Chapter 5. Multimedia-Service Creation Overview

One of the major benefits that SIP offers is its ability to be used by programmers as a tool for multimedia-service creation. In this chapter, we define what “SIP services” are. We also discuss the different approaches to provide SIP services, and analyze the tools available for SIP-service creation.

What are SIP Services?

As we saw in the Chapter 4, core SIP provides a basic functionality for the management of sessions and the location of users. Such functionality at the signaling level, together with other protocols at the media level, can enable a wide variety of multimedia scenarios.

Nevertheless, there are cases where users may want an enhanced functionality on top of the basic multimedia-session management. For instance, imagine that the called user wants incoming calls from specific users to be blocked, or a company wants all the incoming calls to a group number to be distributed among a group of agents. These are examples of services—that is, of enhanced functionality on top of the basic multimedia-session-management functions.

It is not always the end users who demand a specific enhanced treatment for their calls. For instance, in order to decide whether a call can continue or not, a service provider might want all originating calls from prepaid users to go through a service logic that checks if the user has enough credit in his or her account.

The previously described services can be easily implemented by having an extra service logic that implements a smart manipulation of the SIP signaling. This is yet another example of the benefits that the signaling concept offers in communication scenarios: it enables the delivery of services.

So far, we have mentioned just scenarios that involve signaling manipulation. However, even the simplest multimedia call requires some media-level handling at the endpoints, at least in order to capture and present the media and to receive and transmit the media packets over the network. Hence, even in the most basic case, there is a need for the applications at the endpoints to have direct access to the media-handling capabilities in the terminal.

In addition, there are other cases in which services are delivered to the users, and these services involve extra processing at the media level. Imagine, for instance, a conferencing service; in such a service, the media generated by different sources need to be mixed together before they are delivered to the destination. Such media mixing can be seen as an enhanced media function—that is, as a media service.

Enhanced manipulation of media is generally a very specialized task that is typically implemented by specific network elements called media servers. These elements typically offer a set of basic primitives that can be combined to deliver richer functionality. As it happens, in many cases, these primitives are accessible via SIP signaling. Thus, creation of multimedia services, be they pure signaling services or media services, in the end boils down to manipulating SIP signaling—that is, to creating SIP services.

SIP Services and SIP Entities

A SIP service is a piece of logic on top of a SIP entity that delivers an enhanced functionality. Depending on the requested functionality, SIP services may sit on top of a User Agent (UA), a proxy, or a Back-to-Back User Agent (B2BUA).

An example of a service sitting on top of a User Agent could be a simple wake-up call (Figure 5.1). This kind of service allows users to program when they want to receive a call that takes them out of bed in the morning. The machine that generates the call implements a simple logic on top of a UA. The simple logic consists of periodically polling a database that hosts the identity of users and the time when they want to be alerted. Once the application determines that a particular user needs to be alerted, it will use the call-initiation function that the underlying User Agent offers.

Figure 5.1. 

A voice-mail application is another example of service built on top of a UA (Figure 5.2). In this case, the application is able to receive calls and record or play a message.

Figure 5.2. 

SIP proxies can also host applications—for instance, a simple Call Forwarding on Busy (CFB) service (Figure 5.3). The service would receive an indication from the underlying proxy entity that the called user is busy, and it would then query a database to retrieve what the new address is to which the call needs to be diverted. Once obtained, the application would ask the underlying proxy to reroute the call to the new destination.

Figure 5.3. 

SIP services can also be hosted on B2BUAs. As a matter of fact, this is the most usual arrangement for providing services in the network by a service provider. That is because it is the approach that gives more flexibility to manipulate the SIP signaling among different endpoints. When a service requires complex signaling manipulation, B2BUAs are typically used. For instance a Click-to-Dial (CTD) service, where a user who is browsing on a web site that sells sports articles can click on a “customer care” link and have the server automatically establish a call between that user and the customer-care center (Figure 5.4).

Figure 5.4. 

Terminal-Based or Network-Based SIP Services

Generally speaking, SIP services can sit on the SIP UA owned by the end user or they can be located in the network itself. There are examples of services that need to be implemented at the endpoints—think, for instance, of a gaming application. However, there are many cases where the application can be hosted either by the customer (in the terminal) or by the service provider (in the network). The discussion about where to place a particular service is not necessarily of a technical nature. For instance, in some cases, it is desirable to have a strong control on end users and on the services offered to them, which would call for a network-based approach. An enterprise SIP network infrastructure that offers PBX (Private Branch Exchange)–like services to the employees could represent an example of this type of scenario, as could a telecom operator’s owned SIP network that keeps control on the services invoked by the users in order to charge them accordingly.

In other situations—for instance, in an Internet-wide deployment of multimedia services—the control relationship between the service provider and the end user might be weaker, and the user would have complete freedom to implement whatever services in the terminals, assuming only basic functionality in the network nodes.

As we discussed in Chapter 3, the Internet and SIP end-to-end approach very much advocate a decentralized model; however, SIP is flexible enough to also accommodate other scenarios that allow for having some intelligence in the network. These types of scenarios are very appealing to telecom operators because they can then offer their customers the benefits of being able to leverage Internet-based applications while still retaining the necessary control that allows them to continue making money out of new services. An example of this approach is exemplified by the IMS (Internet Multimedia Subsystem) network, which represents the paradigm for the application of SIP technology by telecom operators. The IMS will be looked at in Chapter 24.

In order to let the reader understand these concepts, we will next see an example of SIP service that can be implemented either in the end user’s terminal or in the network. We will consider an Incoming Call Screening (ICS) service that blocks incoming calls to a particular user, say Alice, only when they come from specific users; for instance, Alice might want to block all incoming calls from John.

Option A: Implementation at Alice’s Terminal

An application running on top of Alice’s UA might receive the indication of a new incoming call, and check if the call originator is in a list for barring. If it is, the application would automatically reject the call with a suitable status code. If it is not, the application would allow the normal UA processing to continue. This is shown in Figure 5.5.

Figure 5.5. 

Option B: Implementation at Alice’s SIP Inbound Proxy

In this case, when the call goes through Alice’s inbound proxy, an application sitting on top of it checks the identity of the originator against a blacklist, and rejects the call if the calling party is on the list. This is shown in Figure 5.6.

Figure 5.6. 

Aspects to Consider

When deciding whether to go for a network-based or a terminal-based approach, there are a number of aspects to consider:

Control on Users

Some service providers may want to have enough control on their customers and on the services that are delivered to them. In this case, services are better implemented as part of the network infrastructure. The service provider delivers services on behalf of the communicating users and can charge for them.

Intelligence in the Terminals

Although SIP terminals may have a huge amount of functionality that enables a lot of terminal-based services, there are basic and cheap terminals that support just the basic SIP functionality and cannot host applications on top. In order to deliver services to users with these terminals, a network-based approach is needed.

Service Homogeneity

In some cases, the potential users of a service may have very different types of SIP endpoints, with very different capabilities from each other. In order to offer all the users a homogeneous user experience, the best approach is to let the application be implemented in the network, once and the same for everyone.

End-User Availability

There are situations where a particular call-terminating handling mechanism needs to be applied. In those cases, if the service logic is implemented in the called terminal and the terminal is not available, the service logic cannot be executed. If, on the other hand, the service logic sits on the network, it can have enough intelligence to detect situations where the called user is unavailable, and can then apply alternative call-terminating actions such as forwarding to email.

The aspects just mentioned are particularly relevant in scenarios that involve private networks or telecom operators’ networks. On the other hand, the model that best suits Internet-wide deployments is one where intelligence is fully distributed in the terminals, a model true to the freedom paradigm exemplified by the Net of nets.

Application Servers

An application that is hosted in the network typically resides in a SIP Application Server (AS). If the application requires specific media handling a media server can sometimes be deployed under the control of the AS. We will use the term SIP application server in its broadest sense to denote a network element that hosts a SIP application. As we saw in previous sections, such an element can be based on a SIP UA, proxy, or B2BUA, with B2BUAs the entities on which, most frequently, application servers are based.

SIP Programming Interfaces

In order to ease the development of SIP applications, programmers typically use Application Programming Interfaces (APIs) that encapsulate specific aspects of the SIP functionality so programmers can concentrate on the application service logic.

There are many ways to categorize the interfaces used for SIP programming. For instance, a first classification might split them into proprietary versus standard ones. SIP vendors may decide to expose the SIP functionality in their product by defining their own API that can be used only within the vendor’s platform. On the other hand, a number of standardization bodies have defined a number of standard APIs for SIP-application development. The standard approach helps boost innovation and cost reduction by addressing a much larger community of developers as compared with the proprietary approach.

Another possible high-level categorization of SIP APIs might refer to the level of abstraction the interface provides. For instance, high-level APIs completely hide the SIP functionality, and offer developers an abstract programming model that is largely decoupled from the protocol and network concepts. On the other hand, low-level APIs give the programmer the capability to manipulate the SIP protocol objects at the lowest level: messages, headers, and so on.

The main advantage of low-level interfaces is the power and flexibility they offer in developing any SIP application. The main disadvantage refers to the fact that low-level programming is difficult, time-consuming, and requires the programmers to have a significant understanding of the underlying protocols.

On the other hand, high-level interfaces are easy to use, permit quick application development, and, “in theory,” do not require programmers to have a huge understanding of network protocols. I have said “in theory” because, in the author’s experience, in order to build quality SIP applications, the developer needs to fully understand how the protocol works, even if he or she is using a high-level API. The main drawback with high-level APIs is that, because they abstract a lot of underlying functionality, they do not provide the programmer with access to the full power of the SIP protocol—hence, complex applications cannot be developed using this type of interface.

Having access to an interface that encapsulates the functionality of the underlying protocol is not always enough. For instance, if our application is going to run in a network server, and therefore needs to handle many SIP messages concurrently in a robust, scalable, and performing way, then there is typically also the need to provide an additional piece of software that takes care of the nonfunctional aspects and provides the environment (in Java terminology, the container) where applications can be executed.

Next, we will list some examples of standardized technologies used in the context of SIP-application development. Some of them provide just the functional interface, whereas others represent a container for SIP applications as well.

Standard APIs

JAIN SIP

JAIN (Java APIs for Integrated Networks) SIP is a Java standard for a low-level SIP interface. It provides access to SIP at the lowest level—that is, at the SIP protocol level. Its programming constructs represent low-level concepts such as messages, headers, parameters, ports, and IP addresses. JAIN SIP is so low level that, as a matter of fact, it can be used to build SIP entities such as UAs, proxies, and B2BUA. It can also be used to develop applications, but the programmer is left with the burden of having to implement the core SIP entity logic first, and then the application on top—which can be time-consuming and not the best approach if time to market is crucial.

On the other hand, JAIN SIP, being so low level, gives access to the full power in the SIP protocol and enables the creation of SIP applications of any type.

JAIN SIP is just a functional API. In order to build an application running on a network server, it is also convenient for the programmer to have access to a piece of software (container) that facilitates handling the server-side and nonfunctional aspects. Another option would be for the programmer to build such a software layer on his or her own, which might be a gigantic task that would need to be repeated for every new application.

JAIN SIP is specified in [JSR 032] under the Java Community Process (JCP).[1]

JAIN SDP

SIP programming almost always implies manipulation of SDP (Session Description Protocol) content. JAIN SDP defines a Java interface to facilitate such task.

JAIN SDP corresponds to [JSR 141], but is not yet an approved standard.

SIP Servlets

The SIP servlets API represents an interface to a Java container for SIP applications, including the functional interface. The functional interface is a higher level than the one offered by JAIN SIP. The SIP servlets API is one of the most popular for creating server-side pure SIP applications. It can also be combined with the HTTP (HyperText Transfer Protocol) servlets interface, and offers the programmer a convenient way to develop applications that combine SIP and HTTP protocols.

The SIP servlets API is specified in [JSR 116]. A new version (1.1) is now being developed under [JSR 289].

SIMPLE Instant Messaging

[JSR 165] defines an interface for supporting SIP-based presence and Instant Messaging (IM) services. Presence and IM are covered in Chapter 16.

SIP API for J2ME

[JSR 180] defines a multipurpose SIP API for the Java 2 Platform, Micro Edition. It enables SIP applications to be executed in memory-limited terminals, specially targeted to mobile phones.

JAIN SLEE

JAIN SLEE is another Java standard ([JSR 22] and [JSR 240])—which, in this case, does not provide a functional API, but just the interface to a container for carrier-grade telecommunication applications. Thus, it must be combined with other functional APIs that provide the access to the protocol functionality—for example, JAIN SIP.

The JAIN SLEE container represents a horizontal software layer that can accommodate almost any type of network protocols, thus being the natural choice for building applications that require both legacy signaling protocols—such as IN (Intelligent Network)[2] protocols—as well as newer Internet-based protocols such as SIP.

IMS API

The IMS API is currently being developed under [JSR 281]. It is intended to be used by application developers who wish to build Java applications for terminals that use the IP Multimedia Subsystem. It is targeted at the Java Platform, Micro Edition (JME).

It offers the programmer access to the IMS enablers such as presence, Push-to-Talk (PTT), and XML (Extensible Markup Language). These and other IMS enablers will be described in Chapter 24.

OSA/PARLAY

OSA/Parlay APIs represent a family of standard interfaces that abstract the whole functionality in a telecommunication network (not just limited to SIP). As such, they cover very different aspects—for example, call control, user interaction, mobility, messaging, presence, and charging. A full list of the different OSA/Parlay APIs is defined in Table 5.1.

Table 5.1. 

OSA/Parlay API Name

3GPP Technical Specification

Generic Call Control

3GPP TS 29.998-04-2

Multipart Call Control

3GPP TS 29.998-04-3

Multimedia Call Control

3GPP TS 29.998-04-4

User Interaction

3GPP TS 29.998-05

Mobility

3GPP TS 29.998-06

Terminal Capabilities

3GPP TS 29.998-07

Data Session Control

3GPP TS 29.998-08

Generic Messaging

3GPP TS 29.998-09

Connectivity Manager

3GPP TS 29.998-10

Account Management

3GPP TS 29.998-11

Charging

3GPP TS 29.998-12

Policy Management

3GPP TS 29.998-13

Presence and Availability Management

3GPP TS 29.998-14

Multimedia Messaging Service

3GPP TS 29.998-15

These APIs are all service APIs rather than protocol APIs, and so they provide a significant level of abstraction over the network functionality. For instance, the OSA/Parlay call-control API allows programmers to develop applications that manage calls irrespective of whether they are traditional circuit-switched calls or SIP-based VoIP (Voice over Internet Protocol) calls.

OSA/Parlay standards are developed in a joint effort by both the Parlay Group [PARLAY] and the ETSI (European Telecommunications Standards Institute) and 3GPP (3rd Generation Partnership Project) in the OSA initiative.

PARLAY X

A number of factors have made the OSA/Parlay interfaces not as successful as many people had claimed they would be. On one hand, OSA/Parlay is based on CORBA (Common Object Request Broker Architecture),[3] and this has proved to hinder the operation of the API in an Internet environment. On the other hand, what many people see as the main problem in these interfaces is that they are neither high level nor low level. They stay in the intermediate ground. What this means is that programmers who need full control on the SIP signaling typically revert to low-level interfaces such as JAIN SIP or SIP servlets, whereas, for those programmers who desire a high level of abstraction, the OSA/Parlay APIs seem too complicated.

The Parlay-X APIs intend to resolve this situation by providing a very-high-level set of interfaces that are based on web services, and thus are suitable for Internet-wide operation.

Parlay-X interfaces are developed by the Parlay Group and adopted by 3GPP as part of the OSA initiative. The Parlay Group and 3GPP will work jointly in the future to further develop the specifications.

A full list of the different OSA Parlay-X Web Services APIs is defined in Table 5.2.

Table 5.2. 

OSA/Parlay X Web Services API Name

3GPP Technical Specification

Third Party Call

3GPP TS 29.199-02

Call Notification

3GPP TS 29.199-03

Short Messaging

3GPP TS 29.199-04

Multimedia Messaging Service

3GPP TS 29.199-05

Payment

3GPP TS 29.199-06

Account Management

3GPP TS 29.199-07

Terminal Status

3GPP TS 29.199-08

Terminal Location

3GPP TS 29.199-09

Call Handling

3GPP TS 29.199-10

Audio Call

3GPP TS 29.199-11

Multimedia Conference

3GPP TS 29.199-12

Address List Management

3GPP TS 29.199-13

Presence

3GPP TS 29.199-14

In addition to the standard interfaces mentioned so far, there are many proprietary interfaces (almost every vendor has its own).

Open-Source Implementations

It is worth mentioning that there exist some open-source reference implementations for some of the previous APIs. Here are some examples:

  • The NIST (National Institute of Standards and Technology)[4] reference implementation of the JAIN SIP API is quite popular, and is being successfully used nowadays in a number of commercial projects around the world. It can be downloaded from [JSIP].

  • The Mobicents initiative represents the open-source approach for a JAIN SLEE implementation. It can be downloaded from [MOBICENTS].

  • The Jiplet API is a nonstandard SIP servet-like API that uses JAIN SIP as the functional interface to the SIP protocol, and for which there exists an open-source implementation. It can be downloaded from [JIPLETS].

In addition to these, there are many open-source implementations of SIP entities—such as proxies, User Agents, and so on—that offer proprietary interfaces for building applications or extending the functionality in the product. For instance, the OpenSER [OPENSER] project offers an open-source implementation of a SIP server, and provides low-level, nonstandard APIs for SIP-application development.

Media-Programming APIs

In order to create applications that manipulate media, we can consider two main scenarios:

  1. The programmer wants to develop a multimedia application that has direct access to the media capabilities in the underlying platform. An example of such a scenario could be a voice and video soft-phone application.

  2. The programmer wants to develop an application that makes use of the existing media-handling capabilities present in an external platform, called a media server. An example of this scenario could be a conferencing application where the media mixer resides in a different platform than the controller of the conference.

In the first case, the programmer needs an API that exposes the media capabilities of the underlying platform. This can be a proprietary-platform API or a standard cross-platform API. An example of the latter could be the Java MMAPI (Mobile Media API) or JMF (Java Media Framework) interfaces.

In the second case, the programmer can leverage the fact that most media servers offer a control interface. Most frequently, such control interface is based on SIP; therefore, SIP APIs can be used for that purpose. In other cases, the control interface is based on protocols such as MEGACO (Media Gateway Control), for which specific protocol APIs need to be used. Yet another approach is to use protocol-agnostic APIs for media server control. There is currently work in progress to define such an API for Java under [JSR 309].

Next, we briefly describe two of the main Java APIs used for direct media manipulation: MMAPI and JMF.

Mobile Media API

MMAPI specifies a multimedia API for the Java Platform, Micro Edition. It is intended to be used in mobile terminals, and allows simple, easy access and control of basic audio and multimedia resources.

MMAPI is specified in [JSR 135].

Java Media Framework

JMF is a Java API that extends the Java Platform, Standard Edition, and enables audio, video, and other time-based media to be added to Java applications. It allows programmers to develop Java code to capture, present, store, and process time-based media.

JMF is specified in [JSR 908].

APIs Used in This Book

This book is about learning Internet multimedia applications. As such, it is very much focused on SIP as the fundamental protocol in the signaling layer. In order to let the reader better understand the SIP concepts, we have decided to include programming examples that use a protocol-level API—more specifically, JAIN SIP. From a learning perspective, we believe this is the best approach because it allows readers to take the SIP concepts learned in the theory and directly map them to software constructs by using a generic-purpose programming language. The JAIN SIP API very much resembles the internal layered architecture of the SIP protocol, and therefore is ideally suited to bring into practice the ideas that we will learn about SIP.

Moreover, given that the JAIN SIP reference implementation from NIST is freely downloadable form the web, this also proves to be a cost-effective approach for readers.

For the media part, we will use the JMF API, a Java-based API that is powerful yet very easy to use for building media applications. The reference implementation of the API from Sun Microsystems and IBM can also be freely downloaded from the web.

Summary

The signaling concept represents a powerful enabler for service creation in the IP communications space. SIP represents a very flexible tool for service creation. Many different standards for APIs exist for SIP-service creation, each of them best suited for a particular scenario. Throughout this book, we will combine the theoretical explanations of the protocols with practical programming examples that help readers to better understand the theoretical concepts. With this chapter, the first part of this book (Fundamentals) is brought to an end. This part has been intended to give readers an overview of what they will find in the rest of the book, as well as to provide the key background information needed to proceed with the next part. The second part of this book (Core Protocols) dives in detail into the actual protocols used in multimedia applications, with a focus on SIP, SDP, RTP (Real-time Transport Protocol), and MSRP (Message Session Relay Protocol). In the next chapter, we start with SIP.



[1] The Java Community Process is a formalized process that allows interested parties to be involved in the definition of future versions and features of the Java platform. New proposed specifications and technologies to be added to the Java platform are defined in Java Specification Requests (JSRs).

[2] IN represents a set of telecommunication standards providing the means to deliver enhanced applications on top of circuit-switched network equipment. Among other things, the standards define the interface between a Service Control Point (SCP: entity where the services are hosted) and an SSP (Service Switching Point) that sits on the voice switch.

[3] CORBA is a standard architecture for distributed objects that maximizes interoperability by allowing the objects to be written in different programming languages.

[4] NIST is a nonregulatory agency of the U.S. Department of Commerce’s Technology Administration. Its mission is to promote U.S. innovation and industrial competitiveness.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.16.81