Chapter 9. Session Description

In this chapter, we will look at how multimedia sessions can be described. First we will explain the need for describing sessions and list some examples of scenarios in which sessions need to be described. Then we will focus on the Session Description Protocol, which defines the syntax for describing multimedia sessions. After that, the SDP offer/answer model will be explained. It describes a procedure for the exchange of session descriptions between communicating parties that is crucial for enabling IP multimedia communication services. We will also show how SDP is used in some particular IP communication scenarios.

Once the theory has been presented, we will also include an SDP programming section that shows the reader how to programmatically build and parse session descriptions. As part of this section, we will build a simple Java component that will be reused in our soft-phone project in Chapter 12.

The Purpose of Session Description

[RFC 4566] defines a multimedia session as “a set of multimedia senders and receivers and the data streams flowing from senders to receivers.” This is not a very specific definition, which highlights the fact that a multimedia session can be many different things. Examples of sessions[1] are:

  • a voice over IP call

  • a multicast conference in the Internet

  • the exchange of a series of related instant messages between two parties

  • an online game

  • a video-on-demand streaming session

  • the online transfer of an image as a TCP data stream between two communicating parties

These scenarios represent examples of IP multimedia communication or streaming services, and, in all of them, the session concept is used. In all of them, there is a need to describe the characteristics of the session and then to convey that information to the participants of the session. Such a description of the session would include parameters such as media types, transport addresses, start time and duration of the session, and so on, the knowledge of which is crucial for the participants in the session. Let’s take, for instance, the example of a multicast conference in the Internet. In order for a user to be able to participate, he or she needs to know:

  • in what multicast address they need to listen (that is, what “channel” they need to tune)

  • what media types and codecs the conference will use (otherwise, he or she will be unable to decode the media information correctly)

  • at what time the conference will start

  • etc.

These, and others, are parameters that characterize the conference and that need to be conveyed among the participants before the conference starts.

Also, in peer-to-peer real-time communication scenarios, there is a need to exchange session descriptions. Take, for instance, a two-party voice call. Before the actual voice transmission can start, the participants need to learn what IP addresses and ports they need to send the media packets to. Moreover, they also need to agree on what voice codec to use for transmission and reception, and so forth.

These examples highlight:

  1. The need to find a common format for describing sessions.

  2. The need to find a mechanism (protocol) for delivering those session descriptions among the participants.

The SDP specification covers the first point as it defines a general-purpose format for describing multimedia sessions. SDP is used in a variety of scenarios such as streaming services, real-time communication services, or Internet multicast conferences.

The second need is covered by different protocols depending on the particular service. For instance, we have seen that, for real-time communications, SIP is the protocol used to carry session descriptions, so, in those cases, SDP is included as content in the SIP message. In the streaming cases, the SDP content is embedded in RTSP (Real Time Streaming Protocol) messages exchanged between client and server, whereas, in Internet multicast conferences, it is included in Session Announcement Protocol (SAP) [RFC 2974] messages in order to announce a multicast conference to potential participants.

In the case of multicast sessions, other alternative ways of conveying session descriptions may include use of email or the web so that applications for participating in a session could be automatically launched from the WWW client or email reader in a standard manner.

In the rest of this chapter (and the book!), we will focus on the utilization of SDP just in the remit of real-time IP communication services. In those cases, SDP is typically carried by SIP.

The Session Description Protocol (SDP)

Origins of SDP

SDP was originally conceived to describe multicast sessions on the MBone. In those scenarios, the SDP session descriptor was distributed among the potential participants using the Session Announcement Protocol [RFC 2974]. The SDP included, among other things, information about the multicast address for the media and the set of codecs used in the session.

SDP is used in many scenarios, including streaming, IP communications, and others. In order to apply SDP to IP communication scenarios, it is necessary to extend its semantics. For example, in a multicast conference, it is necessary only to convey a single multicast address for a particular media stream, whereas, in the case of a two-party communication, two unicast addresses are needed.

Moreover, there is also the need to define the operational details of how to use SDP in communication scenarios. In the case of multicast conferences, the codecs to be used for the session are simply indicated in the SDP sent to the participants, whereas, in the case of IP communications, the parties need to agree on the set of codecs to use; therefore, some negotiation needs to occur.

In the next sections, we will see the SDP syntax and semantics that are applicable to IP communication services, as well as a limited negotiation framework defined by [RFC 3264].

SDP Overview

SDP is specified in [RFC 4566]. As has already been mentioned, SDP does not define a true protocol, but rather, a language for representing the key parameters that characterize a multimedia session.

SDP is text based. An SDP message consists of a set of lines of text of the form:

<type> = <value>

where <type> is a single character, and <value> is a structured text whose format depends on <type>. An example of SDP message is shown next.

v = 0o = alice 2890844526 2890842807 IN IP4 1.2.3.4s =c = IN IP4 1.2.3.4t = 0 0m = audio 49170 RTP/AVP 0a = sendrecv

An SDP message contains three levels of information:

  • Session-level description: contains lines that describe characteristics of the whole session.

  • Time description: contains lines indicating time-related aspects of the session.

  • Media description: contains lines that characterize the different media present in the session.

Tables 9.1, 9.2, and 9.3, taken from [RFC 4566], show the different types of lines for each level, indicating whether the field is required (R) or optional (O).

Table 9.1. Session-Level Description SDP Lines

Field

Description

R/O

v

Protocol version

R

o

Originator and session identifier

R

s

Session name

R

i

Session information

O

u

URI of description

O

e

Email address

O

p

Phone number

O

c

Connection information[a]

O

b

Bandwidth information

O

z

Time zone adjustments

O

k

Encryption key

O

a

Session attribute

O

[a] Not required if included in all media.

Table 9.2. Time-Level Description SDP Lines

Field

Description

R/O

t

Time the session is active

R

r

Repeat time

O

Table 9.3. Media-Level Description SDP Lines

Field

Description

R/O

m

Media name and transport addr.

R

i

Media title

R

c

Connection information[a]

R

b

Bandwidth information

O

k

Encryption key

O

a

Attribute line

O

[a] Optional if included at session level.

Next we will focus on those lines that are mandatory or of relevance for realtime communication services. We will explain the meaning of the different parameters in the remit of this kind of services.

Protocol Version (v-line)

The “v=” line gives the version of the Session Description Protocol.

It is set to 0 for the current version of the spec [RFC 4566].

Example:

v = 0

Origin (o-line)

The “o =” line identifies the originator of the session, and contains the following parameters:

  • username: Name of the originator. In our case of IP communication scenarios, the id of the user is already conveyed in the From header in the SIP signaling, so this parameter is not necessary and we may set it to “-.”

  • session id: A numeric string that has to be unique for this session in conjunction with the address. A way to assure uniqueness is to use NTP (Network Time Protocol) [RFC 1305] timestamp values.

  • session version: A version number for the session description data. Again it is recommended to use NTP timestamp values.

  • network type: It is set to “IN” for Internet.

  • address type: It may be IP4 or IP6.

  • unicast address: The sender’s IP address.

An example would be:

o = alice 2890844526 2890842807 IN IP4 1.2.3.4

Session Name (s-line)

The “s =” line conveys the subject of the session. This makes sense for multicast uses of SDP. In the case of unicast (as is the case in IP communication services), a session has no meaningful name, so the s-line will be set to just a blank space “s =.”

Example:

s =

Connection Information (c-line)

The “c =” line contains information about the connection address. That is the address at which the SDP sender expects to receive the incoming media packets.

Example:

c = IN IP4 1.2.3.4

where IN indicates that the following address is an Internet address, and IP4 indicates the version of the IP protocol.

The c-line can be present at session level—that is, valid for all the sessions—or be present in the m-line, implying it is valid for a particular media, in which case it overrides the session-level value.

Time Line (t-line)

The “t=” line conveys the time of the session. This is again meaningful for multicast sessions where the potential receivers need to know beforehand when the session will start, following very much a similar approach to what occurs in TV broadcasting. In the case of unicast sessions, it is set to “0 0.”

Example:

t = 0 0

Media and Transport (m-line)

The “m=” line includes information about a particular media.

A session description may contain several m-lines, implying the session may contain several media. Each m-line indicates:

  • The type of media: voice, video, message, image, and so on.

  • The port where the sender expects to receive media packets.

  • The protocol to use for media transport: RTP, UDP, TCP, MSRP/TCP, and so on.

  • The media format.

The interpretation of the media format depends on the actual media transport protocol. When RTP/AVP is used, the media format represents the RTP payload type number (see Chapter 10). The RTP payload number can be static or dynamic. If it is static, there exists a well-known id number associated to it, so there is no need to include further information about the payload type in the SDP. However, if it is dynamic, there is no fixed association, so the next line in the SDP should include more information that characterizes the format.

Dynamic payload types are assigned numbers between 96 and 127.

Next follows an example of an m-line with a static payload type (0), which indicates a PCM μ -law[2] encoding for audio.

m = audio 40000 RTP/AVP 0

The next example shows an m-line with a dynamic type (96).

m = audio 49230 RTP/AVP 96a = rtpmap:96 L8/8000

Note that in this last case, there is an additional a-line, which is used to describe the type of encoding (L8) and the clock rate dynamically assigned to payload type 96.

Bandwidth (b-line)

The optional “b=” line denotes the proposed bandwidth to be used by the session or media.

A b-line contains two elements:

  1. The bandwidth figure itself expressed, by default, in kilobits per second.

  2. An alphanumeric modifier that gives the meaning to the bandwidth figure.

The values of the modifier more frequently used in IP communications are:

  • AS: Is typically used to specify the total bandwidth (in kilobits per second) allocated for a single media stream from a single site (source).

  • RS: Indicates the requested RTCP bandwidth (in bits per second) allocated to active data senders.

  • RR: Indicates the requested RTCP bandwidth (in bits per second) allocated to receivers.

The use of the modifier values RS and RR is not defined in the base SDP specification [RFC 4566], but on [RFC 3556].

The following SDP session description offers an audio communication, and requests a total bandwidth of 64 kilobits per second. For RTCP senders, the requested bandwidth is 800 bps; for RTCP receivers, it is 2400 bps.

v = 0o = alice 2890844526 2890842807 IN IP4 1.2.3.4s =c = IN IP4 1.2.3.4t = 0 0m = audio 49170 RTP/AVP 0b = AS:64b = RS:800b = RR:2400

Attributes (a-line)

Attributes are the primary means for extending SDP. They may be defined to be used as “session-level” attributes, “media-level” attributes, or both.

Some important attributes in the case of IP communication services are:

  • rtmap: In the case of RTP/AVP transport, it is used to map the payload type in an m-line with some parameters characterizing the payload type, such as the encoding name, clock rate, or encoding parameters. For example, the following a-line maps the dynamic payload type 96 to an L8 encoding at a 8000 Hz sampling rate:

    a = rtpmap:96 L8/8000

  • sendrec: Indicates that the sender of the media description wants to send and receive media.

  • recvonly: Indicates that the sender of the media description wants only to receive media.

  • sendonly: Indicates that the sender of the media description wants only to send media.

  • inactive: Indicates that the sender of the media description does not want to send or receive media.

Next follow some examples of SDP session descriptions for different types of media.

Example IP Communication Sessions Described with SDP

SDP can be used to describe many different types of sessions. Now we will see five possible uses of SDP to describe different types of IP communication sessions. All of these communication sessions can be established via SIP.

Voice and Video

This corresponds to the most classical use of SDP to describe a session composed of “pure” real-time media components such as audio and video transported over RTP. We can see that the SDP message indicates that the RTP audio/video profile should be used (see Chapter 10 for the definition of the audio/video profile). It also includes a media line for audio with PCM codec (payload type = 0), and a media line for video with the H.261 codec (payload type = 31).

v = 0o = alice 2890844526 2890844526 IN IP4 host.ocean.coms = -c = IN IP4 host.ocean.comt = 0 0m = audio 49170 RTP/AVP 0a = rtpmap:0 PCMU/8000m = video 51372 RTP/AVP 31a = rtpmap:31 H261/90000

Telephony Tones

In PSTN (Public Switched Telephony Network) scenarios, it is common to use telephony tones. An example of this is the so-called DTMF (Dual-Tone Multi-Frequency) tones, which are standard signals that are generated by pressing a ordinary telephone’s touch keys. These are typically used in scenarios involving interactive voice response (IVR) machines that prompt the user to introduce some information (e.g., “If you want information in English, press 1; if you want information in Spanish, press 2”).

Multimedia applications in the Internet may need to send or receive these types of signals, especially in (but not limited to) scenarios that involve interworking with the PSTN.[3] DTMF and telephony tones are considered a particular type of audio media carried over RTP that is separately described using SDP. By using SDP, endpoints in an Internet multimedia communication can signal whether they support or not the generation of telephony tones in case those need to be generated during the conversation.

Next follows an example of SDP that offers support for DTMF (only the relevant lines are shown).

m = audio 42000 RTP/AVP 100a = rtpmap:100 telephone-event/8000

Transport of telephony tones over RTP is described in [RFC 4733] and [RFC 4734].

Real-time Text

“Real-time text over IP” sessions can be conveyed using RTP with a specific payload type. An example of SDP for a text session might be:

v = 0o = alice 2890844526 2890844526 IN IP4 host.ocean.coms = -c = IN IP4 host.ocean.comt = 0 0m = text 49170 RTP/AVP 98a = rtpmap:98 t140/1000

As we can see, the dynamic payload type 98 is mapped to the T.140 protocol, which is used for describing the content of a text session (see Chapter 10).

Instant Messages (MSRP)

Another type of media that can be described by SDP is the exchange of related instant messages between two parties. Such an exchange is also considered to be a media session. This scenario is called session-based instant messaging, and the media transport protocol in this case is the Message Session Relay Protocol (MSRP). At the time of writing, there is not yet an RFC for MSRP. It is specified in an Internet draft [draft-ietf-simple-message-sessions]. Therefore, it is considered work in progress. The status of the draft is quite advanced, and it is expected that very soon it will become a standards track RFC.

In order to describe an MSRP session using SDP, two new mandatory media-level attributes are defined:

  • path: This attribute always accompanies an MSRP media line. It indicates the MSRP URI of the user agent that sent this session description. An MSRP URI represents the end user address where he or she expects to receive incoming instant messages. An MSRP URI has the form:

    msrp://host:port/session_id;transport

  • accept-types: This is also a mandatory attribute that accompanies an m-line. It indicates the media types that are acceptable to the endpoint. It may indicate wrapper types (e.g., message/cpim) or simple types (e.g., text/plain).

In addition to these new attributes, the connection and media line in an SDP message describing an MSRP session have the following requirements:

c-line

  • The address in this line must coincide with the IP address or FQDN indicated in the path attribute.

m-line

  • The protocol parameter is “tcp/msrp.”

  • The media field must be “message.” In order to further qualify the media type, the accept-types attribute is used.

  • The port parameter must match the port value used in the MSRP URI in the path attribute.

A curious reader might wonder why some values such as port or FQDN are duplicated in the SDP description. The reason is that actually the c-line and m-line are not used by MSRP devices; however, they need to be there and provide meaningful information for backward compatibility reasons.

Next follows an example of SDP describing an MSRP session:

v = 0o = -2890844526 2890844527 IN IP4 alice.ocean.coms = -c = IN IP4 alice.ocean.comt = 0 0m = message 8341 TCP/MSRP *a = accept-types: message/cpim text/plain text/htmla = path:msrp://alice.ocean.com:8341/7hr38r3ew;tcp

TCP Content

The media described by SDP can also represent media that is conveyed as a data stream using the TCP protocol between two communicating parties. The protocol identifier in this case has the value “TCP.” It indicates just the transport protocol, so the m-line must further qualify the application-layer protocol using a format identifier. Furthermore, two new attributes must be defined to describe how and when the TCP connection setup procedure is performed:

  • setup: This attributes indicates which of the endpoints should initiate the TCP connection establishment. It can have the following values:

    • “active”: The endpoint offers to initiate the connection.

    • “passive”: The endpoint offers to receive an incoming connection.

    • “actpass”: The endpoint is willing to accept an incoming connection or to initiate an outgoing connection.

    • “holdconn”: The endpoint does not want the connection to be established for the time being.

  • connection: This attribute indicates if a new connection needs to be established or the already existing one should be reused.[4] It can have the following values, which are straightforward: “new” or “existing.”

In the following example, we can see an example of SDP for this type of session. In this case, the media type is “image.” The transport protocol is TCP, and the format indicates a T.38 fax application. The setup attribute is active, so the sender is willing to initiate the TCP connection and the connection attribute is new, which means that a new TCP connection needs to be established.

v = 0o = alice 2890844526 2890844526 IN IP4 host.ocean.coms = -c = IN IP4 host.ocean.comt = 0 0m = image 34772 TCP t38a = setup:activea = connection:new

The SDP usage to describe TCP media transport is defined in [RFC 4145].

The Offer/Answer Model with SDP

As we said before, the use of SDP in communication scenarios also requires defining a limited negotiation framework so that the communicating parties can agree on the session characteristics, such as which media streams are in the session, the codecs, and so forth. Such negotiation framework is called the offer/answer model, and is defined in [RFC 3264].

The way it works is quite simple. A party wanting to communicate indicates the desired session description from his or her point of view. That is called the SDP offer. The offer contains, among others:

  • the set of media streams that the offerer wants to use.

  • the desired characteristics of the media streams as qualified by the format parameter and the media-line attributes.

  • the IP addresses and ports where the offerer wants to receive the media.

  • the additional parameters, if needed, that further qualify the media transport.

The other party receives the offer, and replies with an SDP answer. It contains the following pieces of information:

  • whether a media stream is accepted or not.[5]

  • the media streams characteristics that will be used for the session.

  • the IP addresses and ports that the answerer wants to use in order to receive media.

The offerer receives the answer, and, at this point, if the answer accepts at least one media, both parties have found an overlap in their respective desired session descriptions, and communication can start.

In addition to the media types and their characteristics, the parameters that can be negotiated using the offer/answer model differ slightly depending on the type of session being established. We will now see three examples of offer/answer model utilization for three different cases of IP communication.

In case of media types that are conveyed using RTP, we have seen that the offer/answer model enables the negotiation of the type of codecs. For other types of media, the offer/answer model allows us to negotiate other parameters.

Voice/Video

Let us assume that John wants to set up a communication with Alice that includes a bidirectional audio stream and two bidirectional video streams, using H.261 (payload type 31) and MPEG (payload type 32). Therefore, his SDP offer will look like:

v = 0o = john 2890844526 2890844526 IN IP4 host.sea.coms =c = IN IP4 host.sea.comt = 0 0m = audio 48450 RTP/AVP 0a = rtpmap:0 PCMU/8000m = video 52792 RTP/AVP 31a = rtpmap:31 H261/90000m = video 53630 RTP/AVP 32a = rtpmap:32 MPV/90000

Alice does not want to receive or send the first video stream, so she returns the SDP below as the answer:

v = 0o = alice 2890844730 2890844730 IN IP4 host.alice.coms =c = IN IP4 host.alice.comt = 0 0m = audio 48950 RTP/AVP 0a = rtpmap:0 PCMU/8000m = video 0 RTP/AVP 31m = video 53700 RTP/AVP 32a = rtpmap:32 MPV/90000

We can notice that in the answer, the port for the first media stream is set to 0, indicating that Alice does not want to communicate using that particular media. Alice, on the other hand, accepts both the audio and the second video stream, so the communication will start including these two media.

As we can seen from this example, in the case of media types that are conveyed using RTP, the offer/answer model enables the negotiation of the type of codecs.

Putting a Media Stream on Hold

Let us assume in our previous example that, once the call is established, John decides to put the audio stream on hold. In our example, media is flowing in both directions, which is the default value if no specific direction attribute is present in the SDP (sendrecv, sendonly, recvonly, or inactive). Therefore, in order to put the call on hold, John just needs to send a reINVITE to Alice that includes an SDP with the attribute sendonly for the audio stream. If, later on, he wants to retrieve the call, he needs to just send a new reINVITE and change the SDP attribute to sendrecv or simply not add any attribute.

MSRP

Let us consider now that John wants to set up an instant messaging session with Alice. John intends to send messages containing text, and also he wants to send an image. This means that he will offer two media types to Alice:

  • text/plain

  • image/jpeg

John also includes his MSRP address in the path attribute. His SDP offer would be:

v = 0o = john 2890844526 2890844527 IN IP4 host.sea.coms = -c = IN IP4 host.sea.comt = 0 0m = message 6554 TCP/MSRP *a = accept-types: text/plain image/jpega = path:msrp://host.sea.com:6554/5u42ihy542;tcp

Alice does not support the jpeg format, so she answers with the following SDP, in which she also includes her MSRP address in the path attribute:

v = 0o = alice 2890844530 2890844532 IN IP4 host.ocean.coms = -c = IN IP4 host.ocean.comt = 0 0m = message 8651 TCP/MSRP *a = accept-types: text/plaina = path:msrp:// host.ocean.com:8651/6tejdtw5eyde;tcp

At this point, John would set up a TCP connection to the address and port specified by the MSRP URI sent by Alice—that is, host.ocean.com:8651. Once the TCP connection is established, the exchange of just text messages might start.

TCP Content

When using TCP-based transport, it is possible to negotiate how and when the connection setup procedure is performed based on the exchanged values of the “connection” and “setup” attributes.

For instance, the offerer might set setup attribute to passive, and, if the answerer responds with active, it means that the answerer will be the one responsible for initiating the TCP connection. This scenario is seen in the following exchange of SDP messages between John and Alice for a T.38 fax session:

Offer

m = image 52887 TCP t38c = IN IP4 1.2.3.4a = setup:passivea = connection:new

Answer

m = image 55330 TCP t38c = IN IP4 4.3.2.1a = setup:activea = connection:new

In another scenario, John might offer to either initiate an outgoing connection or accept an incoming one by setting the setup attribute to actpass. If Alice responds with a value of passive, that would mean that John is responsible for initiating the connection.

Offer

m = image 52887 TCP t38c = IN IP4 1.2.3.4a = setup:actpassa = connection:new

Answer

m = image 55330 TCP t38c = IN IP4 4.3.2.1a = setup:passivea = connection:new

The connection attribute can also be negotiated. For instance, while already on a session, John might initiate a new SDP exchange. In the offer, he proposes to use the existing TCP connection. Alice responds that she wants a new connection to be created; therefore, a new connection will be established, in this case by Alice, if we look at the values of the exchanged setup attributes.

Offer

m = image 52887 TCP t38c = IN IP4 1.2.3.4

a = setup:passivea = connection:existing

Answer

m = image 55330 TCP t38c = IN IP4 4.3.2.1a = setup:activea = connection:new

SDP Programming

As we have seen, IP communication applications that use SIP will in many cases need to describe sessions using SDP and transport, such a description as part of the SIP message payload. From the developer’s perspective, there is a need then to be able to encode and parse SDP content. There are a number of different ways to do this. One possible way to accomplish this is by using an implementation of the JAIN SDP API. JAIN SDP is part of the Java network API family to which JAIN SIP also belongs. Therefore, and given that we are using JAIN SIP as the API that allows us to illustrate the SIP concepts throughout the book, it seems appropriate to embrace JAIN SDP for our discussion on SDP programming. Actually, we will use JAIN SIP in combination with JAIN SDP for the soft-phone project that we will describe in Chapter 12.

JAIN SDP Overview

JAIN SDP is a very simple API that just allows us to encode and decode SDP content. Like the JAIN SIP API, it is also based on the factory pattern. It defines a factory class called SdpFactory, and a number of interfaces that represent the key concepts in SDP. By invoking creation methods on SdpFactory, the programmer can obtain objects that implement those key interfaces in the API.

As already stated, JAIN SDP provides a number of interface classes that represent the key concepts in the Session Description Protocol. These interfaces fall into two categories. On one hand, there is the SessionDescription interface, which models the SDP message itself; on the other, there are a myriad of interfaces, each of them representing one or more lines in the SDP message.

Table 9.4 shows some of the main interfaces in the API and their mapping to the SDP concepts that they represent. Table 9.5 shows the methods needed to create them from the SdpFactory.

Table 9.4. 

Interface Name

SDP Concept

SessionDescription

SDP message

Version

v-line (protocol version)

Origin

o-line (originator and session identifier)

SessionName

s-line (session name)

Connection

c-line (connection information)

Time

t-line (time the session is active)

Media

m-line (media name and transport address)

MediaDescription

m-line and related a-lines (media name, transport address, and associated attributes)

Table 9.5. 

 

SdpFactory Creation Methods

Description

SessionDescription

createSession Description()

Creates an empty SessionDescription to which we can then add the different lines.

SessionDescription

createSession Description (String s)

Creates a SessionDescription out of a String that represents the received SDP message. Once created, we can invoke “getter” methods to parse the message and obtain the different lines.

Version

createVersion (int value)

Creates a v-line.

Origin

createOrigin (String userName, long sessionId, long sessionVersion, String networkType, String addrType, String address)

Creates an o-line.

SessionName

createSessionName (String name)

Creates an s-line.

Connection

createConnection (String netType, String addrType, String addr)

Creates a c-line.

Time

createTime ()

Creates a “t=0 0” line.

MediaDescription

createMedia Description (String media, int port, int numPorts, String transport, int[] staticRtpAvpTypes)

Creates an m-line. Once createdn we can add related a-lines by invoking setAttribute() on the MediaDescription object.

It is worth noting that in order to create a MediaDescription, we need to pass a vector of integer values for the media formats to the createMediaDescription() method so as to reflect the fact that more than one media format may be included in the same m-line.

So far, we have seen how to create SessionDescription objects as well as objects representing the different lines in an SDP message. We will now see how to actually encode and parse SDP messages.

Encoding SDP Messages

To encode an SDP message, the following steps need to be followed:

  1. Obtain an instance of the singleton SdpFactory class:

    SdpFactory mySdpFactory=SdpFactory.getInstance();
    
  2. Create an empty SessionDescription object:

    SessionDescription mySessionDescription=mySdpFactory.
    createSessionDescription();
    
  3. Create the lines I will want to include in the SDP message (e.g., a “v=0” line):

    Version myVersion=mySdpFactory.createVersion(0);
    
  4. Add those lines to the SessionDescription by invoking the appropriate “setter” method:

    mySessionDescription.setVersion(myVersion);
    

There are setter methods for all the lines in an SDP message. Table 9.6 shows the main ones.

Table 9.6. 

 

SessionDescription Setter Methods

Description

Void

setVersion (Version v)

Sets the v-line.

Void

setOrigin (Origin o)

Sets the o-line.

Void

setSessionName (SessionName s)

Sets the s-line.

Void

setConnection (Connection c)

Sets the c-line.

Void

setTimeDescriptions (Vector v)

Sets the t-lines. The argument is a Vector of Time objects.

Void

setMediaDescriptions (Vector v)

Sets the m- and related lines. The argument is a Vector of MediaDescription objects.

After these steps have been completed, we have SessionDescription object representing our SDP message. In order to include it into a SIP message, we would need to convert it into a String and pass it as an argument of the JAIN SIP setContent() method.

An important aspect to highlight is that in order to include the m-lines in the message, rather than invoking several times a “setter” method to include a MediaDescription, the JAIN SDP API offers only a setMediaDescriptions() method to which we need to pass a Java vector containing all the MediaDescriptions that we want to add to the message. A similar approach is followed for the Time object.

For instance, if we wanted to add the following lines to an SDP message:

m = audio 3401 RTP/AVP 0m = audio 3550 video RTP/AVP 31

Our code should look like:

int[] mf1=new int[1];
mf1[0]=0;
int[] mf2=new int[1];
mf2[0]=31;
MediaDescription media1 = mySdpFactory.
  createMediaDescription("audio," 3401, 1,
 "RTP/AVP," mf1);
MediaDescription media2 = mySdpFactory.
  createMediaDescription("video," 3550, 1,
  "RTP/AVP," mf2);
Vector myMediaDescriptionVector=new Vector();
myMediaDescriptionVector.add(media1);
myMediaDescriptionVector.add(media2);
mySdp.setMediaDescriptions(myMediaDescriptionVector)

Parsing SDP Messages

To parse an SDP message, the following steps need to be followed:

  1. Obtain an instance of the singleton SdpFactory class:

    SdpFactory mySdpFactory=SdpFactory.getInstance();
    
  2. Create a SessionDescription object from the received String representing the SDP message:

    SessionDescription receivedSessionDescription=
      mySdpFactory.createSessionDescription(receivedSdp);
    
  3. Obtain the desired lines from the session description (e.g., the v-line):

    Version receivedVersion = receivedSessionDescription.
      getVersion();

There are getter methods for all the lines in an SDP message. Table 9.7 shows the main ones.

Table 9.7. 

 

SessionDescription Setter Methods

Description

Version

getVersion ()

Gets the v-line.

Origin

getOrigin ()

Gets the o-line.

SessionName

getSessionName ()

Gets the s-line.

Connection

getConnection ()

Gets the c-line.

Vector

getTimeDescriptions (boolean b)

Gets the t-lines as a Vector of Time objects.

Vector

getMediaDescriptions (boolean b)

Gets the m- and related lines as a Vector of MediaDescription objects.

SDP Practice

In order to put into practice the JAIN SDP concepts learned so far, we will now create a simple component that will ease the task of creating and parsing SDP content. This component will be used by the soft-phone application that we will build in Chapter 12. Such application is built only for training purposes and has a limited scope. With regard to SDP handling in our soft-phone application, we take the following assumptions; some of them will help in keeping the code as simple as possible while still allowing us to show the fundamental concepts:

  • The soft phone will support audio and video.

  • The soft phone will support two media codecs for audio: GSM and G723.

  • The soft phone will support two media codecs for video: JPEG and H263.

  • The end user will select in the GUI the type of media that he or she desires for their communications:

    • audio only

    • audio and video

  • The end user will select in the GUI which codec (only one) he or she desires to use for each media.

  • The first media line will be audio, and the second one (if it exists) will be video. This assumption simplifies the SDP parsing.

  • The SDP offer will contain only one proposed codec per media.

  • The recipient will always accept the voice media, but may not accept the video component, depending on GUI configuration.

  • All the payload formats will be static, therefore there is no need to include or read a-lines associated to the m-lines.

Under all the previous assumptions, our soft-phone application will need only to include or get five pieces of information in or from the SDP message:

  • IP address

  • voice port

  • audio format

  • video port

  • video format

The SDP component that we will build now will simplify the task of setting or getting these pieces of information from an SDP message. The component is a Java class called SdpManager. We will also use another Java class called SdpInfo that is a data structure that holds the value of the five parameters we are interested in.

The SdpInfo class is shown next.

public class SdpInfo {
   String IpAddress="" ;
   int aport=0;
   int aformat=0;
   int vport=0;
   int vformat=0;
   public SdpInfo() {}
   public void setIPAddress(String IP) {IpAddress=IP;}
   public void setAudioPort(int AP) {aport=AP;}
   public void setAudioFormat(int VF) {aformat=AF;}
   public void setVideoPort(int VP) {vport=VP;}
   public void setVideoFormat(int VF) {vformat=VF;}
   public String getIpAddress() {return IpAddress;}
   public int getAudioPort() {return aport;}
   public int getAudioFormat() {return aformat;}
   public int getVideoPort() {return vport;}
   public int getVideoFormat() {return vformat;}
 }

The SdpManager class offers two methods:

  1. byte[] createSdp(SdpInfo sdpinfo)

  2. SdpInfo getSdp(byte [] sdpcontent)

The first one receives as input an SdpInfo object, and creates as output a byte array representing the SDP content.

The second one gets an SDP message as a byte array, and produces an SdpInfo object with the key info we are interested in.

The constructor method for SdpManager just obtains the instance of SdpFactory that will be used in the two main methods.

mySdpFactory = SdpFactory.getInstance();

The code for the createSdp () method is shown next. It is quite straightforward.

Version myVersion = mySdpFactory.createVersion(0);
long ss=mySdpFactory.getNtpTime(new Date());
Origin myOrigin = mySdpFactory.createOrigin("-
   ",ss,ss,"IN","IP4",sdpinfo.getIpAddress());
SessionName mySessionName = mySdpFactory.createSessionName("-");
Connection myConnection=mySdpFactory.createConnection("IN,""IP4,"
   sdpinfo.getIpAddress());
//Time description lines
Time myTime=mySdpFactory.createTime();
Vector myTimeVector=new Vector();
myTimeVector.add(myTime);
//Media description lines
int[] aaf=new int[1];
aaf[0]=sdpinfo.getAudioFormat();
MediaDescription myAudioDescription=mySdpFactory.
  createMediaDescription("audio",
   sdpinfo.getAudioport(), 1, "RTP/AVP",aaf);
Vector myMediaDescriptionVector=new Vector();
myMediaDescriptionVector.add(myAudioDescription);
if (sdpinfo.getVideoPort()!=-1) {
   int[] avf=new int[1];
   avf[0]=sdpinfo.getVideoFormat();
   MediaDescription myVideoDescription =
      mySdpFactory.createMediaDescription("video" sdpinfo.
        getVideoPort(), 1,
      "RTP/AVP",
   avf);
   myMediaDescriptionVector.add(myVideoDescription);
}
SessionDescription mySdp = mySdpFactory.createSessionDescription();
mySdp.setVersion(myVersion);
mySdp.setOrigin(myOrigin);
mySdp.setSessionName(mySessionName);
mySdp.setConnection(myConnection);
mySdp.setTimeDescriptions(myTimeVector);
mySdp.setMediaDescriptions(myMediaDescriptionVector);
mySdpContent=mySdp.toString().getBytes();
return mySdpContent;

It is worth mentioning that we have followed the recommendation in [RFC 4566] to create the session id based on NTP timestamps. The JAIN SDP API offers a convenience method to create NTP timestamps from a Java Date object:

static long getNtpTime(Date d)

The third parameter in the o-line—that is, the version of the session information—has been initialized to the same value as the session id.

Next we show the code for the getSdp() method:

String s = new String(content);
SessionDescription recSdp=mySdpFactory.createSessionDescription(s);
String myPeerIp=recSdp.getConnection().getAddress();
String myPeerName=recSdp.getOrigin().getUsername();
Vector recMediaDescriptionVector=recSdp.
  getMediaDescriptions(false);
//We assume first media line is audio
MediaDescription myAudioDescription = (MediaDescription)
recMediaDescriptionVector.elementAt(0);
Media myAudio = myAudioDescription.getMedia();
int myAudioPort = myAudio.getMediaPort();
Vector audioFormats=myAudio.getMediaFormats(false);
Integer myAudioMediaFormat=(Integer) audioFormats.elementAt(0);
int myVideoPort =-1;
Integer myVideoMediaFormat = new Integer(-1);
//We assume second media line, if it exists, is video
if (recMediaDescriptionVector.capacity()>1) {
   MediaDescription myVideoDescription =
      (MediaDescription) recMediaDescriptionVector.elementAt(1);
   Media myVideo = myVideoDescription.getMedia();
   myVideoPort = myVideo.getMediaPort();
   Vector videoFormats = myVideo.getMediaFormats(false);
   myVideoMediaFormat = (Integer) videoFormats.elementAt(0);
}
mySdpInfo=new SdpInfo();
mySdpInfo.setIpAddress(myPeerIp);
mySdpInfo.setAudioPort(myAudioPort);
mySdpInfo.setAudioFormat(myAudioMediaFormat.intValue());
mySdpInfo.setVideoPort(myVideoPort);
mySdpInfo.setVideoFormat(myVideoMediaFormat.intValue());
return mySdpInfo;

It is worth highlighting that the port and the media format parameters are obtained through a Media object, not directly through the MediaDescription object. So, in order to get these parameters, we had to:

  1. obtain the MediaDescription from the SessionDescription

  2. obtain the Media object from the MediaDescription

  3. obtain the desired parameters from the Media object

Summary

In this chapter, we have analyzed the way to describe multimedia sessions using the Session Description Protocol. In the next chapter, we will examine the protocols used to convey the media that is described by SDP. Thus, we will look at RTP, MSRP, TCP, and the different applications that they can support. Once we have covered the media protocols, we will be in a position to look at how a complete multimedia application works, and build one ourselves in Chapter 12



[1] Strictly speaking a session qualifies as multimedia only if it includes more than one media. For instance, following such a terminology, a VoIP call would be a one-medium session, whereas an audio/video conference could be considered a true multimedia session. This requirement is very much relaxed in practice, and many people refer to multimedia sessions even if there is just one medium included.

[2] Pulse Code Modulation (PCM) is a scheme for digitally representing an analog signal. The magnitude of the signal is sampled regularly at uniform intervals and the samples are converted into a digital code. PCM µ-law is a variant used in North America and Japan, whereas PCM A-law is a variant used in Europe and the rest of the world.

[3] Interworking with the PSTN is further described in Chapter 18.

[4] A TCP connection may already exist if the SDP is sent to modify parameters in an existing session.

[5] The way to indicate that a media stream is not accepted is by setting the port value in the m-line for that media to zero.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.81.33