Chapter 12. SIP

Session Initiation Protocol (SIP) is a signaling protocol that controls the initiation, modification, and termination of interactive multimedia sessions. The multimedia sessions could be as diverse as audio or video calls among two or more parties, chat sessions, or game sessions. SIP extensions have also been defined for instant messaging, presence, and event notifications. SIP is a text-based protocol that is similar to HTTP and Simple Mail Transfer Protocol (SMTP).

SIP is a peer-to-peer protocol, which means that network capabilities such as call routing and session management functions are distributed across all the nodes (including endpoints and network servers) within the SIP network. This is in contrast to the traditional telephony model, where the phones or end-user devices are completely dependent on centralized switches in the network for call session establishment and services.

SIP was defined in RFC 2543 (March 1999) by the Multiparty Multimedia Session Control (MMUSIC) Working group of the Internet Engineering Task Force (IETF). In June 2002, the IETF published a new SIP RFC (RFC 3261). IP telephony is still being developed and will require additional signaling capabilities in the future. The extensibility of SIP enables such development of incremental functionality. This chapter also describes some of the key extensions.

This chapter covers the following topics:

  • SIP Overview—Covers function, network elements, and interaction with other protocols

  • SIP Message Building Blocks—Covers SIP addressing, messages and headers, transactions, and dialogs

  • Basic Operation of SIP—Provides a proxy, redirect server, and B2BUA server example

  • SIP Procedures for Registration and Routing—Covers locating SIP servers, registering, and message routing

  • SIP Extensions—Covers caller and callee preferences, Subscription-Notification, REFER, presence, and IM

SIP Overview

This section describes the key components of a SIP network, their function, and the interaction between them.

Functionality That SIP Provides

SIP provides the following capabilities for enabling multimedia sessions:

  • User location—SIP provides the capability to discover the location of the end user for the purpose of establishing a session or delivering a SIP request. User mobility is inherently supported in SIP.

  • User capabilities—SIP enables the determination of the media capabilities of the devices that are involved in the session.

  • User availability—SIP enables the determination of the willingness of the end user to engage in communication.

  • Session setup—SIP enables the establishment of session parameters for the parties who are involved in the session.

  • Session handling—SIP enables the modification, transfer, and termination of an active session.

SIP Network Elements

The SIP network typically comprises the following devices:

  • User agent—A user agent (UA) is a logical function in the SIP network that initiates or responds to SIP transactions. A UA can act as either the client or the server in a SIP transaction. A UA might or might not directly interact with a human user. A UA is stateful—that is, it maintains session or dialog state.

  • User agent client—A user agent client (UAC) is a logical function that initiates SIP requests and accepts SIP responses. Examples of UAC are a SIP phone initiating a call on behalf of a human user or a SIP Proxy forwarding a request on behalf of a UAC.

  • User agent server—A user agent server (UAS) is a logical function that accepts SIP requests and sends back SIP responses. A SIP phone accepting an INVITE request is one example.

  • Proxy—A proxy is an intermediate entity in the SIP network that is responsible for forwarding SIP requests to the target UAS or another proxy on behalf of the UAC. A proxy primarily provides the routing function in the SIP network. A proxy might also enforce policy in the network, such as authenticating a user before providing him with service. A proxy can be stateless, transaction stateful, or call stateful. Typically, proxies are transaction stateful—that is, they maintain state for the duration of a transaction (about 32 seconds).

  • Redirect server—A redirect server is a UAS that generates 300 class SIP responses to requests it receives, directing the UAC to contact an alternate set of Uniform Resource Identifiers (URI).

  • Registrar server—A registrar is a UAS that accepts SIP REGISTER requests and updates the information from the request message into a location database.

  • Back-to-back user agent—A back-to-back user agent (B2BUA) is an intermediate entity that processes incoming SIP requests as a UAS. To answer the incoming SIP request, the B2BUA acts as a UAC, regenerates a SIP request, and sends it on the network. A B2BUA must maintain dialog state and participates in all transactions within the dialog.

Interaction with Other IETF Protocols

SIP by itself does not provide all the capabilities that are required to set up an interactive multimedia session. Instead, SIP is a protocol that is part of the framework of standard protocols that builds a multimedia architecture.

SIP agents or applications need other protocols for the following:

  • To describe the characteristics of a session—The characteristics include whether a session is an audio or video session, what codecs are used, what the media source is, and what the destination addresses are.

  • To handle media—These protocols control and transmit audio/video packets for a session.

  • To support functions—Needs include AAA for authentication, authorization, and accounting; Resource Reservation Protocol (RSVP) for reserving network resources; Telephony Routing over IP (TRIP) for gateway selection and load balancing; Simple Traversal of UDP Through NAT (STUN)/Traversal Using Relay NAT (TURN)/Interactive Connectivity Establishment (ICE) protocols for firewall and NAT traversals; Domain Name System (DNS) for hostname-to-IP address resolution; and Transport Layer Security (TLS) for preventing eavesdropping, tampering, or message forgery.

Sessions that are established using SIP typically use the following IETF protocols:

  • DNS—SIP session establishment might require the use of DNS to resolve host or domain names into routable IP addresses. DNS can also be used to load-share across multiple servers in a cluster identified by a hostname.

  • Session Description Protocol (SDP)—SDP is used in a SIP message body to describe the parameters of the multimedia session. This information includes session type such as audio, video, or both and parameters such as codecs or ports needed to establish a media stream. RFC 2327 defines SDP.

  • Real-time Transport Protocol (RTP)—RTP, first defined in RFC 1889, transports real-time data such as audio or video packets to the endpoints that are involved in a session. Real-time Transport Control Protocol (RTCP), defined in RFC 1890, provides quality of service (QoS) feedback to the sender. RFC 3550 obsoletes RFC 1889.

  • RSVP—SIP can use RSVP to reserve network resources such as bandwidth prior to establishment of the media session. This ensures that the network resources are in place prior to the called party being alerted about an incoming call.

  • TLS—SIP recommends the use of TLS, defined in RFC 2246, to provide privacy and integrity of SIP signaling information over the network. TLS allows the client and server applications to authenticate each other, negotiate encryption algorithms, and establish cryptographic keys before sending the signaling information over the network.

  • STUN—SIP UACs can use the STUN protocol to discover the presence and type of Network Address Translation (NAT) between them and the public Internet. STUN also allows the client to discover the public IP address that is allocated to the NAT. This procedure works for most types of NAT except symmetric NAT. Symmetric NAT occurs when all requests from the same internal IP address and port to a specific destination IP address and port are mapped to the same external source IP address and port.

The preceding is by no means an exhaustive list of protocols that SIP uses. Depending on the signaling and application requirements, SIP might use other protocols.

SIP does not necessitate the use of the protocols in the preceding list. In the future, if newer or enhanced protocols perform similar functions, SIP will be able to use them with little or no change. Most of the information associated with these protocols is carried in the SIP message body. SIP treats the message body as an opaque container and transports it to the recipient. The SIP protocol layer does not interpret the message body.

SIP signaling is thus independent of the type of session being established. Therefore, from a SIP signaling perspective, the same set of messages is used regardless of whether an audio session, audio-video session, or some other type of session is established.

Message Flow in SIP Network

Figure 12-1 shows a basic SIP network comprising SIP proxies and user agents with connection to public switched telephone network (PSTN). The SIP UA, proxies, and SIP-PSTN gateway are located within an IP network. The SIP-PSTN gateway has SS7/PRI trunks going to a switch in the PSTN.

Path Taken by Request and Response Messages in a SIP Network

Figure 12-1. Path Taken by Request and Response Messages in a SIP Network

In Figure 12-1, solid lines represent SIP requests, and dotted lines indicate SIP responses.

SIP Message Building Blocks

This section describes the structure of SIP messages, addressing schemes, and key header fields. Much of the SIP messages and header field syntax is identical to HTTP/1.1. Refer to RFC 3261 for an in-depth description.

Note

You can find all RFCs online at http://www.ietf.org/rfc/rfcxxx.txt, where xxxx is the number of the RFC. If you do not know the number of the RFC, you can try searching by topic at http://www.rfc-editor.org/cgi-bin/rfcsearch.pl.

SIP Addressing

SIP addresses identify a user or a resource within a network domain. SIP addresses are typically referred to as SIP URI. A SIP URI is typically an e-mail-type address with a format such as one of the following:

sip:user@domain:port

sip:user@host:port

The user field identifies a user by name, such as john.doe, or by telephone number, such as 4081234567, within the context of a domain or a host. The port is an optional field. If no port is specified, the default port for a SIP URI is 5060. If a port is explicitly specified, you must use it. Examples of SIP URIs are as follows:

sip:[email protected]

sip:[email protected]

The public SIP address of a user or a resource is referred to as an Address-of-Record (AOR). An AOR is a SIP URI that is globally routable and points to a domain whose location service can map the AOR to another SIP URI, where the user might be located.

RFC 3261 specifies a secure SIP URI format also known as a SIPS URI. The format of a SIPS URI is as follows:

sips:user@domain:port

or

sips:user@host:port

The default port for a SIPS URI is 5061.

SIP Messages

SIP messages can be broadly divided into SIP requests and responses, as further defined in the sections that follow.

SIP Requests

SIP requests are messages that are sent from client to server to invoke a SIP operation. RFC 3261 defines six SIP requests or methods that enable UA and proxy to locate users and initiate, modify, and tear down sessions:

  • INVITE—An INVITE method indicates that the recipient user or service is invited to participate in a session. You can also use this method to modify the characteristics of a previously established session. The INVITE message body might include the description of the media session being set up or modified, encoded per SDP. A successful response (200 OK response) to an INVITE indicates the willingness of the called party to participate in the resulting media session.

  • ACK—An ACK request confirms that the UAC has received the final response to an INVITE request. ACK is used only with INVITE requests. ACK is sent end to end for a 200 OK response. The previous hop proxy or UAC sends ACK for other final responses. The ACK request can include a message body with the final session description if the INVITE request did not contain a session description.

  • OPTIONS—A UA uses the OPTIONS request to query a UAS about its capabilities. If the UAS is capable of delivering a session to the user, it responds with the capability set of the UAS.

  • BYE—A UA uses BYE to request the termination of a previously established session.

  • CANCEL—The CANCEL request enables UACs and network servers to cancel an in-progress request, such as INVITE. This does not affect completed requests in which the UAS had already sent final responses.

  • REGISTER—A client uses a REGISTER request to register its current location information corresponding to the AOR of the user with SIP servers.

SIP Responses

A server sends a SIP response to a client to indicate the status of a SIP request that the client previously sent to the server. The UAS or proxy generates SIP responses in response to a SIP request that the UAC initiates. SIP responses are numbered from 100 to 699. SIP responses are grouped as 1xx, 2xx, and so on through 6xx. SIP responses are classified as provisional and final.

A provisional response indicates progress by the server but does not indicate the final outcome as a result of processing the SIP request. The 1xx class of SIP response indicates provisional status. A final response indicates the termination and the final status of a SIP request. All 2xx, 3xx, 4xx, 5xx, and 6xx class responses are final, specifically:

  • A 2xx class response indicates successful processing of the SIP request.

  • A 3xx class response indicates that the SIP request needs to be redirected to another UAS for processing.

  • A 4xx, 5xx, or 6xx class of response indicates failure in processing of the SIP request.

Table 12-1 lists the various SIP responses per RFC 3261.

Table 12-1. SIP Response Table

Class of Response

Status Code

Explanation

Informational

100

Trying

180

Ringing

181

Call is being forwarded

182

Queued

183

Session progress

Success

200

OK

Redirection

300

Multiple choices

301

Moved permanently

302

Moved temporarily

305

Use proxy

380

Alternative service

Client-Error

400

Bad request

401

Unauthorized

402

Payment required

403

Forbidden

404

Not found

405

Method not allowed

406

Not acceptable

407

Proxy authentication required

408

Request timeout

410

Gone

413

Request entity too large

414

Requested URL too large

415

Unsupported media type

416

Unsupported URI scheme

420

Bad extension

421

Extension required

423

Interval too brief

480

Temporarily not available

481

Call leg or transaction does not exist

482

Loop detected

483

Too many hops

484

Address incomplete

485

Ambiguous

486

Busy here

487

Request terminated

488

Not acceptable here

491

Request pending

493

Undecipherable

Server-Error

500

Internal server error

501

Not implemented

502

Bad gateway

503

Service unavailable

504

Server timeout

505

SIP version not supported

513

Message too large

Global Failure

600

Busy everywhere

603

Decline

604

Does not exist anywhere

606

Not acceptable

SIP Message Structure

A SIP message consists of the following:

  • A start-line

  • One or more header fields

  • An empty line indicating the end of header fields

  • An optional message body

You must terminate the start-line, each message-header line, and the empty line by a Carriage Return Line Feed (CRLF) sequence.

The start-line for a SIP request is a Request-Line. The start-line for a SIP response is a Status-line.

The Request-Line specifies the SIP method, the Request-URI, and the SIP version. The Status-line describes the SIP version, the SIP response code, and an optional reason phrase. The reason phrase is a textual description of the 3-digit SIP response code.

Table 12-2 shows the various components of a SIP request message.

Table 12-2. SIP Request Components

INVITE sip:[email protected] SIP/2.0

Request Line

Via: SIP/2.0/UDP ph1.company.com:5060;branch=z9hG4bK83749.1

SIP Message headers

From: Alice <sip:[email protected]>;tag=1234567

 

To: Bob <sip:[email protected]>

 

Call-ID: [email protected]

 

CSeq: 1 INVITE

 

Contact: <sip:[email protected]>

 

Content-Type: application/sdp

 

Content-Length: ...

 
 

Blank line between SIP header fields and body

v=0

SDP body in SIP message

o=alice 2890844526 28908445456 IN IP4 172.18.193.102

 

s=Session SDP

 

c=IN IP4 172.18.193.102

 

t=0 0

 

m=audio 49170 RTP/AVP 0

 

a=rtpmap:0 PCMU/8000

 

*The information in Table 12-2 is taken from RFC 3261.

Table 12-3 shows the structure of a SIP response message.

Table 12-3. SIP Response Components

SIP/2.0 200 OK

Status (Response) Line

Via: SIP/2.0/UDP ph1.company.com:5060;branch=z9hG4bK83749.1

SIP message headers

From: Alice <sip:[email protected]>;tag=1234567

 

To: Bob <sip:[email protected]>;tag=9345678

 

Call-ID: [email protected]

 

CSeq: 1 INVITE

 

Content-Length: ...

 
 

Blank line between SIP header fields and body

v=0

SDP body in 200 OK response

o=bob 3800844316 3760844696 IN IP4 172.18.193.109

 

s=Session SDP

 

c=IN IP4 172.18.193.109

 

t=0 0

 

m=audio 48140 RTP/AVP 0

 

a=rtpmap:0 PCMU/8000

 

*The information presented in Table 12-3 is taken from RFC 3261.

SIP Headers

A SIP message is composed of header fields (defined in RFC 3261) that convey the signaling and routing information for the SIP network entities. SIP follows the same format as defined for an HTTP header (RFC 2616). Each header field consists of a field name followed by a colon (:) and the field value.

Table 12-4 describes the functions of the key SIP headers

Table 12-4. Key SIP Headers

SIP Header

Explanation

From

This header indicates the identity of the initiator of a SIP request. The From header is usually the AOR of the sender. It consists of a SIP or SIPS URI and an optional display name.

To

This header indicates the desired recipient of a SIP request. The To header is usually the AOR of the recipient. The SIP request might not always be delivered to the “desired” recipient because of redirection or forwarding. The To header consists of a SIP or SIPS URI and an optional display name.

Call-ID

This header field identifies a series of SIP messages. Call-ID must be identical for all SIP requests and responses sent by either UA within a dialog.

Cseq

This header is composed of an integer value and method-name. This header identifies, orders, and sequences SIP requests within a dialog. The Cseq header also differentiates between message retransmissions and new messages.

Via

The Via header indicates the path taken by the request and identifies where the response needs to be sent.

Contact

This header identifies a SIP or SIPS URI where the UA wants to receive a new SIP request.

Allow

The Allow header lists the set of SIP methods supported by the UA that is generating the message.

Supported

This header lists all SIP extensions supported by the UA. SIP extensions are SIP RFCs other than RFC 3261. SIP extensions are represented as option tags such as 100rel defined in RFC 3262.

Require

This header has similar semantics to the Supported header, but the support of the SIP extension at the remote UA is a must for the transaction to be processed.

Content-Type

This header indicates the type of the message body that is attached to a SIP request or response. This header must be present if the SIP message has a body.

Content-Length

This header indicates the size of the message body (in decimal) in a SIP message. This header is a must when SIP messages are carried over stream-based protocols such as TCP.

SIP Transactions and Dialog

A SIP signaling session between two user agents might be comprised of one or more SIP transactions. A SIP transaction occurs between a UAC and a UAS, which might involve one or more intermediate SIP servers such as proxy or redirect. A SIP transaction comprises all messages that begin with the SIP request initiated from the UAC, until a final response (that is, a non-1xx response) is received from the UAS. A SIP transaction is identified by the Call-ID, via-branch, local tag, remote tag, and CSeq value. Figure 12-2 shows a SIP REGISTER transaction between a UAC and a registrar server. The SIP transaction comprises a SIP request message followed by one or more SIP response messages. In this case, the REGISTER message is a SIP request sent from the UAC to the registrar server. 100 Trying and 200 OK are SIP responses. The user agent server sends SIP responses to the UAC indicating the status of the SIP request.

SIP REGISTER Transaction

Figure 12-2. SIP REGISTER Transaction

A SIP transaction can result in the establishment, modification, or termination of a media session. The establishment of a session also results in a SIP signaling relationship between the peers, known as a dialog. A dialog is defined as a peer-to-peer SIP relationship between two or more UAs that persists for the duration of the session. A dialog is the state identified by the Call-ID, the local tag, and the remote tag. Not all SIP transactions affect the state of a dialog. Multiple SIP transactions might take place within the context of a SIP dialog. Each SIP transaction within a dialog has a sequentially increasing integer value in the CSeq header.

A successful INVITE-200 OK transaction results in the establishment of a SIP dialog and an audio or video session between the participants. After the media session is established, you can exchange INVITE messages within the existing dialog context using the same Call-ID and tags to modify the media session parameters. Later, you can tear down the dialog using a BYE transaction or transfer it to another device using a REFER transaction, again within the dialog context.

A successful SUBSCRIBE-200 OK transaction results in the establishment of a dialog. The SUBSCRIBE request is discussed in the “SIP Extensions” section.

Dialog and transaction states are maintained at the SIP UAs or endpoints. SIP servers such as proxy and redirect typically maintain state for the duration of the transaction—that is, they maintain only the transaction state. The transaction state is held for at least 32 seconds per RFC 3261. SIP servers such as proxies and redirect servers maintain the transaction state, but not the dialog state, which enables them to serve many SIP endpoints. Because the network servers maintain only transaction state, a proxy that is going out of service within a cluster affects the transactions that are in progress but has no effect on established dialogs.

Transport Layer Protocols for SIP Signaling

SIP transactions use either connection-oriented transport layer protocols such as TCP or Stream Control Transmission Protocol (SCTP) or connectionless protocols such as UDP. For connectionless protocols, SIP specifies that the SIP application start retransmission timers to retry the SIP requests to guarantee end-to-end reliability.

SIP defines a SIPS URI, which indicates the need for securing the end-to-end SIP signaling information in a network. SIP RFC 3261 specifies the use of TLS or IPsec to encrypt the signaling information.

Basic Operation of SIP

SIP servers handle incoming requests in two ways. This basic operative is based on inviting a participant to a call. The three basic modes of SIP server operation described in this section are as follows:

Proxy Server Example

Figure 12-3 illustrates the communication exchange for the INVITE method using the proxy server.

<source>Source: Henning Schulzrinne, Columbia University</source>
Proxy Mode of Operation

Figure 12-3. Proxy Mode of Operation

The operational steps in the proxy mode needed to bring a two-way call to succession are as follows:

  1. The proxy server accepts the INVITE request from the client.

  2. The proxy server contacts the location server to request the address of the called party UA.

  3. The location server identifies the location of the called party and provides the address of the target server.

  4. The INVITE request is forwarded to the address of the location that is returned. The proxy might add a Record-Route header to the INVITE message to ensure that all subsequent messages for that dialog are routed via the proxy. This might be needed for billing purposes or other applications that need to see the messaging for that dialog.

  5. The called party UA alerts the user. The user answers the call.

  6. The UAS returns a 200 OK indication to the requesting proxy server.

  7. The 200 OK response is forwarded from the proxy server to the calling party UA.

  8. The calling party UA confirms receipt of the 200 OK by issuing an ACK request, which is sent to the proxy (when the proxy inserts the Record-Route header in the INVITE message) or sent directly to the called party UA.

  9. The proxy forwards the ACK to the called party UA.

Redirect Server Example

Figure 12-4 illustrates the protocol exchange for the INVITE request using the redirect server.

<source>Source: Henning Schulzrinne, Columbia University</source>
Redirect Server Mode of Operation

Figure 12-4. Redirect Server Mode of Operation

The operational steps in the redirect mode to bring a two-way call to succession are as follows:

  1. The redirect server accepts the INVITE request from the calling party UA.

  2. The redirect server contacts location services to get the address of the called party UA.

  3. Location services returns the address of the called party UA.

  4. After the user is located, the redirect server returns the address directly to the calling party in a 3xx message, with an updated Contact: header pointing to the new destination(s). Unlike the proxy server, the redirect server does not forward an INVITE.

  5. The UAC sends an ACK to the redirect server acknowledging the 3xx response.

  6. The UAC sends an INVITE request directly to the Contact: address returned by the redirect server.

  7. The called party UA alerts the user, and the user answers the call. The called party UA provides a success indication (200 OK) to the UAC.

  8. The UAC sends an ACK to the UAS acknowledging the 200 OK response.

B2BUA Server Example

RFC 3261 does not define the B2BUA functionality. It describes it as concatenation of UAC and UAS. However, B2BUA forms an important element in providing centralized call control and feature management in SIP networks. Unlike a proxy, B2BUA can initiate new SIP calls and modify and terminate existing calls. SIP calls via a B2BUA server result in creation of two distinct dialogs, which enable it to modify one SIP session without affecting the other session.

B2BUA can act as a third-party call controller (3PCC) and can establish calls between two user agents. Figure 12-5 illustrates B2BUA acting as a 3PCC establishing calls between users A and B. RFC 3725 defines the best current practices for third-party call control in SIP. 3PCC modifies the session characteristics by modifying the Session Description Protocol (SDP) body. SDP is defined in RFC 2327.

B2BUA Server Mode of Operation

Figure 12-5. B2BUA Server Mode of Operation

Figure 12-5 illustrates the steps for call establishment in B2BUA server mode. This illustration is one of the four message flows described in RFC 3725.

  1. B2BUA sends an INVITE to user A. This INVITE contains an SDP body without media lines. This means that the media characteristics will be defined later by another INVITE.

  2. User A is alerted. A 180 Ringing message will be sent from user A to B2BUA, but it is not shown explicitly in this message flow. When the call is answered, a 200 OK is sent to B2BUA. This 200 OK has an SDP body without media lines.

  3. B2BUA sends an ACK to user A.

  4. B2BUA sends an INVITE to user B. The INVITE message does not contain an SDP body.

  5. User B is alerted and answers the call. 200 OK with offer SDP is sent back to the B2BUA.

  6. B2BUA uses the SDP received in the 200 OK to create an INVITE with an SDP body and sends it to user A. The SDP body in the INVITE message is a modified version of the SDP body received in the 200 OK from user B. That is why it is labeled SDP2’ in the message flow.

  7. User A responds with a 200 OK with the answer SDP to B2BUA.

  8. B2BUA sends an ACK with answer SDP to user B.

  9. B2BUA sends an ACK to user A to acknowledge the 200 OK.

  10. A call is established between user A and B. Audio packets are sent between user A and user B using RTP. There are two distinct SIP dialogs—one between user A and B2BUA, and the other between user B and B2BUA.

The B2BUA function also helps in protocol interworking between SIP devices with other protocols, such as H.323 and Media Gateway Control Protocol (MGCP). B2BUA allows transport layer interworking, such as TCP and User Datagram Protocol (UDP), IPv4/IPv6 address mapping, and topology or address hiding in SIP headers, such as Via, Contact, and Record-Route.

The products that leverage B2BUA functions include SIP-based IP-PBXs, softswitches, firewall/NAT traversal applications, call center applications, and conference servers.

SIP Procedures for Registration and Routing

This section describes the functional aspects of SIP as defined in RFC 3261 and the SIP extension RFCs. Registration and routing aspects covered are as follows:

User Agent Discovering SIP Servers in a Network

The UA needs the IP address of the registrar or proxy server to register and provide SIP service. The UAC, however, might not have the IP address of a SIP proxy and might require mechanisms to discover the address of a SIP proxy server in its domain.

The UA gets the DNS server address during a DHCP procedure to acquire an IP address. The UA initiates a DNS procedure to discover the servers that provide SIP routing capabilities in its network. This allows the UA to reach peer user agents or services in the SIP network without explicit configuration.

The UAC can use DNS procedures (defined in RFC 3263) such as Naming Authority Pointer (NAPTR) to determine what services are supported in a domain. NAPTR records return a set of terminal DNS records, such as a Service Record (SRV) (defined in RFC 2782), to indicate the services and protocols that are supported within the domain. The UAC should filter those records that point to servers supporting SIP.

DNS SRV records help clients discover servers supporting an application protocol such as SIP by querying for a specific service and underlying transport protocol within a domain. To determine the address of SIP servers that support UDP transport in company.com domain, you need to query a query string _sip_udp.company.com with a DNS server. A DNS SRV query yields DNS A records, which then resolve to a SIP server IP address. The UA then sends the INVITE request to the resolved IP address.

SIP Registration and User Mobility

SIP endpoints register with a SIP registrar server. Typically, the registrar and proxy function are implemented within the same server. SIP endpoints involved in the registration process are typically end-user devices such as SIP IP phones or servers providing specialized functions such as voice mail and presence status.

SIP users and services typically have a well-known or public SIP or SIPS URI, also known as an AOR. An AOR should be a globally reachable address. The SIP AOR is just another way to reach a user, similar to a telephone number, and might be on the business card or the home page of a person.

The UA, upon activation, creates a time-limited binding between the user AOR and its current IP address at the proxy server. This process is called registration. The AOR of the user is typically configured on a UA such as an IP phone. The IP address of the phone could vary because it is typically provided via DHCP.

The UA sends a SIP REGISTER request to the registrar within the domain, providing its current address in the Contact header. The UA indicates its AOR in the To header of the REGISTER message. The registrar updates the location service database to bind the AOR of the user to his current address or location. The location of the user is also known as the contact address and is conveyed in the Contact header of the REGISTER message. The location database that the proxy or registrar server uses thus maps an AOR to zero or more contact addresses.

The Contact header has an expires parameter that indicates the duration for which this binding is valid. In the call flow illustrated in Figure 12-6, expires is set to 3600 seconds or an hour. The UA is expected to refresh the registration within the time specified in the expires header to keep this binding intact at the registrar. Therefore, the UA sends periodic registration messages to refresh this information. In the absence of refresh, the registrar deletes this binding.

SIP Registration Process

Figure 12-6. SIP Registration Process

The proxy uses the updated location database to route SIP requests that are addressed to the target SIP AOR.

This mechanism of registering the contact address with the registrar/location service enables SIP to inherently support user mobility. For example, an employee who has a SIP softphone on his laptop might travel to a different site within the company. When the SIP softphone or UA is turned on, it sends a REGISTER message to its configured SIP registrar and updates the location service with the current address at the visited site. The proxy can now seamlessly send calls for this user to the visited location based on the contact address in the location service database. This is also true if the user is traveling or telecommuting and logs on to the company network over a secure VPN connection. Figure 12-6 illustrates the SIP registration process and usage of associated headers.

SIP Message Routing

A UAC that is acting on behalf of a user wants to establish an audio or audio-video session with another user. The INVITE request line has the SIP AOR of the called user for message routing. The UAC is now ready to send the INVITE to a SIP proxy server for routing the SIP message to the intended recipient.

After receiving a SIP request like INVITE, the proxy uses the AOR from the INVITE request line of the called user to look up the destination or next-hop address before forwarding the request.

SIP proxies are elements that route SIP requests to the UAS and SIP responses to the UAC. A request might traverse multiple proxies before it reaches the target UAS. Each proxy on the way makes routing decisions and modifies the request before forwarding the request to the next-hop device. A SIP proxy might rewrite the Request URI and add a Via header before forwarding the request message to the next-hop device. The proxy might insert additional headers like Record-Route in the request. SIP responses make their way through the same set of proxies as the request, but in the reverse order.

In practice, a SIP proxy is collocated with the SIP registrar server—that is, a proxy server usually implements the SIP registrar function. Thus, the proxy has access to the location database that is created during SIP registration. The UA routes the initial request such as INVITE or SUBSCRIBE to the local proxy in its domain. The proxy is then responsible for routing the SIP request based on its location database or static route information. Thus, the proxy serves as the rendezvous point for all SIP UAs and servers within the domain.

If the proxy is responsible for the domain in the request line, it looks up the location service database. This lookup provides zero or more addresses where the called user could be reached. Note that these contact addresses binding are available because of the registration procedure described previously or because of static configuration. If the contact address is a hostname, the procedure described in the previous section, such as using DNS SRV and A records for resolving the contact host-to-IP address, applies. The proxy then forwards the INVITE request to these locations.

If the proxy is not responsible for the domain in the request line, it forwards the request to the host that is specified in the request line.

To summarize, a SIP server such as a proxy or redirect server primarily performs the routing function—that is, it determines the next-hop device to where the SIP message needs to be forwarded. The next hop could be another proxy, redirect server, PSTN gateway, or UA. The SIP servers consult a location database to determine the next-hop address or the contact address of the user. Proxies can also have statically configured routes for devices like gateways into the PSTN or to another domain. A proxy can use DNS SRV or A records to route the message to the next hop.

Routing of Subsequent Requests Within a SIP Dialog

Subsequent SIP requests within a dialog usually do not traverse SIP proxies. UAs typically use the Contact header received in the dialog-creating transaction to send subsequent requests directly to their peer user agent. For example, the INVITE transaction might get routed to the UA of the called party via one or more proxies. This leads to the establishment of a SIP dialog and an active call. When one of the parties involved in a call hangs up, one UA usually sends the BYE request directly to the other UA. This enables the SIP proxies to scale up to provide service to an extremely large number of UAs.

SIP proxies, however, can indicate their willingness to be in the path of subsequent requests within a dialog by inserting a Record-Route header in the dialog establishing a request such as INVITE or SUBSCRIBE. In that case, following the establishment of a dialog, the UAs generate a Route header based on the combination of Record-Route and Contact headers. The UAs and intermediate proxies then use the Route header to route subsequent transactions.

This is useful if the proxy provides services in addition to routing. The proxy might be generating accounting records for calls such as timestamps, parties involved, and so on. In this case, the proxy wants to process not only the initial INVITE (that is, the establishment of the call) but also the BYE transaction (that is, when the call was terminated). This helps recording of the start time, connect time, and disconnect time of a call.

You can also use the proxy to enforce policies or checks at a central point in the network, in which case the proxy inserts a Record-Route header to ensure that all subsequent requests pass through it.

Figure 12-7 illustrates the message flow for a SIP call setup and termination involving the SIP proxy.

SIP Call Setup and Teardown Involving the SIP Proxy

Figure 12-7. SIP Call Setup and Teardown Involving the SIP Proxy

Figure 12-7 shows a SIP call setup from Alice to Bob via proxy. The SIP phones of Alice and Bob belong to the same enterprise, which is company.com. Both of their SIP IP phones are registered with the proxy server in the company.com domain.

  1. The UA of Alice sends an INVITE with Request-URI sip:[email protected] to the proxy server. The INVITE request has a unique Call-ID header and a From-Tag. The Contact header in the INVITE request has the address of the UA of Alice.

  2. The proxy server accepts the INVITE and sends a 100 Trying back to the UA of Alice.

  3. The proxy server looks up the location server database, gets the UA address of Bob, and forwards the INVITE to the user agent of Bob.

  4. The phone that Bob has accepts the incoming INVITE request and sends back 100 Trying to the proxy.

  5. The user agent of Bob starts ringing to alert the user about the incoming call. The UA of Bob sends 180 Ringing to the proxy to indicate the ringing state.

  6. The proxy forwards the 180 Ringing to the UA of Alice. After the UA of Alice gets a 180, it starts playing a ringback tone to Alice.

  7. Bob answers the call. The UA of Bob sends 200 OK to the proxy. The To header in 200 OK has a To-Tag that the UA of Bob generated. 200 OK has a Contact header that specifies the address of the UA of Bob.

  8. The proxy forwards the 200 OK to the UA of Alice.

  9. The UA of Alice acknowledges the 200 OK and sends an ACK directly to the UA of Bob. An ACK is sent to the UA of Bob based on the Contact header that is received in the 200 OK response. The proxy did not insert a Record-Route header in the INVITE message, so subsequent messages in this dialog will not be routed via the proxy. At this time, the SIP dialog is established. The dialog identifiers are Call-ID, From-Tag, and To-Tag.

  10. Alice and Bob are conversing. Audio packets are sent directly between the phones using RTP over UDP.

  11. Bob disconnects the call. His UA sends a BYE request directly to the UA of Alice using the Contact header received in the initial INVITE message. The flow of audio packets between the two phones is stopped.

  12. The UA of Alice acknowledges the BYE transaction by sending 200 OK. Her phones also initiate call disconnection. The call is now terminated.

Signaling Forking at the Proxy

The proxy forwards the SIP request to one or more contact addresses either in parallel or in sequence. This feature in SIP is known as forking. During sequential forking, the proxy waits for the final response for the forwarded request before forwarding the request to the next location. During parallel forking, requests are forwarded to all the locations in parallel. In each case, the proxy forwards the best final response to the previous hop device, which is the UAC. For example, if the proxy gets 486 Busy and 200 OK from two forked legs, the 200 OK is forwarded to the UAC. The proxy can use the CANCEL request to cancel the forked legs for which it has not received the final response.

Sequential forking might be used in find-me, follow-me services, whereas parallel forking might be used for group-ringing applications. The find-me, follow-me service lets users have one phone number, which enables them to be contacted on multiple physical phones such as office, home, and mobile phone in a user-defined sequence. A group-ringing application involves all phones in the group ringing simultaneously for an incoming call.

Enhanced Proxy Routing

The SIP proxy can also provide enhanced routing capability based on local policy, application-defined scripts, or caller preferences. RFC 3841 is a SIP extension that enables a caller to express his preferences about request handling at the servers.

In the PSTN world, the caller does not have the authority or the capability to indicate how to route his calls or which features he prefers. SIP enables this via caller preference headers. These headers let the caller specify whether to route the call via proxy or redirect server, route the call only to the mobile phone of the called party, or reach the voicemail of the called party without ringing his phone.

Preference headers provide for a highly flexible and customizable call-routing application. This capability is discussed later in this chapter.

SIP Extensions

The IETF has defined extensions to the core SIP specifications to support protocol features for advanced services such as presence, application-specific routing, instant messaging, and call features. This section introduces the extensions for SIP Subscribe-Notify, Refer, and routing, based on caller preferences.

SIP Extension Negotiation Mechanism: Require, Supported, Allow Headers

SIP continues to evolve, and new capabilities are being proposed through IETF drafts and RFCs. These new RFCs are extensions to the core SIP RFC 3261. To maintain backward compatibility with baseline SIP implementations and facilitate inter-working with devices that do not support the newer extensions, SIP defines the extension negotiation mechanism. The extension negotiation is achieved using the Require and Supported headers.

SIP mandates that SIP entities that receive a SIP message ignore unknown headers. If the UAC insists that the UAS must understand the SIP extension to process a request, the UAC must indicate this using the Require header. The Require header contains option tags that are defined in the SIP extensions.

SIP extensions can define new header fields within existing methods that cannot be reasonably processed by the UAS or proxy that supports the only core SIP RFC. Thus, the SIP extensions need to define option tags. Option tags are populated in Require or Supported headers.

The Require header indicates that the UAC insists that the UAS must understand the extension for processing the request. If the UAS does not support an option tag in the Require header, it must reject the request with the Unsupported header containing the offending option tag. The UAC can resend the request without the extensions, or it can choose to terminate the transaction.

The Supported header is an indication to the UAS that the UAC understands a certain extension. It is up to the UAS to decide whether it wants to use that extension in the response messages. For example, option tag 100rel in the INVITE request indicates that the UAC supports RFC 3262. The UAS can choose to send 18x responses reliably by adding Require:100rel in the 18x responses. UAs that implement SIP extensions usually have configuration options that enable an administrator to control the enabling or disabling of the feature.

The Allow header field lists the set of methods supported by the UA that generated the SIP message. The addition of this header in dialog-initiating transactions such as INVITE or SUBSCRIBE enables the UAS to discover which SIP methods the UAC supports. Similarly, the UAS can send this header in the final responses such as 200 OK to indicate similar information to the UAC. For example, an INVITE request received with the Allow header that does not contain the REFER method might lead to the UAS disabling the “transfer” key on its user interface.

SIP UAs can also send an OPTIONS request to query the capabilities of remote devices such as proxy or UA. The OPTIONS requests include Allow (describes SIP methods), Accept (content-types), and Supported (SIP extensions) headers. The remote device sends back an OPTIONS response (200 OK) containing Allow, Accept, Accept-Language, Accept-Encoding, and Supported headers.

Caller and Callee Preferences

RFC 3841 is a SIP extension that enables a caller to express his preferences about request handling at the intermediate servers. This enables SIP servers to use additional information such as caller preferences for routing a SIP request. This includes matching the capability of the called user end devices with the caller preferences. For example, the caller might want to establish an audio and video session. In that case, the proxy should not route the INVITE request to those contacts that can perform only audio or IM sessions. Similarly, the caller might express an interest to reach the called party on his wireless IP phone only or directly leave a voice-mail message without talking to him.

The UAC indicates its preference in the INVITE or SUBSCRIBE request by adding the Accept-Contact, Reject-Contact, and Request-Disposition header (all defined in RFC 3841).

The Request-Disposition header field lets the caller specify preferences for the way a server should process a SIP request. The caller might indicate whether the server should proxy or redirect the SIP request. In the redirect server mode, all contact locations corresponding to the called party are returned to the UAC in a 3xx response in the Contact header. The UA of the caller can apply customized routing policies on the contact locations it receives from the redirect server.

The user can also specify whether the proxy should forward a request to all possible contact locations in parallel or go through them in sequence, contacting the next address when it has received a non-2xx or non-6xx final response for the previous address. For example:

Request-Disposition: proxy, parallel

During registration, the UA can indicate its capabilities such as support of SIP methods, audio, video, and IM to the proxy server. When a UA registers, it can choose to indicate a feature set associated with a registered contact. (See RFC 3840 for details.) During the message routing process, the proxy tries to match the caller preferences with the callee UA capabilities.

Some examples of usage of the headers added to INVITE or SUBSCRIBE requests are as follows:

  • Accept-Contact: *; video;require;explicit—Forces the INVITE to be routed to an endpoint that supports video capability.

  • Accept-Contact: *;msgserver;require;explicit—Indicates that the user wants to access the voice mail of the called party directly.

  • Accept-Contact:*;mobility=“mobile”;require;explicit—Sends the call only to the wireless phone of the called user.

  • Request-Disposition: proxy, parallel—Indicates that the server should proxy the request and use parallel fork if multiple contacts are present.

  • Reject-Contact: *;msgserver—Indicates that the caller wants to avoid voice mail of the called party. In this case, the proxy shall not include contact that has “msgserver” tag while routing the INVITE transaction. As a result, the caller is not routed to the voice-mail server of the called party.

SIP Event Notification Framework: Subscription and Notifications

RFC 3265 describes the SIP extensions that enable SIP UAs to subscribe for notification from another SIP device when certain events take place. This is a framework by which SIP nodes can request notification from remote peers when monitored events take place. Notifications are generated for network events. Examples of network events include a user registering with or unregistering from the network, a voice mail being deposited in the mailbox of a user or a voice mail being retrieved from the mailbox, or a user changing his online status from idle to busy or away. This is essentially a mechanism to share state information in a distributed system, such as a SIP network. The framework is independent of the events that are being monitored. That is why you can adapt it so easily to a broad range of applications.

For example, a SIP phone might subscribe to the voice-mail status of its user from a SIP-based messaging system. The duration of the subscription is defined by the Expires header in the SUBSCRIBE request.

SUBSCRIBE and NOTIFY Methods

RFC 3265 defines the subscription and notification framework for SIP using two new SIP methods: SUBSCRIBE and NOTIFY.

A SIP entity acts as a subscriber when it sends a SUBSCRIBE for a specific event type, such as message-summary, to a SIP entity that the Request URI identifies. A new header “Event” defines the event type or the class of event types. The Event header is mandatory for SUBSCRIBE requests. The duration of the subscription is indicated in the Expires header.

The Request URI of a SUBSCRIBE request contains enough information to route the request to the appropriate entity per the request routing procedures that SIP outlines. The Request URI of a SUBSCRIBE request also contains enough information to identify the resource for which the event notification is desired, but not necessarily enough information to uniquely identify the nature of the event. The Event header defines the exact state for which the subscription is requested. For example, Event: presence refers to the presence state of the user, whereas Event: reg refers to the registration status of the SIP entity.

The UAS that is processing the SUBSCRIBE request acts as the notifier; it sends a NOTIFY request back to the subscriber whenever a state change takes place, while the subscription is active. For example, when a voice mail is deposited, the SIP-based messaging system sends a NOTIFY to the subscribing SIP phone. NOTIFY requests must contain a Subscription-State header that indicates whether a subscription is active, pending, or terminated.

If the Subscription-State value is pending in the NOTIFY request, the notifier has received the subscription but not authorized it. This might happen when the notifier is waiting for the end-user input to determine whether to accept a subscription from this subscriber.

If the “Subscription-State” value is active, the notifier has accepted and authorized the subscription. A Subscription-State value of terminated indicates that the notifier has terminated the subscription.

You can also send SUBSCRIBE messages within a preexisting dialog. If you send them outside of the dialog, SUBSCRIBE might cause the establishment of a new dialog. Subsequent NOTIFY messages are sent within the dialog that the SUBSCRIBE message creates. The subscriber can refresh the subscription by sending a new SUBSCRIBE with an Expires value.

Most common uses of this mechanism are for providing voice-mail status notifications, monitoring registration, and showing the presence status of users. You can also use this mechanism to emulate PBX features, which require shared state between devices, such as shared line appearance.

Monitoring Registration State Using the Subscription-Notification Framework

A registration represents a dynamic state that the registrar maintains in the network. Registration state changes when a user registers or unregisters from the network. Applications might be interested in monitoring the registration or online state of users.

For example, the application server subscribes to the registration state of Alice with the registrar. Initially, Alice is not registered, and the registrar indicates that in the initial NOTIFY. Later Alice comes online and registers. The registrar sends a NOTIFY to the application to indicate the new state.

The registration status is carried in the body of the NOTIFY request. It is represented using an XML document in the NOTIFY body. RFC 3680 defines the format of the XML body for the registration state.

Figure 12-8 illustrates the use of subscription to monitor registration status.

Monitoring Registration State Using the Subscribe-Notify Framework

Figure 12-8. Monitoring Registration State Using the Subscribe-Notify Framework

SIP REFER Request

RFC 3515 is a SIP extension that defines a new SIP request called REFER. REFER asks the recipient (identified by the REFER Request-URI) to access the resource described in the Refer-To header. The recipient of the REFER method is also required to send notifications to the sender about the progress made in accessing the resource. The notifications are sent using the NOTIFY method. This RFC also describes a new event package “refer” that is used in the notification messages for REFER. Thus, REFER causes an implicit subscription to be established at the recipient.

You can send REFER within an established dialog, such as within an INVITE dialog to trigger call transfer. In this case, the Refer-To header specifies the SIP URI of the transfer target.

If you send REFER outside of an established dialog, it leads to the establishment of a dialog. An application that can possibly use out-of-dialog REFER is the click-to-dial service. In this case, a user clicks on an online directory that causes his phone to call the desired party. The application server for the click-to-dial feature sends a REFER message to the phone of the initiator with a Refer-To pointing to the remote party. After the phone receives the REFER, it alerts its user (initiator of click-to-dial service) and then places a call by sending an INVITE request to the URI provided in the Refer-To header.

You can add third-party participants in a conference by using REFER. A client can send REFER to the participant, asking him to send an INVITE request to the conference URI. In addition, the client can send a REFER request to the conference controller, asking it to send an INVITE to the participant, requesting him to join the conference bridge. These are some sample usages of the REFER request. REFER is a flexible and powerful concept that you can use for a variety of applications.

Presence and Instant Messaging Overview

Presence describes the willingness, availability, and ability of a person to communicate with another. Presence service enables users to publish their availability status and display messages or icons as a form of self-expression. Instant Messaging (IM) refers to the transfer of text messages between users in near real-time.

IM provides the capability of real-time, text-based communication, but it is useful only if the recipient can participate. Presence provides the ability to subscribe to the online status of a person to determine his willingness to engage in IM sessions and calls. When user presence is integrated within the communication infrastructure, it becomes easy to determine the best possible way to reach the target. For example, a busy executive might be in a conference call and might not be available for another call. However, he might be open for an IM session. Similarly, a missed call list on the SIP phone might indicate the current availability status of callers. This enables the user to prioritize the order in which he return the calls.

The presence status of a person is the aggregation of the status from various devices such as desk phone, calendar application, IM client on computer, and cell phone. Presence service collects this information and provides a unified view. This aggregation determines not only the availability of a person, but also the best way to reach him. Presence thus increases the likelihood of success in a call attempt.

A basic presence service enables users to publish and share their information with others to make the communication experience more personalized and productive. The presence service enables users to control who gets access to their presence state and the degree to which information is shared. The presence service can provide a default (also known as polite blocking) status such as unavailable to those subscribers who are denied access.

SIP Extensions for IM and Presence

SIP for Instant Messaging and Presence Leveraging Extensions (SIMPLE) is an IETF group that works on using SIP and proposing SIP extensions for interoperability across IM and presence services. RFC 3856 defines a presence event package for SIP.

You use the SUBSCRIBE method to subscribe to the presence state of another user. Although you can send subscribes directly to another user, typically you use a presence server to handle presence subscription requests and publish presence status on behalf of a user.

A presence server is a SIP network server that handles presence requests on behalf of the target end users and publishes presence information on behalf of a user. The presence server has an advantage over the peer-to-peer infrastructure because polling is not required to monitor when a remote party has come online. In addition, the server enables the implementation of network-based policy such as security and privacy control.

A UA sends a SUBSCRIBE with a presence event package to subscribe to the presence status of another user. The target user or resource whose presence information is being tracked is known as presentity. Presentity is defined in RFC 2778, which provides a model for presence and instant messaging. A SIP URI typically identifies the presentity.

The SIP proxy routes this SIP request to the presence server in the network. The presence server subscribes to the presentity of the target user or presence status on behalf of the subscriber. The SIP UA or endpoint device of the target user accepts the subscription for presence. This UA is also referred to as the presence agent (PA).

When the target user changes status, the PA notifies the presence server of a presence status change. The SIP NOTIFY request provides the notification, with the presence status contained in an XML body. The presence server subsequently notifies all remote parties subscribing to this presence information.

Because the presence server controls the distribution of sensitive presence information, the presence server must seek permission of the target user whenever a remote party requests subscription to its presence information. If the presence server has a predefined policy, the server uses that to allow or deny the subscription of the presence status. If the server has no such policy, it sends a request to the local user to authorize the subscription or use the default system-wide policy to handle the subscription.

RFC 3428 extends SIP and defines a new SIP request MESSAGE that enables the transfer of instant messages. MESSAGE requests carry the message content in the form of MIME body parts. Similarly, RFC 3903 defines a new SIP request: PUBLISH for publishing event states such as presence state.

Summary

SIP is an IETF signaling protocol for multimedia applications involving one or more participants. The IETF approach is to create a layered and functional architecture in which highly optimized protocols realize specific features and functionality. SIP is a flexible protocol that supports extensions for new applications and services.

SIP is distributed in nature, enabling better scalability at the network servers. SIP dialog state is maintained at the endpoints. SIP network servers are either stateless or maintain transaction state information for at least 32 seconds.

This chapter provides an overview of SIP and its operation. Refer to the appropriate SIP RFC for further details.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.103.183