Chapter 1. Introduction

There is a growing interest, both in the Internet and in the telecommunication industries, in multimedia communication services. An increasing number of Internet users who used to just surf the web or send emails are now becoming addicted to services such as Instant Messaging (IM), online gaming, and voice and video on the Net. These are examples of multimedia communication services delivered over the Internet that are enabled by the Session Initiation Protocol (SIP) in conjunction with other protocols.

In this first introductory chapter, we will explain what we mean by multimedia communication services. We will position these services in the context of the rest of the applications provided over the Internet.

We also want to give the reader a first hint of why SIP plays so crucial a role in the Internet communications space. That will lead us to dive into the importance of the signaling concept. We will underline the relevance of the signaling concept by looking at a very simple example of voice communication.

SIP not only enables voice on the Internet, but also a completely new universe of Total Communication services. To let the reader grasp the possibilities of SIP, we will show some examples of services and commercial products that currently use SIP.

SIP is, like any other Internet protocol, defined and developed by the Internet Engineering Task Force (IETF). More specifically, the core SIP specification is documented in [RFC 3261], and we will be referring throughout this book to this and other Internet specifications. So, in this chapter, we will also try to understand a bit better the SIP-related working groups in the IETF and the specifications they produce.

IP Multimedia Communication Services

A lot of very different services can be offered on top of the Internet and, in general, on top of an Internet Protocol (IP) network—a network based on the Internet Protocol.[1] It is not at all easy to find a categorization of those services from the user’s perspective, but we will try to offer a simple one here, with the purpose of allowing us to understand what the remit of IP multimedia communication services is.

A very high-level approach might split the services offered on the Internet into three different categories or domains from the end-user perspective. A first category might include the infotainment services—that is, those services that give the user access to information and entertainment applications typically stored and executed in remote servers. The web would represent the paradigm for this kind of services. A second category would include the streaming services. These allow the user to access, in real-time, either live or stored time-based media content. Video-on-Demand (VOD) or the hot Internet Protocol Television (IPTV) service would fall into this category. The third service type includes the communication services—that is, those that allow people to communicate with each other using different types of media. A voice call or an email exchange would be examples of communication services (Figure 1.1).

Figure 1.1. 

Communication services can be further classified into offline and online. In online communications, both originator and recipient need to be “connected” simultaneously for communication to happen, and the exchange of information occurs immediately between them. Examples of this include a voice call, an IM exchange, or a chess game.

In offline communication services, the involved parties do not necessarily need to be “connected” simultaneously for communication to happen. The popular email service is a good example of this. In the email service, the submission of information is decoupled from its reception by a store and forward mechanism, so the parties can communicate with each other even if they are not connected at the same time. Let us imagine that John wants to send an email to Alice. He switches on his computer, starts the email program, and sends the message. At that point, John closes the program and switches off his computer. Sometime later, Alice starts her email application and checks if new mail has arrived. She sees John’s email and reads it. As we can see, John and Alice do not need to be connected simultaneously for the communication to happen.

The type of information (i.e., media) that can be exchanged in online communication services can be quite diverse. For instance, we might want to exchange real-time media such as voice or video, which have very stringent timing requirements. Packets containing voice samples should be received at regular intervals of some milliseconds so as to allow the receiver to play them back at the appropriate rate.

We might also want to exchange quasi-real-time information—that is, information that has requirements for timely delivery, but not as strong as in the case of voice. An example is an IM session or a chess game. In order to keep the interactivity in the session, data needs to arrive quickly enough—though, in this case, one or two seconds’ delay would not impact the end user’s experience.

Another type of information that we might want to exchange in online communication services is a prestored image or file. This scenario typically occurs in combination with an exchange of other types of media. Take, for instance, the case of John and Alice, who are engaged in a Voice over Internet Protocol (VoIP) conversation. John is at his 3G (third-generation) IP multimedia-enabled phone. Meanwhile, Alice is sitting at home in front of her PC. While talking to Alice, John takes a picture of a beautiful landscape with the camera integrated into his phone. He decides to show the picture to Alice. The image file would, in this case, be sent online and conveyed immediately to the recipient while both parties are talking so that they can comment on it (Figure 1.2).

Figure 1.2. 

Online IP communication services are typically referred to as IP multimedia communication services, and that is the term that we will use throughout this book.

Unlike what occurs in other type of services, signaling plays a key role in IP multimedia communication services. SIP is typically used as the application-level signaling protocol in that remit, and therefore its role is crucial.

It is important to understand that SIP has not been designed to replace existing Internet application-level protocols such as those used in web (Hypertext Transfer Protocol, or HTTP) or email (Simple Mail Transfer Protocol, or SMTP; Post Office Protocol version 3, or POP3; Internet Message Access Protocol version 4, or IMAP4). On the contrary, SIP covers a piece that was originally missing in the Internet architecture—that is, the signaling mechanism for multimedia communication services. SIP was designed in such a way as to fit smoothly with the existing Internet services and protocols such as web or email, so that, when combined with them, the promise of an all-IP total communications system encompassing all type of services can be made a reality.

It is also important to understand that SIP, all by itself, is not capable of delivering multimedia communication services. It needs to work alongside other protocols to accomplish that function. Most importantly, because SIP is a signaling protocol, it needs to work together with other protocols at the media layer.

The Role of Signaling and Media

In order to get a first understanding of the role of signaling and media protocols in IP multimedia communications, let us start by looking at a very simple example of voice communication on the Internet.

Let us assume that John and Alice, who are both in front of their PCs connected to the Internet, want to have a voice conversation. Each of them has a microphone and a loudspeaker connected to the soundcard in his or her computer. John is running a program such that, when he speaks on the microphone, the soundcard samples and encodes the voice signal into a bitstream. The computer program takes this stream of bits representing voice samples, and puts them into IP packets. These packets are then sent to Alice through the Internet. In order to make the packets reach Alice, the program in John’s computer has to fill in the IP packets with the IP address of Alice’s PC.

At the other end, Alice’s PC receives the IP packets, decodes the voice samples in the payload, and feeds them into the soundcard so that they can be played.

In order for this real-time communication to work, this entire process has to be done with minimal latency, and it has to be done very regularly. Fortunately, PC programs can easily achieve this thanks to advances in computer technology.

In our example, we have considered that the voice samples are carried over IP protocol. Instead of conveying the samples directly over IP, an upper-level protocol is generally used. The information exchanged between the communicating parties (in this case, the voice) is typically referred to as media; thus, these protocols are referred to as media transport protocols. Different media transport protocols are specially suited to the type of media that needs to be conveyed. For instance, if the media is voice, a protocol called RTP (Real-time Transport Protocol) is typically used, which runs on top of User Datagram Protocol (UDP)/IP. RTP contains features that facilitate the transport of pure real-time traffic, such as voice, over IP networks. RTP and other media transport protocols are further described in Chapter 10.

Figure 1.3 depicts the previous example.

Figure 1.3. 

So we have seen in the example how a simple voice communication might be enabled in the Internet, and we have understood the need for:

  • A media transport protocol to carry the real-time user information (media). In our example, we were using RTP to transport voice, but there are also other protocols more suited for other types of traffic.

  • An application (computer program) in the endpoints able to:

    • capture the voice samples from the microphone and send them over the network using a media transport protocol; and

    • receive the media transport protocol packets, get the voice samples, and feed them to the soundcard to be played.

Are these two aspects enough, or do we need something more in order to enable fully fledged online communications? In order to answer this question, let us look at some of the additional challenges that the previous scenario presents.

First of all, we were assuming that both John and Alice were already available, prepared, and sitting in front of their PCs at the time communication started. Of course they might have agreed sometime before, through other means, that they both would have this communication at a specific date and time so they could be prepared for it—but that is not really a practical approach. Also, it is not efficient, from the resource-utilization perspective, to have the microphone, the loudspeaker, the soundcard, and software program permanently activated and processing the voice signals in the environment or listening for the voice packets being transmitted in the network. What is needed here is a mechanism by which John can first signal to Alice his desire to start a communication with her. This would be like an invitation signal sent from John’s PC to Alice’s. When this signal reaches Alice’s PC, it would need to trigger some alerting mechanism—for example, a ringing audio signal—that can attract Alice’s attention even if she happens not to be in front of the PC at that precise moment. This would be analogous to the ringing of the phone in the traditional, legacy phone system.

Secondly, John may want to be informed about the progress of the communication process. For instance, he may need to know that his invitation went through and that Alice is being alerted. Most importantly, he needs to know when Alice accepts the communication so that he knows when he can start speaking. This event is also very important for the computer program because it will trigger the activation of the necessary resources in the PC (soundcard, IP resources, and so on) just at the precise moment when they will be needed, thus optimizing resource usage.

The third aspect refers to the fact that in order to send voice samples over the network, they first need to be encoded. Likewise, the encoded data needs to be decoded at the receiving end. This coding or decoding process can be done by specialized hardware (the soundcard) or within the communication software running on top of the operating system, which is termed software CODEC (COder/DECoder). In any case, there are quite a few standard ways to code and decode the voice signals, and it is crucial that the codec used in John’s PC matches the one used by Alice. Because there may typically be many different codecs installed in both PCs, it is necessary that, prior to starting the voice communication itself, John and Alice agree on the codecs that they will use for this particular communication. In other words, there is the need to agree on some voice-related parameters (e.g., the voice codecs) before communication can start.

The fourth aspect refers to the way addressing is done. In our example, we said John’s program needed to add Alice’s computer IP address as the destination address in the IP packets that it sent to Alice. This is actually cumbersome because it forces John to learn Alice’s IP address as a set of four numbers, each of them ranging from 0 to 255.[2] Most importantly, a true communication system should be available for the user irrespective of his or her actual location (IP address). Alice would expect to be able to use the service from her PC at home connected to the Internet through Asymetric Digital Subscriber Line (ADSL), but also when at the office or, when traveling with her laptop, or with her Internet Multimedia Subsystem (IMS)–enabled mobile phone. The problem is that, as location changes, so does the IP address of the device. So how can we guarantee that Alice will be able to receive the call from John irrespective of her location?

All the aspects highlighted above call for the need to exchange some extra information between John and Alice. This is not actually voice information (media), but rather, information that helps John and Alice to control the way voice communication occurs. This control information is called signaling, and is sent in messages between Alice’s and John’s computers according to some signaling protocol. SIP is one such signaling protocol that can convey this type of information, but there are also others.

The relevance of signaling in this context is huge—not just in order to cope with the basic call scenarios, but also to enable more-complex multimedia value-added services. Accordingly, SIP plays a major role in making all these new services possible.

So we now have a clearer picture of what a true multimedia communication system requires in terms of information exchanges:

  • Exchange of media information (voice or others). This is governed by an media transport protocol such as RTP or others.

  • Exchange of control information (signaling). This is governed by a signaling protocol such as SIP or others.

The set of functions and elements that participate in the processing and exchange of the media are said to form the media plane (also called user plane). The set of functions and elements that participate in the processing and exchange of the signaling are said to constitute the Control (or signaling) Plane. Both the media plane and the control plane are integral parts of an IP multimedia communication system.

Figure 1.4 shows the two planes in a multimedia communication.

Figure 1.4. 

Both SIP and RTP are IP application-level protocols, meaning that they are sitting on top of the TCP/IP stack, and therefore they use the services provided by transport protocols such as UDP or TCP. Chapter 3 further explains the TCP/IP stack, and shows how the different protocols used in multimedia communications fit with it.

Media and control protocols are independent. For example, in the recent past, a lot of IP multimedia communication systems were using RTP in the media plane. Meanwhile, in the control plane, International Telecommunication Union (ITU) H.323 signaling protocols were used. Today, SIP is becoming commonplace as the signaling protocol in IP networks, and it is very often used with RTP in the media plane. Nevertheless, SIP can also be used with other Media Transport Protocols. For instance, some session-based Instant Messaging systems use SIP as the signaling protocol and Message Session Relay Protocol (MSRP) as the media transport protocol.

The previous example of voice communication between John and Alice was just meant to help the reader grasp the key concepts around signaling, and does not show all the power of SIP-enabled IP communications. More -sophisticated examples might be built with SIP. For example, communication between John and Alice might start out by being just voice, but after some minutes, Alice might want to add video so that she can show John the cake she is preparing for her birthday. At some point in time, they may even want to start playing a chess game through the Internet while they still talk to each other, or they may decide to stop voice communication but exchange real-time text messages while playing chess. Once they stop playing, Alice may want to show John a map of the new house she has recently bought, and write on top of it in order to mark where, for instance, the different rooms are located.

This would be an example of an IP multimedia call. Throughout this book, we will use the term “call” to refer to IP multimedia calls enabled by SIP. Obviously, this call concept is different from the traditional (voice communication only) telephone call that we are all used to. It is not only that the underlying technologies are completely different (the telephone call is supported by the legacy circuit-oriented telephone system, whereas the multimedia call is supported by the Internet), but also that the amount of services and communication experiences that SIP-enabled IP communications can offer does not have a match in the legacy telephone system.

Type of Services Enabled by SIP

In this section, we intend to give an overview of the different types of services that can be enabled by SIP. The list is not intended to be exhaustive.

Basic Session Management Services

In the previous section, we explained that SIP is a crucial element to provide the main control functions needed in multimedia communication scenarios. So, first of all, SIP can be used to enable communications based on a variety of media, such as:

  • Voice communication.

  • Video communication.

  • Instant Messaging communication: interactive online exchange of (typically short) messages.

  • Text over IP: exchange of real-time text.

  • Peer-to-peer gaming: exchange of conversational data in order to implement peer-to-peer games (e.g., a chess game).

  • Whiteboarding: exchange of conversational data to implement a whiteboard service. In this type of service, each user sees a whiteboard on the screen of their device and can draw or write over it. Changes on the whiteboard made by the users are kept in sync with one another.

  • File transfer: exchange of peer-to-peer data.

Moreover, SIP provides off-the-self support for combining different type of media on the same communication. All the combinations one can imagine are possible; the most common are:

  • Voice combined with video, so called video-telephony.

  • Voice combined with IM: Users can talk to each other while they also exchange messages.

  • Voice combined with real-time text: Users can talk to each other. For example, at some point in the conversation, one of the parties does not understand the word the other is pronouncing, and asks the other participant to write it and send it in realtime.

  • Voice combined with online transfer of a picture: The users can share a live picture while they are talking to each other. Imagine Alice and John talking to each other. At some point in time, Alice tells John about the house she has bought, and sends online a picture to him so they can share it while talking.

  • Voice combined with the online transfer of a generic file.

  • Voice combined with gaming: John and Alice can play a chess game while talking to each other.

  • Voice combined with whiteboarding: Users share a whiteboard while having a voice conversation.

These are just some possible examples; there are many others. As we can see, in most cases, the main media component is voice, but an additional data media is added. These particular scenarios are sometimes referred to as “rich voice.”

Enhanced Control Services

SIP signaling, as defined in the core SIP specification [RFC 3261] and its extensions, is a powerful tool to provide new services. SIP and Session Description Protocol (SDP) signaling convey useful information that can be used to provide enhanced services. Next follow some examples (the list is by no means exhaustive):

  • Identification of the originator of a session: SIP includes information about the identity of the caller that can be presented to the called party. Likewise, SIP also provides the means for the caller to prevent his or her identity from being shown to the recipient.

  • Multimedia identification of the originator: The originator can include multimedia content in the session initiation request so that it is rendered to the recipient when alerting is done. The multimedia content might consist of a picture, a personalized ringtone, a business card, and so on.

  • A call-blockingapplication might decide to block a call based on a combination of different signaling parameters—for example, originator, destination, media type, or other external parameters such as date or time of day.

  • A call-holdservice can very easily be offered with SIP/SDP because SDP already provides the semantics for activating and deactivating the media being sent or received.

  • A call-forwardingservice can also very easily be offered by modifying specific SIP parameters that represent the destination of the call when certain conditions are met (e.g., original recipient does not answer).

The previous examples highlight some of the features that can be offered based on core SIP signaling parameters. These types of services can typically be applied irrespective of the media being exchanged.

SIP also offers a number of primitives that can be used to deliver call-control services. These types of services typically involve complex manipulation of the signaling relationships between participants or a more-complex processing of different signaling fields. Examples of these services are:

  • Call transfer: John and Alice are communicating with each other, and then John decides to transfer the call to Bob so that Alice and Bob can communicate with each other.

  • Hunting groups: A call to a group number may be distributed according to different algorithms between a set of individuals pertaining to the group.

  • Call queuing: Calls to a group number can be queued before they are distributed among the agents.

  • Closed-user-group dialing: Users within a group can use short codes to call each other.

  • Ring back when free: Call from John to Alice is unsuccessful because Alice is busy with another call. As soon as Alice hangs up, a call is immediately placed back to John.

  • Simultaneous ringing: Several devices can share the same number and can ring in parallel when someone dials that number.

  • Call pickup: Two or more users can be part of a group that can pick up each other’s calls. For instance, a call is placed to John, who is not currently at his PC. Alice hears his PC ringing and instructs her phone to take John’s incoming call.

  • Click to dial: by pressing a button in a web page, a two-party call is initiated.

A comprehensive list of SIP call-control services is provided in [draft-ietf-sipping-service-examples].

Media Services

Not all the services are delivered just through manipulation of the signaling. In other cases, there is a need for specific functions at the media plane, which are controlled through SIP. Examples of such services are:

  • Voice mail: Allows users to have their calls redirected to the Voice Mail System (VMS), where the callers can leave their messages. The mailbox owner can later retrieve those messages.

  • Music on hold: Whenever a user puts another user on hold, the first party can select the music to be played to the other party while waiting.

  • Ringback tones: The called user can decide what ringback tone the caller will hear during the alerting phase.

  • Do not disturb: The caller is redirected to specific announcements or menus based on configuration parameters, called-user preferences, and so forth.

Conferencing Services

Conferencing services deserve a separate consideration given their relevance and complexity. In order to provide a fully featured conferencing service, other protocols, in addition to SIP, need to be supported. Still, SIP provides the key call-signaling features.

Examples of conferencing services that can be enabled by SIP are:

  • Multiparty call: John is in a call with Alice and decides online to join Bob to the conversation.

  • Dial-in conferences: Participants in the conference just dial a predetermined identifier and are automatically connected to the conference.

  • Dial-out conferences: Conferencing servers can be configured to start a conference at a specific time and automatically initiate the establishment of the corresponding sessions with all the participants.

Presence

SIP offers the tools for publishing, subscribing, and notifying the information about availability and willingness of users to set up multimedia communications. This feature is particularly useful when used in conjunction with other services such as IM and voice. A typical example of presence-enabled voice service is a multidevice application. It allows the user to have a set of different devices (PC, mobile, laptop, and so on.), all with the same identity. When a call is addressed to the common identity, a presence check could be done in order to determine at what device the user is available and/or willing to take the incoming call. Then the call is routed to the appropriate device.

Moreover, the presence service is typically related to other services that are responsible for managing lists of groups of users (buddy lists). Presence information can typically be transparently shared only within these groups of users. These capabilities allow the development of community-based services.

Examples of SIP Applications

In order to illustrate the concepts in the previous section, we will describe a few examples of commercial SIP applications that group some of the features previously mentioned.

In the web pages www.pulver.com and www.sipforum.org, the user can find a list of SIP commercial products in different categories:

  • Gateways

  • Severs

  • Firewalls and NATs

  • Software components

  • Software tools

  • Terminals

  • PBXs (Private Branch Exchanges)

  • Application servers

A list of SIP service providers can be also found at www.pulver.com.

Within the SIP environment, there are a lot of open-source initiatives. We recommend that the reader visit www.sipforum.org, where he or she will be able to find more information on open-source SIP products and components.

SIP Communicator Applications

An SIP communicator is an SIP-based multimedia application hosted in the endpoint (i.e., PC, mobile device, laptop, and so on) that provides the capability to establish communications in different media such as voice, video, messaging, and file sharing—all of them integrated with presence and contact-list management. The typical interface that these applications provide shows a list of contact names. Associated to each contact is typically an icon showing the presence information for that contact. By clicking in a contact, a menu is displayed that asks for the type of communication requested (voice, video, messaging). The user clicks on the desired option, and the application sets up the required media session.

These types of services, though strongly based on the terminals, do frequently also require some back-end infrastructure, for instance, in order to store the presence and contact information.

SIP communicator applications allow for building on the “community” concept, and are becoming increasingly popular across the Internet. Such Internet players as Microsoft and Yahoo! have a significant customer base of SIP-based communicator customers.

Figure 1.5 shows the layout of one example of these applications.

Figure 1.5. 

Figure 1.6. 

IP PBX Applications

In the enterprise environment, an alternative to traditional circuit-switched PBXs is becoming increasingly popular: the IP PBX. An IP PBX offers enterprises three main benefits:

  • It allows them to converge their voice and data infrastructure over a unique IP network, thus resulting in OPEX (Operational Expenditure) reduction.

  • It supports the existing TDM (Time Division Multiplex) PBX services, but adds additional ones, plus the capability to integrate other media components and data services.

  • IP PBXs are cheaper, as well as easier to install and maintain, than traditional TDM PBXs.

Initial IP PBXs were not based on SIP, but on protocols such as H.323[3] and SCCP (Skinny Client Control Protocol).[4] Today, SIP IP PBXs are becoming commonplace in the market and are rapidly replacing the legacy PBXs. Big PBX providers that play in this space include Alcatel-Lucent, Avaya, Siemens, and Cisco Systems, although there are also many other smaller companies. Moreover, SIP-based open-source PBX applications are also becoming increasingly popular. The most famous example in this remit is the open-source Asterisk IP PBX (www.asterisk.org).

Enterprise Total Communication Systems

In order to cover enterprise communication needs, voice is a must, but also other applications are becoming increasingly in demand—applications such as IM, video, conferencing, remote control from the PC of the desk phone, and presence. Moreover, there is an increasing demand to integrate this communication infrastructure with existing data applications such as email, text editors, and spreadsheets. In this kind of environment, the user might be writing a document with a text editor and come to a point where he or she needs to discuss something with some colleagues. By clicking on a menu item within the text editor, the user can select two colleagues and ask the application to set up a video conference with them.

Products such as the Microsoft Office Communications Server are becoming increasingly popular in this remit.

IP Centrex Applications

Many companies do not want to bother with managing a PBX infrastructure. They just want to outsource their communication services to a reliable partner. An ideal solution for service providers in these cases is to use IP Centrex (Central Office Exchange Service) applications. These applications are situated within the partners’ premises, and an IP network is integrated into the customer’s premises for connectivity with IP phones. A typical IP Centrex application can offer PBX-like features to several companies.

Some typical features offered by IP centrex solutions are:

  • Calling Line Identification Presentation (CLIP)

  • Calling Line Identification Restriction (CLIR)

  • Call forwarding

  • Call blocking

  • Call return

  • Call pickup

  • Call hold

  • Call park

  • Music on hold

  • Hunt groups

  • Closed user groups

  • Conferencing

  • Call transfer

  • Dual ringing

  • Barge in

  • Call toggle

Vendors such as BroadSoft, Sylantro Systems, and Netcentrex (which was acquired by Converse Converged IP Communications) have popular products in this space.

PSTN Emulation Applications

The evolution of the Public Switched Telephone Network (PSTN) implies changing to an SIP network infrastructure. In order to leverage the full power of SIP, intelligent terminals are needed with full SIP capability. However, this is not yet a reality in the fixed operators’ environment. While this happens, some operators are starting to change their network infrastructure toward an SIP-based one while still not changing the end-user terminal and access. In this scenario, there is a need to offer the basic PSTN services (call forwarding, CLIR,[5] CLIP,[6] and so on) on the new SIP infrastructure. This is typically referred as PSTN emulation services.

The Internet Engineering Task Force (IETF)

According to [RFC 3233], the IETF is an “open global community of network designers, operators, vendors, and researchers producing technical specifications for the evolution of the Internet architecture and the smooth operation of the Internet.” The IETF is the main body responsible for defining Internet standards.

The IETF is formally an activity that falls under the umbrella of the Internet Society (ISOC). The society is an international nonprofit organization with individual and corporate members all around the world whose aim is to promote Internet use and access. Among other things, the ISOC provides insurance service and some financial and logistical support to the IETF.

The IETF is organized into eight study areas. These are:

  • Applications

  • General

  • Internet

  • Operations and management

  • Real-time applications and infrastructure

  • Routing

  • Security

  • Transport

The study areas are broken down into Working Groups (WGs). Each WG focuses on a specific topic, and is intended to complete work on that topic and then shut down. Most of a WG’s work is done via email. Anyone can participate in an IETF Working Group, and decisions are made by rough consensus.

The Internet Standards process is defined in [RFC 2026].

The IETF Publications: RFCs and I-Ds

There are two types of IETF publications: Requests for Comments (RFCs) and Internet Drafts (I-Ds). The RFC series contains the Internet specifications, whereas Internet Drafts are just draft specifications made available for informal review and comment—an I-D may or may not become a RFC. There are three types of RFCs: Standards Track, non–Standards Track, and Best Current Practice (BCP). Every RFC has a unique number and is stored permanently at the IETF web site: http://www.ietf.org.

Standards Track RFCs

This category of RFCs includes those specifications that are intended to become Internet Standards. In the process of becoming an Internet Standard, these RFCs evolve through a set of maturity levels known as the Standards Track. These maturity levels are proposed standard, draft standard, and standard.

When a specification in a Working Group becomes stable, it turns into a proposed standard RFC for which a number is assigned. Proposed standard is the entry maturity level into the Standards Track. In addition to being stable, “a proposed standard has to be well-understood, must have received significant review, and must enjoy enough community interest.”

Proposed standards must remain in that level for at least six months before they become draft standards. In order for a RFC to become a draft standard, it must have at least two interoperable implementations with sufficient operational experience. An RFC that becomes a draft standard is assigned a new RFC number. The old RFC is not deleted from the RFC series, but it is made clear that the old RFC has been made obsolete by a new RFC. An example would be the SIP specification. It was first published as a draft standard in RFC 2543, which then was made obsolete by RFC 3261, which has, at the time of this writing, the status of proposed standard. Both of them can be found in the IETF directory for RFCs: http://www.ietf.org/rfc.html.

Draft standards must remain in that level for at least four months, and at least one IETF meeting has to be held before the draft can become an Internet Standard. If the specification has achieved a very high maturity level and has obtained significant implementation and successful operational experience, it may become an Internet Standard.

Internet Standards are published in the STD series, but they also retain their RFC number. For instance, the Real-time Transport Protocol (RTP) is Internet Standard STD 64 and [RFC 3550].

Non–Standards Track RFCs

Non–Standards Track RFCs do not define a standard in any sense. These specifications can be further classified into “Experimental,” “Informational,” and “Historic.”

Experimental specifications contain the conclusions and the experiences of some research-and-development (R&D) effort.

Informational specifications contain general information for the Internet community. They do not represent an Internet community consensus or recommendation.

The term “Historic” applies to those specifications that have been superseded by a more recent one or that are considered to be obsolete for any other reason.

Best Current Practice RFCs

BCP RFCs standardize practices and the results of community deliberations. BCP RFCs are subject to the same basic set of procedures as Standards Track documents, but do not have maturity levels. BCP RFCs are published in the BCP series, but they also retain their RFC number. For instance, the IETF specification “SIP Basic Call Flow Examples” is [RFC 3665] and also BCP 75.

Internet Drafts (I-Ds)

Internet Drafts are draft versions of specifications that are under development. These draft specifications may or may not become RFCs. Internet drafts represent the current status of work and discussion within a Working Group regarding a particular topic. The IETF makes them public so as to obtain informal review and comments.

An Internet draft is valid for only six months. After that time, it either becomes an RFC, a new version of the draft is generated, or it is deleted.

Internet drafts are named using the following format: draft-ietf-working group-title-xx, where xx is the version of the draft starting from 00. For example, the Internet draft from the SIPPING WG called “Session Initiation Protocol Call Control - Transfer” is named draft-ietf-sipping-cc-transfer-07.

The IETF recommends that Internet drafts are referred to simply as “work in progress.”

SIP in the IETF

Work on SIP in the IETF started in the MMUSIC Working Group, which produced RFC 2543 in March 1999. In September of that year, a new SIP Working Group was created. Later, that group was further split into two Working Groups: the SIP WG and the SIPPING WG.

The SIP WG is devoted to the SIP core protocol and its extensions, whereas the SIPPING WG is more focused on investigating new SIP applications and setting requirements for new SIP extensions.

In June 2002, the SIP WG published RFC 3261, which revises RFC 2543. It is mostly backward compatible with RFC 2543. RFC 3261 adds some modifications, and presents the protocol in a much cleaner layered approach. This book uses RFC 3261 (not RFC 2543) as the source of information for SIP.

Next follows a list that includes some of the working groups that, at the time of this writing, are to some extent related to SIP and multimedia communications.

SIP WG

The Session Initiation Protocol (SIP) Working Group is chartered to maintain and continue the development of SIP, currently specified as proposed standard RFC 3261, and its family of extensions

SIPPING WG

The Session Initiation Protocol Project INvestiGation (SIPPING) Working Group is chartered to document the use of SIP for several applications related to telephony and multimedia, and to develop requirements for extensions to SIP needed for those applications.

MMUSIC WG

The Multiparty MUltimedia SessIon Control (MMUSIC) Working Group was originally chartered to develop protocols to support Internet teleconferencing and multimedia communications. It produced the first SIP specification as RFC 2543. After that, responsibility for the SIP specification was moved to the SIP Working Group. The SIP WG created RFC 3261, which renders RFC 2543 obsolete.

Among other things, the MMUSIC WG is now responsible for maintaining and continuing the development on the Session Description Protocol (SDP), currently specified as proposed standard [RFC 4566].

SIMPLE WG

The SIP Instant Messaging and Presence Leveraging Extensions (SIMPLE) Working Group focuses on the application of the Session Initiation Protocol to the suite of services collectively known as Instant Messaging and Presence (IMP).

ENUM WG

The TElephone NUmber Mapping (ENUM) Working Group has defined a DNS[7]-based architecture and protocol [RFC 3761] by which a telephone number[8] can be expressed as a Fully Qualified Domain Name (FQDN)[9] in a defined Internet domain (e164.arpa).

IPTEL WG

The focus of the IP Telephony (IPTEL) Working Group is on the problems related to naming and routing for Voice over Internet Protocol (VoIP) protocols. In particular, this Working Group is responsible for [RFC 3966], which defines the tel URI format for telephone numbers.

AVT WG

The Audio/Video Transport (AVT) Working Group was formed to specify a protocol for real-time transmission of audio and video UDP/IP. In particular, this Working Group is responsible for [RFC 3550], which corresponds to Internet standard STD 64 and defines the Real-time Transport Protocol (RTP).

Summary

We have, in this chapter, understood what multimedia communications are and the role that SIP plays in this remit. We also looked at some examples of services that might be delivered through SIP. The way SIP and related protocols work is defined in Internet specifications generated by the IETF. We also learned a bit about how the IETF is organized and the types of documents that it generates. In the next chapter, we will review the past history of multimedia services in the Internet. As happens in other areas of knowledge, understanding the history can help us understand why things are as they are today.



[1] Internet Protocol and IP networks are reviewed in Chapter 3.

[2] For an IPv4 address.

[3] H.323 is an umbrella specification from the ITU (International Telecommunication Union) that defines protocols to provide packet-based multimedia services, originally targeted more toward a LAN environment.

[4] SCCP is a proprietary terminal-control protocol owned by Cisco Systems, Inc.

[5] CLIP is a PSTN service that allows the called party to know the identity of the caller.

[6] CLIR is a PSTN service that allows the calling party to hide his or her identity from the called party.

[7] DNS is an Internet system that is used to associate several types of information (e.g., IP addresses) with meaningful high-level names (so-called domain names).

[8] Telephone number as defined in ITU Recommendation [E.164].

[9] FQDN is an unambiguous domain name for an entity within the Domain Name System.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.105.255