Chapter 10. Audio and Telephony Control

Support for voice or, more generically, audio is a distinguishing attribute of Bluetooth wireless communication. With support for both voice and data, the technology is well positioned to bridge the domains of computing and communications, as evidenced by the enthusiastic support for the Bluetooth technology within both industries. Several of the profiles address scenarios in which both a computing device and a telephony device are used. This chapter, our final in-depth examination of the core specification, deals with the components of the protocol stack that enable telephony and voice (audio) communication. The telephony control protocol is embodied by the TCS-BIN (or just TCS for short) layer, while audio can be carried natively over the baseband. TCS is based upon the existing ITU-T Q.931 protocol [ITU98], but even so it occupies over 60 pages in the specification. TCS is a binary encoding for packet-based telephony control and resides above the L2CAP layer of the stack. TCS-BIN is sufficient to realize the version 1.0 telephony profiles, although applications using AT commands over the RFCOMM serial port abstraction (including headset, dial-up networking and fax) might also accomplish a form of telephony control (this latter form of telephony control is not included as a separate entity in the version 1.0 specification; it is discussed further in subsequent sections here). Audio is not a layer of the protocol stack per se but rather a specific packet format that can be transmitted directly over the baseband layer. Since audio is frequently (although not exclusively) associated with telephony applications, it is discussed together with TCS in this chapter as a logical convenience. This chapter examines telephony functions, including audio, in Bluetooth wireless communication. As in preceding chapters we will not only provide highlights and interpretations of the specification but also touch upon the background information for these elements of the protocol stack, including the evolution of TCS-BIN.

Figure 10.1 depicts audio and TCS-BIN in the protocol stack; it also shows the component we call AT Command Telephony Control. This latter component is a remnant of what was once called TCS-AT and is explored further below. In general, when we refer simply to TCS we mean the TCS-BIN layer of the stack. TCS-BIN resides above L2CAP; audio communicates directly through the baseband; and AT command telephony control operates over RFCOMM. Telephony control applications can communicate directly with TCS-BIN and might also use AT command telephony control.

Audio and TCS-BIN in the Bluetooth protocol stack. Also shown is the AT command based form of telephony control used by some applications.

Figure 10.1. Audio and TCS-BIN in the Bluetooth protocol stack. Also shown is the AT command based form of telephony control used by some applications.

Audio and Telephony Control Operation

TCS-BIN is used for the call control aspects of telephony, including establishing and terminating calls along with many other control functions that apply to telephone calls. TCS can be used to control both voice and data calls. When a voice call is made the audio element of the stack is used to carry the its content; in the case of data calls the data content can be carried over the transport layers of the stack (perhaps also involving other middleware layers). The call control functions provided by TCS-BIN can be used no matter what the call content (voice or data) is; data calls like those used with the dial-up networking profile are supported and so is voice telephony, like that used for the cordless telephony and intercom profiles.

TCS-BIN also defines a method for devices to exchange call signaling information without actually having a call connection established between them; this is called connectionless TCS and is described more fully below. Another aspect of TCS-BIN is that of group management functions. When there is a group of devices that all support the TCS-BIN protocol, the members of the group (called a wireless user group, or WUG) can make use of some special functions defined by TCS, including group membership management, telephony service "sharing" among devices in the group and a method for a fast direct connection between two group members. The TCS-BIN call control and other functions are examined more fully below.

A second form of call control, which we have called AT command telephony control, was introduced above. While it is not defined as a named protocol in the specification, it is mentioned here because it is a well-known method for accomplishing call control, and it is used by several profiles. In fact, at one time this concept was embodied as a separate protocol and element of the stack called TCS-AT. While TCS-AT is no longer defined as a separate entity (and indeed, given the existence of TCS-BIN, a separate SIG-defined TCS-AT protocol is unnecessary, as described more fully below), it is worth acknowledging that this sort of telephony control does exist in many Bluetooth environments. AT commands are modem control commands that are likely to be used especially by legacy applications; these applications typically are configured to communicate with a modem over a serial port. Within the Bluetooth protocol stack these applications could use RFCOMM to communicate with a compatible modem service using the same AT command call control functions as in other environments, with little or no change to the application (especially through the use of a Bluetooth adaptation layer as described in Chapter 5). TCS-BIN is the only telephony control protocol defined as a separate entity in the specification, and it is the protocol upon which several telephony profiles are based. However, AT command-based telephony control is also used in the headset, fax and dial-up networking profiles, even though no separate AT protocol is specified by the SIG.

Audio, as already pointed out, is not really a layer of the protocol stack. In fact it would not be unreasonable to consider audio as a specialized sort of transport layer, since it is largely embodied as a particular packet format that is sent and received directly over the air-interface using the baseband protocol. Indeed, outside the baseband chapter, the specification directly addresses audio only in an appendix that is fewer than ten pages long! Yet we have established that voice support is a key differentiating value of Bluetooth wireless communications, and clearly audio directly supports voice (voice and audio are often equated, although voice is not the only form of audio). So why does the specification not contain a chapter on audio with a description and page count commensurate with the importance of audio for Bluetooth applications? The answer has already been suggested: because Bluetooth audio is really just a specification of a packet format and an encoding scheme for the data in those packets, it does not require a lengthy explanation. Once the allowances (including time slot reservation and audio packet definition, described more fully in Chapter 6) have been made at the baseband layer to support audio traffic, little more specification is required. In fact the actual bulk of the audio specification can be found in the baseband chapter of the specification, which even includes a section devoted entirely to audio baseband traffic. Thus to fully understand Bluetooth audio one should understand the baseband protocol stack layer, described in Chapter 6 of this book. However, because audio so often is associated closely with voice and thus with telephony, it is logically consistent to discuss it here along with the other telephony-related functions.

TCS Protocol Development

Telephony control is intertwined with audio functions, and in fact it was audio that drove the need for telephony control rather than the other way around. Before there was a TCS working group, it was agreed that the protocol stack needed to support audio so that voice as well as data traffic could be enabled. At first the audio requirement pointed out the need for some control functions, which initially were presented as "audio control" functions. These audio control capabilities were needed to support the ultimate headset, speaking laptop and three-in-one phone usage models (described in Chapter 3), and initially just a small set of simple operations (such as make a call, answer a call, terminate a call and adjust volume) was envisioned. As the telephony profiles (including those noted above along with dial-up networking and fax) were further developed, it became evident that a richer set of telephony control functions was desirable and a working group was formed to define these capabilities for the protocol stack.

With the initial recognition of a need for minimal audio control functions early in the SIG's history, it was at first supposed that these simple operations would be accomplished via AT commands using the RFCOMM serial port abstraction (recall that RFCOMM was fairly well defined even at this early stage). Thus was born the TCS-AT specification. This specification was intended to describe how standard AT commands could be mapped over the Bluetooth protocol stack and to define any new AT commands required for Bluetooth wireless communication. TCS-AT was designed to support legacy applications that send and receive AT commands over a serial port (most likely using a serial cable). TCS-AT of course specified the use of RFCOMM as the serial port replacement. As the specification progressed, it became apparent that there was very little need for any new AT commands specific to Bluetooth environments (only two new AT command responses were identified as being useful enough to propose specific definitions for Bluetooth TCS-AT). Thus the TCS-AT specification became a short reference that described how to use AT commands in the Bluetooth protocol stack, and its definition was absorbed into the profiles that use AT protocols (namely headset, fax and dial-up networking).

In the meantime a binary, packet-based telephony control protocol was also being defined within the Bluetooth protocol stack. Called TCS-Binary (or TCS-BIN), it was adapted from an existing ITU-T specification, Q.931 [ITU98]. As in other cases, the SIG's adoption of existing standards provided benefits for the protocol stack, in this case including the capability for robust telephony control operations in a standardized manner. In early 1999 it was observed that the likely future direction for telephony control applications was along the lines of the TCS-BIN (ITU-T) style, and it was further observed that TCS-BIN provided all of the functions necessary for all of the telephony-based profiles. Finally it was also observed that the TCS-AT specification did not provide significant new functions specific to Bluetooth environments and primarily specified a method by which legacy applications might use standard AT commands over RFCOMM as a means of cable replacement. Thus TCS-BIN subsumed TCS-AT as a separate protocol in the stack. The SIG decided to remove TCS-AT as a separate specification, although the functions were not removed; only the name was. Thus the version 1.0 specification does not mention TCS-AT,[1] although several applications in fact do use RFCOMM as a serial transport for AT commands in cases where a modem service supports such a configuration. Indeed, the headset, fax and dial-up networking profiles use AT command telephony control. With only TCS-BIN being explicitly mentioned in the specification, all further references to TCS herein imply TCS-BIN.

The TCS Protocol Examined

In addition to what the specification calls TCS supplemental services (including caller identification information and dual tone multi-frequency [DTMF] tone generation), TCS defines three major functional areas:

  • Call control

  • Group management

  • Connectionless TCS

Each of these is explored below. The majority of the more than 60 pages of specification devoted to TCS deals with the detailed syntax and semantics of TCS-BIN, which are not reproduced here. Instead we highlight some of the important features and nuances of TCS-BIN in the protocol stack.

TCS Call Control

The TCS call control functions serve to set up calls that subsequently will carry voice or data traffic. TCS acts as a state machine, performing the operations necessary to progress a call from one state to the next, and tracking the resulting state. When making calls, these operations might include such things as setting up the call, including dialing information; establishing and confirming a connection; and disconnecting when the call is complete. For received calls, the states and transitions include call presence (ringing), call acceptance and connection establishment and termination. Much of the TCS chapter of the specification is devoted to a full explanation of these states and their transition operations; the appendix to the TCS chapter of the specification details these states and transitions in comprehensive state diagrams.

The telephony control functions can operate not only in a point-to-point network topology but also in a point-to-multipoint configuration. The multipoint environment is relevant, as pointed out in the specification, for incoming calls when numerous phones all need to receive the incoming ring signal and control information. In this case, TCS uses multipoint signaling to alert all the telephones of the incoming call; it can then establish a single content channel (where the voice or data traffic will flow) with the telephone that answers the call.[2] TCS does not deal with the content that is subsequently streamed over the channel but only with the call control functions that occur on the control channel.

Unlike RFCOMM, in which a single instance of the protocol layer is multiplexed, the specification indicates that multiple instances of TCS may be executed at the same time to handle multiple calls (recall that Bluetooth wireless communication permits up to three voice channels simultaneously over the baseband). Multiple instances of TCS simply use multiple L2CAP channels.

TCS Group Management

Group management functions use the concept of a wireless user group (or WUG). Such a group can use the TCS group management functions to allow for groups of devices to take advantage of some special functions that TCS enables. These functions include a method for one device to make use of the telephony services of another device in the group; a way to manage group membership (called configuration distribution); and a way for two slave members of the group to use the TCS protocol to establish a direct connection (called fast intermember access).

Group management is useful in telephony applications to enable the provision of the sorts of telephony functions that many users expect, such as multiple telephone extensions, call forwarding and group calls. In addition, group management can help to accomplish parts of the three-in-one phone profile by permitting phones to join a WUG (thus enabling a cellular phone to be used as a cordless phone) and to directly communicate with other TCS devices (thus permitting the intercom or "walkie-talkie" function).

A WUG is just a group of devices that all support TCS. The specification makes special provisions for security within the WUG by allowing the WUG master to distribute keys used specifically for communications within the WUG, including communication with the master and separate communication (using a different key) with other WUG members as is done with the fast inter-member access described below.

One device in a WUG can request to use the telephony services of another device in the WUG; TCS calls this an access rights request. A handset might request the use of the telephony services of a base station to make a call, or an access rights request might be used to transfer a call from one TCS device (such as a handset or headset) to another.

Configuration distribution is the TCS-BIN method for managing the membership of the WUG. Again using the concept of a WUG master that maintains all of the information about the WUG, TCS-BIN defines a protocol for the WUG master to send updated WUG configuration information to each WUG member, each time that configuration information changes. For example, this might be used to inform all WUG members that a new member has joined (or that some member has left) the WUG. Among other applications, this feature could be used to support the three-in-one phone profile by advising WUG members (perhaps stationary handsets and base stations in a home) that a new member (say, a mobile phone brought into the home) has joined the WUG. Thus the mobile phone's presence is known and it can contact the base station (to act as a cordless phone) or it could directly contact other phones in the WUG (to act as an intercom).

Fast intermember access is a facility by which any two WUG members can quickly establish a connection with each other. This feature makes use of the fact that two members already belong to a WUG and have already established connections with a common WUG master. Thus all WUG members are already in a single piconet, all using the same hopping sequence established by the WUG master's clock. Furthermore, via the configuration distribution noted above, all WUG members can know about all other WUG members. Because all of this information is already known, it can be leveraged to establish a connection with another WUG member more quickly than such a connection could be established from scratch. With fast intermember access, a WUG member uses the configuration information to determine another member with which it wishes to establish contact. It forwards this information to the WUG master, which in turn contacts the target WUG member. That member then responds to the WUG master, includes its own clock offset information in the response, and then places itself into a page scan state. The master forwards the clock offset information to the requesting WUG member, which can then very quickly use this information to establish a connection with the target member by paging that member (which is now in page scan state to accept such pages), the result being a new piconet, consisting initially of the two devices. This scheme takes advantage of the other features that are already in place for a WUG to enable quick direct connection between any two devices in that WUG to support, for example, the "walkie-talkie" function of the three-in-one phone profile.

Connectionless TCS

Finally, TCS-BIN also provides a way for devices to exchange call signaling information without actually placing a call or having a TCS call connection established. This is called connectionless TCS. Connectionless TCS provides a sort of "sideband" in which devices within a WUG can send messages to each other without having to have a TCS connection established between them. What sort of messages might these devices want to send? The specification defines only a single message format for connectionless TCS called CL Info. CL_Info messages in turn can contain only two types of information: audio control, used to specify information about microphone gain and speaker volume settings, and company information, which is the common TCS way to allow any information not specified in a standardized TCS format to be interchanged. Thus it can be seen that connectionless TCS could be used to manage the audio settings of all members of a WUG as well as to communicate product-specific features, defined by the manufacturer, among all of the devices from that manufacturer in the WUG. Such use of connectionless TCS might allow, for example, advanced telephony features to be used between a base station and a handset from the same manufacturer that might not be available to handsets from another manufacturer (although through standard TCS operations the basic telephony functions would be expected to work within the WUG with all handsets).

Bluetooth Audio Development

There was no audio working group per se within the SIG. Audio has been an inherent part of Bluetooth wireless communication since its inception and thus has always been integrated into the fundamental design of the protocol stack. Audio (voice or other audio) is carried over SCO links at the baseband layer. These basic SCO links were already defined early in the SIG's history, shortly after it was publicly announced (the addition of multiple SCO connections to support multiple voice channels was introduced in mid-1998).

This evolution of Bluetooth audio mirrors its situation within the stack: it is not a distinct protocol layer but rather a fundamental part of the technology. Audio essentially is integrated into the baseband. Owing to a few specific considerations for audio in the protocol stack we discuss it as a separate topic. And as noted above, due to audio's affinity with telephony, for pragmatic reasons we discuss it in this chapter.

Bluetooth Audio Examined

A quick scan of the specification searching for audio information is likely to locate only Appendix V, "Bluetooth Audio." This appendix contains information interesting mostly to audio and sound engineers, including such things as recommended sound pressure, loudness and audio levels. Although important, it is not the fundamental information about how to deal with Bluetooth wireless audio traffic. That information, as might be expected from preceding discussions, is actually found in the "Bluetooth Audio" section of the Baseband chapter of the specification.

While audio in Bluetooth wireless communication need not be used exclusively for voice, its design is optimized for voice content. Sound tends to be continuous for periods of time and is thus isochronous, or time limited. The transmission rate for Bluetooth audio traffic is set at 64 Kbps, chosen to be sufficient for normal voice conversations. While the communication of other audio media (say, music) over Bluetooth audio links is not precluded, the design is not based upon such audio traffic; it clearly is centered around voice traffic.

Two types of encoding schemes are specified for Bluetooth audio. The first is pulse coded modulation (PCM) with either of two types of logarithmic compression (called A-law and ยต-law) applied. PCM audio with these compression types is well known and widely used for general audio, including things like short sound clips. The second audio encoding scheme is continuous variable slope delta (CVSD) modulation. The characteristics of typical voice conversations, which have a more predictable continuity than general audio (music, for example), make a delta-slope prediction more efficient. CVSD generally is also more tolerant of communication errors. Thus CVSD, in general, is a more effective and efficient (and thus generally preferred) method to use for Bluetooth audio communication; we observe once again that this is an optimization for voice versus other forms of audio.

The specification says little else about audio as a topic unto itself. The remainder of what needs to be specified (and what implementers and others may wish to understand) about audio can be gleaned from a study of the baseband protocol, including the SCO packet structure and timing designed specifically to support audio traffic simultaneously with data traffic. This information is found in the baseband chapter of the specification and in Chapter 6 of this book.

One final note about audio: it should be clear that Bluetooth audio as described here and in the specification is digital isochronous (effectively streaming) audio traffic that operates directly over the air-interface using the baseband protocols. Of course audio information can also be encoded in a digital packet-based format[3] using local recording and playback. Such digital audio information clearly could be transmitted over Bluetooth links using the L2CAP layer of the protocol stack, but this is quite different from what we refer to here as Bluetooth audio.[4]

Audio and Telephony Control Usage

Several families of telephony applications are possible. TCS-BIN is intended to support applications that realize the Bluetooth telephony-based profiles: cordless telephony and intercom. These are the only two profiles technically classified as telephony profiles, based upon their usage of TCS-BIN. Such applications are expected to use TCS-BIN directly, as depicted in Figure 10.1.

Other sorts of applications also might be considered in some respects to be telephony applications; these include dial-up networking, fax and headset profile applications. In volume 2 of the specification these profiles are considered to be part of the serial port profile family; this is because the telephony facets of these applications tend to use the programming model of AT commands over a serial port (RFCOMM in the Bluetooth wireless communication case), as described earlier in this chapter. Although not TCS-BIN based, we also consider these applications in general to be telephony applications, and they are depicted in Figure 10.1 as such (that figure shows telephony applications using both TCS-BIN and AT commands over RFCOMM).

Legacy applications are likely to use AT command telephony control, since this is the typical programming model in the world of serial cables. New applications developed specifically to make use of Bluetooth wireless links, though, are encouraged via the specification to make use of the TCS-BIN protocol, which provides a robust set of telephony control functions based upon an existing standard that has been adapted for the Bluetooth stack.

Telephony and audio, particularly voice audio, play important roles in the Bluetooth stack. With their associated applications (which may involve both computing and telecommunications devices in some profiles and usage models), they provide a distinguishing feature of Bluetooth wireless communication.



[1] Actually there is one "leftover" reference to TCS-AT in the Bluetooth Assigned Numbers appendix of the specification, the last remnant of TCS-AT's former existence as a separately described protocol. As defined there, the value could be used to indicate a device's support for AT command telephony control.

[2] The need to transmit ring signals simultaneously to multiple telephone handsets was a primary motivation for including group abstraction and management and connectionless channels in L2CAP. These features could certainly be utilized in other future scenarios, but in version 1.0 they are used only in the context of TCS-BIN.

[3] Such as WAV and many other fundamentally similar representations.

[4] For one thing, it is not truly isochronous, at least not in a streaming, over-the-air fashion. For another, most encoding schemes for such digital packet audio are designed so that many types of audio (music, sound clips and so on) can be effectively rendered, rather than optimizing the audio content for one primary use such as voice.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.168.16