4.1 Introduction

The concept of spatial audio coding is to represent two or more audio channels by means of a down-mix, accompanied by parameters to model the spatial attributes of the original audio signals that are lost by the down-mix process. These ‘spatial parameters’ capture the perceptually-relevant spatial attributes of an auditory scene and provide means to store, process and reconstruct the original spatial image.

In this chapter the concept of spatial audio coding is explained. The first implementations of spatial audio coding techniques employed a single audio channel as down-mix. This approach is also denoted binaural cue coding (BCC). The spatial audio coding approach and concepts using a single audio down-mix channel (BCC) are explained in detail in the current chapter. The extension to multiple down-mix channels is explained in the context of MPEG Surround in Chapter 6.

Figure 4.1 shows a BCC encoder and decoder. As indicated in the figure, the input audio channels xc(n) (1 ≤ c ≤ C) are down-mixed to one single audio channel s(n), denoted down-mix signal. As ‘perceptually relevant differences’ between the audio channels, inter-channel time difference (ICTD), inter-channel level difference (ICLD), and inter-channel coherence (ICC), are estimated as a function of frequency and time and transmitted as side information to the decoder. The decoder generates its output channels imagesc(n) (1 ≤ c ≤ C) such that ICTD, ICLD, and ICC between the channels approximate those of the original audio signal.

The scheme is able to represent multi-channel audio signals at a bitrate only slightly higher than what is required to represent a mono audio signal. This is so, because the estimated ICTD, ICLD, and ICC between a channel pair contain about two orders of magnitude less information than an audio waveform.

Not only the low bitrate, but also the backward compatibility aspect is of interest. The transmitted down-mix signal corresponds to a mono down-mix of the stereo or multi-channel signal. For receivers that do not support stereo or multi-channel sound reproduction, listening to the transmitted down-mix signal is thus a valid method of presenting the audio material on low-profile mono reproduction setups. BCC can therefore also be used to enhance existing services involving the delivery of mono audio material towards multi-channel audio. For example, existing mono audio radio broadcasting systems can be enhanced for stereo or multi-channel playback if the BCC side information can be embedded into the existing transmission channel.

images

Figure 4.1 Generic scheme for binaural cue coding (BCC).

Section 4.2 reviews previously proposed related techniques. BCC is motivated and explained in detail in Section 4.3. This includes a discussion of how ICTD, ICLD, and ICC relate to properties of auditory objects and the auditory spatial image. Multi-channel surround systems often support one or more discrete audio channels for low-frequency effects, denoted LFE channel (for more details see Section 2.2.3). Section 4.4 describes how to apply BCC for efficient coding of LFE channels.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.129.90