5.2 Interaction between core coder and spatial audio coding

So far, the core coder and the parametric stereo coder have mostly been described as separate processes. It is well known that core coders result in audible artifacts if the bitrate is relatively low. Such artifacts include audible pre-echos resulting from quantization noise just before transients, ‘warbling’ or ‘underwater’ effects resulting from spectral holes, and alike. On the other hand, a parametric stereo coder may also result in audible artifacts if employed at low bitrates. Examples of such artifacts are a loss of ‘width’, audible crosstalk or instable sound source positioning. When these coding methods are combined in a single audio codec, it is likely that audible artifacts resulting from the core coder as well as parametric stereo will determine the overall perceived quality, and that the artifacts may ‘add up’ in some way when the perceived quality of the complete system is determined.

As discussed in Chapter 1, bitrate reduction in conventional lossy audio coders is obtained predominantly by exploiting the phenomenon of masking. Therefore, lossy audio coders rely on accurate and reliable masking models, which are often applied on individual channel signals in the case of a stereo or multi-channel signal. For a parametric stereo extended (mono) audio coder, however, the masking model is applied only once on a certain combination of the two input signals. This scheme has two implications with respect to masking phenomena.

The first implication relates to spatial unmasking of quantization noise. In stereo waveform or transform coders, individual quantizers are applied on the two input signals or on linear combinations of the input signals. As a consequence, the injected quantization noise may exhibit different spatial properties than the audio signal itself. Due to binaural unmasking, the quantization noise may thus become audible, even if it is inaudible if presented monaurally. For tonal material, this unmasking effect (or BMLD, quantified as threshold difference between a binaural condition and a monaural reference condition) has shown to be relatively small (about 3 dB, see [127, 128]). However, it is expected that for broadband maskers, the unmasking effect is much more prominent. If one assumes an inter-aurally in-phase noise as masker, and a quantization noise which is either interaurally in-phase or inter-aurally uncorrelated, BMLDs are reported to be about 6 dB [72]. More recent data revealed BMLDs of 13 dB for this condition, based on a sensitivity of changes in the correlation of 0.045 [29]. To prevent these spatial unmasking effects of quantization noise, conventional stereo coders often apply some sort of spatial unmasking protection algorithm.

For a parametric stereo enhanced coder, on the other hand, there is only one waveform or transform quantizer, working on the mono (down-mix) signal. In the stereo reconstruction phase, both the quantization noise and the audio signal present in each frequency band will obey the same spatial properties. Since a difference in spatial characteristics of quantization noise and audio signal is a prerequisite for spatial unmasking, this effect is less likely to occur for parametric stereo enhanced coders than for conventional stereo coders.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.228.99