8.3 Binaural parameter analysis

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

8.3 Binaural parameter analysis

8.3.1 Binaural parameters for a single sound source

In conventional binaural rendering systems, a sound source i with associated time-domain signal x_i(t) is rendered at a certain position by convolving the signal with a pair of headrelated impulse responses h_l,i(t), h_r,i(t), for the left and right ears, respectively, to result in binaural signals y_l,i(t), y_r,i(t):

images

with v ∈ {l, r}. This process is visualized in the left panel of Figure 8.1.

It is often convenient to express the convolution in the frequency domain using a frequency-domain representation X_i(f) of a short segment of x_i(t):

with H_l,i(f), H_r,i(f) the frequency-domain representations (head-related transfer functions) of h_l,i(t), h_r,i(t), respectively. The power p_yv,i at the eardrum resulting from signal y_v,i in frequency band b is given by:

images

Figure 8.1 Synthesis of a virtual sound source by means of HRIR convolution (left panel) and by means of parametric representations (right panel). Reproduced from Breebaart, J. (2007). Analysis and synthesis of binaural parameters for efficient 3D audio rendering in MPEG Surround. IEEE Int. Conf. on Multimedia and Expo (ICME 2007), Beijing, China. Copyright 2007, IEEE.

images

with (*) the complex conjugation operator, and f(b) the lower edge frequency of frequency band b. For clarity and readability, the subscript b will not be given in the following equations; all described processing should nevertheless be performed in each parameter band individually. If the HRTF magnitude spectra H_v,i(f) are locally stationary (i.e. constant within the frequency band b), this can be simplified to:

with p_hv,i the power within frequency band b of HRTF H_v,i:

images

and p_xi the power of the source signal X_i(f) in frequency band b:

images

Thus, given the local stationarity constraint, the power at the level of the eardrums follows from a simple multiplication of the power of the sound source and the power of the HRTF in corresponding frequency bands. In other words, statistical properties of binaural signals can be deducted from statistical properties of the source signal and from the HRTFs. This parameter-based approach is visualized in the right panel of Figure 8.1.

The inter-aural phase difference ϕ in parameter band b is given by the phase difference ϕ_{yl,iy_r,i} between the signals y_l,i and y_r,i in parameter band b:

images

Under the assumption of local stationarity of inter-aural HRTF phase spectra, the IPD can be derived directly from the HRTF spectra themselves, without involvement of the sound source signal:

with ϕ_{hl,_ih_r,i} the average phase angle of the HRTF pair corresponding to position i and frequency band b:

images

The equations above assume local stationarity of HRTF magnitude and inter-aural phase spectra to estimate the resulting binaural parameters. However, strong deviations from stationarity within analysis bands may result in a decrease in the inter-aural coherence (IC) for certain frequency bands, which can be perceived as a change in the spatial ‘compactness’ of a virtual sound source. To capture this property, the IC is estimated for each frequency band b. In the current context, the coherence is defined as the absolute value of the average normalized cross-spectrum:

images

The IC parameter has a dependency on the source signal x_i. For broadband signals, however, it's expected value however is only dependent on the HRTFs:

with

images

In summary, under the local stationarity constraint, the binaural parameters p_yl, p_yr, IPD and IC resulting from a single sound source can be estimated from the sound source parameters p_xi and the HRTF parameters p_hl,i, p_hr,i, ϕ_hl,i,hr,i and c_hl,i,hr,i.

8.3.2 Binaural parameters for multiple independent sound sources

For multiple simultaneous sound sources, conventional methods convolve each individual source signal i with an HRTF pair corresponding to the desired position, followed by summation:

Under the constraint of independent sound source signals x_i(t), the power at the eardrums in frequency band b is given by the sum of the powers of each individual virtual sound source:

which can be written for stationary HRTF properties as:

The net IPD ϕ resulting from the simultaneous virtual sound sources i is given by:

images

This formulation can also be written in terms of parameters:

images

The IC can be estimated similarly:

images

8.3.3 Binaural parameters for multiple sound sources with varying degrees of mutual correlation

The assumption of independent signals across various objects may hold for many applications, especially if each signal is associated with independent sound sources. However, for some applications, the various signals may comprise common components. For example if a virtual multi-channel audio setup is simulated, the signals that are radiated by the virtual loudspeakers may exhibit a significant mutual correlation. In that case, these correlations have to be taken into account in the binaural parameter estimation process. The ICC for band b is denoted by c_xi₁,xi₂, for sound sources i₁ and i₂. In that case, the binaural parameters are estimated according to:

images

with

In a similar way, the IPD and IC are given by:

images

with

images

with

In these equations, the IPD ϕ of each sound source is assumed to be distributed symmetrically across the two binaural signals (i.e. ϕ/2 is the phase offset that is applied to the left-ear signal, and −ϕ/2 is the phase offset of the right-ear signal). As can be observed, these equations are equivalent to those given in Section 8.3.2 for c_xi₁,xi₂ = 0 if i₁ ≠ i₂.

If the decrease in coherence due to HRTF convolution using different impulse responses for both ears is ignored (i.e. c_{h₁ h_r} = 1) and hence it is assumed that the coherence of the binaural signal pair Y_L, Y_R is dominated by the fact that (partially) incoherent sources have different spatial positions, the estimation process simplifies to:

images

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8.3 Binaural parameter analysis

Create new playlist

Sign In

Sign Up

8.3 Binaural parameter analysis

8.3.1 Binaural parameters for a single sound source

8.3.2 Binaural parameters for multiple independent sound sources

8.3.3 Binaural parameters for multiple sound sources with varying degrees of mutual correlation

Table of Contents for
8.3 Binaural parameter analysis