7.1 Introduction

Convolution using head-related transfer functions (HRTFs) is a widely used method to evoke the percept of a virtual sound source at a given spatial position. As described in Chapter 3, the use of HRTFs requires a database of pairs of impulse responses, preferably matched to the anthropometric properties of the user. Because of the large amount of data present in individual HRTF sets that is normally required to generate externalized virtual sound sources, it is desirable to find an efficient representation of HRTFs. For example, attempts have been made to only measure HRTF sets for a limited range of source positions and to interpolate HRTFs for positions in between (based on the magnitude transfers [270], spherical harmonics [81], eigentransfer functions [56], pole-zero approximations [27] or spherical spline methods [54]). Other studies described HRTFs by deriving a small set of basis spectra with individual, position-dependent weights [56, 59, 162]. Although these methods are sound in physical terms, there is a risk that the basis functions that are very important in terms of the least-squares error of the fit are not so relevant in terms of human auditory perception. Another, more psycho-acoustically motivated approach consisted of determining the role of spectral and inter-aural phase cues present in the HRTFs. Wightman and Kistler [274] showed that low-frequency inter-aural time differences dominate in sound localization, while if the low frequencies are removed from the stimuli, the apparent direction is determined primarily by inter-aural level differences and pinna cues. Hartmann and Wittenberg [115] and Kulkarni et al. [170, 171] showed that the frequency-dependent ITD of anechoic HRTFs can be simplified by a frequency-independent delay without perceptual consequences. Huopaniemi and Zacharov [131] discussed three methods to reduce HRTF information. The first method entailed smoothing of the HRTF magnitude spectra by a rectangular smoothing filter with a bandwidth equal to the equivalent rectangular bandwidth (ERB) [97]. Similar experiments using gammatone transfer functions were performed by Breebaart and Kohlrausch [42, 43]. The second method embodied weighting of the errors in an HRTF approximation with the inverse of the ERB scale as weighting function. The third method used frequency warping to account for the nonuniform frequency resolution of the auditory system. From many of these studies, it can be concluded that, although a frequency-independent ILD does not result in an externalized image, the complex magnitude and phase spectra which are present in HRTFs can be simplified to some extent without deteriorating the externalization.

The approach that is pursued here is to exploit limitations of the binaural hearing system to reduce the amount of information to describe HRTFs. It is assumed that the spatial audio coding approach that was so far applied to stereo and multi-channel audio signals can be applied to HRTF impulse responses as well. More specifically, it is hypothesized that the inter-aural and spectral (envelope) properties of anechoic HRTFs can be ‘downsampled’ to an ERB-scale resolution without perceptual consequences. Furthermore, it is hypothesized that for anechoic HRTFs, the absolute phase spectrum is irrelevant and only the relative (inter-aural) phase between the two impulse responses has to be taken into account. These assumptions lead to a very simple parametric description of HRTFs, that comprises an average IPD or ITD per frequency band and two signal level parameters that describe the average signal level in each band for each of the two ears. This set can be extended with an IC parameter for each frequency band; however for many anechoic HRTFs the IC parameter is often very close to +1 and hence small deviations from +1 can be ignored for the anechoic case.

In Section 7.2, the HRTF parameterization procedure will be described in more detail. Subsequently, three different listening tests will be described to evaluate the HRTF parameterization process and to assess the number of parameters required for a perceptually transparent HRTF representation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.136.226