5.3 Basic Spatial Effects for Stereophonic Loudspeaker and Headphone Playback

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

The most common loudspeaker layout is the two-channel setup, called the standard stereophonic setup. It was widely taken into use after the development of single-groove 45°/45° two-channel records in the late 1950s. Two loudspeakers are positioned in front of the listener, separated by 60° from the listener's viewpoint, as presented in Figure 5.2. The setup of two loudspeakers is very common, though quite often the setup is not as shown in the figure. In domestic use, or in car audio typically the listener is not situated in the centre, but the loudspeakers are located in different directions and distances from him than in Figure 5.2. However, even then two-channel reproduction is preferred from monophonic presentation in most cases. On the other hand, in some cases the listener can be assumed to be in the best listening position, as in computer audio.

Figure 5.2 Standard stereophonic listening configuration.

This section deals with the spatial effects obtainable with such two-channel reproduction and simple processing. Two types of effects are presented, the creation of point-like sources, and the creation of spatially spread sources. More advanced methods for loudspeaker reproduction with HRTF processing are discussed in Section 5.4.5, which provide some more degrees of freedom, but unfortunately also introduce some limitations in listening position and listening room acoustics.

5.3.1 Amplitude Panning in Loudspeakers

Amplitude panning is the most frequently used virtual-source-positioning technique. In it a sound signal is applied to loudspeakers with different amplitudes, which can be formulated as

5.1

where xi(t) is the signal to be applied to loudspeaker i, gi is the gain factor of the corresponding channel, N is the number of loudspeakers, and t is the time. The listener perceives a virtual source, the direction of which is dependent on the gain factors.

If the listener is located equally distant from the loudspeakers, the panning law estimates the perceived direction θ from the gain factors of loudspeakers. The estimated direction is called the panning direction or panning angle. In [Pul01] it has been found that amplitude panning provides consistent ITD cues up to 1.1 kHz, and roughly consistent ILD cues above 2 kHz for a listener in the best listening position. The level differences between the loudspeakers are changed a bit surprisingly to phase differences between the ears, which is due to the fact that the sound arrives from both loudspeakers to both ears, which is called cross-talk. This effect is valid at low frequencies. At high frequencies, the level differences of the loudspeakers turn into level differences due to lack of the cross-talk caused by the shadowing of the head.

There exist many published methods to estimate the perceived direction. In practice, all the proposed methods are equally good for audio effects, and the tangent law by Bennett et al. [BBE85] is formulated as

5.2

which has been found to estimate perceived direction best in listening tests in anechoic listening [Pul01]. There are also other panning laws, reviewed in [Pul01].

The panning laws set only the ratio between the gain factors. To prevent undesired changes in loudness of the virtual source depending on panning direction, the sum-of-squares of the gain factors should be normalized:

5.3 5.3

This normalization equation is used in real rooms with some reverberation. Depending on listening room acoustics, different normalization rules may be used [Moo90].

The presented analysis is valid only if the loudspeakers are equidistant from the listener, and if the base angle is not larger than about 60°. This defines the best listening area where the virtual sources are localized between the loudspeakers. The area is located around the axis of symmetry of the setup, as shown in Figure 5.2. When the listener moves away from the area, the virtual source is localized towards the nearest loudspeaker which emanates a considerable amount of sound, due to the precedence effect.

In principle, the amplitude-panning method creates a comb-filter effect in the sound spectrum, as the same sound arrives from both loudspeakers to each ear. However, this effect is relatively mild, and when heard in a normal room, the room reverberation smooths the coloring effect prominently. The sound color is also very similar when heard in different positions in the room. The lack of prominent coloring and the relatively robust directional effect provided by it are very probably the reasons why amplitude panning is included in all mixing consoles as “panpot” control, which makes it the most widely used technique to position virtual sources.

M-file 5.1 (stereopan.m)

% stereopan.m

% Author: V. Pulkki

% Stereophonic panning example with tangent law

Fs=44100;

theta=-20; % Panning direction

% Half of opening angle of loudspeaker pair

lsbase=30;

% Moving to radians

theta=theta/180*pi;

lsbase=lsbase/180*pi;

% Computing gain factors with tangent law

g(2)=1; % initial value has to be one

g(1)=- (tan(theta)-tan(lsbase)) / (tan(theta)+tan(lsbase)+eps);

% Normalizing the sum-of-squares

g=g/sqrt(sum(g.∧2));

% Signal to be panned

signal=mod([1:20000]’,200)/200;

% Actual panning

loudsp_sig=[signal*g(1) signal*g(2)];

% Play audio out with two loudspeakers

soundsc(loudsp_sig,Fs);

5.3.2 Time and Phase Delays in Loudspeaker Playback

When a constant delay is applied to one loudspeaker in stereophonic listening, virtual sources with transient signals are perceived to migrate towards the loudspeaker that radiates the earlier sound signal [Bla97]. Maximal effect is achieved asymptotically when the delay is approximately 1.0 ms or more. However, the effect depends on the signal used. With continuous signals containing low frequencies, the effect is much less prominent than with modulated signals containing high frequencies.

In such processing the phase or time delays between the loudspeakers are turned at low frequencies into level differences between the ears, and at high frequencies to time differences between the ears. This all makes the virtual source direction depend on frequency [Coo87, Lip86]. The produced binaural cues vary with frequency, and different cues suggest different directions for virtual sources [PKH99]. It may thus generate a “spread” perception of direction of sound, which is desirable in some cases. The effect is dependent on listening position. For example, if the sound signal is delayed by 1 ms in one loudspeaker, the listener can compensate the delay by moving 30 cm towards the delayed loudspeaker.

M-file 5.2 (delaypan.m)

% delaypan.m

% Author: V. Pulkki

% Creating spatially spread virtual source by delaying one channel

Fs=44100;

% Delay parameter for channel 1 in seconds

delay=0.005;

% Corresponding number of delayed samples

delaysamp=round(delay*Fs)

% Signal to be used

signal=mod([1:20000]’,400)/400;

signal(1:2000)=signal(1:2000).*[1:2000]’/2000; % Fade in

% Delaying first channel

loudsp_sig=[[zeros(delaysamp,1); signal(1:end-delaysamp)] signal];

% Play audio with loudspeakers

soundsc(loudsp_sig,Fs);

A special case of a phase difference in stereophonic reproduction is the use of antiphasic signals in the loudspeakers. In such a technique, the same signal is applied to both loudspeakers, however, the polarity of the other loudspeaker signal is inverted, which produces a constant 180° phase difference between the signals at all frequencies. This changes the perceived sound color, and also spreads the virtual sources. Depending on the listening position, the low frequencies may be cancelled out. At higher frequencies this effect is milder. This effect is also milder in rooms with longer reverberation. The directional perception of the antiphasic virtual source depends on the listening position. In the sweet spot, the high frequencies are perceived to be at the center, and low frequencies in random directions. Outside the sweet spot, the direction is either random, or towards the closest loudspeaker. In the language of professional audio engineers this effect is called “phasy”, or “there is phase error in here”.

M-file 5.3 (phaseinvert.m)

% phaseinvert.m

% Author: V. Pulkki

% Create a spread virtual source by inverting phase in one loudspeaker

Fs=44100;

signal=mod([1:20000]’,400)/400; %signal to be used

% Inverting one loudspeaker signal

loudsp_sig=[-signal signal];

% Play audio out with two loudspeakers

soundsc(loudsp_sig,Fs);

A further method to spread the virtual source between the loudspeakers is to change the phase spectrum of the sound differently at different frequencies. A basic method is to convolve the signal for the loudspeakers with two different short bursts of white noise. Another method is to apply a different delay to different frequencies. This effectively spreads out the virtual source between the loudspeakers, and the effect is audible over a large listening area. Unfortunately, the processing changes the temporal response slightly, which may be audible as temporal smearing of transients of the signal.

Below is a example creating spread virtual sources for stereophonic listening by convolving the sound with short noise bursts:

M-file 5.4 (spreadnoise.m)

% spreadnoise.m

% Author: V. Pulkki

% Example how to spread a virtual source over N loudspeakers

Fs=44100;

signal=mod([1:20000]’,400)/400; % Signal to be used

NChan=2; % Number of channels

% Generate noise bursts for all channels

nois=rand(round(0.05*Fs),NChan)-0.5;

% Convolve signal with bursts

loudsp_sig=conv(signal,nois(:,1));

for i=2:NChan

loudsp_sig=[loudsp_sig conv(signal,nois(:,i))];

end

if NChan == 2

% Play audio out with loudspeakers

soundsc(loudsp_sig,Fs);

else

% Write file to disk

loudsp_sig=loudsp_sig/max(max(loudsp_sig))*0.9;

wavwrite([loudsp_sig],Fs,16,’burstex.wav’);

end

5.3.3 Listening to Two-channel Stereophonic Material with Headphones

The headphone listening is significantly different to loudspeaker listening. In headphones the cross-talk present in loudspeaker listening is missing, meaning that the sound from the left headphone enters only to the left ear canal, and similarly with the right side. Typically, the audio engineers create the stereophonic audio content in studios with two-channel loudspeaker listening. It is then relevant to ask how the spatial perception of the content changes, when listened to over headphones.

With amplitude-panned virtual sources the level difference between headphone channels is turned directly into ILD, and ITD remains zero. This is very different from loudspeaker listening, where the direction of amplitude-panned sources relies on ITD cues, and ILD remains zero at low frequencies. Although this seems a potential source for large differences in spatial perception of resulting virtual sources, the resulting spatial image is similar. The virtual sources are ordered from the left to right in about the same order as in loudspeaker listening, however in headphone listening the sources are perceived inside the listener's head. This internalization is due to two facts: the dynamic cues propose internalized sources since the ITD and ILD do not change with listener movements, and also the monaural spectral cues do not suggest external sources, since the spectral cues are very different from the cues produced with distant sources.

If the stereophonic material includes virtual sources which have been spatialized by applying time delays, as in Section 5.3.2, this may result in a vastly different spatial perception in headphone listening, e.g., a 5 ms delay in the left loudspeaker may produce a spread perception of the sound in loudspeaker listening, but in headphone listening the sound can be perceived to originate only from the right headphone.

In Section 5.3.2 the technique to spread out virtual sources by convolution with noise burst was also described. This effect provides a similar effect in both headphone and loudspeaker listening. The frequency-dependent alteration of signal phase and magnitude creates ITD and ILD cues which change as a function of frequency in both loudspeaker and headphone listening. Of course, the effect is not the same, as in headphone listening the sound is perceived inside the head, and in loudspeaker listening it is perceived between the loudspeakers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 5.3 Basic Spatial Effects for Stereophonic Loudspeaker and Headphone Playback

Create new playlist

Sign In

Sign Up

Table of Contents for
5.3 Basic Spatial Effects for Stereophonic Loudspeaker and Headphone Playback