13.3 Cross-adaptive AM-DAFX

DAFX architectures have been classified by their implementation [Zöl05]: filters, delays, modulators, time-segment processing, time-frequency processing, etc. Similarly, DAFX have also been classified by the perceptual attributes [ABLAV03] which they modify e.g., timbre, delay, pitch, positions or quality. Although these classifications tend to be accurate in many contexts, they are not optimal for understanding the signal-processing control architectures of some more complex effects. More recently, an adaptive digital audio effect (ADAFx) class was proposed [VZA06]. This class uses features extracted from the signals to control the signal-processing process. In terms of their adaptive properties, digital audio effects may be distinguished as follows:

1. Direct user control: Features are not extracted from input signals so they are non-adaptive. A multi-source extension of this approach is the result of unifying the user interface, for example when linking a stereo equaliser. This provides exactly the same equalisation for the left and right channel using a single user panel. Although the user interface is unified, the output signal processing is independent of the signal content.

2. Auto-adaptive: Control parameters are based on a feature extracted from the input source. These include, for example, auto tuning, harmonisers, simple single-channel noise gates and compressors.

3. External-adaptive: The system takes its control processing variable from a different source to the one on which it has been applied. This is the case for ducking effects, side-chain gates and side-chain compressors.

4. Cross-adaptive effects: Signal processing is the direct result of the analysis of the content of each individual channel with respect to the other channels. The signal processing in such devices is accomplished by inter-source dependency. It is a feedforward cross-adaptive effect if it takes its control variable from the inputs, and it is called a feedback cross-adaptive effect if it takes its control feature from the outputs.

When mixing audio, the user tends to perform signal-processing changes on a given signal source not only because of the source content, but also because there is a simultaneous need to blend it with the content of other sources, so that an overall mix balance is achieved. There is a need to be aware of the relationship between all the sources involved in the audio mixture. Thus, a cross-adaptive effect-processing architecture is ideal for automatic mixing. The general block diagram of a cross-adaptive device is depicted in Figure 13.4. Due to the importance of source inter-relationship in audio mixing for music, we can add another design objective to be performed by the AM-DAFX:

5. The signal processing of an individual source is the result of the inter-dependent relationships between all involved sources involved.

Figure 13.4 General diagram of a cross-adaptive device using side-chain processing.

13.4

A cross-adaptive process is characterised by the use of a multi-input multi-output (MIMO) architecture. For the sake of simplicity we will define our MIMO systems in this chapter to have the same number of input and outputs, unless stated. We will identify inputs as xm(n) and outputs as ym(n), where m has a valid range from 0, …, M − 1 given that M is the maximum number of input sources involved in the signal-processing section of the AM-DAFX. External sources are denoted xe(n), as in Figure 13.2. During this chapter we will use an architecture that does not make use of feedback. Therefore the side-chain processing inputs will be taken only from the input of the signal-processing section of the AM-DAFX. In a cross-adaptive AM-DAFX the side chain consists of two main sections:

1. A feature extraction processing section.

2. A cross-adaptive feature processing block.

The feature extraction vector for all sources, obtained from the feature extraction processing section, will be denoted fvm(n), where n denotes the discrete time index in samples. The control data vectors for all sources, obtained from the cross-adaptive feature processing block, will be denoted as cvm(n).

13.3.1 Feature Extraction for AM-DAFX

The feature extraction processing block is in charge of extracting a series of features per input channel. The ability to extract the features fast and accurately will determine the ability of the system to perform appropriately in real-time. The better the model for extracting a feature, the better the algorithm will perform. For example, if perceptual loudness is the feature to be extracted, the model of loudness chosen to extract the feature will have a direct impact on the performance of the system. According to their feature usage AM-DAFX can be in one of two forms:

1. Accumulative: This type of AM-DAFX aim to achieve a converging data value which improves in accuracy with time in proportion to the amount and distribution of data received. The system has no need to continuously update the data control stream, which means that the accumulative AM-DAFX can operate on systems which are performing real-time signal-processing operations, even if the feature extraction process is non-real-time. The main idea behind accumulative AM-DAFX, as implemented herein, is to obtain the probability mass function of the feature under study and use the most probable solution as the driving feature of the system. In other words we derive the mode, which corresponds to the peak value of the accumulated extracted feature.

2. Dynamic: This type of AM-DAFX makes use of fast extractable features to drive data-control processing parameters in real-time. An example of such a dynamic system can be a system which uses an RMS feature to ride vocals against background music. Another example can be gain-sharing algorithms for controlling microphones such as the one originally implemented in [Dug75]. Dynamic AM-DAFX do not tend to converge to a static value. A compromise between dynamic and accumulative feature extraction can be achieved by using relatively small accumulative windows with weighted averages.

An important consideration to be taken into account during the feature extraction process is noise. The existence of bleed, crosstalk, self-noise and ambient noise will influence the reliability of the feature extraction. Common methods for obtaining more reliable features include averaging, coherence validation and gating. One of the most common methods used for AM-DAFX is adaptive gating, where the gating threshold adapts according to the existing noise. This method was introduced to automatic mixing applications by [Dug75, Dug89]. It requires an input noise source which is representative of the noise in the system. In the case of a live system a microphone outside of the input source capture area is a good representation of ambient noise. Therefore this microphone signal can be used to derive the adaptive threshold needed to validate the feature.

For accumulative AM-DAFX, variance threshold measures can be used to validate the accuracy of the probability mass function peak value. The choice of feature extraction model will influence the convergence time in order to achieve the desired variance. For this to work appropriately in a system that is receiving an unknown input signal, in real-time, some re-scaling operations must be undertaken. First the probability mass function must always be equal to one. Second, if the maximum dynamic range of the feature is unknown the mass probability function must be re-scaled. In such a case, the axis range should be normalised continuously to unity by dividing all received feature magnitudes by the magnitude of the maximum received input value.

An example of the effects of adaptive gating and re-scaling in an accumulative feature extraction block is shown in Figure 13.5. In this example the feature under study is loudness, which has been extracted from a musical test signal. If no re-scaling and no adaptive gating is used to optimise the loudness mass probability function, the resulting most probable feature value is always 0, as shown in Figure 13.5(a). This is because there is a large amount of low-level noise which biases the loudness measurement. A second test with re-scaling and no adaptive gating is shown in Figure 13.5(b). It can be see that although a Gaussian shape corresponding to the actual loudness can be seen, there are still a large number of data points in the lowest bin of the histogram, causing an erroneous null measurement. When adaptive gating is performed without re-scaling, Figure 13.5(c), the number of zero-bin occurrences is dramatically reduced. Finally, a test consisting of both, re-scaling and adaptive gating, is depicted in Figure 13.5(d). It can be seen that the algorithm is able to correctly identify the most probable feature value. This means that that both adaptive re-scaling and gating must be performed in order to achieve accurate extraction of the most probable feature value.

Figure 13.5 Accumulated histograms. The circular marker denotes the resulting accumulated peak loudness value [PGR09a].

13.5

13.3.2 Cross-adaptive Feature Processing

The cross-adaptive processing section of the AM-DAFX is in charge of determining the inter-dependence of the input features in order to output the appropriate control data. These data controls parameters in the signal-processing section of the AM-DAFX. The obtained control parameters are usually interpolated before being sent to the signal-processing portion of the AM-DAFX. This can be achieved using a low pass filter that will ensure a smooth interpolation between control data points. The cross-adaptive feature processing can be implemented by a mathematical function that maps the inter-dependence between channels. In many cases constraint rules can be used to narrow the inter-dependency between channels. In order to keep the cross-adaptive processing system stability the overall gain contribution of the resulting control signals can be normalised so that the overall addition of all source control gains is equal to unity. The cross-adaptive function is unique for every design, and has to be individually derived according to the aim of the AM-DAFX.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.70.38