• Search in book...
• Toggle Font Controls

## Digital principles

### 3.1 Pure binary code

For digital audio use, the prime purpose of binary numbers is to express the values of the samples which represent the original analog sound-velocity or pressure waveform. Figure 3.1 shows some binary numbers and their equivalent in decimal. The radix point has the same significance in binary: symbols to the right of it represent one half, one quarter and so on. Binary is convenient for electronic circuits, which do not get tired, but numbers expressed in binary become very long, and writing them is tedious and error-prone. The octal and hexadecimal notations are both used for writing binary since conversion is so simple. Figure 3.1 also shows that a binary number is split into groups of three or four digits starting at the least significant end, and the groups are individually converted to octal or hexadecimal digits. Since sixteen different symbols are required in hex. the letters A–F are used for the numbers above nine.

There will be a fixed number of bits in a PCM audio sample, and this number determines the size of the quantizing range. In the sixteen-bit samples used in much digital audio equipment, there are 65 536 different numbers. Each number represents a different analog signal voltage, and care must be taken during conversion to ensure that the signal does not go outside the convertor range, or it will be clipped. In Figure 3.2 it will be seen that in a sixteen-bit pure binary system, the number range goes from 0000 hex, which represents the largest negative voltage, through 7FFF hex, which represents the smallest negative voltage, through 8000 hex, which represents the smallest positive voltage, to FFFF hex, which represents the largest positive voltage. Effectively, the zero voltage level of the audio has been shifted so that the positive and negative voltages in a real audio signal can be expressed by binary numbers which are only positive. This approach is called offset binary, and is perfectly acceptable where the signal has been digitized only for recording or transmission from one place to another, after which it will be converted directly back to analog. Under these conditions it is not necessary for the quantizing steps to be uniform, provided both ADC and DAC are constructed to the same standard. In practice, it is the requirements of signal processing in the digital domain which make both non-uniform quantizing and offset binary unsuitable.

Figure 3.1    (a) Binary and decimal. (b) In octal, groups of three bits make one symbol 0–7. (c) In hex, groups of four bits make one symbol 0–F. Note how much shorter the number is in hex.

Figure 3.2    Offset binary coding is simple but causes problems in digital audio processing. It is seldom used.

Figure 3.3    Attenuation of an audio signal takes place with respect to midrange.

Figure 3.3 shows that an audio signal voltage is referred to midrange. The level of the signal is measured by how far the waveform deviates from midrange, and attenuation, gain and mixing all take place around that midrange. Audio mixing is achieved by adding sample values from two or more different sources, but unless all the quantizing intervals are of the same size, the sum of two sample values will not represent the sum of the two original analog voltages. Thus sample values which have been obtained by non-uniform quantizing cannot readily be processed.

Figure 3.4    The result of an attempted attenuation in pure binary code is an offset. Pure binary cannot be used for digital audio processing.

If two offset binary sample streams are added together in an attempt to perform digital mixing, the result will be that the offsets are also added and this may lead to an overflow. Similarly, if an attempt is made to attenuate by, say, 6.02 dB by dividing all the sample values by two, Figure 3.4 shows that the offset is also divided and the waveform suffers a shifted baseline. The problem with offset binary is that it works with reference to one end of the range. What is needed is a numbering system which operates symmetrically with reference to the centre of the range.

### 3.2 Two’s complement

In the two’s complement system, the upper half of the pure binary number range has been redefined to represent negative quantities. If a pure binary counter is constantly incremented and allowed to overflow, it will produce all the numbers in the range permitted by the number of available bits, and these are shown for a four-bit example drawn around the circle in Figure 3.5. As a circle has no real beginning, it is possible to consider it to start wherever it is convenient. In two’s complement, the quantizing range represented by the circle of numbers does not start at zero, but starts on the diametrically opposite side of the circle. Zero is midrange, and all numbers with the MSB (most significant bit) set are considered negative. The MSB is thus the equivalent of a sign bit where 1 = minus. Two’s complement notation differs from pure binary in that the most significant bit is inverted in order to achieve the half-circle rotation.

Figure 3.5    In this example of a four-bit two’s complement code, the number range is from –8 to +7. Note that the MSB determines polarity.

Figure 3.6    A two’s complement ADC. At (a) an analog offset voltage equal to one-half the quantizing range is added to the bipolar analog signal in order to make it unipolar as at (b). The ADC produces positive-only numbers at (c), but the MSB is then inverted at (d) to give a two’s complement output.

Figure 3.6 shows how a real ADC is configured to produce two’s complement output. At (a) an analog offset voltage equal to one half the quantizing range is added to the bipolar analog signal in order to make it unipolar as at (b). The ADC produces positive only numbers at (c) which are proportional to the input voltage. The MSB is then inverted at (d) so that the all-zeros code moves to the centre of the quantizing range. The analog offset is often incorporated into the ADC as is the MSB inversion. Some convertors are designed to be used in either pure binary or two’s complement mode. In this case the designer must arrange the appropriate DC conditions at the input. The MSB inversion may be selectable by an external logic level.

The two’s complement system allows two sample values to be added, or mixed in audio parlance, and the result will be referred to the system midrange; this is analogous to adding analog signals in an operational amplifier. Figure 3.7 illustrates how adding two’s complement samples simulates the audio mixing process. The waveform of input A is depicted by solid black samples, and that of B by samples with a solid outline. The result of mixing is the linear sum of the two waveforms obtained by adding pairs of sample values. The dashed lines depict the output values. Beneath each set of samples is the calculation which will be seen to give the correct result. Note that the calculations are pure binary. No special arithmetic is needed to handle two’s complement numbers.

Figure 3.7    Using two’s complement arithmetic, single values from two waveforms are added together with respect to midrange to give a correct mixing function.

Figure 3.8 shows some audio waveforms at various levels with respect to the coding values. Where an audio waveform just fits into the quantizing range without clipping it has a level which is defined as 0 dBFs where Fs indicates full scale. Reducing the level by 6.02 dB makes the signal half as large and results in the second bit in the sample becoming the same as the sign bit. Reducing the level by a further 6.02 dB to –12 dBFs will make the second and third bits the same as the sign bit and so on. If a signal at –36 dBFs is input to a sixteen-bit system, only ten bits will be active, the remainder will copy the sign bit. For the best performance, analog inputs to digital systems must have sufficient levels to exercise the whole quantizing range.

Figure 3.8    0 dBFs is defined as the level of the largest sinusoid which will fit into the quantizing range without clipping.

It is often necessary to phase reverse or invert an audio signal, for example a microphone input to a mixer. The process of inversion in two’s complement is simple. All bits of the sample value are inverted to form the one’s complement, and one is added. This can be checked by mentally inverting some of the values in Figure 3.5. The inversion is transparent and performing a second inversion gives the original sample values.

Using inversion, signal subtraction can be performed using only adding logic. The inverted input is added to perform a subtraction, just as in the analog domain. This permits a significant saving in hardware complexity, since only carry logic is necessary and no borrow mechanism need be supported.

In summary, two’s complement notation is the most appropriate scheme for bipolar signals, and allows simple mixing in conventional binary adders. It is in virtually universal use in digital audio processing, and is accordingly adopted by all the major digital audio interfaces and recording formats.

Two’s complement numbers can have a radix point and bits below it just as pure binary numbers can. It should, however, be noted that in two’s complement, if a radix point exists, numbers to the right of it are added. For example, 1100.1 is not –4.5, it is –4 + 0.5 = –3.5.

### 3.3 Introduction to digital processing

However complex a digital process, it can be broken down into smaller stages until finally one finds that there are really only two basic types of element in use. Figure 3.9 shows that the first type is a logical element. This produces an output which is a logical function of the input with minimal delay. The second type is a storage element which samples the state of the input(s) when clocked and holds or delays that state. The strength of binary logic is that the signal has only two states, and considerable noise and distortion of the binary waveform can be tolerated before the state becomes uncertain. At every logical element, the signal is compared with a threshold, and can thus can pass through any number of stages without being degraded. In addition, the use of a storage element at regular locations throughout logic circuits eliminates time variations or jitter.

Figure 3.9    Logic elements have a finite propagation delay between input and output and cascading them delays the signal an arbitrary amount. Storage elements sample the input on a clock edge and can return a signal to near coincidence with the system clock. This is known as reclocking. Reclocking eliminates variations in propagation delay in logic elements.

Figure 3.9 shows that if the inputs to a logic element change, the output will not change until the propagation delay of the element has elapsed. However, if the output of the logic element forms the input to a storage element, the output of that element will not change until the input is sampled at the next clock edge. In this way the signal edge is aligned to the system clock and the propagation delay of the logic becomes irrelevant. The process is known as reclocking.

### 3.4 Logic elements

The two states of the signal when measured with an oscilloscope are simply two voltages, usually referred to as high and low. The actual voltage levels will depend on the type of logic family in use, and on the supply voltage used. Within logic, these levels are not of much consequence, and it is only necessary to know them when interfacing between different logic families or when driving external devices.

The pure logic designer is not interested at all in the precise signal voltages, only in their meaning. Just as the electrical waveform from a microphone represents sound velocity, so the waveform in a logic circuit represents the truth of some statement. As there are only two states, there can only be true or false meanings. The true state of the signal can be assigned by the designer to either voltage state. When a high voltage represents a true logic condition and a low voltage represents a false condition, the system is known as positive logic, or high true logic. This is the usual system, but sometimes the low voltage represents the true condition and the high voltage represents the false condition. This is known as negative logic or low true logic. Provided that everyone is aware of the logic convention in use, both work equally well.

Figure 3.10    Using open-collector drive, several signal sources can share one common bus. If negative logic is used, the bus drivers turn off their output transistors with a false input, allowing another driver to control the bus. This will not happen with positive logic.

Negative logic was found in the TTL (transistor-transistor logic) family, because in this technology it was easier to sink current to ground than to source it from the power supply. Figure 3.10 shows that if it is necessary to connect several logic elements to a common bus so that any one can communicate with any other, an open collector system is used, where high levels are provided by pull-up resistors and the logic elements only pull the common line down. If positive logic were used, when no device was operating the pull-up resistors would cause the common line to take on an absurd true state; whereas if negative logic is used, the common line pulls up to a sensible false condition when there is no device using the bus. Whilst the open collector is a simple way of obtaining a shared bus system, it is limited in frequency of operation due to the time constant of the pull-up resistors charging the bus capacitance. In the so-called tri-state bus systems, there are both active pull-up and pull-down devices connected in the so-called totem-pole output configuration. Both devices can be disabled to a third state, where the output assumes a high impedance, allowing some other driver to determine the bus state.

In logic systems, all logical functions, however complex, can be configured from combinations of a few fundamental logic elements or gates. It is not profitable to spend too much time debating which are the truly fundamental ones, since most can be made from combinations of others. Figure 3.11 shows the important simple gates and their derivatives, and introduces the logical expressions to describe them, which can be compared with the truth-table notation. The figure also shows the important fact that when negative logic is used, the OR gate function interchanges with that of the AND gate. Sometimes schematics are drawn to reflect which voltage state represents the true condition. In the so-called intentional logic scheme, a negative logic signal always starts and ends at an inverting ‘bubble’. If an AND function is required between two negative logic signals, it will be drawn as an AND symbol with bubbles on all the terminals, even though the component used will be a positive logic OR gate. Opinions vary on the merits of intentional logic.

If numerical quantities need to be conveyed down the two-state signal paths described here, then the only appropriate numbering system is binary, which has only two symbols, 0 and 1. Just as positive or negative logic could be used for the truth of a logical binary signal, it can also be used for a numerical binary signal. Normally, a high voltage level will represent a binary 1 and a low voltage will represent a binary 0, described as a ‘high for a one’ system. Clearly a ‘low for a one’ system is just as feasible. Decimal numbers have several columns, each of which represents a different power of ten; in binary the column position specifies the power of two.

Figure 3.11    The basic logic gates compared.

Several binary digits or bits are needed to express the value of a binary audio sample. These bits can be conveyed at the same time by several signals to form a parallel system, which is most convenient inside equipment because it is fast, or one at a time down a single signal path, which is slower, but convenient for cables between pieces of equipment because the connectors require fewer pins. When a binary system is used to convey numbers in this way, it can be called a digital system.

### 3.5 Storage elements

The basic memory element in logic circuits is the latch, which is constructed from two gates as shown in Figure 3.12(a), and which can be set or reset. A more useful variant is the D-type latch shown at (b) which remembers the state of the input at the time a separate clock either changes state for an edge-triggered device, or after it goes false for a level-triggered device. D-type latches are commonly available with four or eight latches to the chip. A shift register can be made from a series of latches by connecting the Q output of one latch to the D input of the next and connecting all the clock inputs in parallel. Data are delayed by the number of stages in the register. Shift registers are also useful for converting between serial and parallel data transmissions.

Where large numbers of bits are to be stored, cross-coupled latches are less suitable because they are more complicated to fabricate inside integrated circuits than dynamic memory, and consume more current.

In large random access memories (RAMs), the data bits are stored as the presence or absence of charge in a tiny capacitor as shown in Figure 3.12(c). The capacitor is formed by a metal electrode, insulated by a layer of silicon dioxide from a semiconductor substrate, hence the term MOS (metal oxide semiconductor). The charge will suffer leakage, and the value would become indeterminate after a few milliseconds. Where the delay needed is less than this, decay is of no consequence, as data will be read out before they have had a chance to decay. Where longer delays are necessary, such memories must be refreshed periodically by reading the bit value and writing it back to the same place. Most modern MOS RAM chips have suitable circuitry built in. Large RAMs store thousands of bits, and it is clearly impractical to have a connection to each one. Instead, the desired bit has to be addressed before it can be read or written. The size of the chip package restricts the number of pins available, so that large memories use the same address pins more than once. The bits are arranged internally as rows and columns, and the row address and the column address are specified sequentially on the same pins.

The binary circuitry necessary for adding two’s complement numbers is shown in Figure 3.13. Addition in binary requires two bits to be taken at a time from the same position in each word, starting at the least significant bit. Should both be ones, the output is zero, and there is a carry-out generated. Such a circuit is called a half adder, shown in Figure 3.13(a) and is suitable for the least-significant bit of the calculation. All higher stages will require a circuit which can accept a carry input as well as two data inputs. This is known as a full adder (b). Multibit full adders are available in chip form, and have carry-in and carry-out terminals to allow them to be cascaded to operate on long wordlengths. Such a device is also convenient for inverting a two’s complement number, in conjunction with a set of invertors. The adder chip has one set of inputs grounded, and the carry-in permanently held true, such that it adds one to the one’s complement number from the invertor.

Figure 3.12    Digital semiconductor memory types. In (a), one data bit can be stored in a simple set–reset latch, which has little application because the D-type latch in (b) can store the state of the single data input when the clock occurs. These devices can be implemented with bipolar transistors of FETs, and are called static memories because they can store indefinitely. They consume a lot of power.

In (c), a bit is stored as the charge in a potential well in the substrate of a chip. It is accessed by connecting the bit line with the field effect from the word line. The single well where the two lines cross can then be written or read. These devices are called dynamic RAMs because the charge decays, and they must be read and rewritten (refreshed) periodically.

Figure 3.13    (a) Half adder; (b) full-adder circuit and truth table; (c) comparison of sign bits prevents wraparound on adder overflow by substituting clipping level.

When mixing by adding sample values, care has to be taken to ensure that if the sum of the two sample values exceeds the number range the result will be clipping rather than wraparound. In two’s complement, the action necessary depends on the polarities of the two signals. Clearly if one positive and one negative number are added, the result cannot exceed the number range. If two positive numbers are added, the symptom of positive overflow is that the most significant bit sets, causing an erroneous negative result, whereas a negative overflow results in the most significant bit clearing. The overflow control circuit will be designed to detect these two conditions, and override the adder output. If the MSB of both inputs is zero, the numbers are both positive, thus if the sum has the MSB set, the output is replaced with the maximum positive code (0111…). If the MSB of both inputs is set, the numbers are both negative, and if the sum has no MSB set, the output is replaced with the maximum negative code (1000…). These conditions can also be connected to warning indicators. Figure 3.13(c) shows this system in hardware. The resultant clipping on overload is sudden, and sometimes a PROM is included which translates values around and beyond maximum to soft-clipped values below or equal to maximum.

A storage element can be combined with an adder to obtain a number of useful functional blocks which will crop up frequently in audio equipment. Figure 3.14(a) shows that a latch is connected in a feedback loop around an adder. The latch contents are added to the input each time it is clocked. The configuration is known as an accumulator in computation because it adds up or accumulates values fed into it. In filtering, it is known as an discrete time integrator. If the input is held at some constant value, the output increases by that amount on each clock. The output is thus a sampled ramp.

Figure 3.14(b) shows that the addition of an invertor allows the difference between successive inputs to be obtained. This is digital differentiation. The output is proportional to the slope of the input.

### 3.7 The computer

The computer is now a vital part of digital audio systems, being used both for control purposes and to store, access and process audio signals as data. In control, the computer finds applications in database management, automation, editing, and in electromechanical systems such as synchronizers. Some time ago, processing speeds advanced sufficiently to allow computers to manipulate digital audio in real time.

Figure 3.14    Two configurations which are common in processing. In (a) the feedback around the adder adds the previous sum to each input to perform accumulation or digital integration. In (b) an invertor allows the difference between successive inputs to be computed. This is differentiation.

The computer is a programmable device in that its operation is not determined by its construction alone, but instead by a series of instructions forming a program. The program is supplied to the computer one instruction at a time so that the desired sequence of events takes place.

Programming of this kind has been used for over a century in electromechanical devices, including automated knitting machines and street organs which are programmed by punched cards. However, today’s computers differ from these devices in that the program is not fixed, but can be modified by the computer itself. This possibility led to the creation of the term software to suggest a contrast to the constancy of hardware.

Computer instructions are binary numbers each of which is interpreted in a specific way. As these instructions don’t differ from any other kind of data, they can be stored in RAM. The computer can change its own instructions by accessing the RAM. Most types of RAM are volatile, in that they lose data when power is removed. Clearly if a program is entirely stored in this way, the computer will not be able to recover fom a power failure. The solution is that a very simple starting or bootstrap program is stored in non-volatile ROM which will contain instructions which will bring in the main program from a storage system such as a disk drive after power is applied. As programs in ROM cannot be altered, they are sometimes referred to as firmware to indicate that they are classified between hardware and software.

Making a computer do useful work requires more than simply a program which performs the required computation. There is also a lot of mundane activity which does not differ significantly from one program to the next. This includes deciding which part of the RAM will be occupied by the program and which by the data, producing commands to the storage disk drive to read the input data from a file and write back the results. It would be very inefficient if all programs had to handle these processes themselves. Consequently the concept of an operating system was developed. This manages all the mundane decisions and creates an environment in which useful programs or applications can execute.

The ability of the computer to change its own instructions makes it very powerful, but it also makes it vulnerable to abuse. Programs exist which are deliberately written to do damage. These viruses are generally attached to plausible messages or data files and enter computers through storage media or communications paths.

There is also the possibility that programs contain logical errors such that in certain combinations of circumstances the wrong result is obtained. If this results in the unwitting modification of an instruction, the next time that instruction is accessed the computer will crash. In consumer-grade software, written for the vast personal computer market, this kind of thing is unfortunately accepted.

For critical applications, software must be verified. This is a process which can prove that a program can recover from absolutely every combination of circumstances and keep running properly. This is a nontrivial process, because the number of combinations of states a computer can get into is staggering. As a result most software is unverified.

Figure 3.15    A simple computer system. All components are linked by a single data/address/control bus. Although cheap and flexible, such a bus can only make one connection at a time, so it is slow.

It is of the utmost importance that networked computers which can suffer virus infection or computers running unverified software are never used in a life-support or critical application.

Figure 3.15 shows a simple computer system. The various parts are linked by a bus which allows binary numbers to be transferred from one place to another. This will generally use tri-state logic (see Section 3.4) so that when one device is sending to another, all other devices present a high impedance to the bus.

The ROM stores the startup program, the RAM stores the operating system, applications programs and the data to be processed. The disk drive stores large quantities of data in a non-volatile form. The RAM only needs to be able to hold part of one program as other parts can be brought from the disk as required. A program executes by fetching one instruction at a time from the RAM to the processor along the bus.

The bus also allows keyboard/mouse inputs and outputs to the display and printer. Inputs and outputs are generally abbreviated to I/O. Finally a programmable timer will be present which acts as a kind of alarm clock for the processor. data. Sequential instructions are stored in RAM at contiguously increasing locations so that a program can be executed by fetching instructions from a RAM address specified by the program counter (PC) to the instruction register in the CPU. As each instruction is completed, the PC is incremented so that it points to the next instruction. In this way the time taken to execute the instruction can vary.

### 3.8 The processor

The processor or CPU (central processing unit) is the heart of the system. Figure 3.16 shows the data path of a simple CPU. The CPU has a bus interface which allows it to generate bus addresses and input or output

Figure 3.16    The data path of a simple CPU. Under control of an instruction, the ALU will perform some function on a pair of input values from the registers and store or output the result.

The processor is notionally divided into data paths and control paths. The CPU contains a number of general-purpose registers or scratchpads which can be used to store partial results in complex calculations. Pairs of these registers can be addressed so that their contents go to the ALU (arithmetic logic unit). This performs various arithmetic (add, subtract, etc.) or logical (AND, OR, etc.) functions on the input data. The output of the ALU may be routed back to a register or output. By reversing this process it is possible to get data into the registers from the RAM. The ALU also outputs the conditions resulting from the calculation, which can control conditional instructions.

Which function the ALU performs and which registers are involved are determined by the instruction currently in the instruction register which is decoded in the control path. One pass through the ALU can be completed in one cycle of the processor’s clock. Instructions vary in complexity as do the number of clock cycles needed to complete them. Incoming instructions are decoded and used to access a look-up table which converts them into microinstructions, one of which controls the CPU at each clock cycle.

### 3.9 Interrupts

Ordinarily instructions are executed in the order that they are stored in RAM. However, some instructions direct the processor to jump to a new memory location. If this is a jump to an earlier instruction, the program will enter a loop. The loop must increment a count in a register each time, and contain a conditional instruction called a branch, which allows the processor to jump out of the loop when a predetermined count is reached.

However, it is often required that the sequence of execution should be changeable by some external event. This might be the changing of some value due to a keyboard input. Events of this kind are handled by interrupts, which are created by devices needing attention. Figure 3.17 shows that in addition to the PC, the CPU contains another dedicated register called the stack pointer. Figure 3.18 shows how this is used. At the end of every instruction the CPU checks to see if an interrupt is asserted on the bus.

If it is, a different set of microinstructions are executed. The PC is incremented as usual, but the next instruction is not executed. Instead, the contents of the PC are stored so that the CPU can resume execution when it has handled the current event. The PC state is stored in a reserved area of RAM known as the stack, at an address determined by the stack pointer.

Figure 3.17    Normally the program counter (PC) increments each time an instruction is completed in order to select the next instruction. However, an interrupt may cause the PC state to be stored in the stack area of RAM prior to the PC being forced to the start address of the interrupt subroutine. Afterwards the PC can get its original value back by reading the stack.

Figure 3.18    How an interrupt is handled. See text for details.

Once the PC is stacked, the processor can handle the interrupt. It issues a bus interrupt acknowledge, and the interrupting device replies with an unique code identifying itself. This is known as a vector which steers the processor to a RAM address containing a new program counter. This is the RAM address of the first instruction of the subroutine which is the program that will handle the interrupt. The CPU loads this address into the PC and begins execution of the subroutine.

At the end of the subroutine there will be a return instruction. This causes the CPU to use the stack pointer as a memory address in order to read the return PC state from the stack. With this value loaded into the PC, the CPU resumes execution where it left off.

The stack exists so that subroutines can themselves be interrupted. If a subroutine is executing when a higher-priority interrupt occurs, the subroutine can be suspended by incrementing the stack pointer and storing the current PC in the next location in the stack.

When the second interrupt has been serviced, the stack pointer allows the PC of the first subroutine to be retrieved. Whenever a stack PC is retrieved, the stack pointer decrements so that it always points to the PC of the next item of unfinished business.

### 3.10 Programmable timers

Ordinarily processors have no concept of time and simply execute instructions as fast as their clock allows. This is fine for general-purpose processing, but not for time-critical processes such as audio. One way in which the processor can be made time conscious is to use programmable timers. These are devices which reside on the computer bus and which run from a clock. The CPU can set up a timer by loading it with a count. When the count is reached, the timer will interrupt. To give an example, if the count were to be equal to one audio sample period, there would be one interrupt per sample, and this would result in the execution of a subroutine once per sample, provided, of course, that all the instructions could be executed in time.

### 3.11 Timebase compression and correction

In Chapter 1 it was stated that a strength of digital technology is the ease with which delay can be provided. Accurate control of delay is the essence of timebase correction, necessary whenever the instantaneous time of arrival or rate from a data source does not match the destination. In digital audio, the destination will almost always have perfectly regular timing, namely the sampling rate clock of the final DAC. Timebase correction consists of aligning jittery signals from storage media or transmission channels with that stable reference. In this way, wow and flutter are rendered unmeasurable.

A further function of timebase correction is to reverse the time compression applied prior to recording or transmission. As was shown in section 1.8, digital audio recorders compress data into blocks to facilitate editing and error correction as well as to permit head switching between blocks in rotary-head machines. Owing to the spaces between blocks, data arrive in bursts on replay, but must be fed to the output convertors in an unbroken stream at the sampling rate. The extreme time compression used in DAT to reduce the tape wrap is a further example of the use of the principle (see Chapter 9).

In computer hard-disk drives, which are used in digital audio editing systems, time compression is also used, but a converse problem also arises. Data from the disk blocks arrive at a reasonably constant rate, but cannot necessarily be accepted at a steady rate by the logic because of contention for the use of buses and memory by the different parts of the system. In this case the data must be buffered by a relative of the timebase corrector which is usually referred to as a silo.

Although delay is easily implemented, it is not possible to advance a data stream. Most real machines cause instabilities balanced about the correct timing: the output jitters between too early and too late. Since the information cannot be advanced in the corrector, only delayed, the solution is to run the machine in advance of real time. In this case, correctly timed output signals will need a nominal delay to align them with reference timing. Early output signals will receive more delay, and late output signals will receive less.

Figure 3.19    In a RAM-based TBC, the RAM is reference synchronous, and an arbitrator decides when it will read and when it will write. During reading, asynchronous input data back up in the input silo, asserting a write request to the arbitrator. Arbitrator will then cause a write cycle between read cycles.

Figure 3.20 shows the operation of a FIFO chip, colloquially known as a silo because the data are tipped in at the top on delivery and drawn off at the bottom when needed. Each stage of the chip has a data register and a small amount of logic, including a data-valid or V bit. If the input register does not contain data, the first V bit will be reset, and this will cause the chip to assert ‘input ready’. If data are presented at the input, and clocked into the first stage, the V bit will set, and the ‘input ready’ signal will become false. However, the logic associated with the next stage sees the V bit set in the top stage, and if its own V bit is clear, it will clock the data into its own register, set its own V bit, and clear the input V bit, causing ‘input ready’ to reassert, when another word can be fed in. This process then continues as the word moves down the silo, until it arrives at the last register in the chip. The V bit of the last stage becomes the ‘output ready’ signal, telling subsequent circuitry that there are data to be read. If this word is not read, the next word entered will ripple down to the stage above. Words thus stack up at the bottom of the silo. When a word is read out, an external signal must be provided which resets the bottom V bit. The ‘output ready’ signal now goes false, and the logic associated with the last stage now sees valid data above, and loads down the word when it will become ready again. The last register but one will now have no V bit set, and will see data above itself and bring that down. In this way a reset V bit propagates up the chip while the data ripple down, rather like a hole in a semiconductor going the opposite way to the electrons.

Figure 3.20    Structure of FIFO of silo chip. Ripple logic controls propagation of data down silo.

### 3.12 Gain control

When making a digital recording, the gain of the analog input will usually be adjusted so that the quantizing range is fully exercised in order to make a recording of maximum signal-to-noise ratio. During post-production, the recording may be played back and mixed with other signals, and the desired effect can only be achieved if the level of each can be controlled independently. Gain is controlled in the digital domain by multiplying each sample value by a coefficient. If that coefficient is less than one, attenuation will result; if it is greater than one, amplification can be obtained.

Figure 3.21    Structure of fast multiplier. The input A is multiplied by 1, 2, 4, 8, etc. by bit shifting. The digits of the B input then determine which multiples of A should be added together by enabling AND gates between the shifters and the adder. For long wordlengths, the number of gates required becomes enormous, and the device is best implemented in a chip.

Multiplication in binary circuits is difficult. It can be performed by repeated adding, but this is too slow to be of any use. In fast multiplication, one of the inputs will be simultaneously multiplied by one, two, four, etc., by hard-wired bit shifting. Figure 3.21 shows that the other input bits will determine which of these powers will be added to produce the final sum, and which will be neglected. If multiplying by five, the process is the same as multiplying by four, multiplying by one, and adding the two products. This is achieved by adding the input to itself shifted two places. As the wordlength of such a device increases, the complexity increases exponentially, so this is a natural application for an integrated circuit. It is probably true that digital video would not have been viable without such chips.

### 3.13 Digital faders and controls

In a digital mixer, the gain coefficients will originate in hand-operated faders, just as in analog. Analog mixers having automated mixdown employ a system similar to the one shown in Figure 3.22. Here, the faders produce a varying voltage and this is converted to a digital code or gain coefficient in an ADC and recorded alongside the audio tracks. On replay the coefficients are converted back to analog voltages which control VCAs (voltage-controlled amplifiers) in series with the analog audio channels. A digital mixer has a similar structure, and the coefficients can be obtained in the same way. However, on replay, the coefficients are not converted back to analog, but remain in the digital domain and control multipliers in the digital audio channels directly. As the coefficients are digital, it is so easy to add automation to a digital mixer that there is not much point in building one without it.

Figure 3.22    The automated mixdown system of an audio console digitizes fader positions for storage and uses the coefficients later to drive VCAs via convertors.

Whilst gain coefficients can be obtained by digitizing the output of an analog fader, it is also possible to obtain coefficients directly in digital faders. Digital faders are a form of displacement transducer in which the mechanical position of the knob is converted directly to a digital code. The position of other controls, such as for equalizers or scrub wheels, will also need to be digitized. Controls can be linear or rotary, and absolute or relative. In an absolute control, the position of the knob determines the output directly. These are inconvenient in automated systems because unless the knob is motorized, the operator does not know the setting the automation system has selected. In a relative control, the knob can be moved to increase or decrease the output, but its absolute position is meaningless. The absolute setting is displayed on a bar LED nearby. In a rotary control, the bar LED may take the form of a ring of LEDs around the control. The automation system setting can be seen on the display and no motor is needed. In a relative linear fader, the control may take the form of an endless ridged belt like a caterpillar track. If this is transparent, the bar LED may be seen through it.

Figure 3.23    An absolute linear fader uses a number of light beams which are interrupted in various combinations according to the position of a grating. A Gray code shown in Figure 3.24 must be used to prevent false codes.

Figure 3.23 shows an absolute linear fader. A grating is moved with respect to several light beams, one for each bit of the coefficient required. The interruption of the beams by the grating determines which photocells are illuminated. It is not possible to use a pure binary pattern on the grating because this results in transient false codes due to mechanical tolerances. Figure 3.24 shows some examples of these false codes. For example, on moving the fader from 3 to 4, the MSB goes true slightly before the middle bit goes false. This results in a momentary value of 4 + 2 = 6 between 3 and 4. The solution is to use a code in which only one bit ever changes in going from one value to the next. One such code is the Gray code, which was devised to overcome timing hazards in relay logic but is now used extensively in position encoders.

Gray code can be converted to binary in a suitable PROM or gate array. These are available as industry-standard components.

Figure 3.25 shows a rotary incremental encoder. This produces a sequence of pulses whose number is proportional to the angle through which it has been turned. The rotor carries a radial grating over its entire perimeter. This turns over a second fixed radial grating whose bars are not parallel to those of the first grating. The resultant moiré fringes travel inward or outward depending on the direction of rotation. Two suitably positioned light beams falling on photocells will produce outputs in quadrature. The relative phase determines the direction and the frequency is proportional to speed. The encoder outputs can be connected to a counter whose contents will increase or decrease according to the direction the rotor is turned. The counter provides the coefficient output and drives the display.

Figure 3.24    (a) Binary cannot be used for position encoders because mechanical tolerances cause false codes to be produced. (b) In Gray code, only one bit (arrowed) changes in between positions, so no false codes can be generated.

For audio use, a logarithmic characteristic is required in gain control. Linear coefficients can conveniently be converted to logarithmic in a PROM. The wordlength of the gain coefficients requires some thought as they determine the number of discrete gains available. If the coefficient wordlength is inadequate, the gain control becomes ‘steppy’ particularly towards the end of a fadeout. This phenomenon is quite noticeable on some low-cost home studio equipment. A compromise between performance and the expense of high-resolution faders is to insert a digital interpolator having a low-pass characteristic between the fader and the gain control stage. This will compute intermediate gains to higher resolution than the coarse fader scale so that the steps cannot be heard. Digital filters used for equalization can also be sensitive to sudden step changes to their control coefficients.1 Again the solution is to filter the coefficients.

Figure 3.25    The fixed and rotating gratings produce moiré fringes which are detected by two light paths as quadrature sinusoids. The relative phase determines the direction, and the frequency is proportional to speed of rotation.

### 3.14 A digital mixer

The signal path of a simple digital mixer is shown in Figure 3.26. The two inputs are multiplied by their respective coefficients, and added together in two’s complement to achieve the mix as was shown in Figure 3.7. Peak limiting will be required as in Section 3.6. The sampling rate of the inputs must be exactly the same, and in the same phase, or the circuit will not be able to add on a sample-by-sample basis. If the inputs have come from different sources, they must be synchronized by the same master clock, and/or timebase correction must be provided on the inputs. Synchronization of audio sources follows the principle long established in video in which a reference signal is fed to all devices which then slave or genlock to it. This process will be covered in detail in Chapter 8.

Figure 3.26    One multiplier/accumulator can be time shared between several signals by operating at a multiple of sampling rate. In this example, four multiplications are performed during one sample period.

Some thought must be given to the wordlength of the system. If a sample is attenuated, it will develop bits which are below the radix point. For example, if an eight-bit sample is attenuated by 24 dB, the sample value will be shifted four places down. Extra bits must be available within the mixer to accommodate this shift. Digital mixers can have an internal wordlength of up to 32 bits. When several attenuated sources are added together to produce the final mix, the result will be a stream of 32-bit or longer samples. As the output will generally need to be of the same format as the input, the wordlength must be shortened. This must be done very carefully, as it is a form of quantizing and will require dithering. The necessary techniques will be treated in Chapter 4.

In practice a digital mixer would not have one multiplier for every input. Multiplier chips are expensive, but can work much faster than the relatively low frequencies used in audio sampling. Figure 3.26 also shows that a more economical system results when a time-shared bus system is used with only one multiplier followed by an accumulator. In one sample period, each of the input samples is fed in turn to the lower input of the multiplier at the same time as the corresponding coefficient is fed to the upper input. The products from the multiplier are accumulated during the sample period, so that at the end of the sample period, the accumulator holds the sum of all the products, which is the digitally mixed sample. The process then repeats for the next sample period. To facilitate the sharing of common circuits by many signals, tri-state logic devices can be used. The outputs of such devices can be wired in parallel, and the state of the parallel connection will be the state of the device whose output is enabled. Clearly only one output can be enabled at a time, and this will be ensured by a sequencer circuit connected to all the device enables. In digital signal processor (DSP) chips, the processes shown above can be simulated in software.

In analog audio mixers, the controls have to be positioned close to the circuitry for performance reasons; thus one control knob is needed for every variable, and the control panel is physically large. Remote control is difficult with such construction. The order in which the signal passes through the various stages of the mixer is determined at the time of design, and any changes are difficult.

In a digital mixer,2,3 all the filters are controlled by simply changing the coefficients, and remote control is easy. Since control is by digital parameters, it is possible to use assignable controls, such that there need only be one set of filter and equalizer controls, whose setting is conveyed to any channel chosen by the operator.4 The use of digital processing allows the console to include a video display of the settings. This was seldom attempted in analog desks because the magnetic field from the scan coils tended to break through into the audio circuitry.

Since the audio processing in a digital mixer is by program control, the configuration of the desk can be changed at will by running the programs for the various functions in a different order. The operator can configure the desk to his own requirements by entering symbols on a block diagram on the video display, for example. The configuration and the setting of all the controls can be stored in memory or for a longer term, on disk, and recalled instantly. Such a desk can be in almost constant use, because it can be put back exactly to a known state easily after someone else has used it.

A further advantage of working in the digital domain is that delay can be controlled individually in the audio channels.5 This allows for the time of arrival of wavefronts at various microphones to be compensated despite their physical position.

Figure 3.27 shows a typical digital mixer installation.4 The analog microphone inputs are from remote units containing ADCs so that the length of analog cabling can be kept short. The input units communicate with the signal processor using digital fibre-optic links.

The sampling rate of a typical digital audio signal is low compared to the speed at which typical logic gates can operate. It is sensible to minimize the quantity of hardware necessary by making each perform many functions in one sampling period. Although general-purpose computers can be programmed to process digital audio, they are not really suitable for the following reasons:

Figure 3.27    Digital mixer installation. The convenience of digital transmission without degradation allows the control panel to be physically remote from the processor.

 1 The number of arithmetic operations in audio processing, particularly multiplications, is far higher than in data processing. 2 Audio processing is done in real time; data processors do not generally work in real time. 3 The program needed for an audio function generally remains constant for the duration of a session, or changes slowly, whereas a data processor rapidly jumps between many programs. 4 Data processors can suspend a program on receipt of an interrupt; audio processors must work continuously for long periods. 5 Data processors tend to be I/O limited, in that their operating speed is limited by the problems of moving large quantities of data and instructions into the CPU. Audio processors in contrast have a relatively small input and output rate, but compute intensively.

The above is a sufficient case for the development of specialized digital audio signal processors.68 These units are implemented with more internal registers than data processors to facilitate multi-point filter algorithms. The arithmetic unit will be designed to offer high-speed multiply/accumulate using techniques such as pipelining, which allows operations to overlap.9 The functions of the register set and the arithmetic unit are controlled by a microsequencer.

External control of a DSP will generally be by a smaller processor, often in the operator’s console, which passes coefficients to the DSP as the operator moves the controls. In large systems, it is possible for several different consoles to control different sections of the DSP.10

### 3.15 Effects

In addition to equalization and mixing, modern audio production requires numerous effects, and these can be performed in the digital domain by simply mimicking the analog equivalent.

One of the oldest effects is the use of a tape loop to produce an echo, and this can be implemented with memory or, for longer delays, with a disk drive. Figure 3.28(a) shows the basic configuration necessary for echo. If the delay period is dynamically changed from zero to about 10 ms, the result is flanging, where a notch sweeps through the audio spectrum. This was originally done by having two identical analog tapes running, and modifying the capstan speed with hand pressure! A relative of echo is reverberation, which is used to simulate ambience on an acoustically dry recording. Figure 3.28(b) shows that reverberation actually consists of a series of distinct early reflections, followed by the reverberation proper, which is due to multiple reflections. The early reflections are simply provided by short delays, but the reverberation is more difficult. A recursive structure is a natural choice for a decaying response, but simple recursion sounds artificial. The problem is that, in a real room, standing waves and interference effects cause large changes in the frequency response at each reflection. The effect can be simulated in a digital reverberator by adding various comb-filter sections which have the required effect on the response.

Figure 3.28    (a) A simple configuration to obtain digital echo. The delay would normally be several tens of milliseconds. If the delay is made about 10 ms, the configuration acts as a comb filter, and if the delay is changed dynamically, a notch will sweep the audio spectrum resulting in flanging.

Figure 3.28    (b) In a reverberant room, the signal picked up by a microphone is a mixture of direct sound, early reflections and a highly confused reverberant tail. A digital reverberator will simulate this with various combinations of recursive delay and attenuation.

### 3.16 The phase-locked loop

All digital audio systems need to be clocked at the appropriate rate in order to function properly. Whilst a clock may be obtained from a fixed frequency oscillator such as a crystal, many operations in audio require genlocking or synchronizing the clock to an external source. The phase-locked loop excels at this job, and many others, particularly in connection with recording and transmission.

In phase-locked loops, the oscillator can run at a range of frequencies according to the voltage applied to a control terminal. This is called a voltage-controlled oscillator or VCO. Figure 3.29 shows that the VCO is driven by a phase error measured between the output and some reference. The error changes the control voltage in such a way that the error is reduced, so that the output eventually has the same frequency as the reference. A low-pass filter is fitted in the control voltage path to prevent the loop becoming unstable. If a divider is placed between the VCO and the phase comparator, as in the figure, the VCO frequency can be made to be a multiple of the reference. This also has the effect of making the loop more heavily damped, so that it is less likely to change frequency if the input is irregular.

Figure 3.29    A phase-locked loop requires these components as a minimum. The filter in the control voltage serves to reduce clock jitter.

In digital audio, the frequency multiplication of a phase-locked loop is extremely useful. Figure 3.30 shows how the 48 kHz sampling clock is obtained from the sync pulses of a video reference by such a multiplication process.

Figure 3.30    Obtaining a 48 kHz sampling clock from the line frequency of 625/50 video using a phase-locked loop.

Figure 3.31 shows the NLL or numerically locked loop. This is similar to a phase-locked loop, except that the two phases concerned are represented by the state of a binary number. The NLL is useful to generate a remote clock from a master. The state of a clock count in the master is periodically transmitted to the NLL which will recreate the same clock frequency. The technique is used in MPEG transport streams.

Figure 3.31    The numerically locked loop (NLL) is a digital version of the phase-locked loop.

### 3.17 Multiplexing principles

Multiplexing is used where several signals are to be transmitted down the same channel. The channel bit rate must be the same as or greater than the sum of the source bit rates. Figure 3.32 shows that when multiplexing is used, the data from each source have to be time compressed. This is done by buffering source data in a memory at the multiplexer. They are written into the memory in real time as they arrive, but will be read from the memory with a clock which has a much higher rate. This means that the readout occurs in a smaller timespan. If, for example, the clock frequency is raised by a factor of ten, the data for a given signal will be transmitted in a tenth of the normal time, leaving time in the multiplex for nine more such signals.

Figure 3.32    Multiplexing requires time compression on each input.

In the demultiplexer another buffer memory will be required. Only the data for the selected signal will be written into this memory at the bit rate of the multiplex. When the memory is read at the correct speed, the data will emerge with its original timebase.

In practice it is essential to have mechanisms to identify the separate signals to prevent them being mixed up and to convey the original signal clock frequency to the demultiplexer. In time-division multiplexing the timebase of the transmission is broken into equal slots, one for each signal. This makes it easy for the demultiplexer, but forces a rigid structure on all the signals such that they must all be locked to one another and have an unchanging bit rate. Packet multiplexing overcomes these limitations.

### 3.18 Packets

The multiplexer must switch between different time-compressed signals to create the bitstream and this is much easier to organize if each signal is in the form of data packets of constant size. Figure 3.33 shows a packet multiplexing system.

Figure 3.33    Packet multiplexing relles on headers to identify the packets.

Each packet consists of two components: the header, which identifies the packet, and the payload, which is the data to be transmitted. The header will contain at least an identification code (ID) which is unique for each signal in the multiplex. The demultiplexer checks the ID codes of all incoming packets and discards those which do not have the wanted ID.

In complex systems it is common to have a mechanism to check that packets are not lost or repeated. This is the purpose of the packet continuity count which is carried in the header. For packets carrying the same ID, the count should increase by one from one packet to the next. Upon reaching the maximum binary value, the count overflows and recommences.

### 3.19 Statistical multiplexing

Packet multiplexing has advantages over time-division multiplexing because it does not set the bit rate of each signal. A demultiplexer simply checks packet IDs and selects all packets with the wanted code. It will do this however frequently such packets arrive. Consequently it is practicable to have variable bit rate signals in a packet multiplex. The multiplexer has to ensure that the total bit rate does not exceed the rate of the channel, but that rate can be allocated arbitrarily between the various signals.

As a practical matter is is usually necessary to keep the bit rate of the multiplex constant. With variable rate inputs this is done by creating null packets which are generally called stuffing or packing. The headers of these packets contain an unique ID which the demultiplexer does not recognize and so these packets are discarded on arrival.

In an MPEG environment, statistical multiplexing can be extremely useful because it allows for the varying difficulty of real program material. In a multiplex of several television programs, it is unlikely that all the programs will encounter difficult material simultaneously. When one program encounters a detailed scene or frequent cuts which are hard to compress, more data rate can be allocated at the allowable expense of the remaining programs which are handling easy material.

### 3.20 Filters

Filtering is inseparable from digital audio. Analog or digital filters, and sometimes both, are required in ADCs, DACs, in the data channels of digital recorders and transmission systems and in sampling rate convertors and equalizers. Optical systems used in disk recorders also act as filters.11 There are many parallels between analog, digital and optical filters, which this section treats as a common subject. The main difference between analog and digital filters is that in the digital domain very complex architectures can be constructed at low cost in LSI and that arithmetic calculations are not subject to component tolerance or drift.

Filtering may modify the frequency response of a system, and/or the phase response. Every combination of frequency and phase response determines the impulse response in the time domain. Figure 3.34 shows that impulse response testing tells a great deal about a filter. In a perfect filter, all frequencies should experience the same time delay. If some groups of frequencies experience a different delay from others, there is a group-delay error. As an impulse contains an infinite spectrum, a filter suffering from group-delay error will separate the different frequencies of an impulse along the time axis.

A pure delay will cause a phase shift proportional to frequency, and a filter with this characteristic is said to be phase-linear. The impulse response of a phase-linear filter is symmetrical. If a filter suffers from group-delay error it cannot be phase-linear. It is almost impossible to make a perfectly phase-linear analog filter, and many filters have a group-delay equalization stage following them which is often as complex as the filter itself. In the digital domain it is straightforward to make a phase-linear filter, and phase equalization becomes unnecessary.

Figure 3.34    Group delay time-displaces signals as a function of frequency.

Figure 3.35    (a) The impulse response of a simple RC network is an exponential decay. This can be used to calculate the response to a squarewave, as in (b).

Because of the sampled nature of the signal, whatever the response at low frequencies may be, all digital channels (and sampled analog channels) act as low-pass filters cutting off at the Nyquist limit, or half the sampling frequency.

Figure 3.35(a) shows a simple RC network and its impulse response. This is the familiar exponential decay due to the capacitor discharging through the resistor (in series with the source impedance which is assumed here to be negligible). The figure also shows the response to a squarewave at (b). These responses can be calculated because the inputs involved are relatively simple. When the input waveform and the impulse response are complex functions, this approach becomes almost impossible.

In any filter, the time domain output waveform represents the convolution of the impulse response with the input waveform. Convolution can be followed by reference to a graphic example in Figure 3.36. Where the impulse response is asymmetrical, the decaying tail occurs after the input. As a result it is necessary to reverse the impulse response in time so that it is mirrored prior to sweeping it through the input waveform. The output voltage is proportional to the shaded area shown where the two impulses overlap.

Figure 3.36    In the convolution of two continuous signals (the impulse response with the input), the impulse must be time reversed or mirrored. This is necessary because the impulse will be moved from left to right, and mirroring gives the impulse the correct time-domain response when it is moved past a fixed point. As the impulse response slides continuously through the input waveform, the area where the two overlap determines the instantaneous output amplitude. This is shown for five different times by the crosses on the output waveform.

Figure 3.37    In time discrete convolution, the mirrored impulse response is stepped through the input one sample period at a time. At each step, the sum of the cross-products is used to form an output value. As the input in this example is a constant-height pulse, the output is simply proportional to the sum of the coincident impulse response samples. This figure should be compared with Figure 3.36.

The same process can be performed in the sampled, or discrete time domain as shown in Figure 3.37. The impulse and the input are now a set of discrete samples which clearly must have the same sample spacing. The impulse response only has value where impulses coincide. Elsewhere it is zero. The impulse response is therefore stepped through the input one sample period at a time. At each step, the area is still proportional to the output, but as the time steps are of uniform width, the area is proportional to the impulse height and so the output is obtained by adding up the lengths of overlap. In mathematical terms, the output samples represent the convolution of the input and the impulse response by summing the coincident cross-products.

As a digital filter works in this way, perhaps it is not a filter at all, but just a mathematical simulation of an analog filter. This approach is quite useful in visualizing what a digital filter does.

### 3.21 Transforms

Figure 3.38 shows that if a signal with a spectrum or frequency content a is passed through a filter with a frequency response b the result will be an output spectrum which is simply the product of the two. If the frequency responses are drawn on logarithmic scales (i.e. calibrated in dB) the two can be simply added because the addition of logs is the same as multiplication. Whilst frequency in audio has traditionally meant temporal frequency measured in Hertz, frequency in optics can also be spatial and measured in lines per millimetre (mm–1). Multiplying the spectra of the responses is a much simpler process than convolution.

In order to move to the frequency domain or spectrum from the time domain or waveform, it is necessary to use the Fourier transform, or in sampled systems, the discrete Fourier transform (DFT). Fourier analysis holds that any periodic waveform can be reproduced by adding together an arbitrary number of harmonically related sinusoids of various amplitudes and phases. Figure 3.39 shows how a squarewave can be built up of harmonics. The spectrum can be drawn by plotting the amplitude of the harmonics against frequency. It will be seen that this gives a spectrum which is a decaying wave. It passes through zero at all even multiples of the fundamental. The shape of the spectrum is a sinx/x curve. If a squarewave has a sinx/x spectrum, it follows that a filter with a rectangular impulse response will have a sinx/x spectrum.

A low-pass filter has a rectangular spectrum, and this has a sinx/x impulse response. These characteristics are known as a transform pair. In transform pairs, if one domain has one shape of the pair, the other domain will have the other shape. Thus a squarewave has a sinx/x spectrum and a sinx/s impulse has a square spectrum. Figure 3.40 shows a number of transform pairs. Note the pulse pair. A time domain pulse of infinitely short duration has a flat spectrum. Thus a flat waveform, i.e. DC, has only zero in its spectrum. Interestingly the transform of a Gaussian response in still Gaussian. The impulse response of the optics of a laser disk has a sin2x/x2 function, and this is responsible for the triangular falling frequency response of the pickup.

Figure 3.38    In the frequency domain, the response of two series devices is the product of their individual responses at each frequency. On a logarithmic scale the responses are simply added.

The spectrum of a pseudo-random sequence is not flat because it has a finite sequence length. The rate at which the sequence repeats is visible in the spectrum. Where pseudo-random sequences are to be used in sample manipulation, i.e. where their effects can be audible, it is essential that the sequence length should be long enough to prevent the periodicity being audible.

Figure 3.39    Fourier analysis of a squarewave into fundamental and harmonics. A, amplitude; δ, phase of fundamental wave in degrees; 1, first harmonic (fundamental); 2 odd harmonics 3–15; 3, sum of harmonics 1–15; 4, ideal squarewave.

Figure 3.41 shows that the spectrum of a pseudo-random sequence has a sinx/x characteristic, with nulls at multiples of the clock frequency. A closer inspection of the spectrum shows that it is not continuous, but takes the form of a comb where the spacing is equal to the repetition rate of the sequence.

### 3.22 FIR and IIR Filters

Filters can be described in two main classes, as shown in Figure 3.42, according to the nature of the impulse response. Finite-impulse response (FIR) filters are always stable and, as their name suggests, respond to an impulse once, as they have only a forward path. In the temporal domain, the time for which the filter responds to an input is finite, fixed and readily established. The same is therefore true about the distance over which a FIR filter responds in the spatial domain. FIR filters can be made perfectly phase linear if required. Most filters used for sampling rate conversion and oversampling fall into this category.

Infinite-impulse response (IIR) filters respond to an impulse indefinitely and are not necessarily stable, as they have a return path from the output to the input. For this reason they are also called recursive filters. As the impulse response in not symmetrical, IIR filters are not phase linear. In this respect they are similar to analog tone controls.

Figure 3.40    The concept of transform pairs illustrates the duality of the frequency (including spatial frequency) and time domains.

Figure 3.41    The spectrum of a pseudo-random sequence has a sinx/x characteristic, with nulls at multiples of the clock frequency. The spectrum is not continuous, but resembles a comb where the spacing is equal to the repetition rate of the sequence.

Figure 3.42    An FIR filter (a) responds only to an input, whereas the output of an IIR filter (b) continues indefinitely rather like a decaying echo.

### 3.23 FIR filters

A FIR filter works by graphically constructing the impulse response for every input sample. It is first necessary to establish the correct impulse response. Figure 3.43(a) shows an example of a low-pass filter which cuts off at 1/4 of the sampling rate. The impulse response of a perfect low-pass filter is a sinx/x curve, where the time between the two central zero crossings is the reciprocal of the cut-off frequency. According to the mathematics, the waveform has always existed, and carries on for ever. The peak value of the output coincides with the input impulse. This means that the filter is not causal, because the output has changed before the input is known. Thus in all practical applications it is necessary to truncate the extreme ends of the impulse response, which causes an aperture effect, and to introduce a time delay in the filter equal to half the duration of the truncated impulse in order to make the filter causal.

As an input impulse is shifted through the series of registers in Figure 3.43(b), the impulse response is created, because at each point it is multiplied by a coefficient as in (c). These coefficients are simply the result of sampling and quantizing the desired impulse response. Clearly the sampling rate used to sample the impulse must be the same as the sampling rate for which the filter is being designed. In practice the coefficients are calculated, rather than attempting to sample an actual impulse response. The coefficient wordlength will be a compromise between cost and performance. Because the input sample shifts across the system registers to create the shape of the impulse response, the configuration is also known as a transversal filter. In operation with real sample streams, there will be several consecutive sample values in the filter registers at any time in order to convolve the input with the impulse response.

Figure 3.43    (a) The impulse response of an LPF is a sinx/x curve which stretches from – ∞ to + ∞ in time. The ends of the response must be neglected, and a delay introduced to make the filter causal.

Figure 3.43    (b) The structure of an FIR LPF. Input samples shift across the register and at each point are multiplied by different coefficients.

Figure 3.43    (c) When a single unit sample shifts across the circuit of Figure 3.43(b), the impulse response is created at the output as the impulse is multiplied by each coefficient in turn.

Simply truncating the impulse response causes an abrupt transition from input samples which matter and those which do not. Truncating the filter superimposes a rectangular shape on the time domain impulse response. In the frequency domain the rectangular shape transforms to a sinx/x characteristic which is superimposed on the desired frequency response as a ripple. One consequence of this is known as Gibb’s phenomenon; a tendency for the response to peak just before the cut-off frequency.12,13 As a result, the length of the impulse which must be considered will depend not only on the frequency response but also on the amount of ripple which can be tolerated. If the relevant period of the impulse is measured in sample periods, the result will be the number of points or multiplications needed in the filter. Figure 3.44 compares the performance of filters with different numbers of points. A high-quality digital audio FIR filter may need in excess of 100 points.

Figure 3.44    The truncation of the impulse in an FIR filter caused by the use of a finite number of points (N) results in ripple in the response. Shown here are three different numbers of points for the same impulse response. The filter is an LPF which rolls off at 0.4 of the fundamental interval. (Courtesy Philips Technical Review)

Rather than simply truncate the impulse response in time, it is better to make a smooth transition from samples which do not count to those that do. This can be done by multiplying the coefficients in the filter by a window function which peaks in the centre of the impulse. Figure 3.45 shows some different window functions and their responses. The rectangular window is the case of truncation, and the response is shown at I. A linear reduction in weight from the centre of the window to the edges characterizes the Bartlett window II, which trades ripple for an increase in transition-region width. At III is shown the Hanning window, which is essentially a raised cosine shape. Not shown is the similar Hamming window, which offers a slightly different trade-off between ripple and the width of the main lobe. The Blackman window introduces an extra cosine term into the Hamming window at half the period of the main cosine period, reducing Gibb’s phenomenon and ripple level, but increasing the width of the transition region. The Kaiser window is a family of windows based on the Bessel function, allowing various tradeoffs between ripple ratio and main lobe width. Two of these are shown in IV and V.

Figure 3.45    The effect of window functions. At top, various window functions are shown in continuous form. Once the number of samples in the window is established, the continuous functions shown here are sampled at the appropriate spacing to obtain window coefficients. These are multiplied by the truncated impulse response coefficients to obtain the actual coefficients used by the filter. The amplitude responses I–V correspond to the window functions illustrated. (Responses courtesy Philips Technical Review)

Figure 3.46    The Dolph window shape is shown at (a). The frequency response is at (b). Note the constant height of the response ripples.

The Dolph window14 shown in Figure 3.46 results in an equiripple filter which has the advantage that the attenuation in the stopband never falls below a certain level.

Filter coefficients can be optimized by computer simulation. One of the best-known techniques used is the Remez exchange algorithm, which converges on the optimum coefficients after a number of iterations.

In the example of Figure 3.47, a low-pass FIR filter is shown which is intended to allow downsampling by a factor of two. The key feature is that the stopband must have begun before one half of the output sampling rate. This is most readily achieved using a Hamming window because it was designed empirically to have a flat stopband so that good aliasing attenuation is possible. The width of the transition band determines the number of significant sample periods embraced by the impulse. The Hamming window doubles the width of the transition band. This determines in turn both the number of points in the filter and the filter delay. For the purposes of illustration, the number of points is much smaller than would normally be the case in an audio application.

Figure 3.47    A downsampling filter using the Hamming window.

As the impulse is symmetrical, the delay will be half the impulse period. The impulse response is a sinx/x function, and this has been calculated in the figure. The equation for the Hamming window function is shown with the window values which result. The sinx/x response is next multiplied by the Hamming window function to give the windowed impulse response shown.

If the coefficients are not quantized finely enough, it will be as if they had been calculated inaccurately, and the performance of the filter will be less than expected. Figure 3.48 shows an example of quantizing coefficients. Conversely, raising the wordlength of the coefficients increases cost.

Figure 3.48    Frequency response of a 49-point transversal filter with infinite precision (solid line) shows ripple due to finite window size. Quantizing coefficients to 12 bits reduces attenuation in the stopband. (Responses courtesy Philips Technical Review)

Figure 3.49    A seven-point folded filter for a symmetrical impulse response. In this case K1 and K7 will be identical, and so the input sample can be multiplied once, and the product fed into the output shift system in two different places. The centre coefficient K4 appears once. In an even-numbered filter the centre coefficient would also be used twice.

The FIR structure is inherently phase linear because it is easy to make the impulse response absolutely symmetrical. The individual samples in a digital system do not know in isolation what frequency they represent, and they can only pass through the filter at a rate determined by the clock. Because of this inherent phase-linearity, a FIR filter can be designed for a specific impulse response, and the frequency response will follow.

The frequency response of the filter can be changed at will by changing the coefficients. A programmable filter only requires a series of PROMs to supply the coefficients; the address supplied to the PROMs will select the response. The frequency response of a digital filter will also change if the clock rate is changed, so it is often less ambiguous to specify a frequency of interest in a digital filter in terms of a fraction of the fundamental interval rather than in absolute terms. The configuration shown in Figure 3.43 serves to illustrate the principle. The units used on the diagrams are sample periods and the response is proportional to these periods or spacings, and so it is not necessary to use actual figures.

Where the impulse response is symmetrical, it is often possible to reduce the number of multiplications, because the same product can be used twice, at equal distances before and after the centre of the window. This is known as folding the filter. A folded filter is shown in Figure 3.49.

### 3.24 Sampling-rate conversion

The topic of sampling-rate conversion will become increasingly important as digital audio equipment becomes more common and attempts are made to create large interconnected systems. Many of the circumstances in which a change of sampling rate is necessary are set out here:

 1 To realize the advantages of oversampling converters, an increase in sampling rate is necessary prior to DACs and a reduction in sampling rate is necessary following ADCs. In oversampling the factors by which the rates are changed are very much higher than in other applications. 2 When a digital recording is played back at other than the correct speed to achieve some effect or to correct pitch, the sampling rate of the reproduced signal changes in proportion. If the playback samples are to be fed to a digital mixing console which works at some standard frequency, rate conversion will be necessary. 3 In the past, many different sampling rates were used on recorders which are now becoming obsolete. With sampling-rate conversion, recordings made on such machines can be played back and transferred to more modern formats at standard sampling rates. 4 Different sampling rates exist today for different purposes. Rate conversion allows material to be exchanged freely between rates. For example, master tapes made at 48 kHz on multitrack recorders may be digitally mixed down to two tracks at that frequency, and then converted to 44.1 kHz for Compact Disc or DCC mastering, or to 32 kHz for broadcast use. 5 When digital audio is used in conjunction with film or video, difficulties arise because it is not always possible to synchronize the sampling rate with the frame rate. An example of this is where the digital audio recorder uses its internally generated sampling rate, but also records studio timecode. On playback, the timecode can be made the same as on other units, or the sampling rate can be locked, but not both. Sampling-rate conversion allows a recorder to play back an asynchronous recording locked to timecode. 6 When programs are interchanged over long distances, there is no guarantee that source and destination are using the same timing reference. In this case the sampling rates at both ends of a link will be nominally identical, but drift in reference oscillators will cause the relative sample phase to be arbitrary.

In items 5 and 6 above, the difference of rate between input and output is small, and the process is then referred to as synchronization. This can be simpler than rate conversion, and will be treated in Chapter 8.

Sampling-rate conversion can be effected by returning to the analog domain. A DAC is connected to an ADC. In order to satisfy the requirements of sampling theory, there must be a low-pass filter between the two having a frequency response restricted to one-half of the lower sampling rate. In reality this is seldom done, because all practical machines have anti-aliasing filters at their analog inputs and anti-image filters at their analog outputs. Connecting one machine to another via the analog domain therefore includes one unnecessary filter in the chain. Since analog filters are seldom optimal, degradation may be caused by rate-converting in this way, particularly in the area of phase response, although the introduction of oversampling convertors has lessened the problem.

Analog filters usually have a fixed response, and this is not necessarily the correct one if both input and output rates are to be varied significantly. The increase in noise due to an additional quantizing stage and additional double exposure to clock jitter is not beneficial. Methods of sampling-rate conversion in the digital domain are preferable and will be described here.

There are three basic but related categories of rate conversion, as shown in Figure 3.50. The most straightforward (a) changes the rate by an integer ratio, up or down. The timing of the system is thus simplified because all samples (input and output) are present on edges of the higher-rate sampling clock. Such a system is generally adopted for oversampling convertors; the exact sampling rate immediately adjacent to the analog domain is not critical, and will be chosen to make the filters easier to implement.

Figure 3.50    Categories of rate conversion. (a) Integer-ratio conversion, where the lower-rate samples are always coincident with those of the higher rate. There are a small number of phases needed. (b) Fractional-ratio conversion, where sample coincidence is periodic. A larger number of phases are required. Example here is conversion from 50.4 kHz to 44.1 kHz (8/7). (c) Variable-ratio conversion, where there is no fixed relationship, and a large number of phases are required.

Next in order of difficulty is the category shown at (b) where the rate is changed by the ratio of two small integers. Samples in the input periodically time-align with the output. Many of the early proposals for professional sampling rates were based on simple fractional relationships to 44.1 kHz such as so that this technique could be used. This technique is not suitable for variable-speed replay or for asynchronous operation.

The most complex rate-conversion category is where there is no simple relationship between input and output sampling rates, and indeed they are allowed to vary. This situation, shown at (c), is known as variable-ratio conversion. The time relationship of input and output samples is arbitrary, and independent clocks are necessary. Once it was established that variable-ratio conversion was feasible, the choice of a professional sampling rate became very much easier, because the simple fractional relationships could be abandoned. The conversion fraction between 48 kHz and 44.1 kHz is 160:147, which is indeed not simple.

As the technique of integer-ratio conversion is used almost exclusively for oversampling in digital audio it will be discussed in that context. Sampling-rate reduction by an integer factor is dealt with first.

Figure 3.51(a) shows the spectrum of a typical sampled system where the sampling rate is a little more than twice the analog bandwidth. Attempts to reduce the sampling rate by simply omitting samples, a process known as decimation, will result in aliasing, as shown in (b). Intuitively it is obvious that omitting samples is the same as if the original sampling rate was lower. In order to prevent aliasing, it is necessary to incorporate low-pass filtering into the system where the cut-off frequency reflects the new, lower, sampling rate. An FIR type low-pass filter could be installed, as described earlier in this chapter, immediately prior to the stage where samples are omitted, but this would be wasteful, because for much of its time the FIR filter would be calculating sample values which are to be discarded.

A more effective method is to combine the low-pass filter with the decimator so that the filter only calculates values to be retained in the output sample stream. Figure 3.51(c) shows how this is done. The filter makes one accumulation for every output sample, but that accumulation is the result of multiplying all relevant input samples in the filter window by an appropriate coefficient. The number of points in the filter is determined by the number of input samples in the period of the filter window, but the number of multiplications per second is obtained by multiplying that figure by the output rate. If the filter is not integrated with the decimator, the number of points has to be multiplied by the input rate. The larger the rate-reduction factor, the more advantageous the decimating filter ought to be, but this is not quite the case, as the greater the reduction in rate, the longer the filter window will need to be to accommodate the broader impulse response.

When the sampling rate is to be increased by an integer factor, additional samples must be created at even spacing between the existing ones. There is no need for the bandwidth of the input samples to be reduced since, if the original sampling rate was adequate, a higher one must also be adequate.

Figure 3.51    The spectrum of a typical digital audio sample stream in (a) will be subject to aliasing as in (b) if the baseband width is not reduced by an LPF. In (c) an FIR low-pass filter prevents aliasing. Samples are clocked transversely across the filter at the input rate, but the filter only computes at the output sample rate. Clearly this will only work if the two rates are related by an integer factor.

Figure 3.52 shows that the process of sampling-rate increase can be thought of in two stages. First the correct rate is achieved by inserting samples of zero value at the correct instant, and then the additional samples are given meaningful values by passing the sample stream through a low-pass filter which cuts off at the Nyquist frequency of the original sampling rate. This filter is known as an interpolator, and one of its tasks is to prevent images of the lower input-sampling spectrum from appearing in the extended baseband of the higher-rate output spectrum.

Figure 3.52    In integer-ratio sampling, rate increase can be obtained in two stages. Firstly, zero-value samples are inserted to increase the rate, and then filtering is used to give the extra samples real values. The filter necessary will be an LPF with a response which cuts off at the Nyquist frequency of the input samples.

How do interpolators work? It is important to appreciate that, according to sampling theory, all sampled systems have finite bandwidth. An individual digital sample value is obtained by sampling the instantaneous voltage of the original analog waveform, and because it has zero duration, it must contain an infinite spectrum. However, such a sample can never be heard in that form because of the reconstruction process, which limits the spectrum of the impulse to the Nyquist limit. After reconstruction, one infinitely short digital sample ideally represents a sinx/x pulse whose central peak width is determined by the response of the reconstruction filter, and whose amplitude is proportional to the sample value. This implies that, in reality, one sample value has meaning over a considerable timespan, rather than just at the sample instant. If this were not true, it would be impossible to build an interpolator.

As in rate reduction, performing the steps separately is inefficient. The bandwidth of the information is unchanged when the sampling rate is increased; therefore the original input samples will pass through the filter unchanged, and it is superfluous to compute them. The combination of the two processes into an interpolating filter minimizes the amount of computation.

As the purpose of the system is purely to increase the sampling rate, the filter must be as transparent as possible, and this implies that a linear-phase configuration is mandatory, suggesting the use of an FIR structure. Figure 3.53 shows that the theoretical impulse response of such a filter is a sinx/x curve which has zero value at the position of adjacent input samples. In practice this impulse cannot be implemented because it is infinite.

Figure 3.53    A single sample results in a sinx/x waveform after filtering in the analog domain. At a new, higher, sampling rate, the same waveform after filtering will be obtained if the numerous samples of differing size shown here are used. It follows that the value of these new samples can be calculated from the input samples in the digital domain in an FIR filter.

The impulse response used will be truncated and windowed as described earlier. To simplify this discussion, assume that a sinx/x impulse is to be used. The process of interpolation is the same in principle as the reconstruction filtering which takes place in DACs. It will be seen in Chapter 4 that a continuous time analog signal is obtained by summing the analog impulses due to each sample. In a digital interpolating filter, this process is duplicated but in discrete time.15

If the sampling rate is to be doubled, new samples must be interpolated exactly halfway between existing samples. The necessary impulse response is shown in Figure 3.54; it can be sampled at the output sample period and quantized to form coefficients. If a single input sample is multiplied by each of these coefficients in turn, the impulse response of that sample at the new sampling rate will be obtained. Note that every other coefficient is zero, which confirms that no computation is necessary on the existing samples; they are just transferred to the output. The intermediate sample is computed by adding together the impulse responses of every input sample in the window. The figure shows how this mechanism operates. If the sampling rate is to be increased by a factor of four, three sample values must be interpolated between existing input samples. Figure 3.55 shows that it is only necessary to sample the impulse response at one-quarter the period of input samples to obtain three sets of coefficients which will be used in turn. In hardware-implemented filters, the input sample which is passed straight to the output is transferred by using a fourth filter phase where all coefficients are zero except the central one which is unity.

Figure 3.54    A two times oversampling interpolator. To compute an intermediate sample, the input samples are imagined to be sinx/x impulses, and the contributions from each at the point of interest can be calculated. In practice, rather more samples on either side need to be taken into account.

Figure 3.55    In 4 oversampling, for each set of input samples, four phases of coefficients are necessary, each of which produces one of the oversampled values.

Figure 3.50 showed that when the two sampling rates have a simple fractional relationship m/n, there is a periodicity in the relationship between samples in the two streams. It is possible to have a system clock running at the least-common multiple frequency which will divide by different integers to give each sampling rate.16

The existence of a common clock frequency means that a fractional-ratio convertor could be made by arranging two integer-ratio convertors in series. This configuration is shown in Figure 3.56(a). The input-sampling rate is multiplied by m in an interpolator, and the result is divided by n in a decimator. Although this system would work, it would be grossly inefficient, because only one in n of the interpolator’s outputs would be used. A decimator followed by an interpolator would also offer the correct sampling rate at the output, but the intermediate sampling rate would be so low that the system bandwidth would be quite unacceptable.

Figure 3.56    In (a), fractional-ratio conversion of 3/4 in this example is by increasing to 4 input prior to reducing by 3. The inefficiency due to discarding previously computed values is clear. In (b), efficiency is raised since only needed values will be computed. Note how the interpolation phase changes for each output. Fixed coefficients can no longer be used.

As has been seen, a more efficient structure results from combining the processes. The result is exactly the same structure as an integer-ratio interpolator, and requires an FIR filter. The impulse response of the filter is determined by the lower of the two sampling rates, and as before it prevents aliasing when the rate is being reduced, and prevents images when the rate is being increased. The interpolator has sufficient coefficient phases to interpolate m output samples for every input sample, but not all of these values are computed; only interpolations which coincide with an output sample are performed. It will be seen in Figure 3.56(b) that input samples shift across the transversal filter at the input sampling rate, but interpolations are performed only at the output sample rate. This is possible because a different filter phase will be used at each interpolation.

In the previous examples, the sample rate of the filter output had a constant relationship to the input, which meant that the two rates had to be phase-locked. This is an undesirable constraint in some applications, including sampling rate convertors used for variable-speed replay. In a variable-ratio convertor, values will exist for the instants at which input samples were made, but it is necessary to compute what the sample values would have been at absolutely any time between available samples. The general concept of the interpolator is the same as for the fractional-ratio convertor, except that an infinite number of filter phases is necessary. Since a realizable filter will have a finite number of phases, it is necessary to study the degradation this causes.

The desired continuous time axis of the interpolator is quantized by the phase spacing, and a sample value needed at a particular time will be replaced by a value for the nearest available filter phase. The number of phases in the filter therefore determines the time accuracy of the interpolation. The effects of calculating a value for the wrong time are identical to sampling with jitter, in that an error occurs proportional to the slope of the signal. The result is program-modulated noise. The higher the noise specification, the greater the desired time accuracy and the larger the number of phases required. The number of phases is equal to the number of sets of coefficients available, and should not be confused with the number of points in the filter, which is equal to the number of coefficients in a set (and the number of multiplications needed to calculate one output value).

In Chapter 4 it will be shown that the sampling jitter accuracy necessary for sixteen-bit working is a few hundred picoseconds. This implies that something like 215 filter phases will be required for adequate performance in a sixteen-bit sampling-rate convertor.17 The direct provision of so many phases is difficult, since more than a million different coefficients must be stored; so alternative methods have been devised. When several interpolators are cascaded, the number of phases available is the product of the number of phases in each stage. For example, if a filter which could interpolate sample values halfway between existing samples were followed by a filter which could interpolate at one-quarter, one-half and three-quarters the input period, the overall number of phases available would be eight. This is illustrated in Figure 3.57.

For a practical convertor, four filters in series might be needed. To increase the sampling rate, the first two filters interpolate at fixed points between samples input to them, effectively multiplying the input sampling rate by some large factor as well as removing images from the spectrum; the second two work with variable coefficients, like the fractional-ratio convertor described earlier, so that only samples coincident with the output clock are computed. To reduce the sampling rate, the positions of the two pairs of filters are reversed, so that the fixed-response filters perform the anti-aliasing function at the output sampling frequency.

Figure 3.57    Cascading interpolators multiplies the factor of sampling-rate increase of each stage.

As mentioned earlier, the response of a digital filter is always proportional to the sampling rate. When the sampling rate on input or output varies, the phase of the interpolators must change dynamically. The necessary phase must be selected to the stated accuracy, and this implies that the position of the relevant clock edge must be measured in time to the same accuracy. This is not possible because, in real systems, the presence of noise on binary signals of finite-rise time shifts the time where the logical state is considered to have changed. The only way to measure the position of clocks in time without jitter is to filter the measurement digitally, and this can be done with a numerically locked loop. Figure 3.58 shows the essential stages of a variable-ratio convertor of this kind.

When suitable processing speed is available, a digital computer can act as a filter, since each multiplication can be executed serially, and the results accumulated to produce an output sample. For simple filters, the coefficients would be stored in memory, but the number of coefficients needed for rate conversion precludes this. However, it is possible to compute what a set of coefficients should be algorithmically, and this approach permits single-stage conversion.

The two sampling clocks are compared as before, to produce an accurate relative-phase parameter. The lower sampling rate is measured to determine what the impulse response of the filter should be to prevent aliasing or images, and this is fed, along with the phase parameter, to a processor which computes a set of coefficients and multiplies them by a window function. These coefficients are then used by the single-filter stage to compute one output sample. The process then repeats for the next output sample.

Figure 3.58    (a) In a variable-ratio convertor, the phase relationship of input and output clock edges must be measured to determine the coefficients needed. Jitter on clocks prevents their direct use, and phase-locked loops must be used to average the jitter over many sample clocks.

Figure 3.58    (b) The clock relationships in (a) determine the relative phases of output and input samples, which in conjunction with the filter impulse response determine the coefficients necessary.

Figure 3.58    (c) The coefficients determined in (b) are fed to the configuration shown (or the equivalent implemented in software) to compute the output sample at the correct interpolated position. Note that actual filter will have many more points than this simple example shows.

### 3.25 IIR filters

Figure 3.59 is a FIR filter which has been adapted in an attempt to simulate an RC network. Because an RC network is causal, i.e. the output cannot appear before the input, the impulse response is asymmetrical, and represents an exponential decay, as shown in Figure 3.59(a). The asymmetry of the impulse response confirms the expected result that this filter will not be phase-linear. The structure of the filter is exactly the same as the earlier examples given in this chapter; only the coefficients have been changed. The simulation of RC networks is common in digital audio for the purposes of equalization or provision of tone controls. A large number of points are required in an FIR filter to create the long exponential decays necessary, and the FIR filter is at a disadvantage here because an exponential decay can be computed as every output sample is a fixed proportion of the previous one.

Figure 3.59    In (a) an FIR filter is supplied with exponentially decaying coefficient to simulate an RC response. In (b) the configuration of an IIR or recursive filter uses much less hardware (or computation) to give the same response, shown in (c).

Figure 3.59(b) shows a much simpler hardware configuration, where the output is returned in attenuated form to the input. The response of this circuit to a single sample is a decaying series of samples, in which the rate of decay is controlled by the gain of the multiplier. If the gain is one, the output can carry on indefinitely. For this reason, the configuration is known as an infinite impulse response (IIR) filter. If the gain of the multiplier is slightly more than one, the output will increase exponentially after a single non-zero input until the end of the number range is reached. Unlike FIR filters, IIR filters are not necessarily stable. FIR filters are easy to understand, but difficult to make in audio applications; IIR filters are easier to make, because less hardware is needed, but they are harder to understand.

Figure 3.60    (a) First-order lag network IIR filter. Note recursive path through single sample delay latch. (b) The arrangement of (a) becomes an integrator if K2 = 1, since the output is always added to the next sample. (c) When the time coefficients sum to unity, the system behaves as an RC lag network. (d) The same performance as in (c) can be realized with only one multiplier by reconfiguring as shown here.

One major consideration when recursive techniques are to be used is that the accuracy of the coefficients must be much higher. This is because an impulse response is created by making each output some fraction of the previous one, and a small error in the coefficient becomes a large error after several recursions. This error between what is wanted and what results from using truncated coefficients can often be enough to make the actual filter unstable whereas the theoretical model is not.

By way of introduction to this class of filters, the characteristics of some useful configurations will be discussed. It will be seen that parallels can be drawn with some classical analog circuits.

Figure 3.61    The response of the configuration of Figure 3.60 to a unit step. With K2 = 1, the system is an integrator, and the straight line shows the output with K1 = 0.1. With K1 = 0.1 and K2 = 0.9, K1 + K2 = 1 and the exponential response of an RC network is simulated.

The terms phase lag and phase lead are used to describe analog circuit characteristics, and they are also applicable to digital circuits. Figure 3.60(a) shows a first-order lag network containing two multipliers, a register to provide one sample period of delay, and an adder. As might be expected, the characteristics of the circuit can be transformed by changing the coefficients. If K2 is greater than unity, the circuit is unstable, as any non-zero input causes the output to increase exponentially. Making K2 equal to unity (Figure 3.60(b)) produces a digital integrator, because the current value in the latch is added to the input to form the next value in the latch. The coefficient K1 determines the time constant in the same way that the RC network does for the analog circuit.

Figure 3.62    (a) First-order lead configuration. Unlike the lag filter this arrangement is always stable, but as before the effect of changing the coefficients is dramatic. (b) When K2 of (a) is made zero, the configuration subtracts successive samples, and thus acts as a differentiator. (c) Setting K2 of (a) to unity gives the high-pass filter response shown here.

Figure 3.60(c) shows the case where K1 + K2 = 1; the response will be the same as an RC lag network. In this case it will be more economical to construct a different configuration shown in (d) having the same characteristics but eliminating one stage of multiplication. The operation of these configurations can be verified by computing their responses to an input step. This is simply done by applying some constant input value, and deducing how the output changes for each applied clock pulse to the register. This has been done for two cases in Figure 3.61 where the linear integrator response and the exponential responses can be seen. It is interesting to experiment with different coefficients to see how the results change.

Figure 3.62(a) shows a first-order lead network using the same basic building blocks. Again, the coefficient values have dramatic power. If K2 is made zero, the circuit simply subtracts the previous sample value from the current one, and so becomes a true differentiator as in (b). K1 determines the time constant. If K2 is made unity, the configuration acts as a high-pass filter as in (c).

### 3.26 The z-transform

Whereas it was possible to design effective FIR filters with relatively simple theory, the IIR filter family are too complicated for that. The z-transform is particularly appropriate for IIR digital filter design because it permits a rapid graphic assessment of the characteristics of a proposed filter. This graphic nature of the z-plane also lends itself to explanation so that an understanding of filter concepts can be obtained without their becoming obscured by mathematics.

Digital filters rely heavily on delaying sample values by one or more sample periods. One tremendous advantage of the z-transform is that a delay which is difficult to handle in time-domain expressions corresponds to a multiplication in the z-domain. This means that the transfer function of a circuit in the z-domain can often be written down by referring to the block diagram, which is why a register causing a sample delay is usually described by z–1.

The circuit configuration of Figure 3.60(c) is repeated in Figure 3.63(a). For simplicity in calculation, the two coefficients have been set to 0.5. The impulse response of this circuit will be found first, followed by the characteristics of the circuit in the z-plane which will immediately give the frequency and phase response.

The impulse response can be found graphically by supplying a series of samples as input which are all zero except for one which has unity value. This has been done in the figure, where it will be seen that once the unity sample has entered, the output y will always be one half the previous output, resulting in an exponential decay.

Figure 3.63    (a) A digital filter which simulates an RC network. In this example the coefficients are both 0.5.

Figure 3.63    (b) The response of (a) to a single unity-value sample. The initial output of 0.5 is due to the input coefficient. Subsequent outputs are always 0.5 of the previous sample, owing to the recursive path through the latch.

Figure 3.63    (c) The z-plane showing real and imaginary axes. Frequency increases anticlockwise from Re(z) = +1 to the Nyquist limit at Re(z) = –1, returning to Re(z) = +1 at the sampling rate. The origin of aliasing is clear.

Figure 3.63    (d) The z-plane is used by inserting poles and zeros.

Figure 3.63    (e) Some examples of response of the circuit (a) using the z-plane. The zero vector is divided by the pole vector to obtain the amplitude response, and the phase response is the difference between the arguments of the two vectors.

Where there is a continuous input to the circuit, the output will be one half the previous output y plus one half the current input x. The output then becomes the convolution of the input and the impulse response.

It is possible to express the operation of the circuit mathematically. Time is discrete, and is measured in sample periods, so the time can be expressed by the number of the sample n. Accordingly the input at time n will be called x[n] and the corresponding output will be called y[n]. The previous output is called yn–1. It is then possible to write:

y[n] = 0.5 x[n] + 0.5 y[n–1]

This is called a recurrence relationship, because if it is repeated for all values of n, a convolution results. The relationship can be transformed into an expression in z, referring to Figure 3.63(b):

As with any system the transfer function is the ratio of the output to the input. Thus:

Rearranging equation (3.1) gives:

Thus:

The term in the numerator would make the transfer function zero when it became zero, whereas the terms in the denominator would make the transfer function infinite if they were to become zero. These result in poles.

Poles and zeros are plotted on a z-plane diagram. The basics of the z-plane are shown in Figure 3.63(c). There are two axes at right angles, the real axis Re(z) and the imaginary axis Im(z). Upon this plane is drawn a circle of unit radius whose centre is at the intersection of the axes. Frequency increases anticlockwise around this circle from 0 Hz at Re(z) = +1 to the Nyquist limit frequency at Re(z) = –1. Negative frequency increases clockwise around the circle reaching the negative Nyquist limit at Re(z) = –1. Essentially the circle on the z-plane is produced by taking the graph of a Fourier spectrum and wrapping it round a cylinder. The repeated spectral components at multiples of the sampling frequency in the Fourier domain simply overlap the fundamental interval when rolled up in this way, and so it is an ideal method for displaying the response of a sampled system, since only the response in the fundamental interval is of interest. Figure 3.63(d) shows that the z-plane diagram is used by inserting poles (X) and zeros (0).

In equation (3.2), H[z] can be made zero only if z is zero; therefore a zero is placed on the diagram at z = 0. H[z] becomes infinite if z is 0.5, because the denominator becomes zero. A pole (X) is placed on the diagram at Re(z) = 0.5. It will be recalled that 0.5 was the value of the coefficient used in the filter circuit being analysed. The performance of the system can be analysed for any frequency by drawing two vectors. The frequency in question is a fraction of the sampling rate and is expressed as an angle which is the same fraction of 360 degrees. A mark is made on the unit circle at that angle; a pole vector is drawn from the pole to the mark, and the zero vector is drawn from the zero to the mark.

The amplitude response at the chosen frequency is found by dividing the magnitude of the zero vector by the magnitude of the pole vector. A working approximation can be made by taking distances from the diagram with a ruler. The phase response at that frequency can be found by subtracting the argument of the pole vector from the argument of the zero vector, where the argument is the angle between the vector and the positive real z-axis. The phase response is clearly also the angle between the vectors. In Figure 3.63(e) the resulting diagram has been shown for several frequencies, and this has been used to plot the frequency and phase response of the system. This is confirmation of the power of the z-transform, because it has given the results of a Fourier analysis directly from the block diagram of the filter with trivial calculation. There cannot be more zeros than poles in any realizable system, and the poles must remain within the unit circle or instability results.

Figure 3.64(a) shows a slightly different configuration which will be used to support the example of a high-pass filter. High-pass filters with a low cut-off frequency are used extensively in digital audio to remove DC components in sample streams which have arisen due to convertor drift.

Using the same terminology as the previous example:

u[n] = x[n] + Ku[n–1]

so that in the z-domain:

Figure 3.64    (a) Configuration used as a high-pass filter to remove DC offsets in audio samples. (b) The coefficient K determines the position of the pole on the real axis. It would normally be very close to the zero. (c) In the passband, the closeness of the pole and zero means the vectors are almost the same length, so the gain tends to unity and the phase shift is small (left). In the stopband, the gain falls and the phase angle tends to 90°.

Also: y[n] = u[n] – u[n–1]

So that in the z-domain:

y[z] = u[z] – z–1 u[z] = u[z](1–z1)

Therefore:

From equation (3.3) u[z] – Kz–1 u[z] = x[z]

Therefore:

From equations (3.4) and (3.5):

Since the numerator determines the position of the pole, this will be at z = 1, and the zero will be at z = K because this makes the denominator go to zero. Figure 3.64(b) shows that if the pole is put close to the zero by making K almost unity, the filter will only attenuate very low frequencies. At high frequencies, the ratio of the length of the pole vector to the zero vector will be almost unity, whereas at very low frequencies the ratio falls steeply becoming zero at DC. The phase characteristics can also be established. At high frequencies the pole and zero vectors will be almost parallel, so phase shift will be minimal. It is only in the area of the zero that the phase will change.

### 3.27 Bandpass filters

The low- and high-pass cases have been examined, and it has been seen that they can be realized with simple first-order filters. Bandpass circuits will now be discussed; these will generally involve higher-order configurations. Bandpass filters are used extensively for presence filters and their more complex relative the graphic equalizer, and are essentially filters which are tuned to respond to a certain band of frequencies more than others.

Figure 3.65(a) shows a bandpass filter, which is essentially a lead filter and a lag filter combined. The coefficients have been made simple for clarity; in fact two of them are set to zero. The adder now sums three terms, so the recurrence relationship is given by:

y[n] = x[n] – x[n–2] – 0.25y[n–2]

Since in the z-transform x[n–1] = z–1x[n],

As before, the position of poles and zeros is determined by finding what values of z will cause the denominator or the numerator to become zero. This can be done by factorizing the terms. In the denominator the roots will be complex:

Figure 3.65(b) illustrates that the poles have come off the real axis of the z-plane, and then appear as a complex conjugate pair, so that the diagram is essentially symmetrical about the real axis. There are also two zeros, at Re+1 and Re–1.

With more poles and zeros, the graphical method of determining the frequency response becomes more complicated. The procedure is to multiply together the lengths of all the zero vectors, and divide by the product of the lengths of all the pole vectors. The process is shown in Figure 3.65(c). The frequency response shows an indistinct peak at half the Nyquist frequency. This is because the poles are some distance from the unit radius.

A more pronounced peak can be obtained by placing the poles closer to unit radius. In contrast to the previous examples, which have accepted a particular configuration and predicted what it will do, this example will decide what response is to be obtained and then compute the coefficients needed. Figure 3.66(a) shows the z-plane diagram. The resonant frequency is one third of the Nyquist limit, or 60 degrees around the circle, and the poles have been placed at a radius of 0.9 to give a peakier response.

Figure 3.65    (a) Simple bandpass filter combining lead and lag stages. Note that two terms marked have zero coefficients, reducing the complexity of implementation. (b) Bandpass filters are characterized by poles which are away from the real axis. (c) With multiple poles and zeros, the computation of gain and phase is a little more complicated.

It is possible to write down the transfer function directly from the position of the poles and zeros:

y[z](z2 – 2rzcosωT + r2) = x[z](z2 – 1)

y[z]z2 = x[z]z2x[z] + y[z]2rzcosωTy[z]r2

y[z] = x[z] – x[z]z–2 + y[z]2rz–1 cosθTy[z]r2z–2

As cos60 = 0.5, the recurrence relationship can be written:

y[n] = x[n] – x[n–2] + 0.9y[n–1] – 0.81y[n–2]

The configuration necessary can now be seen in Figure 3.66(b), along with the necessary coefficients. Since the transfer function is the ratio of two quadratic expressions, this arrangement is often referred to as a biquadratic section.

Figure 3.66    (a) The peak frequency chosen here is one-sixth of the sample rate, which requires the poles to be on radii at ±60 from the real axis. The radius of 0.9 places the poles close to the unit cycle, resulting in a pronounced peak.

Figure 3.66    (b) The biquadratic configuration shown here implements the recurrence relationship derived in the text: y[n] = x[n] – x[n – 2] + 0.9y[n – 1] – 0.81y[n – 2].

Figure 3.66    (c) Calculating the frequency response of the filter of (b). The method of Figure 3.65(c) is used. Note how the pole vector becomes very short as the resonance is reached. As this is in the denominator, the response is large. The instability resulting from a pole on or outside the unit circle is thus clearly demonstrated.

Figure 3.66    (d) The response of the filter of (b) to a unit impulse has been tabulated by using the recurrence relation. The filter rings with a period of six samples, as expected.

As with the previous example, the frequency response is computed by multiplying the vector lengths, and it will be seen in (c) to have the desired response. The impulse response has been computed for a single non-zero input sample in Figure 3.66(d), and this will be seen to ring with a period of six samples as expected.

In the example above, the calculations were performed with precision, and the result was as desired. In practice, the coefficients will be represented by a finite wordlength, which means that some inaccuracy will be unavoidable. Owing to the recursion of previous calculations, IIR filters are sensitive to the accuracy of the coefficients, and the higher the order of the filter, the more sensitivity will be shown. In the worst case, a stable filter with a pole near the unit circle may become unstable if the coefficients are represented to less than the required accuracy. Whilst it is possible to design high-order digital filters with a response fixed for a given application, programmable filters of the type required for audio are seldom attempted above the second order to avoid undue coefficient sensitivity. The same response as for a higher-order filter can be obtained by cascading second-order filter sections. For certain applications, such as graphic equalizers, filter sections might be used in parallel.

A further issue which demands attention is the effect of truncation of the wordlength within the data path of the filter. Truncation of coefficients causes only a fixed change in the filter performance which can be calculated. The same cannot be said for data-path truncation. When a sample is multiplied by a coefficient, the necessary wordlength increases dramatically. In a recursive filter the output of one multiplication becomes the input to the next, and so on, making the theoretical wordlength required infinite. By definition, the required wordlength is not available within a realizable filter. Some low-order bits of the product will be lost, which causes noise or distortion depending on the input signal. A series cascade will produce more noise of this kind than a parallel implementation.18

In some cases, truncation can cause oscillation. Consider a recursive decay following a large input impulse. Successive output samples become smaller and smaller, but if truncation takes place, the sample may be coarsely quantized as it becomes small, and at some point will not be the correct proportion of the previous sample. In an extreme case, the decay may reverse, and the output samples will grow in magnitude until they are great enough to be represented accurately in the truncation, when they will again decay. The filter is then locked in an endless loop known as a limit cycle.19 It is a form of instability which cannot exist on large signals because the larger a signal becomes, the smaller the effect of a given truncation. It can be prevented by the injection of digital dither at an appropriate point in the data path. The randomizing effect of the dither destroys the deterministic effect of truncation and prevents the occurrence of the limit cycle.

In the opposite case from truncation due to losing low-order sample bits, products are also subject to overflow if sufficient high-order bits are not available after a multiplication.20 A simple overflow results in a wraparound of the sample value, and is most disturbing as well as being a possible source of large-amplitude limit cycles. The clipping or saturating adders described in Chapter 3 find an application in digital filters, since the output clips or limits instead of wrapping, and limit cycles are prevented. In order to balance the requirements of truncation and saturation, the output of one stage may be shifted one or more binary places before entering the next stage. This process is known as scaling, since shifting down in binary divides by powers of two; it can be used to prevent overflow. Conversely if the coefficients in use dictate that the high-order bits of a given multiplier output will never be exercised, a shift up may be used to reduce the effects of truncation.

The configuration shown in Figure 3.66 is not the only way of implementing a two-pole two-zero filter. Figures 3.67 shows some alternatives. Starting with the direct form 2 filter, the delays are exchanged with the adders to give a structure which is sometimes referred to as the canonic form.

Filters can also be transposed to yield a different structure with the same transfer function. Transposing is done by reversing signal flow in all the branches, the delays and the multipliers, and replacing nodes in the flow with adders and vice versa. Coefficients and delay lengths are unchanged. The transposed configurations are shown in Figure 3.67(b) and (c). The transposed direct form 2 filter has advantages for audio use,21 since it has less tendency to problems with overflow, and can be made so that the dominant truncation takes place at one node, which eases the avoidance of limit cycles.

### 3.29 Pole/zero positions

Since the coefficients of a digital filter are all binary numbers of finite wordlength, it follows that there must be a finite number of positions of poles and zeros in the z-plane. For audio use, the frequencies at which the greatest control is required are usually small compared to the sampling frequency. In presence filters, the requirement for a sharp peak places the poles near to the unit circle, and the low frequencies used emphasize the area adjacent to unity on the real axis. For maximum flexibility of response, a large number of pole and zero positions are needed in this area for given coefficient wordlengths. The structure of the filter has a great bearing on the pole and zero positions available.

Figure 3.67    The same filter can be realized in the direct and canonical forms, (a) and (b), and each can be transposed, (c) and (d).

Figure 3.67(b)

Figure 3.67(c)

Figure 3.67(d)

Canonic structures result in highly non-uniform distributions of available pole positions, and the direct form is little better.22 Figure 3.68 shows a comparison of direct form and coupled form, where the latter has a uniform pole distribution. The transfer function of the arrangement of Figure 3.68(a) is given by:

Reference to Section 3.27 will show that

a = –2rcosθT and b = r2

When poles are near to unit radius, r is nearly 1, and so also is coefficient b. When ωT is small, a will be nearly –2. In this case a more accurate representation of the coefficients can be had by expressing them as the difference between the wanted values and 1 and –2 respectively. Thus b becomes 1–b′ and a becomes 2– a′. Since a′ and b′ are small, representing them with a reasonable wordlength means that the low-order bit represents a much smaller quantizing step with respect to unity. This is the approach of the Agarwal–Burrus filter structure.23

Figure 3.69 shows the equivalent of Figure 3.68(a) where the use of difference coefficients can be seen. The multiplication by –1 requires only complementing, and by –2 requires a single-bit shift in addition.

An alternative method of providing accurate coefficients is to transform the z-plane with a horizontal shift, as in Figure 3.70(a). A new z′ plane is defined, where the origin is at unity on the real z-axis:

Figure 3.70(b) shows that the above expression is used in the realization of a z′ stage, which will be seen to be a digital integrator. The general expression for a biquadratic section is given in (c). z – 1 is replaced by 1/(z′ + 1) throughout, and the expression is multiplied out. It will be seen that the z-plane coefficients gathered in brackets can be replaced by z′-plane coefficients. When poles and zeros are required near Re[z] = 1, it will be found that all the z′ coefficients are small, allowing accurate multiplication with short-coefficient wordlengths. Figure 3.70(d) shows a filter constructed in this way. These configurations demonstrate their accuracy with minimal truncation noise and limit cycles.

Figure 3.68    (a) A second-order, direct-form digital filter, with coefficients quantized to three bits (not including the sign bit). This means that each coefficient can have only eight values. As the pole radius is the square root of the coefficient b the pole distribution is non-uniform, and poles are not available in the area near to Re[z] = 1 which is of interest in audio.

Figure 3.68    (b) In the coupled structure shown here, the realizable pole positions are now on a uniform grid, with the advantage of more pole positions near to Re[z] = 1, but with the penalty that more processing is required.

Figure 3.69    In the Agarwal–Burrus filter structure, advantage is taken of the fact that the coefficients are nearly 1 and 2 when the poles are close to unit radius. The coefficients actually used are the difference between these round numbers and the desired value. This configuration should be compared with that of Figure 3.68(a).

Figure 3.70    (a) By defining a new z′-plane whose origin is at Re[z] = 1, small coefficients in z′ will correspond to poles near to z = 1.

Figure 3.70    (b) The above configuration implements a z–1 transfer function. It is in fact a digital integrator.

Figure 3.70    (c) Conversion to the z′-plane requires the coefficients in z to be combined as shown here.

### 3.30 The Fourier transform

Figure 3.39 showed that if the amplitude and phase of each frequency component is known, linearly adding the resultant components in an inverse transform results in the original waveform. In digital systems the waveform is expressed as a number of discrete samples. As a result the Fourier transform analyses the signal into an equal number of discrete frequencies. This is known as a discrete Fourier transform or DFT in which the number of frequency coefficients is equal to the number of input samples. The fast Fourier transform is no more than an efficient way of computing the DFT.24 As was seen in the previous section, practical systems must use windowing to create short-term transforms.

Figure 3.70    (d) Implementation of a z′-plane filter. The z–1 sections can be seen at centre. As the coefficients are all small, scaling (not shown) must be used.

It will be evident from Figure 3.71 that the knowledge of the phase of the frequency component is vital, as changing the phase of any component will seriously alter the reconstructed waveform. Thus the DFT must accurately analyse the phase of the signal components.

There are a number of ways of expressing phase. Figure 3.72 shows a point which is rotating about a fixed axis at constant speed. Looked at from the side, the point oscillates up and down at constant frequency. The waveform of that motion is a sine wave, and that is what we would see if the rotating point were to translate along its axis whilst we continued to look from the side.

Figure 3.71    Fourier analysis allows the synthesis of any waveform by the addition of discrete frequencies of appropriate amplitude and phase.

One way of defining the phase of a waveform is to specify the angle through which the point has rotated at time zero (T = 0). If a second point is made to revolve at 90 degrees to the first, it would produce a cosine wave when translated. It is possible to produce a waveform having arbitrary phase by adding together the sine and cosine wave in various proportions and polarities. For example, adding the sine and cosine waves in equal proportion results in a waveform lagging the sine wave by 45 degrees.

Figure 3.72 shows that the proportions necessary are respectively the sine and the cosine of the phase angle. Thus the two methods of describing phase can be readily interchanged.

Figure 3.72    The origin of sine and cosine waves is to take a particular viewpoint of a rotation. Any phase can be synthesized by adding proportions of sine and cosine waves.

The discrete Fourier transform spectrum-analyses a string of samples by searching separately for each discrete target frequency. It does this by multiplying the input waveform by a sine wave, known as the basis function, having the target frequency and adding up or integrating the products. Figure 3.73(a) shows that multiplying by basis functions gives a non-zero integral when the input frequency is the same, whereas (b) shows that with a different input frequency (in fact all other different frequencies) the integral is zero showing that no component of the target frequency exists. Thus from a real waveform containing many frequencies all frequencies except the target frequency are excluded. The magnitude of the integral is proportional to the amplitude of the target component.

Figure 3.73    The input waveform is multiplied by the target frequency and the result is averaged or integrated. At (a) the target frequency is present and a large integral results. With another input frequency the integral is zero as at (b). The correct frequency will also result in a zero integral shown at (c) if it is at 90° to the phase of the search frequency. This is overcome by making two searches in quadrature.

Figure 3.73(c) shows that the target frequency will not be detected if it is phase shifted 90 degrees as the product of quadrature waveforms is always zero. Thus the discrete Fourier transform must make a further search for the target frequency using a cosine basis function. It follows from the arguments above that the relative proportions of the sine and cosine integrals reveals the phase of the input component. Thus each discrete frequency in the spectrum must be the result of a pair of quadrature searches.

Searching for one frequency at a time as above will result in a DFT, but only after considerable computation. However, a lot of the calculations are repeated many times over in different searches. The fast Fourier transform gives the same result with less computation by logically gathering together all the places where the same calculation is needed and making the calculation once.

The amount of computation can be reduced by performing the sine and cosine component searches together. Another saving is obtained by noting that every 180 degrees the sine and cosine have the same magnitude but are simply inverted in sign. Instead of performing four multiplications on two samples 180 degrees apart and adding the pairs of products it is more economical to subtract the sample values and multiply twice, once by a sine value and once by a cosine value.

Figure 3.74    An example of a filtering search. Pairs of samples are subtracted and multiplied by sampled sine and cosine waves. The products are added to give the sine and cosine components of the search frequency.

The first coefficient is the arithmetic mean which is the sum of all the sample values in the block divided by the number of samples. Figure 3.74 shows how the search for the lowest frequency in a block is performed. Pairs of samples are subtracted as shown, and each difference is then multiplied by the sine and the cosine of the search frequency. The process shifts one sample period, and a new sample pair are subtracted and multiplied by new sine and cosine factors. This is repeated until all the sample pairs have been multiplied. The sine and cosine products are then added to give the value of the sine and cosine coefficients respectively.

Figure 3.75    The basic element of an FFT is known as a butterfly as at (a) because of the shape of the signal paths in a sum and difference system. The use of butterflies to compute the first two coefficients is shown in (b).

It is possible to combine the calculation of the DC component which requires the sum of samples and the calculation of the fundamental which requires sample differences by combining stages shown in Figure 3.75(a) which take a pair of samples and add and subtract them. Such a stage is called a butterfly because of the shape of the schematic. Figure 3.75(b) shows how the first two components are calculated. The phase rotation boxes attribute the input to the sine or cosine component outputs according to the phase angle. As shown the box labelled 90 degrees attributes nothing to the sine output, but unity gain to the cosine output. The 45 degree box attributes the input equally to both components.

Figure 3.75    (c) An actual calculation of a sine coefficient. This should be compared with the result shown in (d).

Figure 3.75(c) shows a numerical example. If a sinewave input is considered where zero degrees coincides with the first sample, this will produce a zero sine coefficient and non-zero cosine coefficient. Figure 3.75(d) shows the same input waveform shifted by 90 degrees. Note how the coefficients change over.

Figure 3.75(e) shows how the next frequency coefficient is computed. Note that exactly the same first-stage butterfly outputs are used, reducing the computation needed.

A similar process may be followed to obtain the sine and cosine coefficients of the remaining frequencies. The full FFT diagram for eight samples is shown in Figure 3.76(a). The spectrum this calculates is shown in (b). Note that only half of the coefficients are useful in a real band-limited system because the remaining coefficients represent frequencies above one half of the sampling rate.

In short-time Fourier transforms (STFTs) the overlapping input sample blocks must be multiplied by window functions. The principle is the same as for the application in FIR filters shown in Section 3.23. Figure 3.77 shows that multiplying the search frequency by the window has exactly the same result except that this need be done only once and much computation is saved. Thus in the STFT the basis function is a windowed sine or cosine wave.

Figure 3.75    (d) With a quadrature input the frequency is not seen.

Figure 3.75    (e) The butterflies used for the first coefficients form the basis of the computation of the next coefficient.

The FFT is used extensively in such applications as phase correlation, where the accuracy with which the phase of signal components can be analysed is essential. It also forms the foundation of the discrete cosine transform.

### 3.31 The discrete cosine transform (DCT)

The DCT is a special case of a discrete Fourier transform in which the sine components of the coefficients have been eliminated leaving a single number. This is actually quite easy. Figure 3.78(a) shows a block of input samples to a transform process. By repeating the samples in a time-reversed order and performing a discrete Fourier transform on the double-length sample set a DCT is obtained. The effect of mirroring the input waveform is to turn it into an even function whose sine coefficients are all zero. The result can be understood by considering the effect of individually transforming the input block and the reversed block.

Figure 3.76    At (a) is the full butterfly diagram for an FFT. The spectrum this computes is shown at (b).

Figure 3.77    Multiplication of a windowed block by a sine wave basis function is the same as multiplying the raw data by a windowed basis function but requires less multiplication as the basis function is constant and can be pre-computed.

Figure 3.78    The DCT is obtained by mirroring the input block as shown at (a) prior to an FFT. The mirroring cancels out the sine components as at (b), leaving only cosine coefficients.

Figure 3.78(b) shows that the phase of all the components of one block are in the opposite sense to those in the other. This means that when the components are added to give the transform of the double length block all the sine components cancel out, leaving only the cosine coefficients, hence the name of the transform.25 In practice the sine component calculation is eliminated. Another advantage is that doubling the block length by mirroring doubles the frequency resolution, so that twice as many useful coefficients are produced. In fact a DCT produces as many useful coefficients as input samples.

### 3.32 The wavelet transform

The wavelet transform was not discovered by any one individual, but has evolved via a number of similar ideas and was only given a strong mathematical foundation relatively recently.2629 The wavelet transform is similar to the Fourier transform in that it has basis functions of various frequencies which are multiplied by the input waveform to identify the frequencies it contains. However, the Fourier transform is based on periodic signals and endless basis functions and requires windowing. The wavelet transform is fundamentally windowed, as the basis functions employed are not endless sine waves, but are finite on the time axis; hence the name. Wavelet transforms do not use a fixed window, but instead the window period is inversely proportional to the frequency being analysed. As a result a useful combination of time and frequency resolutions is obtained. High frequencies corresponding to transients in audio or edges in video are transformed with short basis functions and therefore are accurately located. Low frequencies are transformed with long basis functions which have good frequency resolution.

Figure 3.79 shows that that a set of wavelets or basis functions can be obtained simply by scaling (stretching or shrinking) a single wavelet on the time axis. Each wavelet contains the same number of cycles such that as the frequency reduces, the wavelet gets longer. Thus the frequency discrimination of the wavelet transform is a constant fraction of the signal frequency. In a filter bank such a characteristic would be described as ‘constant Q’. Figure 3.80 shows the division of the frequency domain by a wavelet transform is logarithmic whereas in the Fourier transform the division is uniform. The logarithmic coverage is effectively dividing the frequency domain into octaves and as such parallels the frequency discrimination of human hearing.

Figure 3.79    Unlike discrete Fourier transforms, wavelet basis functions are scaled so that they contain the same number of cycles irrespective of frequency. As a result their frequency discrimination ability is a constant proportion of the centre frequency.

Figure 3.80    Wavelet transforms divide the frequency domain into octaves instead of the equal bands of the Fourier transform.

As it is relatively recent, the wavelet transform has yet to be widely used although it shows great promise. It has been successfully used in audio and in commercially available non-linear video editors and in other fields such as radiology and geology.

### 3.33 Modulo-n arithmetic

Conventional arithmetic which is in everyday use relates to the real world of counting actual objects, and to obtain correct answers the concepts of borrow and carry are necessary in the calculations.

Figure 3.81    In modulo-2 calculations, there can be no carry or borrow operations and conventional addition and subtraction become identical. The XOR gate is a modulo-2 adder.

There is an alternative type of arithmetic which has no borrow or carry which is known as modulo arithmetic. In modulo-n no number can exceed n. If it does, n or whole multiples of n are subtracted until it does not. Thus 25 modulo-16 is 9 and 12 modulo-5 is 2. The count shown in Figure 3.81 is from a four-bit device which overflows when it reaches 1111 because the carry-out is ignored. If a number of clock pulses m are applied from the zero state, the state of the counter will be given by m mod.16. Thus modulo arithmetic is appropriate to systems in which there is a fixed wordlength and this means that the range of values the system can have is restricted by that wordlength. A number range which is restricted in this way is called a finite field.

Modulo-2 is a numbering scheme which is used frequently in digital processes. Figure 3.81 also shows that in modulo-2 the conventional addition and subtraction are replaced by the XOR function such that: A + B Mod.2 = A XOR B. When multi-bit values are added Mod.2, each column is computed quite independently of any other. This makes Mod.2 circuitry very fast in operation as it is not necessary to wait for the carries from lower-order bits to ripple up to the high-order bits.

Modulo-2 arithmetic is not the same as conventional arithmetic and takes some getting used to. For example, adding something to itself in Mod.2 always gives the answer zero.

### 3.34 The Galois field

Figure 3.82 shows a simple circuit consisting of three D-type latches which are clocked simultaneously. They are connected in series to form a shift register. At (a) a feedback connection has ben taken from the output to the input and the result is a ring counter where the bits contained will recirculate endlessly. At (b) one XOR gate is added so that the output is fed back to more than one stage. The result is known as a twisted-ring counter and it has some interesting properties. Whenever the circuit is clocked, the left-hand bit moves to the right-hand latch, the centre bit moves to the left-hand latch and the centre latch becomes the XOR of the two outer latches. The figure shows that whatever the starting condition of the three bits in the latches, the same state will always be reached again after seven clocks, except if zero is used.

Figure 3.82    The circuit shown is a twisted-ring counter which has an unusual feedback arrangement. Clocking the counter causes it to pass through a series of non-sequential values. See text for details.

The states of the latches form an endless ring of non-sequential numbers called a Galois field after the French mathematical prodigy Evariste Galois who discovered them. The states of the circuit form a maximum-length sequence because there are as many states as are permitted by the wordlength. As the states of the sequence have many of the characteristics of random numbers, yet are repeatable, the result can also be called a pseudo-random sequence (prs). As the all-zeros case is disallowed, the length of a maximum length sequence generated by a register of m bits cannot exceed (2m–1) states. The Galois field, however includes the zero term. It is useful to explore the bizarre mathematics of Galois fields which use modulo-2 arithmetic. Familiarity with such manipulations is helpful when studying the error correction, particularly the Reed–Solomon codes used in recorders and treated in Chapter 6. They will also be found in processes which require pseudo-random numbers such as digital dither, treated in Section 3.14, and randomized channel codes used in, for example, NICAM 728 and discussed in Chapters 6 and 8.

The circuit of Figure 3.82 can be considered as a counter and the four points shown will then be representing different powers of 2 from the MSB on the left to the LSB on the right. The feedback connection from the MSB to the other stages means that whenever the MSB becomes 1, two other powers are also forced to one so that the code of 1011 is generated.

Each state of the circuit can be described by combinations of powers of x, such as

 x2 = 100, x = 010, x2 + x = 110, etc.

The fact that three bits have the same state because they are connected together is represented by the Mod.2 equation:

x3 + x + 1 = 0

Let x = a, which is a primitive element. Now

a3 + a + 1 = 0 (1)

In modulo 2

 a + a = a2 + a2 = 0 a = x = 010 a2 = x2 = 100 a3 = a + 1 = 011 from (1) a4 = a a3 = a(a + 1) = a2 + a = 110 a5 = a2 + a + 1 = 111 a6 = a a5 = a(a2 + a + 1) = a3 + a2 + a = a + 1 + a2 + a = a2 + 1= 101 a7 = a(a2 + 1) = a3 + a = a + 1 + a= 1 = 001

Figure 3.83    In NICAM, randomizing is done by adding the serial data to a polynomial generated by the circuit shown here. The receiver needs an identical system which is synchronous with that of the transmitter.

In this way it can be seen that the complete set of elements of the Galois field can be expressed by successive powers of the primitive element. Note that the twisted-ring circuit of Figure 3.82 simply raises a to higher and higher powers as it is clocked; thus the seemingly complex multibit changes caused by a single clock of the register become simple to calculate using the correct primitive and the appropriate power.

The numbers produced by the twisted-ring counter are not random; they are completely predictable if the equation is known. However, the sequences produced are sufficiently similar to random numbers that in many cases they will be useful. They are thus referred to as pseudo-random sequences. The feedback connection is chosen such that the expression it implements will not factorize. Otherwise a maximum-length sequence could not be generated because the circuit might sequence around one or other of the factors depending on the initial condition. A useful analogy is to compare the operation of a pair of meshed gears. If the gears have a number of teeth which is relatively prime, many revolutions are necessary to make the same pair of teeth touch again. If the number of teeth have a common multiple, far fewer turns are needed.

Figure 3.83 shows the pseudo-random sequence generator used in NICAM 728. Its purpose is to break up the transmitted spectrum so that the sound carrier does not cause patterning on the TV picture. The sequence length of the circuit shown is 511 because the expression will not factorize. Further details of NICAM 728 can be found in Chapter 8.

Figure 3.84    White noise in analog circuits generally has the Gaussian amplitude distribution.

### 3.35 Noise and probability

Probability is a useful concept when dealing with processes which are not completely predictable. Thermal noise in electronic components is random, and although under given conditions the noise power in a system may be constant, this value only determines the heat that would be developed in a resistive load. In digital systems, it is the instantaneous voltage of noise which is of interest, since it is a form of interference which could alter the state of a binary signal if it were large enough. Unfortunately the instantaneous voltage cannot be predicted; indeed if it could the interference could not be called noise. Noise can only be quantified statistically, by measuring or predicting the likelihood of a given noise amplitude.

Figure 3.84 shows a graph relating the probability of occurrence to the amplitude of noise. The noise amplitude increases away from the origin along the horizontal axis, and for any amplitude of interest, the probability of that noise amplitude occurring can be read from the curve. The shape of the curve is known as a Gaussian distribution, which crops up whenever the overall effect of a large number of independent phenomena is considered. Thermal noise is due to the contributions from countless molecules in the component concerned. Magnetic recording depends on superimposing some average magnetism on vast numbers of magnetic particles.

If it were possible to isolate an individual noise-generating microcosm of a tape or a head on the molecular scale, the noise it could generate would have physical limits because of the finite energy present. The noise distribution might then be rectangular as shown in Figure 3.85(a), where all amplitudes below the physical limit are equally likely. The output of a twisted-ring counter such as that in Figure 3.82 can have a uniform probability. Each value occurs once per sequence. The outputs are positive only but do not include zero, but every value from 1 up to 2n–1 is then equally likely.

Figure 3.85    At (a) is a rectangular probability; all values are equally likely but between physical limits. At (b) is the sum of two rectangular probabilities, which is triangular, and at (c) is the Gaussian curve which is the sum of an infinite number of rectangular probabilities.

The output of a prs generator can be made into the two’s complement form by inverting the MSB. This has the effect of exchanging the missing all zeros value for a missing fully negative value as can be seen by considering the number ring in Figure 3.5. In this example, inverting the MSB causes the code of 1000 representing –8 to become 0000. The result is a four-bit prs generating uniform probability from –7 to +7 as shown in Figure 3.85(a).

If the combined effect of two of these uniform probability processes is considered, clearly the maximum amplitude is now doubled, because the two effects can add, but provided the two effects are uncorrelated, they can also subtract, so the probability is no longer rectangular, but becomes triangular as in Figure 3.85(b). The probability falls to zero at peak amplitude because the chances of two independent mechanisms reaching their peak value with the same polarity at the same time are understandably small.

If the number of mechanisms summed together is now allowed to increase without limit, the result is the Gaussian curve shown in Figure 3.85(c), where it will be seen that the curve has no amplitude limit, because it is just possible that all mechanisms will simultaneously reach their peak value together, although the chances of this happening are incredibly remote. Thus the Gaussian curve is the overall probability of a large number of uncorrelated uniform processes.

### References

 1 Spreadbury, D., Harris, N. and Lidbetter, P. So you think performance is cracked using standard floating point DSPs? Proc. 10th. Int. AES Conf., 105–110 (1991) 2 Richards, J.W., Digital audio mixing. The Radio and Electron. Eng., 53, 257–264 (1983) 3 Richards, J.W. and Craven, I., An experimental ‘all digital’ studio mixing desk. J. Audio Eng. Soc., 30, 117–126 (1982) 4 Jones, M.H., Processing systems for the digital audio studio. In Digital Audio, edited by B. Blesser, B. Locanthi and T.G. Stockham Jr, pp. 221–225, New York: Audio Engineering Society (1982) 5 Lidbetter, P.S., A digital delay processor and its applications. Presented at the 82nd Audio Engineering Society Convention (London, 1987), Preprint 2474(K-4) 6 McNally, G.J., COPAS A high speed real time digital audio processor. BBC Research Dept Report, RD 1979/26 7 McNally, G.W., Digital audio: COPAS-2, a modular digital audio signal processor for use in a mixing desk. BBC Research Dept Report, RD 1982/13 8 Vandenbulcke, C. et al., An integrated digital audio signal processor. Presented at the 77th Audio Engineering Society Convention (Hamburg, 1985), Preprint 2181(B-7) 9 Moorer, J.A., The audio signal processor: the next step in digital audio. In Digital Audio, edited by B. Blesser, B. Locanthi and T.G. Stockham Jr, pp. 205–215, New York: Audio Engineering Society (1982) 10 Gourlaoen, R. and Delacroix, P., The digital sound mixing desk: architecture and integration in the future all-digital studio. Presented at the 80th Audio Engineering Society Convention (Montreux, 1986), Preprint 2327(D-1) 11 Ray, S.F., Applied Photographic Optics. Oxford: Focal Press (1988) (Ch. 17) 12 van den Enden, A.W.M. and Verhoeckx, N.A.M., Digital signal processing: theoretical background. Philips Tech. Rev., 42, 110–144, (1985) 13 McClellan, J.H., Parks, T.W. and Rabiner, L.R., A computer program for designing optimum FIR linear-phase digital filters. IEEE Trans. Audio and Electroacoustics, AU-21, 506–526 (1973) 14 Dolph, C.L., A current distribution for broadside arrays which optimises the relationship between beam width and side lobe level. Proc. IRE, 34, 335–348 (1946) 15 Crochiere, R.E. and Rabiner, L.R., Interpolation and decimation of digital signals – a tutorial review. Proc. IEEE, 69, 300–331 (1981) 16 Rabiner, L.R., Digital techniques for changing the sampling rate of a signal. In Digital Audio, edited by B. Blesser, B. Locanthi and T.G. Stockham Jr, pp. 79–89, New York: Audio Engineering Society (1982) 17 Lagadec, R., Digital sampling frequency conversion. In Digital Audio, edited by B. Blesser, B. Locanthi and T.G. Stockham Jr, pp. 90–96, New York: Audio Engineering Society (1982) 18 Jackson, L.B., Roundoff noise analysis for fixed-point digital filters realized in cascade or parallel form. IEEE Trans. Audio and Electroacoustics, AU–18, 107–122 (1970) 19 Parker, S.R., Limit cycles and correlated noise in digital filters. In Digital Signal Processing, Western Periodicals Co., 177–179 (1979) 20 Claasen, T.A.C.M., Mecklenbrauker, W.F.G. and Peek, J.B.H., Effects of quantizing and overflow in recursive digital filters. IEEE Trans. ASSP, 24, 517–529 (1976) 21 McNally, G.J., Digital audio: recursive digital filtering for high-quality audio signals. BBC Res. Dept Report, RD 1981/10 22 Rabiner, L.R. and Gold, B., Theory and Application of Digital Signal Processing, New Jersey: Prentice Hall (1975) 23 Agarwal, R.C. and Burrus, C.S., New recursive digital filter structures having very low sensitivity and roundoff noise. IEEE Trans. Circuits. Syst., CAS–22, 921–927 (1975) 24 Kraniauskas, P., Transforms in Signals and Systems, Wokingham: Addison-Wesley (1992) 25 Ahmed, N., Natarajan, T. and Rao, K., Discrete Cosine Transform, IEEE Trans. Computers, C-23 90–93 (1974) 26 Goupillaud, P., Grossman, A. and Morlet, J., Cycle-Octave and related transforms in seismic signal analysis. Geoexploration, 23, 85–102, Elsevier Science (1984/5) 27 Daubechies, I., The wavelet transform, time–frequency localisation and signal analysis. IEEE Trans. Info. Theory, 36, No.5, 961–1005 (1990) 28 Rioul, O. and Vetterli, M., Wavelets and signal processing. IEEE Signal Process. Mag., 14–38 (Oct. 1991) 29 Strang, G. and Nguyen, T., Wavelets and Filter Banks, Wellesly, MA: Wellesley-Cambridge Press (1996)
• No Comment
..................Content has been hidden....................