The number of digital signals per sample is the point on which so many attempts to achieve digital coding of audio have foundered in the past. As so often happens, the problems could be more easily solved using tape methods, because it would be quite feasible to make a 16-track tape recorder using wide tape and to use each channel for one particular bit in a number. This is, in fact, the method that can be used for digital mastering where tape size is not a problem, but the disadvantage here is that for original recordings some 16–32 separate music tracks will be needed. If each of these were to consist of 16 digital tracks the recorder would, to put it mildly, be rather overloaded. Since there is no possibility of creating or using a 16-track disc, the attractively simple idea of using one track per digital bit has to be put aside. The alternative is
serial transmission and recording.
DefinitionSerial means one after another. For 16 bits of a binary number, serial transmission means that the bits are transmitted in a stream of 16 separate signals of 0 or 1, rather in the form of separate signals on 16 channels at once. The rate of transmission of a digital serial signal is stated in kb/s (kilobits per second) or Mb/s (megabits per second).
Now if the signals are samples that have been taken at the rate of 40
kHz, and each signal requires 16 bits to be sent out, then the rate of sending digital signals is 16
×
40 kb/s, which if we coded it as one wave per bit would be equivalent to a frequency of 640
kHz, well beyond the rates for which ordinary tape or disc systems can cope. As it happens, we can get away with
much more efficient coding and with slower sampling rates, as we shall see, but this does not offer much relief because there are further problems.
NoteA well-known rule (Shannon’s law, which was worked out long before digital recording was established) states that the sampling rate must be at least twice the highest component of the analog signal, so that a 40
kHz sampling rate is adequate for a signal whose highest frequency is 20
kHz. Remember that the components of an audio signal that are at this highest frequency will be present only in very small amplitudes.
When a parallel system is used, with one channel for each bit, there is no problem of identifying a number, because the bits are present at the same time on the 16 different channels, with each bit in its place; the most significant bit will be on the MSB line, the least significant on the LSB line, and so on. When bits are sent one at a time, though, how do you know which bits belong to which number? Can you be sure that a bit is the last bit of one number or is it the first bit of the next number? The point is very important because when the 8-4-2-1 system is used, a 1 as the most important bit means a value of 32,768, but a 1 as the least important bit means just 1. The difference in terms of signal amplitudes is enormous, which is why binary codes other than the 8-4-2-1 type are used industrially. The 8-4-2-1 code is used mainly in computing because of the ease with which arithmetical operations can be carried out on numbers that use this code.
Even if we assume that the groups of 16 bits can be counted out perfectly, what happens if one bit is missed or mistaken? At a frequency of 1
MHz or more it would be hopelessly optimistic to assume that a bit might not be lost or changed. There are tape dropouts and dropins to consider, and discs cannot have perfect surfaces. At such a density of data, faults are inevitable, and some methods must be used to ensure that the groups of 16 bits (
words) remain correctly gathered together. Whatever method is used must not compromise the rate at which the numbers are transmitted, however, because this is the sampling rate and it must remain fixed. Fortunately, the problems are not new nor unique to audio; they have existed for a long time and been tackled by the designers of computer systems. A look at how these problems are tackled in simple computer systems gives a few clues as to how the designers of audio digital systems went about their task.
SummaryThe problem of conversion from analog to digital signals for sound waves is the rate of conversion that is needed. The accepted number of digits per sample is 16, and Shannon’s law states that the sampling rate must be at least twice the highest component of the analog signal. For a maximum of 20
kHz, this means a sampling rate of 40
kHz, and for 16-bit signals, this requires a rate of 16
×
20 thousand bits per second, which is 320,000 bits per second. These bit signals have to be recorded and transmitted in serial form, meaning one after another.
Analog to Digital
Converting from an analog into a digital signal involves the quantization steps that have been explained above, but the mechanism needs some explanation in the form of block diagrams. All analog to digital (A/D) conversions start with a sample and hold circuit, illustrated in block form in
Figure 9.2.
The input to this circuit is the waveform that is to be converted to digital form, and this is taken through a buffer stage to a switch and a capacitor. While the switch is closed, the voltage across the capacitor will be the waveform voltage; the buffer ensures that the capacitor can be charged and discharged without taking power from the signal source. At the instant when the switch opens, the voltage across the capacitor is the sampled waveform voltage, and this will remain stored until the capacitor discharges. Since the next stage is another buffer, it is easy to ensure that the amount of discharge is negligible. While the switch is open and the voltage is stored, the conversion of this voltage into digital form (quantization) can take place, and this action must be completed before the switch closes again at the end of the sampling period.
In this diagram, a simple mechanical switch has been indicated but in practice this switch action would be carried out using MOSFETs which are part of the conversion IC. To put some figures on the process, at the sampling rate of 44.1
kHz that is used for CDs, the hold period cannot be longer than 22
ms, which looks long enough for conversion – but read on!
The conversion can use a circuit such as is outlined in
Figure 9.3 and which very closely resembles the diagram of a digital voltmeter illustrated in
Chapter 11. The clock pulses are at a frequency that is much higher than the sampling pulses, and while a voltage is being held at the input, the clock pulses pass through the gate and are counted. The clock pulses are also the input to the integrator, whose output is a rising voltage. When the output voltage from the
integrator reaches the same level (or slightly above the level) as the input voltage, the gate shuts off, and the counted number at this point is used as the digital signal. The reset pulse (from the sample and hold switch circuit) then resets the counter and the integrator so that a new count can start for the next sampled voltage.
The clock pulse must be at a rate that will permit a full set of pulses to be counted in a sampling interval. For example, if the counter uses 8-bit output, corresponding to a count of 65,538 (which is 2
8), and the sampling time is 20
ms, then it must be possible to count 65,536 clock pulses in 20
ms, giving a clock rate of 3.27
GHz. This is not a rate that would be easy to supply or to work with (though rates of this magnitude are now commonplace in computers), so that the conversion process is not quite as simple as has been suggested here. For CD recording, more advanced A/D conversion methods are used, such as the successive approximation method (in which the input voltage is first compared to the voltage corresponding to the most significant digit, then successively to all the others, so that only eight comparisons are needed for an 8-bit output rather than 65,536). If you are curious about these methods, see the book
Introducing Digital Audio from PC Publishing.
NoteThe sampling rate for many analog signals can be much lower than the 44.1
kHz that is used for CD recording, and most industrial processes can use much lower rates. ICs for these A/D conversions are therefore mostly of the simpler type, taking a few milliseconds for each conversion. Very high-speed (
flash) converters are also obtainable that can work at sampling rates of many megahertz.
Serial Transmission
To start with, when computers transmit data serially, the word that is transmitted is not just the group of digits that is used for coding a number. For historic reasons, computers transmit in
units of 8-bit bytes, rather than in 16-bit words, but the principles are equally valid. When a byte is transmitted over a serial link, using what is called
asynchronous methods, it is preceded by one
start bit and followed by one or two (according to the system that is used)
stop bits.
Since the use of two stop bits is very common, we will stick to the example of one start bit, eight number bits and two stop bits. The start bit is a 0 and the stop bits are 1s, so that each group of 11 bits that are sent will start with a 0 and end with two 1s. The receiving circuits will place each group of 11 bits into a temporary store and check for these start and stop bits being correct. If they are not, then the digits as they come in are shifted along until the pattern becomes correct. This means that an incorrect bit will cause loss of data, because it may need several attempts to find that the pattern fits again, but it will not result in every byte that follows being incorrect, as would happen if the start and stop bits were not used.
The use of start and stop bits is one very simple method of checking the accuracy of digital transmissions, and it is remarkably successful, but it is just one of a number of methods. In conjunction with the use of start and stop bits, many computer systems also use what is known as
parity, an old-established method of detecting one-bit errors in text data. In a group of eight bits, only seven are normally used to carry text data (in ASCII code) and the eighth is spare. This redundant bit is made to carry a checking signal, which is of a very simple type.
We can illustrate how it works with an example of what is termed
even parity. Even parity means that the number of 1s in a group of eight shall always be even. If the number is odd, then there has been an error in transmission and a computer system may be able to make the transmitting equipment try again. When each byte is sent the number of 1s is counted. If this number is even, then the redundant bit is left as a 0, but if the number is odd, then the redundant bit is made a 1, so that the group of eight now contains an even number of 1s. At the receiver, all that is normally done is to check for the number of 1s being even, and no attempt is made to find which bit is at fault if an error is detected. The redundant bit is not used for any purpose other than making the total number even. The process is illustrated in
Table 9.2.
Table 9.2 Using parity to check that a byte has been received correctly.
The numbers each use seven bits, and one extra parity bit is added, in this case to make the count of 1s an even number. On reception, the byte can be checked to find if its parity is even. The parity bit is located on the left hand side. |
Signal byte | 0011001 | 0101111 | 0101110 | 1110010 |
Even parity added | 1 | 1 | 0 | 0 |
Signal sent | 10011001 | 10101111 | 00101110 | 01110010 |
Received byte | 10001001 | 10101111 | 10101000 | 01110010 |
Parity check | odd | even | odd | even |
Result | error | OK | error | OK |
Parity, used in this way, is a very simple system indeed, and if two bits in a byte are in error it is possible that the parity could be correct though the transmitted data was not. In addition, the
parity bit itself might be the one that was affected by the error so that the data is signaled as being faulty even though it is perfect. Nevertheless, parity, like the use of start bits and stop bits, works remarkably well and allows large masses of computer text data to be transmitted over serial lines at reasonably fast rates. What is a reasonably fast rate for a computer is not, however, very brilliant for audio, and even for the less demanding types of computing purposes the use of parity is not really good enough and much better methods have been devised. Parity has now almost vanished as a main checking method because it is now unusual to send plain (7-bit) text; we use formatted text using all 8 bits in each byte, and we also send coded pictures and sound using all 8 bits of each byte, so that an added parity bit would make a 9-bit unit (not impossible but awkward, and better methods are available).
The rates of sending bits serially over telephone lines in the pre-broadband days ranged from the painfully slow 110 bits per second (used at one time for teleprinters) to the more tolerable 56,000 bits per second. Even this fast rate is very slow by the standards that we have been talking about, so it is obvious that something rather better is needed for audio information. Using direct cable connections, rates of several million bits per second can be achieved, and these rates are used for the universal serial bus (USB) system that is featured in modern computers. We will look at broadband methods later.
As a further complication, recording methods do not cope well with signals that are composed of long strings of 1s or 0s; this is equivalent to trying to record square waves in an analog system. The way round this is to use a signal of more than 8 bits for a byte, and using a form of conversion table for bytes that contain long sequences of 1s or 0s. A very popular format that can be used is called eleven-to-fourteen (ETF), and as the name suggests, this converts each 11-bit piece of code (8 bits of data plus 3 bits used for error checking) into 14-bit pieces which will not contain any long runs of 1s or 0s and which are also free of sequences that alternate too quickly, such as 01010101010101.
All in all, then, the advantages that digital coding of audio signals can deliver are not obtained easily, whether we work with tape or with disc. The rate of transmission of data is enormous, as is the bandwidth required, and the error-detecting methods must be very much better and work very much more quickly than is needed for the familiar personal computers that are used to such a large extent today. That the whole business should have been solved so satisfactorily as to permit mass production is very satisfying, and even more satisfying is the point that there is just one worldwide CD standard, not the furiously competing systems that made video recording such a problem for the consumer in the early days.
For coding television signals, the same principles apply, but we do not attempt to convert the analog television signal into digital because we need only the video portion, not the synchronization pulses. Each position on the screen is represented by a binary number which carries the information on brightness and color. Pictures of the quality we are used to can be obtained using only 8 bits. Once again, the problems relate to the speed at which the
information is sampled, and the methods used for digital television video signals are quite unlike those used for the older system, though the signals have to be converted to analog form before they are applied to the guns of a cathode-ray tube (CRT). Even this conversion becomes unnecessary when CRTs are replaced by color liquid crystal display (LCD) screens, because the IC that deals with television processing feeds to a digital processor that drives the LCD dots.
A more detailed description and block diagram of CD replay systems is contained in
Chapter 12.
SummaryDigital coding has the enormous advantage that various methods can be used to check that a signal has not been changed during storage or transmission. The simplest system uses parity, adding one extra bit to check that number of 1s in a byte. More elaborate systems can allow each bit to be checked, so that circuits at the far end can correct errors. In addition, using coding systems such as ETF can avoid sequences of bits that are difficult to transmit or record with perfect precision.