Fundamentals
2.1 What is an audio signal?
Actual sounds are converted to electrical signals for convenience of handling, recording and conveying from one place to another. This is the job of the microphone. There are two basic types of microphone, those which measure the variations in air pressure due to sound, and those which measure the air velocity due to sound, although there are numerous practical types which are a combination of both.
The sound pressure or velocity varies with time and so does the output voltage of the microphone, in proportion. The output voltage of the microphone is thus an analog of the sound pressure or velocity.
As sound causes no overall air movement, the average velocity of all sounds is zero, which corresponds to silence. As a result the bidirectional air movement gives rise to bipolar signals from the microphone, where silence is in the centre of the voltage range, and instantaneously negative or positive voltages are possible. Clearly the average voltage of all audio signals is also zero, and so when level is measured, it is necessary to take the modulus of the voltage, which is the job of the rectifier in the level meter. When this is done, the greater the amplitude of the audio signal, the greater the modulus becomes, and so a higher level is displayed. Whilst the nature of an audio signal is very simple, there are many applications of audio, each requiring different bandwidth and dynamic range.
2.2 What is a video signal?
The goal of television is to allow a moving picture to be seen at a remote place. The picture is a two-dimensional image, which changes as a function of time. This is a three-dimensional information source where the dimensions are distance across the screen, distance down the screen and time. Whilst telescopes convey these three dimensions directly, this cannot be done with electrical signals or radio transmissions, which are restricted to a single parameter varying with time.
The solution in film and television is to convert the three-dimensional moving image into a series of still pictures, taken at the frame rate, and then, in television only, the two-dimensional images are scanned as a series of lines1 to produce a single voltage varying with time which can be digitized, recorded or transmitted. Europe, the Middle East and the former Soviet Union use the scanning standard of 625/50, whereas the USA and Japan use 525/59.94.
2.3 Types of video
Figure 2.1 shows some of the basic types of analog colour video. Each of these types can, of course, exist in a variety of line standards. Since practical colour cameras generally have three separate sensors, one for each primary colour, an RGB component system will exist at some stage in the internal workings of the camera, even if it does not emerge in that form. RGB consists of three parallel signals each having the same spectrum, and is used where the highest accuracy is needed, often for production of still pictures. Examples of this are paint systems and in computer aided design (CAD) displays. RGB is seldom used for real-time video recording.
Some compression can be obtained by using colour difference working. The human eye relies on brightness to convey detail, and much less resolution is needed in the colour information. R, G and B are matrixed together to form a luminance (and monochrome compatible) signal Y which has full bandwidth. The matrix also produces two colour difference signals, R–Y and B–Y, but these do not need the same bandwidth as Y, one half or one quarter will do depending on the application. Colour difference signals represent an early application of perceptive coding; a saving in bandwidth is obtained by expressing the signals according to the way the eye operates.
Analog colour difference recorders such as Betacam and M II record these signals separately. The D-1 and D-5 formats record 525/60 or 625/50 colour difference signals digitally and Digital Betacam does so using compression. In casual parlance, colour difference formats are often called component formats to distinguish them from composite formats.
For colour television broadcast in a single channel, the PAL, SECAM and NTSC systems interleave into the spectrum of a monochrome signal a subcarrier which carries two colour difference signals of restricted bandwidth. As the bandwidth required for composite video is no greater than that of luminance, it can be regarded as a form of compression performed in the analog domain. The artifacts which composite video introduces and the inflexibility in editing resulting from the need to respect colour framing serve as a warning that compression is not without its penalties. The subcarrier is intended to be invisible on the screen of a monochrome television set. A subcarrier-based colour system is generally referred to as composite video, and the modulated subcarrier is called chroma.
It is not advantageous to compress composite video using modern transform-based coders as the transform process cannot identify redundancy in a subcarrier. Composite video compression is restricted to differential coding systems. Transform-based compression must use RGB or colour difference signals. As RGB requires excessive bandwidth it makes no sense to use it with compression and so in practice only colour difference signals, which have been bandwidth reduced by perceptive coding, are used in MPEG. Where signals to be compressed originate in composite form, they must be decoded first. The decoding must be performed as accurately as possible, with particular attention being given to the quality of the Y/C separation. The chroma in composite signals is deliberately designed to invert from frame to frame in order to lessen its visibility. Unfortunately any residual chroma in luminance will be interpreted by inter-field compression systems as temporal luminance changes which need to be reproduced. This eats up data which should be used to render the picture. Residual chroma also results in high horizontal and vertical spatial frequencies in each field which appear to be wanted detail to the compressor.
2.4 What is a digital signal?
One of the vital concepts to grasp is that digital audio and video are simply alternative means of carrying the same information as their analog counterparts. An ideal digital system has the same characteristics as an ideal analog system: both of them are totally transparent and reproduce the original applied waveform without error. Needless to say, in the real world ideal conditions seldom prevail, so analog and digital equipment both fall short of the ideal. Digital equipment simply falls short of the ideal to a smaller extent than does analog and at lower cost, or, if the designer chooses, can have the same performance as analog at much lower cost. Compression is one of the techniques used to lower the cost, but it has the potential to lower the quality as well.
Any analog signal source can be characterized by a given useful bandwidth and signal-to-noise ratio. Video signals have very wide bandwidth extending over several megaHertz but require only 50 dB or so SNR whereas audio signals require only 20 kHz but need much better SNR.
Although there are a number of ways in which audio and video waveforms can be represented digitally, there is one system, known as pulse code modulation (PCM), which is in virtually universal use. Figure 2.2 shows how PCM works. Instead of being continuous, the time axis is represented in a discrete or stepwise manner. The waveform is not carried by continuous representation, but by measurement at regular intervals. This process is called sampling and the frequency with which samples are taken is called the sampling rate or sampling frequency Fs. The sampling rate is generally fixed and is not necessarily a function of any frequency in the signal, although in component video it will be line-locked for convenience. If every effort is made to rid the sampling clock of jitter, or time instability, every sample will be made at an exactly even time step. Clearly if there are any subsequent timebase errors, the instants at which samples arrive will be changed and the effect can be detected. If samples arrive at some destination with an irregular timebase, the effect can be eliminated by storing the samples temporarily in a memory and reading them out using a stable, locally generated clock. This process is called timebase correction which all properly engineered digital systems employ. It should be stressed that sampling is an analog process. Each sample still varies infinitely as the original waveform did.
Figure 2.2 also shows that each sample is also discrete, or represented in a stepwise manner. The length of the sample, which will be proportional to the voltage of the waveform, is represented by a whole number. This process is known as quantizing and results in an approximation, but the size of the error can be controlled until it is negligible. If, for example, we were to measure the height of humans to the nearest metre, virtually all adults would register two metres high and obvious difficulties would result. These are generally overcome by measuring height to the nearest centimetre. Clearly there is no advantage in going further and expressing our height in a whole number of millimetres or even micrometres. An appropriate resolution can be found just as readily for audio or video, and greater accuracy is not beneficial. The link between quality and sample resolution is explored later in this chapter. The advantage of using whole numbers is that they are not prone to drift. If a whole number can be carried from one place to another without numerical error, it has not changed at all. By describing waveforms numerically, the original information has been expressed in a way which is better able to resist unwanted changes.
Essentially, digital systems carry the original waveform numerically. The number of the sample is an analog of time, and the magnitude of the sample is an analog of the signal voltage. As both axes of the waveform are discrete, the waveform can be accurately restored from numbers as if it were being drawn on graph paper. If we require greater accuracy, we simply choose paper with smaller squares. Clearly more numbers are required and each one could change over a larger range.
Discrete numbers are used to represent the value of samples so that they can readily be transmitted or processed by binary logic. There are two ways in which binary signals can be used to carry sample data. When each digit of the binary number is carried on a separate wire this is called parallel transmission. The state of the wires changes at the sampling rate. This approach is used in the parallel video interfaces, as video needs a relatively short wordlength; eight or ten bits. Using multiple wires is cumbersome where a long wordlength is in use, and a single wire can be used where successive digits from each sample are sent serially. This is the definition of pulse code modulation. Clearly the clock frequency must now be higher than the sampling rate.
Digital signals of this form will be used as the input to compression systems and must also be output by the decoding stage in order that the signal can be returned to analog form. Figure 2.3 shows the stages involved. Between the coder and the decoder the signal is not PCM but will be in a format which is highly dependent on the kind of compression technique used. It will also be evident from Figure 2.3 where the signal quality of the system can be impaired. The PCM digital interfaces between the ADC and the coder and between the decoder and the DAC cause no loss of quality. Quality is determined by the ADC and by the performance of the coder. Generally, decoders do not cause significant loss of quality; they make the best of the data from the coder. Similarly, DACs cause little quality loss above that due to the ADC. In practical systems the loss of quality is dominated by the action of the coder. In communication theory, compression is known as source coding in order to distinguish it from the channel coding necessary reliably to send data down transmission or recording channels. This book is not concerned with channel coding but details can be found elsewhere.2
2.5 Sampling
Sampling is a process of periodic measurement which can take place in space or time and in several dimensions at once. Figure 2.4(a) shows that in temporal sampling the frequency of the signal to be sampled and the sampling rate Fs are measured in Hertz (Hz), the standard unit of temporal frequency. In still images such as photographs there is no temporal change and Figure 2.4(b) shows that the sampling is spatial. The sampling rate is now a spatial frequency. The absolute unit of spatial frequency is cycles per metre, although for imaging purposes cycles-per-millimetre is more practical.
If the human viewer is considered, none of these units is useful because they don’t take into account the viewing distance. The acuity of the eye is measured in cycles per degree. As Figure 2.4(c) shows, a large distant screen subtends the same angle as a small nearby screen. Figure 2.4(c) also shows that the nearby screen, possibly a computer monitor, needs to be able to display a higher spatial frequency than a distant cinema screen to give the same sharpness perceived at the eye. If the viewing distance is proportional to size, both screens could have the same number of pixels, leading to the use of a relative unit, shown in (d), which is cycles-per-picture-height (cph) in the vertical axis and cycles-per-picture-width (cpw) in the horizontal axis.
The computer screen has more cycles-per-millimetre than the cinema screen, but in this example has the same number of cycles-per-picture-height. Spatial and temporal frequencies are related by the process of scanning as given by:
Temporal frequency = spatial frequency × scanning velocity
Figure 2.5 shows that if the 1024 pixels along one line of an SVGA monitor were scanned in one tenth of a millisecond, the sampling clock frequency would be 10.24MHz.
Sampling theory does not require regular sample spacing, but it is the most efficient arrangement. As a practical matter if regular sampling is employed, the process of timebase correction can be used to eliminate any jitter due to recording or transmission.
The sampling process originates with a pulse train which is shown in Figure 2.6(a) to be of constant amplitude and period. This pulse train can be temporal or spatial. The information to be sampled amplitude-modulates the pulse train in much the same way as the carrier is modulated in an AM radio transmitter. One must be careful to avoid over-modulating the pulse train as shown in (b) and this is achieved by suitably biasing the information waveform as at (c).
Figure 2.7 shows a constant speed rotation viewed along the axis so that the motion is circular. Imagine, however, the view from one side in the plane of the rotation. From a distance, only a vertical oscillation will be observed and if the position is plotted against time the resultant waveform will be a sine wave. The sine wave is unique because it contains only a single frequency. All other waveforms contain more than one frequency.
Imagine a second viewer who is at right angles to the first viewer. He will observe the same waveform, but at a different time. The displacement is given by multiplying the radius by the cosine of the phase angle. When plotted on the same graph, the two waveforms are phase-shifted with respect to one another. In this case the phase-shift is 90° and the two waveforms are said to be in quadrature. Incidentally the motions on each side of a steam locomotive are in quadrature so that it can always get started (the term used is quartering). Note that the phase angle of a signal is constantly changing with time whereas the phase-shift between two signals can be constant. It is important that these two are not confused.
It was seen in Figure 2.7 that a sinusoidal function is a rotation resolved in one axis. In order to obtain a purely sinusoidal motion, the motion on the other axis must be eliminated. Conceptually this may be achieved by having a contra-rotating system in which there is one rotation at +ω and another at −ω. Figure 2.8(a) shows that the sine components of these two rotations will be in the same phase and will add, whereas the cosine components will be in anti-phase and will cancel. Thus all real frequencies actually contain equal amounts of positive and negative frequencies. These cannot be distinguished unless a modulation process takes place. Figure 2.8(b) shows that when two signals of frequency ±ω1 and ±ω2 are multiplied together, the result is that the rotations of each must be added. The result is four frequencies, ±(ω1 + ω2) and ±(ω1 − ω2), one of which is the sum of the input frequencies and one of which is the difference between them. These are called sidebands.
Sampling is a modulation process and also produces sidebands although the carrier is now a pulse train and has an infinite series of harmonics as shown in Figure 2.9(a). The sidebands repeat above and below each harmonic of the sampling rate as shown in (b). The consequence of this is that sampling does not alter the spectrum of the baseband signal at all. The spectrum is simply repeated. Consequently sampling need not lose any information.
The sampled signal can be returned to the continuous domain simply by passing it into a low-pass filter. This filter has a frequency response which prevents the images from passing, and only the baseband signal emerges, completely unchanged. If considered in the frequency domain, this filter can be called an anti-image filter; if considered in the time domain it can be called a reconstruction filter. It can also be considered as a spatial filter if a sampled still image is being returned to a continuous image. Such a filter will be two dimensional.
If an input is supplied having an excessive bandwidth for the sampling rate in use, the sidebands will overlap (Figure 2.7(c)) and the result is aliasing, where certain output frequencies are not the same as their input frequencies but instead become difference frequencies (Figure 2.7(d)). It will be seen from Figure 2.7 that aliasing does not occur when the input bandwidth is equal to or less than half the sampling rate, and this derives the most fundamental rule of sampling, which is that the sampling rate must be at least twice the input bandwidth. Nyquist3 is generally credited with being the first to point out the need for sampling at twice the highest frequency in the signal in 1928, although the mathematical proofs were given independently by Shannon4,5 and Kotelnikov. It subsequently transpired that Whittaker6 beat them all to it, although his work was not widely known at the time. One half of the sampling frequency is often called the Nyquist frequency.
Whilst aliasing has been described above in the frequency domain, it can be described equally well in the time domain. In Figure 2.9(a) the sampling rate is obviously adequate to describe the waveform, but at (b) it is inadequate and aliasing has occurred. In some cases there is no control over the spectrum of input signals and in this case it becomes necessary to have a low-pass filter at the input to prevent aliasing. This anti-aliasing filter prevents frequencies of more than half the sampling rate from reaching the sampling stage.
Figure 2.10 shows that all practical sampling systems consist of a pair of filters, the anti-aliasing filter before the sampling process and the reconstruction filter after it. It should be clear that the results obtained will be strongly affected by the quality of these filters which may be spatial or temporal according to the application.
2.6 Reconstruction
Perfect reconstruction was theoretically demonstrated by Shannon as shown in Figure 2.11. The input must be band limited by an ideal linear phase low-pass filter with a rectangular frequency response and a bandwidth of one-half the sampling frequency. The samples must be taken at an instant with no averaging of the waveform. These instantaneous samples can then be passed through a second, identical filter which will perfectly reconstruct that part of the input waveform which was within the passband.
There are some practical difficulties in implementing Figure 2.11 exactly, but well-engineered systems can approach it and so it forms a useful performance target. The impulse response of a linear-phase ideal low-pass filter is a sin x/x waveform as shown in Figure 2.12(a). Such a waveform passes through zero volts periodically. If the cut-off frequency of the filter is one-half of the sampling rate, the impulse passes through zero at the sites of all other samples. It can be seen from Figure 2.12(b) that at the output of such a filter, the voltage at the centre of a sample is due to that sample alone, since the value of all other samples is zero at that instant. In other words the continuous output waveform must pass through the tops of the input samples. In between the sample instants, the output of the filter is the sum of the contributions from many impulses (theoretically an infinite number), causing the waveform to pass smoothly from sample to sample.
It is a consequence of the band-limiting of the original anti-aliasing filter that the filtered analog waveform could only take one path between the samples. As the reconstruction filter has the same frequency response, the reconstructed output waveform must be identical to the original band-limited waveform prior to sampling. A rigorous mathematical proof of reconstruction can be found in Porat7 or Betts.8
Perfect reconstruction with a Nyquist sampling rate is a limiting condition which cannot be exceeded and can only be reached under ideal and impractical conditions. Thus in practice Nyquist rate sampling can only be approached. Zero-duration pulses are impossible and the ideal linear-phase filter with a vertical ‘brick-wall’ cut-off slope is impossible to implement. In the case of temporal sampling, as the slope tends to the vertical, the delay caused by the filter goes to infinity. In the case of spatial sampling, sharp-cut optical filters are impossible to build. Figure 2.13 shows that the spatial impulse response of an ideal lens is a symmetrical intensity function. Note that the function is positive only as the expression for intensity contains a squaring process. The negative excursions of the sin x/x curve can be handled in an analog or digital filter by negative voltages or numbers, but in optics there is no negative light. The restriction to positive-only impulse response limits the sharpness of optical filters.
In practice real filters with finite slopes can still be used. The cut-off slope begins at the edge of the required pass band, and because the slope is not vertical, aliasing will always occur. However, it can be seen from Figure 2.14 that the sampling rate can be raised to drive aliasing products to an arbitrarily low level. The perfect reconstruction process still works, but the system is a little less efficient in information terms because the sampling rate has to be raised. There is no absolute factor by which the sampling rate must be raised. A figure of 10 per cent is typical in temporal sampling, although it depends upon the filters which are available and the level of aliasing products which is acceptable.
There is another difficulty which is that the requirement for linear phase means the impulse response of the filter must be symmetrical. In the time domain, such filters cannot be causal because the output has to begin before the input occurs. A filter with a finite slope has a finite window and so a linear-phase characteristic can be obtained by incorporating a delay of one-half the window period so that the filter can be causal. This concept will be expanded in Chapter 3.
2.7 Aperture effect
In practical sampling systems the sample impulse cannot be infinitely small in time or space. Figure 2.15 shows that real equipment may produce impulses whose possible shapes include rectangular and Gaussian. The result is an aperture effect where the frequency response of the sampling system is modified. The new response is the Fourier transform of the aperture function.
In the case where the pulses are rectangular, the proportion of the sample period occupied by the pulse is defined as the aperture ratio which is normally expressed as a percentage.
The case where the pulses have been extended in width to become equal to the sample period is known as a zero-order hold (ZOH) system and has a 100 per cent aperture ratio as shown in Figure 2.16(a). This produces a waveform which is more like a staircase than a pulse train.
To see how the use of ZOH compares with ideal Shannon reconstruction, it must be recalled that pulses of negligible width have a uniform spectrum and so the frequency respose of the sampler and reconstructor is flat within the pass band. In contrast, pulses of 100 per cent aperture ratio have a sin x/x spectrum which falls to a null at the sampling rate, and as a result is about 4 dB down at the Nyquist frequency as shown in Figure 2.16(b).
Figure 2.17(a) shows how ZOH is normally represented in texts with the pulses extending to the right of the sample. This representation is incorrect because it does not have linear phase as can be seen in (b). Figure 2.17(c) shows the correct representation where the pulses are extended symmetrically about the sample to achieve linear phase (d). This is conceptually easy if the pulse generator is considered to cause a half-sample-period delay relative to the original waveform. If the pulse width is stable, the reduction of high frequencies is constant and predictable, and an appropriate filter response shown in (e) can render the overall response flat once more. Note that the equalization filter in (e) is conceptually a low-pass reconstruction filter in series with an inverse sin x/x response.
An alternative in the time domain is to use resampling which is shown in Figure 2.18. Resampling passes the zero-order hold waveform through a further synchronous sampling stage which consists of an analog switch that closes briefly in the centre of each sample period. The output of the switch will be pulses which are narrower than the original. If, for example, the aperture ratio is reduced to 50 per cent of the sample period, the first frequency response null is now at twice the sampling rate, and the loss at the edge of the pass band is reduced. As the figure shows, the frequency response becomes flatter as the aperture ratio falls. The process should not be carried too far, as with very small aperture ratios there is little energy in the pulses and noise can be a problem. A practical limit is around 12.5 per cent where the frequency response is virtually ideal.
It should be stressed that in real systems there will often be more than one aperture effect. The result is that the frequency responses of the various aperture effects multiply, which is the same as saying that their impulse responses convolve. Whatever fine words are used, the result is an increasing loss of high frequencies where a series of acceptable devices when cascaded produce an unacceptable result.
In many systems, for reasons of economy or ignorance, reconstruction is simply not used and the system output is an unfiltered ZOH waveform. Figure 2.19 shows some examples of this kind of thing which are associated with the ‘digital look’. It is important to appreciate that in well-engineered systems containing proper filters there is no such thing as the digital look.
2.8 Choice of audio sampling rate
The Nyquist criterion is only the beginning of the process which must be followed to arrive at a suitable sampling rate. The slope of available filters will compel designers to raise the sampling rate above the theoretical Nyquist rate. For consumer products, the lower the sampling rate, the better, since the cost of the medium or channel is directly proportional to the sampling rate: thus sampling rates near to twice 20 kHz are to be expected.
Where very low bit rate compression is to be used, better results may be obtained by reducing the sampling rate so that the compression factor is not as great.
For professional products, there is a need to operate at variable speed for pitch correction. When the speed of a digital recorder is reduced, the offtape sampling rate falls, and Figure 2.20 shows that with a minimal sampling rate the first image frequency can become low enough to pass the reconstruction filter. If the sampling frequency is raised without changing the response of the filters, the speed can be reduced without this problem. It follows that variable-speed recorders, generally those with stationary heads, must use a higher sampling rate.
In the early days of digital audio, video recorders were adapted to store audio samples by creating a pseudo-video waveform which could convey binary as black and white levels.9 The sampling rate of such a system is constrained to relate simply to the field rate and field structure of the television standard used, so that an integer number of samples can be stored on each usable TV line in the field. Such a recording can be made on a monochrome recorder, and these recordings are made in two standards, 525 lines at 60 Hz and 625 lines at 50 Hz. Thus it was necessary to find a frequency which is a common multiple of the two and also suitable for use as a sampling rate.
The allowable sampling rates in a pseudo-video system can be deduced by multiplying the field rate by the number of active lines in a field (blanked lines cannot be used) and again by the number of samples in a line. By careful choice of parameters it is possible to use either 525/60 or 625/50 video with a sampling rate of 44.1 kHz.
In 60 Hz video, there are 35 blanked lines, leaving 490 lines per frame, or 245 lines per field for samples. If three samples are stored per line, the sampling rate becomes
60 × 245 × 3 = 44.1 kHz
In 50 Hz video, there are 37 lines of blanking, leaving 588 active lines per frame, or 294 per field, so the same sampling rate is given by
50 × 294 × 3 = 44.1 kHz
The sampling rate of 44.1 kHz came to be that of the Compact Disc. Even though CD has no video circuitry, the equipment used to make CD masters was originally video-based and determined the sampling rate.
For landlines to FM stereo broadcast transmitters having a 15 kHz audio bandwidth, the sampling rate of 32 kHz is more than adequate, and has been in use for some time in the United Kingdom and Japan. This frequency is also used in the NICAM 728 stereo TV sound system, in DVB audio and in DAB. The professional sampling rate of 48 kHz was proposed as having a simple relationship to 32 kHz, being far enough above 40 kHz for variable-speed operation, and having a simple relationship with 50 Hz frame rate video which would allow digital video recorders to store the convenient number of 960 audio samples per video field. This is the sampling rate used by all production DVTRs. The field rate offset of 59.94 Hz video does not easily relate to any of the above sampling rates, and requires special handling which is outside the scope of this book.10
Although in a perfect world the adoption of a single sampling rate might have had virtues, for practical and economic reasons digital audio now has essentially three rates to support: 32 kHz for broadcast, 44.1 kHz for CD, and 48 kHz for professional use.11 In MPEG these audio sampling rates may be halved for low bit rate applications with a corresponding loss of audio bandwidth.
2.9 Video sampling structures
Component or colour difference signals are used primarily for postproduction work where quality and flexibility are paramount. In colour difference working, the important requirement is for image manipulation in the digital domain. This is facilitated by a sampling rate which is a multiple of line rate because then there is a whole number of samples in a line and samples are always in the same position along the line and can form neat columns. A practical difficulty is that the line period of the 525 and 625 systems is slightly different. The problem was overcome by the use of a sampling clock which is an integer multiple of both line rates. ITU-601 (formerly CCIR-601) recommends the use of certain sampling rates which are based on integer multiples of the carefully chosen fundamental frequency of 3.375 MHz. This frequency is normalized to 1 in the document.
In order to sample 625/50 luminance signals without quality loss, the lowest multiple possible is 4 which represents a sampling rate of 13.5 MHz. This frequency line-locks to give 858 samples per line period in 525/59.94 and 864 samples per line period in 625/50.
In the component analog domain, the colour difference signals used for production purposes typically have one half the bandwidth of the luminance signal. Thus a sampling rate multiple of 2 is used and results in 6.75 MHz. This sampling rate allows respectively 429 and 432 samples per line. Component video sampled in this way has a 4:2:2 format. Whilst other combinations are possible, 4:2:2 is the format for which the majority of digital component production equipment is constructed. The D-1, D-5, D-9, SX and Digital Betacam DVTRs operate with 4:2:2 format data. Figure 2.21 shows the spatial arrangement given by 4:2:2 sampling.
Luminance samples appear at half the spacing of colour difference samples, and every other luminance sample is co-sited with a pair of colour difference samples. Co-siting is important because it allows all attributes of one picture point to be conveyed with a three-sample vector quantity. Modification of the three samples allows such techniques as colour correction to be performed. This would be difficult without co-sited information. Co-siting is achieved by clocking the three ADCs simultaneously.
For lower bandwidths, particularly in prefiltering operations prior to compression, the sampling rate of the colour difference signal can be halved. 4:1:1 delivers colour bandwidth in excess of that required by the composite formats and is used in 60 Hz DV camcorder formats.
In 4:2:2 the colour difference signals are sampled horizontally at half the luminance sampling rate, yet the vertical colour difference sampling rates are the same as for luma. Whilst this is not a problem in a production application, this disparity of sampling rates represents a data rate overhead which is undesirable in a compression environment. In this case it is possible to halve the vertical sampling rate of the colour difference signals as well, producing a format known as 4:2:0.
This topic is the source of considerable confusion. In MPEG-1, which was designed for a low bit rate, the single ideal 4:2:0 subsampling strategy of Figure 2.22(a) was used. The colour data are vertically low-pass filtered to the same bandwidth in two dimensions and interpolated so that the remaining colour pixels are equidistant from the source pixels.
This has the effect of symmetrically disposing the colour information with respect to the luma. When MPEG-2 was developed, it was a requirement to support 4:4:4, 4:2:2 and 4:2:0 colour structures. Figure 2.22(b) shows that in MPEG-2, the colour difference samples require no horizontal interpolation in 4:4:4 or 4:2:2 and so for consistency they don’t get it in 4:2:0 either. As a result there are two different 4:2:0 structures, one for MPEG-1 and one for MPEG-2 and MPEG-4. In vertical subsampling a new virtual raster has been created for the chroma samples. At the decoder a further interpolation will be required to put the decoded chroma data back onto the display raster. This causes a small generation loss which is acceptable for a single-generation codec as used in broadcasting, but not in multi-generation codecs needed for production. This problem was one of the reasons for the development of the 4:2:2 Profile of MPEG-2 which avoids the problem by retaining the colour information on every line. The term chroma format factor will be found in connection with colour downsampling. This is the factor by which the addition of colour difference data increases the source bit rate with respect to monochrome. For example, 4:2:0 has a factor of 1.5 whereas 4:2:2 has a factor of 2.
The sampling rates of ITU-601 are based on commonality between 525- and 625-line systems. However, the consequence is that the pixel spacing is different in the horizontal and vertical axes. This is incompatible with computer graphics in which so-called ‘square’ pixels are used. This means that the horizontal and vertical spacing is the same, giving the same resolution in both axes. However, high-definition TV and computer graphics formats universally use ‘square’ pixels. MPEG can handle various pixel aspect ratios and allows a control code to be embedded in the sequence header to help the decoder.
2.10 The phase-locked loop
All digital video systems need to be clocked at the appropriate rate in order to function properly. Whilst a clock may be obtained from a fixed frequency oscillator such as a crystal, many operations in video require genlocking or synchronizing the clock to an external source.
In phase-locked loops, the oscillator can run at a range of frequencies according to the voltage applied to a control terminal. This is called a voltage-controlled oscillator or VCO. Figure 2.23 shows that the VCO is driven by a phase error measured between the output and some reference. The error changes the control voltage in such a way that the error is reduced, so that the output eventually has the same frequency as the reference. A low-pass filter is fitted in the control voltage path to prevent the loop becoming unstable. If a divider is placed between the VCO and the phase comparator, as in the figure, the VCO frequency can be made to be a multiple of the reference. This also has the effect of making the loop more heavily damped, so that it is less likely to change frequency if the input is irregular.
In digital video, the frequency multiplication of a phase-locked loop is extremely useful. Figure 2.24 shows how the 13.5 MHz clock of component digital video and the 27 MHz master clock of MPEG are obtained from the sync pulses of an analog reference by such a multiplication process.
The numerically locked loop is a digital relative of the phase-locked loop. Figure 2.25 shows that the input is an intermittently transmitted value from a counter. The input count is compared with the value of a local count and the difference is used to control the frequency of a local oscillator. Once lock is achieved, the local oscillator and the remote oscillator will run at exactly the same frequency even though there is no continuous link between them.
2.11 Quantizing
Quantizing is the process of expressing some infinitely variable quantity by discrete or stepped values. Quantizing turns up in a remarkable number of everyday guises. Figure 2.26 shows that an inclined ramp enables infinitely variable height to be achieved, whereas a stepladder allows only discrete heights to be had. A stepladder quantizes height. When accountants round off sums of money to the nearest pound or dollar they are quantizing.
In audio the values to be quantized are infinitely variable voltages from an analog source. Strict quantizing is a process which is restricted to the voltage domain only. For the purpose of studying the quantizing of a single sample, time is assumed to stand still. This is achieved in practice either by the use of a track/hold circuit or the adoption of a quantizer technology which operates before the sampling stage.
Figure 2.27(a) shows that the process of quantizing divides the voltage range up into quantizing intervals Q. In applications such as telephony these may be of differing size, but for digital audio and video the quantizing intervals are made as identical as possible. If this is done, the binary numbers which result are truly proportional to the original analog voltage, and the digital equivalents of filtering and gain changing can be performed by adding and multiplying sample values. If the quantizing intervals are unequal this cannot be done. When all quantizing intervals are the same, the term uniform quantizing is used. The term linear quantizing will be found, but this is, like military intelligence, a contradiction in terms.
The term LSB (least significant bit) will also be found in place of quantizing interval in some treatments, but this is a poor term because quantizing is not always used to create binary values and because a bit can only have two values. In studying quantizing we wish to discuss values smaller than a quantizing interval, but a fraction of an LSB is a contradiction in terms.
Whatever the exact voltage of the input signal, the quantizer will determine the quantizing interval in which it lies. In what may be considered a separate step, the quantizing interval is then allocated a code value which is typically some form of binary number. The information sent is the number of the quantizing interval in which the input voltage lay. Exactly where that voltage lay within the interval is not conveyed, and this mechanism puts a limit on the accuracy of the quantizer. When the number of the quantizing interval is converted back to the analog domain, it will result in a voltage at the centre of the quantizing interval as this minimizes the magnitude of the error between input and output. The number range is limited by the wordlength of the binary numbers used. In a sixteen-bit system commonly used for audio, 65 536 different quantizing intervals exist, whereas video systems typically have eight-bit systems having 256 quantizing intervals.
2.12 Quantizing error
It is possible to draw a transfer function for such an ideal quantizer followed by an ideal DAC, and this is shown in Figure 2.27(b). A transfer function is simply a graph of the output with respect to the input. When the term linearity is used, this generally means the straightness of the transfer function. Linearity is a goal in audio and video, yet it will be seen that an ideal quantizer is anything but. Quantizing causes a voltage error in the sample which cannot exceed unless the input is so large that clipping occurs. Figure 2.27(b) shows the transfer function is somewhat like a staircase, and the voltage corresponding to audio muting or video blanking is half-way up a quantizing interval, or on the centre of a tread. This is the so-called mid-tread quantizer which is universally used in audio and video. Figure 2.27(c) shows the alternative mid-riser transfer function which causes difficulty because it does not have a code value at muting/blanking level and as a result the code value is not proportional to the signal voltage.
In studying the transfer function it is better to avoid complicating matters with the aperture effect of the DAC. For this reason it is assumed here that output samples are of negligible duration. Then impulses from the DAC can be compared with the original analog waveform and the difference will be impulses representing the quantizing error waveform. As can be seen in Figure 2.28, the quantizing error waveform can be thought of as an unwanted signal which the quantizing process adds to the perfect original. As the transfer function is non-linear, ideal quantizing can cause distortion. As a result practical digital audio devices use non-ideal quantizers to achieve linearity. The quantizing error of an ideal quantizer is a complex function, and it has been researched in great depth.12
As the magnitude of the quantizing error is limited, its effect can be minimized by making the signal larger. This will require more quantizing intervals and more bits to express them. The number of quantizing intervals multiplied by their size gives the quantizing range of the convertor. A signal outside the range will be clipped. Clearly if clipping is avoided, the larger the signal, the less will be the effect of the quantizing error.
Consider first the case where the input signal exercises the whole quantizing range and has a complex waveform. In audio this might be orchestral music; in video a bright, detailed contrasty scene. In these cases successive samples will have widely varying numerical values and the quantizing error on a given sample will be independent of that on others. In this case the size of the quantizing error will be distributed with equal probability between the limits.
Figure 2.28(c) shows the resultant uniform probability density. In this case the unwanted signal added by quantizing is an additive broadband noise uncorrelated with the signal, and it is appropriate in this case to call it quantizing noise. This is not quite the same as thermal noise which has a Gaussian probability shown in Figure 2.28(d). The subjective difference is slight.
Treatments which then assume that quantizing error is always noise give results which are at variance with reality. Such approaches only work if the probability density of the quantizing error is uniform. Unfortunately at low levels, and particularly with pure or simple waveforms, this is simply not true. At low levels, quantizing error ceases to be random, and becomes a function of the input waveform and the quantizing structure. Once an unwanted signal becomes a deterministic function of the wanted signal, it has to be classed as a distortion rather than a noise. We predicted a distortion because of the non-linearity or staircase nature of the transfer function. With a large signal, there are so many steps involved that we must stand well back, and a staircase with many steps appears to be a slope. With a small signal there are few steps and they can no longer be ignored.
The non-linearity of the transfer function results in distortion, which produces harmonics. Unfortunately these harmonics are generated after the anti-aliasing filter, and so any which exceed half the sampling rate will alias. Figure 2.29 shows how this results in anharmonic distortion in audio. These anharmonics result in spurious tones known as birdsinging.
When the sampling rate is a multiple of the input frequency the result is harmonic distortion. Where more than one frequency is present in the input, intermodulation distortion occurs, which is known as granulation.
As the input signal is further reduced in level, it may remain within one quantizing interval. The output will be silent because the signal is now the quantizing error. In this condition, low-frequency signals such as air-conditioning rumble can shift the input in and out of a quantizing interval so that the quantizing distortion comes and goes, resulting in noise modulation.
In video, quantizing error in luminance results in visible contouring on low-key scenes or flat fields. Slowly changing brightness across the screen is replaced by areas of constant brightness separated by sudden steps. In colour difference signals, contouring results in an effect known as posterization where subtle variations in colour are removed and large areas are rendered by the same colour as if they had been painted by numbers.
2.13 Dither
At high signal level, quantizing error is effectively noise. As the level falls, the quantizing error of an ideal quantizer becomes more strongly correlated with the signal and the result is distortion. If the quantizing error can be decorrelated from the input in some way, the system can remain linear. Dither performs the job of decorrelation by making the action of the quantizer unpredictable.
The first documented use of dither was in picture coding.12 In this system, the noise added prior to quantizing was subtracted after reconversion to analog. This is known as subtractive dither. Although subsequent subtraction has some slight advantages13 it suffers from practical drawbacks, since the original noise waveform must accompany the samples or must be synchronously re-created at the DAC. This is virtually impossible in a system where the signal may have been edited. Practical systems use non-subtractive dither where the dither signal is added prior to quantization and no subsequent attempt is made to remove it. The introduction of dither inevitably causes a slight reduction in the signal-to-noise ratio attainable, but this reduction is a small price to pay for the elimination of non-linearities. As linearity is an essential requirement for digital audio and video, the use of dither is equally essential.
The ideal (noiseless) quantizer of Figure 2.28 has fixed quantizing intervals and must always produce the same quantizing error from the same signal. In Figure 2.30 it can be seen that an ideal quantizer can be dithered by linearly adding a controlled level of noise either to the input signal or to the reference voltage which is used to derive the quantizing intervals. There are several ways of considering how dither works, all of which are valid.
The addition of dither means that successive samples effectively find the quantizing intervals in different places on the voltage scale. The quantizing error becomes a function of the dither, rather than just a function of the input signal. The quantizing error is not eliminated, but the subjectively unacceptable distortion is converted into broadband noise which is more benign. An alternative way of looking at dither is to consider the situation where a low-level input signal is changing slowly within a quantizing interval. Without dither, the same numerical code results, and the variations within the interval are lost. Dither has the effect of forcing the quantizer to switch between two or more states. The higher the voltage of the input signal within the interval, the more probable it becomes that the output code will take on a higher value. The lower the input voltage within the interval, the more probable it is that the output code will take the lower value. The dither has resulted in a form of duty cycle modulation, and the resolution of the system has been extended indefinitely instead of being limited by the size of the steps.
Dither can also be understood by considering the effect it has on the transfer function of the quantizer. This is normally a perfect staircase, but in the presence of dither it is smeared horizontally until with a certain minimum amplitude the average transfer function becomes straight.
The characteristics of the noise used are rather important for optimal performance, although many suboptimal but nevertheless effective systems are in use. The main parameters of interest are the peak-to-peak amplitude, and the probability distribution of the amplitude. Triangular probability works best and this can be obtained by summing the output of two uniform probability processes.
The use of dither invalidates the conventional calculations of signal-to-noise ratio available for a given wordlength. This is of little consequence as the rule of thumb that multiplying the number of bits in the wordlength by 6 dB gives the SNR a result that will be close enough for all practical purposes.
It has only been possible to introduce the principles of conversion of audio and video signals here. For more details of the operation of convertors the reader is referred elsewhere.2,14
2.14 Introduction to digital processing
However complex a digital process, it can be broken down into smaller stages until finally one finds that there are really only two basic types of element in use, and these can be combined in some way and supplied with a clock to implement virtually any process. Figure 2.31 shows that the first type is a logic element. This produces an output which is a logical function of the input with minimal delay. The second type is a storage element which samples the state of the input(s) when clocked and holds or delays that state.
The strength of binary logic is that the signal has only two states, and considerable noise and distortion of the binary waveform can be tolerated before the state becomes uncertain. At every logical element, the signal is compared with a threshold, and can thus can pass through any number of stages without being degraded. In addition, the use of a storage element at regular locations throughout logic circuits eliminates time variations or jitter. Figure 2.31 shows that if the inputs to a logic element change, the output will not change until the propagation delay of the element has elapsed. However, if the output of the logic element forms the input to a storage element, the output of that element will not change until the input is sampled at the next clock edge. In this way the signal edge is aligned to the system clock and the propagation delay of the logic becomes irrelevant. The process is known as reclocking.
2.15 Logic elements
The two states of the signal when measured with an oscilloscope are simply two voltages, usually referred to as high and low. The actual voltage levels will depend on the type of logic family in use, and on the supply voltage used. Supply voltages have tended to fall as designers seek to reduce power consumption. Within logic, the exact levels are not of much consequence, and it is only necessary to know them when interfacing between different logic families or when driving external devices. The pure logic designer is not interested at all in these voltages, only in their meaning.
Just as the electrical waveform from a microphone represents sound velocity, so the waveform in a logic circuit represents the truth of some statement. As there are only two states, there can only be true or false meanings. The true state of the signal can be assigned by the designer to either voltage state. When a high voltage represents a true logic condition and a low voltage represents a false condition, the system is known as positive logic, or high true logic. This is the usual system, but sometimes the low voltage represents the true condition and the high voltage represents the false condition. This is known as negative logic or low true logic. Provided that everyone is aware of the logic convention in use, both work equally well.
In logic systems, all logical functions, however complex, can be configured from combinations of a few fundamental logic elements or gates. It is not profitable to spend too much time debating which are the truly fundamental ones, since most can be made from combinations of others. Figure 2.32 shows the important simple gates and their derivatives, and introduces the logical expressions to describe them, which can be compared with the truth-table notation. The figure also shows the important fact that when negative logic is used, the OR gate function interchanges with that of the AND gate.
If numerical quantities need to be conveyed down the two-state signal paths described here, then the only appropriate numbering system is binary, which has only two symbols, 0 and 1. Just as positive or negative logic could be used for the truth of a logical binary signal, it can also be used for a numerical binary signal. Normally, a high voltage level will represent a binary 1 and a low voltage will represent a binary 0, described as a ‘high for a one’ system. Clearly a ‘low for a one’ system is just as feasible. Decimal numbers have several columns, each of which represents a different power of ten; in binary the column position specifies the power of two.
Several binary digits or bits are needed to express the value of a binary sample. These bits can be conveyed at the same time by several signals to form a parallel system, which is most convenient inside equipment or for short distances because it is inexpensive, or one at a time down a single signal path, which is more complex, but convenient for cables between pieces of equipment because the connectors require fewer pins. When a binary system is used to convey numbers in this way, it can be called a digital system.
2.16 Storage elements
The basic memory element in logic circuits is the latch, which is constructed from two gates as shown in Figure 2.33(a), and which can be set or reset. A more useful variant is the D-type latch shown at (b) which remembers the state of the input at the time a separate clock either changes state, for an edge-triggered device, or after it goes false, for a level-triggered device. A shift register can be made from a series of latches by connecting the Q output of one latch to the D input of the next and connecting all the clock inputs in parallel. Data are delayed by the number of stages in the register. Shift registers are also useful for converting between serial and parallel data formats.
Where large numbers of bits are to be stored, cross-coupled latches are less suitable because they are more complicated to fabricate inside integrated circuits than dynamic memory, and consume more current.
In large random access memories (RAMs), the data bits are stored as the presence or absence of charge in a tiny capacitor as shown in Figure 2.33(c). The capacitor is formed by a metal electrode, insulated by a layer of silicon dioxide from a semiconductor substrate, hence the term MOS (metal oxide semiconductor). The charge will suffer leakage, and the value would become indeterminate after a few milliseconds. Where the delay needed is less than this, decay is of no consequence, as data will be read out before they have had a chance to decay. Where longer delays are necessary, such memories must be refreshed periodically by reading the bit value and writing it back to the same place. Most modern MOS RAM chips have suitable circuitry built in. Large RAMs store many megabits, and it is clearly impractical to have a connection to each one. Instead, the desired bit has to be addressed before it can be read or written. The size of the chip package restricts the number of pins available, so that large memories use the same address pins more than once. The bits are arranged internally as rows and columns, and the row address and the column address are specified sequentially on the same pins.
Just like recording devices, electronic data storage devices come in many varieties. The basic volatile RAM will lose data if power is interrupted. However, there are also non-volatile RAMS or NVRAMs which retain the data in the absence of power. A type of memory which is written once is called a read-only-memory or ROM. Some of these are programmed by using a high current which permanently vaporizes conductors in each location so that the data are fixed. Other types can be written electrically, but cannot be erased electrically. These need to be erased by exposure to ultraviolet light and are called UVROMS. Once erased they can be reprogrammed with new data. Another type of ROM can be rewritten electrically a limited number of times. These are known as electric alterable ROMs or EAROMS.
2.17 Binary coding
In many cases a binary code is used to represent a sample of an audio or video waveform. Practical digital hardware places a limit on the wordlength which in turn limits the range of values available. In the eight-bit samples used in much digital video equipment, there are 256 different numbers, whereas in the sixteen-bit codes common in digital audio, there are 65 536 different numbers.
Figure 2.34(a) shows the result of counting upwards in binary with a fixed wordlength. When the largest possible value of all ones is reached, adding a further one to the LSB causes it to become zero with a carry-out. This carry is added to the next bit which becomes zero with a carry-out and so on. The carry will ripple up the word until the MSB becomes zero and produces a carry-out. This carry-out represents the setting of a bit to the left of the MSB, which is not present in the hardware and is thus lost. Consequently when the highest value is reached, further counting causes the value to reset to zero and begin again. This is known as an overflow. Counting downwards will achieve the reverse. When zero is reached, subtracting one will cause an underflow where a borrow should be taken from a bit to the left of the MSB, which does not exist, the result being that the bits which do exist take the value of all ones, being the highest possible code value.
Storage devices such as latches can be configured so that they count pulses. Figure 2.34(b) shows such an arrangement. The pulses to be counted are fed to the clock input of a D-type latch, whose input is connected to its complemented output. This configuration will change state at every input pulse, so that it will be in a true state after every other pulse. This is a divide-by-two counter. If the output is connected to the clock input of another stage, this will divide by four. A series of divide-by-two stages can be cascaded indefinitely in this way to count up to arbitrarily high numbers. Note that when the largest possible number is reached, when all latches are in the high state, the next pulse will result in all latches going to a low state, corresponding to the count of zero. This is the overflow condition described above.
Counters often include reset inputs which can be used to force the count to zero. Some are presettable so that a specific value can be loaded into each latch before counting begins.
As a result of the fixed wordlength, underflow and overflow, the infinite range of real numbers is mapped onto the limited range of a binary code of finite wordlength. Figure 2.34(c) shows that the overflow makes the number scale circular and it is as if the real number scale were rolled around it so that a binary code could represent any of a large possible number of real values, positive or negative. This is why the term wraparound is sometimes used to describe the result of an overflow condition.
Mathematically the pure binary mapping of Figure 2.34(c) from an infinite scale to a finite scale is known as modulo arithmetic. The four-bit example shown expresses real numbers as Modulo-16 codes.
In a practical ADC, each number represents a different analog signal voltage, and the hardware is arranged such that voltages outside the finite range do not overflow but instead result in one or other limit codes being output. This is the equivalent of clipping in analog systems. In Figure 2.35(a) it will be seen that in an eight-bit pure binary system, the number range goes from 00 hex, which represents the smallest voltage and all those voltages below it, through to FF hex, which represents the largest positive voltage and all voltages above it.
In some computer graphics systems these extremes represent black and peak white respectively. In television systems the traditional analog video waveform must be accommodated within this number range. Figure 2.35(b) shows how this is done for a broadcast standard luminance signal. As digital systems only handle the active line, the quantizing range is optimized to suit the gamut of the unblanked luminance and the sync pulses go off the bottom of the scale. There is a small offset in order to handle slightly misadjusted inputs. Additionally the codes at the extremes of the range are reserved for synchronizing and are not available to video values.
Colour difference video signals are bipolar and so blanking is in the centre of the signal range. In order to accommodate colour difference signals in the quantizing range, the blanking voltage level of the analog waveform has been shifted as in Figure 2.35(c) so that the positive and negative voltages in a real signal can be expressed by binary numbers which are only positive. This approach is called offset binary and has the advantage that the codes of all ones and all zeros are still at the ends of the scale and can continue to be used for synchronizing.
Figure 2.36 shows that analog audio signal voltages are referred to midrange. The level of the signal is measured by how far the waveform deviates from midrange, and attenuation, gain and mixing all take place around that level. Digital audio mixing is achieved by adding sample values from two or more different sources, but unless all the quantizing intervals are of the same size and there is no offset, the sum of two sample values will not represent the sum of the two original analog voltages. Thus sample values which have been obtained by non-uniform or offset quantizing cannot readily be processed because the binary numbers are not proportional to the signal voltage.
If two offset binary sample streams are added together in an attempt to perform digital mixing, the result will be that the offsets are also added and this may lead to an overflow. Similarly, if an attempt is made to attenuate by, say, 6.02 dB by dividing all the sample values by two, Figure 2.37 shows that the offset is also divided and the waveform suffers a shifted baseline. This problem can be overcome with digital luminance signals simply by subtracting the offset from each sample before processing as this results in positive-only numbers truly proportional to the luminance voltage. This approach is not suitable for audio or colour difference signals because negative numbers would result when the analog voltage goes below blanking and pure binary coding cannot handle them.
The problem with offset binary is that it works with reference to one end of the range. What is needed is a numbering system which operates symmetrically with reference to the centre of the range.
In the two’s complement system, the mapping of real numbers onto the finite range of a binary word is modified. Instead of the mapping of Figure 2.38(a) in which only positive numbers are mapped, in (b) the upper half of the pure binary number range has been redefined to represent negative quantities. In two’s complement, the range represented by the circle of numbers does not start at zero, but starts on the diametrically opposite side of the circle such that zero is now in the centre of the number range. All numbers clockwise from zero are positive and have the MSB reset. All numbers anticlockwise from zero are negative and have the MSB set. The MSB is thus the equivalent of a sign bit where 1 = minus. Two’s complement notation differs from pure binary in that the most significant bit is inverted in order to achieve the half circle rotation.
Figure 2.39 shows how a real ADC is configured to produce two’s complement output. At (a) an analog offset voltage equal to one half the quantizing range is added to the bipolar analog signal in order to make it unipolar as at (b). The ADC produces positive-only numbers at (c) which are proportional to the input voltage. This is actually an offset binary code. The MSB is then inverted at (d) so that the all-zeros code moves to the centre of the quantizing range. The analog offset is often incorporated into the ADC as is the MSB inversion. Some convertors are designed to be used in either pure binary or two’s complement mode. In this case the designer must arrange the appropriate DC conditions at the input. The MSB inversion may be selectable by an external logic level. In the broadcast digital video interface standards the colour difference signals use offset binary because the codes of all zeros and all ones are at the end of the range and can be reserved for synchronizing. A digital vision mixer simply inverts the MSB of each colour difference sample to convert it to two’s complement.
The two’s complement system allows two sample values to be added, or mixed in audio and video parlance, and the result will be referred to the system midrange; this is analogous to adding analog signals in an operational amplifier.
Figure 2.40 illustrates how adding two’s complement samples simulates a bipolar mixing process. The waveform of input A is depicted by solid black samples, and that of B by samples with a solid outline. The result of mixing is the linear sum of the two waveforms obtained by adding pairs of sample values. The dashed lines depict the output values. Beneath each set of samples is the calculation which will be seen to give the correct result. Note that the calculations are pure binary. No special arithmetic is needed to handle two’s complement numbers.
It is interesting to see why the two’s complement adding process works. Effectively both two’s complement numbers to be added contain an offset of half full scale. When they are added, the two offsets add to produce a sum offset which has a value of full scale. As adding full scale to a code consists of moving one full rotation round the circle of numbers, the offset has no effect and is effectively eliminated.
It is sometimes necessary to phase reverse or invert a digital signal. The process of inversion in two’s complement is simple. All bits of the sample value are inverted to form the one’s complement, and one is added. This can be checked by mentally inverting some of the values in Figure 2.38(b). The inversion is transparent and performing a second inversion gives the original sample values. Using inversion, signal subtraction can be performed using only adding logic.
Two’s complement numbers can have a radix point and bits below it just as pure binary numbers can. It should, however, be noted that in two’s complement, if a radix point exists, numbers to the right of it are added. For example, 1100.1 is not −4.5, it is −4 + 0.5 = −3.5.
The circuitry necessary for adding pure binary or two’s complement binary numbers is shown in Figure 2.41. Addition in binary requires two bits to be taken at a time from the same position in each word, starting at the least significant bit. Should both be ones, the output is zero, and there is a carry-out generated. Such a circuit is called a half adder, shown in Figure 2.41(a) and is suitable for the least significant bit of the calculation. All higher stages will require a circuit which can accept a carry input as well as two data inputs. This is known as a full adder (Figure 2.41(b)).
Such a device is also convenient for inverting a two’s complement number, in conjunction with a set of invertors. The adder has one set of inputs taken to a false state, and the carry-in permanently held true, such that it adds one to the one’s complement number from the invertor.
When mixing by adding sample values, care has to be taken to ensure that if the sum of the two sample values exceeds the number range the result will be clipping rather than overflow. In two’s complement, the action necessary depends on the polarities of the two signals. Clearly if one positive and one negative number are added, the result cannot exceed the number range. If two positive numbers are added, the symptom of positive overflow is that the most significant bit sets, causing an erroneous negative result, whereas a negative overflow results in the most significant bit clearing. The overflow control circuit will be designed to detect these two conditions, and override the adder output. If the MSB of both inputs is zero, the numbers are both positive, thus if the sum has the MSB set, the output is replaced with the maximum positive code (0111 …). If the MSB of both inputs is set, the numbers are both negative, and if the sum has no MSB set, the output is replaced with the maximum negative code (1000 …). These conditions can also be connected to warning indicators. Figure 2.41(c) shows this system in hardware. The resultant clipping on overload is sudden, and sometimes a PROM is included which translates values around and beyond maximum to soft-clipped values below or equal to maximum.
A storage element can be combined with an adder to obtain a number of useful functional blocks which will crop up frequently in digital signal processing. Figure 2.42(a) shows that a latch is connected in a feedback loop around an adder. The latch contents are added to the input each time it is clocked. The configuration is known as an accumulator in computation because it adds up or accumulates values fed into it. In filtering, it is known as a discrete time integrator. If the input is held at some constant value, the output increases by that amount on each clock. The output is thus a sampled ramp.
Figure 2.42(b) shows that the addition of an invertor allows the difference between successive inputs to be obtained. This is digital differentiation. The output is proportional to the slope of the input.
2.18 Gain control
When processing digital audio or image data the gain of the system will need to be variable so that mixes and fades can be performed. Gain is controlled in the digital domain by multiplying each sample value by a coefficient. If that coefficient is less than one, attenuation will result; if it is greater than one, amplification can be obtained.
Multiplication in binary circuits is difficult. It can be performed by repeated adding, but this is too slow to be of any use. In fast multiplication, one of the inputs will be simultaneously multiplied by one, two, four, etc., by hard-wired bit shifting. Figure 2.43 shows that the other input bits will determine which of these powers will be added to produce the final sum, and which will be neglected. If multiplying by five, the process is the same as multiplying by four, multiplying by one, and adding the two products. This is achieved by adding the input to itself shifted two places. As the wordlength of such a device increases, the complexity increases exponentially.
In a given application, all that matters is that the output has the correct numerical value. It does not matter if this is achieved using dedicated hardware or using software in a general-purpose processor. It should be clear that if it is wished to simulate analog gain control in the digital domain by multiplication, the samples to be multiplied must have been uniformly quantized. If the quantizing is non-uniform the binary numbers are no longer proportional to the original parameter and multiplication will not give the correct result.
In audio, uniform quantizing is universal in production systems. However, in video it is not owing to the widespread use of gamma which will be discussed in Chapter 6. Strictly speaking, video signals with gamma should be returned to the uniformly quantized domain before processing but this is seldom done in practice.
2.19 Floating-point coding
Computers operate on data words of fixed length and if binary or two’s complement coding is used, this limits the range of the numbers. For this reason many computers use floating-point coding which allows a much greater range of numbers with a penalty of reduced accuracy.
Figure 2.44(a) shows that in pure binary, numbers which are significantly below the full scale value have a number of high-order bits which are all zero. Instead of handling these bits individually, as they are all zero it is good enough simply to count them. Figure 2.44(b) shows that every time a leading zero is removed, the remaining bits are shifted left one place and this has the effect in binary of multiplying by two. Two shifts multiply by four, three shifts by eight and so on. In order to recreate the number with the right magnitude, the power of two by which the number was multiplied must also be sent. This value is known as the exponent.
In order to convert a binary number of arbitrary value with an arbitrarily located radix point into floating-point notation, the position of the most significant or leading one and the position of the radix point are noted. The number is then multiplied or divided by powers of two until the radix point is immediately to the right of the leading one. This results in a value known as the mantissa (plural: mantissae) which always has the form 1.XXX … where X is 1 or 0 (known in logic as ‘don’t care’).
The exponent is a two’s complement code which determines whether the mantissa has to be multiplied by positive powers of two which will shift it left and make it bigger, or whether it has to be multiplied by negative powers of two which will shift it right and make it smaller.
In floating-point notation, the range of the numbers and the precision are independent. The range is determined by the wordlength of the exponent. For example, a six-bit exponent having 64 values allows a range from 1.XX × 231 to 1.XX × 2−32. The precision is determined by the length of the mantissa. As the mantissa is always in the format 1.XXX it is not necessary to store the leading one so the actual stored value is in the form .XXX. Thus a ten-bit mantissa has eleven-bit precision. It is possible to pack a ten-bit mantissa and a six-bit exponent in one sixteen-bit word.
Although floating-point operation extends the number range of a computer, the user must constantly be aware that floating point has limited precision. Floating point is the computer’s equivalent of lossy compression. In trying to get more for less, there is always a penalty.
In some signal-processing applications, floating-point coding is simply not accurate enough. For example, in an audio filter, if the stopband needs to be, say, 100 dB down, this can only be achieved if the entire filtering arithmetic has adequate precision. 100 dB is one part in 100 000 and needs more than sixteen bits of resolution. The poor quality of a good deal of digital audio equipment is due to the unwise adoption of floating-point processing of inadequate precision.
Computers of finite wordlength can operate on larger numbers without the inaccuracy of floating-point coding by using techniques such as double precision. For example, thirty-two-bit precision data words can be stored in two adjacent memory locations in a sixteen-bit machine, and the processor can manipulate them by operating on the two halves at different times. This takes longer, or needs a faster processor.
2.20 Multiplexing principles
Multiplexing is used where several signals are to be transmitted down the same channel. The channel bit rate must be the same as or greater than the sum of the source bit rates. Figure 2.45 shows that when multiplexing is used, the data from each source has to be time compressed. This is done by buffering source data in a memory at the multiplexer. It is written into the memory in real time as it arrives, but will be read from the memory with a clock which has a much higher rate. This means that the readout occurs in a smaller timespan. If, for example, the clock frequency is raised by a factor of ten, the data for a given signal will be transmitted in a tenth of the normal time, leaving time in the multiplex for nine more such signals.
In the demultiplexer another buffer memory will be required. Only the data for the selected signal will be written into this memory at the bit rate of the multiplex. When the memory is read at the correct speed, the data will emerge with its original timebase.
In practice it is essential to have mechanisms to identify the separate signals to prevent them being mixed up and to convey the original signal clock frequency to the demultiplexer. In time-division multiplexing the timebase of the transmission is broken into equal slots, one for each signal. This makes it easy for the demultiplexer, but forces a rigid structure on all the signals such that they must all be locked to one another and have an unchanging bit rate. Packet multiplexing overcomes these limitations.
2.21 Packets
The multiplexer must switch between different time-compressed signals to create the bitstream and this is much easier to organize if each signal is in the form of data packets of constant size. Figure 2.46 shows a packet multiplexing system.
Each packet consists of two components: the header, which identifies the packet, and the payload, which is the data to be transmitted. The header will contain at least an identification code (ID) which is unique for each signal in the multiplex. The demultiplexer checks the ID codes of all incoming packets and discards those which do not have the wanted ID.
In complex systems it is common to have a mechanism to check that packets are not lost or repeated. This is the purpose of the packet continuity count which is carried in the header. For packets carrying the same ID, the count should increase by one from one packet to the next. Upon reaching the maximum binary value, the count overflows and recommences.
2.22 Statistical multiplexing
Packet multiplexing has advantages over time-division multiplexing because it does not set the bit rate of each signal. A demultiplexer simply checks packet IDs and selects all packets with the wanted code. It will do this however frequently such packets arrive. Consequently it is practicable to have variable bit rate signals in a packet multiplex. The multiplexer has to ensure that the total bit rate does not exceed the rate of the channel, but that rate can be allocated arbitrarily between the various signals.
As a practical matter it is usually necessary to keep the bit rate of the multiplex constant. With variable-rate inputs this is done by creating null packets which are generally called stuffing or packing. The headers of these packets contain an unique ID which the demultiplexer does not recognize and so these packets are discarded on arrival.
In an MPEG environment, statistical multiplexing can be extremely useful because it allows for the varying difficulty of real program material. In a multiplex of several television programs, it is unlikely that all the programs will encounter difficult material simultaneously. When one program encounters a detailed scene or frequent cuts which are hard to compress, more data rate can be allocated at the allowable expense of the remaining programs which are handling easy material.
2.23 Timebase correction
One of the strengths of digital technology is the ease with which delay can be provided. Accurate control of delay is the essence of timebase correction, necessary whenever the instantaneous time of arrival or rate from a data source does not match the destination. In digital video and audio, the destination will almost always have perfectly regular timing, namely the sampling rate clock of the final DAC. Timebase correction consists of aligning irregular data from storage media, transmission channels or compression decoders with that stable reference.
When compression is used, the amount of data resulting from equal units of time will vary. Figure 2.47 shows that if these data have to be sent at a constant bit rate, a buffer memory will be needed between the encoder and the channel. The result will be that effectively the picture period varies. Similar buffering will be needed at the decoder. Timebase correction is used to recover a constant picture rate.
Section 2.16 showed the principles of digital storage elements which can be used for delay purposes. The shift-register approach and the RAM approach to delay are very similar, as a shift register can be thought of as a memory whose address increases automatically when clocked. The data rate and the maximum delay determine the capacity of the RAM required. Figure 2.48 shows that the addressing of the RAM is by a counter that overflows endlessly from the end of the memory back to the beginning, giving the memory a ring-like structure. The write address is determined by the incoming data, and the read address is determined by the outgoing data. This means that the RAM has to be able to read and write at the same time.
In an MPEG decoder, the exact time of arrival of the data corresponding to a picture can vary, along with the time taken to decode it. In practice the decoded picture is placed in memory and read out according to a locally re-created picture rate clock. If the phase of the picture clock is too early, decoding may not be completed before the memory has to be read. Conversely if the picture clock phase is too late, the memory may overflow. During lock-up, a decoder has to set the picture phase in such a way that the buffer memories are an average of half-full so that equal correcting power is available in both directions.
References
1. | Watkinson, J.R., Television Fundamentals. Oxford: Focal Press (1998) |
2. | Watkinson, J.R., The Art of Digital Audio, third edition. Oxford: Focal Press (2001) |
3. | Nyquist, H., Certain topics in telegraph transmission theory. AIEE Trans., 617–644 (1928) |
4. | Shannon, C.E., A mathematical theory of communication. Bell Syst. Tech. J., 27, 379 (1948) |
5. | Jerri, A.J., The Shannon sampling theorem – its various extensions and applications: a tutorial review. Proc. IEEE, 65, 1565–1596 (1977) |
6. | Whittaker, E.T., On the functions which are represented by the expansions of the interpolation theory. Proc. R. Soc. Edinburgh, 181–194 (1915) |
7. | Porat, B., A Course in Digital Signal Processing. New York: John Wiley (1996) |
8. | Betts, J.A., Signal Processing Modulation and Noise, Ch. 6. Sevenoaks: Hodder and Stoughton (1970) |
9. | Ishida, Y. et al., A PCM digital audio processor for home use VTRs. Presented at 64th AES Convention (New York, 1979), Preprint 1528 |
10. | Rumsey, F.J. and Watkinson, J.R., The Digital Interface Handbook. Oxford: Focal Press (1995) |
11. | Anon., AES recommended practice for professional digital audio applications employing pulse code modulation: preferred sampling frequencies. AES5–1984 (ANSI S4.28–1984). J. Audio Eng. Soc., 32, 781–785 (1984) |
12. | Roberts, L.G., Picture coding using pseudo-random noise. IRE Trans. Inform. Theory, IT-8, 145–154 (1962) |
13. | Lipshitz, S.P., Wannamaker, R.A. and Vanderkooy, J., Quantization and dither: a theoretical survey. J. Audio Eng. Soc., 40, 355–375 (1992) |
14. | Watkinson, J.R., The Art of Digital Video, second edition. Oxford: Focal Press (1994) |
98.82.120.188