1

Introduction to Interfacing

1.1  The Need for Digital Interfaces

1.1.1 Transparent Links

Digital audio and video systems make it possible for the user to maintain a high and consistent sound or picture quality from beginning to end of a production. Unlike analog systems, the quality of the signal in a digital system need not be affected by the normal processes of recording, transmission or transferral over interconnects, but this is only true provided that the signal remains in the digital domain throughout the signal chain. Converting the signal to and from the analog domain at any point has the effect of introducing additional noise and distortion, which will appear as audible or visual artefacts in the programme material. Herein lies the reason for adopting digital interfaces when transferring signals between digital devices – it is the means of ensuring that the signal is carried ‘transparently’, without the need to introduce a stage of analog conversion. It enables the receiving device to make a ‘cloned’ copy of the original data, which may be identical numerically and temporally.

1.1.2 The Need for Standards

The digital interface between two or more devices in an audio or video system is the point at which data is transferred. Digital interconnects allow programme data to be exchanged, and they may also provide a certain capacity for additional information such as ‘housekeeping data’ (to inform a receiver of the characteristics of the programme signal, for example), text data, subcode data replayed from a tape or disk, user data, communications channels (e.g. low quality speech) and perhaps timecode. These applications are all covered in detail in the course of this book. Standards have been developed in an attempt to ensure that the format of the data adheres to a convention and that the meaning of different bits is clear, in order that devices may communicate correctly, but there is more than one standard and thus not all devices will communicate with each other. Furthermore, even between devices using ostensibly the same interface there are often problems in communication due to differences in the level or completeness of implementation of the standard, the effects of which are numerous. Older devices may have problems with data from newer devices or vice versa, since the standard may have been modified or clarified over the years.

As digital audio and video systems become more mature the importance of correct communication also increases, and the evidence is that manufacturers are now beginning to take correct implementation of interface standards more seriously, since more people are adopting fully digital signal chains. But it must be said that at the same time the applications of such technology are becoming increasingly complicated, and the additional data which accompanies programme data on many interfaces is becoming more comprehensive – there being a wide range of different uses for such data. Thus manufacturers must decide the extent to which they implement optional features, and what to do with the data which is not required or understood by a particular device. Eventually it is likely that digital interface receivers will become more ‘intelligent’, such that they may analyse the incoming data and adapt so as to accommodate it with the minimum of problems, but this is rare in today’s systems and would currently add considerably to the cost of them.

1.1.3 Digital Interfaces and Programme Quality

To say that signal quality cannot be affected provided that the signal remains in the digital domain is a bold statement and requires some qualification, since there will be cases where signal processing in the digital chain may affect quality. The statement is true if it is possible to assume that the sampling rate of the signal, its resolution (number of bits per sample) and the method of quantization all remain unchanged (see Chapter 2). Further one must assume that the signal has not been subjected to any processing, since filtering, gain changing, and other such operations may introduce audible or visual side effects. Operations such as sampling frequency conversion and changes in resolution (say, between 20 and 16 bits per sample) may also introduce artefacts, since these can never be ‘perfect’ processes. Therefore an operation such as copying a signal digitally between two recording systems with the same characteristics is a transparent process, resulting in absolutely no loss of quality (see Figure 1.1), but copying between two recorders with different sampling rates via a sampling frequency convertor is not (although the side effects in most cases are very small).

Figure 1.1 (a) A ‘clone’ copy may be made using a digital interconnect between two devices operating at the same sampling rate and resolution. (b) When sampling parameters differ, digital interconnects may still be used, such as in this example, but the copy will not be a true ‘clone’ of the original.

Confusion arises when users witness a change in sound or picture quality even in the former of the above two cases, leading them to suggest that digital copying is not a transparent process, but the root of this problem is not in the digital copying process – it is in the digital-to-analog conversion process of the device which the operator is using to monitor the signal, as discussed in greater detail in Chapter 2. It is true that timing instabilities and (occasionally) errors may arise when signals are transferred digitally between devices, but both of these are normally correctable or avoidable within the digital domain. Since data errors are extremely rare in digitally interfaced systems, it is timing instabilities which will have the most likely effect on the convertor. Poor quality clock recovery in the receiver and lack of timebase correction in the convertor often allow timing instabilities resulting from the digital interface to affect programme quality when it is monitored in the analog domain. This does not mean that the digital programme itself is of poor quality, simply that the convertor is incapable of rejecting the instability. Although it is difficult to ensure low jitter clock recovery in receivers, especially with the stability required for very high convertor resolutions (e.g. 20 bits in audio), it is definitely here that the root of the problem lies and not really in the nature of digital signals themselves, since it is possible to correct such instabilities with digital signals but not normally possible with analog signals. This is discussed further in Chapter 2 and in section 6.4.3, and for additional coverage of these topics the reader is referred to Rumsey 1 and Watkinson2,3.

1.2  Analog and Digital Communication Compared

Before going on to examine specific digital audio and video interfaces it would be useful briefly to compare analog and digital interfaces in general, and then to look at the basic principles of digital communication.

In an analog wire link between two devices the baseband audio or video signal from a transmitter is carried directly in the form of variations in electrical voltage. Any unwanted signals induced in the wire, such as radio frequency interference (RFI) or mains hum will be indistinguishable from the wanted signal at the receiver, as will any noise or timing instability introduced between transmitter and receiver, and these are likely to affect signal quality (see Figure 1.2). Techniques such as balancing (see section 1.7.1) and the use of transmission lines (see section 1.7.3) are used in analog communication to minimize the effects of long lines and interference. Forms of modulation are used in analog links, especially in radio frequency transmission, whereby the baseband (unmodulated) signal is used to alter the characteristics of a high frequency carrier, resulting in a spectrum with a sideband structure around the carrier. Modulation may give the signal increased immunity to certain types of interference4, but modulated analog signals are rarely carried over wire links within a studio installation.

Figure 1.2 If noise is added to an analog signal the result is a noisy signal. In other words, the noise has become a feature of the signal which may not be separated from it.

Pulse amplitude modulation (PAM) is a means of modulating an analog signal onto a series of pulses, such that the amplitude of the pulses varies according to the instantaneous amplitude of the analog signal (see Figure 1.3) and this is the basis of pulse code modulation (PCM) whereby the amplitude of PAM pulses is quantized, resulting in a binary signal that is exceptionally immune to noise and interference since it only has two states. This process normally takes place in the analog-to-digital (A/D) convertor of an audio or video system, described in more detail in Chapter 2. PCM is the basis of all current digital audio and video interfaces, and, as shown in Figure 1.4, the effects of unwanted noise and timing instability may be rejected by reclocking the binary signal and comparing it with a fixed threshold. Provided that the receiving device is able to distinguish between the two states, one and zero, and can determine the timing slot in which each binary digit (bit) resides, the system may be shown to be able to reject any adverse effects of the link. Over long links a digital signal may be reconstructed regularly, in order to prevent it from becoming impossibly distorted and difficult to decode. Of course there will be cases in which an interfering signal will cause a sufficiently large effect on the digital signal to prevent it from being correctly reconstructed, but below the threshold at which this happens there will be no effect at all.

Figure 1.3 In pulse amplitude modulation (PAM) a regular chain of pulses is amplitudemodulated by the baseband signal.

Figure 1.4 (a) A binary signal is compared with a threshold and reclocked on receipt, thus the meaning will be unchanged. (b) Jitter on a signal can appear as noise with respect to fixed timing. (c) Noise on a signal can appear as jitter when compared with a fixed threshold.

Compared with an analog baseband signal, its digital counterpart normally requires a considerably greater bandwidth, as can be seen from the diagrams above, but the advantage of a digital signal is that it can normally survive over a channel with a relatively low signal-to-noise ratio. Another advantage of digital communications is that a number of signals of different types may be carried over the same physical link without interfering with each other, since they may be time-division multiplexed (TDM), as discussed in section 1.5.5.

1.3  Quantization, Binary Data and Word Length

When a PAM pulse is quantized it is converted into a PCM word of a certain length or resolution. The number of bits in the word determines the accuracy with which the original analog signal may be represented – a larger number of bits allowing more accurate quantization, resulting in lower noise and distortion. This process is covered further in Chapter 2, and thus will not be discussed further here; suffice it to say that for digital audio word lengths of up to 24 bits may be considered necessary for high sound quality, whereas for digital video a smaller number of bits per sample (typically 8–10) is adequate. Since the number of bits per sample is related directly to the signal-to-noise (S/N) ratio of the system it can be deduced that the S/N ratio required for high quality sound is greater than that required for high quality pictures.

Although the resolution of audio samples may be greater than that of video samples, the sampling rate of a video system is much higher than that required for audio. Thus the total amount of data required per second to represent a moving video picture is considerably greater than that required to represent a sound signal. (In all cases it is assumed that no form of data reduction is used.) This has important implications when considering the requirements for different types of digital interface.

In computer terminology, eight bits is a byte, and this is the unit of storage often used in computer systems, even though data may actually be handled in word lengths considerably longer than eight bits. A binary word is sometimes confused with a byte, but a word can be of virtually any length, whereas a byte may only ever be eight bits. The bit with the greatest ‘weight’ in a binary word (the leftmost, or the highest power of two, when written down in conventional form) is called the most significant bit, and the bit with the least weight (20=1, which is normally the rightmost bit) is called the least significant bit. The following example illustrates the point:

Take the eight-bit binary value ‘01011101’:

A kilobit (Kb) is 1024 bits (210 bits), and a kilobyte (Kbyte) is 1024 bytes. A megabit (Mb) is 1024 kilobits. Confusingly, in communications terminology a data rate of a kilobit per second represents 1000 bits per second, not 1024 bits per second.

1.4  Serial and Parallel Communications

The bits of a binary word may be transmitted either in parallel or serial form (see Figure 1.5). In the parallel form each bit is carried over a separate communications channel and the result is at least as many channels as there are bits in the word (there are normally additional lines for controlling the exchange of data on a parallel interface). Thus a 24-bit parallel interface would require at least 24 wires, an earth return, a clock line, and a number of address and handshaking lines. Such an approach is normally used for short distance communications (‘buses’) within a digital device, but is bulky and uneconomical for use over longer distances.

Figure 1.5 When a signal is carried in numerical form, either parallel or serial, the mechanisms of Figure 1.4 ensure that the only degradation is in the conversion process.

When data is carried serially it only requires a single channel (although electrically that channel may consist of more than one wire), and this makes it economical and simple to implement over large distances. On a serial interface the bits of a word are sent one after the other, and thus it tends to be slower than the parallel equivalent, but the two are so different that to say this is really oversimplifying the matter since there are some extremely fast serial interfaces around. Some serial interfaces carry clock and control information over the same channel as the data, whereas others accompany the data channel with a number of additional parallel lines and a clock signal to control the flow of data between devices. Depending on the standard protocol in use it is possible for serial data to be sent either MSB first or LSB first (see section 1.3), and knowing the convention is clearly important when interpreting received data. These matters are examined further in the next section.

1.5  Introduction to Interface Terminology

It is important to understand certain fundamental principles which will arise regularly in the discussion of interfacing and communications.

1.5.1 Data Rate Versus Baud Rate

The data rate of an interface is the rate at which information is carried (sometimes referred to as the information rate), whereas the baud rate of an interface is the modulation rate or number of data ‘symbols’ per second. Although in many cases the two are equivalent, modulation schemes exist which allow for more than one bit to be carried per baud, such as by using multi-level encoding, by using a form of phase-shift keying or other such channel code (see section 3.6).

Data rate is normally quoted as so many kilo- or megabits per second (Kb/s, Mb/s) and this must normally include any capacity for control and additional data. The term baud rate is used more widely in computer and telecommunications systems than it is in audio and video interfacing, but it is useful to understand the distinction.

1.5.2 Synchronous, Asynchronous and Isochronous Communications

The receiving device must be able to determine in which time slot it is to register each bit of data which arrives and there are two approaches to achieving this end. In synchronous communications a clock signal normally accompanies the data, either on a separate wire or modulated with the data (see Chapter 3) and this is used to synchronize the receiver’s clock to that of the transmitter. Each bit of data may be latched at the receiver on one of the edges of a separate clock, or the clock may be extracted from the modulated data using a suitable phase-locked loop (PLL), as described in section 3.5.

In asynchronous communications the clocks of the transmitter and receiver are not locked directly, but must have an almost identical frequency. The tolerance is often around ±1%. In such a protocol each byte of data is prefixed with a start bit and followed by one or more stop bits (see Figure 1.6) and the phase of the receiver’s clock is adjusted at the trailing edge of the start bit. The following data bits are then clocked in according to the receiver’s clock, which should remain sufficiently in phase with the transmitted data over the duration of one byte to ensure correct reception. The receiver’s clock is then resynchronized at the start of the next data byte. Such an approach is often used in computer systems for exchange of data with remote locations over a modem, for example, where the gaps between received bytes may be variable and data flow may not be regular.

Figure 1.6 In asynchronous data transfer each data byte is normally preceded by a start bit and followed by a stop bit to synchronize the receiver.

In isochronous systems one master clock is used to synchronize all receiving devices, and data transmission and reception is clocked with relation to this source.

1.5.3 Uni- and Bi-Directional Interfaces

In a uni-directional interface data may only be transmitted in one direction, and no return path is allowed from the receiver back to the transmitter. In a bi-directional interface a return path is provided and this allows for twoway communications. The return path is often used in a simple serial situation to send back handshaking information to the transmitter, telling it whether or not the data was received satisfactorily. Unfortunately such an approach is not particularly useful in digital audio and video interfacing because the data in these situations is transferred in real time, making retransmission in the case of an error an unrealistic concept.

A simplex interface is one which operates in one direction only; a halfduplex interface is one which operates in both directions, but only one at a time; and a full-duplex interface is one capable of simultaneous transmission and reception.

1.5.4 Clock Signals

A clock signal may be needed, as described above, to synchronize the receiver. In digital audio and video interfacing a variety of methods are used to ensure synchronism between transmitter and receiver which are discussed in detail in the sections concerned. In general two important clock frequencies exist, that is the ‘word clock’ which indicates the sampling frequency or rate of sample words over the interface, and the ‘bit clock’ which indicates the rate of individual data bits. Some interfaces, such as the AES/EBU audio interface, combine the bit clock with the data using a modulation scheme or channel code known as bi-phase mark and indicate the starts of sample words using a violation of the modulation scheme which is easily detected by the receiver. This approach avoids the need for additional lines to carry clock signals and the data is said to be ‘self-clocking’. In contrast, an interface such as Mitsubishi’s audio interface carries the bit clock and word clock signals on individual wires.

An alternative approach is that used in the so-called ‘MADI’ audio interface (see Chapter 4) in which the transmitter and receiver are both locked to a common reference clock signal and the data is transmitted asynchronously, using a buffer at both ends of the interface to allow for flexibility in timing. This is a form of isochronous approach. The different methods are summarized in Figure 1.7.

Figure 1.7 A number of approaches may be used to ensure synchronization in data transfer. (a) Asynchronous transfer relies on transmitter and receiver having identical frequencies, requiring only the receiver’s clock phase to be adjusted by the start bit of each byte. (b) In synchronous transfer the data signal is accompanied by a separate clock. (c) A form of synchronous transfer involves modulating the data in such a way that a clock signal is part of the channel code. (d) In the isochronous approach both devices are locked to a common clock.

1.5.5 Multiplexing

A multiplexed interface is one which carries more than one data signal over a single channel. This is normally achieved using time-division multiplexing (TDM), whereby different time slots in the data stream carry different kinds of information (see Figure 1.8). Here the channel capacity is divided up between the data streams using it, and a multiplexer takes control of the insertion of data into the correct time slots. A demultiplexer at the receiving end extracts the data from each time slot and produces a number of separate data streams.

Figure 1.8 In a serial time-division multiplex a number of input channels share a single high-speed link. Each time slot carries a data packet for each channel.

The AES/EBU audio interface is a simple example of a multiplexed interface, since left and right channel data are multiplexed over the same communications channel, with samples of data for each channel taking alternate time slots on the interface. The MADI interface is a more complicated example in which 56 channels of audio data are multiplexed onto one communications channel – a result of which is that the data rate of the communications channel must be extremely high in order to ensure that data for each audio channel can still be carried in real time.

Thus a multiplexed interface must have a data rate high enough to handle the total data rate of all the data streams which are to be multiplexed onto it, otherwise delays will arise as data queues to use the link. In a computer network it is often acceptable to have short delays where the network is shared between many users, since one can wait for a file to load from a remote server, but for real-time applications such as audio and video it is clearly unacceptable to have a shared network which cannot always carry each channel without breaks or queuing.

1.5.6 Buffering

Buffering is sometimes used at transmitting and receiving ends of an interface to store a number of samples of data temporarily. The buffer is normally a RAM (Random Access Memory) store configured in the FIFO (First In, First Out) mode whose input may be addressed separately to its output (see Figure 1.9). Using such a buffer it is possible to iron out irregularities in data flow, since erratic data arriving at the input may be stored in the buffer and read out at a more regular rate after a short delay. The approach will be successful provided that the average rate of flow into the buffer equals the average rate of flow out of it, and the buffer is big enough to accommodate the irregularities which may arise at its input, otherwise the buffer will either overflow or become empty after a time.

Figure 1.9 A memory buffer may be used as a temporary store to handle irregularities in data flow. Data samples are written to successive memory locations and read out a short time later under control of the read clock.

A disadvantage of the approach is the delay which arises between input and output, which may be undesirable.

1.6  Introduction to Networks

In the most general sense a network is a means of communication between a large number of places. According to this definition the Post Office is a network, as are parcel and courier companies. This type of network delivers physical objects. If, however, we restrict the delivery to information only the result is a telecommunications network. The telephone system is a good example of a telecommunications network because it displays most of the characteristics of later networks.

It is fundamental in a network that any port can communicate with any other port. Figure 1.10 shows a primitive three-port network. Clearly each port must select one or other of the remaining ports in a trivial switching system. However, if it were attempted to redraw Figure 1.10 with 100 ports, each one would need a 99-way switch and the number of wires needed would be phenomenal. Another approach is needed.

Figure 1.10 A simple three-port network has trivial switching requirements.

Figure 1.11 shows that the common solution is to have an exchange, also known as a router, hub or switch, which is connected to every port by a single cable. In this case when a port wishes to communicate with another, it instructs the switch to make the connection. The complexity of the switch varies with its performance. The minimal case may be to install a single input selector and a single output selector. This allows any port to communicate with any other, but only one at a time. If more simultaneous communications are needed, further switching is needed. The extreme case is where every possible pair of ports can communicate simultaneously.

Figure 1.11 A network implemented with a router or hub.

The amount of switching logic needed to implement the extreme case is phenomenal and in practice it is unlikely to be needed. One fundamental property of networks is that they are seldom implemented with the extreme case supported. There will be an economic decision made balancing the number of simultaneous communications with the equipment cost. Most of the time the user will be unaware that this limit exists, until there is a statistically abnormal condition which causes more than the usual number of nodes to attempt communication.

The phrase ‘the switchboard was jammed’ has passed into the language and stayed there despite the fact that manual switchboards are only seen in museums. This is a characteristic of networks. They generally only work up to a certain throughput and then there are problems. This doesn’t mean that networks aren’t useful, far from it. What it means is that with care, networks can be very useful, but without care they can be a nightmare.

There are two key factors to get right in a network. The first is that it must have enough throughput, bandwidth or connectivity to handle the anticipated usage and the second is that a priority system or algorithm is chosen which has appropriate behaviour during overload. These two characteristics are quite different, but often come as a pair in a network corresponding to a particular standard.

Where each device is individually cabled, the result is a radial network shown in Figure 1.12(a). It is not necessary to have one cable per device and several devices can co-exist on a single cable if some form of multiplexing is used. This might be time-division multiplexing (TDM) or frequency division multiplexing (FDM). In TDM, shown in Figure 1.12(b), the time axis is divided into steps which may or may not be equal in length. In Ethernet, for example, these are called frames. During each time step or frame a pair of nodes have exclusive use of the cable. At the end of the time step another pair of nodes can communicate. Rapidly switching between steps gives the illusion of simultaneous transfer between several pairs of nodes. In FDM, simultaneous transfer is possible because each message occupies a different band of frequencies in the cable. Each node has to ‘tune’ to the correct signal. In practice it is possible to combine FDM and TDM. Each frequency band can be time multiplexed in some applications.

Figure 1.12 Radial network at (a) has one cable per node. TDM network (b) shares timeslots on a single cable.

Data networks originated to serve the requirements of computers and it is a simple fact that most computer processes don’t need to be performed in real time or indeed at a particular time at all. Networks tend to reflect that background as many of them, particularly the older ones, are asynchronous.

Asynchronous means that the time taken to deliver a given quantity of data is unknown. A TDM system may chop the data into several different transfers and each transfer may experience delay according to what other transfers the system is engaged in. Ethernet and most storage system buses are asynchronous. For broadcasting purposes an asynchronous delivery system is no use at all, but for copying a video data file between two storage devices an asynchronous system is perfectly adequate.

The opposite extreme is the synchronous system in which the network can guarantee a constant delivery rate and a fixed and minor delay. An AES/EBU router is a synchronous network.

In between asynchronous and synchronous networks reside the isochronous approaches. These can be thought of as sloppy synchronous networks or more rigidly controlled asynchronous networks. Both descriptions are valid. In the isochronous network there will be maximum delivery time which is not normally exceeded. The data transmission rate may vary, but if the rate has been low for any reason, it will accelerate to prevent the maximum delay being reached. Isochronous networks can deliver near-real-time performance. If a data buffer is provided at both ends, synchronous data such as AES/EBU audio can be fed through an isochronous network. The magnitude of the maximum delay determines the size of the buffer and the length of the fixed overall delay through the system. This delay is responsible for the term ‘near-real time’. ATM is an isochronous network.

These three different approaches are needed for economic reasons. Asynchronous systems are very efficient because as soon as one transfer completes, another can begin. This can only be achieved by making every device wait with its data in a buffer so that transfer can start immediately. Asynchronous systems also make it possible for low bit rate devices to share a network with high bit rate devices. The low bit rate device will only need a small buffer and will therefore send short data blocks, whereas the high bit rate device will send long blocks. Asynchronous systems have no difficulty in handling blocks of varying size, whereas in a synchronous system this is very difficult.

Isochronous systems try to give the best of both worlds, generally by sacrificing some flexibility in block size. FireWire (see Chapter 5) is an example of a network which is part isochronous and part asynchronous so that the advantages of both are available.

Whilst computer industry network technology is not ideal for real-time audio-visual information, it has to be accepted that the volume of computer equipment production is such that this technology must be less expensive than hardware specifically designed for broadcast use. There will thus be economic pressure to find ways of using it for audio-visual applications.

1.7  The Electrical Interface

Although digital interfaces are primarily concerned with the transfer of binary data, there are ‘analog’ problems to be considered, since the electrical characteristics of the interface such as the type of cable used, its frequency response and impedance will affect the ability of the interface to carry data signals over distances without distortion.

1.7.1 Balanced and Unbalanced Compared

In an unbalanced interface there is one signal wire and a ground, and the data signal alternates between a positive and negative voltage with respect to ground (see Figure 1.13). The shield of the cable is normally connected to the ground at the transmitter end, and may or may not be connected at the receiver end depending on whether there is a problem with earth loops (a situation in which the earths of the two devices are at different potentials, causing a current to circulate between them, sometimes resulting in hum induction into the signal wire). The unbalanced interface tends to be quite susceptible to interference, since any unwanted signal induced in the data wire will be inseparable from the wanted signal.

Figure 1.13 Electrical configuration of an unbalanced interface.

In a balanced interface there are two signal wires and a ground (see Figure 1.14) and the interface is terminated at both ends either in a differential amplifier or a transformer. The driver drives the two legs of the line in opposite phase, and the advantage of the balanced interface is that any interfering signal is induced equally into the two legs, in phase. At the receiver any so-called ‘common mode’ signals are cancelled out either in the transformer or differential amplifier, since such devices are only interested in the difference between the two legs. The degree to which the receiver can reject common mode signals is called the common mode rejection ratio (CMRR). Although it is a generally held belief that transformers offer better CMR than electronically balanced lines, the performance of modern differential line drivers and receivers is often as good. The advantage of transformers is that they make the line truly ‘floating’, that is independent of the ground, and there is no DC coupling across them, thus isolating the two devices.

Figure 1.14 Electrical configuration of a balanced interface. (a) Transformer balanced. (b) Electronically balanced.

The balanced interface therefore requires one more wire than the unbalanced interface, and will usually have lines labelled ‘Ground’, ‘Data+’ and ‘Data–’, whereas the unbalanced interface will simply have ‘Ground’ and ‘Data’. For temporary test set-ups it is sometimes possible to interconnect between balanced and unbalanced electrical interfaces or vice versa, by connecting the unbalanced interface between the two legs of the balanced one, or between the ground and one leg, but often the voltages involved are different, and one must take care to ensure that the two data streams are compatible. Balancing transformers are available which will convert a signal from one form to the other.

1.7.2 Electrical Interface Standards

The different interface standards specify various peak-to-peak voltages for the data signal, and also specify a minimum acceptable voltage at the receiver to ensure correct decoding (this is necessary because the signal may have been attenuated after passing over a length of cable). Quite commonly serial interfaces conform to one of the international standard conventions which describe the electro-mechanical characteristics of data interfaces, such as RS-4225 which is a standard for balanced communication over long lines devised by the EIA (Electronics Industries Association). For example, the AES3-1992 audio interface is designed to be able to use RS-422 drivers and receivers and specifies a peak-to-peak voltage between 2 and 7 volts at the transmitter (when measured across a 110 ohm resistor with no interconnecting cable present), and also specifies a minimum ‘eye-height’ (see section 3.5) at the receiver of 200mV.

(In passing it should be noted that standards such as RS-422 are mostly only electrical or electro-mechanical standards, and do not say anything about the format or protocol of the data to be carried over them.)

Unbalanced serial interfaces such as RS-232 are not used in audio and video systems, since they are not designed for the high data rates and long distances involved, and are more suited to telecommunications. The voltages involved in an RS-232 interface can be up to 25V, and thus it is not recommended that one should interconnect an RS-232 output with an RS-422 input (which may be damaged by anything above around 12 volts). An RS-422 interface is designed to carry data at rates up to 100 kbaud over a distance of 1200 metres. Above this rate the distance which may be covered satisfactorily drops with increasing baud rate, as shown in Figure 1.15, depending on whether the line is terminated or not (see section 1.7.3).

Figure 1.15 The RS-422 standard allows different cable lengths at different frequencies. Shown here is the guideline for data signalling rate versus cable length when using twisted-pair cable with a wire diameter of 0.51 mm. Longer distances may be achieved using thinner wire.

Other manufacturer-specific interfaces often use TTL levels of 0–5 volts over unbalanced lines, and these are only suitable for communications over relatively short distances.

1.7.3 Transmission Lines

At low data rates a piece of wire can be considered as a simple entity which conducts current and perhaps attenuates the signal resistively to some extent, and in which all components of the signal travel at the same speed as each other, but at higher rates it is necessary to consider the interconnect as a ‘transmission line’ in which reflections may be set up and where such factors as the characteristic impedance and terminating impedance of the line become important.

When considering a simple electrical circuit it is normal to assume that changes in voltage and current occur at the same time throughout the circuit, since the speed at which electricity travels down a wire is fast enough for this to be a reasonable assumption in many cases. When very long cables are involved, the time taken for an electrical signal to travel from one end to the other may approach a significant proportion of one cycle of the signal’s waveform, and when the frequency of the electrical signal is high this is yet more likely since the cycle is short. Another way of considering this is to think in terms of the effective wavelength of the signal in its electrical form, which will be very long since the speed at which electricity travels in wire approaches the speed of light. When the wavelength of the signal in the wire becomes of the same order of magnitude as the length of the wire, transmission line issues may arise.

In a transmission line the ends of the line may be considered to be impedance discontinuities, that is points at which the impedance of the line changes from the line’s characteristic impedance to the impedance of the termination. The characteristic impedance of a line is a difficult concept to grasp but it may be modelled as shown in Figure 1.16, being a combination of inductance and capacitance (and probably, in reality, also some resistance) which depends on the spacing between the conductors, their size and the type of insulation used. Characteristic impedance has been defined as the input impedance of a line of infinite length6. The situation at the ends of a transmission line may be likened to what happens when a sound wave hits a wall – a portion of the power is reflected and a portion is absorbed. The wall represents an impedance discontinuity, and at such points sound energy is reflected. At certain frequencies standing wave modes will be set up in the room, at frequencies where multiples of half a wavelength equal one of the dimensions of the room, whereby the reflected wave combines constructively with the incident wave to produce points of maximum and minimum sound pressure within the room.

Figure 1.16 Electrical model of a transmission line.

If an electrical transmission line is incorrectly terminated, reflections will be set up at the ends of the line, resulting in a secondary electrical wave travelling back down the line. This reflected wave may interfere with the transmitted wave, and in the case of a data signal may corrupt it if the reflected energy is high. The mismatch also results in a loss of power transferred to the next stage. If the line is correctly terminated the end of the line will not appear to be a discontinuity, the optimum power transfer will result at this point, and no reflections will be set up. A line is correctly terminated by ensuring that source and receiver impedances are the same as the characteristic impedance of the line (see Figure 1.17).

Figure 1.17 Electrical characteristics of a matched transmission line.

The upshot of all this for the purposes of this book is that long interconnects which carry high frequency audio or video data signals, often having bandwidths of many megahertz, may be subject to transmission line phenomena, and thus lines should normally be correctly terminated. The penalty for not doing so may be reduced signal strengths and erroneous data reception after some distance, and such situations can sometimes arise, especially when audio data signals are sent down existing cable runs which may pass through jackfields and different types of wire, each of which presents a change in characteristic impedance. In video environments people tend to be used to the concepts of transmission lines and correct termination, since video signals have always been subject to transmission line phenomena. In audio environments it may be a new concept, since analog audio signals do not contain high enough frequencies for such matters to become a problem, yet digital audio signals may. It is not recommended, for example, to parallel a number of receivers across one source line when dealing with digital signals, since the line will not then be correctly terminated and the signal level may also be considerably attenuated.

1.7.4 Cables

The types of cables used in serial audio and video interfaces vary from balanced screened, twisted pair (e.g. AES/EBU) to unbalanced coaxial links (e.g. Sony SDIF-2) and the characteristic impedances of these cables are different. Typical twisted pair audio cable, such as is used in many analog installations (and also put to use in digital links), tends to have a characteristic impedance of around 90–100 ohms, although this is not carefully controlled in manufacture since it normally does not matter for analog audio signals, whilst the more expensive ‘star-quad’ audio cable has a lower characteristic impedance of around 35 ohms. Typical coaxial cables used in video work have a characteristic impedance of 75 ohms, and other RF coaxial links use 50 ohm cable.

One problem with the twisted pair balanced line is that its electromagnetic radiation is poor compared with that of the coaxial link. A further advantage of the coaxial link is that its attenuation does not become severe until much higher frequencies than that of the twisted pair, and the speed of propagation along a coaxial link is significantly faster than along a twisted pair, resulting in smaller signal delays. But the twisted pair link is balanced whereas the coaxial link is not, and this makes it more immune to external interference, which is a great advantage.

Cable losses at high frequencies will clearly reduce the bandwidth of the interconnect, and the effect of this on the data signal is to slow the rise and fall times of the data edges, and to delay the point of transition between one state and the other by an amount which depends on the state of the data in the previous bit cell. The practical result of this is data link timing jitter, as proposed by Dunn7, which may affect signal quality in D/A conversion if the clock is recovered from a portion of the data frame which is subject to a large degree of jitter. Links which suffer HF loss can often be equalized at the receiver to prop up the high frequency end of the spectrum, and this can help to accommodate longer interconnects which would otherwise fail. This matter is examined further in section 6.4.1.

Thus a number of factors combine to suggest that the type of cabling used in a digital interconnect is an important matter. A cable is required whose characteristic impedance matches that of the driver and receiver as closely as possible and is consistent along the length of the cable. The cable should have low loss at high frequencies, and the precise specification requires a knowledge of the bandwidth of the signal to be transferred. Cable manufacturers can usually quote such figures in specification sheets or on request, and it pays to study such matters, especially when recabling a large installation. More detailed discussion of these matters is contained within the sections of this book covering individual interfaces.

1.7.5 Connectors

The connectors used in digital interfacing fall into a number of distinct categories (see Figure 1.18). First, there are the unbalanced coaxial connectors, normally BNC type, which differ slightly depending on the characteristic impedance of the line; 50 ohm BNC connectors have a slightly larger central pin than 75ohm connectors and can damage 75ohm sockets if used inadvertently. RCA phono connectors, such as are often found in consumer hi-fi systems, are also used for unbalanced consumer digital audio interfaces using coaxial cable, although they are not proper coaxial connectors and do not have a controlled characteristic impedance. Both these connector types carry the data signal on the central pin and the shield on the outer ring.

Figure 1.18 A number of connector types are commonly used in digital interfacing. (a) RCA phono connector. (b) BNC connector. (c) XLR connector. (d) D-type connector.

The XLR-3 connector is used for one balanced digital audio interface, and it has its roots as an analog professional audio connector. The convention is for pin 1 to be the shield, pin 2 to be ‘Data+’, and pin 3 to be ‘Data—’.

The D-sub type of connector stems from computer systems and remote control applications, and it has a number of individual pins arranged in two or three rows. The 9 pin D connector is often used for RS-422 communications, since it allows for two balanced sends and returns plus a ground, whereas some custom digital interfaces (either parallel or multichannel serial) use 25, 36 or even 50 pin D-type connectors.

These are not the only connectors used in audio and video interfacing, but they are the most common. Miscellaneous manufacturer-specific formats use none of these, Yamaha preferring the 8 pin DIN connector, for example, as its digital ‘cascade’ connector in one format.

1.8  Optical Interfaces

Optical fibres are now playing a larger part in everyday data communications, and the cost: performance ratio of such fibre interconnects makes them a reasonable proposition when a large amount of data is to be carried over long distances. The key features of optical fibres are a large bandwidth, very low losses and immunity to interference, coupled with small size and flexibility. Also, when fibres are used as interconnects, devices are electrically isolated thus avoiding problems such as ground loops, shorts and crosstalk. Data is transferred over fibres by modulating a light source with the data, the light being carried within the fibre to an optical detector at the receiving end. The light source may be an LED or a laser diode, with the LED capable of operation up to a maximum of a few hundred megahertz at low power, whilst the laser is preferable in applications at higher frequencies or over longer distances.

1.8.1 Fibre Principles

The typical construction of an optical fibre is shown in Figure 1.19, and light travels in the fibre as in a typical ‘waveguide’, by reflection at the boundaries. Total internal reflection occurs at the boundaries between the fibre core and its cladding, provided that the angle at which the light wave hits the boundary is shallower than a certain value called the critical angle, and thus the method of coupling the light source to the fibre is important in ensuring that light is propagated in the right way. Unless the fibre is exceptionally narrow, with a diameter approaching the wavelength of the light travelling along it (say between 4 and 10μm), the light will travel along a number of paths – known as ‘multimodes’ – and thus the time taken for a source pulse to arrive at the receiver may vary slightly depending on the length of each path. This results in smearing of pulses in the time domain, and is known as modal dispersion, or, looked at in the frequency domain, it represents a reduction in the bandwidth of the link with increasing length. Up to distances of around 1.5km the bandwidth decreases roughly linearly with distance (quoted in MHz per km), and after this in proportion to the square root of the length. Optical signals will also be attenuated with distance due to scattering by metal ions within the fibre, and by absorption due to water present within the structure. Losses are usually quoted in dB per km at a specified wavelength of light, and can be as low as 1 dB/km with high quality silica, graded index multimode fibres, or higher than 100dB/km with plastic or ordinary glass cores. Clearly the high loss cables would be cheaper, and perhaps adequate for consumer applications in which the distances to be covered might be quite small.

Figure 1.19 Cross-section through a typical optical fibre, and mode of transmission.

Single mode fibres with very fine cores achieve very wide bandwidths with very low losses, and thus are suitable for use over long distances. Attenuations of around 0.5 dB/km are not uncommon with such fibres, which have only recently become feasible due to the development of suitable sources and connectors.

1.8.2 Light Sources and Connectors

LED (Light-Emitting Diode) light sources are made of gallium arsenide (GaAs) and can be doped to produce light with a wavelength between 800 and 1300 nm. The bandwidth of the radiated light is fairly wide, having a range of wavelengths of around 40 nm, which is another factor leading to greater losses over distance as the light of different wavelengths propagates over different modal paths within the fibre. The light from an LED source is incoherent (i.e. the phase and plane of the wavelets is random), and the angle over which it is radiated is quite wide. Since light radiated into the fibre at angles greater than a certain ‘acceptance angle’ will not be internally reflected it will effectively be lost in the cladding; thus the effectiveness of the coupling of light from an LED into the fibre is not good and only a few hundred microwatts of power can be transmitted.

An ILD (Injection Laser Diode) on the other hand produces coherent light of a similar wavelength to the LED, but over a narrower angle and with a narrower bandwidth (between around 1 and 3 nm in wavelength), thus providing better coupling of the light power to the fibre and resulting in less dispersion. Because of this, ILD drivers can be used for links which work in the gigahertz region whilst maintaining low losses.

Optical detectors are forms of photodiode which are very efficient at converting received light power into electrical current. Rise times of around 10 ns or less are achievable. The main problem with detectors is the distortion and noise they introduce, and different types of photodiode differ enormously in their S/N ratios. The so-called avalanche photodiode has a noise floor considerably lower than its counterpart, the PIN diode, and thus is preferable in critical applications. The integrated detector/preamplifier (IPD) provides amplification and detection of the light source in an integrated device, providing a higher output level than the other two and an improved S/N ratio.

The important features of connectors are their insertion loss and their return loss, these being respectively the amount of power lost due to the insertion of a connector into an otherwise unbroken link, and the amount of power reflected back in the direction of the source due to the presence of the connector. Typically insertion loss should be low (less than 1 dB), and return loss should be high (greater than 40 dB), for a reliable installation.

There are seven principal types of fibre optic connector, and it is not intended to cover each of them in detail here. For a comprehensive survey the reader is referred to Ajemian8. The so-called FDDI (Fibre Distributed Digital Interface) MIC connector is gaining widespread use in optical networks since it is a duplex connector (allows separate communication in two directions) with a typical insertion loss of 0.6 dB, whilst the SC connector is a popular ‘snap-on’ device developed by the Japanese NTT Corporation, available in both simplex and duplex forms, offering low insertion loss of around 0.25 dB and small size. The ST series of connectors, developed by AT&T, is also used widely.

1.9  Timebase Recovery in Interfacing

Unlike generic data, audio-visual data is only meaningful when reproduced with an appropriate timebase, both in the long term and the short term. Whatever means may be used to deliver the data, there must also be a means to recover the timebase. In audio and video installations it is common practice to distribute master clocks from a central generator. When this is done, the interfaces essentially become data transmissions, because timebase correction will be used at any receiving device in order to align the signal timing with the master reference.

This approach is feasible within a building complex, but not over long distances. Where data networks are used, packets may be routed over different physical paths between the same points and the time taken will be subject to some variation, known as packet jitter. Under these conditions it is much more difficult to recreate the correct timebase in the receiving device. Where several signal formats are possible, the receiver may need metadata that help it generate the correct timing.

References

1.  Rumsey, F.J., Desktop Audio Technology, Focal Press, Oxford (2004)

2.  Watkinson, J.R., The Art of Digital Audio, third edition, Focal Press, Oxford (2001)

3.  Watkinson, J.R., The Art of Digital Video, third edition, Focal Press, Oxford (2000)

4.  Connor, F.R., Modulation, Edward Arnold, London (1982)

5.  EIA., Industrial electronics bulletin no. 12. EIA standard RS-422A. Electronics Industries Association, Engineering Dept., Washington DC

6.  Sinnema, W., Electronic Transmission Technology, Prentice-Hall, Englewood Cliffs, NJ (1979)

7.  Dunn, J., Jitter: specification and assessment in digital audio equipment. Presented at the 93rd AES Convention, San Francisco, 1–4 October. Audio Engineering Society (1992)

8.  Ajemian, R.G., Fiber-optic connector considerations for professional audio. Journal of the Audio Engineering Society, vol. 40, no. 6, June, pp. 524–531 (1992)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
98.82.120.188