15 Digital audio tape (DAT) format

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Although digital audio processors have been developed and used for many years, using conventional video recorders to store high-quality audio information, it is inevitable that some form of tape mechanism be required to do the job in a more compact way. Two main formats have been specified.

The first format, known as rotary-head, digital audio tape (R-DAT), is based on the same rotary-head principle as a video recorder, and so has the same limitations in portability. The second format, known as stationary-head, digital audio tape (S-DAT), was developed under the name DCC, mentioned in the short history chapter.

R-DAT

One important difference between standard video recorder and R-DAT techniques is that in a video recorder the recorded signal is continuous; two heads on the drum make contact with the tape for 180° each (i.e., the system is said to have a 180° wrap angle, as shown in Figure 15.2a), or 221° each (a 221° wrap angle, as in Figure 15.2b). In the R-DAT system, where the digital audio signal is time-compressed, meaning that the heads only need to make contact with the tape for a smaller proportion of the time (actually 50%), a smaller wrap angle may be used (90°, as shown in Figure 15.2c).

Figure 15.1 DAT mechanism.

Figure 15.2

This means only a short length of tape is in contact with the drum at any one time. Tape damage is consequently reduced, and only a low tape tension is necessary with resultant increase in head life.

The R-DAT standard specifies three sampling frequencies:

•	48 kHz; this frequency is mandatory and is used for recording and playback.
•	44.1 kHz; this frequency, which is the same as for CD, is used for playback of pre-recorded tapes only.
•	32 kHz; this frequency is optional and three modes are provided.

32 kHz has been selected as it corresponds with the broadcast standard.

Quantization:

•	A 16-bit linear quantization is the standard for all three sampling rates.
•	A 12-bit non-linear quantization is provided for special applications such as long play mode at reduced drum speed, 1000 rpm (mode III) and U-channel applications.

Figure 15.3 shows a simplified R-DAT track pattern.

The standard track width is 13.591 μm, the track length is 23.5 mm and the linear tape speed is 8.1 mm s^–1. The tape speed of the analog compact cassette (TM) is 47.6 mm s^–1. This results in a packing density of 114 Mbit s^–1 m^–2 (see Table 15.1).

The R-DAT format specifies a track width of only 13.6 μm, but the head width is about 1.5 times this value, around 20 μm. A procedure known as overwrite recording is used, where one head partially records over the track recorded by the previous head, illustrated in Figure 15.4. This means that as much tape as possible is used – rotating-head recorders without this overwrite record facility must leave a guardband between each track on the tape. Because of this, recorders using overwrite recording techniques are sometimes known as guard-bandless. To prevent crosstalk on playback (as each head is wide enough to pick up all of its own track and half of the next), the heads are set at azimuth angles of ±20°. This enables, as will be explained later, automatic track following (ATF).

Figure 15.3 Simplified R-DAT track pattern.

Table 15.1

Figure 15.4 Overwrite recording is used to ensure each track is as narrow as possible and no guard-band is required.

These overwrite record and head azimuth techniques are fairly standard approaches to rotating-head video recording, and are used specifically to increase the recording density.

Figure 15.5 shows the R-DAT track format on the tape, while Table 15.2 shows the track contents. Table 15.2 lists each part of a track and gives the recording angle, recording period and number of blocks allocated to each part. Frequencies of these blocks which are not of a digital-data form are also listed.

As specified in the standard, a head drum of 30 mm diameter is applied and rotates at a speed of 2000 rpm. However, in future applications smaller drums with appropriate speeds can be used. At this size and speed, the drum has a resistance to external disturbances similar to that of a gyroscope.

Under these conditions, the 2.46 Mbit s^–1 signal to be recorded, which includes audio as well as many other types of data, is compressed by a factor of 3 and processed at 7.5 Mbit s^–1. This enables the signal to be recorded continuously.

In order to overcome the well-known low-frequency problems of coupling transformers in the record/playback head, an 8/10 modulation channel code converts the 8-bit signals to 10-bit signals.

Figure 15.5 R-DAT tape track format.

Table 15.2

This channel coding also gives the benefit of reducing the range of wavelengths to be recorded. The resultant maximum wave-length is only four times the minimum wavelength. This allows overwriting, eliminating the need for a separate erase head.

The track outline is given in Figure 15.5. Each helical track is divided into seven areas, separated by inter-block gaps. As can be seen, each track has one PCM area, containing the modulated digital information (audio data and error codes), and is 128 blocks of 288 bits long. Table 15.2 lists all track parts of a track.

The PCM area is separated from the other areas by an inter-block gap (IBG), three blocks long. At both sides of the PCM area, two ATF areas are inserted, each five blocks long.

Again, an IBG is inserted at both ends of the track, separating the ATF areas from the sub-1 and sub-2 areas (subcode areas), each eight blocks long. These subareas contain all the information on time code, tape contents, etc.

Then at both track ends a margin block is inserted, 11 blocks long, and is used to cover tolerances in the tape mechanism and head position.

A single track comprises 196 blocks of data, of which the major part is made up of 128 blocks of PCM data. Other important parts are the subcode blocks (sub-1 and sub-2, containing system data, similar to the CD subcode data), automatic track-finding (ATF) signals (to allow high-speed search) and the IBGs around the ATF signals (which means that the PCM and subcode information can be overwritten independently without interference to surrounding areas). Parts are recorded successively along the track.

The PCM area format is shown in Table 15.3. PCM and subcode parts comprise similar data blocks, shown in Figure 15.6. Each block is 288 bits long.

Each block comprises eight synchronization bits, the identification word (W1, 8 bits), the block address word (W2, 8 bits), 8-bit parity word and 256-bit (32 × 8-bit symbol) data. The ID code W1 contains control signals related to the main data. Table 15.4 shows the bit assignment of the ID codes. W2 contains the block address. The most significant bit (MSB) of the W2 word defines whether the data block is of PCM or subcode form. Where the MSB is zero, the block consists of PCM audio data, and the remainder of word W2, i.e., 7 bits, gives the block address within the track. The 7 bits therefore identify the absolute block address (as 2⁷ is 128).

Table 15.3 PCM area format

Figure 15.6 PCM and subcode data blocks.

Table 15.4 Bit assignment of ID codes

On the other hand, when the MSB of word W2 is 1, the block is of subcode form and data bits in the word are as shown in Figure 15.7, where a further 3 bits are used to extend the W1 word sub-code identity code, and the four least significant bits give the block address.

The P-word, block parity, is used to check the validity of the W1 and W2 words and is calculated as follows:

Figure 15.7 Subcode data blocks.

where ⊕ signifies modulo-2 addition, as explained in Appendix 1.

Automatic track following

In the R-DAT system, no control track is provided. In order to obtain correct tracking during playback, a unique ATF signal is recorded along with the digital data.

The ATF track pattern is illustrated in Figure 15.8. One data frame is completed in two tracks and one ATF pattern completed in two frames (four tracks). Each frame has an A and a B track. A tracks are recorded by the head with +20° azimuth and B tracks are recorded by the head with –20° azimuth.

The ATF signal pattern is repeated over subsequent groups of four tracks. The frequencies of the ATF signals are listed in Figure 15.8. The key to the operation lies in the fact that different frames hold different combinations and lengths. Furthermore, the ATF operation is based upon the use of the crosstalk signals, picked up by the wide head, which is 1.5 times the track width, and the azimuth recording. This method is called the area divided ATF.

As shown in Figure 15.8, the ATF uses a pilot signal f₁; sync signal 1, f₂; sync signal 2, f₃; and erase signal f₄. When the head passes along the track in the direction of the arrow (V-head) and detects an f₂ or f₃ signal, the six adjacent pilot signals f₁ on both sides are immediately compared, which results in a correction of the tracking when necessary.

Figure 15.8 ATF signal frequencies.

The f₂ and f₃ signals thus act as sync signals to start the ATF servo operation.

The f₁ signal, a low-frequency signal, i.e., 130.67 kHz, is used as low-frequency signals are not affected by the azimuth setting, so crosstalk can be picked up and detected from both sides. The pilot signal f₁ is positioned so as not to overlap through the head scans across three successive tracks.

Error correction

As with any digital recording format, the error-detection and error-correction scheme is very important. It must detect and correct the digital audio data, as well as subcodes, ID codes and other auxiliary data.

Types of errors that must be corrected are burst errors – dropouts caused by dust, scratches and head clogging – and random errors – caused by crosstalk from an adjacent track, traces of an imperfectly erased or overwritten signal, or mechanical instability.

Error-correction strategy

In common with other digital audio systems, R-DAT uses a significant amount of error-correction coding to allow error-free replay of recorded information. The error-correction code used is a double-encoded Reed–Solomon code.

These two Reed–Solomon codes produce C1 (32, 28) and C2 (32, 26) parity symbols, which are calculated on G_F (2⁸) by the polynomial:

g(x) = x⁸ + x⁴ + x³ + x² + 1

C1 is interleaved on two neighbouring blocks, while C2 is interleaved on one entire track of PCM data every four blocks (see Figure 15.9 for the interleaving format).

Figure 15.9 ECC interleaving format.

In order to perform C1 ↔ C2 decoding/encoding, one track worth of data must be stored in memory.

One track contains 128 blocks consisting of 4096 (32 × 128) symbols. Of these, 1184 symbols (512 symbols C1 parity and 672 symbols C2 parity) are used for error correction, leaving 2912 data symbols (24 × 104).

In fact, C1 encoding adds four symbols of parity to the 28 data symbols C1 (32, 28), while C2 encoding adds six symbols of parity to every 26 PCM data symbols C2 (32, 26).

The main data allocation is shown in Figure 15.10.

This double-Reed–Solomon code gives the format a powerful correction capability for random errors.

PCM data interleave

In order to cope with burst errors, i.e., head clogging, tape dropouts, etc. PCM data are interleaved over two tracks called one frame, effectively turning burst errors into random errors which are correctable using the Reed–Solomon technique already described.

Figure 15.10 Data allocation.

To interleave the PCM data, the contents of two tracks have first to be processed in a memory. The memory size required for one PCM interleave block is: (128 × 32) symbols × 8 bits × 2 tracks = 65.536 bits, which means a 128-bit memory is required.

The symbols are interleaved, based on the following method, according to the respective number of the audio data symbol. The interleaving format depends on whether a 16-bit or 12-bit quantization is used. The interleave format discussed here is for 16-bit quantization, the most important format.

One 16-bit audio data word indicated as A_i or B_i is converted to two audio data symbols each consisting of 8 bits. The audio data symbol converted from the upper 8 bits of A_i or B_i is expressed as A_iu or B_iu. The audio data symbol converted from the lower 8 bits of A_i or B_i is expressed as A_il or B_il. Note: A stands for the left channel, B for the right channel.

If the audio data symbol is equal to A_iu or A_il, let a = 0.

If the audio data symbol is equal to B_iu or B_il, let a = 1.

If the audio data symbol is equal to A_iu or B_iu, let u = 0.

If the audio data symbol is equal to A_il or B_iu, let u = 1.

Figure 15.11 PCM data interleave format.

Tables 15.5a and b represents an example of the data assignment for both tracks (+ azimuth and – azimuth) respectively, for 16-bit sampled data words.

Subcode

The data subcode capacity is about four times that of a CD and various applications will be available in the future. A subcode format which is essentially the same as the CD subcode format is currently specified for pre-recorded tapes.

The most important control bits, such as the sampling frequency bit and copy inhibit bit, are recorded in the PCM-ID area, so it is impossible to change these bits without rewriting the PCM data. As the PCM data are protected by the main error-correction process, subcodes requiring a high reliability are usefully stored here.

Data to allow fast accessing, programme number, time code, etc. are recorded in subcode areas (sub-1 and sub-2) which are located at both ends of the helical tracks. These subcode areas are identical. Figure 15.12 illustrates the sub-1 and sub-2 areas, along with the PCM area containing subcode information.

An example of the subcode area format is shown in Figure 15.13. Data are recorded in a pack format.

Figure 15.14 shows the pack format, and the pack item codes are listed in Table 15.6. All the CD-Q channel subcodes are available to be used.

Each pack block comprises an item code of 4 bits, indicating what information is stored in the pack data area. The item code 0100 indicates that the related pack data is a table of contents (TOC) pack. This TOC is recorded repeatedly throughout the tape, in order to allow high-speed access and search (at 200 times normal speed). Every subcode datablock is controlled by an 8-bit C1 parity word allowing appropriate control of data validity.

Figure 15.12

Figure 15.13 Subcode area format.

Figure 15.14 Pack format.

Subcode data in the subarea can be rewritten or modified independently from the PCM data.

Figure 15.15 shows an example of subcode information for pre-recorded tape. The figure shows the use of different codes and pack data on a tape, such as programme time, absolute time, programme number, etc.

Table 15.5a

Table 15.5b

Figure 15.15

Table 15.6

Item	Mode
0000	No information
0001	Programme time
0010	Absolute time
0011	Running time
0100	TOC
0101	Calendar
0110	Catalogue
0111	ISRC
1000 1110	Reserved
1111	For tape maker

Tape duplication

High-speed duplication of R-DAT tapes can be done by using the magnetic contact printing technique. In this method a master tape of the mirror type is produced on a master tape recorder (Figure 15.16, 1).

The magnetic surfaces of the master tape and the copy tape are mounted in contact with each other on a printing machine, as shown in Figure 15.16 (2).

By controlling the pressure of both tapes between pinch drum and bias head, the magnetizing process is performed, applying a magnetic bias to the contact area. Special tape and a special bias head are required (see Figures 15.17 and 15.18).