8 DVD

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 8
DVD

The DVD format is a successor to the CD, and also creates new market opportunities for digital optical disc technology. DVD improves on the CD in all respects and in particular provides larger storage capacity and faster reading and writing. The technological improvement provided by the DVD specification allows filmmakers, musicians, and programmers to expand their creative horizons beyond old data carriers, and lets users enjoy the result. For everyone, DVD provides greatly upgraded capacity and flexibility of data storage. In particular, compared to previous consumer video formats, the DVD provides a significant improvement in the video and sound quality of the playback of motion pictures.

Development and Overview

Beginning with the CD-Audio format, and the subsequent development of CD-ROM, CD-R, and CD-RW, the CD revolutionized data storage. However, the CD’s limited capacity and slow throughput bit rate made it unsuitable for high bandwidth or large volume applications such as high-quality digital video. Thus, the motion picture industry sought to develop a small optical disc holding a feature film coded with high-quality digital video. In December 1994, Sony and Philips proposed the MultiMedia Compact Disc (MMCD). In January 1995, Toshiba and Time Warner proposed the Super Density disc (SD). The DVD Technical Working Group subsequently outlined these criteria for the new video disc: A single standard that was fully interchangeable for TV and PC applications; fully backward-compatible with existing CD media, forward-compatible with write-once and rewritable media; low-cost and high-quality disc replication; a single universal file structure; support for both linear and nonlinear applications; and high capacity that was expandable for high definition media.

The SD and MMCD formats were similar but incompatible, and computer industry representatives urged the two sides to produce a unified standard. Subsequently, the SD and MMCD formats were merged. Moreover, the scope of the video-centric format was expanded to include digital audio and both playback-only and recordable media for general computer applications. The DVD family of formats was thus developed by a consortium of manufacturers known as the DVD Forum. Several working groups were charged with the development of different formats and aspects within the family. The preliminary DVD format was announced in December 1995. The DVD family includes different formats for video, audio, and computer applications. Because the scope of the applications far exceeded digital video, the original name of Digital Video Disc was changed to Digital Versatile Disc, but that name was never fully accepted. Instead, the format is simply called DVD.

Whatever the jargon, DVD supersedes the CD in the music and computer software markets and supersedes the LaserDisk and VHS tape in the video market. The DVD-Video and DVD-ROM formats were the first of the family to be introduced, early in 1997. Approximately 5 million DVD-ROM drives and 1.2 million DVD-Video players were sold in 1998. At the end of 1998, about 2000 DVD-Video titles were available. By mid-2004, over 34,000 DVD titles were in release, and since launch, more than 103 million DVD players had been sold and over 3 billion discs had been shipped. These early growth rates surpassed those of any other existing entertainment medium. DVD is also credited as a key factor behind the migration to digital television.

The DVD family portrait is shown in Fig. 8.1. There are six DVD books: Book A is DVD-ROM (Read-Only Memory); Book B is DVD-Video; Book C is DVD-Audio; Book D is DVD-R (Recordable); Book E is DVD-RAM (Random-Access Memory); and Book F is DVD-RW (ReWritable). The specification is further classified in several parts. Part 1 defines the physical specifications, Part 2 defines the file system specifications, and subsequent parts define specific applications and extensions. For example, Part 3 defines the Video application, Part 4 defines the Audio application, and Part 5 defines the Video Audio Navigation (VAN) extension. DVD-ROM, DVD-Video, and DVD-Audio discs (all read-only in nature) share the same disc specifications and physical format. Likewise they all share the same file system. The DVD-R, DVD-RAM, and DVD-RW formats are more unique. The DVD specification borrows from other existing specifications. For example, the DVD file system uses elements of the UDF, ISO 9660, and ISO 13346 specifications. DVD-Video generally uses MPEG video coding and Dolby Digital (AC-3) or DTS audio coding, and DVD-Audio generally uses PCM and MLP (Meridian Lossless Packing) coding, as well as Dolby Digital.

FIGURE 8.1 The DVD family of specifications includes six books for read-only and recordable discs. Some physical and file system attributes are shared, but specific application details are distinct to each specification book.

Although based on CD technology, DVD employs new physical specifications and a new file format. Philosophically, DVD differs considerably from the CD. Whereas the CD was originally designed exclusively as an audio storage format, and incrementally adapted to other applications, DVD was wholly designed as a universal storage platform. The CD is also a simple format, designed to work with or without microprocessors in the player. In contrast, DVD is based on sophisticated microprocessor control to read its file structure and interact with the disc and its contents. Perhaps most importantly, Red Book CD-Audio was designed to play back a continuous stream of data thus addressing was not needed. Yellow Book CD-ROM only subsequently added addressing capability. In contrast, DVD is founded on the premise that all data will be addressable and randomly accessible. In short, all DVD contents are essentially viewed as software data. In that respect, DVD is more akin to CD-ROM than CD-Audio.

In all respects, DVD surpasses the CD format. Perhaps most strikingly, although its outer physical dimensions are identical, one DVD data layer provides seven times the storage capacity of CD. This increase is due to the shorter wavelength laser, higher numerical aperture, smaller track pitch, and other aspects, as illustrated in Fig. 8.2. In addition, overall disc and player tolerances are more stringent compared to those stipulated for the CD; this takes into account the improvement in manufacturing precision gained over the intervening 15 years. In all, the recording density of CD is 1 bit/mm² whereas the DVD recording density is 6 to 7 bits/mm². Even so, time marches on. The specifications of the DVD format have been eclipsed by the Blu-ray format.

FIGURE 8.2 A DVD disc layer holds seven times the data capacity of CD. This is accomplished through improvements in optics design, improved drive design and precision, and more sophisticated decoding electronics. (Schouhamer Immink, 1996)

Disc Design

Part 1 of the DVD specification defines the physical specifications of DVD discs. The specifications for the DVD-ROM, DVD-Video, and DVD-Audio discs are identical; thus, Part 1 applies to all three formats. These read-only formats thus share disc construction, modulation code, error correction, and so on. The physical parameters call for 120-mm-and 80-mm-diameter discs, and single and dual layers per substrate. As with CD, DVD discs use a pit/land structure to store data. The DVD track pitch is 0.74 mm. The constant linear velocity (CLV) track velocity is 3.49 m/s on a single layer and 3.84 m/s on a dual layer. The pits and land that store binary data are as short as 0.4 mm. Minimum/maximum pit length is 0.40/1.87 mm (single layer) and 0.44/2.05 mm (dual layer).

These small dimensions are possible because the laser beam used to read DVD discs uses a visible red wavelength of 635 nm or 650 nm (both wavelengths are supported) compared to 780 nm in a CD. The standard specifies a lens with a numerical aperture (NA) of 0.6, compared to a CD’s NA of 0.45. Together, the shorter laser wavelength and higher NA allow smaller pit dimensions which in turn allow a data density increase of 467% over that of CD. A DVD data layer holds about four times as many pits as a CD layer as shown in Fig. 8.3. If unwound, a DVD track would run for almost 7.5 miles. Combined with other coding efficiencies, a DVD layer can store 4.7 Gbytes of data and multiple data layers provide greater capacity. It is worth noting that the quoted DVD capacity of 4.7 Gbytes is more precisely 4.7 billion bytes (measured in multiples of 1000); in computer terms (measured in multiples of 1024) the capacity is 4.38 Gbytes.

Disc Optical Specification

A DVD disc appears similar to a CD and is the same diameter (120 mm) and thickness (1.2 mm). Whereas a CD uses a single polycarbonate substrate, a DVD disc employs two 0.6-mm substrates bonded together with the data layers placed near the internal interface. All data is held in a spiral data track that is read with counterclockwise rotation. The layer closest to the readout surface is Layer 0, and the layer further from the readout surface is Layer 1. Layer 0 is read first. A reading laser starts at the innermost radius of Layer 0 and reads to the disc outer edge, where it reaches a middle area, stops and refocuses, and reads Layer 1 from the outer edge to the inner radius; this is called opposite track path (OTP) organization. Alternatively, Layer 1 can be independently written from an inner to outer radius; this is called parallel track path (PTP). In PTP, each layer requires its own lead-in and lead-out areas.

FIGURE 8.3 A comparison of CD and DVD data surfaces shows that a DVD holds about four times as many pits as a CD. With other improved efficiencies, overall capacity of a DVD is increased seven times.

FIGURE 8.4 The thin (0.6-mm) DVD substrate is less sensitive to tracking and detection errors due to disc tilt. A. Thick CD substrate allows greater deviation. B. Thin DVD substrate has less deviation.

Whereas a CD data layer is near the disc’s top surface and thus somewhat vulnerable to damage, DVD data layers are embedded deeply within the disc and thus are more protected. Thinner substrates are advantageous because they are inherently more resistant to tracking errors that result when a disc is slightly tilted relative to the laser pickup, as shown in Fig. 8.4. However, because the thin substrate places the data layer closer to the outer disc surface, surface contamination is not placed as far out of focus as with a CD; this is compensated for with more powerful error correction. The DVD finished disc specification for radial tilt is ±0.8°, and for refractive index is 1.55 ± 0.10. A summary of specifications for CD and DVD is given in Table 8.1.

The dual substrate construction allows manufacturing variants, yielding five different capacities of playback-only discs. They are known as DVD-5, DVD-9, DVD-10, DVD-14, and DVD-18. As the nomenclature loosely suggests, the disc capacities are: 4.7, 8.54, 9.4, 13.24, and 17.08 billions of bytes, respectively. Expressed in terms of 8-bit bytes, they hold 4.37, 7.95, 8.75, 12.33, and 15.91 Gbytes. This is roughly a 7.4% difference (1 billion is 1,000,000,000 or 10⁹, and 1 Gbyte is 1,073,741,824 or 2³⁰). When the average data output bit rate is 4.8 Mbps, the approximate playing times are: 133, 241, 266, 375, and 482 minutes, respectively. Of course, the bit rate can vary widely, and thus playing times vary too. For example, a DVD-5 disc can hold from 1 to 9 hours of video. Generally, a DVD-Video disc that holds a 2-hour movie and modest bonus features is a DVD-9 disc.

A single-layer, single-sided DVD-5 disc uses one substrate with a data surface and one blank substrate. The substrates are bonded together with either a hot-melt glue or an ultraviolet-cured photopolymer (2P); the former is generally preferred. Two substrates with data surfaces can be bonded together to form a single-layer, double-sided DVD-10 disc; the disc is turned over to access the opposite layer. The DVD standard also allows data to be placed on two layers in a substrate, one embedded beneath the other to create a dual-layer disc that is read from one side; this comprises a DVD-9 disc. Two dual-layer substrates comprise a double-sided DVD-18 disc. The layers are separated by a clear resin and a very thin semi-reflective (from 25 to 40%) layer of gold or silicon; this layer is sometimes referred to as a semi-transparent layer. Gold is sputtered with conventional metallization techniques. Silicon is similarly sputtered, using argon gas. Clearly, gold is more expensive than silicon. The environmental stability and playability performance of the silicon semi-reflective layer meets or exceeds that of gold-based discs. When a layer is read through a bonding layer, the bonding material must be transparent; a 2P bonding agent can be used.

TABLE 8.1 Comparison of CD and DVD disc and player specifications.

When using a semi-reflective and fully reflective layer, both layers can be read from one disc side by simply focusing the reading laser on either layer. The beam either reflects from the lower semi-reflective layer or passes through it and reflects from the top fully reflective layer. The laser light can be switched to either data layer in a few milliseconds (they are about 40 mm to 70 mm apart) by simply moving the objective lens; a buffer memory makes the transition indiscernible. Because reflectivity of the embedded layer is reduced, as is the signal-to-noise ratio (because of out-of-focus imaging from the inner layer), for reliable playback the embedded layer is formed with a faster linear velocity and thus holds somewhat less data than the top data layer. The interior data surface uses a faster scanning velocity of 3.84 m/s versus 3.49 m/s, thus the pit length is longer; for example, the minimum pit length is 0.44 mm versus 0.4 mm.

Conventional DVD discs cannot be played in CD players. However, the DVD-Forum devised specifications for a hybrid CD/DVD-Audio disc that plays in CD and DVD-Audio capable players. The DualDisc format comprises a double-sided disc that allows DVD-Audio data (multichannel audio or video) to be read from one side, and CD data (stereo music) from the other. For example, a movie could be accompanied by its (CD) soundtrack, or DVD bonus features could accompany a CD album. DualDiscs are slightly thicker (perhaps 1.4 mm to 1.5 mm) than the 1.2-mm nominal thickness of CD and DVD discs, but are playable in most DVD-Audio and CD players.

Disc Manufacturing and Playback

DVD manufacturing is very similar to CD manufacturing, as described in Chap. 7, but tighter tolerances and some new manufacturing steps are needed. Following authoring, disc content is typically imaged on a hard-disk drive, and then transferred to another delivery medium. In many cases, the content image is delivered to the disc mastering facility on Digital Linear Tape (DLT) using ANSI format. A DLT Type III tape cartridge can hold up to 10 Gbytes of uncompressed data; DLT Type IV tapes are also widely used and can hold up to 80 Gbytes of uncompressed data. With a transfer rate of 1.25 Mbytes/sec, a 135-minute program can be transferred in about an hour. A separate DLT is used for each physical disc layer and copy protection such as Content Scrambling System (CSS) and Content Protection for Prerecorded Media (CPPM) can be enabled. Alternatively, other authoring media such as DVD-R or DVD + R (single or dual layer) using the Cutting Master Format or Exabyte tape are sometimes used for simpler projects. In some cases, electronic file delivery via private secure networks is used to move files from authoring studios to replication plants. Content that will be copy protected cannot be submitted on CD-R.

Disc Description Protocol (DDP) files may accompany the master; this data provides the laser beam recorder (LBR) with disc identification information. Other data files (such as DVDID/DDPID and CONTROL.DAT) also accompany the content image file, supplying disc type, copy protection, and other information. DVD-Video and DVD-Audio authoring systems often use the Joliet extensions (supporting longer file names) in the ISO 9660 file format (within UDF Bridge); this assists compatibility in legacy operating systems that are unable to read UDF Bridge. Authoring systems must also read Macintosh file formats to allow conversion to the UDF Bridge format.

In DVD mastering, shorter wavelengths must be used in the laser beam recorder to create the smaller formations; blue, ultraviolet, or violet krypton lasers may be used. In some cases, solid-state lasers (instead of gas lasers) with frequency-doubling crystals are used. Either photoresist mastering or direct dye-polymer mastering may be used. All DVD discs use two substrates. Even in single-sided discs, two substrates, one holding data and the other a dummy substrate (with a cosmetic metal layer) must be manufactured. The thin DVD substrates require greater care in the molding process. For example, it is more difficult to uniformly flow molten polycarbonate into a thinner mold with minimal stress. The finer pit structure and the geometry of the pits (shorter and narrower than CD pits resulting in a steeper height-to-width ratio) may require injection molding machines with higher tonnage. Finally, it is more difficult to separate the disc from the stamper mold without strain.

In the case of double-sided discs, two substrates are independently formed, and then bonded together using a hot-melt adhesive or UV-curable bonding agent (the latter is preferred). As noted, dual-layer discs can be manufactured by independently molding two 0.6-mm polycarbonate substrates with one layer receiving full metallization and the other receiving semi-reflective metallization. The two substrates are then bonded together with a layer of UV-cured optically clear photopolymer as shown in Fig. 8.5. The reading laser can focus through the semi-reflective layer (and the bonding material with a thickness of about 55 mm) to read the upper substrate; this design can be used to manufacture single-sided discs (such as some DVD-9 discs).

FIGURE 8.5 Single-sided, dual-layer DVD-9 discs can be manufactured with data layers on two substrates, one with a semi-reflective surface and another with a fully reflective surface. The replication steps are shown. A1. Replicate first-layer substrate. A2. Replicate second-layer substrate. B1. Deposit semi-reflective layer. B2. Deposit fully reflective layer. C. Put UV-hardened bonding resin on substrate. D. Bond substrates. E. Harden with UV light.

FIGURE 8.6 Dual-layer substrates can be manufactured by pressing a second data layer into an intermediate resin layer. This technique can be used to produce substrates for single-sided, dual-layer (DVD-9) discs and double-sided, dual-layer (DVD-18) discs. The replication steps are shown. A. Replicate first layer substrate. B. Deposit semi-reflective layer. C. Put UV-hardened resin on substrate. D. Replicate pits of second layer on resin by stamper. E. Deposit fully reflective layer. F. Apply UV-hardened resin to form protective layer.

Alternatively, a molded single-layer substrate can be coated with a semi-reflective layer followed by a layer of liquid photopolymer that is molded by a second stamper and hardened by exposure to ultraviolet light; after the layer is hardened, the stamper is removed and a fully reflective metal layer is applied and the substrate is ready for bonding to a second substrate, as shown in Fig. 8.6. This technique is used for some DVD-9 and DVD-18 discs. In another approach, the first information layer is molded on an interim 0.6-mm polymethyl methacrylate (PMMA) substrate, and it is sputtered with aluminum. The PMMA substrate (which shares no strong molecular bonds with aluminum) is peeled away and recycled, leaving the resultant aluminum information layer that is transferred onto a basic gold metal/polycarbonate DVD substrate with an adhesive. The six disc types are shown in Fig. 8.7.

Optical Playback

Although backward compatibility with CD is not required (but is recommended) by the DVD specification, in practice all DVD players can read CD discs with molded plastic pits. To accomplish this, the laser must be able to focus on CD data layers at about 1.2 mm from the readout surface, as well as DVD layers at about 0.6 mm from the surface. Some pickups use a single shared objective lens for the CD and DVD wavelengths. The pickup has both 780-nm and 635/650-nm laser diodes, and a largely shared light path. In some designs, as shown in Fig. 8.8, holographic pickups use an aspherical lens; annular rings are cut into the center of the lens so that light passing through it is diffracted, yielding a longer focal length. Light passing through the outer smooth part of the lens has a shorter focal length. The optical spots appear to the photodiode array as concentric circles; light reflecting from 1.2 mm forms an inner circle and light reflecting from 0.6 mm forms an outer circle. Alternatively, a dual pickup can be designed with independent sections for each wavelength source including two separate objective lenses—one for CD and another for DVD.

FIGURE 8.7 The DVD specification allows multiple disc types. A. Single-sided, single-layer DVD-5. B. Single-sided, dual-layer DVD-9. C. Single-sided, dual-layer (alternate version) DVD-9. D. Double-sided, single-layer DVD-10. E. Double-sided, single-/dual-layer DVD-14. F. Double-sided, dual-layer DVD-18.

FIGURE 8.8 A DVD pickup can be designed to focus on either CD or DVD data layers. Different focal lengths can be achieved with a variety of techniques including a holographic lens.

Playback of CD-R discs is problematic; the optical response of the organic dye recording layer is extremely wavelength-dependent, with high absorption below a narrow range of around 780 nm. For example, a CD-R disc may have 65% reflectivity at 780 nm and only 10% reflectivity at 650 nm. As a result, contrast is low and the disc is difficult to read at 635 nm or 650 nm. CD-R-compatible DVD pickups are designed with two discrete optical paths at two wavelengths, or may employ one objective lens with two lasers. In one design, the numerical aperture is adjusted by coating the lens’ outer circumference with a material that is opaque at 780 nm but transparent at 635 nm and 650 nm. When a 780-nm laser is used, the coating restricts the NA to 0.45, but when a 635-nm or 650-nm laser is used, since the coating is transparent, the NA is increased to 0.6 for reading DVD discs. Alternatively, for example, a dual-laser pickup could mount two objective lenses on a rotating head that places the appropriate lens in the optical path.

Many DVD drives use a differential phase-detection (DPD) method for autotracking. The pickup monitors asymmetry in the intensity pattern of the diffraction from the edges of pits. In particular, as an off-center spot moves from the leading edge to the trailing edge of a pit, the intensity of the pattern rotates. The pattern illuminates a four-quadrant photodiode, as shown in Fig. 8.9. In this example, at the leading edge of the pit, diode pairs B + D receive less light than pairs A + C; when the spot is in the middle of the pit, the diode pairs receive equal intensity; at the trailing edge, B + D receive more light than A + C. This is used to generate a tracking error signal.

FIGURE 8.9 In the DPD autotracking method, an off-center beam creates a rotation in the intensity of the reflected beam as a pit is scanned. This is sensed by a four-quadrant photodiode. In this figure, darkened areas in the photodiode represent lower light intensity. (Carriere et al., 2000)

Data Coding

The disc lead-in area is the innermost area of the Information Area. It consists of the Initial zone, Reference code zone, Buffer zone 1, Control data zone, and Buffer zone 2. A Control data block comprises 16 sectors; information includes disc size, minimum readout rate, single/dual layer, track path, disc manufacturing information, and copyright. A Burst Cutting Area (BCA) is located inside the lead-in area, between 44.6 mm and 47 mm from center. BCA can create a unique serial number or ID code comprising a series of low-reflectance stripes, similar to a bar code, extending along the radial direction. It holds up to 188 bytes, in 16-byte increments. The code can be recorded by a high-power YAG laser that melts the aluminum sputtering layer to create the lines; codes can be written in both single and dual layers. The code is read by a drive’s optical pickup. Recordable DVD media use the BCA to uniquely identify each disc; this can be used to encrypt recorded data for copy-protection purposes.

The read-only DVD data structure is similar to that of a CD-ROM in which data is stored in files within directories; this increases data density. DVD data is placed on a disc in physical sectors that run continuously without a gap from the lead-in to the lead-out areas. The lead-in area ends at address 02FFFF and data begins at address 030000. Two types of dual-layer discs are defined: parallel track and opposite track. In a parallel track path disc, addresses of both layers ascend from the inner radius, and there are two lead-in and lead-out areas. In an opposite track path disc, Layer 1 addresses ascend as the laser moves toward the inner radius, and there is one lead-in and lead-out area, and two middle areas. In a DVD disc, the lead-in area starts 2 mm closer to the center than on a CD.

A data sector comprises 2064 bytes, called Data Unit 1. It consists of 2048 bytes of main user data and 16 bytes of header; the latter comprises four bytes of sector identification (ID), two bytes of ID error detection (IED), six bytes reserved for copyright management, and four bytes of error detection code (EDC) data. A data sector can be viewed as a block of 2064 bytes with 172 bytes in 12 lines, as shown in Fig. 8.10. The four bytes of ID contain one byte of sector information and three bytes of sector number. A synchronization code is added to the head of every 91 bytes in the recording sector; this forms a physical sector; in all, 52 bytes of synchronization code are added. The initial 2048 bytes of user data is thus increased to 2418 bytes (2048 user + 16 header + 302 error correction + 52 synchronization).

FIGURE 8.10 DVD data is placed in sectors, each with 2064 bytes of data. Four bytes of ID data contain sector information.

Reed–Solomon Product Code

The DVD format employs a Reed–Solomon Product Code (RS-PC) for error correction. This code uses a combination of two Reed–Solomon codes (denoted by C1 and C2) as a product code. It differs from the CD’s CIRC code and is more similar to the code used in the DAT format. In the CD format, CD-ROM data can be coded with additional error correction; in the DVD format, all disc types use the same level of error correction. Moreover, in DVD, error concealment is not used. Instead, the data reliability of all DVD data must approach computer standards. The CIRC code uses a convolutional structure that is suited to long streams of data. In contrast, the matrix structure of the RS-PC code is suited to small blocks of data. It is a product code in which rows of outer parity are crossed with columns of inner parity. A small disadvantage of an RS-PC code is its larger memory requirement.

Error correction is more challenging on a DVD disc than on a CD because the pit size is smaller. In addition, because of the thin substrates, surface defects can more readily obscure the data surface (although the outer surface is out of focus to the reading laser in relation to the interior data layer). However, the superior RS-PC error correction code provides improved overall error protection compared to CIRC and is also more powerful than the double error correction used in the CD-ROM format. Because RS-PC is more efficient than CIRC in terms of overhead, its use increases data density by 16%.

In RS-PC, the two C1 and C2 product codes are (208,192) and (182,172) in length. The rate of the code is thus (172 × 192)/(182 × 208), or 0.872. RS-PC is applied to the 2048 bytes of main data, each error-correction code (ECC) block providing error correction encoding over 16 data sectors. A total of 302 bytes of error correction code are added to each sector. Each ECC block thus contains 32,768 bytes of user data, 4832 bytes of ECC, 96 bytes of sector EDC, and 160 bytes of ID and copy protection, totaling 37,856 bytes. A 16-byte outer code parity (PO) and 10-byte inner code parity (PI) are added to form recording sectors. PO is formed from 172 columns and yields 16 new rows. PI is formed from 208 rows (192 + 16). The data block is parsed into groups of 12 data rows plus one parity row to create recording sectors of 182 bytes. Overall, error detection and correction data requires an overhead of about 13% of the recorded sectors. Single-byte synchronization codes are placed in the middle of each recording sector. This Data Unit 3 thus holds 2418 bytes (2366 + 52); this unmodulated physical sector is used to record data in all the recordable DVD formats.

The principal error criterion for CDs is the BLER measurement. In DVD discs, PI and PO error rates are used. PI errors use an 8 ECC block running sum. It counts the number of PI rows with any bad symbols. PI failures are the number of uncorrectable PI rows per ECC block. PO failures are the number of uncorrectable PO columns per ECC block. C1 and C2 can be decoded multiple times to improve performance. The maximum correctable burst length for CIRC is about 500 bytes (2.4 mm), while it is about 2200 bytes (4.6 mm) for RS-PC. The RS-PC can reduce a random input error rate of 2 × 10⁻² to a data error rate of 10⁻¹⁵. This is better than a CD by a factor of 10. For example, on a DVD-ROM, a burst error of about 2800 bytes can be corrected; this corresponds to an obstruction that is 6 mm in length.

EFMPlus Modulation

Data on DVD discs is recorded with EFMPlus modulation. It is an 8/16 RLL code that writes each 8-bit data byte as a 16-bit modulated channel byte. The modulated physical sector is thus 4836 bytes. EFMPlus is similar to the EFM code used in CDs; for example, it uses the same minimum (2) and maximum (10) run length and represents binary one channel bits as pit/land or land/pit transitions, and binary channel zero bits as no transition. EFM uses 8/14 coding with three merging bits to yield an 8/17 ratio. EFMPlus provides a 6% increase in user storage capacity compared to EFM because its coding is more efficient than EFM. As with many other codes, EFMPlus promotes timing recovery and suppression of low-frequency components. However, whereas EFM uses merging bits, a single lookup table, and simple concatenation rules to suppress low-frequency content, EFMPlus does not require merging bits and uses a more sophisticated lookup method. The EFMPlus encoder defines four lookup tables each with 351 possible source words. In practice, the source codebook size is 344; seven possible words are discarded to allow for a unique 26-bit synchronization word. Of these, 256 words are used to code input data. The remaining excess 88 words (344 − 256 = 88) are used to control low-frequency content.

DC suppression is accomplished by monitoring the digital sum value (DSV); the surplus words are used as alternative channel representations for codewords 0 through 87 to create alternative tables. Either main or alternative codewords are actively selected to minimize the running DSV. The decoder uses an array to examine the current 16-bit codeword and two positions of the upcoming codeword to translate 16 + 2 channel bits into 8-bit data words. Data-to-clock jitter is measured from the EFMPlus signal to the PLL clock for one disc revolution. Data-to-data jitter (effect length) is measured as timing between pits and lands. This differs somewhat from the data-to-data jitter measurement often used in CDs. The DVD-ROM specification calls for jitter of less than 8% of the channel bit clock period; if this period is 38 ns, then jitter must be less than 3 ns. DVD discs do not have a separate subcode area as in CDs; the subcode functions are intrinsically contained in the data format.

When a DVD disc is read, data passes through a buffer and then is evaluated by a navigator/splitter that separates the bitstream into video, sub-picture, audio, and navigational information. If necessary, the video, sub-picture, and audio data is descrambled and decoding takes place; for example, MPEG-2 video data is decoded as is Dolby Digital (AC-3) audio data. This can occur in a dedicated hardware chip or with software via a computer CPU. Video data is routed to the display monitor, audio is sent to other outputs, and navigational information is used by a controller for the user interface.

Universal Disc Format (UDF) Bridge

Part 2 of the DVD specification specifies a file format called UDF Bridge. It is fundamentally based on a simplified version of UDF called Micro UDF and includes ISO 9660. Read-only DVD discs must use the Universal Disc Format (UDF) Bridge for volume structure and file format; it is designed specifically for optical disc storage. It is common to the DVD-ROM, DVD-Video, and DVD-Audio formats (and applies to the write-once and rerecordable disc formats). However, application-specific parameters are unique to each of the B (Video) and C (Audio) books. The DVD format is unlike CD-Audio, the DVD being fundamentally computer-based with a file format defined for all its applications. And, whereas CD-ROM was designed without designating a specific file format, a DVD disc must use UDF Bridge. UDF Bridge is a simplified version based on Part 4 of ISO/IEC 13346 and conforms to both UDF and ISO 9660 (the file format used in CD-ROM). In other words, DVD-Video and DVD-Audio are much closer in concept to CD-ROM, than to CD-Audio.

UDF Bridge defines data structures such as volumes, files, blocks, sectors, CRCCs, paths, records, allocation tables, partitions, and character sets, as well as methods for reading, writing, and other operations. It is a flexible, multi-platform, multi-application, multi-language, multi-user oriented format that has been adapted to DVD. It is backward-compatible to existing ISO 9660 operating system software; however, a DVD-Video or DVD-Audio player supports only Micro UDF and not ISO 9660. UDF Bridge standardizes many file and directory names to simplify operation. For example, certain subdirectories are always read first. UDF Bridge permits use of DVD in Windows, Macintosh, Unix, OS/2, and DOS operating systems as well as dedicated players. Because the UDF Bridge file system is unified with the DVD format, host computers can be programmed to use DVDs. Conversely, a dedicated player can ignore files within UDF Bridge that it does not need for operation.

UDF Bridge defines the following: a Sector is the smallest addressable data field (2048 bytes); a Volume is a sector address space; a Volume Set is a collection of one or more volumes; a Volume Group within a volume consists of one or more consecutively numbered volumes; a File is a set of sectors with sector numbers in a continuously ascending sequence; an Application is a program that processes the contents of a file; a Descriptor contains information about a volume or file. UDF Bridge supports multiple extent files in which parts may be located noncontiguously on a disc. It supports file names with mixed cases, up to 256 characters long. It also supports Unicode, which supports all character sets. UDF Bridge also supports the resource fork and data fork in Macintosh files. UDF Bridge specifies a time stamp: year (1 to 9999), month, day, hour, minute, second, centiseconds, hundreds of microseconds, and microseconds. The UDF specification was developed by the Optical Storage Technology Association.

Because of its diverse applications, the UDF Bridge file format specification is quite detailed. However, its basic directory and file structure is quite explicit. For example, Fig. 8.11A shows how data is organized in a DVD-Video “Video” disc. Under a root directory, a DVD-Video zone and DVD-Other zone are defined. Within the DVD-Video zone, the VIDEO_TS directory contains both menu and program data. In particular, a Video Manager defines file types and organization of both video and audio data, and Video Title Set (VTS) subdirectories contain video and audio data files (such as MPEG-2 video and Dolby Digital audio). One Video Manager can contain up to 99 VTS subdirectories. Other computer data may be contained in the DVD-Other zone; this data may be used by DVD-ROM drives, and is ignored by DVD-Video players. This information may allow Internet connectivity, enhanced music, and other features.

Figure 8.11B shows how data is organized in a DVD-Audio “Audio-Only” disc. Information is contained in the DVD-Audio zone. Audio data, such as PCM, is contained in an Audio Title Set (ATS). An Audio Manager defines file types and organizes both audio and video data. Both menu and program data is included. Figure 8.11C shows how data is organized in a DVD-Audio AV “Audio with Video” disc. Audio data is contained in an Audio Title Set and video data in a Video Title Set. The Audio Manager and Video Manager define file types and organize both audio and video data. Both menu and program data is included. The Audio Manager can control a subset of the DVD-Video data. Link Info shows that a DVD-Audio player can play audio components of video contents. Figure 8.11D shows how data is organized in a DVD-Video VAN “Video Audio Navigation” disc. Audio data is contained in an Audio Title Set, and video data in a Video Title Set. The Audio Manager and Video Manager define file types and organize both audio and video data; both menu and program data is included. “Link Info” shows that a DVD-Audio player can play audio components of video contents.

DVD-Video

DVD-Video was the first DVD format to be launched and is the most widely used DVD format. In a typical application, a DVD-Video disc stores a feature film with 5.1-channel and stereo soundtracks. The film industry participated in the development of DVD-Video, and it was designed according to recommendations from the Studio Advisory Committee: approximately 135 minutes of digital video on one disc side, approaching CCIR-601 broadcast picture quality, stereo or multichannel digital audio, multiple aspect ratios, up to 8 language soundtracks, up to 32 subtitle streams, parental control options, and copy protection.

FIGURE 8.11 The same directory and file structure is used for different types of DVD discs. A. In a “Video” DVD-Video disc, the DVD-Video zone contains the Video Manager and VTS subdirectories that contain data files. B. In an “Audio-Only” DVD-Audio disc, the DVD-Audio zone contains the Audio Manager and ATS subdirectories. C. In an “AV” DVD-Audio disc, the Audio Manager can control a subset of the DVD-Video data. D. In a “VAN” DVD-Video disc, Link Info allows a DVD-Audio player to play audio components of video contents.

DVD-Video Video Coding

The task of storing a feature film on an optical disc is far from trivial. A CD disc is woefully inadequate for high-quality video storage. For example, if a movie was coded at a video bit rate of 166 Mbps, a CD would hold only about 40 seconds of this uncompressed video and would have to spin at a rate of 58,850 rpm, or 118 times faster than its normal speed. To accomplish its task, a DVD-Video disc takes advantage of the inherently higher capacity and output bit rate of DVD, and more importantly employs data reduction techniques. Although a DVD-Video data layer can hold seven times the data of a CD (4.7 Gbytes) it is insufficient to store a feature film; at a bit rate of 166 Mbps, a layer would hold less than 5 minutes of video. To overcome this, the DVD-Video standard uses the MPEG-2 data compression algorithm to encode its video program. The algorithm used for DVD-Video is based on the MPEG-2 Main Profile at Main Level protocol, also known as MP@ML.

MP@ML is an intermediate level, and below the High Level sometimes used for digital television (DTV). However, MP@ML yields a high-quality picture that equals that of the professional CCIR-601 (or D-1) standard operating at a rate of 270 Mbps. Even if it could be recorded to a DVD-Video disc without compression (it can’t), this CCIR-601 video bit-stream would fill a single-sided, single-layer DVD disc in 140 seconds. To instead store over 2 hours of an audio/video program would imply an overall reduction of about 60:1. However, several prefiltering operations reduce the burden of algorithmic compression.

An NTSC CCIR-601 signal assumes a sampling rate of 858 samples per scan line, 525 lines per frame (yielding a 858-by-525 display) and 30 frames per second; however, many of these samples are offscreen in blanking intervals. Thus, DVD-Video reduces the number of pixels to a 720-by-480-pixel display. The video bit rate is further decreased by decreasing the word length (for example, from 10 bits per sample to 8 bits per sample). Furthermore, rather than code RGB components, a YCrCb representation can be used more efficiently. For example, both the vertical and horizontal resolution of the chrominance information can be halved; the video program is stored as 4:2:0 component video instead of 4:4:2. These steps reduce the bit rate by 54%.

Additional efficiency can be realized when coding movies. Movies are filmed at 24 frames per second, whereas DVD-Video operates at 30 frames per second; this means that after conversion, 6 out of every 30 video frames are repeated, and this need not be separately coded. Overall, this type of prefiltering on the input signal may decrease the bit rate by 63% (for movie sources). Although the bit rate may be “only” 100 Mbps, it still requires algorithmic compression. For example, to place a 133-minute movie on a single-sided, single-layer disc, an average compression ratio of 21:1 is still needed.

The MPEG-2 video compression algorithm uses psychovisual models to analyze the video signal to determine how a human viewer will perceive it. Image data that is deemed redundant, not perceived, or marginally perceived, is not coded. This analysis is carried out for both individual video frames and series of frames. Over time, as much as 95% of the video data can be omitted without significant degradation of the picture. Video compression is discussed further in Chap. 16.

An important aspect of MPEG-2 coding is its variable bit rate (MPEG-1 uses a fixed bit rate). Because some pictures are more difficult to code than others, MPEG-2 allows for a variable bit rate. The bit rate needed to code motion pictures can vary greatly from scene to scene. For example, a scene with a “talking head” surrounded by a static background would require a relatively low bit rate. A fast action scene with a changing complex picture would require a higher bit rate. MPEG-2 encoders output a changing bit rate that reflects the changing degree of picture complexity and coding difficulty; in this way, bits are not wasted on low complexity frames, and artifacts are avoided in high complexity frames. In some encoders, video content may be coded in three passes, to optimize coding efficiency.

The MPEG-2 algorithm is specifically engineered so that improvements can be made in the encoding algorithm while retaining complete compatibility with existing decoders. Thus, the look of video software titles can improve, and the improvement will be seen on current and future players. The picture quality of a particular DVD-Video title is also influenced by the expertise used in the picture encoding. An overview of the DVD-Video properties is shown in Table 8.2.

TABLE 8.2 Summary of the principal characteristics of the DVD-Video format.

DVD-Video Audio Coding

The audio portion of the DVD-Video standard accommodates both multichannel and stereo soundtracks. DVD-Video titles can accommodate up to eight independent audio bitstreams. These audio bitstreams can be 1 to 8 channels of PCM, 1 to 6 channels of 5.1-channel (five main channels plus a low-frequency effects channel) of Dolby Digital, or 1 to 8 channels (5.1 or 7.1) of MPEG-2 audio. An NTSC title must include at least one Dolby Digital or PCM audio track. A PAL title must contain at least one Dolby Digital, PCM, or MPEG-2 audio track. A disc can also optionally employ DTS, SDDS, or other audio coding. Neither MP3 or AAC coding is allowed in the DVD-Video or DVD-Audio zone; however, these codings may be used in areas outside the zones. Dolby Digital is the standard coding used for multichannel soundtracks in the United States and Canada (Region 1) and other regions as well. Typically, Dolby Digital soundtracks are either stereo or 5.1-channel. For strictly commercial reasons, NTSC players generally play only NTSC discs, but many PAL players can play both NTSC and PAL discs. In either case, the corresponding display (NTSC or PAL) must be connected. Most computers can play both NTSC and PAL discs.

As used with DVD-Video, Dolby Digital (AC-3) codes 1 to 5 main discrete channels plus a discrete low-frequency effects (LFE) channel. A rear center channel can be added using Dolby Digital Surround EX which uses phase matrix encoding. The Dolby Digital sampling frequency is 48 kHz (the norm in digital video applications), the nominal output bit rate is 384 kbps, and the maximum bit rate is 448 kbps. Dolby Digital accommodates resolution of up to 24 bits. Dolby Digital decoders must also be able to down-mix 5.1-channel soundtracks to stereo PCM. The center and surround channels are matrixed with the main stereo channels in the Dolby Surround format (that can be decoded by Dolby Pro Logic decoders). The LFE channel is not included in the down-mix (for that reason, surround mixes should have bass content that takes full advantage of the low-frequency response of the main channels). An alternative to downmixing in the player is to simply create a new stereo mix and place that on the disc.

In the DVD-Video format, DTS can be optionally used to code 1 to 6 discrete main channels of audio data plus a discrete LFE channel. A rear center channel can be added discretely or by matrixing. The DTS sampling frequency is 48 kHz, and resolution is up to 24 bits. Typical bit rates for a 5.1-channel program are 768 kbps or 1536 kbps (the maximum rate). MPEG-1 stereo audio (Layer II) is sampled at 48 kHz with a maximum resolution of 20 bits and a maximum bit rate of 384 kbps. MPEG-2 multichannel audio (BC matrix mode) codes up to 7.1 channels. It is also coded at 48 kHz with a maximum resolution of 20 bits and a maximum bit rate of 912 kbps. The AAC codec is not supported. Audio codecs such as Dolby Digital, DTS, MPEG-1, and MPEG-2 are discussed in more detail in Chap. 11.

For compatibility, DVD-Video movies also carry a redundant PCM digital stereo soundtrack. These PCM audio tracks can employ sampling rates of either 48 kHz or 96 kHz, and word lengths of 16, 20, or 24 bits. The CD sampling frequency of 44.1 kHz is not supported on DVD-Video; files must undergo sampling-rate conversion. These PCM audio configurations are supported: 16/48 (up to eight channels), 20/48 (up to six channels), 24/48 (up to five channels), 16/96 (up to four channels), 20/96 (up to three channels), and 24/96 (up to two channels). Because up to eight independent PCM channels are permitted, for example, movies can be released in eight different languages. PCM coding can also employ a dynamic range control (the same provision as in the DVD-Audio specification). This is a disc option and player requirement. The maximum PCM bit rate allowed on a DVD-Video disc is 6.144 Mbps. Of course, an increase in the audio bit rate decreases the bit rate available to the digital video signal.

To summarize, on a DVD-Video disc, the maximum channel bit rate for DVD is 26.16 Mbps. Following demodulation, this rate is halved to 13.08 Mbps. Following error correction, the maximum bit transfer rate is 11.08 Mbps. The maximum user data bit rate for video, audio, and sub-picture is 10.08 Mbps. The maximum bit rate into the respective buffers is 9.8 Mbps for MPEG-2 video, 1.856 Mbps for MPEG-1 video, 6.144 Mbps for PCM audio (DVD-Video), 9.6 Mbps for MLP/PCM audio (DVD-Audio), and 3.360 Mbps for sub-picture. In addition, these limits apply: 448 kbps for Dolby Digital, 384 kbps for MPEG-1 audio, 912 kbps for MPEG-2 audio, 1536 kbps for DTS, and 1280 kbps for SDDS. From these bit rates, playing times can be calculated; for example, a DVD-9 disc would hold 42 hours and 22 minutes of 5.1-channel Dolby Digital data. Similarly, disc authors can calculate bit budgets, to ensure that contents can fit on a particular disc format.

Figure 8.12 shows an example of how the bit rates of various contents are accommodated on a DVD-Video disc. In this example, the average video bit rate is 3.5 Mbps, there are three audio soundtracks each at 0.384 Mbps, and four subtitle streams each at 0.01 Mbps. This yields a total bit rate of 4.692 Mbps, which is within the DVD-Video specification. At this average bit rate, one 4.7-Gbyte layer would hold a 133-minute program comprising 4.68 Gbytes. This capacity can accommodate over 90% of all feature films; longer titles can use dual-layer discs. A 4.7-Gbyte disc might hold 133 minutes of program, an 8.5-Gbyte disc might hold 241 minutes, a 9.4-Gbyte disc might hold 266 minutes, and a 17-Gbyte disc might hold 482 minutes. These timings are only representative examples; different combinations of content streams and their bit rates would yield different durations. During authoring, data reduction is applied so that content fits available disc capacity, within the constraints of sound and picture quality.

FIGURE 8.12 An example of how a video program, three audio programs, and four subtitle streams can be placed on a DVD-Video disc, while maintaining an overall capacity requirement of less than 4.7 Gbytes.

DVD-Video Playback Features

A DVD-Video player is connected to a home theater system and is used to play motion pictures and other video programs from DVD-Video and VAN discs (see below), as well as the video contents on a DVD-Audio AV “Audio with Video” disc and other compatible audio portions such as Dolby Digital tracks. In addition, a DVD-Video player can play audio CDs (not all players can play CD-R discs). Universal DVD players can play both DVD-Audio and DVD-Video discs. DVD-Video discs cannot be played in CD players. Most DVD-Video players provide a component video output (a professional video standard that avoids the carrier frequencies used in composite video signals) as well as a HDMI output. The 8-bit video signal is generally reproduced with D/A converters with 10-bit resolution. The DVD-Video standard provides a parental lockout; movies can be coded to play different versions, skipping potentially offensive scenes or using alternate scenes and dialogue tracks. A block diagram of the principal elements and signal flow in a DVD-Video player is shown in Fig. 8.13.

Hybrid DVD-Video discs may contain a movie that is playable in a dedicated DVD-Video player and a DVD-ROM-enabled PC; in addition, the ROM drive may be used to access disc contents such as the movie’s screenplay, interactive games, screen savers, and hyperlinks to Web sites. The DVD-Video specification (Part 5) also describes a hybrid video-audio disc. Because it contains “video audio navigation” information, it is sometimes known as a VAN disc. VAN discs are video discs but they contain audio information that can be played on DVD-Audio players. The content provider selects which portions of the audio tracks can be played on DVD-Audio players.

FIGURE 8.13 A DVD-Video player reads data from a disc, distinguishes between CD and DVD data, and directs data types to specific decoding circuits for processing and signal output.

DVD-Video supports both normal (4:3) and widescreen (16:9) aspect ratios and an automatic pan-and-scan feature. To perform pan-and-scan, the player must use data specially inserted in the bitstream to display portions of a 16:9 picture on a 4:3 screen. In 16:9 mode, the player ignores the pan-and-scan data and instead produces a wide-screen image. Alternatively, a 16:9 image can be letterboxed on a 4:3 screen with black stripes at the top and bottom of the screen. Other features include chapter division, forward and reverse scanning, up to nine camera angles and interactive story lines. These features are disc options, and implementation is left to the content provider. Digital audio data stored at a 96 kHz sampling rate is not output through a player’s conventional digital audio outputs; it is downsampled to 48 kHz.

From an audio standpoint, the most significant feature of DVD-Video (and DVD-Audio) is multichannel playback of surround sound. Although discs can be coded with different audio channel outputs, the de facto standard is 5.1-channel playback. Figure 8.14 shows a loudspeaker configuration recommended for reproducing Dolby Digital surround sound. The front left/right speakers are placed at ±30° and the surround speakers at ±110° from the center. Optimal placement of the LFE channel speaker depends on room acoustics. One or two additional rear channel speakers are sometimes added.

FIGURE 8.14 This loudspeaker configuration, taken from the ITU-R BS.775-2 recommendation, is often used for playback of multichannel audio from DVD, HDTV, and other surround sources. Other configurations are also used. Optimal placement of the LFE subwoofer depends on room acoustics. In some cases, additional rear speakers are added.

The DVD-Video format supports up to 32 channels of sub-picture information. Sub-pictures are graphic files that are bitmapped or overlayed onto the picture. Sub-pictures are generally used for captions, subtitles, or other text. Sub-pictures can be scrolled and faded and can change in every video field. The color palette comprises 16 colors and contrast values; four colors and four contrast values can be displayed in one channel at a time. Sub-picture data is a run-length coded bitmap with 2 bits/pixel, each at a bit rate of 10 kbps. Sub-picture information can be accessed by timecode or user button depressions to produce graphics and simple animation.

Discs and players contain regional coding flags so players will only play properly coded discs. For example, a Region 2 (Europe and Japan) player will not play discs intended for the North American (Region 1) market. This allows movie studios to control release of titles to different global markets. Regional codes on discs are optional; support is mandatory on players. Discs can be made with multiple codes, or no codes; discs with no codes will play on all players, regardless of country. There are six geographic regions: (1) Canada and the United States and its territories; (2) Japan, Western Europe, South Africa, Turkey, and the Middle East; (3) South Korea, Southeast and East Asia including Hong Kong; (4) Australia, New Zealand, Pacific Islands, Central America, Mexico, South America, and the Caribbean; (5) Russia, Indian Subcontinent, Africa, North Korea, and Mongolia; (6) People’s Republic of China and Tibet. Region 7 is reserved. Region 8 coding is used for nontheatrical venues such as airplanes, cruise ships, hotels, and so on.

DVD-Video Authoring

A DVD-Video disc can hold video content, multiple soundtracks, subtitles, bonus features, and Web links, and allow elaborate navigation features. A DVD-Audio disc can hold stereo and multichannel music as well as videos, still photographs, graphics, animation, lyrics, and other features. In addition, most DVD-Audio discs hold a Dolby Digital mix for playback in DVD-Video players. A simple DVD title can be authored on a personal computer. A complex title can entail use of diverse professional authoring tools; in many cases, specialists in audio, video, graphic design, and interactivity handle different aspects of the workflow. Whether simple or complex, all title production is computer-based. First, the contents and functionality of the title are determined, along with navigational use of the contents. Also, a bit budget is determined to ensure that all contents will fit on a disc’s layers, and peak output bit rates are checked. In most cases, content is already prepared before authoring begins. However, authoring steps must ensure that content adheres to the DVD standards. Audio and video content is supplied and coding is applied when necessary. Artwork, transition sequences, and still or animated menu graphics are created. Web-based multimedia programs are created. Following postproduction, all video and audio content is processed with appropriate coding such as Dolby Digital, DTS, or MPEG-2.

In DVD-Video authoring, program chains for parental lockout, camera angles, alternate endings, regional coding information, and supplemental information is prepared. In some cases, movie soundtracks need to be reconformed to the video being used, or remixed to a surround format. Also in DVD-Video authoring, it is essential that time synchronization of video and audio content is assured on the final disc. In DVD-Audio authoring, MLP coding and downmixing data is applied as necessary. Menu buttons are linked to content or other menus. Elementary streams are created and checked. The simulation is checked for any violations to the DVD specification and thoroughly tested and debugged. The streams are input to an authoring system to create a DVD disc image. The image is complied to a virtual disc format; data is multiplexed into a single composite stream, navigation files are generated, and data is formatted according to Micro UDF Bridge/ISO 9660 format. The image is placed on a hard-disk drive and playback is emulated with a DVD player application. The image is burned to DVD-R or another disc type and tested again. All audio and video elements are reviewed, sound-to-picture synchronization is checked, along with disc navigation and functionality. In some cases, the image is played from the DVD-R into the authoring workstation and the emulation is further checked and debugged there. A finished, encrypted version is written to DLT tape along with DDP data. A manufacturing plant produces test discs that are checked for functionality and player compatibility. After approval, finished discs can be replicated. A list of the principal steps in DVD-Video authoring (which is similar to DVD-Audio authoring) is given in Table 8.3.

TABLE 8.3 Principal steps in the authoring of a DVD-Video disc.

DVD-Video Developer’s Summary

Programmers and others developing DVD-Video products should know the specific definitions of terms, file structures, and the interrelationships of file types. This section provides such overview information. As noted, the Part 3 DVD-Video format adheres to Parts 1 and 2 of the DVD specification. It employs the UDF file format; Part 3 specifically defines how the user can access disc contents (Navigation) and how the video data itself is structured (Video Objects). Discs can contain multiple titles; for example, there might be a movie and a trailer, or perhaps several short films. The Title is a disc’s highest level of navigation, and users select which title they wish to view. Internally, the title manager contains one or more Video Title Sets (VTSs) and Program Chains (PGCs) that also contain audio and video elements. Within each title, a main menu shows particular contents, possibly leading to submenus. Typically, four directional buttons are used for onscreen selection, and an enter button activates selections within a menu. Other dedicated buttons access specific features such as audio tracks and subtitles. This and other navigation structure is defined in Part 3.

Part 3 defines a video disc for moving pictures. The Presentation data structure complies with the MPEG-1 and MPEG-2 specifications. A Pack is a pack header followed by one or more packets. A Pack is a layer in the system coding that is described in ISO/IEC 13818-1 (the MPEG-2 stream layer specification). MPEG 13818-1 defines disc program stream and broadcast transmission stream; 13818-2 defines video compression; 13818-3 defines audio compression for surround sound. (MPEG-1 is defined in ISO 11172-1, -2, and -3.)

A Packet is the elementary data stream following the header; there are five kinds of Packets. A Stream-ID defines the type of packet and is defined in ISO/IEC 13818-1. An ISO/IEC 13818-1 stream contains five packetized elementary streams of video, audio, sub-picture, presentation control information (PCI), and data search information (DSI). A Cell is a group of MPEG frames of indeterminate length starting and ending with an I-frame. An I-frame is an intra-coded picture without temporal prediction, as opposed to P (forward-predicted) frames and B (bidirectional) frames. Cells are used by PGCs, which are part of the Navigation system. (There are also audio cells.)

A Program Chain (PGC) contains navigation pointers to cells and cell groups. These PGCs can be selected by the viewer. For example, one PGC pointing to cells 1 to 20 and 35 to 50 would show the G-rated version of a movie, and another PGC pointing to cells 1 to 50 would show the complete R-rated version.

The Volume Space of a DVD-Video disc consists of the Volume and File structure, a single DVD-Video zone, and DVD-Other zone (see Fig. 8.11A). A DVD-Video zone consists of one Video Manager (VMG) and one or more Video Title Sets (VTSs). The VMG is the table of contents for all Video Title Sets; each VTS is a collection of titles. The VMG contains a main menu for disc title, text data, and so on. A Video Title Set is a collection of titles. A VTS contains a menu for title chapter, language for audio/sub-picture, playback control information (PGCI), and audio-video VOBS data. A Video Object Set (VOBS) is a collection of Video Objects (VOBs) that hold presentation data such as video, audio, or sub-picture data. For example, a VOB might contain all of an MPEG-2 video program. A Title consists of one or more PGCs, each containing Program Chain Information and VOBs. Titles with multiple PGCs permit branching, multiple story lines, etc. The DVD-Video data structure is shown in Fig. 8.15.

A DVD-Video zone also contains Navigation data (playback control) and Presentation data (the video program to be played back). The Navigation Manager handles navigation data (VMGI, VTSI, PCI, and DSI) to control the user interface, control playback, interpret user actions, and determine how the Presentation Engine should play back Presentation data. A Button is an onscreen user control; Buttons are defined in PCI. A Menu is an onscreen display that includes Buttons.

FIGURE 8.15 The DVD-Video data structure can be viewed as a disc image with the DVD-Video zone holding the Video Manager and Video Title Sets. DVD-Audio follows the same structure.

The Presentation Engine follows instructions issued by the Navigation Manager to play Presentation data from the disc and control the displayed output. Presentation data is divided into cells. Presentation data consists of VOBs. For example, a chapter may be a VOB. Different VOBs may be used for different scenes, cuts, and so on (for director’s cut, angles, parental lockout, and the like). A VOBS consists of one or more Video Object blocks.

Navigation data consists of attributes and playback control for the Presentation data. Navigation data allows the user to access disc contents. Content providers can use this data to code branching and interactivity. There are four types: Video Manager Information (VMGI), Video Title Set Information (VTSI), Presentation Control Information (PCI), and Data Search Information (DSI). VMGI is described in the Video Manager (VMG). It describes information in the VIDEO_TS directory. Data includes a video copy flag, an audio copy flag, number of volumes, disc side identifier, NTSC/PAL, aspect ratio, picture resolution, number of audio streams, audio-coding method, quantization, sampling rate, number of channels, sub-picture coding, menu language, and parental management.

Video Title Set Information (VTSI) is described in the Video Title Set (VTS). It describes information for one or more Video titles and the Video Title Set Menu. Its data is similar to VMGI. PGCI pointers are the Navigation data used to control presentation of the PGC and order of cell playback. PGCI pointers are usually played sequentially, but can be played in random or shuffled sequence. PGC is composed of PGCI and VOBs. For example, PGC may be used to create interactive programs. PGC has data such as presentation time in hours, minutes, seconds, frames, and cell information. PCI is dispersed in the VOBS along with Presentation data. PCI is the Navigation data used to control the presentation of a VOB Unit (VOBU). PCI is used by the playback engine to control what is seen and heard. PCI has data such as angle information, highlight information, relation between sub-picture and highlight, and buttons.

Data Search Information (DSI) is dispersed in the VOBS along with Presentation data. DSI is the Navigation information used to search and seamlessly play back the VOBU. DSI is also used for navigation and search control (such as branching). DSI has data such as interleaving, start address, and synchronization. Navigation Commands are used by content providers to allow changes in player operation including branching and interactivity (as opposed to linear playback). Navigation commands appear in PCIs and PGCIs. DVD-Video uses Navigation commands to provide a high degree of interactivity. Linking, looping, jumping, searching, and decision making are built into the specification as navigation commands. Software developers use this standardized command set. There are twenty-four 16-bit System Parameters (SPRMs) registers (such as angle number, video capability, audio capability, parental level, and language code) for player settings and sixteen 16-bit General Parameters (GPRMs) registers (such as go to, jump, link, and compare) to memorize the user’s operational history and modify the player’s operation.

A hardware splitter/navigator controls DVD playback. It uses PCI information that describes the stream contents, and then splits the data to the appropriate decoders. The navigation engine uses DSI information and user input to control playback. Omitting PCI and DSI, the three remaining stream types have a maximum bit rate of 9.8 Mbps (variable). In practice, the average rate might be 4.7 Mbps.

A VOB contains Presentation data (video data, audio data, sub-picture data, and VBI data) and part of the Navigation data (PCI and DSI). Video Objects are defined as pack types and have restrictions on data transfer rate. The video stream (maximum of 1) has a maximum transfer rate of 9.80 Mbps; the PCM audio stream (maximum of 8) has a maximum transfer rate of 6.144 Mbps; the sub-picture stream (maximum of 32) has a maximum transfer rate of 3.36 Mbps.

A Video Object Set is a collection of Video Objects, as shown in Fig. 8.16. Each VOB can be divided into cells (or scenes). Each cell contains Video Object Units that are groups of audio or video blocks; a cell is the smallest addressable data chunk, but may last for a moment or for a movie’s entire duration. Each VOBU contains VOB pack types such as V_PCK (video packs), A_PCK (audio packs), and NV_PCK (navigation packs). VOBUs may also contain analog copy-protection data for Macrovision. A pack comprises packets and both comply with ISO/IEC 13818-1 (the MPEG-2 bitstream format, not the MPEG-2 audio or video coding formats). A pack is 2048 bytes total with up to 2034 bytes of user information. Figure 8.17 shows the structure of a pack. Navigation Packs (NV_PCK) contain PCI (979 bytes) and DSI (1017 bytes) data. Sub-picture packs (SP_PCK) contain sub-picture data (2024 bytes). Video Blanking Information (data placed in the video blanking period) packs (BVI_PCK) contain VBI (640 bytes). Video packs (V_PCK) contain video data (2025 bytes). Audio packs (A_PCK) contain audio data such as PCM (2013 bytes), AC-3 (2016 bytes), and MPEG (2020 bytes). An audio pack contains data such as audio emphasis, audio mute, audio frame number, quantization word length, audio sampling frequency, number of audio channels, dynamic range control, copyright, and so on.

To summarize, audio and video data as well as presentation and control information are held in packets. Usually, one MPEG Group of Pictures (GOP) occupies one VOBU. VOBUs are collected into cells. Sequences of cells comprise a program that is stored in a VOB. Sequences of programs comprise a Presentation Control Block (PCB). This, along with command information, creates a Program Chain. Its audio and video content is stored in a VOBS. Titles are grouped to form a VTS.

FIGURE 8.16 A Video Object Set (VOBS) is a collection of Video Objects, which in turn contain cells, and Video Object Units, which contain pack data.

FIGURE 8.17 The structure of packs and packets used in DVD-Video adheres to the MPEG-2 standard. Presentation data is contained in packets.

In the DVD-Video player, packs in the program stream are received from the disc and transferred to the appropriate decoder. A buffer is used to ensure a continuous supply of data to the decoders. DSI data is treated separately. Video presentation data complies with ISO/IEC 13818-2 (MPEG-2 video standard) or ISO/IEC 11172-2 (MPEG-1 video standard). MPEG places constraints on the picture coding. Video data is split into VOBUs. Video data in a VOBU consists of one or more GOP. Audio presentation data comprises PCM or compression coded data such as AC-3, DTS, or MPEG. The audio stream is divided into packs and recorded on the disc. Sub-picture presentation data comprises the sub-picture header unit, pixel data, and display control sequence table. Video Blanking Information Unit presentation data is placed in the blanking period.

DVD-Audio

The DVD-Audio portion of the DVD specification describes a high-quality audio storage format that provides a wide variety of channels, sampling frequencies, word lengths, and other features. Although primarily an audio specification, it also provides for incorporation of video and other elements. In many ways, DVD-Audio is based on the DVD-Video specification. Development of DVD-Audio began in December 1995, with the formation of the DVD Forum’s Working Group 4 (WG-4). Its first meeting was held in January 1996, and Version 0.9 of the specification was released in June 1998. The DVD-Audio Version 1.0 specification was finalized in February 1999, and was the last of the original DVD formats to be ratified by the DVD Forum. DVD-Audio products were introduced in early 2000. WG-4 received input from the International Steering Committee (ISC) representing the interests of the major record labels through trade associations (RIAA, IFPI, and RIAJ). The ISC established 15 criteria for DVD-Audio such as high-quality sound, multichannel audio, scalable parameters, CD compatibility, long playing time, optional video content, simple or menu-based disc navigation, and copy protection.

Although the DVD-Video format can provide high-quality audio (such as six channels of 48-kHz/20-bit audio) its maximum audio bit rate of 6.144 Mbps cannot support the highest quality levels. DVD-Audio’s maximum bit rate of 9.6 Mbps increases its abilities. For example, with PCM coding, six channels of 96-kHz/16-bit audio is allowed. However, six channels of 96 kHz/24-bit PCM audio exceeds the maximum bit rate. In any case, high bit-rate PCM streams reduce playing time. Thus lossless and lossy compression algorithms can be optionally employed to reduce bit rate demands, and increase playing time.

The primary intent of the developers was to create an audio format that would retain compatibility with other DVD disc formats, some backward compatibility with the CD format, and introduce improved sound quality and multichannel playback. In addition, DVD-Audio would protect its content with stringent anti-piracy measures. To augment the already large capacity of DVD, DVD-Audio also provides for lossless data compression of audio. This option allows storage of over 74 minutes of high-quality multichannel music on a single data layer. DVD-Audio discs must contain an uncompressed or MLP-compressed PCM version of the DVD-Audio portion of the program. For further flexibility and added compatibility with existing DVD-Video players, DVD-Audio discs may also include video programs with Dolby Digital, DTS and/or PCM tracks. In most cases, in addition to high-resolution PCM tracks, DVD-Audio discs also contain Dolby Digital tracks, so that discs are playable in both DVD-Audio and DVD-Video players. Dolby Digital tracks are mandatory on discs that contain associated video tracks.

Two types of DVD-Audio discs are defined. An Audio-Only disc (see Fig. 8.11B) contains only music information. An Audio-Only disc can optionally include still pictures (one per track), text information, and a visual menu. In addition, an Audio with Video (AV) disc is defined (see Fig. 8.11C); it can contain motion video information formatted as a subset of the DVD-Video format.

DVD-Audio Coding and Channel Options

The DVD-Audio format supports a variety of coding methods and recording parameters, as shown in Table 8.4. PCM tracks are mandatory on all discs. Optional disc coding methods include MLP, Dolby Digital, MPEG-1, or MPEG-2 without extension bitstream, MPEG-2 with extension bitstream, DTS, DSD, and SDDS. DVD-Audio is said to be extensible; it is open-ended and can be adapted to future coding technologies. All DVD-Audio players must support MLP decoding. DVD-Audio is a “scalable” format; that is, its specification provides considerable flexibility for content providers. When PCM coding is used, the number of channels (1 to 6), the word length (16, 20, 24 bit), and the sampling frequency (44.1, 48, 88.2, 96, 176.4, or 192 kHz) can be interchanged. At the highest sampling frequencies of 176.4 kHz and 192 kHz, only two-channel playback is possible. To limit the output bit rate to the 9.6-Mbps maximum, other restrictions may apply to lower sampling frequencies. Audio attributes such as sampling frequency and word length can be set differently for each track.

The coding options, range of sampling frequencies and word lengths, and number of titles (stereo and/or multichannel) create a range of playback times. In addition, the number of disc layers determines playing times. For example, depending on its recording parameters, a stereo PCM program on a data layer might play for 422 or 65 minutes. Similarly, different configurations of multichannel recordings will yield a range of playing times, as shown in Table 8.5. MLP lossless compression can effectively almost double the disc capacity, thus increasing playing times. The compression achieved by MLP depends on the music being coded. Very approximately, it gives about a 1.85:1 compression ratio; thus, it can almost halve bit rate, and double playing time with no loss of audio quality. Similarly, lossy compression increases playing time. Dolby Digital, DTS, MPEG, MLP, and other audio codecs are discussed in more detail in Chap. 11.

TABLE 8.4 The DVD-Audio specification supports a variety of coding methods, each with many possible recording parameters. Some examples are shown here; this table is not inclusive of all possibilities.

TABLE 8.5 Examples of coding methods and recording parameters and resulting playing times per disc layer (DVD-5).

The use of high sampling frequencies such as 96 kHz and 192 kHz may seem unnecessary. In rare cases, a person may be able to hear frequencies of 24 kHz or 26 kHz, far below the cutoff frequencies of 48 kHz and 96 kHz. In most cases, high-frequency hearing response is below 20 kHz. Thus, for steady-state tones, the higher frequency response may not be useful. However, it can be argued that high sampling frequencies improve the binaural time response, leading to improved imaging. For example, if short pulses are applied to each ear, a 15-ms difference between the pulses can be heard, and that time difference is shorter than the time between two samples at 48 kHz. Some people can hear a 5-ms difference, and that corresponds to the time difference between two samples at 192 kHz. In theory, this high sampling frequency may improve spatial imaging. Thus, it may take two ears to distinguish between a recording at 48 kHz, and one at 192 kHz. Its designers hoped that the DVD-Audio specification would offer improvements in fidelity and in any case its specifications would not be a limiting factor.

Various channel assignments can be made by placing the channels into two Channel Groups (CGs); examples of channel assignments are shown in Table 8.6. This prioritizes mixes that use the front L and R channels; front L, R, and C channels; and the corner L, R, Ls, and Rs channels. The sampling frequency and word length of CG1 is always greater than or equal to those of CG2, as shown in Table 8.7. Generally, CG1 assignments are for the front channels, and CG2 assignments are for the rear channels. There are numerous ways to assign channels, ranging from monaural to six channels, and different word lengths and sampling frequencies can be employed on the front and rear channels. For example, front channels could be coded at 24/96 with the rear channels coded at 16/48. Coding the rear channels at a lower bit rate, for example, would allow longer playing times, or would allow disc capacity to be budgeted to other content such as videos or stereo mixes.

TABLE 8.6 DVD-Audio channel assignments are made with two Channel Groups (CG1 and CG2). Assignments enable front mixes, front mixes with center channel, and four-channel corner mixes. There are other possible assignments beyond the 21 examples shown here.

Sampling frequencies in channel groups must be in the same family, that is, related by a simple integer such as 48/96/192 kHz or 44.1/88.2/176.4 kHz. Table 8.8 shows examples of different channel configurations for 5.0- and 5.1-channel playback using PCM coding. Unlike some 5.1-channel systems (Dolby Digital, DTS, MPEG), the PCM coding used in DVD-Audio does not bandlimit the LFE channel, it is a full-bandwidth channel. The choice of sampling frequency family is probably best determined by the sampling frequency of the original recording; noninteger sample rate conversion might introduce audible artifacts. The frame rate is defined to be 1/600 second at sampling frequencies of 48, 96, and 192 kHz, and 1/551.25 second at 44.1, 88.2, and 176.4 kHz.

TABLE 8.7 The Channel Groups are scalable. The sampling frequency and word lengths of CG1 must be greater than or equal to those of CG2.

TABLE 8.8 Examples of multichannel PCM channel configurations with multiple sampling rates, showing bit rate and playing time (on single-layer/dual-layer discs). A. 5.1-channels coded at 48 kHz/96 kHz. B. 5.0-channels coded at 48 kHz/96 kHz.

PCM coding also provides for an emphasis characteristic (zero at 50 ms and pole at 15 ms, for sampling frequencies of 48 kHz and 44.1 kHz); this boosts high frequencies during encoding and correspondingly cuts high frequencies during decoding to reduce the noise floor. This can be applied when all channels use the same sampling frequency. Use of pre-emphasis is optional on discs, but provision for de-emphasis is mandatory in players. PCM coding can also employ a dynamic range control (the same provision as in the DVD-Video specification). This is a disc option and player requirement.

A DVD-Audio disc can contain one or several selections, as provided by a content provider. For example, a disc might contain one selection coded as PCM; every player could play back this selection. Another disc might contain two selections, one coded as PCM multichannel and the other coded as PCM stereo; the provider can choose the order of the selections; only players with multichannel capability could play the multichannel selection. Another disc might contain two selections, one coded as PCM stereo and the other coded in an optional format such as Dolby Digital; the optional selection could be played by players equipped with that circuitry. It is advantageous to place Dolby Digital tracks on a DVD-Audio disc so they can be played in a DVD-Video player. A single-inventory disc may include: DVD-Audio stream of up to six channels of MLP at 96/24, stereo PCM stream, Dolby Digital 5.1-channel stream on the DVD-Video portion, and possibly even a Red Book layer at 44.1/16.

DVD-Audio discs can employ the SMART (System Managed Audio Resource Technique) feature with PCM tracks. SMART provides automatic downmixing so that a multichannel audio program can be mixed down to two channels by the player during playback and thus replayed over a stereo playback system. The content provider can program how the downmixing will occur, by selecting one of 16 coefficient tables, stored along with the audio data on the disc. Each coefficient table defines level, pan position, and phase. The level mixing ratio can vary from 0 to −60 dB. Coefficient tables can be varied on a track-by-track basis in each Audio Title Set. The SMART feature eliminates the need to include a separate stereo mix on a multichannel disc, thus wasting disc space. However, SMART may not allow the creative flexibility demanded by some content providers. When a separate stereo mix is coded along with a multichannel mix, the separate mix is automatically selected instead of a folded-down mix. Use of SMART downmixing is optional on discs, but its support is mandatory in players.

MLP supports the downmix feature. The player first decodes the multichannel signal and then accesses the coefficients to provide a two-channel playback. Optionally, the downmix can be created by the MLP encoder rather than the player so both two-channel and multichannel mixes can be conveyed separately by MLP substreams. The MLP decoder in the player reads the two-channel substream and outputs two channels. This reduces the computation required in stereo-only players. However, the two-channel mix increases the bit rate by about one bit per sample. For multichannel playback, the player extracts both substreams and the MLP decoder decodes all channels prior to possible downmixing.

Other “value-added” content on DVD-Audio discs may include artist names, song titles, liner notes, artist commentary, biographies, discographies, music videos, and Internet URLs. Nonreal-time information (such as content) is recorded in Information areas while real-time information (such as lyrics) is recorded in Data areas. Two character sets are supported: ISO 8859-1 for European languages and Music Shift JIS for Japanese. Multiple languages may be supported. Still images may be tagged to individual tracks; they may be displayed like a slide show (manual or automatic) while the music plays. Likewise, text and sound effects may be played in real time. This extra information is a disc option (track names are mandatory if there is any text information), but decoding must be supported by Universal players (described below). In some cases, the player uses the text information to construct a text menu.

Full motion video can be added to a DVD-Audio AV disc as an independent video portion; it is defined as a subset of the DVD-Video specification. Several restrictions apply: there is a maximum of two audio streams, at least one of which must be PCM and the PCM stream is limited to six channels with restricted channel assignments. In addition, there is no multi-story, multi-angle, parental control, or region control features. There is no mandatory PCM audio in the DVD-Video portion of disc (there is already a PCM version in the DVD-Audio part). Dolby Digital is mandatory in the DVD-Video portion (PCM is optional). DVD-Audio also defines a DVD-ROM zone that can contain compressed audio files. These files can be moved, for example, to portable music players. MPEG-4 High-Efficiency AAC (HE AAC), also known as aacPlus, files can be placed in this compressed audio zone.

DVD-Audio Disc Contents

DVD-Audio disc contents are arranged hierarchically as shown in Fig. 8.18. One album (or volume) describes the entire contents of one disc side. An album can contain up to nine groups. A group can contain up to 99 tracks. A track may contain up to 99 indices (in an Audio-Only disc). Figure 8.19 shows an example of the contents of a DVD-Audio Audio-Only disc. This disc contains two groups; in this case, Group 1 holds five main tracks and Group 2 holds two alternate remixed tracks. In addition, each group has two selections (labeled #1 and #2); for example, tracks 1 and 2 in Group 1 have two selections. In track 1, selection #1 is a multichannel mix and selection #2 is a stereo mix (downmixing is not used). Tracks 3, 4, and 5 may use downmixing. In this example, Group 2 selection #2 tracks use optional coding. Each group has one or more Audio Title Sets, and the tracks are objects within the Audio Title Sets.

FIGURE 8.18 Contents of DVD-Audio discs are arranged hierarchically. Any track in an album is accessible using a group number and track number.

FIGURE 8.19 An example of the contents of a DVD-Audio Audio-Only disc, showing two groups with a total of seven tracks.

FIGURE 8.20 An example of the contents of a DVD-Audio AV disc, showing one group with seven tracks.

TABLE 8.9 Compatibility between DVD-Video and DVD-Audio discs, and four types of players: Audio-Only player, Video Capable Audio player, Universal player, and Video player.

An example of the contents of a DVD-Audio AV disc is shown in Fig. 8.20. There is one group in this album. It has five audio-only tracks and two AV tracks. Tracks 1 and 2 have two selections; for example, in track 1, selection #1 is a stereo mix and selection #2 is a multichannel mix. Tracks 6 and 7 have video components (not playable on an Audio-Only player). The group has three Titles (two Audio Title Sets and one Video Title Set). The tracks are objects within the ATS and VTS.

DVD-Audio players (without video capability) can play back the audio contents and audio components of video contents of DVD-Audio AV discs. They can play selected audio components on DVD-Video VAN discs. Disc and player compatibility is illustrated in Table 8.9. DVD-Video players cannot play the high-resolution PCM tracks in the DVD-Audio zone. However, the video zone in DVD-Audio discs adheres to the DVD-Video format. Thus, for partial compatibility with DVD-Video players, many DVD-Audio discs contain a stereo PCM or Dolby Digital version of the album in their video zone. DVD-Audio players can also play hybrid DVD-Audio discs that contain both a DVD-Audio data layer and a Red Book CD layer. The DVD-Audio format also supports a SACD-like disc (Super Audio CD) as an optional format, thus some players can play both DVD and SACD discs. Universal DVD players can play the spectrum of DVD-Audio and DVD-Video discs. Mandatory player functions include user transport controls, selection of groups and tracks, and track searches. Optional features include group search, index search, visual menu, random play, and highlight selection. The visual menu is a subset of the DVD-Video menu specification; it is used to select groups and tracks, view multiple languages, and view still information such as liner notes and images. The visual menu is optional for Audio players but mandatory for Universal players.

The Simple Audio Play Pointer (SAPP) facilitates user navigation of the disc contents. SAPP information is contained in a table located in the disc lead-in area (it is similar to the TOC in the Red Book specification). SAPP is a subset of the more sophisticated and general Audio Navigation table; SAPP provides basic information for monaural and stereo PCM playback only, usually in simple players.

As with other DVD discs, DVD-Audio uses a robust RS-PC error correction system that includes a wide interleave. Local disc damage can cause unreadable data; the player’s response depends on how the navigator is implemented. If the navigator attempts to reread the damaged sector, its success partly depends on the data rate. If the rate is low, the player may have time to reread without an interruption in the output data; if the rate is high and especially near the maximum rate of 9.6 Mbps, there may not be time for a reread, and the data output may be interrupted.

DVD-Audio Developer’s Summary

Programmers and others developing DVD-Audio products should know the specific definitions of terms, file structures, and the interrelationships of file types. This section provides such overview information. Part 4 of the DVD specification describes the DVD-Audio format. It uses all the specifications from Part 1 and Part 2, and many of the specifications from Part 3. In particular, there are strong parallels between DVD-Video and DVD-Audio in terms of file management for navigation and presentation data. For example, instead of a Video Manager and Video Title Sets, an Audio Manager and Audio Title Sets are used. Likewise, many of the same navigation features are used. As noted, the complexity in the DVD standards is not in the coding itself (UDF, MPEG-2, and AC-3 are complex unto themselves). The DVD standards focus on how data is organized, and how the player should function. However, in some ways, DVD-Audio is somewhat more complex than DVD-Video. Whereas DVD-Video uses one kind of video coding (MPEG-2), DVD-Audio allows for many kinds of coding, with one to six channels. Parenthetically, it is worth noting that the terminology used in DVD-Audio differs from that used in DVD-Video. For example, in DVD-Video we refer to Title, Program, and Cell; whereas, in DVD-Audio we refer to Group, Track, and Index, respectively.

DVD-Audio provides stereo and multichannel playback. A Volume space includes a DVD-Volume zone and DVD-Audio zones as well as DVD-Video zones and DVD-Other zones (see Fig. 8.11B). The DVD-Volume zone complies with the UDF Bridge structure defined in Part 2. The DVD-Video zone and DVD-Other zone are defined in Part 3. A DVD-Audio zone is an area to record the audio contents in a volume of a DVD-Audio disc. It contains one Audio Manager (AMG) and one or more (maximum of 99) Audio Title Sets (ATSs). The AMG is the table for all contents in the DVD-Audio zone (and DVD-Video zone if present) and Navigation data. The AMG is composed of Audio Manager Information (AMGI), optional Video Object Set for AMG Menu (AMGM VOBS) and a backup of AMGI. ATT is a general name given to Audio Only Title (AOTT) and Audio with Video Title (AVTT). An AOTT title has no video data except for still pictures and is defined in the PGCI in the ATS. An AVTT title (otherwise known as AV) has video data and is defined in the PGCI in the VTS. AOTTs are playable by Audio players and Universal players, and AVTTs are playable by Universal players. The DVD-Audio file structure is the same as that used in DVD-Video (see Fig. 8.15).

The Audio Title Set defines the Audio Only Titles. The ATS contains Audio Objects (AOBs) and can contain visual menus. Generally, one AOB comprises one track on the disc. Further, an AOB can have two streams, such as stereo and multichannel. (AVTTs are defined in the Video Title Set with links from the ATS). An AOB track can also optionally contain still images, stored in Audio Still Video (ASV) files. There are two kinds of ATSs. One kind is composed of Audio Title Set Information (ATSI), Audio Object Set for Audio Only Title (AOTT AOBS), and a backup of ATSI. The other is composed of ATSI and its backup.

The Presentation hierarchical structure is an Album (one disc side), a Group, an Audio Title (ATT), a Track, and Index. ATT is not accessible by the user. A Group contains one or more Audio Titles. There are several types of ATTs. AOTT is playable by all Audio players. AVTT is playable by a Video Capable Audio player.

Both an AOTT and AVTT contain a Program Chain (PGC). A Track is a Program (PG) defined in the PGC of ATS. The attribute is the definition of sampling frequency, quantization word length, and so on. An Index is a cell defined for audio contents in the PGC. An index may consist of two or more cells.

Presentation of contents starts from the track (or index) selected by the user. This is the same as the playback of PGC as defined in the ATS. Video data including sub-picture in video contents and still picture in audio contents are played by a Video Capable Audio player. Some types of Real-Time information can be recorded within the audio contents, if desired. Real-Time Text Data (RTXTDT) contains text data such as lyrics, and explanations of contents. One Page consists of 4 lines with 30 characters per line or 2 lines with 15 characters per line. Text is presented onscreen one page at a time. Eight languages are available. Two kinds of audio data (such as multichannel or not, or PCM or another coding) may be defined in an ATT; this is also known as a selection. (For AOTTs, they are defined as the PGC block, and for AVTTs, they are defined as two streams of audio in a Video Object.)

The Visual Menu is the menu for the AMG. Presentation of the Visual Menu is the same as playback of one or more PGCs that are defined for the AMG Menu. It is played back by Video Capable Audio players but ignored by Audio-Only players which use a simpler SAPP (Simple Audio Play Pointer) feature.

The AMG is the table of contents for all ATSs that exist in the DVD-Audio zone, and all VTSs for audio titles that exist in the DVD-Video zone. The AMG contains Audio Manager Information that contains Navigation data for every audio title and its backup. It may also contain Video Objects used for the Visual Menu. The AMGI is composed of the Audio Manager Information Management Table, Audio Title Search Pointer Table, Audio Only Title Search Pointer Table, Audio Manager Menu PGCI Unit Table, and optional Audio Text Data Manager. The AMGI describes information in the AUDIO_TS directory. The Audio Manager Information Management Table (AMGI_MAT) is a table that describes the size of the AMG and AMGI, starting addresses of information in the AMG and other attribute information such as number of volumes, disc side where a volume is recorded, video display mode, aspect ratio, audio coding mode, and sampling frequency. The Audio Title Search Pointer Table is a table with search information (starting and ending addresses) for Audio Titles and is used by a Video Capable Audio player.

The Audio Only Title Search Pointer Table is a table that contains search information (addresses) of AOTTs and is used by Audio-Only players; this is part of the SAPP feature. The Audio Manager Menu PGCI Unit Table is a table that describes the audio menu. The optional Audio Text Data Manager contains information such as album, group, and track names.

The Video Object for Audio Manager Menu (AMGM_VOB) contains Presentation data (video, audio, and sub-picture data) and some of the Navigation data (PCI and DSI). The AMGM_VOB is the same as the Video Object, with the same contents, same pack structure, and same data transfer rate. The Presentation data is essentially the same as in Part 3. Its PCI and DSI data are essentially the same, with some added restrictions. The AMGM_VOB contains the ISRC code.

The Audio Title Set defines the Audio Only Titles that are defined by the Navigation data and the Audio Objects in the ATS, or by the Navigation data in the ATS and the audio part of Video Objects in the VTS. The ATSs are recorded in the DVD-Audio zone along with the Audio Manager, and the VTSs to be used for the audio title are recorded in the DVD-Video zone along with the Video Manager. The ATSs contain Audio Title Set Information (ATSI), an Audio Object Set for Audio Only Title (AOTT_AOBS), and a backup. The ATSI contains the Navigation data needed to play back every ATT in the ATS and provides information to support User Operation. ATSI contains the Audio Title Set Information Management Table (ATSI_MAT), and Audio Title Set PCI Table (ATS_PGCIT). The AOTT_AOBS is a collection of Audio Objects for Audio Only Title that contains Presentation data such as audio data, optional still picture data, and some kinds of optional Real-Time Information (RTI).

The Audio Title Set Information Management Table (ATSI_MAT) describes the size and starting addresses of ATS and ATSI, as well as attributes. It describes the audio coding method, downmix mode, quantization word length, and sampling frequency of two channel groups. The ATSI_MAT also describes the coefficients to mix down the audio data from multichannel to two-channel. An area containing 16 coefficient tables is provided. The ATS also contains Audio Title Set PCI (ATS_PGCIT), which is the Navigation data to control the presentation of the Audio Title Set Program Chain (ATS_PGC). This information describes the addresses of data, as well as the presentation order of programs and cells.

The Audio Object for Audio Only Title (AOTT_AOB) contains the Presentation data that are audio data, Real-Time Information (RTI) data and still picture data. The AOTT_ AOB is an elementary program stream described by the ISO/IEC 13818-1 standard. The AOTT_AOB uses three types of packs: Audio pack, Real-Time Information pack, and Still Picture pack. The maximum length of a pack is 2048 bytes. The maximum transfer rate of the audio stream is 9.6 Mbps. The maximum video transfer rate for still pictures is 9.8 Mbps.

The AOTT_AOBS Structure is a collection of AOTT_AOB files whose attributes are the same or different up to eight. It is composed of one or more cells that are made up of packs. The pack structure of the AOTT_AOB follows the general 2048-byte DVD pack layout (see Fig. 8.10) with 14 bytes of pack header and 2034 bytes of packets. In many ways, the AOTT_AOB is the core of a DVD-Audio disc. It contains the Presentation data (mainly audio). It adheres to the ISO/IEC 13818-1 standard (the MPEG-2 stream layer specification). As noted, 13818-1 compliance does not mean that audio is coded as MPEG-2. Rather, the format of the bitstream itself adheres to MPEG-2. Linear PCM data is held in A_PKT packets. As in Part 3, a Pack is a header, followed by packets. A Pack adheres to ISO/IEC 13818-1; a Packet is the elementary data stream following the header. An AOTT_AOB Audio pack (A_PCK) has up to 2013 bytes of user data. A LPCM packet (A_PKT) comprises a packet header, the private header, and the audio data. The private header has information such as ISRC, audio emphasis, downmix code, quantization word length, sampling frequency, multichannel type, and dynamic range control.

An AOTT_AOB Real-Time Information pack (RTI_PCK) contains up to 2015 user bytes. An RTI packet (RTI_PKT) comprises a packet header, the private header and RTI data. RTI data (such as real-time text, and ISRC) is used synchronously with audio data. An AOTT_AOB Still Picture pack (SPCT_PCK) contains up to 2025 user bytes. A Still Picture pack (SPCT_PKT) comprises a packet header and video data for the still picture. The still picture is one GOP (with one I-frame), which complies with ISO/IEC 13818-2 (MEG-2 video). The following audio data is mandatory in an AOTT: PCM of one or two audio channels or PCM for three to six audio channels with downmix coefficients, or PCM data of three to six channels without downmix coefficients. (In some cases, an Audio player may not play back this PCM data.) Other types of audio data (such as compressed and lossless compressed) may be contained in an AOTT_AOB as an option. The audio stream is divided into packs of 2048 bytes.

The general DVD-Audio specification for linear PCM describes the number of channels, sampling frequency, quantization, and emphasis. Total maximum bit rate is 9.6 Mbps. Coding is two’s complement. When there are three or more audio channels, they are classified into two groups: Channel Group 1 (CG1) and Channel Group 2 (CG2). The data in each group may use different sampling frequencies and word lengths. The word length is identical in every channel of the same CG, but each CG may have different word lengths (16, 20, 24). Sampling frequencies of 176.4 kHz and 192 kHz are only used when there are two channels (L and R) or less. CG1 generally defines stereo and front channels, and CG2 defines rear channels.

When an ATS has AOTT_AOBS, the structure of CG1 and CG2 and the relation between the audio channel and audio signal is described in the ATS Multi Channel Type area (it must be Type 1) according to an Assignment for Audio Object table. When an ATS has no AOTT_AOBS, the relation between the audio channel and audio signal and the number of audio channels is described in the ATS Multi Channel Type area (it must be Type 1) according to an Assignment for Video Object table. The channels supported are: Left Front, Right Front, Center, Low Frequency Effects, Surround, Left Surround, and Right Surround.

The downmix procedure outputs channel signals from input signals that have more channels than outputs. Mixing phase and mixing coefficients are used to produce a two-channel output from six-channel input signals. In this way, it is not necessary to place a separate two-channel mix on a multichannel disc. Each gain controller is programmed with a coefficient value. It also manages the phase (polarity) of the signal. The downmix coefficient (DM_COEFT) is defined in the disc ATS and SAPP lead-in area. It is used by the player to control the downmix procedure. The decimal coefficient value is 0 to 255. It is calculated according to a formula for a specific gain control curve.

The video contents of an Audio with Video Title (AVTT) (also known as AV) are recorded in the DVD-Video zone (one with one VMG and one or more VTSs). VMG is not used by Audio players but may be used by DVD-Video players, which cannot recognize AMG. The inclusion of a VMG allows DVD-Video players to play video tracks on a DVD-Audio disc that may include PCM or Dolby Digital content. VTS is used to define the AVTT of a DVD-Audio disc. Titles defined by VTS are pointed to by the Audio Manager. Various restrictions apply to the VMG of DVD-Audio; for example, a DVD-Audio disc has no region management or parental management. Restrictions also apply to the VTS of DVD-Audio.

DVD-Audio provides a degree of navigation interactivity, such as branching. However, these features are operational only for Video Capable Audio players. Audio-Only players may ignore these features. The navigation features for DVD-Audio are a subset of those specified for DVD-Video. Some special navigation features are added to the reserved areas of the Part 3 specification to support specific audio needs. Part 4 navigation parameters are classified as General Parameters (GPRM) and System Parameters (SPRM). There are sixteen 16-bit GPRMs (such as go to, jump, link, and compare) to memorize the user’s operational history and modify operation of Video Capable Audio players. In addition, there are twenty-four 16-bit SPRMs (such as audio selection number, sub-picture stream number, highlighted button number, audio player configuration, and track number) for player settings.

The Simple Audio Play Pointer (SAPP) provides TOC-like data for simple Audio-Only players that may use a simple alphanumeric readout instead of a video display. The SAPPT is a table (a subset of navigation) recorded in the control data in the disc lead-in area and consists of one or more SAPPs. Each SAPP is information for the track presented by a simple Audio player that plays back only PCM not using the Program Chain defined in the Audio Title Set. Every audio program defined in the ATS_PGC that satisfies certain conditions is a SAPP. The conditions are: audio coding mode of PCM, and stereo or monaural output. A SAPPT specifically describes the number of SAPPs and the end address of the SAPPT. The size of an SAPPT must be less than 16,384 bytes. A SAPP describes address information and playback information for a track such as track number, start time of the first audio cell, playback time, track attributes such as word length and sampling frequency of CG, downmix coefficient, and end address.

At the hardware player, packs in the program stream are received from the disc and transferred to the appropriate decoder. A buffer is used to ensure continuous supply of data to the decoders. In a Video Capable Audio player, DSI data (navigation used to search and seamlessly play back branching) in AVTT is treated separately. The audio stream may include the Audio Gap that is the discontinuous period of the audio stream during the presentation of a Still Picture. During the Audio Gap, the player’s audio output is muted.

Alternative DVD Formats

In addition to the DVD-Video format defined in Book B and the DVD-Audio format defined in Book C, the DVD family includes DVD-ROM (Read-Only Memory) defined in Book A, DVD-R (Recordable) defined in Book D, DVD-RAM (Random-Access Memory) defined in Book E, and DVD-RW (ReWritable) defined in Book F. DVD Books A, B, and C use a UDF Bridge file format (M-UDF + ISO 9660) and Books D, E, and F use the UDF format. The DVD-ROM, DVD-R, DVD-RAM, DVD-RW, and DVD + RW formats are employed as computer peripherals, in professional authoring environments, or in consumer applications. Single-sided and double-sided recordable discs are available. DVD-ROM is a read-only format, DVD-R is a write-once format, while the others are rewritable. The specifications for DVD-R, DVD-RW, and DVD-RAM are supported by the DVD Forum (www.dvdforum.org). The DVD + R and DVD + RW specifications are supported by the DVD + RW Alliance (www.dvdrw.com). The family of recordable DVD formats is summarized in Table 8.10.

TABLE 8.10 Specifications for recordable DVD disc formats (4.7-Gbyte capacity).

The recordable DVD formats can store diverse types of data; however, several specifications for specific data types have been defined: DVD Video Recording (DVD-VR), DVD Audio Recording (DVD-AR), and DVD Stream Recording (DVD-SR). The DVD-VR recording format is borrowed from DVD-Video. DVD-VR recorders allow real-time recording of video, stereo PCM audio, as well as still pictures, and users can create custom play lists. A new VOBU map, located in the VOBI area, stores time stamps; when a recording is completed, users can create menus to easily access programs. DVD-AR is derived from the DVD-Video and DVD-Audio formats; it supports real-time audio recording, as well as still pictures and text. The various DVD-Audio sampling frequencies are supported as is PCM, Dolby Digital, and MPEG formats. Users can create custom play lists to access disc contents. The DVD-SR format is derived from the DVD-Video format. It acts as a bit bucket to allow recording of streaming data from digital sources such as camcorders, cable boxes, and satellite receivers. An IEEE 1394 interface can be used.

DVD-ROM

At their base level, all DVD discs are DVD-ROM (Read-Only Memory) discs. That is, all DVD discs use the UDF format. Different DVD applications, such as DVD-Video, place specialized material in a specific place, such as the DVD-Video zone. Content contained in the DVD-Other zone may be quite varied, and DVD-ROM uses that opportunity for open-ended storage. In that respect, DVD-ROM is a large capacity bit bucket formatted as UDF. DVD-ROM discs are playback-only media used for high-capacity storage of data, software, games, and so on. DVD-ROM drives are connected to personal computers and function much like CD-ROM drives. With appropriate software, DVD-ROM drives can play DVD-Video and DVD-Audio discs. As with other DVD players, to play back CD-R discs at a 780-nm wavelength and DVD discs at a 635-nm or 650-nm wavelength, DVD-ROM drives must use pickups with dual lasers and other appropriate optical design. DVD-ROM drives support DVD-Video regional coding as well as CSS copy protection. The various recordable formats are not mutually compatible, and there is variability in disc-to-drive compatibility.

DVD-R and DVD + R

DVD-R (Recordable) discs, like CD-R, offer write-once capability to permanently record data. The DVD-R(A) Authoring format is often used for professional authoring and testing of DVD titles. The DVD-R(G) General format is used for business and consumer applications. Because DVD-R(A) uses a 635-nm laser for writing and DVD-R(G) uses a 650-nm laser, the two media are not write-incompatible. However, discs are playable in both types of drives. DVD-R(A) has a Cutting Master Format (CMF) functionality that allows a Disc Description Protocol (DDP) file to be written in the lead-in area for mastering applications. Replication plants can use these discs directly. DVD-R(G) discs include measures to limit piracy; for example, some decryption keys are blanked out. It is thus impossible to copy CSS-encrypted data to a disc. Also, DDP data cannot be written to DVD-R(G) discs.

FIGURE 8.21 DVD-R discs contain a dye recording layer backed by a reflective metal and protective layer, all sandwiched between two substrates.

DVD-R discs comprise two substrates bonded together. A single-sided disc uses one pregrooved substrate bonded to one pregrooved dummy substrate. The recording side of a single-sided disc comprises a polycarbonate substrate, organic dye recording layer, reflective layer, and protective lacquer overcoat, as shown in Fig. 8.21. The dummy side comprises a substrate, cosmetic reflective layer, and protective lacquer overcoat. The CLV wobbled pregroove generates a carrier signal used for motor control, tracking, and focus. However, whereas CD-R discs use a physical frequency modulation of the pregroove carrier signal to encode the Absolute Time In Pregroove address and prerecorded signal, DVD-R discs use pits and land (known as land pre-pits) molded into land areas between grooves. Placed at the beginning of each sector, the pre-pits contain addressing, laser writing power, and synchronization information. The reading laser tracks the pregroove, but the light shines on the pre-pits peripherally to create a secondary signal that can be extracted from the main signal. As with CD-R, DVD-R uses a pulsed laser to create marks in the organic dye, controlling the duration and intensity of the laser bursts. However, whereas the CD-R write strategy typically simply turns the laser on and off, during DVD-R writing the laser is modulated between a recording and reading bias power to create a multi-pulse train to write one mark. This efficiently controls heat, and creates smaller and more accurate marks. Disc manufacturers can optionally place a write strategy code in the lead-in pre-pits to modify the player’s write strategy. Reflectivity for DVD-R and DVD + R discs is about 45 to 85%.

DVD-R discs contain a power calibration area (PCA) for testing laser power. A recording management area (RMA) stores calibration information, disc contents and recording locations, remaining capacity information, and recorder and disc identifiers for copy protection. Recorders perform an optimum power calibration (OPC) procedure to determine the correct laser writing power for particular discs. The PCA can hold 7088 different calibrations, and the RMA can hold OPC information for as many as four different recorders. The remainder of the disc comprises the Information Area. It contains the lead-in, data recordable area, and lead-out. The lead-in contains information on disc format, specification version, physical size and structure, minimum readout rate, recording density, and pointers to the location of the data recordable area where user data is recorded. The lead-out marks the end of the recording area.

DVD-R discs can use the same reference velocity and track pitch as molded discs to achieve the same unformatted storage capacity; user capacity of a “4.7 Gbyte” Version 2.0 disc holds 4.7 billion bytes, or 4.35 Gbytes of user data per side. A cyanine, phthalocyanine or azo dye recording layer may be used, with a 635-nm or 650-nm laser. Both sequential (disc-at-once) and incremental writing can be performed. Once recorded, DVD-R discs are highly compatible and can be played in many DVD-ROM, DVD-Video, and DVD-Audio players. Longevity of a recorded disc is similar to that of a CD-R disc; estimates range from 50 to 300 years. On the other hand, as with most recordable media, the shelf life of unrecorded discs might be only 10 years. Single-sided, dual-layer discs (using the same physical parameters as DVD-ROM discs) hold 8.5 billion bytes. Most drives can read both layers; however, a dual-layer (DL) recorder is needed to write to the second layer.

FIGURE 8.22 DVD-RW discs contain a phase-change recording layer backed by a reflective metal layer. The recording layer is sandwiched between two dielectric layers to control thermal properties. Two substrates are bonded together. DVD + RW discs use the same physical construction.

The DVD + R format is another write-once format. It is not officially a part of the DVD specification written by the DVD Forum. It uses a dye recording layer and CLV rotation. Discs are available in a 4.7- and 8.5-Gbyte (DL) capacity. DVD + R discs are highly compatible and can be played in many DVD-ROM, DVD-Video, and DVD-Audio players. However, DL recorders are needed to record to the added layer.

DVD-RW and DVD + RW

DVD-RW (ReWritable) allows rewriting of data; the specification is essentially an extension to the DVD-R format. It is similar to the CD-RW format. DVD-RW is used for both professional authoring and consumer applications. Discs use a phase-change recording mechanism and a multilayer disc structure shown in Fig. 8.22. The recording layer may use a silver, indium, antimony, and tellurium compounded layer, and perhaps 1000 read/erase cycles are possible. Unlike dye-polymer technologies, phase-change recording is not wavelength-specific. Reflectivity for DVD-RW discs (and other phase-change discs) is about 18 to 30%. The disc uses a wobbled pregroove, and pre-pits with addressing and synchronization information. Data is recorded inside the pregroove, and in relatively large blocks. As with DVD-R, there are PMA and RMA zones. A DVD-RW disc may hold 4.37 Gbytes per side. DVD-RW uses CLV rotation; thus, it is particularly used for sequential writing, as in mastering applications. Because it has less robust error protection and a relatively small number of rewrite cycles, DVD-RW is not intended for general purpose data storage and distribution. Although not required, some players use a protective disc caddy. DVD-RW was previously known as DVD-R/W. DVD-RW discs are highly compatible and can be played in many DVD drives. As with other recordable DVD media, the longevity of a DVD-RW disc might be as long as 100 years.

DVD + RW is another rewritable format. It is not officially a part of the DVD specification written by the DVD Forum. It uses phase-change media and a wobbled pregroove; the frequency modulation in the wobble provides address in pregroove (ADIP) addressing information. Data is written inside the groove and there is no preembossed addressing data. Disc layer construction is the same as in the DVD-RW format (see Fig. 8.22). Data is written and read in relatively large blocks compared to DVD-RAM. CLV or CAV rotation is allowed for recording, for either sequential data transfer (as in audio/video recording) or faster random access (as in computer data), respectively. Optional defect management features, similar to those found on DVD-RAM, are available. One thousand rewrite cycles are possible. Nominal capacity of a CLV disc is 4.7 Gbytes, and a double-layer (DL) disc holds 8.5 Gbytes. DVD + RW discs recorded with CLV can be played in some DVD drives.

DVD-RAM

DVD-RAM (Random–Access Memory) is a rewritable format. It uses a phase-change recording mechanism and a wobbled land and groove disc design. Using this structure, data may be recorded on both planar surfaces of the groove and land, as shown in the upper part of Fig. 8.23. This technique doubles disc capacity, but deep grooves with steep walls are needed to avoid crosstalk interference between adjacent data. In addition, servos must be employed to switch the pickup’s focus between the groove and land area on each revolution. In addition, the tracking signal is inverted when the switch occurs. However, designers contend that the wider groove pitch provided by the groove/land recording technique allows easier tracking and faster recovery from physical shock. Discs also contain pre-embossed pit areas (for every 2k sector) to provide addressing header information, as shown in the lower part of Fig. 8.23. A zoned constant linear velocity (ZCLV) rotational control is used. This technique divides the disc surface into a number of zones, each with a different CLV, but with the same CAV within each zone. There are a total of 35 recording zones across a 120-mm Version 2.0 disc. Successive zones contain more sectors, with 39,200 sectors in the first zone and 105,728 sectors in the last zone. The ZCLV feature enables DVD-RAM to be used as a true random-access, nonsequential medium. Thus, DVD-RAM is well-suited for writing and reading chores done from computer drives.

FIGURE 8.23 DVD-RAM discs use a land and groove recording technique. As seen in the upper portion of the figure, phase change data may be placed on planar surfaces of both the wobbled pregroove and land. As shown in the lower part of the figure, rewritable areas are separated by pre-embossed sector headers holding addressing information. (Parker, 1998)

DVD-RAM provides advanced error correction and defect management features. In the latter feature, defective sectors are identified during manufacture or formatting (or reformatting) and preallocated spare sectors can substitute for them. To reduce wear and tear on specific portions of the disc that are repeatedly written to, the system automatically shifts data placement on the disc surface.

DVD-RAM Version 2.0 discs marketed with capacities of 4.7 and 9.4 billion bytes hold 4.37 Gbytes (single-sided) and 8.74 Gbytes (double-sided), respectively. A recording rate of 22.16 Mbps is possible. A disc allows perhaps 100,000 rewrite cycles, and offers a high degree of stability for archiving integrity. DVD-RAM is designed primarily for professional DVD authoring and other post-production work but some consumers use DVD-RAM discs. DVD-RAM discs may be played in many DVD-ROM drives and in some DVD-Video and DVD-Audio players. Higher-capacity discs are usually held in protective cartridges that require slot-loading drives. Some discs can be removed from their cartridges for playback in tray-loading drives. Because of its relatively unique design features it is not as compatible as other recordable DVD formats.

DVD Multi is not a disc format; rather, it is a specification that promotes compatibility within the DVD family. A read-only drive denoted as DVD Multi is capable of reading DVD-ROM, DVD-R, DVD-RW, and DVD-RAM media. Likewise, a writeable drive denoted as a DVD Multi can read all of these media, and can also write on DVD-R, DVD-RW, and DVD-RAM media.

DVD Content Protection

The intellectual property potentially stored on DVD discs has a monetary value that is almost incalculable. Prerecorded formats such as DVD-Video and DVD-Audio provide content owners with the option of securing and monitoring their data in a variety of ways including encryption and watermarking. Likewise, recordable media such as DVD-R, DVD-RW, DVD-RAM, and DVD + RW can employ content protection to prohibit or limit copying. A delicate balance is required so that while content is secured, the user is not unnecessarily inconvenienced; these requirements are mutually contradictory in any content protection system. A number of copy-protection mechanisms, summarized in Table 8.11, are optionally available to content owners.

TABLE 8.11 Summary of copy-protection systems.

DVD-Video Copy Protection

A group known as the 4C entity, comprising Intel, IBM, Matsushita, and Toshiba, developed the Content Protection System Architecture (CPSA) that encompasses security issues for DVD formats. In all, eight different security features are used in DVD formats.

In cooperation with 4C, the Copy Protection Technical Working Group (CPTWG) representing 60 companies and interest groups developed the Content Scrambling System (CSS) copy-protection system that is standard in DVD-Video discs. The CPTWG also established an independent, nonprofit group to oversee nominal cost-based licensing of the CSS technology. Data encryption is used so that content is self-protecting, but use of the technology is voluntary; discs can be distributed with or without copy protection. Likewise, manufacturers could offer a player without decryption hardware; however, it could only play nonencrypted discs. To obtain the algorithms and keys needed to decrypt data in their players, manufacturers must first obtain a DVD license. The data stream is flagged so computer programs can properly interpret the encryption. A Matsushita proposal is the basis for the CSS system. With CSS, content is self-protecting; that is, content cannot be digitally copied because software keys needed to decrypt the data are missing in any copy. Although it is a different technology, regional coding must be implemented in any CSS device.

When encrypted data is decoded in software (as opposed to a dedicated hardware chip) care must be taken so that excessive demands are not placed on the microprocessor. CSS is designed to minimize the burden, allowing efficient decryption (descrambling) without compromising integrity. Moreover, CSS limits the processor overhead required to perform decryption. In the variable-rate MPEG-2 video coding algorithm, video frames are stored as data sectors. There are 2048 bytes/sector. At a fast data rate of 10 Mbps, there might be 600 sectors/second or 20 sectors/frame; at a slow data rate of 2 Mbps, there might be 120 sectors/second or 4 sectors/frame. Instead of encrypting all video sectors, the CPTWG sets an upper limit (and lower limit) on the rate of sectors encrypted by CSS. For example, 10 or 15% of sectors might be encrypted. Even with a low rate, the picture will be unviewable, but the limit minimizes microprocessor overhead.

The proprietary CSS system encrypts data during encoding and then uses authentication to verify that the player’s decoder is authorized to decrypt the data. Moreover, communication within the system is encrypted to maintain security over keys. Most DVD-Video players have dedicated authentication hardware. CSS decoding can be performed in hardware or software and every decoder has a 40-bit player key used to decrypt a disc key, and uses the result with the title key to decrypt the movie contents.

CSS features two copy-protection methods. The first, the “Content Scrambled DVD” method, is designed for DVD-Video players. Sectors containing audio and video signals are encrypted; navigation data in sector headers is not encrypted. Content providers must select two encryption keys—one disc key and one title key—jointly used to encrypt the data prior to storage on a DVD-Video disc. The title key is placed in a disc sector header, and the disc key is concealed in a control area of a disc that cannot be read by a DVD-ROM drive unless instructed by authentication commands. Each licensed manufacturer is assigned one of 400 unique player keys; all 400 keys are stored in every disc using CSS encryption. If a license lapses, that manufacturer’s key can be omitted from future disc pressings.

The DVD-Video player’s hardware decrypting chip is placed in the bitstream between the source data, and the internal Dolby Digital and MPEG-2 decoders. The person viewing the program via an analog output will not know that decryption is taking place. However, if the player contains a digital output, that output will be tapped off prior to decrypting. Copies made from the output digital stream cannot be decrypted because any subsequent decoders will not be able to retrieve the encryption keys and use them to decrypt the data.

The second, the “Bus Authentication and Encryption” method, is designed for use in the computer environment, where encrypted 128-bit keys must be transmitted from a DVD-Video disc across a computer bus to decryption software or hardware. An authentication key is used in addition to the disc and title keys, and each key is checked by elaborately sending data between the disc and the decrypter. This method is more sophisticated because during playback it performs additional encryption on the keys themselves.

In addition, CSS requires that an analog protection system (APS) be employed. Macrovision copy protection, similar to that used in set-top boxes and video networks (which in turn is similar to that protection used in analog videocassettes) is typically used. The Macrovision system can prevent digital-to-analog copying, for example, attempting to use the analog output from a DVD-Video player to make a VHS tape copy. This system uses automatic gain control (AGC) and Colorstripe methods. The AGC portion is virtually identical to that used in prerecorded videocassettes; bipolar pulse signals are added to the video vertical blanking signal causing a VCR to record a weak, noisy, and unstable signal. Because the AGC of a television works quite differently from the AGC of a VCR, VCR playback is disrupted, whereas television display is not. The Colorstripe method is similar to that used in digital set-top boxes; it modulates the phase of the colorburst signal in a rapid, controlled manner, creating horizontal stripes in a copy. A recording VCR recognizes the colorburst phase changes as timebase errors and acts to correct them, thus inducing color errors in the picture. An unauthorized copy shows stripes of color, distortion, rolling, a black and white picture, and dark/light cycling. Colorburst is not present in a component video signal. Use of Macrovision is optional and per-disc licensing fees are paid. The disc identifies its Macrovision protection to the player. A player or drive that does not contain APS would not play DVD-Video discs encrypted with CSS. Likewise, video cards may use APS.

CSS technology is used primarily by the motion picture industry (but its use on a disc is optional). Many computer software providers do not use CSS, even for their audio/video content. Importantly, CSS does not protect other types of data such as software programs. Manufacturers who want to accommodate playback of CSS-coded titles may apply for a license and place CSS decoders in their products. Products without CSS decoding would not play back CSS-coded titles. For example, a DVD-ROM drive might not contain a CSS descrambler; the drive could be used to play back non-scrambled data, but could not be used to watch scrambled movies on a computer. Some professional pirates use DVD replication lines to produce bit-for-bit accurate DVD-Video discs—complete with CSS encryption. Another piracy method rips a DVD-Video disc into its component video and audio contents and re-codes them to Video CD. Both methods violate copyright law. The Data Hiding Sub-Group (DHSG) of the CPTWG is charged with the development and evaluation of watermarks. Encryption and water-marking are also discussed in Chap. 15.

DVD-Audio Copy Protection

To protect against unauthorized copying, the DVD-Audio format uses an optional CPPM content protection framework employing encryption and embedded watermark technology. Copy-protected DVD-Audio discs can only be played on licensed players. The Content Protection for Prerecorded Media (CPPM) was devised in March 1999, by IBM, Intel, Matsushita, and Toshiba in conjunction with music industry companies such as BMG, EMI, Sony Music, Universal Music Group, and Warner Music Group. WG-9 is charged with copy-protection issues. CPPM is similar in intent to the CSS system used in the DVD-Video format and CPPM uses the same authentication measures as CSS. However, CPPM’s protection is more sophisticated.

The CPPM encryption code is stronger than that used in the DVD-Video format. A secret album identifier is placed in a control area of the disc that cannot be read by recordable drives, and so cannot be copied to a blank media. Each player or drive has 16 device keys. A media key block is placed on every disc, and the player’s device keys interact with the media key block to generate a media key. It is used with the album identifier to decrypt encrypted portions of the disc contents. In the event of hacking, there is capability to revoke, expire, or recover encryption keys.

The CPPM content protection system provides a number of options to content providers of prerecorded media; for example, consumers can make one CD-quality digital copy, per recorder, of the original content. Related content such as supporting text and images is not copied. Content providers can also allow additional copies at various quality levels, up to and including the full quality of the DVD-Audio multichannel original. The encryption used in DVD-Audio can allow two-channel CD-quality, real-time copying along the IEC-958 interface. It also allows both two-channel and multichannel, CD-quality and higher quality, high speed copying along the IEEE 1394 interface. The recorder receives ISRC data that identifies the original recording along with copy permission information describing, for example, how many copies are permitted.

The CPPM watermark is designed to identify content through unencrypted digital (and analog) links. It is not used in high-speed encrypted links and instead verifies copy status of unencrypted signals. The watermark is embedded in the audio signal and is robust over analog and data-compressed transmission links. The watermark operates similarly to SCMS in the digital domain, but it operates in the analog domain or unencrypted digital domain. A copy-permit is the default status; when a copy is made, the embedded watermark signal is updated to mark the copy as a second-generation source. Watermark-compliant recorders will check this mark prior to recording. The watermark can also identify the manufacturer, artist, copyright holder, and other characteristics. The encryption and watermarking technologies are independent; for example, watermarking is optional in encrypted discs.

Content Protection for Recordable Media

Recordable media are protected by the Content Protection for Recordable Media (CPRM) protocol. CPRM links content to the media it is recorded to, so that the recording is playable but copies of the recording are not. CPRM is similar to the CPPM system used specifically for the DVD-Audio format. All blank DVD media have a 64-bit media identifier placed in the burst cutting area at the time of manufacture that uniquely identifies each disc. With CPRM, when protected content is recorded to the disc, the media identifier is used to encrypt a title key, which in turn encrypts content. When the disk is played, the media identifier is again used along with other keys to decrypt a title key, which in turn is used to decode the contents. If the content is moved to another disc, its media identifier will not correctly decode the content. Only audio/video sectors are encrypted; navigation and other data is not encrypted.

The Copy Generation Management System (CGMS) controls the copying of digital and analog video signals. Discs can specify whether any copying is permitted; this is conveyed in data in the output analog and digital signal and interpreted by recorders. Copy instructions in analog signals are placed in the XDS section of the NTSC signal, and in digital signals it is conveyed via DTCP and HDCP protocols. Copy control information (CCI) includes no copies, one copy, and unlimited copies. When one copy is permitted, the second-generation copy then contains a no-copy instruction; however, multiple copies may be made from the original copy. When a disc carries CSS, CPPM, or CPRM, then a “no copy” condition is assumed. Furthermore, when copying to unprotected media such as CD-R, DVD-Audio limits authorized copies to no more than two channels, 48 kHz and 16 bits.

Secure Digital Transmission

Many applications require a secure link between two devices, such as a computer video card and a display. Two principal systems have been developed. DVD data can be conveyed along these paths, but the disc itself does not participate; the player and display perform the necessary operations independent of the disc contents. The High-Band-width Digital Content Protection (HDCP) system defines a secure digital interface for players and displays designed according to the Digital Visual Interface (DVI) specification. DVI can support transmission at 4.95 Gbps; this provides 1600 × 1200 resolution that encompasses HDTV formats. Twin links can support even higher resolution. HDCP for DVI makes DVI a secure interface. Connected devices, such as a video display card and a monitor, exchange keys to authenticate the devices; the system uses forty 56-bit device keys and 40-bit key selection. Data is encrypted at the transmitting device and decrypted at the receiving device. If the receiving device is not HDCP equipped, the transmitting device may send a lower resolution version of the content. HDCP was proposed by Intel and ratified by the Digital Display Working Group in 1998.

The Digital Transmission Content Protection (DTCP) system provides secure transmission over bidirectional digital lines such as the IEEE 1394 bus. For example, a DVD player could be digitally and securely connected to an LCD display, and DTCP would resist unauthorized copying by another connected device. DTCP is described in Chap. 14 in the context of IEEE 1394.

DVD Watermarking

Watermarking can be used to intertwine data into DVD contents so that the watermark can later be retrieved to identify the contents on the disc. Furthermore, most watermarks resist tampering or removal. Watermarking does not prevent copying; it merely identifies the content. In some cases, a fragile watermark is used; analog copying degrades the watermark and thus identifies the content as a copy. A watermark is only useful if downstream equipment recognizes it. In the case of DVD, a license agreement needed to play encrypted contents may also legally bind the manufacturer to detect watermarks. DVD-Audio uses a watermark system developed by Verance. The license that enables the drive to play CPPM or CPRM discs obligates the manufacturer to detect the watermark. DVD-Audio recorders recognize CCI copy-generation watermarks.

FIGURE 8.24 The abandoned HD DVD format provided greater storage capacity and higher output bit rate than DVD. HD DVD discs use two 0.6-mm substrates. A single-sided disc is shown, but the specification also supports double-sided discs.

HD DVD

The HD DVD format was envisioned as the successor to the DVD-Video format. This high-density disc format was designed primarily to deliver high-definition playback of motion pictures. Players and discs were introduced in March and April 2006, but the HD DVD format ultimately did not find commercial success against the competing Blu-ray system. In February 2008, its principal backer, Toshiba, announced that it would no longer develop or manufacture HD DVD players or drives. Soon thereafter, the format was abandoned in the marketplace.

The HD DVD format (High Density) uses a 405-nm blue-light laser and NA of 0.65 to achieve high storage capacity. An HD DVD-ROM disc holds 15 Gbytes on a single-layer disc and 30 Gbytes on a dual-layer disc. The structure of the HD disc is shown in Fig. 8.24. The VC-9 video codec used in Microsoft Windows Media 9 (WM9), MPEG-4 H.264 Advanced Video Codec (AVC), and MPEG-2 are mandatory video codecs for all licensed HD DVD players. Dolby Digital Plus and DTS are mandatory audio-coding formats. Lossless MLP 2-channel coding is mandatory and lossless DTS coding is optional. AES encryption is used to copy-protect contents. In addition, rewritable HD DVD discs have been developed. A single-sided, single-layer HD DVD-RW disc holds 20 Gbytes and a double-sided, single-layer disc holds 40 Gbytes. A single-sided, single-layer HD DVD-R disc holds 15 Gbytes. HD DVD was supported by the DVD Forum.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8 DVD

Create new playlist

Sign In

Sign Up

CHAPTER 8DVD