12

Digital audio in optical disks

Optical disks are particularly important to digital audio, not least because of the success of the Compact Disc and subsequent devlopments such as MiniDisc, magneto-optical production recorders and DVD. CD, DVD and MiniDisc are worthy of detailed consideration, as they are simultaneously consumer products available in large numbers at low cost, and yet are technically advanced devices. Optical disks result from the marriage of many disciplines, including laser optics, servomechanisms, error correction and both analog and digital circuitry in VLSI form.

12.1 Types of optical disk

There are numerous types of optical disk, which have different characteristics.1 There are, however, three broad groups, shown in Figure 12.1, which can be usefully compared:

1 The Compact Disc, the Digital Video Disc and the prerecorded MiniDisc are read-only laser disks, which are designed for mass duplication by stamping. They cannot be recorded.
2 Some laser disks can be recorded, but once a recording has been made, it cannot be changed or erased. These are usually referred to as write-once-read-mostly (WORM) disks. Recordable CDs and DVDs work on this principle.
3 Erasable optical disks have essentially the same characteristic as magnetic disks, in that new and different recordings can be made in the same track indefinitely. Recordable MiniDisc is in this category. Sometimes a separate erase process is necessary before rewriting.

images

Figure 12.1    The various types of optical disk. See text for details.

The Compact Disc, generally abbreviated to CD, is a consumer digital audio recording which is intended for mass replication. When optical recording was in its infancy, many companies were experimenting with a variety of optical media. In most cases the goal was to make an optical recorder where the same piece of apparatus could record and immediately reproduce information. This would be essential for most computer applications and for use in audio or video production. This was not, however, the case in the consumer music industry, where the majority of listening was, and still is, to prerecorded music. The vinyl disk could not be recorded in the home, yet it sold by the million. Individual vinyl disks were not recorded as such by the manufacturer as they were replicated by pressing, or moulding, molten plastic between two surfaces known as stampers which were themselves made from a master disk produced on a cutting lathe. This master disk was the recording.

Philips’ approach was to invent an optical medium which would have the same characteristics as the vinyl disk in that it could be mass replicated by moulding or stamping with no requirement for it to be recordable by the user. The information on it is carried in the shape of flat-topped physical deformities in a layer of plastic, and as a result the medium has no photographic, magnetic or electronic properties, but is simply a relief structure. Such relief structures lack contrast and are notoriously difficult to study with conventional optics, but in 1934 Zernike2 described a Nobel Prize-winning technique called phase contrast microscopy which allowed an apparent contrast to be obtained from such a structure using optical interference. This principle is used to read relief recordings.

images

Figure 12.2    (a) The information layer of CD is reflective and uses interference. (b) Write-once disks may burn holes or raise blisters in the information layer. (c) High data rate MO disks modulate the laser and use a constant magnetic field. (d) At low data rates the laser can run continuously and the magnetic field is modulated.

Figure 12.2(a) shows that the information layer of CD and the prerecorded MiniDisc is an optically flat mirror upon which microscopic bumps are raised. A thin coating of aluminium renders the layer reflective. When a small spot of light is focused on the information layer, the presence of the bumps affects the way in which the light is reflected back, and variations in the reflected light are detected in order to read the disk. Figure 12.2 also illustrates the very small dimensions which are common to both disks. For comparison, some sixty CD/MD tracks can be accommodated in the groove pitch of a vinyl LP. These dimensions demand the utmost cleanliness in manufacture.

Figure 12.2(b) shows that there are several types of WORM disks. The disk may contain a thin layer of metal; on recording, a powerful laser melts spots on the layer. Surface tension causes a hole to form in the metal, with a thickened rim around the hole. Subsequently a low-power laser can read the disk because the metal reflects light, but the hole passes it through. Computer WORM disks work on this principle. As an alternative, the layer of metal may be extremely thin, and the heat from the laser heats the material below it to the point of decomposition. This causes gassing which raises a blister or bubble in the metal layer. Recordable CDs can use this principle as the relief structure can be read like a normal CD. It is also possible to impregnate a disk with a chemical dye which darkens when it is struck by a high level of radiation from a writing laser. Clearly once such a pattern of holes, blisters or dark areas has been made it is permanent.

Rerecordable or erasable optical disks rely on magneto-optics,3 also known more fully as thermomagneto-optics. Writing in such a device makes use of a thermomagnetic property possessed by all magnetic materials, which is that above a certain temperature, known as the Curie temperature, their coercive force becomes zero. This means that they become magnetically very soft, and take on the flux direction of any externally applied field. On cooling, this field orientation will be frozen in the material, and the coercivity will oppose attempts to change it. Although many materials possess this property, there are relatively few which have a suitably low Curie temperature. Compounds of terbium and gadolinium have been used, and one of the major problems to be overcome is that almost all suitable materials from a magnetic viewpoint corrode very quickly in air.

There are two ways in which magneto-optic (MO) disks can be written. Figure 12.2(c) shows the first system, in which the intensity of laser is modulated with the waveform to be recorded. If the disk is considered to be initially magnetized along its axis of rotation with the north pole upwards, it is rotated in a field of the opposite sense, produced by a steady current flowing in a coil which is weaker than the room-temperature coercivity of the medium. The field will therefore have no effect. A laser beam is focused on the medium as it turns, and a pulse from the laser will momentarily heat a very small area of the medium past its Curie temperature, whereby it will take on a reversed flux due to the presence of the field coils. This reversed-flux direction will be retained indefinitely as the medium cools.

Alternatively the waveform to be recorded modulates the magnetic field from the coils as shown in Figure 12.2(d). In this approach, the laser is operating continuously in order to raise the track beneath the beam above the Curie temperature, but the magnetic field recorded is determined by the current in the coil at the instant the track cools. Magnetic field modulation is used in the recordable MiniDisc.

In both of these cases, the storage medium is clearly magnetic, but the writing mechanism is the heat produced by light from a laser; hence the term thermomagneto-optics. The advantage of this writing mechanism is that there is no physical contact between the writing head and the medium. The distance can be several millimetres, some of which is taken up with a protective layer to prevent corrosion. In prototypes, this layer is glass, but commercially available disks use plastics.

The laser beam will supply a relatively high power for writing, since it is supplying heat energy. For reading, the laser power is reduced, such that it cannot heat the medium past the Curie temperature, and it is left on continuously. Readout depends on the so-called Kerr effect, which describes a rotation of the plane of polarization of light due to a magnetic field. The magnetic areas written on the disk will rotate the plane of polarization of incident polarized light to two different planes, and it is possible to detect the change in rotation with a suitable pickup.

Erasable disks may also be made which rely on phase changes. Certain chemical compounds may exist in two states, crystalline or amorphous. One state appears shiny and the other dull to a readout beam. It is possible to switch from one state to another by using different power levels in the laser.

12.2 CD, DVD and MD contrasted

CD and MD have a great deal in common. Both use a laser of the same wavelength which creates a spot of the same size on the disk. The track pitch and speed are the same and both offer the same playing time. The channel code and error-correction strategy are the same. DVD uses the same principle as CD, but the readout spot is smaller so that the recording density can significantly be raised.

CD carries 44.1 kHz sixteen-bit PCM audio and is intended to be played in a continuous spiral like a vinyl disk. The CD process, from cutting, through pressing and reading, produces no musical degradation whatsoever, since it simply conveys a series of numbers which are exactly those recorded on the master tape. The only part of a CD player which can cause subjective differences in sound quality in normal operation is the DAC, although in the presence of gross errors some players will correct and/or conceal better than others. Chapter 4 deals with the principles of conversion.

DVD records an MPEG program stream. This is a multiplex of compressed audio and video elementary streams which can be decoded to reproduce television programs or movies.

MD begins with the same PCM data as CD, but uses a form of compression known as ATRAC (see Chapter 5) having a compression factor of 0.2 which introduces a slight but audible loss of quality. After the addition of subcode and housekeeping data MD has an average data rate which is 0.225 that of CD. However, MD has the same recording density and track speed as CD, so the data rate from the disk is greatly in excess of that needed by the audio decoders. The difference is absorbed in RAM exactly as shown in Figure 10.34.

The RAM in a typical MD player is capable of buffering about 3 seconds of audio. When the RAM is full, the disk drive stops transferring data but keeps turning. As the RAM empties into the decoders, the disk drive will top it up in bursts. As the drive need not transfer data for over three quarters of the time, it can reposition between transfers and so it is capable of editing in the same way as a magnetic hard disk. A further advantage of a large RAM buffer is that if the pickup of a CD or MD player is knocked off-track by an external shock the RAM continues to provide data to the audio decoders and provided the pickup can get back to the correct track before the RAM is exhausted there will be no audible effect.

When recording an MO disk, the MiniDisc drive also uses the RAM buffer to allow repositioning so that a continuous recording can be made on a disk which has become chequerboarded through selective erasing. The full total playing time is then always available irrespective of how the disk is divided into different recordings.

CD and MD have a fixed bit rate, whereas DVD does not. It is a characteristic of video material that the degree of compression possible varies with program material. Consequently the greatest efficiency is obtained if the bit rate can increase to handle complex pictures and slow down again to handle simple images. In a multiplex a variable video bit rate can exist alongside a fixed audio bit rate.

12.3 CD and MD – disk construction

Figure 12.3 shows the mechanical specification of CD. Within an overall diameter of 120 mm the program area occupies a 33 mm-wide band between the diameters of 50 and 116 mm. Lead-in and lead-out areas increase the width of this band to 35.5 mm. As the track pitch is a constant 1.6 μm, there will be

images

tracks crossing a radius of the disk. As the track is a continuous spiral, the track length will be given by the above figure multiplied by the average circumference:

images

images

Figure 12.3    Mechanical specification of CD. Between diameters of 46 and 117 mm is a spiral track 5.7 km long.

These figures give a good impression of the precision involved in CD manufacture. The CD case is for protection in storage and the CD has to be taken out of its case and placed in the player. The disk has a plain centre hole and most players clamp the disk onto the spindle from both sides. There are some CD players designed for broadcasters which require the standard CD to be placed into a special cassette. The CD can then be played inside the cassette.

Figure 12.4 shows the mechanical specification of prerecorded Mini-Disc. Within an overall diameter of 64 mm the lead-in area begins at a diameter of 29 mm and the program area begins at 32 mm. The track pitch is exactly the same as in CD, but the MiniDisc can be smaller than CD without any sacrifice of playing time because of the use of compression. For ease of handling, MiniDisc is permanently enclosed in a shuttered plastic cartridge which is 72 images 68 images 5 mm. The cartridge resembles a smaller version of a 3½-inch floppy disk, but unlike a floppy, it is slotted into the drive with the shutter at the side. An arrow is moulded into the cartridge body to indicate this.

In the prerecorded MiniDisc, it was a requirement that the whole of one side of the cartridge should be available for graphics. Thus the disk is designed to be secured to the spindle from one side only. The centre of the disk is fitted with a ferrous clamping plate and the spindle is magnetic. When the disk is lowered into the drive it simply sticks to the spindle. The ferrous disk is only there to provide the clamping force. The disk is still located by the moulded hole in the plastic component. In this way the ferrous component needs no special alignment accuracy when it is fitted in manufacture. The back of the cartridge has a centre opening for the hub and a sliding shutter to allow access by the optical pickup.

images

Figure 12.4    The mechanical dimensions of MiniDisc.

The recordable MiniDisc and cartridge has the same dimensions as the prerecorded MiniDisc, but access to both sides of the disk is needed for recording. Thus the recordable MiniDisc has a a shutter which opens on both sides of the cartridge, rather like a double-sided floppy disk. The opening on the front allows access by the magnetic head needed for MO recording, leaving a smaller label area.

images

Figure 12.5    The construction of the MO recordable MiniDisc.

Figure 12.5 shows the construction of the MO MiniDisc. The 1.1 μm wide tracks are separated by grooves which can be optically tracked. Once again the track pitch is the same as in CD. The MO layer is sandwiched between protective layers.

12.4 Rejecting surface contamination

A fundamental goal of consumer optical disks is that no special working environment or handling skill is required. The bandwidth needed by PCM audio is such that high-density recording is mandatory if reasonable playing time is to be obtained in CD. Although MiniDisc uses compression, it does so in order to make the disk smaller and the recording density is actually the same as for CD.

High-density recording implies short wavelengths. Using a laser focused on the disk from a distance allows short-wavelength recordings to be played back without physical contact, whereas conventional magnetic recording requires intimate contact and implies a wear mechanism, the need for periodic cleaning, and susceptibility to contamination.

images

Figure 12.6    The objective lens of a CD pickup has a numerical aperture (NA) of 0.45; thus the outermost rays will be inclined at approximately 27° to the normal. Refraction at the air/disk interface changes this to approximately 17° within the disk. Thus light focused to a spot on the information layer has entered the disk through a 0.7 mm diameter circle, giving good resistance to surface contamination.

The information layer of CD and MD is read through the thickness of the disk. Figure 12.6 shows that this approach causes the readout beam to enter and leave the disk surface through the largest possible area. The actual dimensions involved are shown in the figure. Despite the minute spot size of about 1.2 μm diameter, light enters and leaves through a 0.7 mm-diameter circle. As a result, surface debris has to be three orders of magnitude larger than the readout spot before the beam is obscured. This approach has the further advantage in MO drives that the magnetic head, on the opposite side to the laser pickup, is then closer to the magnetic layer in the disk.

The bending of light at the disk surface is due to refraction of the wavefronts arriving from the objective lens. Wave theory of light suggests that a wavefront advances because an infinite number of point sources can be considered to emit spherical waves which will only add when they are all in the same phase. This can only occur in the plane of the wavefront. Figure 12.7 shows that at all other angles, interference between spherical waves is destructive.

When such a wavefront arrives at an interface with a denser medium, such as the surface of an optical disk, the velocity of propagation is reduced; therefore the wavelength in the medium becomes shorter, causing the wavefront to leave the interface at a different angle (Figure 12.8). This is known as refraction. The ratio of velocity in vacuo to velocity in the medium is known as the refractive index of that medium; it determines the relationship between the angles of the incident and refracted wavefronts. Reflected light, however, leaves at the same angle to the normal as the incident light. If the speed of light in the medium varies with wavelength, dispersion takes place, where incident white light will be split into a rainbow-like spectrum leaving the interface at different angles. Glass used for chandeliers and cut glass is chosen to be highly dispersive, whereas glass for optical instruments will be chosen to have a refractive index which is as constant as possible with changing wavelength. The use of monochromatic light in optical disks allows low-cost optics to be used as they only need to be corrected for a single wavelength.

images

Figure 12.7    Plane-wave propagation considered as infinite numbers of spherical waves.

images

Figure 12.8    Reflection and refraction, showing the effect of the velocity of light in a medium.

The size of the entry circle in Figure 12.6 is a function of the refractive index of the disk material, the numerical aperture of the objective lens and the thickness of the disk. MiniDiscs are permanently enclosed in a cartridge, and scratching is unlikely. This is not so for CD, but fortunately the method of readout through the disk thickness tolerates surface scratches very well. In extreme cases of damage, a scratch can often successfully be removed with metal polish. By way of contrast, the label side is actually more vulnerable than the readout side, since the lacquer coating is only 30 μm thick. For this reason, writing on the label side of CD is not recommended. Pressure from a ballpoint pen could distort the information layer, and solvents from marker pens have been known to penetrate the lacquer and cause corruption. The common party-piece of writing on the readout surface of CD with a felt pen to show off the error-correction system is quite harmless, since the disk base material is impervious to most solvents.

The base material is in fact a polycarbonate plastic produced by (among others) Bayer under the trade name of Makrolon. It has excellent mechanical and optical stability over a wide temperature range, and lends itself to precision moulding and metallization. It is often used for automotive indicator clusters for the same reasons. An alternative material is polymethyl methacrylate (PMMA), one of the first optical plastics, known by such trade names as Perspex and Plexiglas, and widely used for illuminated signs and aircraft canopies. Polycarbonate is preferred by some manufacturers since it is less hygroscopic than PMMA. The differential change in dimensions of the lacquer coat and the base material can cause warping in a hygroscopic material. Audio disks are too small for this to be a problem, but the larger analog video disks are actually two disks glued together back-to-back to prevent this warpage.

12.5 Playing optical disks

A typical laser disk drive resembles a magnetic drive in that it has a spindle drive mechanism to revolve the disk, and a positioner to give radial access across the disk surface. The positioner has to carry a collection of lasers, lenses, prisms, gratings and so on, and cannot be accelerated as fast as a magnetic-drive positioner. A penalty of the very small track pitch possible in laser disks, which gives the enormous storage capacity, is that very accurate track following is needed, and it takes some time to lock onto a track. For this reason tracks on laser disks are usually made as a continuous spiral, rather than the concentric rings of magnetic disks. In this way, a continuous data transfer involves no more than track following once the beginning of the file is located.

In order to record MO disks or replay any optical disk, a source of monochromatic light is required. The light source must have low noise otherwise the variations in intensity due to the noise of the source will mask the variations due to reading the disk. The requirement for a low-noise monochromatic light source is economically met using a semiconductor laser.

The semiconductor laser is a relative of the light-emitting diode (LED). Both operate by raising the energy of electrons to move them from one valence band to another conduction band. Electrons which fall back to the valence band emit a quantum of energy as a photon whose frequency is proportional to the energy difference between the bands. The process is described by Planck’s Law:

Energy difference E = H images f
where H = Planck’s Constant
  = 6.6262 images 10–34 joules/Hertz

For gallium arsenide, the energy difference is about 1.6 eV, where 1 eV is 1.6 images 10-19 joules.

Using Planck’s Law, the frequency of emission will be:

images

The wavelength will be c/f where

images

In the LED, electrons fall back to the valence band randomly, and the light produced is incoherent. In the laser, the ends of the semiconductor are optically flat mirrors, which produce an optically resonant cavity. One photon can bounce to and fro, exciting others in synchronism, to produce coherent light. This is known as light amplification by stimulated emission of radiation, mercifully abbreviated to LASER, and can result in a runaway condition, where all available energy is used up in one flash. In injection lasers, an equilibrium is reached between energy input and light output, allowing continuous operation. The equilibrium is delicate, and such devices are usually fed from a current source. To avoid runaway when temperature change disturbs the equilibrium, a photosensor is often fed back to the current source. Such lasers have a finite life, and become steadily less efficient. The feedback will maintain output, and it is possible to anticipate the failure of the laser by monitoring the drive voltage needed to give the correct output.

Some of the light reflected back from the disk re-enters the aperture of objective lens. The pickup must be capable of separating the reflected light from the incident light. Figure 12.9 shows two systems. In (a) an intensity beamsplitter consisting of a semisilvered mirror is inserted in the optical path and reflects some of the returning light into the photosensor. This is not very efficient, as half of the replay signal is lost by transmission straight on. In the example at (b) separation is by polarization.

In natural light, the electric-field component will be in many planes. Light is said to be polarized when the electric field direction is constrained. The wave can be considered as made up from two orthogonal components. When these are in phase, the polarization is said to be linear. When there is a phase shift between the components, the polarization is said to be elliptical, with a special case at 90° called circular polarization. These types of polarization are contrasted in Figure 12.10.

images

Figure 12.9    (a) Reflected light from the disk is directed to the sensor by a semisilvered mirror. (b) A combination of polarizing prism and quarter-wave plate separates incident and reflected light.

images

Figure 12.10    (a) Linear polarization: orthogonal components are in phase. (b) Circular polarization: orthogonal components are in phase quadrature.

In order to create polarized light, anisotropic materials are necessary. Polaroid material, invented by Edwin Land, is vinyl which is made anisotropic by stretching it while hot. This causes the long polymer molecules to line up along the axis of stretching. If the material is soaked in iodine, the molecules are rendered conductive, and short out any electric-field component along themselves. Electric fields at right angles are unaffected; thus the transmission plane is at right angles to the stretching axis.

Stretching plastics can also result in anisotropy of refractive index; this effect is known as birefringence. If a linearly polarized wavefront enters such a medium, the two orthogonal components propagate at different velocities, causing a relative phase difference proportional to the distance travelled. The plane of polarization of the light is rotated. Where the thickness of the material is such that a 90° phase change is caused, the device is known as a quarter-wave plate. The action of such a device is shown in Figure 12.11. If the plane of polarization of the incident light is at 45° to the planes of greatest and least refractive index, the two orthogonal components of the light will be of equal magnitude, and this results in circular polarization. Similarly, circular-polarized light can be returned to the linear-polarized state by a further quarter-wave plate. Rotation of the plane of polarization is a useful method of separating incident and reflected light in a laser pickup. Using a quarter-wave plate, the plane of polarization of light leaving the pickup will have been turned 45°, and on return it will be rotated a further 45°, so that it is now at right angles to the plane of polarization of light from the source. The two can easily be separated by a polarizing prism, which acts as a transparent block to light in one plane, but as a prism to light in the other plane, such that reflected light is directed towards the sensor.

images

Figure 12.11    Different speed of light in different planes rotates the plane of polarization in a quarter-wave plate to give a circular-polarized output.

In a CD player, the sensor is concerned only with the intensity of the light falling on it. When playing MO disks, the intensity does not change, but the magnetic recording on the disk rotates the plane of polarization one way or the other depending on the direction of the vertical magnetization. MO disks cannot be read with circular-polarized light. Light incident on the medium must be plane polarized and so the quarter-wave plate of the CD pickup cannot be used. Figure 12.12(a) shows that a polarizing prism is still required to linearly polarize the light from the laser on its way to the disk. Light returning from the disk has had its plane of polarization rotated by approximately ± 1°. This is an extremely small rotation. Figure 12.12(b) shows that the returning rotated light can be considered to be composed of two orthogonal components. Rx is the component which is in the same plane as the illumination and is called the ordinary component and Ry is the component due to the Kerr effect rotation and is known as the magneto-optic component. A polarizing beam splitter mounted squarely would reflect the magneto-optic component Ry very well because it is at right angles to the transmission plane of the prism, but the ordinary component would pass straight on in the direction of the laser. By rotating the prism slightly a small amount of the ordinary component is also reflected. Figure 12(c) shows that when combined with the magneto-optic component, the angle of rotation has increased.4 Detecting this rotation requires a further polarizing prism or analyser as shown in Figure 12.12. The prism is twisted such that the transmission plane is at 45° to the planes of Rx and Ry. Thus with an unmagnetized disk, half of the light is transmitted by the prism and half is reflected. If the magnetic field of the disk turns the plane of polarization towards the transmission plane of the prism, more light is transmitted and less is reflected. Conversely if the plane of polarization is rotated away from the transmission plane, less light is transmitted and more is reflected. If two sensors are used, one for transmitted light and one for reflected light, the difference between the two sensor outputs will be a waveform representing the angle of polarization and thus the recording on the disk. This differential analyser eliminates common-mode noise in the reflected beam.5 As Figure 12.12 shows, the output of the two sensors is summed as well as subtracted in a MiniDisc player. When playing MO disks, the difference signal is used. When playing prerecorded disks, the sum signal is used and the effect of the second polarizing prism is disabled.

images

Figure 12.12    A pickup suitable for the replay of magneto-optic disks must respond to very small rotations of the plane of polarization.

Since the residual stresses set up in moulding plastic tend to align the long-chain polymer molecules, plastics can have different refractive indices in different directions, a phenomenon known as birefringence. It is possible accidentally to rotate the plane of polarization of light in birefringent plastics, and there were initial reservations as to the feasibility of this approach for playing moulded CDs. The necessary quality of disk moulding was achieved by using relatively high temperatures where the material flows more easily. This has meant that birefringence is negligible, and the polarizing beamsplitter is widely used.

12.6 Focus systems

The frequency response of the laser pickup and the amount of crosstalk are both a function of the spot size and care must be taken to keep the beam focused on the information layer. If the spot on the disk becomes too large, it will be unable to discern the smaller features of the track, and can also be affected by the adjacent track. Disk warp and thickness irregularities will cause focal-plane movement beyond the depth of focus of the optical system, and a focus servo system will be needed. The depth of field is related to the numerical aperture, which is defined, and the accuracy of the servo must be sufficient to keep the focal plane within that depth, which is typically ± 1 μm.

The focus servo moves a lens along the optical axis in order to keep the spot in focus. Since dynamic focus-changes are largely due to warps, the focus system must have a frequency response in excess of the rotational speed. A moving-coil actuator is often used owing to the small moving mass which this permits. Figure 12.13 shows that a cylindrical magnet assembly almost identical to that of a loudspeaker can be used, coaxial with the light beam. Alternatively a moving-magnet design can be used. A rare earth magnet allows a sufficiently strong magnetic field without excessive weight.

A focus-error system is necessary to drive the lens. There are a number of ways in which this can be derived, the most common of which will be described here.

images

Figure 12.13    Moving-coil-focus servo can be coaxial with the light beam as shown.

images

Figure 12.14    The cylindrical lens focus method produces an elliptical spot on the sensor whose aspect ratio is detected by a four-quadrant sensor to produce a focus error.

In Figure 12.14 a cylindrical lens is installed between the beamsplitter and the photosensor. The effect of this lens is that the beam has no focal point on the sensor. In one plane, the cylindrical lens appears parallel-sided, and has negligible effect on the focal length of the main system, whereas in the other plane, the lens shortens the focal length. The image will be an ellipse whose aspect ratio changes as a function of the state of focus. Between the two foci, the image will be circular. The aspect ratio of the ellipse, and hence the focus error, can be found by dividing the sensor into quadrants. When these are connected as shown, the focus-error signal is generated. The data readout signal is the sum of the quadrant outputs.

images

Figure 12.15    (a)–(c) Knife-edge focus method requires only two sensors, but is critically dependent on knife-edge position. (d)–(f) Twin-prism method requires three sensors (A, B, C), where focus error is (A + C) – B. Prism alignment reduces sensitivity without causing focus offset.

Figure 12.15 shows the knife-edge method of determining focus. A split sensor is also required. At (a) the focal point is coincident with the knife edge, so it has little effect on the beam. At (b) the focal point is to the right of the knife edge, and rising rays are interrupted, reducing the output of the upper sensor. At (c) the focal point is to the left of the knife edge, and descending rays are interrupted, reducing the output of the lower sensor. The focus error is derived by comparing the outputs of the two halves of the sensor. A drawback of the knife-edge system is that the lateral position of the knife edge is critical, and adjustment is necessary. To overcome this problem, the knife edge can be replaced by a pair of prisms, as shown in Figure 12.15(d)–(f). Mechanical tolerances then only affect the sensitivity, without causing a focus offset.

The cylindrical lens method is compared with the knife-edge/prism method in Figure 12.16, which shows that the cylindrical lens method has a much smaller capture range. A focus-search mechanism will be required, which moves the focus servo over its entire travel, looking for a zero crossing. At this time the feedback loop will be completed, and the sensor will remain on the linear part of its characteristic. The spiral track of CD and MiniDisc starts at the inside and works outwards. This was deliberately arranged because there is less vertical run-out near the hub, and initial focusing will be easier.

images

Figure 12.16    Comparison of captive range of knife-edge/prism method and astigmatic (cylindrical lens) system. Knife edge may have a range of 1 mm, whereas astigmatic may only have a range of 40 μm, requiring a focus-search mechanism.

12.7 Tracking systems

The track pitch is only 1.6 μm, and this is much smaller than the accuracy to which the player chuck or the disk centre hole can be made; on a typical player, run-out will swing several tracks past a fixed pickup. The non-contact readout means that there is no inherent mechanical guidance of the pickup. In addition, a warped disk will not present its surface at 90° to the beam, but will constantly change the angle of incidence during two whole cycles per revolution. Owing to the change of refractive index at the disk surface, the tilt will change the apparent position of the track to the pickup, and Figure 12.17 shows that this makes it appear wavy. Warp also results in coma of the readout spot. The disk format specifies a maximum warp amplitude to keep these effects under control. Finally, vibrations induced in the player from outside, particularly in portable and automotive players, will tend to disturb tracking. A track-following servo is necessary to keep the spot centralized on the track in the presence of these difficulties. There are several ways in which a tracking error can be derived.

In the three-spot method, two additional light beams are focused on the disk track, one offset to each side of the track centreline. Figure 12.18 shows that, as one side spot moves away from the track into the mirror area, there is less destructive interference and more reflection. This causes the average amplitude of the side spots to change differentially with tracking error. The laser head contains a diffraction grating which produces the side spots, and two extra photosensors onto which the reflections of the side spots will fall. The side spots feed a differential amplifier, which has a low-pass filter to reject the channel-code information and retain the average brightness difference. Some players use a delay line in one of the side-spot signals whose period is equal to the time taken for the disk to travel between the side spots. This helps the differential amplifier to cancel the channel code.

images

Figure 12.17    Owing to refraction, the angle of incidence (i) is greater than the angle of refraction (r). Disk warp causes the apparent position of the track (dashed line) to move, requiring the tracking servo to correct.

images

Figure 12.18    Three-spot method of producing tracking error compares average level of side-spot signals. Side spots are produced by a diffraction grating and require their own sensors.

The side spots are generated as follows. When a wavefront reaches an aperture which is small compared to the wavelength, the aperture acts as a point source, and the process of diffraction can be observed as a spherical wavefront leaving the aperture as in Figure 12.19. Where the wavefront passes through a regular structure, known as a diffraction grating, light on the far side will form new wavefronts wherever radiation is in phase, and Figure 12.20 shows that these will be at an angle to the normal depending on the spacing of the structure and the wavelength of the light. A diffraction grating illuminated by white light will produce a dispersed spectrum at each side of the normal. To obtain a fixed angle of diffraction, monochromatic light is necessary.

images

Figure 12.19    Diffraction as a plane wave reaches a small aperture.

images

Figure 12.20    In a diffraction grating, constructive interference can take place at more than one angle for a single wavelength.

The alternative approach to tracking-error detection is to analyse the diffraction pattern of the reflected beam. The effect of an off-centre spot is to rotate the radial diffraction pattern about an axis along the track. Figure 12.21 shows that, if a split sensor is used, one half will see greater modulation than the other when off-track. Such a system may be prone to develop an offset due either to drift or to contamination of the optics, although the capture range is large. A further tracking mechanism is often added to obviate the need for periodic adjustment. Figure 12.22 shows this dither-based system, which resembles in many respects the track-following method used in many professional videotape recorders. A sinusoidal drive is fed to the tracking servo, causing a radial oscillation of spot position of about ± 50 nm. This results in modulation of the envelope of the readout signal, which can be synchronously detected to obtain the sense of the error. The dither can be produced by vibrating a mirror in the light path, which enables a high frequency to be used, or by oscillating the whole pickup at a lower frequency.

images

Figure 12.21    Split-sensor method of producing tracking error focuses image of spot onto sensor. One side of spot will have more modulation when off-track.

images

Figure 12.22    Dither applied to readout spot modulates the readout envelope. A tracking error can be derived.

12.8 Typical pickups

It is interesting to compare different designs of laser pickup. Figure 12.23 shows a Philips laser head.6 The dual-prism focus method is used, which combines the output of two split sensors to produce a focus error. The focus amplifier drives the objective lens which is mounted on a parallel motion formed by two flexural arms. The capture range of the focus system is sufficient to accommodate normal tolerances without assistance. A radial differential tracking signal is extracted from the sensors as shown in the figure. Additionally, a dither frequency of 600 Hz produces envelope modulation which is synchronously rectified to produce a drift-free tracking error. Both errors are combined to drive the tracking system. As only a single spot is used, the pickup is relatively insensitive to angular errors, and a rotary positioner can be used, driven by a moving coil. The assembly is statically balanced to give good resistance to lateral shock.

images

Figure 12.23    Philips laser head showing semisilvered prism for beam splitting. Focus error is derived from dual-prism method using split sensors. Focus error (A + D) – (B + C) is used to drive focus motor which moves objective lens on parallel action flexure. Radial differential tracking error is derived from split sensor (A + B) – (C + D). Tracking error drives entire pickup on radial arm driven by moving coil. Signal output is (A + B + C + D). System includes 600 Hz dither for tracking. (Courtesy Philips Technical Review)

images

Figure 12.24    Sony laser head showing polarizing prism and quarter-wave plate for beam splitting, and diffraction grating for production of side spots for tracking. The cylindrical lens system is used for focus, with a four-quadrant sensor (A, B, C, D) and two extra sensors E, F for the side spots. Tracking error is E–F; focus error is (A + C) – (B + D). Signal output is (A + B + C + D). The focus and tracking errors drive the two-axis device. (Courtesy Sony Broadcast)

Figure 12.24 shows a Sony laser head used in consumer players. The cylindrical-lens focus method is used, requiring a four-quadrant sensor. Since this method has a small capture range, a focus-search mechanism is necessary. When a disk is loaded, the objective lens is ramped up and down looking for a zero crossing in the focus error. The three-spot method is used for tracking. The necessary diffraction grating can be seen adjacent to the laser diode. Tracking error is derived from side-spot sensors (E, F). Since the side-spot system is sensitive to angular error, a parallel-tracking laser head traversing a disk radius is essential. A cost-effective linear motion is obtained by using a rack-and-pinion drive for slow, coarse movements, and a laterally moving lens in the light path for fine rapid movements. The same lens will be moved up and down for focus by the so-called two-axis device, which is a dual-moving coil mechanism. In some players this device is not statically balanced, making the unit sensitive to shock, but this was overcome on later heads designed for portable players. Figure 12.25 shows a later Sony design having a prism which reduces the height of the pickup above the disk.

images

Figure 12.25    For automotive and portable players, the pickup can be made more compact by incorporating a mirror, which allows most of the elements to be parallel to the disk instead of at right angles.

12.9 DVD and CD readout in detail

The CD medium is designed to be read with a phase contrast microscope, and so it is correct to describe the deformities on the information layer as a phase structure. The original LaserVision disk patent7 contains a variety of approaches, but in the embodiment used in CD it consists of two parallel planes separated by a distance which is constant and specifically related to the wavelength of the light which will be used to read it. The phase structure is created with deformities which depart from the first of the planes and whose extremities are in the second plane. These deformities are called pits when the second plane is below the first and bumps when the second plane is above the first.

Whilst a phase structure can be read by transmission or reflection, commercial designs based the Philips medium, such as LaserVision,8 DVD, CD, CD-Video, CDROM and prerecorded MiniDisc, use reflective readout exclusively.

Optical physicists characterize materials by their reflectivity, transmissivity and absorption. Light energy cannot disappear, so when light is incident on some object, the amounts of light transmitted, absorbed and reflected must add up to the original incident amount. When no light is absorbed, the incident light is divided between that transmitted and that reflected. When light is absorbed, the transmitted and reflected amounts of light are both reduced. A medium such as a photograph contains pigments which absorb light more in the dark areas and less in the light areas. Thus the amount of light reflected varies. A medium such as a transparency also contains such pigments but in this case it is primarily the amount of light transmitted which varies. Such a variation in transmitted or reflected light from place to place is known as contrast.

Figure 12.2(a) showed that in CD, the information layer consists of an optically flat surface above which flat-topped bumps project. The entire surface of the phase structure is metallized to render it reflective. This metallization of the entire information layer means that little light is transmitted or absorbed, and as a result virtually all incident light must be reflected. The information layer of CD does not have conventional contrast. Contrast is in any case unnecessary as interference is used for readout, and this works better with a totally reflecting structure. Referring to Figure 12.26 it will be seen that a spot of light is focused onto the phase structure such that it straddles a bump. Ideally half the light energy should be incident on the top of the bump and half on the surrounding mirror surface. The height of the bump is ideally one quarter the wavelength of the light in a reflective system and as a result light which has reflected from the mirror surface has travelled one half a wavelength further than light which has reflected from the mirror surface. Consequently, along the normal, there are two components of light of almost equal energy, but they are in phase opposition, and destructive interference occurs, such that no wavefront can form in that direction. As light energy cannot disappear, wavefronts will leave the phase structure at any oblique angle at which constructive interference between the components can be achieved, creating a diffraction pattern. In the case of the light beam straddling the centre of a long bump the diffraction pattern will be in a plane which is normal to the disk surface and which intersects a disk radius. It is thus called a radial diffraction pattern. The zeroth-order radiation (that along the normal) will be heavily attenuated, and most of the incident energy will concentrated in the first- and second-order wavefronts.

images

Figure 12.26    The structure of a maximum frequency recording is shown here, related to the intensity function of an objective of 0.45NA with 780 μm light. Note that track spacing puts adjacent tracks in the dark rings, reducing crosstalk. Note also that as the spot has an intensity function it is meaningless to specify the spot diameter without some reference such as an intensity level.

Some treatments use the word scattering to describe the effect of the interaction of the readout beam with the relief structure. This is technically incorrect as scattering is a random phenomenon which is independent of wavelength. The diffraction pattern is totally predictable and strongly wavelength dependent.

When the light spot is focused on a plain part of the mirror surface, known as a land, clearly most of the energy is simply reflected back whence it came. Thus when a bump is present, light is diffracted away from the normal, whereas in the absence of a bump, it returns along the normal. Although all incident light is reflected at all times, the effect of diffraction is that the direction in which wavefronts leave the phase structure is changed by the presence of the bump. What then happens is a function of the optical system being used. In a conventional CD player the angle to the normal of the first diffracted order in the radial diffraction pattern due to a long bump will be sufficiently oblique that it passes outside the aperture of the objective and does not return to the photosensor. Thus the bumps appear dark to the photosensor and the lands appear bright. Although all light is reflected at all times and there is no conventional contrast, inside the pickup there are variations in the light falling on the photosensor, a phenomenon called phase contrast.

The phase contrast technique described will only work for a given wavelength and with an appropriate aperture and lens design, and so the CD must be read with monochromatic light. Whilst the ideal case is where the two components of light are equal to give exact cancellation, in practice this ideal is not met but instead there is a substantial reduction in the light returning to the pickup.

Some treatments of CD refer to a ‘beam’ of light returning from the disk to the pickup, but this is incorrect. What leaves the disk is a hemispherical diffraction pattern certain orders of which enter the aperture of the pickup. The destructive interference effect can be seen with the naked eye by examining any CD under a conventional incandescent lamp. The data surface of a CD has many parallel tracks and works somewhat like a diffraction grating by dispersing the incident white light into a spectrum. However, the resultant spectrum is not at all like that produced by a conventional diffraction grating or by a prism. These latter produce a spectrum in which the relative brightness of the colours is like that of a rainbow, i.e. the green in the centre is brightest, the red at one end is less bright and the blue at the other end is fainter still. This is due to the unequal response of the eye to various colours, where equal red, green and blue stimuli produce responses in approximately the proportions 2:5:1 respectively. In the diffracted spectrum from a CD, however, the blue component appears as strong or stronger than the other colours. This is because the relief structure of CD is designed not to reflect infrared light of 780 nm wavelength. This relief structure will, however, reflect perfectly ultraviolet light of half that wavelength as the zeroth-order light reflected from the top of the bumps will be in phase with light reflected from the land. Thus a CD reflects visible blue light much more strongly than longer-wavelength colours.

It is essential to the commercial success of CD that a useful playing time (75 min max.) should be obtained from a recording of reasonable size (12 cm). The size was determined by the European motor industry as being appropriate for car dashboard-mounted units. It follows that the smaller the spot of light which can be created, the smaller can be the deformities carrying the information, and so more information per unit area (known in the art as the superficial recording density) can be stored. Development of a successful high-density optical recorder requires an intimate knowledge of the behaviour of light focused into small spots. If it is attempted to focus a uniform beam of light to an infinitely small spot on a surface normal to the optical axis, it will be found that it is not possible. This is probably just as well as an infinitely small spot would have infinite intensity and any matter it fell on would not survive. Instead the result of such an attempt is a distribution of light in the area of the focal point which has no sharply defined boundary. This is called the Airy distribution9 (sometimes pattern or disk) after Lord Airy (1835), the then astronomer royal. If a line is considered to pass across the focal plane, through the theoretical focal point, and the intensity of the light is plotted on a graph as a function of the distance along that line, the result is the intensity function shown in Figure 12.26. It will be seen that this contains a central sloping peak surrounded by alternating dark rings and light rings of diminishing intensity. These rings will in theory reach to infinity before their intensity becomes zero. The intensity distribution or function described by Airy is due to diffraction effects across the finite aperture of the objective. For a given wavelength, as the aperture of the objective is increased, so the diameter of the features of the Airy pattern reduces. The Airy pattern vanishes to a singularity of infinite intensity with a lens of infinite aperture which, of course, cannot be made. The approximation of geometric optics is quite unable to predict the occurrence of the Airy pattern.

An intensity function does not have a diameter, but for practical purposes an effective diameter typically quoted is that at which the intensity has fallen to some convenient fraction of that at the peak. Thus one could state, for example, the half-power diameter.

Since light paths in optical instruments are generally reversible, it is possible to see an interesting corollary which gives a useful insight into the readout principle of CD. Considering light radiating from a phase structure, as in Figure 12.27, the more closely spaced the features of the phase structure, i.e. the higher the spatial frequency, the more oblique the direction of the wavefronts in the diffraction pattern which results and the larger the aperture of the lens needed to collect the light if the resolution is not to be lost. The corollary of this is that the smaller the Airy distribution it is wished to create, the larger must be the aperture of the lens. Spatial frequency is measured in lines per millimetre and as it increases, the wavefronts of the resultant diffraction pattern become more oblique. In the case of a CD, the smaller the bumps and the spaces between them along the track, the higher the spatial frequency, and the more oblique the diffraction pattern becomes in a plane tangential to the track. With a fixed-objective aperture, as the tangential diffraction pattern becomes more oblique, less light passes the aperture and the depth of modulation transmitted by the lens falls. At some spatial frequency, all the diffracted light falls outside the aperture and the modulation depth transmitted by the lens falls to zero. This is known as the spatial cut-off frequency. Thus a graph of depth of modulation versus spatial frequency can be drawn and which is known as the modulation transfer function (MTF). This is a straight line commencing at unity at zero spatial frequency (no detail) and falling to zero at the cut-off spatial frequency (finest detail). Thus one could describe a lens of finite aperture as a form of spatial low-pass filter. The Airy function is no more than the spatial impulse response of the lens, and the concentric rings of the Airy function are the spatial analog of the symmetrical ringing in a phase-linear electrical filter. The Airy function and the triangular frequency response form a transform pair10 as shown in Chapter 3.

images

Figure 12.27    Fine detail in an object can only be resolved if the diffracted wavefront due to the highest spatial frequency is collected by the lens. Numerical aperture (NA) = sin θ, and as θ is the diffraction angle it follows that, for a given wavelength, NA determines resolution.

When an objective lens is used in a conventional microscope, the MTF will allow the resolution to be predicted in lines per millimetre. However, in a scanning microscope the spatial frequency of the detail in the object is multiplied by the scanning velocity to give a temporal frequency measured in Hertz. Thus lines per millimetre multiplied by millimetres per second gives lines per second. Instead of a straight-line MTF falling to the spatial cut-off frequency, a scanning microscope has a temporal frequency response falling to zero at the optical cut-off frequency. Whilst this concept requires a number of idiomatic terms to be assimilated at once, the point can be made clear by a simple analogy. Imagine the evenly spaced iron railings outside a schoolyard. These are permanently fixed, and can have no temporal frequency, yet they have a spatial frequency which is the number of railings per unit distance. A small boy with a stick takes great delight in running along the railings so that his stick hits each one in turn and makes a great noise. The rate at which his stick hits the railings is the temporal frequency which results from their being scanned. This rate would increase if the boy ran faster, but it would also increase if the rails were closer together. As a consequence it can be seen that the temporal frequency is proportional to the spatial frequency multiplied by the scanning speed. Put more technically, the frequency response of an optical recorder is the Fourier transform of the Airy distribution of the readout spot multiplied by the track velocity.

images

Figure 12.28    Frequency response of laser pickup. Maximum operating frequency is about half of cut-off frequency Fc.

In magnetic recorders and vinyl disk recorders there is at least a frequency band where the response is reasonably flat. CD is basically a phase contrast scanning microscope. Figure 12.28 shows that the frequency response falls progressively from DC to the optical cutoff frequency which is given by:

images

The minimum linear velocity of CD is 1.2 m/s, giving a cutoff frequency of

images

Actual measurements reveal that the optical response is only a little worse than the theory predicts. This characteristic has a large bearing on the type of modulation schemes which can be successfully employed. Clearly, to obtain any noise immunity, the maximum operating frequency must be rather less than the cutoff frequency. The maximum frequency used in CD is 720 kHz, which represents an absolute minimum wavelength of 1.666 μm, or a bump length of 0.833 μm, for the lowest permissible track speed of 1.2 m/s used on the full-length 75 min-playing disks. One-hour-playing disks have a minimum bump length of 0.972 μm at a track velocity of 1.4 m/s. The maximum frequency is the same in both cases. This maximum frequency should not be confused with the bit rate of CD since this is different owing to the channel code used. Figure 12.26 showed a maximum-frequency recording, and the physical relationship of the intensity function to the track dimensions.

In a CD player, the source of light is a laser, and this does not produce a beam of uniform intensity. It is more intense in the centre than it is at the edges, and this has the effect of slightly increasing the half-power diameter of the intensity function. The effect is analogous to the effect of window functions in FIR filters (see Chapter 3). The intensity function can also be enlarged if the lens used suffers from optical aberrations. This was studied by Maréchal11 who established criteria for the accuracy to which the optical surfaces of the lens should be made to allow the ideal Airy distribution to be obtained. CD player lenses must meet the Maréchal criterion. With such a lens, the diameter of the distribution function is determined solely by the combination of numerical aperture (NA) and the wavelength. When the size of the spot is as small as the NA and wavelength allow, the optical system is said to be diffraction limited. Figure 12.27 showed how numerical aperture is defined, and illustrates that the smaller the spot needed, the larger must be the NA. Unfortunately the larger the NA, the more obliquely to the normal the light arrives at the focal plane and the smaller the depth of focus will be. This was investigated by Hopkins,12 who established the depth of focus available for a given NA. CD players have to use an NA of 0.45 which is a compromise between a small spot and an impossibly small depth of focus.13 The later DVD uses an NA of 0.6 in conjunction with a shorter-wavelength laser. This allows a significant reduction in the size of the recorded bumps and a corresponding increase in storage density. Essentially the information layer of DVD is a scaled-down CD.

The intensity function will also be distorted and grossly enlarged if the optical axis is not normal to the medium. The initial effect is that the energy in the first bright ring increases strongly in one place and results in a secondary peak adjacent to the central peak. This is known as coma and its effect is extremely serious as the enlargement of the spot restricts the recording density. The larger the NA, the smaller becomes the allowable tilt of the optical axis with respect to the medium before coma becomes a problem. With the NA of CD this angle is less than a degree.13

Numerical aperture is defined as the cosine of the angle between the optical axis and rays converging from the perimeter of the lens. It will be apparent that there are many combinations of lens diameter and focal length which will have the same NA. As the difficulty of manufacture, and consequently the cost, of a lens meeting the Maŕechal criterion increases disproportionately with size, it is advantageous to use a small lens of short focal length, mounted close to the medium and held precisely perpendicular to the medium to prevent coma. As the lens needs to be driven along its axis by a servo to maintain focus, the smaller lens will facilitate the design of the servo by reducing the mass to be driven. It is extremely difficult to make a lens which meets the Maréchal criterion over a range of wavelengths because of dispersion. The use of monochromatic light eases the lens design as it has only to be correct for one wavelength.

At the high recording density of CD, there is literally only one scanning mechanism with which all the optical criteria can be met and this is the approach known from the scanning microscope. The optical pickup is mounted in a carriage which can move it parallel to the medium in such a way that the optical axis remains at all times parallel to the axis of rotation of the medium. The latter rotates as the pickup is driven away from the axis of rotation in such a way that a spiral track on the disk is followed. The pickup contains a short focal length lens of small diameter which must therefore be close to the disk surface to allow a large NA. All high-density optical recorders operate on this principle in which the readout of the carrier is optical but the scanning is actually mechanical.

12.10 How optical disks are made

The steps used in the production of CDs will next be outlined. Prerecorded MiniDiscs are made in an identical fashion except for detail differences which will be noted. MO disks need to be grooved so that the track-following system will work. The grooved substrate is produced in a similar way to a CD master, except that the laser is on continuously instead of being modulated with a signal to be recorded. As stated, CD is replicated by moulding, and the first step is to produce a suitable mould. This mould must carry deformities of the correct depth for the standard wavelength to be used for reading, and as a practical matter these deformities must have slightly sloping sides so that it is possible to release the CD from the mould.

The major steps in CD manufacture are shown in Figure 12.29. The mastering process commences with an optically flat glass disk about 220 mm in diameter and 6 mm thick. The blank is washed first with an alkaline solution, then with a fluorocarbon solvent, and spun dry prior to polishing to optical flatness. A critical cleaning process is then undertaken using a mixture of de-ionized water and isopropyl alcohol in the presence of ultrasonic vibration, with a final fluorocarbon wash. The blank must now be inspected for any surface irregularities which would cause data errors. This is done by using a laser beam and monitoring the reflection as the blank rotates. Rejected blanks return to the polishing process, those which pass move on, and an adhesive layer is applied followed by a coating of positive photoresist. This is a chemical substance which softens when exposed to an appropriate intensity of light of a certain wavelength, typically ultraviolet. Upon being thus exposed, the softened resist will be washed away by a developing solution down to the glass to form flat-bottomed pits whose depth is equal to the thickness of the undeveloped resist. During development the master is illuminated with laser light of a wavelength to which it is insensitive. The diffraction pattern changes as the pits are formed. Development is arrested when the appropriate diffraction pattern is obtained.14 The thickness of the resist layer must be accurately controlled, since it affects the height of the bumps on the finished disk, and an optical scanner is used to check that there are no resist defects which would cause data errors or tracking problems in the end product. Blanks which pass this test are oven-cured, and are ready for cutting. Failed blanks can be stripped of the resist coating and used again.

images

Figure 12.29    The many stages of CD manufacture, most of which require the utmost cleanliness.

images

Figure 12.30    CD cutter. The focus subsystem controls the spot size of the main cutting laser on the photosensitive blank. Disc and traverse motors are coordinated to give constant track pitch and velocity. Note that the power of the focus laser is insufficient to expose the photoresist.

The cutting process is shown in simplified form in Figure 12.30. A continuously operating helium cadmium15 or argon ion16 laser is focused on the resist coating as the blank revolves. Focus is achieved by a separate helium neon laser sharing the same optics. The resist is insensitive to the wavelength of the He–Ne laser. The laser intensity is controlled by a device known as an acousto-optic modulator which is driven by the encoder. When the device is in a relaxed state, light can pass through it, but when the surface is excited by high-frequency vibrations, light is scattered. Information is carried in the lengths of time for which the modulator remains on or remains off. As a result the deformities in the resist produced as the disk turns when the modulator allows light to pass are separated by areas unaffected by light when the modulator is shut off. Information is carried solely in the variations of the lengths of these two areas.

The laser makes its way from the inside to the outside as the blank revolves. As the radius of the track increases, the rotational speed is proportionately reduced so that the velocity of the beam over the disk remains constant. This constant linear velocity (CLV) results in rather longer playing time than would be obtained with a constant speed of rotation. Owing to the minute dimensions of the track structure, the cutter has to be constructed to extremely high accuracy. Air bearings are used in the spindle and the laser head, and the whole machine is resiliently supported to prevent vibrations from the building from affecting the track pattern.

Early CD cutters worked in real time, but subsequently the operating speed has been increased dramatically to increase the throughput.

As the player is a phase contrast microscope, it must produce an intensity function which straddles the deformities. As a consequence the intensity function which produces the deformities in the photoresist must be smaller in diameter than that in the reader. This is conveniently achieved by using a shorter wavelength of 400–500 nm from a helium– cadmium or argon–ion laser combined with a larger lens aperture of 0.9. These are expensive, but are only needed for the mastering process.

It is a characteristic of photoresist that its development rate is not linearly proportional to the intensity of light. This non-linearity is known as ‘gamma’. As a result there are two intensities of importance when scanning photoresist; the lower sensitivity, or threshold, below which no development takes place, and the upper threshold above which there is full development. As the laser light falling on the resist is an intensity function, it follows that the two thresholds will be reached at different diameters of the function. It can be seen in Figure 12.31 that advantage is taken of this effect to produce tapering sides to the pits formed in the resist. In the centre, the light is intense enough to fully develop the resist right down to the glass. This gives the deformity a flat bottom. At the edge, the intensity falls and as some light is absorbed by the resist, the diameter of the resist which can be developed falls with depth in the resist. By controlling the intensity of the laser, and the development time, the slope of the sides of the pits can be controlled.

In summary, the resist thickness controls the depth of the pits, the cutter laser wavelength and the NA of the objective together control the width of the pits in the radial direction, and the laser intensity and sensitivity of the resist together with the development time control the slope. The length of the pits in the tangential direction, i.e. along the track, is controlled by the speed of the disk past the objective and the length of time for which the modulator allows light to pass. The space between the pits along the track is controlled by the speed of the disk past the objective and the length of time for which the modulator blocks the laser light. In practice all these values are constant for a given cutting process except for the times for which the modulator turns on or off. As a result pits of constant depth and cross-section are formed, and only their length and the space between them along the track is changed in order to carry information.

images

Figure 12.31    The two levels of exposure sensitivity of the resist determine the size and edge slope of the bumps in the CD. (a) Large exposure results in large bump with gentle slope; (b) less exposure results in smaller bump with steeper sloped sides.

The specified wavelength of 780 nm and the numerical aperture of 0.45 used for playback results in an Airy function where the half-power level is at a diameter of about 1 μm. The first dark ring will be at about 1.9 μm diameter. As the illumination follows an intensity function, it is really meaningless to talk about spot size unless the relative power level is specified. The analogy is quoting frequency response without dB limits. Allowable crosstalk between tracks then determines the track pitch. The first ring outside the central disk carries some 7 per cent of the total power, and limits crosstalk performance. The track spacing is such that with a slightly defocused beam and a slight tracking error, crosstalk due to adjacent tracks is acceptable. Since aberrations in the objective will increase the spot size and crosstalk, the CD specification requires the lens to be within the Maréchal criterion. Clearly the numerical aperture of the lens, the wavelength of the laser, the refractive index and thickness of the disk and the height and size of the bumps must all be simultaneously specified.

The master recording process has produced a phase structure in relatively delicate resist, and this cannot be used for moulding directly. Instead a thin metallic silver layer is sprayed onto the resist to render it electrically conductive so that electroplating can be used to make robust copies of the relief structure. This conductive layer then makes the resist optically reflective and it is possible to ‘play’ the resist master for testing purposes. However, it cannot be played by the cutter, as the beam in the cutter is too small and it would not straddle the pits. The necessary phase contrast between light energy leaving the lands and pits would not then be achieved. Unfortunately the resist master cannot be played by a normal CD pickup either, because the pits in the resist are full of air, in which the velocity (and therefore the wavelength) of light is different from the value it will have in the finished disk when the pits are filled with plastic. Thus the correct pit depth for a plastic disk is incorrect in air; a third type of optical system is needed to test play a resist master in which the wavelength is shorter than the wavelength used in the normal player. This would produce an Airy pattern which was too small with a conventional lens aperture, and so a lens of smaller aperture is needed to produce a spot of the correct diameter from the ‘wrong’ wavelength.

The electrically conductive resist master is then used as the cathode of an electroplating process where a first layer of metal is laid down over the resist, conforming in every detail to the relief structure thereon. This metal layer can then be separated from the glass and the resist is dissolved away and the silver is recovered leaving a laterally inverted phase structure on the surface of the metal, in which the pits in the photoresist have become bumps in the metal. From this point on, the production of CD is virtually identical to the replication process used for vinyl disks, save only that a good deal more precision and cleanliness is needed.

This first metal layer could itself be used to mould disks, or it could be used as a robust submaster from which many stampers could be made by pairs of plating steps. The first metal phase structure can itself be used as a cathode in a further electroplating process in which a second metal layer is formed having a mirror image of the first. A third such plating step results in a stamper. The decision to use the master or substampers will be based on the number of disks and the production rate required.

The master is placed in a moulding machine, opposite a flat plate. A suitable quantity of molten plastic is injected between, and the plate and the master are forced together. The flat plate renders one side of the disk smooth, and the bumps in the metal stamper produce pits in the other surface of the disk. The surface containing the pits is next metallized, with any good electrically conductive material, typically aluminium. This metallization is then covered with a lacquer for protection. In the case of CD, the label is printed on the lacquer. In the case of a prerecorded MiniDisc, the ferrous hub needs to be applied prior to fitting the cartridge around the disk.

12.11 Direct metal mastering

An alternative method of CD duplication has been developed by Teldec.17 As Figure 12.32 shows, the recording process is performed by a diamond stylus which embosses the pit structure into a thin layer of copper. The stylus is driven by a piezoelectric element using motional feedback. The element is supported in an elastic medium so that its own centre of gravity tends to remain stationary. Application of drive voltage makes the element contract, lifting the stylus completely off the copper between pits. Since the channel code of CD is DC-free, the stylus spends exactly half its time in contact with the copper, and therefore the embossing force is exactly twice the static force applied. The pits produced are vee-shaped, rather than the flat-bottomed type produced by the photoresist method, but this is not of much consequence, since the diffraction-limited optics of the player cannot determine any more about the pit than its presence or absence. It is claimed that this form of pit is easier to mould.

images

Figure 12.32    In direct metal mastering, a piezoelectric element embosses a copper layer, which is plated over with nickel and subsequently etched away to make a father, or for direct use as a stamper for short runs.

A glass master disk is prepared as before, and following a thin separation layer, a coating of copper about 300 nm thick is sputtered on. The recording is made on this copper layer, which is then gold-coated, and nickel-plated to about 0.25 mm thick. The resultant metal sandwich can then be peeled off the glass master, which can be re-used. The recording is completely buried in the sandwich, and can be stored or transported in this form.

In order to make a stamper, the copper is etched away with ferric chloride to reveal the gold-coated nickel. This can then be used as an electroplating father, as before. For short production runs, the nickel layer can be used as a stamper directly, but it is recommended that the gold layer be replaced by rhodium for this application.

12.12 MiniDisc read/write in detail

MiniDisc has to operate under a number of constraints which largely determine how the read/write pickup operates. A prerecorded MiniDisc has exactly the same track dimensions as CD so that it can be mastered on similar equipment. When playing a prerecorded disk, the MiniDisc player pickup has to act in the same way as a CD pickup. This determines the laser wavelength, the NA of the objective and the effective spot diameter on the disk. This spot diameter must also be used when the pickup is operating with an MO disk.

Figure 12.33(a) shows to scale a CD track being played by a standard pickup. The readout spot straddles the track so that two antiphase components of reflected light can be obtained. As was explained in section 12.10, the CD mastering cutter must use a shorter wavelength and larger NA than the subsequent player in order to ‘cut’ the small pits in the resist. Figure 12.33(b) shows that the cutting process convolves the laser-enabling pulse with the spot profile so that the pit is actually longer than the pulse duration by a spot diameter. The effect is relatively small in CD because of the small spot used in the cutter.

When using an MO disk, the tracks recorded will be equal in width to the spot diameter and so will be wider than CD tracks as Figure 12.33(c) shows. MO writing can be performed in two ways. The conventional method used in computer disks is to apply a steady current to the coil and to modulate the laser. This is because the coil has to be some distance from the magnetic layer of the disk and must be quite large. The inductance of the coil is too great to allow it to be driven at the data frequency in computer applications.

images

Figure 12.33    (a) A CD track and readout spot to scale. (b) The CD track is cut by a smaller spot, but the process results in pits which are longer than the pulse duration. (c) An MO track and readout spot to scale. If this spot is pulsed for writing, the magnetized areas are much larger than the pulse period and density is compromised as in (d). If, however, the magnetic field is modulated, as in (e), the recording is made at the trailing edge of the spot and short wavelengths can be used.

Figure 12.33(d) shows what would happen if MD used laser modulation. The spot profile is convolved with the modulation pulse as for a CD cutter, but the spot is the same size as a replay spot. As a result the magnetized area is considerably longer than the modulation pulse. The shortest wavelengths of a recording could not be reproduced by this system, and it would be necessary to increase the track speed, reducing the playing time.

The data rate of MiniDisc is considerably lower than is the case for computer disks, and it is possible to use magnetic field modulation instead of laser modulation. The laser is then on continuously, and the spot profile is no longer convolved with the modulation. The recording is actually made at the instant the magnetic layer cools below the Curie temperature of about 180°C just after the spot has passed. The state of the magnetic field at this instant is preserved on the disk. Figure 12.33(e) shows that the recorded wavelength can be much shorter because the recording is effectively made by the trailing edge of the spot. This makes the ends of the recorded flux patterns somewhat crescent-shaped. Thus a spot the same size as a CD readout spot can be made to record flux patterns as short as the pits made by the smaller spot of a cutter. The recordable MiniDisc can thus have the same playing time as a prerecorded disk. The optical pickup is simplified because no laser modulator is needed.

The magnetic layer of MO disks should show a large Kerr rotation angle in order to give an acceptable SNR on replay. A high Curie temperature requires a high recording power, but allows greater readout power to be used without fear of demagnetization. This increases the readout signal with respect to the photodiode noise. As a result the Curie temperature is a compromise. Magnetic layers with practical Curie temperatures are made from proprietary alloys of iron, cobalt, platinum, terbium, gadolinium and various other rare earths. These are all highly susceptible to corrosion in air and are also incompatible with the plastics used for moulded substrates. The magnetic layer must be protected by sandwiching it between layers of material which require to be impervious to corrosive ions but which must be optically transmissive. Thus only dielectrics such as silicon dioxide or aluminium nitride can be used.

The disk pickup is concerned with analysing light which has returned from within the MO layer as only this will have the Kerr rotation. Reflection from the interface between the MO layer and the dielectric overlayer will have no Kerr rotation. The optical characteristics of the dielectric layers can be used to enhance readout by reducing the latter reflection. Figure 12.34 shows that the MO disks have an optically reflective layer behind the sandwiched MO layer. The thickness of the dielectric between the MO layer and the reflector is selected such that light from the reflector is antiphase with light from the overlayer/MO layer interface and instead of being reflected back to the pickup is absorbed in the MO layer. Conversely, light originating in the MO layer and leaving in the direction of the pickup experiences constructive interference with reflected components of that light. These components which contain Kerr rotation are readily able to exit the disk. These measures enhance the ratio of the magneto-optic component to ordinary light at the pickup.

images

Figure 12.34    In MO disks the dielectric and reflective layers are as important as the magneto-optic layer itself.

12.13 How recordable MiniDiscs are made

Recordable MiniDiscs make the recording as flux patterns in a magnetic layer. However, the disks need to be pre-grooved so that the tracking systems described in section 12.7 can operate. The grooves have the same pitch as CD and the prerecorded MD, but the tracks are the same width as the laser spot: about 1.1 μm. The grooves are not a perfect spiral, but have a sinusoidal waviness at a fixed wavelength. Like CD, MD uses constant track linear velocity, not constant speed of rotation. When recording on a blank disk, the recorder needs to know how fast to turn the spindle to get the track speed correct. The wavy grooves will be followed by the tracking servo and the frequency of the tracking error will be proportional to the disk speed. The recorder simply turns the spindle at a speed which makes the grooves wave at the correct frequency. The groove frequency is 75 Hz; the same as the data sector rate. Thus a zero crossing in the groove signal can also be used to indicate where to start recording. The grooves are particularly important when a chequer-boarded recording is being replayed. On a CLV disk, every seek to a new track radius results in a different track speed. The wavy grooves allow the track velocity to be monitored as soon as a new track is reached.

The pre-grooves are moulded into the plastics body of the disk when it is made. The mould is made in a similar manner to a prerecorded disk master, except that the laser is not modulated and the spot is larger. The track velocity is held constant by slowing down the resist master as the radius increases, and the waviness is created by injecting 75 Hz into the lens radial positioner. The master is developed and electroplated as normal in order to make stampers. The stampers make pre-grooved disks which are then coated by vacuum deposition with the MO layer, sandwiched between dielectric layers. The MO layer can be made less susceptible to corrosion if it is smooth and homogeneous. Layers which contain voids, asperities or residual gases from the coating process present a larger surface area for attack. The life of an MO disk is affected more by the manufacturing process than by the precise composition of the alloy.

Above the sandwich an optically reflective layer is applied, followed by a protective lacquer layer. The ferrous clamping plate is applied to the centre of the disk, which is then fitted in the cartridge. The recordable cartridge has a double-sided shutter to allow the magnetic head access to the back of the disk.

12.14 Channel code of CD and MiniDisc

CD and MiniDisc use the same channel code known as EFM. This was optimized for the optical readout of CD and prerecorded MiniDisc, but is also used for the recordable version of MiniDisc for simplicity. DVD uses a refinement of the CD code called EFM+.

The frequency response falling to the optical cut-off frequency is only one of the constraints within which the modulation scheme has to work. There are a number of others. In all players the tracking and focus servos operate by analysing the average amount of light returning to the pickup. If the average amount of light returning to the pickup is affected by the content of the recorded data, then the recording will interfere with the operation of the servos. Debris on the disk surface affects the light intensity and means must be found to prevent this reducing the signal quality excessively.

Optical disks are serial media which produce on replay only a single voltage varying with time. If it is attempted to simply serialize raw data, a process known as direct recording, it is not difficult to see what will happen in the case where the data are digital audio samples where the audio is muted. Upon serializing the all-zeros code for muting the serial waveform is simply a steady logical low level and in the absence of a separate clock it is impossible to tell how many zeros were present, nor in the case of CD will there be a track to follow. A similar problem would be experienced if all ones occur in the data except that a steady high logic level results in a continuous bump. In digital logic circuits it is common to have signal lines and separate clock lines to overcome this problem, but with a single signal the separate clock is not possible. A further problem with direct optical recording is that the average brightness of the track is a function of the relative proportion of ones and zeros. Focus and tracking servos cannot be used with direct recordings because the data determine the average brightness and confuse the servos. Chapter 6 discussed modulation schemes known as DC-free codes. If such a code is used, the average brightness of the track is constant and independent of the data bits. Figure 12.35(a) shows the replay signal from the pickup being compared with a threshold voltage in order to recover a binary waveform from the analog pickup waveform, a process known as slicing. If the light beam is partially obstructed by debris, the pickup signal level falls, and the slicing level is no longer correct and errors occur. If, however, the code is DC-free, the waveform from the pickup can be passed through a high-pass filter (e.g. a series capacitor) and Figure 12.35(b) shows that this rejects the falling level and converts it to a reduction in amplitude about the slicing level so that the slicer still works properly. This step cannot be performed unless a DC-free code is used.

images

Figure 12.35    A DC-free code allows signal amplitude variations due to debris to be rejected.

As the frequency response on replay falls linearly to the cut-off frequency determined by the aperture of the lens and the wavelength of light used, the shorter bumps and lands produce less modulation than longer ones. Figure 12.36(a) shows what happens to the replay waveform as a bump between two long lands is made shorter. At some point the replayed signal no longer crosses the slicing level and readout is impossible. Figure 12.36(b) shows that the same effect occurs as a land between two long bumps is made shorter. In these cases recorded frequencies have to be restricted to those which produce wavelengths long enough for the player to register. Using direct recording where, for example, lands represent a 1 and bumps represent a 0 it is clear that the length of track corresponding to a one or a zero would have to be greater than the limit at which the slicing in the player failed and this would restrict the playing time.

images

Figure 12.36    If the recorded waveform is not DC-free, timing errors occur until slicing becomes impossible. With a DC-free code, jitter-free slicing is possible in the presence of serious amplitude variation.

Figure 12.36(c) shows that if the recorded waveform is restricted to one which is DC-free, as the length of bumps and lands falls with rising density, the replay waveform simply falls in amplitude but the average voltage remains the same and so the slicer still operates correctly. It will be clear that by using a DC-free code correct slicing remains possible with much shorter bumps and lands than with direct recording. Thus in practical high-density optical disk players, including CD players, a DC-free code must be used. The output of the pickup passes to two filters. A low-pass filter removes the DC-free modulation and leaves a signal which can be used for tracking, and the high-pass filter removes the effect of debris and allows the slicer to continue to function properly. Clearly direct recording of serial data from a shift register cannot be DC-free and so it cannot be read at high density, it will not be self-clocking and it will not be resistant to errors caused by debris, and it will interfere with the operation of the servos. The solution to all these problems is to use a suitable channel code. The concepts of channel coding were discussed in Chapter 6, in which frequency shift keying (FSK) was described. In FSK it is possible to use a larger number of different discrete frequencies, for example four frequencies allow all combinations of two bits to be conveyed, eight frequencies allow all combinations of three bits to be conveyed and so on. The channel code of CD is similar in that it is the minimal case of multi-tone FSK where only a half-cycle of each of nine different frequencies is used. These frequencies are 196, 216, 240, 270, 308, 360, 430, 540 and 720 kHz and are obtained by dividing a master clock of 2.16 MHz by 11, 10, 9, 8, 7, 6, 5, 4, and 3. There are therefore nine different periods or run lengths in the CD signal, and it does not matter whether the period is the length of a land or the length of a bump. In fact the signal from a CD pickup could be inverted without making the slightest difference to the data recovery as all that is of any consequence is the time between successive zero crossings of the signal. In run-length-limited coding of this kind, the time periods are described in a relative rather than an absolute manner. Thus if half a cycle of the master clock has a period T, then the periods or run lengths of the code can be from 3T to 11T. The run lengths are combined in ways which make the resulting waveform DC-free and so the slicer will function properly as the response falls at higher frequencies. The various frequencies or periods used in CD can be seen by examining the replay waveform from the pickup with an oscilloscope. Figure 12.37 shows the resultant eye pattern. It will be seen that the higher frequencies (period 3T) have the smallest amplitude on replay. Note that the optical cut-off frequency of CD is only 1.4 MHz, and so it will be evident that the master clock frequency of 2.16 MHz cannot be recorded or reproduced. This is of no consequence in CD as it does not need to be recorded. 1.4 MHz is the frequency at which the depth of modulation has fallen to zero. As stated, the highest frequency which can be reliably recorded is about one half of the optical cut-off frequency. Frequencies above this replay with an amplitude so small that they have inadequate signal-to-noise ratio. It will be seen that the highest frequency in CD is 720 kHz which is about half of 1.4 MHz. Although frequencies lower than 196kHz can be replayed easily, the clock content of lower frequencies is considered inadequate.

images

Figure 12.37    The characteristic eye pattern of EFM observed by oscilloscope. Note the reduction in amplitude of the higher-frequency components. The only information of interest is the time when the signal crosses zero.

CD uses a coding scheme where combinations of the data bits to be recorded are represented by unique waveforms. These waveforms are created by combining various run lengths from 3T to 11T together to give a channel pattern which is 14T long.18 Within the run length limits of 3T to 11T, a waveform 14T long can have 267 different patterns. This is slightly more than the 256 combinations of eight data bits and so eight bits are represented by a waveform lasting 14T. Some of these patterns are shown in Figure 12.38. As stated, these patterns are not polarity conscious and they could be inverted without changing the meaning.

images

Figure 12.38    (a–g) Part of the codebook for EFM code showing examples of various run lengths from 3T to 11T. (h,i) Invalid patterns which violate the run-length limits.

Not all the 14T patterns used are DC-free, some spend more time in one state than the other. The overall DC content of the recorded waveform is rendered DC-free by inserting an extra portion of waveform, known as a packing period, between the 14T channel patterns. This packing period is 3T long and may or may not contain a transition, which if it is present can be in one of three places. The packing period contains no information, but serves to control the DC content of the overall waveform.19 The packing waveform is generated in such a way that in the long term the amount of time the channel signal spends in one state is equal to the time it spends in the other state. A packing period is placed between every pair of channel patterns and so the overall length of time needed to record eight bits is 17T. Packing periods were discussed in Chapter 6.

CD is recorded using such patterns where the lengths of bumps and lands are modulated in ideally discrete steps. The simplest way in which such patterns can be generated is to use a look-up table which converts the data bits to a control code for a programmable waveform generator. As stated, the polarity of the CD waveform is irrelevant. What matters on the disk are the lengths of the bumps or lands. The change of state in the signal sent to the cutter laser is called a transition. Clearly if a bump is being cut, it will be terminated by interrupting the light beam. If a land is being recorded, it will be terminated by allowing through the light beam. Both of these are classified as a transition, therefore it is logical for the control code to cause transitions rather than to control the waveform level as it is not concerned with the polarity of the waveform. This is conveniently achieved by controlling the cutter laser with the output waveform of a JK type bistable as shown in Figure 12.39. A bistable of this kind can be configured to have a data input and a clock input. If the data input is 0, there is no effect on the output when the clock edge arrives, whereas if the data input is 1 the output changes state when the clock edge arrives. The change of state causes a transition on the disk. If the clock has a period of T, at each channel time period or detent the output waveform will contain a transition if the control code is 1 or not if it is 0.

images

Figure 12.39    A bistable is necessary to convert a stream of channel bits to a channel-coded waveform. It is the waveform which is recorded not the channel bits.

The control code is a binary word having fourteen bits which are known in the art as channel bits or binits. Thus a group of eight data bits is represented by a code of fourteen channel bits, hence the name of eight to fourteen modulation (EFM). The use of groups gives rise to the generic name of group code recording (GCR). It is a common misconception that the channel bits of a group code are recorded; in fact they are simply a convenient but not essential way of synthesizing a coded waveform having uniform time steps. It should be clear that channel bits cannot be recorded as they have a rate of 4.3 Mbits/s whereas the optical cut-off frequency of CD is only 1.4 MHz.

Another common misconception is that channel bits are data. If channel bits were data, all combinations of fourteen bits, or 16 384 different values could be used. In fact only 267 combinations produce waveforms which can be recorded.

In a practical CD modulator, the eight-bit data symbols to be recorded are used as the address of a look-up table which outputs a fourteen-bit channel bit pattern. As the highest frequency which can be used in CD is 720 kHz, transitions cannot be closer together than 3T and so successive 1s in the channel bit stream must have two or more zeros between them. Similarly transitions cannot be further apart than 11T or there will be insufficient clock content. Thus there cannot be more than 10 zeros between channel 1s. Whilst the look-up table can be programmed to prevent code violations within the 14T pattern, they could occur at the junction of two successive patterns. Thus a further function of the packing period is to prevent violation of the run-length limits. If the previous pattern ends with a transition and the next begins with one, there will be no packing transition and so the 3T minimum requirement can be met. If the patterns either side have long run lengths, the sum of the two might exceed 11T unless the packing period contained a transition. In fact the minimum run-length limit could be met with 2T of packing, but the requirement for DC control dictated 3T of packing.

The coding of CD may appear complex, but this is because it was designed to offer the required playing time on a disk of restricted size. It does this by reducing the frequency of the recorded signal compared to the data frequency. Eight data bits are represented by a length of track corresponding to 17T. The shortest run length in a conventional recording code such as MFM would be the length of one bit, and as eight bits require 17T of track, the length of one bit would be 17/8T or 2.125T. Using the CD code the shortest run length is 3T. Thus the highest frequency in the CD code is less than that of an MFM recording, so a density improvement of 3/2.125 or 1.41 is obtained. Thus CD can record 41 per cent more using EFM than if it used MFM. A CD can play for 75 minutes maximum. Using MFM a CD would only play for 53 minutes.

The high-pass filtered DC-free signal from the CD pickup can be readily sliced back to a binary signal having transitions at the zero crossings. A group-coded waveform needs a suitably designed data separator to decode and deserialize the replay signal. When the disk is initially scanned, the data separator simply sees a single voltage varying with time, and it has no other information to go on whatsoever. The scanning of the disk will not necessarily be at the correct speed, and the transitions recovered will suffer from jitter. The jitter comes from two main sources. The first of these is variations in the thickness of the disk. Everyone is familiar with the illusion that the bottom of a shallow pond is moving when there are ripples in the water. In the same way, ripples in the disk thickness make the track appear to vary in speed. The second source is simply in the production tolerance to which bump edges can be made. The replication process from master to stamper will cause some slight migration of edge position, and stampers can wear in service. In order to interpret the replay waveform in the presence of jitter, use is made of the fact that transitions ideally occur at integer multiples of T. When a real transition occurs at a time other than an exact multiple of T, it can be attributed to the nearest multiple if the jitter is not too serious, and the jitter will be completely rejected. If, however the jitter is too great, the wrongly timed transition will be attributed to the incorrect detent, and the wrong pattern will be identified.

A phase-locked loop is an essential part of a practical high-density data separator. The operation of a phase-locked loop was described in section 5.9. If the input is a group-coded signal, it will contain transitions at certain multiples of the basic time period T, but not at every cycle owing to the run-length limits. The reason for the use of multiples of a basic time period in group codes is simply that a phase-locked loop can lock to such a waveform. When a transition occurs, a phase comparison can be made, but when no transition occurs, there is no phase comparison but the VCO will continue to run at the same frequency like a flywheel. The maximum run-length limit of 11T in CD is to ensure that the VCO does not have to run for too long between phase corrections. As a result, the VCO recreates a continuous clock from the intermittent clock content of the channel-coded signal. In a group-coded system, the VCO recreates the channel bit rate. In CD this is the only way in which the channel bit rate can be reproduced, as the disk itself cannot record the channel bit rate.

Jitter in the transition timing is handled by inserting a low-pass or averaging filter between the phase detector and the VCO and/or by increasing the division ratio in the feedback. Both of these steps increase the flywheel inertia. The VCO then runs at the average frequency obtained from many channel transitions and the jitter is substantially removed from the re-created clock. With a jitter-free continuous clock available from the VCO, the actual time at which a transition occurs can differ from the ideal by a considerable amount. When the recording was made, the transitions were intended to be spaced at multiples of the channel bit period, and the run lengths in the code ideally should be discrete. In practice the analog nature of the channel causes the run lengths to vary. A certain amount of variation can be rejected in a properly engineered channel code. The VCO is used to create windows called detents along the time axis of the replay signal. An ideal jitter-free signal would have a transition in the centre of the window, but real transitions may occur before or after the centre. As long as the variation is within the window, it is rejected, but if the jitter were so large that a transition crossed into an adjacent window, an error would occur. It was shown in Chapter 6 that the jitter window of EFM is 8/17 of a data bit. Transitions on a CD replay signal can be up to plus or minus 4/17 of a data bit period out of time before errors are caused. This jitter rejection is a requirement of the CD system because such jitter actually occurs on real disks as has been described. Indeed if it did not, the designers would have used a code with less jitter tolerance and even higher recording density. Thus it is simplistic to regard the surface of a high-density recording as a nice neat set of areas like toy bricks. In practice the manufacturing tolerances are eased so that the recording becomes cheaper even if the transitions become a little jittery. Provided the channel code can reject the jitter, the extra density makes the product more cost-effective. The deformities on real CDs are not exact multiples of the basic unit in practice. If this were a requirement they could never be sold on the consumer market. The jitter-rejection mechanism allows considerable production tolerances to be absorbed so that disks can be mass produced.

The length of a deformity on a CD master is affected not only by the duration of the record pulse, which can be as accurate as necessary, but also by the sensitivity of the resist and the intensity function of the laser. The pit which is formed in the resist is the result of the convolution of the rectangular pulse operating the modulator with the Airy function. Thus the pit will be longer than the period of the pulse would suggest. The pit edge is then subject to further position tolerance as a result of electroplating mothers and sons to create a large number of stampers. The stampers themselves will wear in service. The position of a transition is now subject to the tolerance of the cutting laser intensity function and state of focus, resist sensitivity, electroplating accuracy and wear and so the actual disk will be non-ideal.

The shortest deformity in CD is nominally 3T long or 3 images 8/17 data bits long. This can suffer nearly plus or minus 4/17 data bit periods of jitter at each end before it cannot be read properly. Thus in the worst case, where the leading edge was early and the trailing edge late, the deformity could be almost 30 per cent longer than the ideal. In typical production disks, the edge position is held a little more accurately than this theoretical limit in order to allow extra jitter in the replay process due to thickness ripple, coma due to warped disks or out-of-focus conditions.

Once the phase-locked loop has reached the lock condition, it outputs a clock whose frequency is proportional to the speed of the track. If the track speed is correct it will have the same frequency as the channel bit clock in the cutter. This clock can then be used to sample the sliced analog signal from the pickup. As can be seen from Figure 12.40, transitions nominally occur in the centre of a T period. If the samples are taken on the edge of every T period, a transition will be reliably detected as the difference between two successive samples even if it has positional jitter approaching plus or minus T/2. Thus the output of the sampler is a jitter-free replica of the replay signal, and in the absence of errors it will be identical to the output of the JK bistable in the cutter. The sampling clock runs at the average phase of a large number of transitions from the track. Every transition not only conveys part of the waveform representing data, but also allows the phase of the clock to be updated and so every transition can also be considered to have a synchronizing function. The 11T maximum run-length limit is necessary to ensure that synchronizing information for the VCO is regularly available in the replay waveform; a requirement that cannot be met by direct recording.

The information in the CD replay waveform is carried in the timing of the transitions, not in the polarity. It is thus necessary to create a polarity-independent signal from the sliced de-jittered replay waveform. This is done by differentiating the sampler output. Figure 12.40 shows that this can be achieved by a D-type latch and an exclusive-OR gate. The latch is clocked at the channel bit rate, and so acts as a one-bit delay. The gate compares the input and output of the delay. When they are the same, there is no transition and the gate outputs 0. When a transition passes through, the input and output of the latch will be different and the gate outputs 1. Thus some distance through the replay circuitry from the pickup, the channel bits reappear, just as they disappeared before reaching the cutter laser.

images

Figure 12.40    The output of the slicer is sampled at the boundary of every T period. Where successive samples differ, a channel bit 1 is generated.

12.15 Deserialization

Decoding the stream of channel bits into data requires that the boundaries between successive 17T periods are identified. This is the process of deserialization. On the disk one 17T period runs straight into the next; there are no dividing marks. Symbol separation is performed by counting channel bit periods and dividing them by 17 from a known reference point. The three packing periods are discarded and the remaining 14T symbol is decoded to eight data bits. The reference point is provided by the synchronizing pattern which is given that name because its detection synchronizes the deserialization counter to the replay waveform.

Synchronization has to be as reliable as possible because if it is incorrect all the data will be corrupted up to the next sync pattern. Synchronization is achieved by the detection of an unique waveform periodically recorded on the track at a regular spacing. It must be unique in the strict sense in that nothing else can give rise to it, because the detection of a false sync is just as damaging as failure to detect a correct one. Clearly the sync pattern cannot be a data code value in CD as there would then be a Catch-22 situation. It would not be possible to deserialize the EFM symbols in order to decode them until the sync pattern had been detected, but if the sync pattern were a data code value, it could not be detected until the deserialization of the EFM waveform had been synchronized. Thus in a group code recording a data code value simply cannot be used for synchronizing. In any case it is undesirable and unnecessary to restrict the data code values which can be recorded; CD requires all 256 combinations of the eight-bit symbols recorded.

In practice CD synchronizes deserialization with a waveform which is unique in that it is different from any of the 256 waveforms which represent data. For reliability, the sync pattern should have the best signal-to-noise ratio possible, and this is obtained by making it one complete cycle of the lowest frequency (11T plus 11T) which gives it the largest amplitude and also makes it DC-free. Upon detection of the 2 images Tmax waveform, the deserialization counter which divides the channel bit count by 17 is reset. This occurs on the next system clock, which is the reason for the 0 in the sync pattern after the third 1 and before the merging bits. CD therefore uses forward synchronization and correctly deserialized data are available immediately after the first sync pattern is detected. The sync pattern is longer than the data symbols, and so clearly no data code value can create it, although it would be possible for certain adjacent data symbols to create a false sync pattern by concatenation were it not for the presence of the packing period. It is a further job of the packing period to prevent false sync patterns being generated at the junction of two channel symbols.

images

Figure 12.41    One CD data block begins with a unique sync pattern, and one subcode byte, followed by 24 audio bytes and eight redundancy bytes. Note that each byte requires 14T in EFM, with 3T packing between symbols, making 17T.

Each data block or frame in CD and MD, shown in Figure 12.41, consists of 33 symbols 17T each following the preamble, making a total of 588T or 136 μs. Each symbol represents eight data bits. The first symbol in the block is used for subcode, and the remaining 32 bytes represent 24 audio sample bytes and 8 bytes of redundancy for the error-correction system. The subcode byte forms part of a subcode block which is built up over 98 successive data frames, and this will be described in detail later in this chapter.

The channel bits which are re-created by sampling and differentiating the sliced replay waveform in time to the restored clock from the VCO are conveniently converted to parallel format for decoding in a shift register which need only have fourteen stages. The bit counter which is synchronized to the serial replay waveform by the detection of the sync pattern will output a pulse every 17T when a complete 14T pattern of channel bits is in the register. This pattern can then be transferred in parallel to the decoder which will identify the channel pattern and output the data code value.

Detection of sync in CD is simply a matter of identifying a complete cycle of the lowest recorded frequency. In practical players the sync pattern will be sliced, sampled and differentiated to channel bits along with the rest of the replay waveform. As a shift register is already present it is a matter of convenience to extend it to 23 stages so that the sync pattern can be detected by continuously examining the parallel output as the patterns from the track shift by. The pattern will be detected by a combination of logic gates which will only output a ‘true’ value when the shift register contains 10000000000100000000001 in the correct place.

This is not a bit pattern which exists on the disk; the disk merely contains two maximum run-lengths in series and it does not matter whether these are a bump followed by a land or a land followed by a bump. The sliced replay waveform cannot be sampled at the correct frequency until the VCO has locked and this requires the T rate synchronizing information from a prior length of data track. If the VCO were not locked, the sync waveform would be sampled into the wrong number of periods and would not be detected. Following sampling, the replay signal is differentiated so that transitions of either direction produce a channel bit 1.

Figure 12.42 shows an overall block diagram of the record modulation scheme used in CD mastering and the corresponding replay system or data separator. The input to the record channel coder consists of sixteen-bit audio samples which are divided in two to make symbols of eight bits. These symbols are used in the error-correction system which interleaves them and adds redundant symbols. For every twelve audio symbols, there are four symbols of redundancy, but the channel coder is not concerned with the sequence or significance of the symbols and simply records their binary code values.

images

Figure 12.42    Overall block diagram of the EFM encode/decode process. A MiniDisc will contain both. A CD player only has the decoder; the encoding is in the mastering cutter.

Symbols are provided to the coder in eight-bit parallel format, with a symbol clock. The symbol clock is obtained by dividing down the 4.3218 MHz T rate clock by a factor of 17. Each symbol is used to address the look-up table which outputs a corresponding fourteen-channel bit pattern in parallel into a shift register. The T rate clock then shifts the channel bits along the register. The look-up table also outputs data corresponding to the digital sum value (DSV) of the fourteen-bit symbol to the packing generator. The packing generator determines if action is needed between symbols to control DC content. The packing generator checks for run-length violations and potential false sync patterns. As a result of all the criteria, the packing generator loads three channel bits into the space between the symbols, such that the register then contains fourteen-bit symbols with three bits of packing between them. At the beginning of each frame, the sync pattern is loaded into the register just before the first symbol is looked up in such a way that the packing bits are correctly calculated between the sync pattern and the first symbol.

A channel bit one indicates that a transition should be generated, and so the serial output of the shift register is fed to the JK bistable along with the T rate clock. The output of the JK bistable is the ideal channel coded waveform containing transitions separated by 3T to 11T. It is a self-clocking, run-length-limited waveform. The channel bits and the T rate clock have done their job of changing the state of the JK bistable and do not pass further on. At the output of the JK the sync pattern is simply two 11T run lengths in series.

At this stage the run-length-limited waveform is used to control the acousto-optic modulator in the cutter. This actually results in pits which are slightly too long and lands which are too short because of the convolution of the record waveform with the Airy function which was mentioned above. As the cutter spot is about 0.4 μm across, the pit edges in the resist are moved slightly. Thus although the ideal waveform is created in the encoding circuitry, having integer multiples of T between transitions, the pit structure is non-ideal and pit edges are not located at exact multiples of a basic distance. The duty cycle of the pits and lands is not exactly 50 per cent and the replay waveform will have a DC offset. This is of no consequence in CD as the channel code is known to be DC-free and an equivalent offset can be generated in the slicing level of the player such that the duty cycle of the slicer output becomes 50 per cent.

The resist master is developed and used to create stampers. The resulting disks can then be replayed. The track velocity of a given CD is constant, but the rotational speed depends upon the radius. In order to get into lock, the disk must be spun at roughly the right track speed. This is done using the run-length limits of the recording. The pick-up is focused and the tracking is enabled. The replay waveform from the pickup is passed through a high-pass filter to remove level variations due to contamination and sliced to return it to a binary waveform. The slicing level is self-adapting as Figure 12.43 shows, so that a 50 per cent duty cycle is obtained. The slicer output is then sampled by the unlocked VCO running at approximately T rate. If the disk is running too slowly, the longest run length on the disk will appear as more than 11T, whereas if the disk is running too fast, the shortest run length will appear as less than 3T. As a result, the disk speed can be brought to approximately the right speed and the VCO will then be able to lock to the clock content of the EFM waveform from the slicer. Once the VCO is locked, it will be possible to sample the replay waveform at the correct T rate. The output of the sampler is then differentiated and the channel bits reappear and are fed into the shift register. The sync pattern detector will then function to reset the deserialization counter which allows the 14T symbols to identified. The 14T symbols are then decoded to eight bits in the reverse coding table.

images

Figure 12.43    Self-slicing a DC-free channel code. Since the channel code signal from the disk is band limited, it has finite rise times, and slicing at the wrong level (as shown here) results in timing errors, which cause the data separator to be less reliable. As the channel code is DC-free, the binary signal when correctly sliced should integrate to zero. An incorrect slice level gives the binary output a DC content and, as shown here, this can be fed back to modify the slice level automatically.

images

Figure 12.44    CD timing structure.

Figure 12.44 reveals the timing relationships of the CD format. The sampling rate of 44.1 kHz with sixteen-bit words in left and right channels results in an audio data rate of 176.4 kb/s (k = 1000 here, not 1024). Since there are 24 audio bytes in a data frame, the frame rate will be:

images

If this frame rate is divided by 98, the number of frames in a subcode block, the subcode block or sector rate of 75 Hz results. This frequency can be divided down to provide a running-time display in the player. Note that this is the frequency of the wavy grooves in recordable MDs.

If the frame rate is multiplied by 588, the number of channel bits in a frame, the master clock-rate of 4.3218 MHz results. From this the maximum and minimum frequencies in the channel, 720 kHz and 196 kHz, can be obtained using the run-length limits of EFM.

12.16 Error-correction strategy

This section discusses the track structure of CD in detail. The track structure of MiniDisc is based on that of CD and the differences will be noted in the next section.

Each sync block was seen in Figure 12.41 to contain 24 audio bytes, but these are non-contiguous owing to the extensive interleave.2022 There are a number of interleaves used in CD, each of which has a specific purpose. The full interleave structure is shown in Figure 12.45. The first stage of interleave is to introduce a delay between odd and even samples. The effect is that uncorrectable errors cause odd samples and even samples to be destroyed at different times, so that interpolation can be used to conceal the errors, with a reduction in audio bandwidth and a risk of aliasing. The odd/even interleave is performed first in the encoder, since concealment is the last function in the decoder. Figure 12.46 shows that an odd/even delay of two blocks permits interpolation in the case where two uncorrectable blocks leave the error-correction system.

images

Figure 12.45    CD interleave structure.

images

Figure 12.46    Odd/even interleave permits the use of interpolation to conceal uncorrectable errors.

Left and right samples from the same instant form a sample set. As the samples are sixteen bits, each sample set consists of four bytes, AL, BL, AR, BR. Six sample sets form a 24-byte parallel word, and the C2 encoder produces four bytes of redundancy Q. By placing the Q symbols in the centre of the block, the odd/even distance is increased, permitting interpolation over the largest possible error burst. The 28 bytes are now subjected to differing delays, which are integer multiples of four blocks. This produces a convolutional interleave, where one C2 codeword is stored in 28 different blocks, spread over a distance of 109 blocks.

At one instant, the C2 encoder will be presented with 28 bytes which have come from 28 different codewords. The C1 encoder produces a further four bytes of redundancy P. Thus the C1 and C2 codewords are produced by crossing an array in two directions. This is known as cross-interleaving.

images

Figure 12.47    The final interleave of the CD format spreads P codewords over two blocks. Thus any small random error can only destroy one symbol in one codeword, even if two adjacent symbols in one block are destroyed. Since the P code is optimized for single-symbol error correction, random errors will always be corrected by the C1 process, maximizing the burst-correcting power of the C2 process after de-interleave.

images

Figure 12.48    Owing to cross-interleave, the 28 symbols from the Q encode process (C2) are spread over 109 blocks, shown hatched. The final interleave of P codewords (as in Figure 12.47) is shown stippled. The result of the latter is that Q codeword has 5, 3, 5, 3 spacing rather than 4, 4.

The final interleave is an odd/even output symbol delay, which causes P codewords to be spread over two blocks on the disk as shown in Figure 12.47. This mechanism prevents small random errors destroying more than one symbol in a P codeword. The choice of eight-bit symbols in EFM assists this strategy. The expressions in Figure 12.45 determine how the interleave is calculated. Figure 12.48 shows an example of the use of these expressions to calculate the contents of a block and to demonstrate the cross-interleave.

The calculation of the P and Q redundancy symbols is made using Reed–Solomon polynomial division. The P redundancy symbols are primarily for detecting errors, to act as pointers or error flags for the Q system. The P system can, however, correct single-symbol errors.

12.17 Track layout of MD

MD uses the same channel code and error-correction interleave as CD for simplicity and the sectors are exactly the same size. The interleave of CD is convolutional, which is not a drawback in a continuous recording. However, MD uses random access and the recording is discontinuous. Figure 12.49 shows that the convolutional interleave causes codewords to run between sectors. Rerecording a sector would prevent error correction in the area of the edit. The solution is to use a buffering zone in the area of an edit where the convolution can begin and end. This is the job of the link sectors. Figure 12.50 shows the layout of data on a recordable MD. In each cluster of 36 sectors, 32 are used for encoded audio data. One is used for subcode and the remaining three are link sectors. The cluster is the minimum data quantum which can be recorded and represents just over two seconds of decoded audio. The cluster must be recorded continuously because of the convolutional interleave. Effectively the link sectors form an edit gap which is large enough to absorb both mechanical tolerances and the interleave overrun when a cluster is rewritten. One or more clusters will be assembled in memory before writing to the disk is attempted.

images

Figure 12.49    The convolutional interleave of CD is retained in MD, but buffer zones are needed to allow the convolution to finish before a new one begins, otherwise editing is impossible.

images

Figure 12.50    Format of MD uses clusters of sectors including link sectors for editing. Prerecorded MDs do not need link sectors, so more subcode capacity is available. The ATRAC coder of MD produces the sound groups shown here.

Prerecorded MDs are recorded at one time, and need no link sectors. In order to keep the format consistent between the two types of MiniDisc, three extra subcode sectors are made available. As a result it is not possible to record the entire audio and subcode of a prerecorded MD onto a recordable MD because the link sectors cannot be used to record data.

The ATRAC coder produces what are known as sound groups (see Chapter 5). Figure 12.50 shows that these contain 212 bytes for each of the two audio channels and are the equivalent of 11.6 ms of real-time audio. Eleven of these sound groups will fit into two standard CD sectors with 20 bytes to spare. The 32 audio data sectors in a cluster thus contain a total of 16 images 11 = 176 sound groups.

12.18 CD subcode

Subcode is essentially an auxiliary data stream which is merged with the audio samples, and which has numerous functions. One of these is to assist in locating the beginning of the different musical pieces on a disk, and providing a catalogue of their location on the disk and their durations. A further vital function is to convey the status of pre-emphasis in the recording, so that de-emphasis can be automatically selected in the player. The subcode information in CD is conveyed by including an extra byte, which corresponds to one EFM symbol, in the main frame structure. As the format of the disk is standardized, the player is designed to route the subcode byte in the frame to a different destination from that of the audio sample bytes. The separation is based upon the physical position of the subcode byte in the frame. The player uses the sync pattern at the beginning of the frame to reset a byte count so that it always knows how far through the frame it is. As a result, subcode bytes will be separated from the data stream at frame frequency.

It has been shown that there are 98 bytes in a subcode block, since this results in a subcode block rate of exactly 75 Hz. This frequency can be used to run the playing-time display.

It is necessary for the player to know when a new subcode block is beginning. This is the function of the subcode sync patterns which are placed in the subcode byte position of two successive frames. There are more than 256 legal fourteen-bit patterns in EFM, and two of these additional legal channel-bit patterns are used for subcode-block synchronizing. The EFM decoder will be able to distinguish them from the patterns used to represent subcode-data bytes. For this reason it is impossible to describe the subcode sync patterns by a byte, and they have to be specified as fourteen channel bits.

Figure 12.51 shows the subcode sync patterns, and illustrates the contents of the subcode block. After the subcode sync patterns, there are 96 bytes in the block. The block is arranged as eight 96-bit words, labelled P Q R S T U V and W. The choice of labelling is unfortunate because the letters P and Q have already been used to describe the redundancy in the error-correction system. The subcode P and Q data have absolutely nothing to do with that. The eight words are quite independent, and each subcode byte in a disk frame contains one bit from each word. This is a form of interleaving which reduces the damage done to a particular word by an error.

The P data word is used to denote the start of specific bands (having the same meaning as the bands on a vinyl disk) in the sound recorded. The entire word is recorded as data ones during the start-flag period. It can be used even where there is no audible pause in the music, since the start point is defined as where the P data become zeros again. The CD standard calls for a minimum of two seconds of start flag to be recorded. This seems wasteful, but it allows a very simple player to recognize the beginning of a piece easily by skipping tracks. The fact that every bit is a one means that it is not necessary to wait for subcode block sync to be found before finding pause status on the disk track. The two-second-flag period means that the status will be seen a few tracks in advance of the actual start-point, helping to prevent the pickup from overshooting. If a genuine pause exists in the music, the start flag may be extended to the length of the pause if it exceeds two seconds. Again, for the benefit of simple players, the start flag alternates on and off at 2 Hz in the lead-out area at the end of the recording.

images

Figure 12.51    Each CD frame contains one subcode byte. Afer 98 frames, the structure above will repeat. Each subcode byte contains 1 bit from eight 96 bit words following the two synchronizing patterns. These patterns cannot be expressed as a byte, because they are 14 bit EFM patterns additional to those which describe the 256 combinations of eight data bits.

At the time of writing the only other defined subcode data word is the Q word. This word has numerous modes and uses which can be taken advantage of by CD players with greater processing and display capability.

Figure 12.52 shows the structure of the Q subcode word. In the 96 bits following the sync patterns, there are two four-bit words for control, a 72-bit data block, and a sixteen-bit CRC character which makes all 96 bits a codeword.

images

Figure 12.52    The structure of the Q data block. The 72-bit data can be interpreted in three ways determined by the address bits.

The first four-bit control word contains flags specifying the number of audio channels encoded, to permit automatic decoding of four channel-disks, the copy-prohibit status and the pre-emphasis status. Since de-emphasis is often controlled by a relay or electronic switch in the analog stages of the player, the pre-emphasis status is only allowed to change during a P code start flag.

The second four-bit word determines the meaning of the subsequent 72-bit block. There can be three meanings: mode 1, which tells the player the number and start times of the bands on the disk; mode 2, which carries the disk catalogue number; and mode 3, which carries the ISRC (International Standard Recording Code) of each band. Of all the subcode blocks on a disk, the mode 1 blocks are by far the most common.

Mode 1 has two major functions. During the lead-in track it contains a table of contents (TOC), listing each piece of music and the absolute playing time when it starts. During the music content of the disk, it contains running time.

Figure 12.53 shows that the 72-bit block is subdivided into nine bytes, one of which is unused and permanently zero. Each byte represents two hexadecimal digits where not all codes are valid. The first byte in the block is the music number (MNR), which specifies the number of the track on the disk; where in this context ‘track’ corresponds to the bands on a vinyl disk. The tracks are numbered from one upwards, and the track number of 00 indicates that the pickup is in the lead-in area and that the rest of the block contains an entry in the table of contents.

images

Figure 12.53    General format of Q subcode frame in mode 1. There are eight unused bits, leaving eight active bytes. First byte is music or track number, which determines meaning of remaining bytes.

images

Figure 12.54    During lead-in TNO is zero and Q subcode builds up a table of contents using numbered points with starting times. For multidisk sets, the band numbering can continue from one disk to the next, and there are point-limit codes A0 and A1 which specify the range of bands on a given disk. The example of a two-disk set is given, with five bands on the first disk and six bands on the second. Point = 00–99, point = music number, and point (min, sec, frame) denotes absolute starting time of that music number. This forms an entry in TOC. Point = A0 hex, point min byte = music number of first band on this disk, denotes beginning MNR of TOC. Point = A1 hex, point min byte = music number of last band on this disk, denotes end MNR of TOC. Point = A2 hex, point (min, sec, frame), denotes absolute starting time of lead-out track.

The table of contents is built up by listing points in time where each track starts. One point can be described in one subcode block. Figure 12.54 shows that the second byte of the block is the point number. The absolute time at which that point will be reached after the start of the first track is contained in the last three bytes as point minutes, point seconds and point frames. These bytes are two BCD digits, where the maximum value of point frame is 74. As there is only error detection in the Q data, the point is repeated in three successive subcode blocks. The number of points allowed is 99, but the track numbering can continue through a set of disks. For example, in a two-disk set, there could be five tracks, 1 to 5, on the first disk, and six tracks, 6 to 11, on the second disk. Clearly the first point on the second disk is going to be point 6, and to prevent the player fruitlessly looking for points that are absent, the point range is specified.

If the point byte has the value A0 hex, the point-minute byte contains the number of the first track on the disk, which in the example given would be 6. If the point byte has the value A1 hex, the point-minute byte contains the number of the last track on the disk, which would here be 11. A further point is specified, which is the absolute running time of the start of the lead-out track, which uses the point code of A2 hex. These three points come after the actual music start points. During the lead-in track, the running time is counted by the minute, second and frame bytes in the block.

If the first byte of the block is between 00 and 99, the block is in a music track, and the meaning shown in Figure 12.55 applies. The running time is given in three ways. Minute, second and frame are the running time from the start of that track, and A(bsolute)min, Asec and Aframe are the running time from the start of the first track on the disk. The third running-time mode employs the index or X byte. When this is zero, it denotes a pause, which corresponds to the P subcode’s being 1. During this pause, which precedes the start of a track, the running time counts down to zero, so that a player can display the time to go before a track starts to play. The absolute time is unaffected by this mode. Non-zero values of X denote a subdivision of the track into shorter sections. This would be useful to locate individual phrases on a language-course disk, or the individual effects on a sound-effects disk. Figure 12.56 shows an example of the use of P and Q subcode and the relationship between them and the music bands.

images

Figure 12.55    During music bands TNO. is 01–99, and subcode shows time through band and time through disk. The former counts down during pause. Each band can be subdivided by index count X.

images

Figure 12.56 The relationship of P and Q subcode timing to the music bands. P flag is never less than 2 s between bands, whereas index reflects actual pause, and vanishes at a crossfade. Time counts down during index 00.

*1: lead-in time does not have to start from zero;

*2: A time must start from zero;

*3: de-emphasis can only change during pause of 2 s or more.

Mode 2 of the Q subcode allows the recording of the barcode number of the disk, and is denoted by the address code of 2 in the block as shown in Figure 12.57. The 52-bit barcode, along with twelve zeros and a continuation of the absolute frame count, are protected by the CRC character. If this mode is used, it should show up at least once in every 100 subcode blocks and the contents of each block should be identical. The use of the mode is not compulsory.

images

Figure 12.57    In mode 2, the catalogue number can be recorded. This must always be the same throughout the disk, and must appear in at least one out of a hundred successive blocks.

images

Figure 12.58    ISRC format in mode 3 allows each band to have a different code. All mode 3 frames must be the same within same TNO. Must appear in at least one out of every hundred successive blocks. Not present in lead-in or lead-out tracks.

Mode 3 of Q subcode is similar to mode 2, except that a code number can be allocated to each track on the disk. Figure 12.58 shows that the ISR code requires five alphanumeric characters of six bits each and seven BCD characters of four bits each. Again the mode is optional but, if used, the mode 3 subcode block must occur at least once in every 100 blocks.

The R to W subcode is currently not standardized, but proposed uses for this data include a text display which would enable the words of a song to appear on a monitor in synchronism with the sound played from the disk. A difficulty in this area is the requirement to support not only the kind of alphanumerics in which this book is written but also the complex Kanji characters which would be needed for the Japanese market.

12.19 MD table of contents

The TOC of the pre-recorded MiniDisc is basically similar to the CD TOC as it performs the same function. Recordable MiniDiscs have a different approach. Recordable MD is more like a hard disk than a real-time audio recorder, and the buffer memory allows continuous audio listening from records which are fragmented across the disk surface. Thus the UTOC (user table of contents) of the recordable MD is more like the directory of a data disk (see Chapter 10). UTOC contains one entry for each numbered recorded item which lists the physical cluster addresses at which the data for that item are recorded. When the user selects the number of an item, the player reads the UTOC in order to locate the data addresses. Item numbers are contiguous, so if an item is deleted or if two items are merged, the numbering scheme beyond will move up by one. It is not necessary to actually erase unwanted recordings. Instead the directory entry is deleted and then as far as the system is concerned the recording no longer exists and the clusters it uses are available for overwriting. Figure 12.59 shows some examples of UTOC operations. Despite the internal complexity, the disk mapping is taken care of by a microprocessor and the user simply selects item numbers.

12.20 CD player structure

The physics of the manufacturing process and the readout mechanism have been described, along with the format on the disk. Here, the details of actual CD and MD players will be explained. One of the design constraints of the CD and MD formats was that the construction of players should be straightforward, since they were to be mass-produced.

images

Figure 12.59    Recordings on MD are accessed via the UTOC which functions like a hard disk index. Editing is achieved simply by altering UTOC.

Figure 12.60 shows the block diagram of a typical CD player, and illustrates the essential components. The most natural division within the block diagram is into the control/servo system and the data path. The control system provides the interface between the user and the servo mechanisms, and performs the logical interlocking required for safety and the correct sequence of operation.

The servo systems include any power-operated loading drawer and chucking mechanism, the spindle-drive servo, and the focus and tracking servos already described.

Power loading is usually implemented on players where the disk is placed in a drawer. Once the drawer has been pulled into the machine, the disk is lowered onto the drive spindle, and clamped at the centre, a process known as chucking. In the simpler top-loading machines, the disk is placed on the spindle by hand, and the clamp is attached to the lid so that it operates as the lid is closed.

images

Figure 12.60    Block diagram of CD player showing the data path (broad arrow) and control/servo systems.

The lid or drawer mechanisms have a safety switch which prevents the laser operating if the machine is open. This is to ensure that there can be no conceivable hazard to the user. In actuality there is very little hazard in a CD pickup. This is because the beam is focused a few millimetres away from the objective lens, and beyond the focal point the beam diverges and the intensity falls rapidly. It is almost impossible to position the eye at the focal point when the pickup is mounted in the player, but it would be foolhardy to attempt to disprove this.

The data path consists of the data separator, timebase correction and the de-interleaving and error-correction process followed by the error-concealment mechanism. This results in a sample stream which is fed to the convertors.

The data separator which converts the readout waveform into data was detailed in the description of the CD channel code. LSI chips have been developed to perform the data-separation function: for example, the Philips SAA 7010 or the Sony CX 7933. The separated output from both of these consists of subcode bytes, audio samples, redundancy and a clock. The data stream and the clock will contain speed variations due to disk run-out and chucking tolerances, and these have to be removed by a timebase corrector.

The timebase corrector is a memory addressed by counters which are arranged to overflow, giving the memory a ring structure as described in Chapter 1. Writing into the memory is done by using clocks from the data separator whose frequency rises and falls with run-out, whereas reading is done using a crystal-controlled clock, which removes speed variations from the samples, and makes wow and flutter unmeasurable. The timebase-corrector will only function properly if the two addresses are kept apart. This implies that the long-term data rate from the disk must equal the crystal-clock rate. The disk speed must be controlled to ensure that this is always true, and there are two contrasting ways in which it can be done.

The data-separator clock counts samples from the disk. By phase-comparing this clock with the crystal reference, the phase error can be used to drive the spindle motor. This system was used in the Sony CDP-101, where the principle was implemented with a CX-193 chip, originally designed for DC turntable motors. The data-separator signal replaces the feedback signal which would originally have come from a toothed wheel on the turntable.

The alternative approach is to analyse the address relationship of the timebase corrector. If the disk is turning too fast, the write address will move towards the read address; if the disk is turning too slowly, the write address moves away from the read address. Subtraction of the two addresses produces an error signal which can be fed to the motor. The TBC RAM controller produces the motor-control signal. In these systems, and in all CD players, the speed of the motor is unimportant. The important factor is that the sample rate is correct, and the system will drive the spindle at whatever speed is necessary to achieve the correct rate. As the disk cutter produces constant bit density along the track by reducing the rate of rotation as the track radius increases, the player will automatically duplicate that speed reduction. The actual linear velocity of the track will be the same as the velocity of the cutter, and although this will be constant for a given disk, it can vary between 1.2 and 1.4 m/s on different disks.

These speed-control systems can only operate when the data separator has phase-locked, and this cannot happen until the disk speed is almost correct. A separate mechanism is necessary to bring the disk up to roughly the right speed. One way of doing this is to make use of the run-length limits of the channel code. Since transitions closer than 3T and further apart than 11T are not present, it is possible to estimate the disk speed by analysing the run lengths. The period between transitions should be from 694 ns to 2.55 μs. During disk run-up the periods between transitions can be measured, and if the longest period found exceeds 2.55 μs, the disk must be turning too slowly, whereas if the shortest period is less than 694 ns, the disk must be turning too fast. Once the data separator locks up, the coarse speed control becomes redundant. The method relies upon the regular occurrence of maximum and minimum run lengths in the channel. Synchronizing patterns have the maximum run length, and occur regularly. The description of the disk format showed that the C1 and C2 redundancy was inverted. This injects some ones into the channel even when the audio is muted. This is the situation during the lead-in track – the very place that lock must be achieved. The presence of the table of contents in subcode during the lead-in also helps to produce a range of run lengths.

Owing to the use of constant linear velocity, the disk speed will be wrong if the pickup is suddenly made to jump to a different radius using manual search controls. This may force the data separator out of lock, and the player will mute briefly until the correct track speed has been restored, allowing the PLO to lock again. This can be demonstrated with most players, since it follows from the format.

Following data separation and timebase correction, the error-correction and de-interleave processes take place. Because of the cross-interleave system, there are two opportunities for correction, first, using the C1 redundancy prior to deinterleaving, and second, using the C2 redundancy after de-interleaving. In Chapter 6 it was shown that interleaving is designed to spread the effects of burst errors among many different codewords, so that the errors in each are reduced. However, the process can be impaired if a small random error, due perhaps to an imperfection in manufacture, occurs close to a burst error caused by surface contamination. The function of the C1 redundancy is to correct single-symbol errors, so that the power of interleaving to handle bursts is undiminished, and to generate error flags for the C2 system when a gross error is encountered.

The EFM coding is a group code which means that a small defect which changes one channel pattern into another will have corrupted up to eight data bits. In the worst case, if the small defect is on the boundary between two channel patterns, two successive bytes could be corrupted. However, the final odd/even interleave on encoding ensures that the two bytes damaged will be in different C1 codewords; thus a random error can never corrupt two bytes in one C1 codeword, and random errors are therefore always correctable by C1. From this it follows that the maximum size of a defect considered random is 17T or 3.9 μs. This corresponds to about a 5 μm length of the track. Errors of greater size are, by definition, burst errors.

The de-interleave process is achieved by writing sequentially into a memory and reading out using a sequencer. The RAM can perform the function of the timebase-corrector as well. The size of memory necessary follows from the format; the amount of interleave used is a compromise between the resistance to burst errors and the cost of the de-interleave memory. The maximum delay is 108 blocks of 28 bytes, and the minimum delay is negligible. It follows that a memory capacity of 54 images 28 = 1512 bytes is necessary. Allowing a little extra for timebase error, odd/even interleave and error flags transmitted from C1 to C2, the convenient capacity of 2048 bytes is reached.

The C2 decoder is designed to locate and correct a single-symbol error, or to correct two symbols whose locations are known. The former case occurs very infrequently, as it implies that the C1 decoder has miscorrected. However, the C1 decoder works before de-interleave, and there is no control over the burst-error size that it sees. There is a small but finite probability that random data in a large burst could produce the same syndrome as a single error in good data. This would cause C1 to miscorrect, and no error flag would accompany the miscorrected symbols. Following de-interleave, the C2 decode could detect and correct the miscorrected symbols as they would now be single-symbol errors in many codewords. The overall miscorrection probability of the system is thus quite minute. Where C1 detects burst errors, error flags will be attached to all symbols in the failing C1 codeword. After de-interleave in the memory, these flags will be used by the C2 decoder to correct up to two corrupt symbols in one C2 codeword. Should more than two flags appear in one C2 codeword, the errors are uncorrectable, and C2 flags the entire codeword bad, and the interpolator will have to be used. The final odd/even sample de-interleave makes interpolation possible because it displaces the odd corrupt samples relative to the even corrupt samples.

If the rate of bad C2 codewords is excessive, the correction system is being overwhelmed, and the output must be muted to prevent unpleasant noise. Unfortunately digital audio cannot be muted by simply switching the sample stream to zero, since this would produce a click. It is necessary to fade down to the mute condition gradually by multiplying sample values by descending coefficients, usually in the form of a half-cycle of a cosine wave. This gradual fade-out requires some advance warning, in order to be able to fade out before the errors arrive. This is achieved by feeding the fader through a delay. The mute status bypasses the delay, and allows the fade-out to begin sufficiently in advance of the error. The final output samples of this system will be either correct, interpolated or muted, and these can then be sent to the convertors in the player.

The power of the CD error correction is such that damage to the disk generally results in mistracking before the correction limit is reached. There is thus no point in making it more powerful. CD players vary tremendously in their ability to track imperfect disks and expensive models are not automatically better. It is generally a good idea when selecting a new player to take along some marginal disks to assess tracking performance.

The control system of a CD player is inevitably microprocessor-based, and as such does not differ greatly in hardware terms from any other microprocessor-controlled device. Operator controls will simply interface to processor input ports and the various servo systems will be enabled or overridden by output ports. Software, or more correctly firmware, connects the two. The necessary controls are Play and Eject, with the addition in most players of at least Pause and some buttons which allow rapid skipping through the program material.

Although machines vary in detail, the flowchart of Figure 12.61 shows the logic flow of a simple player, from start being pressed to sound emerging. At the beginning, the emphasis is on bringing the various servos into operation. Towards the end, the disk subcode is read in order to locate the beginning of the first section of the program material.

When track-following, the tracking-error feedback loop is closed, but for track-crossing, in order to locate a piece of music, the loop is opened, and a microprocessor signal forces the laser head to move. The tracking error becomes an approximate sinusoid as tracks are crossed. The cycles of tracking error can be counted as feedback to determine when the correct number of tracks have been crossed. The ‘mirror’ signal obtained when the read-out spot is half a track away from target is used to brake pickup motion and re-enable the track-following feedback.

The control system of a professional player for broadcast use will be more complex because of the requirement for accurate cueing. Professional machines will make extensive use of subcode for rapid access, and in addition are fitted with a hand-operated rotor which simulates turning a vinyl disk by hand. In this mode the disk constantly repeats the same track by performing a single track-jump once every revolution. Turning the rotor moves the jump point to allow a cue point to be located. The machine will commence normal play from the cue point when the start button is depressed or from a switch on the audio fader. An interlock is usually fitted to prevent the rather staccato cueing sound from being broadcast.

CD changers running from 12 volts are available for remote installation in cars. These can be fitted out of sight in the luggage trunk and controlled from the dashboard. The RAM buffering principle can be employed to overcome skipping as in MD, but a larger memory is required.

Personal portable CD players are available, but these have not displaced the personal analog cassette in the youth market. This is possibly due to the cost of player and disks relative to the Compact Cassette. The Compact Cassette is also more immune to rough handling. Personal CD players are more of a niche market, being popular with professionals who are more likely to have a quality audio system and CD collection. The same CDs can then be enjoyed whilst travelling.

images

Figure 12.61    Simple flowchart for control system, focuses, starts disk, and reads subcode to locate first item of programme material.

12.21 MD recorder/player structure

Figure 12.62 shows the block diagram of an MD player. There is a great deal of similarity with a conventional CD player in the general arrangement. Focus, tracking and spindle servos are basically the same, as is the EFM and Reed–Solomon replay circuitry. The main difference is the presence of recording circuitry connected to the magnetic head, the large buffer memory and the data reduction codec. The figure also shows the VLSI chips developed by Sony for MD. Whilst MD machines are capable of accepting 44.1 kHz PCM or analog audio in real time, there is no reason why a twin-spindle machine should not be made which can dub at four to five times normal speed.

images

Figure 12.62    MiniDisc block diagram. See text for details.

12.22 Structure of a DVD player

Figure 12.63 shows the block diagram of a typical DVD player, and illustrates the essential components. The most natural division within the block diagram is into the control/servo system and the data path. The control system provides the interface between the user and the servo mechanisms, and performs the logical interlocking required for safety and the correct sequence of operation.

images

Figure 12.63    A DVD player’s essential parts. See text for details.

The servo systems include any power-operated loading drawer and chucking mechanism, the spindle-drive servo, and the focus and tracking servos already described for CD.

The data path consists of the data separator, the de-interleaving and error-correction process followed by a RAM buffer which supplies the MPEG decoders.

The data separator converts the EFM+ read-out waveform into data. Following data separation the error-correction and de-interleave processes take place. Because of the interleave system, there are two opportunities for correction, first, using the inner code prior to de-interleaving, and second, using the outer code after de-interleaving. As MPEG data are very sensitive to error the correction performance has to be extremely good.

Following the de-interleave and outer error-correction process an MPEG program stream emerges. Some of the program stream data will be video, some will be audio and this will be routed to the appropriate decoder. It is a fundamental concept of DVD that the bit rate of this program stream is not fixed, but can vary with the difficulty of the program material in order to maintain consistent image quality. Although the bit rate allocated to the audio remains constant, the video bit rate doesn’t. The bit rate is changed by changing the linear speed of the disk track. However, there is a complication because the disk uses constant linear velocity rather than constant angular velocity. It is not possible to obtain a particular bit rate with a fixed spindle speed.

The solution is to use a RAM buffer between the transport and the MPEG decoders. The amount of data read from the disk over the long term must equal the amount of data used by the MPEG decoders. The speed of the motor is unimportant. The important factor is that the data rate needed by the decoder is correct, and the system will drive the spindle at whatever speed is necessary so that the buffer neither underflows nor overflows.

The MPEG decoder will convert the compressed elementary streams into PCM video and audio and place the pictures and audio blocks into RAM. These will be read out of RAM whenever the time stamps recorded with each picture or audio block match the state of a time stamp counter. If bidirectional coding is used, the RAM readout sequence will convert the recorded picture sequence back to the real-time sequence. The time stamp counter is derived from a crystal oscillator in the player which is divided down to provide the 90 kHz time stamp clock.

As a result the frame rate at which the disk was mastered will be replicated as the pictures are read from RAM. Once a picture buffer is read out, this will trigger the decoder to decode another picture. It will read data from the buffer until this has been completed and thus indirectly influence the disk speed.

images

Figure 12.64    Simple processes required for a DVD player to operate.

Owing to the use of constant linear velocity, the disk speed will be wrong if the pickup is suddenly made to jump to a different radius using manual search controls. This may force the data separator out of lock, or cause a buffer overflow and the decoder may freeze briefly until this has been remedied.

Although machines vary in detail, the flowchart of Figure 12.64 shows the logic flow of a simple player, from start being pressed to pictures and sound emerging. At the beginning, the emphasis is on bringing the various servos into operation. Towards the end, the disk subcode is read in order to locate the beginning of the first section of the program material.

References

1. Bouwhuis, G. et al., Principles of Optical Disc Systems, Bristol: Adam Hilger (1985)
2. Zernike, F., Beugungstheorie des schneidenverfahrens und seiner verbesserten form, der phasenkontrastmethode. Physica, 1, 689 (1934)
3. Mee, C.D. and Daniel, E.D. (eds) Magnetic Recording, Vol. III, New York: McGraw-Hill (1987)
4. Connell, G.A.N., Measurement of the magneto-optical constants of reactive metals. Appl. Opt., 22, 3155 (1983)
5. Goldberg, N., A high density magneto-optic memory. IEEE Trans. Magn., MAG-3, 605 (1967)
6. Various authors, Philips Tech. Rev., 40, 149–180 (1982).
7. German Patent No. 2,208,379
8. Various authors, Video long-play systems. Appl. Opt., 17, 1993–2036 (1978)
9. Airy, G.B., Trans. Camb. Phil. Soc., 5, 283 (1835)
10. Ray, S.F., Applied Photographic Optics, Oxford: Focal Press (1988)
11. Maréchal, A., Rev. d’Optique, 26, 257 (1947)
12. Hopkins, H.H., Diffraction theory of laser read-out systems for optical video discs. J. Opt. Soc. Am., 69, 4 (1979)
13. Bouwhuis et al., op. cit., Chapter 2.
14. Pasman, J.H.T., Optical diffraction methods for analysis and control of pit geometry on optical discs. J. Audio Eng. Soc., 41, 19–31 (1993)
15. Verkaik, W., Compact Disc (CD) mastering – an industrial process. in Digital Audio, edited by B.A. Blesser, B. Locanthi and T.G. Stockham Jr, New York: Audio Engineering Society, 189–195 (1983)
16. Miyaoka, S., Manufacturing technology of the Compact Disc. In Digital Audio, op. cit., 196–201
17. Redlich, H. and Joschko, G., CD direct metal mastering technology: a step toward a more efficient manufacturing process for Compact Discs. J. Audio Eng. Soc., 35, 130–137 (1987)
18. Ogawa, H., and Schouhamer Immink, K.A., EFM – the modulation system for the Compact Disc digital audio system. In Digital Audio, op. cit., 117–124
19. Schouhamer Immink, K.A. and Gross, U., Optimization of low-frequency properties of eight-to-fourteen modulation. Radio Electron. Eng., 53, 63–66 (1983)
20. Peek, J.B.H., Communications aspects of the Compact Disc digital audio system. IEEE Commun. Mag., 23, 7–15 (1985)
21. Vries, L.B. et al., The digital Compact Disc – modulation and error correction. Presented at the 67th Audio Engineering Society Convention (New York, 1980), Preprint 1674
22. Vries, L.B. and Odaka, K., CIRC – the error correcting code for the Compact Disc digital audio system. In Digital Audio, op. cit., 178–186
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
35.171.45.182