Tone reproduction is concerned with the reproduction of original intensities and intensity differences, as well as with the observer’s impression of these quantities, i.e. brightness (or lightness) and contrast reproduction respectively. Intensity is used in this context as a generic term for physical quantities related to the imaging signal or output, such as luminance and illuminance, transmittance, reflectance and density, and pixel value. Tone reproduction is the most crucial dimension of image quality, being a critical component of the subjective impression of the excellence of an image and the fidelity of a reproduction. Other important aspects of image quality, such as colour and perceived sharpness, are greatly affected by the contrast of an image and their subjective evaluation relies on optimum tone reproduction.
The theory of tone reproduction was established by L.A. Jones (an American physicist specializing in sensitometry) in the 1920s and 1930s, who first distinguished between objective and subjective tone reproduction. The terms were later formalized by C.N. Nelson. Objective tone reproduction refers to the measured or modelled relationship between the input and output intensities of an individual imaging device or system, or an imaging chain consisting of a number of systems. Subjective tone reproduction is, of course, dependent on the objective tone reproduction but also takes into account the viewing conditions (luminance level, background luminance, flare, etc.) that greatly influence the perception of image tones.
The relationship between input-to-output intensities in an imaging device or system is described by one or a set of transfer functions. The luminance–brightness relationship introduced in Chapter 4 (see Figure 4.16) can be considered as the transfer function of the human visual system (HVS). The photographic transfer function, known as the characteristic curve, describes the sigmoid relationship between the common logarithm of relative exposure and the reproduced density on film or photographic paper. Note that density is also a logarithmic quantity, defined as the log10 of the reciprocal of the film transmittance or print reflectance (Eqns 8.4 and 8.7). Chapter 8 is dedicated to sensitometry and the characteristic curve. Later in this chapter we discuss in detail the transfer functions of input and output imaging devices employed in digital imaging as well as the transfer functions, such as the opto-electronic conversion function (OECF), of various encoding systems.
Transfer functions are commonly plotted in linear–linear, log–log or linear–log units (with respect to luminance). The photographic characteristic curve – the earliest ever imaging transfer function – is plotted in log–log units, as mentioned above. The slope (or gradient) of the straight-line portion of the characteristic curve is termed gamma (denoted by the Greek letter g) and it is a descriptor of the contrast of the photographic material under a given set of development conditions. The transfer functions of digital image capturing devices and display systems are usually plotted in linear–linear units (for example, luminance vs. pixel value in displays), but not exclusively. As we will see later on, the input–output relationships of such systems are often described by power transfer functions, i.e. output = inputexponent, where the ‘exponent’ represents gamma, a descriptor of the imaging contrast with a similar meaning to the photographic gamma. It is important to note that a power relationship appears linear when plotted in log10 vs. log10 units. This implies that the slope of the straight line of a transfer function represented in logelog units is equal to the exponent of a transfer function represented in linear–linear units. Therefore, mathematically, gamma has the same meaning whether it is calculated from the slope of a transfer function plotted in log–log, or from the exponent of a transfer function plotted in linear–linear units. This concept is illustrated in Figure 21.1.
The relationship between original scene intensity and reproduced image intensity characterizes the tone reproduction of an imaging chain and is affected by the tonal characteristics of each component in the chain. The overall transfer function of an imaging chain consisting of more than one imaging component is the product of the individual transfer functions of the separate components, as illustrated in the example in Figure 21.2. This is true provided that all individual transfer functions are plotted in the same mathematical space (i.e. log–log, lineare linear) and expressed using relative intensities. The overall gamma of the imaging chain in this example may therefore be derived by calculating the product of the component gammas (Eqn 21.1), or equally by deriving the gamma from the overall transfer function.
where the subscripts O, P, S and D refer to the overall system, the photographic print, the scanner and the display respectively. γC is the amount of gamma correction, which can be applied to adjust the overall gamma to a desired value. Gamma correction can be carried out in imaging software, applied during scanning or can be a built-in function in video and digital cameras.
The term gamma correction was instigated by the television industry and originally referred to the modification of the video camera signal required to compensate for the tonal distortion introduced by the non-linearity of cathode ray tube (CRT) display systems (see Chapter 15 for CRT transfer characteristics and later in this chapter). In this particular case, gamma correction was achieved by applying the inverse (of the display) transfer function to the original signal for obtaining an overall system gamma equal to unity:
where V′ is the corrected video signal, V is the original video signal and γD is the exponent of the power function that models the transfer characteristics of the display.
Equation 21.2 is a simplification of the case of real systems, since it assumes that the CRT display is perfectly set up and thus can be modelled by a single power function – as we will see later on, strictly speaking this is not true. Additionally, the gamma for television signal transmission is greater than 1/γD and thus the overall gamma is greater than unity to compensate for the dim viewing environments common to television viewing (see section on subjective tone reproduction below). For example, if γD is 2.5 as in the case of High Definition television (HDTV) systems, the effective gamma correction is close to 0.5 (whereas 1/2.5 = 0.4) and the overall gamma equal to 1.25.
As we will see later in the chapter, in reality imaging transfer functions are rarely perfectly linear or pure power functions. The effective gamma correction refers to the exponent of a power function that approximates the system’s overall transfer function and is usually slightly different to the actual exponent applied in gamma correction.
In objective tone reproduction, higher measured gamma values than 1.0 indicate higher contrast in the reproduction than that in the original scene; lower gamma values than 1.0 indicate lower than the original contrast; gamma equal to 1.0 indicates equal scene and reproduction contrast. While the aim of objective tone reproduction is a one-to-one reproduction of the relative input intensity, i.e. relative luminance reproduction, the aim of subjective tone reproduction is to have a linear reproduction of brightness relative to white, i.e. a linear lightness reproduction (lightness is the subjective perception of relative luminance – see Chapter 5). As we saw in Chapter 5, however, the perception of lightness is dependent on the relative intensity of the stimulus and the viewing conditions. More specifically, the perceived contrast decreases as the intensity of the stimulus, adapting background and surround decreases. Thus, the optimum tone reproduction can only be considered when the viewing conditions are known. Additionally, projection, printing or/and viewing flare lower image contrast (and thus the overall viewing gamma), especially in the dark image regions. To compensate for the ‘flare effect’ an ideal imaging system must have a gradually decreasing gamma at the light end of its transfer function.
Optimum gamma values for subjective tone reproduction vary between 1.0 and 1.5 (Figure 21.3). For example, reflection prints viewed typically in bright viewing conditions require an overall viewing gamma of approximately 1.0, but the overall objective gamma is close to 1.1 to compensate for flare. This is achieved by having, for example, a negative gamma close to 0.90 and a print gamma of 1.2. A gamma of approximately 1.25 is optimum for television and cut-sheet transparencies viewed in dim environments, whereas a gamma of approximately 1.5 is optimum for transparencies projected in dark and motion pictures. Thus, slide and motion picture films are designed to have high contrast – typically a gamma of 1.6 to compensate also for flare. Note that overall gamma values of 1.25 and 1.5 for dim and dark surrounds are optimum for fairly high imaging intensities – in the case of film and print densities. For lower densities slightly lower gamma values may be more appropriate. Computer images are viewed in brighter than television settings – office settings – and therefore the overall target gamma for viewing displayed digital images is between 1.1 and 1.15.
The viewing conditions can be modelled by a separate transfer function included in the imaging chain. This idea was first represented in the photographic tone reproduction diagram or quadrant diagram, introduced by L.A. Jones, which aims to model the tone reproduction of all stages in the imaging chain in a single diagram. The diagram consists of four quadrants, which are interrelated graphs representing each stage of the photographic process, proceeding in a clockwise direction. The data output from the graph in the first quadrant is transferred as the input to the graph in the next quadrant, as illustrated in Figure 21.4. The representative graphs in the quadrants in the diagram are:
From Bilissi (2003)
1. Characteristic curve of negative material and development
2. Characteristic curve of positive material
3. Transfer line
4. Overall tone reproduction characteristics achieved with steps 1, 2 and 3.
The transfer line in the third quadrant can be used to represent the viewing conditions. A simple straight line at 45° (slope = 1.0) can represent viewing of the print under conditions similar to those used when measuring the print with the reflection densitometer. Different viewing arrangements should produce a different line, depending on how the final print densities are viewed by the observer.
In the quadrant diagram, points from the first characteristic curve in the reproduction process are transferred from quadrant to quadrant via the characteristic curve of each stage in the process to produce the overall tone reproduction curve. Thus, the idea is essentially the same as that presented in Figure 21.2. Quadrant diagrams can also be used to represent the tonal characteristics of digital imaging chains (as illustrated in Figure 21.5, for example a camera/gamma correction/display system. To obtain a desired tone reproduction curve, the gamma correction required for tonal compensation in the digital chain can be derived from the transfer line.
The evaluation of the tone reproduction of CRT display systems is based on the relationship between input voltage in mV (or input pixel values in the frame buffer, provided that they are linearly related to voltage) and output luminance in cd m–2 produced on the CRT faceplate. CRTs are inherently non-linear: the output luminance is a nonlinear function of the input voltage. The CRT transfer function can be roughly described by a power function which obeys what physicists call the 5/2 power law (see Figure 21.6. Most CRTs have a numerical value of gamma – described by the exponent of the power function – close to 2.5.
Most commonly the CRT transfer function is described by the gamma model described below:
where:L is the normalized luminance (obtained by calculating (Ln – Lmin)/(Lmax – Lmin), with Ln representing the nth luminance level); V is the normalized voltage (or normalized pixel values in the case digital systems); o is the system offset, controlled by the brightness setting on the display; g is the system gain, controlled by the contrast setting on the display; and γD is the descriptor of the nonlinearity in the contrast of the displayed image.
Theoretically, on a correctly adjusted display the offset and gain are set so that they have values of 0 and 1 respectively. This makes output luminance equal the input voltage raised to the power of γD, which in turn becomes the sole descriptor of the display tone reproduction.
In reality the offset is almost never exactly 0 and the gain never exactly equal to 1, so the measured display transfer function cannot be modelled solely by a straight power function. This is evident in Figure 21.6, where we notice in log–log space (where the very low luminances are better described) a deviation of the measured CRT response from a straight power curve.
What Poynton (2003) refers to as an ‘amazing coincidence’ is that the luminanc–elightness function describing the human visual system’s lightness sensitivity is nearly the inverse of the CRT transfer function: lightness is approximately luminance raised to the power of 0.33 (see Chapters 4 and 5). This coincidence has implications in imaging: the gamma-corrected functions implemented in video cameras, digital image encoding systems and digital input devices essentially mimic the visual transfer function. This is advantageous with respect to image quality, for to minimize the perceptibility of image noise (see Chapter 24) it is advisable to use a perceptually uniform code. (This is especially true when encoding is carried out in lower than 9 bits per pixel bit depths, as we will see later in the chapter.) The gamma-corrected video camera voltage (Eqn 21.2), as well as the corrected signal in most encoding systems (see later), is a perceptually uniform input.
More complicated models than the gamma function are used to describe the CRT non-linearity with more accuracy. Their implementation is generally more complex. One that is widely implemented is the GOG (gain, offset, gamma) model, described in Eqn 21.4 for the red channel, in which the CRT image is described in terms of its spectral radiance. Similar expressions describe the green and blue channels.
Lλ,r,max defines the maximum spectral radiance of the red channel (r) for a given CRT set-up, LUT represents the video look-up-table and N is the number of bits in the digital-to-analogue converter (DAC). Constants kg, r and ko, r are referred to as the system gain and offset respectively, and γr is the exponent describing the CRT inherent non-linearity between video voltages and electron beam currents hitting the faceplate. Lλ,r is the spectral radiance produced on the CRT faceplate. Because the spectral radiance depends on both the graphics display controller and the actual CRT, Eqn 21.4 includes common components to all computer-controlled CRT displays, such as the DAC and video LUTs, and thus characterizes the ‘display system’ as a whole.
Display technologies more modern than the CRT, including LCDs, have inherently different transfer functions to that of the CRT. They are intrinsically linear devices, i.e. the output luminance produced on the face-plate is linearly related to the input pixel value, at least for the largest part of the system’s luminance range. The native transfer function of LCDs depends on the particular cell structure and the mode it operates in. Because of the ‘amazing coincidence’ described above, however, it is essential for image quality that the encoding of image data is carried out in a non-linear space. Therefore, LCDs incorporate some internal (local) correction to adapt their intrinsic transfer functions, often modelled by hyperbolic (i.e. sigmoid) functions, to transfer functions that have been standardized for video transmissions and image interchange. This signal remapping is achieved via a voltage modification or internal LUTs. Many modern LCDs have transfer functions that mimic that of the CRT and therefore the model described in Eqn 21.3 can be used to approximate the transfer characteristics of such displays; however, it does not describe them accurately.
Display transfer functions are measured in total darkness, where the display is the only emissive device. The display needs to have been turned on a sufficient time before the measurement to ensure stabilization (see Chapter 15). CRTs need approximately 45 minutes to 1 hour to stabilize; LCDs stabilize in a maximum of 10 minutes. Photometers, spectrophotometers and colorimeters can be used to measure the display luminance (Y) output in candelas per square metre (cd m–2) of the red, green and blue channels, as well as the monochrome display response. Spectroradiometers can also be used to record spectral radiance. When measuring LCDs, instruments with very small apertures are required; also, measurements are carried out from an axis exactly perpendicular to the display faceplate to avoid angle-of-view dependencies (see Chapter 15).
Measuring methods involve displaying uniform patches in the central 50% area of the display from which the luminance is measured (Figure 21.7). For CRT measurements, the display remaining area is often set to display either the complementary colour or an average display luminance, i.e. an average grey to avoid electron gun overload. For LCD measurements the remaining of the display is set to black. Displayed patches range from the display black (input pixel value d = 0) to the maximum colour of each channel (input pixel value d = 2N − 1, where N is the number of bits per pixel sent to the graphics card – usually 8), with equal steps in between. Seventeen to twenty-two steps per channel are usually enough to evaluate the display transfer characteristics. For measuring the greyscale response, set the input dr = dg = db, where dr is the red, dg the green and db the blue channel counts respectively. For varying a single-channel response, d is set to the desired input pixel value and the pixel values of the remaining two channels are set always to zero. Linear or non-linear interpolation between measured points can be used to estimate missing values and create LUTs that fairly well represent the response of the system for all input digital counts.
In a perfectly additive display, the measured normalized red, green and blue channel responses should be matching. In reality, this is rarely the case. Any important deviation, however, represents problems with the greyscale tracking of the display, resulting in a colour cast in areas where the deviation occurs. Figure 21.8 shows measured red, green, blue and greyscale channel responses of an LCD system before (Figure 21.8a) and after (Figure 21.8b) normalization. The shape of the transfer function is similar to that of a CRT. In Figure 21.8a we notice that the neutral response is not identical to the sum of the three colour channel responses, an indication of lack of total additivity that is also seen in the slightly non-matching normalized blue channel response (see also Chapter 23).
In the evaluation of display transfer functions there is often a problem in numerical precision when measuring low-luminance colours – especially at zero digital input values. This results from either the low dynamic range of some displays, or the insufficient precision of measuring instruments at low luminance levels, or both.
The tonal characteristics of digital acquisition devices are commonly described by the relationship between the original scene luminance (often expressed by luminance ratios such as scene or print reflectance, or film transmittance) and the generated digital counts. Although charge-coupled device (CCD) and complementary metal oxide semiconductor (CMOS) sensors incorporated in such systems generally respond in a linear fashion for the majority of input luminances (see Chapter 9), a non-linear mapping (gamma correction) of the output signal takes place in the system firmware or software to mimic the eye’s response and at the same time accommodate for the display non-linearity and bring the overall system gamma close to unity. Note that digital RAW files (see Chapter 17), however, are reproduced with linear tone reproduction (gamma 1.0) – but also with high enough bit depths to compensate for the problems accompanying linear quantization (see later). The RAW converter applies gamma correction to redistribute the tonal information when images are displayed.
The transfer function of acquisition devices, also known as the opto-electronic conversion function (OECF), describes the overall transfer characteristics of a camera or a scanner system (i.e. sensor, firmware, software) and can be modelled roughly by the inverse of the CRT transfer function given in Eqn 21.3:
PV is the generated pixel value, often normalized by the maximum count (i.e. 2N 1, where N is the number of output bits/channel), L is the input luminance ratio, o is the system offset, g is the system gain and γA is the measure of digital image contrast.
An offset in the positive direction can be caused by either an electronic shift or stray light in the system. While the electronic offset can be set equal to zero and offset from uniform stray light can be adjusted out electronically, signals coming from flare light (i.e. stray light coming thought the lens) are often scene dependent.
In digital cameras the acquisition gamma, γA, is usually near 1/γD (i.e. the inverse of the display gamma, ≅ 1/2.2 to 1/2.5 or 0.45 to 0.40), or 1.0 – depending on the choice of output colour space (or encoding) and/or file format (see Chapter 17). When scanning, γA may be adjusted by the user via the scanning software so that a desired gamma correction is applied during digital image acquisition. When capturing using a chosen colour encoding such as sRGB, γA = γE, the gamma of the encoding system (see below and Chapter 23).
The transfer functions of digital acquisition devices can be measured by averaging the response of the device to uniform transmittance or reflectance steps of conventional test charts (see Figure 21.9. Examples of such charts are: the ISO camera OECF test chart (ISO 14524:1999 method for measuring camera OECFs – see Bibliography), which includes 12 reflective uniform neutral grey patches with visual density increments which are equal with respect to the cube root of the luminance illuminating the target; and the Kodak Q-13 greyscale, including 20 uniform grey steps in 0.10 density increments between almost 0.0 (white) and a practical printing black of 1.90 density. Visual densities in such charts are measured with conventional densitometers (see Chapter 8). Reflectance (or transmittance in transmissive targets) values can be calculated using the inverse of Eqn 8.4. Alternatively, for reflection-only charts and camera measurements, chart luminances can be measured with a telescopic photometer placed at the camera location, or chart relative luminances can be measured using a spectrophotometer. Strictly, the resulting curves from such measurements are average responses to the specific target, target positioning and illuminance. Various non-uniformities affect the response of the system, including the response of individual sensor elements in the camera/scanner sensor and lamp variations during illumination or scanning.
OECFs are often plotted in a linear–linear space (for example, (normalized) pixel value vs. scene relative luminance – or test chart reflectance, or transmittance) or a log–log space (for example, log10 (normalized) pixel value vs. chart density). γA can be approximated by extracting the exponent of the power function in the former case, or the slope of the straight line in the latter case. The ISO 14524:1999 method suggests plotting the output pixel value, or log2 of the output pixel value, vs. the input log exposure or the input log luminance. Log luminance values are calculated from chart density measurements.
OECFs can also be constructed by extrapolating between closely measured points and creating LUTs that represent the response of the system at every pixel value. Figure 21.10 shows two example RGB OECFs plotted in linear–linear (a, c) and log–log (b, d) spaces. The overlapping functions in Figure 21.10a and b indicate good greyscale response, where γAr ≅ γAg ≅ γAb (the subscripts r, g and b denote red, green and blue). Notice that these responses are nearly the inverse of the CRT responses. However, when the OECFs are plotted in log–log space they do not appear linear but hyperbolic functions, similar to the photographic characteristic curve, where tones are compressed in the shoulder and toe regions (see Chapter 8). In fact, digital camera manufacturers often use S-shaped LUTs to emulate the response of silver halide materials, which are known to produce pleasant tone reproduction. In Figure 21.10c and d, the OECFs show poor greyscale response, with the blue channel response being different to that of the red and blue channels, indicating a colour cast in the mid and dark tones of the acquired image.
sRGB (standard Red, Green, Blue) is a colour image encoding that is devised for personal computers and for image exchange on the Internet. Most commercial digital cameras use sRGB as their standard output colour space, or at least give the option to choose sRGB encoding. The sRGB encoding space is discussed in detail in Chapter 23, but here we present its transfer function.
sRGB is calibrated colorimetrically for a reference display output, viewing conditions and observer. It assumes that the encoded signals will be outputted on a standard CRT display system with a nominal gamma value of 2.5. sRGB has an effective power function with gamma 1/2.2 (0.45), resulting in an overall system (encoding × display) gamma of 1.125, appropriate for dim illumination conditions and flare. The sRGB overall gamma is similar (but not identical) to the Rec. 709 standard encoding for HDTV. The encoding transfer function is given by:
where sR′G′B′ is the gamma-corrected encoded signal (either red, green or blue – between 0 and 1.0) and sRGB is the linear input signal prior to the gamma correction (either red, green or blue – between 0 and 1.0).
Note that the sRGB transfer function is not a pure power function. At very low relative luminances the transfer function is linear (top part of Eqn 21.6). At relative luminances larger than 0.003130 the encoding gamma, γE, is equal to the exponent 1/2.4, offset is –0.055 and gain is 1.055. The effect of Eqn 21.6 is to closely fit a straightforward effective gamma 1/2.2 (0.45) curve, with o = 0 and g = 1.0. The gamma correction and the –0.055 offset is included to compensate for contrast ratios that occur when a value is at or near zero and for ambient light effects.
Another commonly employed image encoding is the Adobe RGB 1998 encoding system. It has a pure power transfer function similar to that described in Eqn 21.6 with a straightforward encoding gamma, γE, approximately equal to the exponent 1/2.2. As mentioned above, if we model the sRGB function as a pure power function the effective exponent is 1/2.2. This fact makes sRGB and Adobe encoding functions very similar to each other. Figure 21.11 illustrates the sRGB and Adobe RGB 1998 transfer functions.
In the JPEG encoding standard (see Chapter 29) there is no reference to a transfer function, but a non-linear transfer function, similar to video encoding or to a standard RGB colour encoding system (when such a system is used for image encoding), is embedded. Much lower quality results are obtained when JPEG is applied to linear data, due to the greater visibility of JPEG quantization artefacts.
The tonal characteristics of desktop printers are described by the relationship between input digital counts and the generated density (or reflectance) produced on the paper. The transfer functions of printers do not follow a specific model and depend on many variables, such as printer technology, dot size and dot gain setting, ink concentration, number of inks, printer driver and settings. To characterize this relationship a number of steps are printed from white to the maximum density for the neutral (K) and for the colour channels of the printer (CMY, for example). Linear interpolation is used between measured points to obtain a curve that characterizes the response of the device for the specific ink set, paper and printer settings.
Two other objective measures related to tone are the dynamic range and the contrast ratio. Several definitions of dynamic range exist. Generally, it refers to the range of intensities in a scene (subject luminance range), a medium or a picture, i.e. the difference between the maximum and minimum intensities. Contrast ratio is the ratio of maximum to minimum signal, or output light intensities.
Dynamic range is often expressed in densities (which is measured on a base 10 log scale; thus the values represent the tens exponent of the relative density, for example 3.0D = 1000 contrast ratio) or in photographic stops (see Chapter 12). Photographic flare might modify the subject luminance range. ISO 21550:2004 specifies methods for measuring and reporting the dynamic range of electronic scanners for continuous-tone photographic media. It applies to scanners for reflective and transmissive media.
The dynamic range in real scenes may range from direct sunlight to dark shadows and thus sometimes it happens to be much higher than that of imaging systems. High-dynamic-range (HDR) imaging is concerned with a set of techniques that allow the realistic reproduction of such high-dynamic-range scenes. It generally involves multiple captures of the same scene with a range of exposures and their combination to produce one image. An issue with HDR is in viewing the reproduced image. Typical methods of displaying and printing images have a limited dynamic range and thus various methods of converting HDR images into a viewable format have been developed. These are generally referred to as ‘tonal mapping’. A number of tonal mapping algorithms exist but they are all scene dependent, meaning that not a single algorithm is appropriate for best rendering of all types of HDR scenes. More on HDR imaging can be found in Chapter 12.
DEVICE |
CONTRAST RATIO |
Liquid crystal computer displays |
200:1 to 20,000:1 |
Cathode ray tube computer displays |
300:1 to 1500:1 |
Digital data projectors |
600:1 to 2000:1 |
Slide projectors |
500:1 up to 1000:1 |
High contrast ratio is a desired feature of any device; higher ratios bring out subtle tonal and colour differences. The contrast ratios of devices are highly affected by the viewing conditions: the brighter the surrounds, the lower the effective contrast ratio. Typical contrast ratios are given in Table 21.1 and typical luminance ranges in Table 21.2.
A non-linear mapping of the input luminance (i.e. logarithmic or cubic root) mimicking the perceptual response is required to make the most of limited contrast ratios in imaging systems (see later in this chapter). Many factors, such as projection flare and backlighting on LCD devices, increase the luminance of the black and therefore lessen the contrast ratio in imaging systems, resulting in decreased image quality.
Due to their discrete nature, digital images allow some further direct measures related to tonal issues. One is the grey-level histogram, which summarizes the frequency of occurrence of the available intensity levels in an image. Histograms are discussed in detail in Chapter 27. Normalizing the grey-level histogram by dividing it by the number of pixels in the image produces the probability density function (PDF), also defined in Chapter 27. The PDFs of three different greyscale images are shown in Figure 21.12. Several image characteristics can be understood from its PDF, such as exposure and range, whether the image is low key (PDF skewing toward the lower pixel values) or high key (PDF skewing toward higher pixel values). Further, several statistical measures providing information relating to image tones can be calculated from image PDFs. Examples include the mean and median of the PDF, both relating to global image intensity, the standard deviation which relates to global image contrast (see below), and entropy which relates to the information content and random tonal changes in the image, etc. Some of these are discussed in later chapters.
The two definitions that follow represent contrast as a ratio of luminance change to mean background luminance. They are commonly used for measuring the contrast of simple uniform fields and test targets. One is the Weber fraction or Weber definition of contrast (see also Chapter 4), which is used to measure local contrast of an area with uniform luminance against a uniform background:
MEDIUM/SCENE |
DYNAMIC RANGE (LOG UNITS) |
Outdoor scenes (luminance range) |
1.25–1.50 typically, up to 3.3 |
Colour transparency film |
3.0–3.7 |
Projected transparencies |
2.1 (reduced to 1.4 perceivable scene luminance, due to dark surrounds that reduce apparent contrast) |
Black-and-white negative film |
2.1 |
Colour negative film |
2.1 |
Black-and-white photographic paper |
1.9 |
Colour photographic paper |
2.4 |
Colour reflection prints, in ordinary room illumination conditions |
2.1 (reduced to 1.25 perceivable, due to flare from topmost surface of print) |
CCD and CMOS sensors in commercial cameras and scanners |
2.5–3.6 |
where ∆L is the increment or decrement in the area’s luminance from the luminance of the uniform background.
The second, Michelson’s formula, measures the contrast of a spatially periodic pattern, such as a single frequency sinusoidal grating:
where Lmax and Lmin are the maximum and minimum luminance values respectively in the grating (see also Chapter 24).
Equations 21.7 and 21.8 cannot be used to define contrast in digital images with complex scene content (i.e. non-uniform in luminance and complex in terms of spatial frequency content). Because of the difficulties in defining contrast in images many definitions of contrast can be found in the literature. A common way to define global contrast in an image so that the contrast of two different images can be compared is to measure the root mean square (rms) contrast, defined as the standard deviation of the pixel values (grey levels):
where N is the total number of image pixels, xi is a normalized grey level value of the digital image (ranging between 0 and 1) and is the normalized mean grey level, given by:
The rms contrast does not depend on the spatial frequency content of the image or the spatial distribution of the contrast in the image.
We have seen in previous chapters that the human response to luminance is nearly logarithmic (or the cube root of the luminance), that the contrast ratio of the human vision at most levels of adaptation is about 100:1 and that the contrast ratio of two distinguishable levels of luminance is about 1% of luminance – for the majority of luminance levels. What happens when linear quantization is applied to the luminance signal (i.e. when output pixel values are proportional to input luminance) using 256 grey levels? Poynton (2003) points out that in an 8-bit quantized luminance signal the code value 100 is where the contrast ratio between adjacent luminance levels is approximately 1%, thus equal to the visual threshold. Adjacent luminance levels below code value 100 (in the darkest image areas) become increasingly visible, whereas adjacent levels above this point become more and more indistinguishable from each other and thus many code values are being wasted. As a result, smoothly varying dark areas in images, where the contrast ratio is as high as 4%, suffer from contouring, an artefact that has the appearance of visible steps (or bands) in areas that otherwise would look continuous. Thus, linear quantization in 8 bits per channel is unsuitable. This is one of the reasons why the majority of digital acquisition devices that employ linear quantization use bit depths of higher than 8 bits. In fact, even if logarithic quantization is employed, a bit depth of 9 bits per pixel would be required to produce non-perceptible steps in the entire range of quantized luminances. Figure 21.13 shows that in a perceptual scale from 1 to 100 which is logarithmically related to luminance, equal steps in visual response (represented by the double arrows on the y-axis) require different steps in a linearly quantized luminance signal. It also demonstrates that around the code value 100, the balance shifts from the need for fine quantization of the dark values to coarser quantization of the light values.
Original images from Kodak Master Photo CD.
Another reason why digital acquisition devices quantize images at bit depths higher than 8 bits is to minimize the effects of the rounding errors caused during image modifications. The limited precision of the integer mathematics used by digital imaging devices and computers presents problems at low bit depths. One example is gamma correction which, as described earlier, is applied to the acquisition signal to adjust the overall gamma in the imaging chain to an optimum value. The problem with tonal correction in discrete systems is that it introduces a loss in the original number of the available intensity levels. In a gamma-corrected signal, the limits of the intensity range (pixel values representing black and white) remain the same as in the original discrete signal, but some of the original available intensity levels are lost whereas some others are repeated due to rounding errors (see example in Table 21.3). The losses appear more significant when the quantization levels are coarse and gamma correction in an 8-bit space may result in up to a 30% loss in the available intensity levels. The loss of intensity levels with gamma correction is illustrated in Figure 21.14, which presents a range of gamma corrections in an 8-bit per channel quantized signal. Missing grey levels in an image may lead to posterization, an artefact similar to contouring, mentioned above, where continuous gradation of image tones is replaced with visible steps of fewer tones.
A solution to this is to start with more available intensity levels, i.e. higher bit depths, so that the loss is not so significant. Performing tone modification in 12 or even 16 bits per channel and down-sampling the optimized signal to 8 bits per channel for output means that the remaining levels after gamma correction are enough to be mapped to 256 output code values. This is a very common implementation in digital cameras and scanners. The subject of 8-bit versus 16-bit depth imaging is further dealt with in Chapter 28.
Adobe RGB 1998 Color Image Encoding, version 2005-05, 2005. Adobe Systems Incorporated.
Berns, R.S., 1996. Characterization of a CRT display. Displays 16 (4), 173–182.
Bilissi, E., Langford, M., 2008. Langford’s Advanced Photography, seventh ed. Focal Press, Oxford, UK.
Eggleston, J., 1984. Sensitometry for Photographers. Focal Press, Oxford, UK.
Gibson, J.E., Fairchild, M.D., 2000. Colorimetric Characterization of Three Computer Displays (LCD and CRT). Munsell Color Science Laboratory Technical Report.
Hunt, R.W.G., 2004. The Reproduction of Colour, sixth ed. Wiley, USA.
IEC 61966-2-1, 1998. Multimedia Systems and Equipment – Colour Measurement and Management, Part 2.1: Colour Management in Multimedia Systems – Default RGB Colour Space – sRGB. International Electrotechnical Commission.
ISO 14524:1999(E), 1999. Methods for Measuring Opto-Electronic Conversion Functions (OECFs). International Organization for Standardization.
ISO 21550:2004, 2004. Photography – Electronic Scanners for Photographic Images – Dynamic Range Measurements.
ISO 22028-1:2004, 2004. Photography and Graphic Technology – Extended Colour Encodings for Digital Image Storage, Manipulation and Interchange.
Jones, L.A., 1920. On the Theory of Tone Reproduction, with a Graphic Method for the Solution of Problems. Journal of the Franklin Institute 190 (1), 39–90.
Nelson, C.N., 1966. The Theory of Tone Reproduction, in the Theory of Photographic Process. In: Mees, C.E.K., James, T.H. (Eds.). The MacMillan Co., New York, USA.
Peli, E., 1990. Contrast in complex images. Journal of Optical Society of America 7 (10), 2032–2040.
Poynton, C., 2003. Digital Video and HDTV Algorithms and Interfaces. Morgan Kaufman, Elsevier Science, San Francisco.
Triantaphillidou, S., 2001. Image Quality in the Digitisation of Photographic Collections. Ph.D. thesis, University of Westminster, UK.
3.133.154.2