The issue of the storage of digital images is more complex than that of images on silver halide materials, where the sensor is also the storage medium. Thought must be given to much more than simply the physical conditions of storage. The image data must be encoded in some way in an image file; this must be stored in a file format, which is then saved to some form of hardware storage device. There are a range of factors to consider in selecting a particular image file format, which are fundamentally governed by the image properties, the way in which the image is being used and specific storage/file size requirements.
Digital images can be considered simply as arrays of pixel values; however, the image data must be converted to binary code to be processed, stored and transmitted. This code needs to be represented in an image file in a way that allows it to be interpreted as it moves through the imaging chain. The image file format determines how the image data are organized. It also defines the ‘packaging’ around the data: at the very least it will include a header which contains extra information necessary or useful for its interpretation, such as datatype, resolution, bit depth, colour space; it may also include metadata (defined as ‘data about data’) such as Exif (Exchangeable Image File Format) information from digital cameras.
There is a wealth of file formats available for the storage of images and many more that have been developed and have since become obsolete. Luckily, as digital imaging technology has progressed, the range of formats used within the imaging industries has decreased, converging to a collection which is now widely used for the majority of imaging applications. These formats encompass a range of image types, employ different approaches to store images efficiently and have various advantages and limitations; in some cases these allow them to be optimal for a particular aspect of imaging.
The majority of formats currently used in the imaging industries have something in common: they are, in most cases, imaging standards, meaning that they have gone through a fairly lengthy process of development and that they are widely available. They also tend to have enough flexibility to allow further development and improvement to adapt to the changing needs of the technology and systems. In some cases they are also de facto standards, meaning that they have become the dominant standard for a particular imaging task or application (note, however, that a de facto standard is not necessarily a standardized file format). This may be either as a result of widespread adoption by manufacturers, or because they have become the preference of the majority of users over other equivalent formats.
There are a number of important properties to consider when selecting a file format. These are briefly summarized below, before more detailed information about the most relevant formats is given.
Most fundamental in defining the nature of an image file, the type of imaging graphics relates to the structure of the image information across the image plane. In photographic imaging we are predominantly dealing with raster graphics, which are essentially defined by an array of discrete values, commonly known as a bitmap. The raster is the image grid of picture elements; nowadays we use raster displays, i.e. displays made up of pixels. The alternative type of image representation uses vector graphics. These are most commonly used in computer graphics applications and drawing programs, although one may well work with both types if using text with images or selection paths, which are defined using vector descriptions.
Vector images are described in terms of lines and shapes, which are represented by mathematical formulae. The vectors represent end points, directions and magnitudes of lines, for example, in an efficient way.
The difference between the two image types is shown in Figure 17.1. The bitmap has samples fixed at regular intervals, hence it suffers from the various artefacts, such as aliasing, which are characteristic of sampled images (see Chapter 7). It is also of a fixed resolution and if magnified enough the samples will be visible (i.e. pixelation).
The vector image has no fixed points – a line can have end points at any position within the image. Curves may be represented in various ways, most commonly using a parametric description. Bezier curves and splines are examples. Vector graphics provide a compact approach to describing shapes and lines, and these may be easily altered in size and shape using geometric transformations. Because the objects within a vector image are described by equations rather than sets of discrete samples, the image is resolution independent and resizing operations do not result in artefacts related to sample size and separation. It is more difficult, however, to represent filled shapes with vectors, and impossible to reproduce the many spatial, tonal and colour variations that occur at a minute level in natural scenes. Vector descriptions are therefore not so useful for representing photographic quality images; they tend to be used for computer-generated images and drawings.
To be displayed on a typical computer monitor, vector graphics must be rasterized, i.e. converted into an array of pixels.
There are a number of image file formats that use some form of vector representation. Many use vectors to describe text and graphics objects, but will have the capability to also contain bitmap information. In this chapter we will mainly concentrate on the bitmap formats, but include the Portable Document Format (PDF), which is an example of a vector-based format used widely nowadays for proofing documents that contain both graphics and bitmaps.
Bit-depth support defines the image modes that can be stored by a particular file format. The image mode relates to how many colour channels and how many bits per channel are stored for an individual pixel. The bit-depth support therefore limits the colour encoding (see Chapter 23). Many formats were developed at a time when imaging industries dealt with predominantly 8-bit (per channel) images, but it is becoming more common nowadays to work with 16 bits per channel. In many cases formats have been updated to provide support for 16-bit quantization. Graphics Interchange Format (GIF) is a special case, as GIF images contain up to 8 bits in just a single channel, whether the image is greyscale or colour. This is a paletted or indexed image, which means that a colour palette, or table, is included with the image data. Because each pixel may be represented by a maximum of 8 bits, it can take a maximum of 256 different values. However, the pixel values are actually indices into the colour palette and, for an RGB image, the index will point to a triplet of RGB values defining the colour.
The file format will define which types of compression algorithms are available and importantly whether the compression is lossless or lossy. Lossy compression is only suitable for applications where some loss in image quality can be tolerated. This is an immediate limiting factor in terms of the selection of certain file formats such as the Joint Photographic Experts Group (JPEG) format. This was designed as a lossy method and although it has a lossless version, this has not been widely adopted or supported. Compression is covered in more detail in Chapter 29.
If a format is a standard, then it has the advantage of being open source and widely available. Technical standards are a formalization of a method or process. A standard file format will have certain aspects completely specified, such as the language used to store the data, the type of data stored and the structure of the file. This means that a standard file format will have the same or very similar structure, independently of implementation. Standards make it easier to ensure cross-platform compatibility.
Proprietary file formats are developed by individuals or companies who then ‘own’ them, meaning that they control the distribution and availability of the format, as well as the ability to make changes to it. This also means that they may charge for their use. An example of the problems associated with proprietary formats is that of the RAW format (or formats!). At the time of writing there is no standard RAW format. Most digital camera manufacturers have developed their own proprietary RAW format. The result is a set of formats, each using different methods to encode and interpret the data and, until recently, each requiring its own converter, i.e. software that opens the RAW images and allows them to be processed, converted and saved to other formats (see later RAW format).
De facto standards are those that have become standards as a result of widespread adoption. As mentioned earlier, they may also be formal standards, but not necessarily.
The metadata contained within a file is extra information that is useful to include with the image data and assist the processing of the image within the imaging chain. Metadata may include capture information such as ISO rating, aperture and shutter speed, focal length, lighting conditions and the average colour temperature of the scene. Information about the scene content, for example keyword descriptions, may also be included. Exchangeable Image File Format (Exif) is one of a number of specifications for the storage of camera metadata, used in a range of different file formats. Exif data are embedded in the image file format.
The file format will specify what methods of colour representation are available and more specifically which colour space encodings are supported. A format may allow for the use of image profiles, allowing the image to be colour managed by International Color Consortium (ICC) colour management (see Chapter 26). If colour space support is limited by a file format, information may be lost when the image is saved in this format and into a particular colour space.
These are many and various and are usually extra capabilities that make a file format suitable for a particular application. Examples include progressive display, which allows the image to be partially displayed as it is being transmitted, important for web applications; support for transparency and layers, important in composite images and for formats to be useful as an interim during image editing; and multi-resolution support, meaning that the image can be saved at a high resolution, but opened at lower resolutions, cutting down the need to save multiple different-sized image files.
Tagged Image File Format (TIFF; extension .tif or .tiff) is a tag-based image file format, developed for storing and interchanging bitmap images originating from scanner and desktop publishing applications. The first public version of the TIFF specifications was published by Aldus Corporation in 1987 in consultation with scanner manufacturers and software developers, who agreed on a common image file format to replace individual manufacturer proprietary formats. This first version was the third revision of TIFF (named TIFF Revision 3.0) and stored only greyscale images. Very soon after this, TIFF Revision 4.0 was published, supporting RGB images and a little later in August 1988 TIFF Revision 5.0 came out, adding the capability of storing indexed colour and supporting Lempel-Ziv-Welch (LZW) compression (see Chapter 29). The current TIFF Revision 6.0 was released in June 1992 by Adobe Systems (which merged with Aldus Corporation in 1994) and additionally supported CMYK, YCbCr and CIE L*a*b* image encoding (see Chapter 23), and standard JPEG compression (see Chapter 29). Since then some extensions have been made to this current version, published in the form of technical notes.
Today TIFF is widely used and supported by most imaging and desktop publishing software. Although it has not been standardized, it is a de facto standard. Two different file format specifications that are based on TIFF have been standardized: the TIFF/EP (ISO 12234-2:2001) and the TIFF/IT (ISO 12639:2004) – seen later in the chapter. TIFF is considered to be the leading commercial and scientific image file format, because of its flexibility, power, extensible nature and many options. The fact that the source code of TIFF is easily accessible, alongside its stability and its easy access and retrieval across all platforms (PC, Mac, Unix, etc.), makes TIFF the most suitable format for image archiving purposes nowadays, along with JPEG 2000. The goal of Adobe when TIFF 6.0 was published was to update the format while maintaining compatibility ‘so that TIFF will never become obsolete and should not have to be revised more frequently than absolutely necessary’.
TIFF is a highly adaptable file format that stores images and data in a single file by including a basic set of file header tags (see Figure 17.2) that indicate the basic properties of the image. Manufacturers can use private tags to enable them to include their own proprietary information inside a TIFF file without causing problems for file interchange. TIFF is ideal for most storage needs, allowing storage of multiple bitmap images of virtually any colour encoding and bit depth: 1 bit per pixel bi-tonal, 4- or 8-bit greyscale or palette colour, and up to 48- and 64-bit colour. The largest possible TIFF file is 232 bytes in length, equal to 4 gigabytes. It is important to note that, although the TIFF 6.0 specification allows for up to 64-bit colour, many TIFF readers will only open a maximum of 24-bit RGB images or 32-bit CMYK images. Also, not all readers open multi-image TIFF files. TIFF supports the storage of uncompressed images – that is its most popular application. However, as mentioned earlier, lossless schemes such as LZW (compression of full colour images – Chapter 29) and ITUT.6 (compression for bi-tonal images) are used for reducing file size without any information loss and JPEG lossy compression (see Chapter 29) is supported by TIFF. Other available TIFF options include multi-pages and layers and the support of ICC colour profiles. Because of multi-page support and the support of compression of bi-tonal images TIFF is being widely used for the storage of faxes, especially on fax servers.
Adapted from Gosney et al. (1995)
On the downside, TIFF image files are generally large, mostly due to the versatility of the TIFF tags. Uncompressed TIFF files are approximately the same size in bytes as the image size in the computer memory. Because of its versatility and flexibility, TIFF was considered for many years as complicated and confusing, but not so any more. Tags that are contained with the data can sometimes cause incompatibility, but any imaging application will nowadays handle standard baseline TIFFs. A source of problems when opening compressed TIFF files can be the use of applications that do not support algorithms used to compress image data, but these are very rare nowadays.
The TIFF structure is illustrated in Figure 17.2. It contains an image file header (IFH) describing the byte order used in the file, the identification number 42 (a number providing the answer to the universe) that identifies the file as a TIFF and the offset position of the first image file directory (IFD). The IFD is a collection of information similar to the header used to describe the bitmap data to which it is attached, such as image height and width, bit depth, number of colour channels and type of compression used. The information is stored in one or more data structures, called tags. The IFD also contains the offset position of the next IFD. There may be more than one IFD in a TIFF file, but a baseline TIFF reader is not required to read any IFDs beyond the first one. The image data follow each IFD.
Tagged Image File Format for Electronic Photography (TIFF/EP; extensions .tif or .tiff) is an image file format based on a subset of the current revision of TIFF Version 6.0 and of the Exif standard. TIFF/EP was standardized by the International Organization for Standardization in 2001 (ISO 12234-2:2001). It is defined to be as compatible as possible with existing desktop software packages, to enable them to operate with images from electronic cameras. Several still-picture cameras use it as their native TIFF format. There are no major departures from the TIFF 6.0 standard structure (Figure 17.2) except that many of the existing TIFF tags are not used while there are a small number of new tabs. A TIFF/EP file is a valid TIFF file that contains the TIFF/EP format identifier and has exactly the same header as the TIFF header. Unlike TIFF, in TIFF/EP there is a method for dealing with thumbnail images. The allowed colour space encodings are RGB, YCbCr and CFA (colour filter array) – see Chapter 23 – but not all TIFF/EP readers read CFA encoded files. The format supports uncompressed, JPEG lossless and lossy compression algorithms; however, TIFF/EP readers are only required to open uncompressed images.
Tagged Image File Format for Image Technology (TIFF/IT; extensions .tif or .tiff) is another file format based on TIFF Version 6.0 which has been standardized. Its current second edition is defined in the ISO 12639:2004 standard, which specifies media-independent means for prepress electronic data exchange. The TIFF/IT standard is intended to facilitate the interchange of rasterized images among electronic digital systems used in prepress image processing, graphic arts design and related document creation and production operations. It supports the encoding of continuous-tone images, colour line art, binary images and binary line art, screened data and images for composite final print pages. The current second edition of the standard specifies three levels of conformance: TIFF/IT (also referred to simply as TIFF/IT), TIFF/IT-P1 and TIFF/IT-P2. TIFF/IT-P1 provides a smaller set of options to permit simpler implementation and compatibility, where possible, with commonly available TIFF 6.0 readers and writers. TIFF/IT-P2 is also a subset which incorporates all of the options defined for TIFF/IT-P1 but provides wider image type support. The primary colour space for the TIFF/IT standard is CMYK since the intended output is a printed page. The 2004 edition of the standard added support for an expanded line-art palette (up to 65,535 colours) and support for up to 32 colour separations. Other colour spaces and the use of ICC profiles are supported, but the P1 profile is limited to CMYK. Many magazines and journals require that advertising materials are submitted as TIFF/IT.
JPEG is a lossy compression standard and also the name of the related file format (file extension .jpg, .jpeg). The compression standard was developed by the Joint Photographic Experts Group, a joint ISO/CCITT committee, with the aim of producing an international standardized method for the compression of ‘natural’, continuous-tone, greyscale or RGB images. The specification was proposed in December 1991 and standardized as ISO/IEC IS 10918-1jITU-T Recommendation T.81, in 1994. A file format, JPEG Interchange Format (JIF), was included in an annexe in the standard. This was quickly updated to the JPEG File Interchange Format (JFIF), which is described in the standard as ‘a minimal format which enables JPEG bitstreams to be exchanged between a wide variety of platforms and applications … the only purpose of this format is to allow the exchange of JPEG compressed bitstreams’.
Two JPEG file formats are currently used today: JPEG/JFIF and JPEG/Exif. The files stored using both these formats are commonly referred to simply as ‘JPEG’ files. JPEG/JFIF is used most commonly for storing or transmitting files on the Internet and has now become a de facto standard. JPEG/Exif has gained widespread use in digital cameras. It should be noted that a lossless version of JPEG compression, JPEG-LS, was also released, but has never really been used; therefore, we will deal here only with the well-known lossy version, baseline JPEG.
The JFIF implementation is compatible with Mac, Windows or Linux operating systems. It provides support for up to 8-bit greyscale or 24-bit colour, but uses the YCbCr colour space, into which images to be saved as JPEGs are transformed in a pre-processing step before compression. If an RGB image is to be saved as a JPEG, its YCbCr components are calculated using a linear transformation. The compression method is the lossy JPEG compression algorithm, which is detailed in Chapter 29. More in-depth information about the colour encoding in JPEG can be found in Chapter 23.
JPEG files consist of a series of markers, specifying various properties which may be followed by payload data. The markers specify the type of JPEG implementation, the quantization tables used, the Huffman tables used in entropy coding (see Chapter 29), the start of the image data and the end of the file. Some markers are for description of specific data, such as Exif, and some are used for text comments. The payload data includes any data relating to the markers, for example the values within the tables and the entropy coded image data.
JPEG can achieve high levels of compression, up to 1:100 (with heavy penalties in image quality). Up to 1:10, it is generally considered to be perceptually lossless, meaning that the artefacts caused by the algorithm are virtually imperceptible. High compression rates, coupled with the fact that JPEG represents the appearance of continuous tone, have led to the format being one of the most widely used with images on the Internet, as well as a range of other imaging applications.
The JPEG/Exif format is based upon the JFIF implementation, but includes camera metadata in an Exif format, such as date, time and camera settings and an image thumbnail. JPEG/Exif files are supported by nearly all consumer digital cameras on the market and it is expected that this will continue. JPEG files are able to embed ICC colour profiles, such as sRGB and Adobe RGB 98, making them compatible with ICC colour management systems, although JPEG profiles are not always recognized by application software.
There are certain characteristic artefacts of JPEG at high compression levels, known as ‘blocking’ and ‘ringing’ arte-facts (see Figure 29.15). These can be more bothersome in higher resolution printed images, where quality requirements are higher and images are likely to be more closely examined, but in such cases lossless formats are more commonly used. The quantization in the JPEG compression algorithm is achieved using visually weighted quantization tables, which are defined based upon a quality setting selected by the user. There is always a trade-off between the degree of compression and the image quality of JPEG images, but that is a problem with any lossy compression method. The global adoption of JPEG indicates that as a lossy format it is one of the best. One of the areas, however, where it is less successful is in the compression of images containing text. This has been one of the reasons behind the development of the latest standard, JPEG 2000.
The JPEG 2000 format (file extension .jp2) is a wavelet-based image compression standard, developed by the JPEG committee to address the evolving requirements of digital imaging technologies and to improve upon some of the limitations of the earlier JPEG format. As with JPEG, the standard specifies both compression method and file format. The compression method is covered in more detail in Chapter 29.
The call for proposals for the new compression standard went out in 1997. One of the committee’s aims was to develop a standard which would be useful in some of the emerging digital imaging areas, such as medical imaging, digital image libraries and mobile applications. The format was required to support multiple bit depths and image types (bi-tonal, greyscale, colour), provide lossless and lossy compression, and support a range of imaging models, preferably in some unified system. Another key requirement was superior compression performance to existing standards, especially at low bit rates (corresponding to high compression levels): JPEG and other schemes performed well at high bit rates (low levels of compression) but suffered from severe distortions at low bit rates. This was relevant in many of the new areas of digital imaging, especially in multimedia, the Internet and mobile phones, where images needed to be heavily compressed.
The features in the resulting standard, ISO/IEC 15444, approved in 2001, are summarized in Chapter 29. Further versions of JPEG 2000 have since been approved, adding various new parts to the standard. Key to the JPEG 2000 format is that it produces a multiple resolution representation of the image. This is achieved because JPEG 2000 uses a discrete wavelet transform (DWT) instead of the DCT used in JPEG, to pre-process the data prior to optional quantization.
The ability of JPEG 2000 to support both lossless and lossy compression makes it suitable for imaging applications which require perfectly reconstructed images, such as medical imaging and image archiving, as well as the applications already supported by JPEG. The results from comparison studies with JPEG on the image quality of the lossy compression are discussed in Chapter 29. One type of scene content successfully addressed by JPEG 2000 is an improvement on the poor performance of JPEG when compressing text (Figure 17.3).
JPEG 2000 provides for multiple colour spaces and image types. Images may be single channel or multiple channels, greyscale, bi-tonal or colour. In terms of colour space encoding support, it includes palettes, device-dependent RGB, sRGB, YCbCr and some ICC colour spaces. Because it is able to work with bi-tonal images and text in particular, it is suitable for layout applications. However, possibly as a result of the development and growth in the use of the PDF format, it has not been widely adopted for this purpose. In general, although JPEG 2000 addresses many of the areas of weakness in JPEG and provides many features useful in imaging applications today, it is still not widely used. This may be partly due to the lack of support by web browsers, although there are plug-ins available, but also because there are a number of other competing formats that possess some of the application-specific features. The same applies to digital cameras, where support for the format is inconsistent. However, there is much research going on in various specialist imaging application areas such as forensic imaging and image archiving into the suitability and performance of JPEG 2000, which is likely to ensure that it will eventually becomes a more widely available format.
RAW (the extension is variable – see Table 17.1) is not a single file format but is a generic term indicating proprietary image file formats that contain unprocessed (or minimally processed) digital image data originating from digital capturing devices. A RAW file is composed of two parts: the image data and the metadata. It is expected to contain a record of the image data as captured by the electronic sensor – the so-called sensor data – which has been sampled and quantized according to the specifications of the sensor and the analogue-to-digital converter (ADC) of the capturing device. In most commercial cameras some processing, such as dark noise reduction and the mapping out of ‘dead’ pixels, takes place on the sensor and thus the raw data are not totally unprocessed (see Chapter 14 for the image-processing pipeline in digital cameras). Also, non-linear quantization may be employed at capture, which allows distribution of the tonal information in a similar manner to the human visual system (HVS).
FILE NAME EXTENSION |
MANUFACTURER |
.dng |
Adobe |
.crw, .ciff, .cr2 |
Canon |
.raf |
Fuji |
.k25, .kdc, .dcr |
Kodak |
.nef |
Nikon |
.orf |
Olympus |
.cap, .tif, .iiq |
Phase One |
.srf, .sr2, .arw |
Sony |
RAW image files are often compared to photographic negatives, for these are unrendered images containing image information from which one can generate different kinds of prints. In a similar fashion RAW files, sometimes called digital negatives (note, however, that there is also an Adobe RAW format called a digital negative – see below), can be rendered in different ways depending on the rendering settings chosen by the user, who can adjust a wide range of parameters, including white balance, tonal mapping, noise reduction, sharpening and others to achieve a preferred look (more details are given in Chapter 14). Processed images are then saved for display or print in a rendered image file format such as TIFF or JPEG, but the RAW data file remains unchanged, and can be ‘archived’ for future re-rendering.
In most commercial digital cameras, where colour filter arrays are overlaid on CCD or CMOS sensitive elements to filter the incoming light and separate colour information, the raw pixel values of the red, green and blue channels are of incomplete resolution (see Chapters 9 and 14). The resulting captured image has only one full channel of information in which some of the pixels represent (most commonly) red, some green and some blue pixel values (CFA colour space). To retrieve the full resolution three-channel RGB image from the raw file the mosaic data are interpolated and combined – a process known as demosaicing. RAW files from scanner sensors, or from full-colour camera sensors (such as the FoveonTM range of multi-layer image sensors – see Chapters 9 and 14), provide raw data which do not need demosaicing. Further, since electronic sensors are linear devices (they respond to light in a linear fashion, unlike the HVS and silver-based photographic media – see Chapters 4, 8, 9 and 21), the tone reproduction of RAW files is linear – provided that linear quantization has been employed. In exceptional cases, curve shaping may have taken place during or after ADC and the RAW file ends up with a gamma different from 1.0. Digital RAW files contain pixel values of high bit depth (often 12, 14 or 16 bits per pixel per colour channel) compared to typical 8-bit per channel renderings and thus can store greater tonal latitude and finer tonal and colour variations. Important image processes result in fewer artefacts when performed on high-bit-depth data, such as raw data (or high-bit-depth TIFF data), than when done on already typically rendered 8-bit per channel images. Applying tone modification, such as a typical gamma correction of 1/2.2 for example, on 8-bit per channel data, will result in a loss of almost 30% of the available intensity levels in the image due to rounding errors, compared to a loss of 3% when the same gamma correction is applied to 12-bit data which are then down-sampled to 8 bits for viewing (details in Chapter 21).
In addition to image pixel values, RAW files contain large amounts of metadata. Apart from Exif type metadata, further metadata are included in RAW files to be used by RAW converters; these are necessary to process the RAW files and render them for appropriate viewing.
While raw data provide great flexibility and choice of image rendering, RAW file formats are proprietary and their structure differs from manufacturer to manufacturer and even from one camera model to another. They are not standardized or documented. Often image information is compressed using lossless (or nearly lossless) data compression methods. The fact that different RAW formats are in use creates problems of support with software for conversion and guarantees no compatibility with future software.
Cameras that support RAW files provide proprietary RAW converters to render their RAW format. While other conversion programs, plug-ins and open source programs are available, they do not always deal with all proprietary RAW files. These other applications aim to provide a common interface for the conversion of many different RAW files. Adobe camera RAW is a generic converter which has gained widespread use within the commercial industry. Images are rendered differently by different RAW converters which employ different (proprietary) rendering algorithms. For example, as there is no standard algorithm for converting data from Bayer colour filter arrays into RGB, different RAW converters may produce different colours for the same digital file, even if the rendering settings chosen by the user are the same. Most RAW converters will perform the following operations:
• Demosaicing – involves interpolation (see Chapters 14 and 26) and is based on the information provided in the metadata regarding the arrangement of the colour filters on the sensor.
• Colour space mapping – involves mapping the camera colour space (see Chapter 23) to a device-independent space such as the CIEXYZ.
• Gamma correction/curve shaping – involves a nonlinear conversion of the linear pixel values (with respect to input intensity) to achieve redistribution of the tonal information in the image, so that this matches best the HVS and is appropriate for viewing images on displays (see Chapter 21).
• White balance – involves white point conversion and is based on settings selected by the user during capture or during rendering (see also Chapters 14 and 23).
• Noise reduction, anti-aliasing and sharpening – these steps are commonly included in the camera processing pipeline to counteract the blurring and aliasing artefacts caused by sampling, imaging optics and demosaicing, and the noise from multiple sources inherent in digital capture devices. They are therefore also necessary steps in processing RAW images, although they may sometimes be applied with more precision after the RAW conversion process in an imaging software application instead (see Chapter 14).
Digital Negative (DNG; extensions .dng or .tif) was introduced by Adobe in 2004 in an attempt to address the lack of a universal format for RAW files. It is a publicly available file format for RAW files generated by digital cameras. The specification of the latest DNG version 1.3.0.0 was published in June 2009. DNG conforms to TIFF/EP (see above), it is structured according to TIFF and thus permits significant use of metadata. Adobe aims to promote DNG as a universal RAW file format for archival purposes and is currently submitting it for standardization to the International Organization for Standardization. A number of camera manufacturers have introduced cameras that support DNG either as their native or as an alternative RAW format, while a number of digital imaging and desktop publishing applications provide conversion and writing DNG files. DNG files store uncompressed data and JPEG lossless compressed data in a linear non-white balanced colour space, which is usually the native RGB colour space of the camera. It enables data storage in either CFA form or in full-colour RGB.
It is possible (but not required) for a DNG file to simultaneously comply with both the DNG specification and the TIFF/EP standard. The format for storing image data in DNG files is based on the published TIFF/EP but not the metadata, which contains all the information that a converter needs to convert the RAW image data, even if the application was not designed for a specific camera. DNG 1.2.0.0 and later, allows for one or more ‘camera profiles’ to be embedded, as a set of tags that include information such as colour calibration matrices for conversion from/to CIEXYZ to/from the native camera colour space, linearization tables that map stored values into linear values, whether noise reduction has been applied to the raw data, etc. DNG specifies a required set of metadata but does not restrict additional, proprietary (not publicly documented) metadata to be included which can embed features chosen by camera manufacturers, stored in private tags. It enables inclusion of TIFF/EP, Exif, IPTC and XMP formats for metadata tags.
In addition to the DNG specification, Adobe provides a free DNG converter that converts most camera-specific RAW format files, including of course DNG, but without obligatorily maintaining all of the original proprietary metadata because this information is not publicly documented. The converter is not an open source.
The GIF format (file extension .gif) was introduced by Compuserve Inc. in 1987 as a raster format, originally designed for easy transmission and viewing of image data, stored either locally or on remote computers. GIF is optimized for the storage of graphics and has since gained worldwide usage as a result of the high degree of compression it achieves, making it suitable for certain types of images on the worldwide web.
GIF images are paletted or indexed images and contain a maximum of 8 bits per pixel, whether greyscale or colour. GIF files compress data using LZW coding, a lossless dictionary-based compression technique. This method is covered in detail in Chapter 29, but in summary it allocates sequences of pixel values single codes from a dictionary, meaning that an image with many repeating sequences (such as one containing lots of graphics) may be substantially compressed. The other key characteristic of the GIF format is the ability to store multiple images in the same file, which means that it is possible to store the frames of an animated sequence in a single file.
Compuserve Inc. released the first version of GIF, GIF87a, as a free open specification. It was superseded 2 years later by GIF89a, which added various additional features, including transparency support and storage of metadata for a specific application. During the next few years, it became a de facto standard for Internet imaging (at this point JPEG had only just been released). It emerged, however, in 1994 that the LZW compression algorithm used within GIF was a method that had been previously patented by Unisys. No mention of this had been made in the GIF specification, but it meant that Unisys could charge a licence fee to certain users, for example developers creating software to read or write GIF files. The issue eventually led to the development of the Portable Network Graphics (PNG) format as an alternative.
The colour palette used in GIF is the fundamental limitation in its suitability as a format for the majority of photographic imaging applications. It is commonly termed a lossless format, because LZW compression is a lossless compression method. However, the process of storing a 24-bit RGB image as a GIF involves the conversion of a range of over 16 million possible colours into one containing just 256 individual colours. Even if a local colour table is included, in which the colours are selected based on those in the image, it is a quantization process and is irreversible; therefore, image information will be lost, with an associated loss of image quality.
The effects of this are clearly shown in Figure 17.4, which depicts an original TIF image and the appearance of the image and its histogram, after it is saved as a GIF. Contouring or posterization artefacts are indicated on the histogram and are clearly visible in the image, particularly in areas of smoothly graduating tones. The significant degree of compression achieved is indicated by the file sizes at the bottom. The visibility of the artefacts means that GIF is only really suitable for photographic images that are small and/or being viewed as part of a quickly moving sequence, hence it is still sometimes used on web pages. However, the JPEG format can achieve significant compression without the same degree of quality loss on continuous-tone images; therefore it is now more commonly used for Internet images. The GIF format is much more successful for images containing graphics with solid colour or text.
All GIF files begin with a header, a logical screen descriptor and a global colour table. The header simply identifies the file as a GIF format; the logical screen descriptor provides colour and screen information necessary to correctly display the image, such as the minimum screen resolution required to display the image without scaling. The global colour table is an optional colour palette, which is used as the default table to index the colour data from the pixel values within the contained image file (or files). However, each image included in the file may use its own local colour table, which then overrides the use of the global colour table.
In GIF89a, extension information is included after this information and before the individual image information. This information is termed control extension information, as it controls the way in which the graphical data within the file, both bitmap and text, is rendered.
Following this, in both specifications, is the image information for one or multiple images. This is made up of a logical image descriptor, a local colour table and the image data. The logical image descriptor describes the position of the image on screen and the height and width in pixels, as well as information about whether a local colour table is attached to it, and how many bits are allocated to each local colour table indexed entry. The image data are the output from LZW compression and are organized in a series of data sub-blocks rather than a continuous stream. The GIF file ends with a trailer, which is a single byte of data, always the same value, marking the end of the file.
The PNG format (file extension .png) is a lossless raster format designed as an alternative to (and an improvement on) the GIF format and, most importantly, it is an open specification which is patent free. It was developed with two main applications in mind: as a storage format for images to be used on the worldwide web and as an intermediate storage format in image editing.
It has several advantages over the GIF format. It compresses images using the ‘Deflate’ algorithm, a combination of LZ77 and Huffman coding (see Chapter 29), which provides greater levels of lossless compression than the LZW compression in GIF, particularly on very small images. Additionally, it can be extended by software implementers to include compression filters, which filter the image data beforehand to improve the compression results. Perhaps most importantly, where GIF is limited to a maximum 8-bit colour palette, PNG provides extensive bit-depth support. As well as supporting 1-, 2-, 4- and 8-bit paletted images, it also supports up to 16-bit greyscale images and up to 16-bit per channel RGB images. In addition to allowing transparency equivalent to that of GIF, PNG allows the inclusion of an alpha channel, which can provide ‘variable transparency’, such as that used in image masks, with 8- or 16-bit greyscale or RGB images, hence its suitability as an intermediate editing format. It does not, however, support CMYK images, as its main applications use images displayed on screen. It also does not provide multi-image support so cannot store animations.
PNG has a range of other useful extra features, such as the facility to include information for gamma correction to ensure the correct display of images across different browsers and platforms, and to embed text annotations in the image file, for copyright information, for example. The latest version supports ICC colour management. The format also supports interlacing, which is a method in which the image information is selected and transmitted in a number of different passes, taking some lines (or blocks) of pixels but not others at each pass until all have been selected. The image information can be filled in at the other end progressively from each pass. If the interlacing method is successful, then aspects of the image content will appear in a rough version when the image is first displayed in a web browser and the details will be filled in as each subsequent pass is received. GIF also supports interlacing, although using a slightly less sophisticated method than PNG. It should be noted here that this is mainly relevant for images being transmitted across a modem. The fact that broadband connections are now commonplace may be one of the reasons that the adoption of PNG has not been as widespread as was expected.
The history of the PNG format is interesting, as it is a demonstration of the influence of the Internet in the sharing and promulgation of information to a common end. The announcement that GIF was to be partially licensed as a result of the patent on the LZW compression method held by Unisys led to the formation in 1995 of an informal Internet working group, now known as the PNG Development Group, led by Thomas Boutell. He posted a draft format called the Portable Bitmap Format on a number of relevant newsgroup websites and the group was established from interested parties with the aim of developing a royalty-free alternative to GIF. The format was developed very much by consensus; within a few weeks most of its major features had been proposed, and within a few months seven new drafts of the format had been developed.
The PNG specification (version 0.92) was released by the World Wide Web Consortium in 1995 as a working draft, followed a few months later by version 0.95 as an Internet draft by the Internet Engineering Task Force (IETF). The update to version 1.1 in 1998 included new information for cross-platform colour correction using sRGB and a revision to the gamma correction method. This was followed by the current specification, version 1.2, in 1999 with some minor revisions. In 1997 PNG had been formally adopted by the Virtual Reality Modelling Language (VRML) Architecture Group as one of the required image formats for conformance with VRML 2.0. This helped to progress it towards standardization. It became the joint ISO/IEC (International Electrotechnical Commission) standard 15948 in 2004.
Despite some of the advantages that PNG has over some equivalent formats, particularly in its use on the Internet, it has not yet been widely adopted by users instead of GIF and JPEG. This is partly due to the fact that it is not as widely supported in application software and web browsers as the other two. It may also be due to its inability to store animated sequences. It is quite clear that PNG is more successful than GIF for storing continuous-tone images but GIF does not tend to be widely used for such images unless they are very small. GIF is useful mainly for images containing graphics; it may be that the majority of users continue with it out of habit, because it is good enough for their requirements. By contrast, JPEG is a lossy format, resulting in much smaller file sizes than the PNG equivalent, as it is designed for continuous-tone images. The artefacts produced may not be problematic on low-resolution images on the Internet. It is possible, now that images are being transmitted across higher bandwidths, that PNG may come into its own as a lossless format for larger images; however, it also has competition from JPEG 2000, another format yet to be widely adopted.
A PNG file begins with a PNG file signature, which is a set of 8 bytes always containing the same decimal values, which signify that a PNG file follows. It is then made up of a series of chunks, each of which includes a length descriptor, a chunk type descriptor and the chunk data, among other things. The chunks are of two types: critical and ancillary.
The critical chunks include: a header chunk containing data about image dimensions, bit depth, colour type, compression and compression filtering methods and interlace method; an optional colour palette chunk (maximum 8 bits); an image data chunk, which is the output data from the compression algorithm; and an image trailer chunk marking the end of the PNG file. The ancillary chunks are optional chunks which allow developers to include other useful information in the file and fully utilize the potential features of the PNG format. They include, amongst other information, background colour, primary chromaticities and white point to enable colour management support, image gamma for gamma correction, the image histogram, image transparency and text chunks.
Photoshop Document (PSD; extension .psd) is a proprietary image file format. It is the native bitmap file format of Adobe Photoshop – the most commonly used image-editing application – and although it is a de facto standard format for designers and photography professionals it is not considered a general-purpose interchange format, rather an intermediate editing format. Because PSD is an application-based format it is expected to change in the future. It is a complex format but its strength lies in the fact that it stores image layers, effects, paths and other Photoshop-specific elements. Photoshop and a few other editing and desktop publishing applications can write PSD files while a few additional ones can read them.
PSD provides good support for various colour schemes, storing binary, indexed colour, greyscale, half-tone, RGB, CMYK and L* a* b* image data of up to 16 bits bit depth per channel. The maximum spatial resolution supported by the format is 30,000 × 30,000 pixels. Image data is stored uncompressed, or using the incorporated RLE lossless compression (a compression algorithm used by the Macintosh ROM and the TIFF standard), resulting in PSD files usually being very large.
The structure of PSD is illustrated in Figure 17.5. Files consist of a header, three informational sections followed by the image data. The header includes fields for the file identification number (signature), the number of channels in the image (from 1 to 24), image dimensions, bit depth and the colour mode of the file. The other sections are of variable length. Only indexed colour and duotone have colour mode data. For indexed colour images the colour data contains the colour table for the image. For duotone half-tone images, the colour data contains the duotone specification. Other than Photoshop applications that read PSD files, a duotone image is treated as a greyscale image. The Image Resources section consists of non-pixel data associated with the image; following are the Layer and Mask section, where each layer and mask is documented, and the image data.
Photo CD (PCD; extension .pcd) is a proprietary image file format, attached to the Photo CD system. The Photo CD was a commercial system introduced by Kodak in 1992 but was discontinued in the mid-2000s (Photo CD discs are still available from an independent vendor in the USA). It provided film (negative or slide) development, high-quality scanning and storage of the digital image files on a CD-ROM for access from Kodak CD players, other media players and computers with suitable software. Although the Photo CD system has now been replaced by other Kodak services it is worth explaining PCD, as it is considered a high standard file format.
PCD is still used by some professionals and supported by most of the graphics software applications. The Photo YCC encoding (one luminance, Y, and two chrominance, C1 and C2, channels – see Chapter 23) of the PCD format is suitable for archiving master image files in cases where the original colorimetry needs to be preserved, because it is an input-referred encoding system, i.e. the transformation from sensor encoding to input-referred encoding is known (see Chapter 23). This allows for PCD readers to convert the YCC data to device-independent colour spaces, such as the CIELAB. PCD images are often used by imaging researchers as test images, because of their high quality and calibrated colour.
Images on Photo CD were scanned using high-quality scanners at a spatial resolution of 2200 or 4400 dpi, the latter offered with the Pro Photo CD service. Thirty-five-millimetre frames produced sampled files of 3072 × 2048 pixels (or 6144 × 4096 pixels with the Pro Photo CD for 35 mm, which also offered scans from 120 and 4 × 5 film sizes), per red, green and blue channels, quantized to 8 bits per channel. RGB files were then converted to Photo YCC, an encoding that was primarily developed for the Photo CD system. They were compressed at a ‘visually lossless’ Kodak proprietary compression and stored as 2.0–6.0 megabyte files. Compression was carried out in Photo YCC colour space, where much of the chrominance information was averaged – by averaging colours of adjacent pixels – and subsampled to reduce image size significantly. After two subsampling phases, the original and compressed image data were compared and the differences between them – the so-called residuals – were saved as two separate files. The residuals were recombined later with the subsampled data to rebuild two high-resolution versions of the image. The compression of chrominance information in the Photo YCC encoding takes advantage of the fact that the human visual system is much more sensitive to luminance differences than to differences in colour (or hue; see Chapters 4 and 5). Without compression each image would occupy 18 MB (or 78 MB for the Pro version) of disc storage.
PCD is a multi-resolution image format. Images can be opened at five separate resolutions because the information is stored in a so-called image pac. The main advantage of multi-resolution formats is that lower resolutions, intended for Internet, TV and multimedia applications, can be viewed and downloaded quickly. The Base image resolution contains the original data, averaged and subsampled. There are two low-resolution versions (Base/4 and Base/16) and two high-resolution versions (4 × Base and 16 × Base). The high resolution versions contain the Base plus the first (4 × Base) and second (16 × Base) residuals. A Base/N image contains 1/N the number of pixels in the Base image and an N× Base contains N times as many pixels as the Base image.
The specifications of the PCD have been published and the format has been widely used by many libraries and archives. According to Kodak: ‘The PCD file format was designed to reliably perform a specific function. However, the equipment and software required to scan an image from film and write it to a PCD file is expensive.’ Thus, despite its advantages the PCD format’s future is unclear.
PostScript (PS; extension .ps) is a file format for files that are saved in the PostScript page description language (PDL). It is not an image file format. PostScript is a programming language that describes very accurately the appearance of a page, including vector graphics, high-quality bitmaps and text of various fonts for both print and display, and is used extensively in desktop publishing. The program code is processed by related software and hardware in printers (the Raster Image Processor – RIP) that support PostScript. The RIP interprets the PostScript code into a bitmap, producing graphical information (i.e. vector graphics and text are rasterized at a desired resolution). PostScript instructions can be rendered into a display for viewing instead of to paper, in which case an RIP is also used by the PS viewer. PostScript Level 1 was developed by Adobe in 1984 and has since gone through many revisions and updates. The current version is PostScript 3, introduced by Adobe in 1997.
One advantage of PostScript files is that they are rendered in exactly the same way from different PostScript-compatible printers and viewers, so they are device independent. The rich font system used the PS graphics primitives to draw glyphs (or characters) as line art, which can be rendered at any resolution. PS files are relatively small in size as they contain instructions in ASCII form to be sent to the printer, rather than bitmap information. This applies unless they incorporate bitmaps, in which case the results are large PS files.
PostScript used to be a de facto standard for printed output, but this is no longer the case. The evolution of PostScript led to the development of Adobe Acrobat, which creates Portable Document Format (PDF) documents (see later).
Encapsulated Postscript files (EPS; extension .eps) are PostScript format that includes a low-resolution ‘preview’ encapsulated in it so that it can be displayed as a preview without interpreting the associated PostScipt code. EPS is essentially an image file format that focuses mainly on output. It is suitable for high-end print workflows but can be rendered by any application that renders bitmaps. EPS files may contain as a minimum one command, describing the page containing the image described by the EPS file. Applications can use this information to lay out the page, even if they are unable to directly render the PostScript inside. In recent years, applications have ignored the ‘preview’ of an EPS file, but are able to display a preview by interpreting the PostScript to get their own preview.
The PDF format (file extension .pdf) was released by Adobe systems in 1993 and during the 1990s became the de facto standard for the storage and communication of documents. PDF was officially standardized and published by the International Organization for Standardization as ISO 32000-1:2008 in July 2008. It is not really an imaging format, rather it is a document format that allows the inclusion of images, providing support for both raster and vector data representation. A PDF file contains a fixed layout of a page and provides a complete description of the document, which includes graphics, images, text and fonts. The document is displayed as an image, which can be viewed or printed, making it particularly suitable for proofing composite images. Hence it was originally used mainly in desktop publishing applications. It has now found widespread use for images or documents which are to be printed from the web.
A PDF document is represented in a way that is independent of the system on which it is displayed, i.e. the application software, the operating system and the hardware. It is a collection of objects, describing the appearance of a page or pages and includes structural information about the page layout. The graphics, text and images in the page are contained in a content stream of graphics objects which is painted on to the page, the layout and formatting all defined by the application creating the file. The main graphics objects are: path objects, which are vector descriptions of graphical objects (points, lines, curves, which may be filled shapes); text objects, which are glyphs representing text characters (which are included in a separate font data structure); and image objects, which are rectangular arrays of sampled values, essentially bitmaps. The images are embedded in the PDF file using a PDF-specific image format, from which the original image format may not be determined. The images have up to 64-bit colour support in RGB, YCbCr and CMYK colour spaces. The images may be uncompressed, losslessly compressed using various algorithms including LZW, or compressed using lossy JPEG compression. This means that the degree of quality loss may be controlled for different types of documents.
Specific software (the Acrobat family of products from Adobe) is necessary to view or create PDF files. Initially, Adobe charged for this software, which may have contributed to the relatively slow uptake of the format; however, Acrobat Reader is now distributed at no cost, making it feasible for distribution of PDF documents on the Internet, resulting in its current status as a de facto standard. Early versions of PDF and Acrobat allowed only the viewing and printing of documents, but more recent developments have added the facility to extract specific selections of text and images. There have been eight versions of PDF and Acrobat, culminating in PDF version 1.7 and Adobe Extension 3/Acrobat 9.0 in 2008. This has now been published as an open standard, ISO 32000-1:2008.
PDF is an extremely useful format in imaging workflow as a means for the communication of information, layouts and the overall look of a document. It is also suitable as a format to provide documents containing images for downloading and printing from the Internet. It is not, however, comparable to the other formats designed specifically for images; it has a specific purpose in document communication and should be used in this context, with more suitable image formats selected for the storage of images.
Adobe Photoshop Software Development Kit 2006. Adobe Systems Incorporated, USA.
Burns, P.D., Houchin, S., Parulski, K., Rabbani, M., 2002. Using JPEG 2000 in future digital cameras: advantages and challenges. Proc. ICIS’02 Conference, Tokyo, 371–372.
CIPA DC-008-2010: Exchangeable Image File Format for Digital Still Cameras: Exif Version 2.3 Camera and Imaging Products Association (CIPA) and Japan Electronics and Information Technology Industry Association (JEITA).
Common Image File Formats 2008. http://www.library.cornell.edu/preservation/tutorial.
Digital Negative (DNG) Specification 2008. Adobe Systems Incorporated, USA.
Encapsulated PostScript File Format Specification version 2.0 1996. Adobe Systems Incorporated, USA.
Fraser, B., 2004. Real World Camera Raw. Peachpit Press, Berkeley, CA, USA.
Gosney, M., et al., 1995. The Official Photo CD Handbook. Peachpit Press, Berkeley, CA, USA.
Graphics Interchange Format 87a Specification 1987. Compuserve Inc. Available from the World Wide Web Consortium website.
Graphics Interchange Format 89a Specification 1989. Compuserve Inc. Available online from the World Wide Web Consortium website.
Hamilton, E., 1992. JPEG File Interchange Format version 1.02 Specification. C-Cube Microsystems, available from the World Wide Web Consortium website.
ISO 12234-2:2001, 2001. Electronic Still-Picture Imaging – Removable Memory, Part 2: TIFF/EP Image Data Format. International Organization for Standardization.
ISO 12639:2004, 2004. Graphic Technology – Prepress Digital Data Exchange – TIFF/IT Tag Image File Format for Imaging Technology. International Organization for Standardization.
ISO 32000-1:2008, 2008. Document Management – Portable Document Format, Part 1: PDF 1.7. International Organization for Standardization.
Murray, D.J., vanRyper, W., 1996. Encyclopaedia of Graphics File Formats, second ed. O’Reilly & Associates, Cambridge, UK.
PDF Specification version 6, 2007. PDF Reference and related documentation. Adobe Systems (updated October 2007).
PNG (Portable Network Graphics) Specification, version 1.2, 1999. Portable Network Graphics Development Group.
PostScript Printer Description File Format Specification version 4.3, 1992. Adobe Systems Incorporated, USA.
Rabbani, M., Joshi, R., 2002. An overview of the JPEG 2000 Still Image Compression Standard. Signal Processing: Image Communication 17, 3–48.
TIFF Revision 6, 1992. Adobe Systems Incorporated, USA.
Wallace, G.K., 1991. The JPEG Still Picture Compression Standard. IEEE Transactions on Consumer Electronics, December.
18.226.98.32