3. Raster Images (2/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

58 3. Raster Images

3.1.3 Input Devices

Raster images have to come from somewhere, and any image that wasn’t com-

puted by some algorithm has to have been measured by some raster input device,

most often a camera or scanner. Even in rendering images of 3D scenes, pho-

tographs are used constantly as texture maps (see Chapter 11). A raster input

device has to make a light measurement for each pixel, and (like output devices)

they are usually based on arrays of sensors.

A digital camera is an example of a 2D array input device. The image sensor

lens

image

sensor

scene

Figure 3.7. The operation

of a digital camera.

in a camera is a semiconductor device with a grid of light-sensitive pixels. Two

common types of arrays are known as CCDs (charge-coupleddevices) and CMOS

(complimentary metal–oxide–semiconductor) image sensors. The camera’s lens

projects an image of the scene to be photographed onto the sensor, and then each

pixel measures the light energy falling on it, ultimately resulting in a number that

goes into the output image (Figure 3.7). In much the same way as color displays

use red, green, and blue subpixels, most color cameras work by using a colo r-ﬁlter

array or mosaic to allow each pixel to see only red, green, or blue light, leaving

the image processing software to ﬁll in the missing values in a process known as

demosaicking (Figure 3.8).

Figure 3.8. Most color

digital cameras use a color-

ﬁlter array similar to the

Bayer mosaic

shown here.

Each pixel measures either

red, green, or blue light.

Other cameras use three separate arrays, or three separate layers in the array, to

measure independent red, green, and blue values at each pixel, producing a usable

color image without further processing. The resolution of a camera is determined

by the ﬁxed number of pixels in the array and is usually quoted using the total

count of pixels: a camera with an array of 3000 columns and 2000 rows produces

an image of resolution 3000 × 2000, which has 6 million pixels, and is called a

6 megapixel (MP) camera. It’s important to remember that a mosiac sensor does

People who are selling

cameras use "mega" to

mean 10

, not 2

as with

megabytes.

not measure a complete color image, so a camera that measures the same number

of pixels but with independent red, green, and blue measurements records more

information about the image than one with a mosaic sensor.

A ﬂatbed scanner also measures red, green, and blue values for each of a grid

of pixels, but like a thermal dye transfer printer it uses a 1D array that sweeps

across the page being scanned, making many measurements per second. The

resolution across the page is ﬁxed by the size of the array, and the resolution

The resolution of a scanner

is sometimes called its “op-

tical resolution” since most

scanners can produce im-

ages of other resolutions,

via built-in conversion.

along the page is determined by the frequency of measurements compared to the

speed at which the scan head moves. A color scanner has a 3 × n

array, where

is the number of pixels across the page, with the three rows covered by red,

green, and blue ﬁlters. With an appropriate delay between the times at which the

three colors are measured, this allows three independent color measurements at

each grid point. As with continuous-tone printers, the resolution of scanners is

reported in pixels per inch (ppi).

3.2. Images, Pixels, and Geometry 59

With this concrete information about where our images come from and where

they will go, we’ll now discuss images more abstractly, in the way we’ll use them

in graphics algorithms.

lens

array

detector

Figure 3.9. The operation

of a ﬂatbed scanner.

3.2 Images, Pixels, and Geometry

We know that a raster image is a big array of pixels, each of which stores informa-

tion about the color of the image at its grid point. We’ve seen what various output

devices do with images we send to them and how input devices derive them from

images formed by light in the physical world. But for computations in the com-

puter we need a convenient abstraction that is independent of the speciﬁcs of any

device, that we can use to reason about how to produce or interpret the values

stored in images.

When we measure or reproduce images, they take the form of two-dimensional

distributions of light energy: the light emitted from the monitor as a function of

position on the face of the display; the light falling on a camera’s image sensor

as a function of position across the sensor’s plane; the reﬂectance, or fraction of

“A pixel is not a little

square!”

—Alvy Ray Smith (A. R.

Smith, 1995)

light reﬂected (as opposed to absorbed) as a function of position on a piece of pa-

per. So in the physical world, images are functions deﬁned over two-dimensional

areas—almost always rectangles. So we can abstract an image as a function

I(x, y):R → V,

where R ⊂ R

is a rectangular area and V is the set of possible pixel values. The Are there any raster de-

vices that are not rectangu-

lar?

simplest case is an idealized grayscale image where each point in the rectangle

has just a brightness (no color), and we can say V = R

(the non-negative reals).

An idealized color image, with red, green, and blue values at each pixel, has

V =(R

)

. We’ll discuss other possibilities for V in the next section.

How does a raster image relate to this abstract notion of a continuous image?

Looking to the concrete examples, a pixel from a camera or scanner is a measure-

ment of the average color of the image over some small area around the pixel. A

display pixel, with its red, green, and blue subpixels, is designed so that the aver-

age color of the image over the face of the pixel is controlled by the corresponding

pixel value in the raster image. In both cases, the pixel value is a local average

of the color of the image, and it is called a point sample of the image. In other

words, when we ﬁnd the value x in a pixel, it means “the value of the image in the

vicinity of this grid point is x.” The idea of images as sampled representations of

functions is explored further in Chapter 9.

A mundane but important question is where the pixels are located in 2D space.

This is only a matter of convention, but establishing a consistent convention is

60 3. Raster Images

Figure 3.10. Coordinates of a four pixel × three pixel screen. Note that in some APIs the

-axis will point downwards.

important! In this book, a raster image is indexed by the pair (i, j) indicating theIn some APIs, and many

ﬁle formats, the rows of an

image are organized top-to-

bottom, so that (0, 0) is at

the top left. This is for his-

torical reasons: the rows in

analog television transmis-

sion started from the top.

column (i)androw(j) of the pixel, counting from the bottom left. If an image

has n

columns and n

rows of pixels, the bottom-left pixel is (0, 0) and the top-

right is pixel (n

− 1,n

− 1). We need 2D real screen coordinates to specify

pixel positions. We will place the pixels’ sample points at integer coordinates, as

shown by the 4 × 3 screen in Figure 3.10.

The rectangular domain of the image has width n

and height n

and is cen-

tered on this grid, meaning that it extends half a pixel beyond the last sample point

on each side. So the rectangular domain of a n

× n

image isSome systems shift the co-

ordinates by half a pixel

to place the sample points

halfway between the inte-

gers but place the edges of

the image at integers.

R =[− 0.5,n

− 0.5] × [−0.5,n

− 0.5] .

Again, these coordinates are simply conventions, but they will be important

to remember later when implementing cameras and viewing transformations.

3.2.1 Pixel Values

So far we have described the values of pixels in terms of real numbers, represent-

ing intensity (possibly separately for red, green, and blue) at a point in the image.

This suggests that images should be arrays of ﬂoating-point numbers, with either

one (for grayscale, or black and white, images) or three (for RGB color images)

32-bit ﬂoating point numbers stored per pixel. This format is sometimes used,

3.2. Images, Pixels, and Geometry 61

when its precision and range of values are needed, but images have a lot of pix-

els and memory and bandwidth for storing and transmitting images are invariably

Why 115 MB and not 120

MB?

scarce. Just one ten-megapixel photograph would consume about 115 MB of

RAM in this format.

Less range is required for images that are meant to be displayed directly.

While the range of possible light intensities is unbounded in principle, any given

device has a decidedly ﬁnite maximum intensity, so in many contexts it is per-

fectly sufﬁcient for pixels to have a bounded range, usually taken to be [0, 1] for

simplicity. For instance, the possible values in an 8-bit image are 0, 1/255, 2/255,

...,254/255, 1. Images stored with ﬂoating-point numbers, allowing a wide

The denominator of 255,

rather than 256, is awk-

ward, but being able to rep-

resent 0 and 1 exactly is im-

portant.

range of values, are often called high dynamic range (HDR) images to distinguish

them from ﬁxed-range, or low dynamic range (LDR) images that are stored with

integers. See Chapter 23 for an in-depth discussion of techniques and applications

for high dynamic range images.

Here are some pixel formats with typical applications:

• 1-bit grayscale—text and other images where intermediate grays are not

desired (high resolution required);

• 8-bit RGB ﬁxed-range color (24 bits total per pixel)—web and email appli-

cations, consumer photographs;

• 8- or 10-bit ﬁxed-range RGB (24–30 bits/pixel)—digital interfaces to com-

puter displays;

• 12- to 14-bit ﬁxed-range RGB (36–42 bits/pixel)—raw camera images for

professional photography;

• 16-bitﬁxed-rangeRGB (48 bits/pixel)—professionalphotographyand print-

ing; intermediate format for image processing of ﬁxed-range images;

• 16-bit ﬁxed-range grayscale (16 bits/pixel)—radiology and medical imag-

ing;

• 16-bit“half-precision”ﬂoating-pointRGB—HDR images; intermediatefor-

mat for real-time rendering;

• 32-bit ﬂoating-point RGB—general-purpose intermediate format for soft-

ware rendering and processing of HDR images.

Reducing the number of bits used to store each pixel leads to two distinc-

tive types of artifacts,orartiﬁcially introduced ﬂaws, in images. First, encoding

images with ﬁxed-range values produces clipping when pixels that would other-

wise be brighter than the maximum value are set, or clipped, to the maximum

62 3. Raster Images

representable value. For instance, a photograph of a sunny scene may include re-

ﬂections that are much brighter than white surfaces; these will be clipped (even if

they were measured by the camera) when the image is converted to a ﬁxed range

to be displayed. Second, encoding images with limited precision leads to quan-

tization artifacts, or banding, when the need to round pixel values to the nearest

representable value introduces visible jumps in intensity or color. Banding can

be particularly insidious in animation and video, where the bands may not be

objectionable in still images but become very visible when they move back and

forth.

3.2.2 Monitor Intensities and Gamma

All modern monitors take digital input for the “value” of a pixel and convert this

to an intensity level. Real monitors have some non-zero intensity when they are

off because the screen reﬂects some light. For our purposes we can consider this

“black” and the monitor fully on as “white.” We assume a numeric description

of pixel color that ranges from zero to one. Black is zero, white is one, and a

gray halfway between black and white is 0.5. Note that here “halfway” refers to

the physical amount of light coming from the pixel, rather than the appearance.

The human perception of intensity is non-linear and will not be part of the present

discussion; see Chapter 22 for more.

There are two key issues that must be understood to produce correct images

on monitors. The ﬁrst is that monitors are non-linear with respect to input. For

example, if you give a monitor 0, 0.5, and 1.0 as inputs for three pixels, the

intensities displayed might be 0, 0.25, and 1.0 (off, one-quarter fully on, and

fully on). As an approximate characterization of this non-linearity, monitors are

commonly characterized by a γ (“gamma”) value. This value is the degree of

freedom in the formula

displayed intensity =(maximum intensity)a

, (3.1)

where a is the input pixel value between zero and one. For example, if a monitor

has a gamma of 2.0, and we input a value of a =0.5, the displayed intensity

will be one fourth the maximum possible intensity because 0.5

=0.25.Note

that a =0maps to zero intensity and a =1maps to the maximum intensity

regardless of the value of γ. Describing a display’s non-linearity using γ is only

an approximation; we do not need a great deal of accuracy in estimating the γ of

a device. A nice visual way to gauge the non-linearity is to ﬁnd what value of a

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 3. Raster Images (2/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
3. Raster Images (2/4)