Chapter 21
Like sounds, digital images are discretized versions of their analog counterparts. Regular samples of color (instead of air pressure) are stored in a two-dimensional grid (instead of a one-dimensional array). Access to these samples allows us to change them in a variety of ways.
The following program requires image.py from Listing 24.3 to be in the same folder or directory.
Listing 21.1: Two-Tone Image
1 # twotone.py
2
3 from image import ImagePPM
4
5 def luminance(c):
6 r, g, b = c
7 return 0.2 * r + 0.7 * g + 0.1 * b
8
9 def twotone(c, bright, cutoff, dark):
10 return bright if luminance(c) > cutoff else dark
11
12 def f(c):
13 return twotone(c, (255, 255, 255), 110, (0, 30, 70))
14
15 def applyfn(f, img):
16 width, height = img.size
17 for i in range(width):
18 for j in range(height):
19 img.putpixel((i, j), f(img.getpixel((i, j))))
20
21 def main():
22 img = ImagePPM.open("sup.ppm")
23 applyfn(f, img)
24 img.save("sup2.ppm")
25
26 main()
Computer programs represent images as two-dimensional grids of pixels. Pixel locations are given by (i, j) coordinates with (0, 0) at the upper left corner and the j coordinates increase as you move down. For example, the pixel coordinates for a tiny 5 × 5 image are:
(0, 0) |
(1, 0) |
(2, 0) |
(3, 0) |
(4, 0) |
(0, 1) |
(1, 1) |
(2, 1) |
(3, 1) |
(4, 1) |
(0, 2) |
(1, 2) |
(2, 2) |
(3, 2) |
(4, 2) |
(0, 3) |
(1, 3) |
(2, 3) |
(3, 3) |
(4, 3) |
(0, 4) |
(1, 4) |
(2, 4) |
(3, 4) |
(4, 4) |
Notice the differences between these coordinates and mathematical graphs in the xy-plane: the origin is in the upper left corner instead of lower left, and the j coordinate grows down instead of up.
Each pixel in a 24-bit RGB image has one byte of storage for each of three components: red (R), green (G), and blue (B). Each byte is interpreted as an unsigned integer, so its value is between 0 and 255. At 3 bytes per pixel, image files become large very quickly, so they are usually compressed.
Python programs represent RGB values as integer tuples. Common RGB colors are:
RGB Tuple |
Color |
---|---|
(0, 0, 0) |
black |
(255, 0, 0) |
red |
(0, 255, 0) |
green |
(0, 0, 255) |
blue |
(255, 255, 0) |
yellow |
(255, 0, 255) |
magenta |
(0, 255, 255) |
cyan |
(255, 255, 255) |
white |
RGB values represent the intensities of red, green, and blue wavelengths of light, and so (0, 0, 0) is the darkest color because each wavelength has no intensity, and (255, 255, 255) is the brightest because each wavelength is at full intensity.
⇒ Caution: Mixing light is different than mixing paint. Light colors are additive, whereas paint colors are subtractive.
Many common image formats such as JPEG, GIF, and PNG compress image data in order to save space. Processing these files requires complex compression and decompression algorithms, which are beyond the scope of this course. PPM files contain uncompressed RGB image data which is easy to manipulate in code and can be viewed in a text editor. Thus, Listing 21.1 uses PPM rather than a more complex format. Use a tool such as GIMP, available at http://www.gimp.org, to convert your own images to PPM (ASCII) format. See page 167 for more information on the PPM format.
Listing 21.1 uses a new data type called ImagePPM to represent a PPM image. This is not a built-in type; it is defined in a module called image, and you can see its code in Listing 24.3. For now, you just need to know how to use ImagePPM objects.
There are two ways to create a new ImagePPM object, either from an existing PPM file or as a brand new image:
ImagePPM.open(fname) |
Open an existing image file. |
ImagePPM.new((width, height)) |
Create new (all black) PPM image of the given size. |
⇒ Caution: The parameter of Image.new() is a tuple, so it looks like there is an extra set of parentheses.
Both ImagePPM.open() and ImagePPM.new() return an ImagePPM object that responds to these methods:
img.getpixel((i, j)) |
Get RGB tuple at (i, j). |
img.putpixel((i, j), (r, g, b)) |
Set pixel (i, j) to (r, g, b). |
img.save(fname) |
Save img in file fname. |
The .getpixel() and .putpixel() methods take tuple parameters.
Listing 21.1 provides a framework for writing many image manipulation programs. The nested loops in applyfn() are key:
for i in range(width):
for j in range(height):
# modify pixel at location i, j
They allow us to access every pixel in the image.
The physical basis for RGB color is that human eyes have three types of cones that respond to different wavelengths of color: one centered around red, one green, and one blue. These cones have noticeably different sensitivities, so that our eyes are much more sensitive to green light than red, and more sensitive to red light than blue.
Luminance is an attempt to estimate the brightness of a color as perceived by our eyes. An approximate value based on current display technology (see, e.g., [1]) is
The coefficients 0.2, 0.7, and 0.1 add up to 1.0, so that the result stays between 0 and 255.
Every object in Python stores its data in variables called fields or instance variables. Most of the time, we access that data through an object’s methods, but occasionally there is a need to access a field directly. The syntax to access an object’s field is:
<object>.<field>
This is identical to the syntax for calling a method except that there are no parentheses after the name of a field.
Chapter 19 introduced the typecode field of an array; recall that it used this dot syntax with no parentheses. Images have an important field that we need to access when opening an existing image:
img.size |
Tuple (width, height) giving size of img in pixels. |
Line 16 unpacks the tuple, although we could also have used img.size[0] and img.size[1].
Finally, the function f in Listing 21.1 probably looks a little strange. What is its role?
The related functions applyfn() and twotone() should be easier to follow. The first allows us to apply any function to each pixel of an image; that should be useful. The second computes a two tone color for an input color c based on its luminance. The issue is that the f() that applyfn() needs can only have one parameter, and twotone() has four parameters. This isn’t really a problem, because any time we want to apply a two-tone, we’ll want to specify the other three parameters, anyway: bright, cutoff, and dark. So, that is what f() does. It creates a one-parameter function from twotone() by specifying values for bright, cutoff, and dark.
Python provides lambda expressions exactly for situations like this. Exercise 21.4 asks you to try it as an alternative to writing f().
Discuss the results.
The preceding exercises only modify color based on each pixel’s own previous value and so may use applyfn() as is. However, the exercises that follow require creating a separate new image using ImagePPM.new(). Approach writing these functions by starting with the body of applyfn() and then modifying that code to create and return the appropriate new image.
Functions that change the size of the image can also usually be written in two ways: one that loops over the old size and one that loops over the new size.
18.221.66.185