i
i
i
i
i
i
i
i
568 22. Visual Perception
As shown in Figure 22.14, the packing density of rods drops to zero at the
center of the fovea. Away from the fovea, the rod density rst increases and then
decreases. One result of this is that there is no foveal vision when illumination
is very low. The lack of rods in the fovea can be demonstrated by observing a
night sky on a moonless night, well away from any city lights. Some stars will
be so dim that they will be visible if you look at at point in the sky slightly to the
side of the star, but they will disappear if you look directly at them. This occurs
because when you look directly at these features, the image of the features falls
only on the cones in the retina, which are not sufciently light sensitive to detect
the feature. Looking slightly to the side causes the image to fall on the more
light sensitive cones. Scotopic vision is also limited in acuity, in part because
of the lower density of rods over much of the retina and in part because greater
pooling of signals from the rods occurs in the retina in order to increase the light
sensitivity of the visual information passed back to the brain.
22.2.5 Motion
When reading about visual perception and looking at static gures on a printed
page, it is easy to forget that motion is pervasive in our visual experience. The
patterns of light that fall on the retina are constantly changing due to eye and body
motion and the movement of objects in the world. This section covers our ability
to detect visual motion. Section 22.3.4 describes how visual motion can be used
to determine geometric information about the environment. Section 22.4.3 deals
with the use of motion to guide our movement through the environment.
The detectability of motion in a particular pattern of light falling on the retina
is a complex function of speed, direction, pattern size, and contrast. The issue is
further complicated because simultaneous contrast effects occur for motion per-
ception in a manner similar to that observed in brightness perception. In the
extreme case of a single small pattern moving against a contrasting, homoge-
nous background, perceivable motion requires a rate of motion corresponding to
0.2
–0.3
/second of visual angle. Motion of the same pattern moving against a
textured pattern is detectable at about a tenth this speed.
With this sensitivity to retinal motion, combined with the frequency and ve-
locity of saccadic eye movements, it is surprising that the world usually appears
stable and stationary when we view it. The vision system accomplishes this in
three ways. Contrast sensitivity is reduced during saccades, reducing the visual
effects generated by these rapid changes in eye position. Between saccades, a
variety of sophisticated and complex mechanisms adjust eye position to compen-
sate for head and body motion and the motion of objects of interest in the world.
Finally, the visual system exploits information about the position of the eyes to
i
i
i
i
i
i
i
i
22.2. Visual Sensitivity 569
???
(a) (b)
Figure 22.15. The aperture problem: (a) If a straight line or edge moves in such a way
that its end points are hidden, the visual information is not sufficient to determine the actual
motion of the line. (b) 2D motion of a line is unambiguous if there are any corners or other
distinctive markings on the line.
assemble a mosaic of small patches of high resolution imagery from multiple x-
ations into a single, stable whole.
The motion of straight lines and edges is ambiguous if no endpoints or cor-
ners are visible, a phenomenon referred to as the aperture problem (Figure 22.15).
The aperture problem arises because the component of motion parallel to the line
or edge does not produce any visual changes. The geometry of the real world
is sufciently complex that this rarely causes difculties in practice, except for
intentional illusions such as barber poles. The simplied geometry and textur-
ing found in some computer graphics renderings, however, has the potential to
introduce inaccuracies in perceived motion.
Real-time computer graphics, lm, and video would not be possible without
an important perceptual phenomena: discontinuous motion, in which a series of
static images are visible for discrete intervals in time and then move by discrete
intervals in space, can be nearly indistinguishable from continuous motion. The
effect is called apparent motion to highlight that the appearance of continuous
motion is an illusion.
Figure 22.16 illustrates the difference between continuous motion, which is
typical of the real world, and apparent motion, which is generated by almost all
dynamic image display devices. The motion plotted in Figure 22.16 (b) consists
of an average motion comparable to that shown in Figure 22.16 (a), modulated by
a high space-time frequency that accounts for the alternation between a stationary
pattern and one that moves discontinuously to a new location. Apparent percep-
i
i
i
i
i
i
i
i
570 22. Visual Perception
position
time
position
time
(a) (b)
Figure 22.16. (a) Continuous motion. (b) Discontinuous motion with the same average
velocity. Under some circumstances, the perception of these two motion patterns may be
similar.
tion of continuous motion occurs because the visual system is insensitive to the
high frequency component of the motion.
A compelling sense of apparent motion occurs when the rate at which indi-
vidual images appear is above about 10 Hz, as long as the positional changes
between successive images is not too great. This rate is not fast enough, how-
ever, to produce a satisfying sense of continuous motion for most image display
devices. Almost all such devices introduce brightness variation as one image is
switched to the next. In well lit conditions, the human visual system is sensitive
to this varying brightness for rates of variations up to about 80 Hz. In lower light,
detectability is present up to about 40 Hz. When the rate of alternating brightness
is sufciently high, flicker fusion occurs and the variation is no longer visible.
To produce a compelling sense of visual motion, an image display must there-
fore satisfy two separate constraints:
images must be updated at a rate 10 Hz;
any icker introduced in the process of updating images must occur at a
rate 60–80 Hz.
One solution is to require that the image update rate be greater than or equal to
60–80 Hz. In many situations, however, this is simply not possible. For computer
graphics displays, the frame computation time is often substantially greater than
12–15 msec. Transmission bandwidth and limitations of older monitor technolo-
gies limit normal broadcast television to 25–30 images per second. (Some HDTV
formats operate at 60 images/sec.) Movies update images at 24 frames/second
due to exposure time requirements and the mechanical difculties of physically
moving lm any faster than that.
i
i
i
i
i
i
i
i
22.3. Spatial Vision 571
Different display technologies solve this problem in different ways. Computer
displays refresh the displayed image at 70–80 Hz, regardless of how often the
contents of the image change. The term frame rate is ambiguousfor such displays,
since two values are required to characterize this display: refresh rate,which
indicates the rate at which the image is redisplayed and frame update rate,which
indicates the rate at which new images are generated for display. Standard non-
HDTV broadcast television uses a refresh rate of 60 Hz (NTSC, used in North
America and some other locations) or 50 Hz (PAL, used in most of the rest of
the world). The frame update rate is half the refresh rate. Instead of displaying
each new image twice, the display is interlaced by dividing alternating horizontal
image lines into even and odd fields and alternating the display of these even and
odd elds. Flicker is avoided in movies by using a mechanical shutter to blink
each frame of the lm three times before moving to the next frame, producing a
refresh rate of 72 Hz while maintaining the frame update rate of 24 Hz.
The use of apparent motion to simulate continuous motion occasionally pro-
duces undesirable artifacts. Best known of these is the wagon wheel illusion in
which the spokes of a rotating wheel appear to revolve in the opposite direction
from what would be expected given the translational motion of the wheel. The
wagon wheel illusion is an example of temporal aliasing. Spokes, or other spa-
tially periodic patterns on a rotating disk, produce a temporally periodic signal
for viewing locations that are xed with respect to the center of the wheel or disk.
Fixed frame update rates have the effect of sampling this temporally periodic sig-
nal in time. If the temporal frequency of the sampled pattern is too high, under
sampling results in an aliased, lower temporal frequency appearing when the im-
age is displayed. Under some circumstances, this distortion of temporal frequency
causes a spatial distortion in which the wheel appears to move backwards. Wagon
wheel illusions are more likely to occur with movies than with video, since the
temporal sampling rate is lower.
Problems can also occur when apparent motion imagery is converted from
one medium to another. This is of particular concern when 24 Hz movies are
transferred to video. Not only does a non-interlaced format need to be translated
to an interlaced format, but there is no straightforward way to move from 24
frames per second to 50 or 60 elds per second. Some high-end display devices
have the ability to partially compensate for the artifacts introduced when lm is
converted to video.
22.3 Spatial Vision
One of the critical operations performed by the visual system is the estimation of
geometric properties of the visible environment, since these are central to deter-
i
i
i
i
i
i
i
i
572 22. Visual Perception
mining information about objects, locations, and events. Vision has sometimes
been described as inverse optics, to emphasize that one function of the visual sys-
tem is to invert the image formation process in order to determine the geometry,
materials, and lighting in the world that produced a particular pattern on light
on the retina. The central problem for a vision system is that properties of the
visible environment are confounded in the patterns of light imaged on the retina.
Brightness is a function of both illumination and reectance, and can depend on
environmental properties across large regions of space due to the complexities of
light transport. Image locations of a projected environmental location at best can
be used to constrain the position of that location to a half-line. As a consequence,
it is rarely possible to uniquely determine the nature of the world that produced a
particular imaged pattern of light.
Determining surface layout—the location and orientation of visible surfaces
in the environment—is thought to be a key step in human vision. Most discus-
sions of how the vision system extracts information about surface layout from the
patterns of light it receives divide the problem into a set of visual cues, with each
cue describing a particular visual pattern which can be used to infer properties
of surface layout along with the needed rules of inference. Since surface layout
can rarely be determined accurately and unambiguously from vision alone, the
process of inferring surface layout usually requires additional, non-visual infor-
mation. This can come from other senses or assumptions about what is likely to
occur in the real world.
Visual cues are typically categorized into four categories. Ocularmotor cues
involve information about the position and focus of the eyes. Disparity cues in-
volve information extracted from viewing the same surface point with two eyes,
beyond that available just from the positioning of the eyes. Motion cues provide
information about the world that arises from either the movement of the observer
or the movement of objects. Pictorial cues result from the process of projecting
3D surface shapes onto a 2D pattern of light that falls on the retina. This sec-
tion deals with the visual cues relevant to the extraction of geometric information
about individual points on surfaces. More general extraction of location and shape
information is covered in Section 22.4.
22.3.1 Frames of Reference and Measurement Scales
Descriptions of the location and orientation of points on a visible surface must be
done within the context of a particular frame of references that species the ori-
gin, orientation, and scaling of the coordinate system used in representing the ge-
ometric information. The human vision system uses multiple frames of reference,
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.121.251