22. Visual Perception (4/9)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

568 22. Visual Perception

As shown in Figure 22.14, the packing density of rods drops to zero at the

center of the fovea. Away from the fovea, the rod density ﬁrst increases and then

decreases. One result of this is that there is no foveal vision when illumination

is very low. The lack of rods in the fovea can be demonstrated by observing a

night sky on a moonless night, well away from any city lights. Some stars will

be so dim that they will be visible if you look at at point in the sky slightly to the

side of the star, but they will disappear if you look directly at them. This occurs

because when you look directly at these features, the image of the features falls

only on the cones in the retina, which are not sufﬁciently light sensitive to detect

the feature. Looking slightly to the side causes the image to fall on the more

light sensitive cones. Scotopic vision is also limited in acuity, in part because

of the lower density of rods over much of the retina and in part because greater

pooling of signals from the rods occurs in the retina in order to increase the light

sensitivity of the visual information passed back to the brain.

22.2.5 Motion

When reading about visual perception and looking at static ﬁgures on a printed

page, it is easy to forget that motion is pervasive in our visual experience. The

patterns of light that fall on the retina are constantly changing due to eye and body

motion and the movement of objects in the world. This section covers our ability

to detect visual motion. Section 22.3.4 describes how visual motion can be used

to determine geometric information about the environment. Section 22.4.3 deals

with the use of motion to guide our movement through the environment.

The detectability of motion in a particular pattern of light falling on the retina

is a complex function of speed, direction, pattern size, and contrast. The issue is

further complicated because simultaneous contrast effects occur for motion per-

ception in a manner similar to that observed in brightness perception. In the

extreme case of a single small pattern moving against a contrasting, homoge-

nous background, perceivable motion requires a rate of motion corresponding to

0.2

◦

–0.3

◦

/second of visual angle. Motion of the same pattern moving against a

textured pattern is detectable at about a tenth this speed.

With this sensitivity to retinal motion, combined with the frequency and ve-

locity of saccadic eye movements, it is surprising that the world usually appears

stable and stationary when we view it. The vision system accomplishes this in

three ways. Contrast sensitivity is reduced during saccades, reducing the visual

effects generated by these rapid changes in eye position. Between saccades, a

variety of sophisticated and complex mechanisms adjust eye position to compen-

sate for head and body motion and the motion of objects of interest in the world.

Finally, the visual system exploits information about the position of the eyes to

22.2. Visual Sensitivity 569

???

(a) (b)

Figure 22.15. The aperture problem: (a) If a straight line or edge moves in such a way

that its end points are hidden, the visual information is not sufﬁcient to determine the actual

motion of the line. (b) 2D motion of a line is unambiguous if there are any corners or other

distinctive markings on the line.

assemble a mosaic of small patches of high resolution imagery from multiple ﬁx-

ations into a single, stable whole.

The motion of straight lines and edges is ambiguous if no endpoints or cor-

ners are visible, a phenomenon referred to as the aperture problem (Figure 22.15).

The aperture problem arises because the component of motion parallel to the line

or edge does not produce any visual changes. The geometry of the real world

is sufﬁciently complex that this rarely causes difﬁculties in practice, except for

intentional illusions such as barber poles. The simpliﬁed geometry and textur-

ing found in some computer graphics renderings, however, has the potential to

introduce inaccuracies in perceived motion.

Real-time computer graphics, ﬁlm, and video would not be possible without

an important perceptual phenomena: discontinuous motion, in which a series of

static images are visible for discrete intervals in time and then move by discrete

intervals in space, can be nearly indistinguishable from continuous motion. The

effect is called apparent motion to highlight that the appearance of continuous

motion is an illusion.

Figure 22.16 illustrates the difference between continuous motion, which is

typical of the real world, and apparent motion, which is generated by almost all

dynamic image display devices. The motion plotted in Figure 22.16 (b) consists

of an average motion comparable to that shown in Figure 22.16 (a), modulated by

a high space-time frequency that accounts for the alternation between a stationary

pattern and one that moves discontinuously to a new location. Apparent percep-

570 22. Visual Perception

position

time

position

time

(a) (b)

Figure 22.16. (a) Continuous motion. (b) Discontinuous motion with the same average

velocity. Under some circumstances, the perception of these two motion patterns may be

similar.

tion of continuous motion occurs because the visual system is insensitive to the

high frequency component of the motion.

A compelling sense of apparent motion occurs when the rate at which indi-

vidual images appear is above about 10 Hz, as long as the positional changes

between successive images is not too great. This rate is not fast enough, how-

ever, to produce a satisfying sense of continuous motion for most image display

devices. Almost all such devices introduce brightness variation as one image is

switched to the next. In well lit conditions, the human visual system is sensitive

to this varying brightness for rates of variations up to about 80 Hz. In lower light,

detectability is present up to about 40 Hz. When the rate of alternating brightness

is sufﬁciently high, ﬂicker fusion occurs and the variation is no longer visible.

To produce a compelling sense of visual motion, an image display must there-

fore satisfy two separate constraints:

• images must be updated at a rate ≥ 10 Hz;

• any ﬂicker introduced in the process of updating images must occur at a

rate ≥ 60–80 Hz.

One solution is to require that the image update rate be greater than or equal to

60–80 Hz. In many situations, however, this is simply not possible. For computer

graphics displays, the frame computation time is often substantially greater than

12–15 msec. Transmission bandwidth and limitations of older monitor technolo-

gies limit normal broadcast television to 25–30 images per second. (Some HDTV

formats operate at 60 images/sec.) Movies update images at 24 frames/second

due to exposure time requirements and the mechanical difﬁculties of physically

moving ﬁlm any faster than that.

22.3. Spatial Vision 571

Different display technologies solve this problem in different ways. Computer

displays refresh the displayed image at ∼70–80 Hz, regardless of how often the

contents of the image change. The term frame rate is ambiguousfor such displays,

since two values are required to characterize this display: refresh rate,which

indicates the rate at which the image is redisplayed and frame update rate,which

indicates the rate at which new images are generated for display. Standard non-

HDTV broadcast television uses a refresh rate of 60 Hz (NTSC, used in North

America and some other locations) or 50 Hz (PAL, used in most of the rest of

the world). The frame update rate is half the refresh rate. Instead of displaying

each new image twice, the display is interlaced by dividing alternating horizontal

image lines into even and odd ﬁelds and alternating the display of these even and

odd ﬁelds. Flicker is avoided in movies by using a mechanical shutter to blink

each frame of the ﬁlm three times before moving to the next frame, producing a

refresh rate of 72 Hz while maintaining the frame update rate of 24 Hz.

The use of apparent motion to simulate continuous motion occasionally pro-

duces undesirable artifacts. Best known of these is the wagon wheel illusion in

which the spokes of a rotating wheel appear to revolve in the opposite direction

from what would be expected given the translational motion of the wheel. The

wagon wheel illusion is an example of temporal aliasing. Spokes, or other spa-

tially periodic patterns on a rotating disk, produce a temporally periodic signal

for viewing locations that are ﬁxed with respect to the center of the wheel or disk.

Fixed frame update rates have the effect of sampling this temporally periodic sig-

nal in time. If the temporal frequency of the sampled pattern is too high, under

sampling results in an aliased, lower temporal frequency appearing when the im-

age is displayed. Under some circumstances, this distortion of temporal frequency

causes a spatial distortion in which the wheel appears to move backwards. Wagon

wheel illusions are more likely to occur with movies than with video, since the

temporal sampling rate is lower.

Problems can also occur when apparent motion imagery is converted from

one medium to another. This is of particular concern when 24 Hz movies are

transferred to video. Not only does a non-interlaced format need to be translated

to an interlaced format, but there is no straightforward way to move from 24

frames per second to 50 or 60 ﬁelds per second. Some high-end display devices

have the ability to partially compensate for the artifacts introduced when ﬁlm is

converted to video.

22.3 Spatial Vision

One of the critical operations performed by the visual system is the estimation of

geometric properties of the visible environment, since these are central to deter-

572 22. Visual Perception

mining information about objects, locations, and events. Vision has sometimes

been described as inverse optics, to emphasize that one function of the visual sys-

tem is to invert the image formation process in order to determine the geometry,

materials, and lighting in the world that produced a particular pattern on light

on the retina. The central problem for a vision system is that properties of the

visible environment are confounded in the patterns of light imaged on the retina.

Brightness is a function of both illumination and reﬂectance, and can depend on

environmental properties across large regions of space due to the complexities of

light transport. Image locations of a projected environmental location at best can

be used to constrain the position of that location to a half-line. As a consequence,

it is rarely possible to uniquely determine the nature of the world that produced a

particular imaged pattern of light.

Determining surface layout—the location and orientation of visible surfaces

in the environment—is thought to be a key step in human vision. Most discus-

sions of how the vision system extracts information about surface layout from the

patterns of light it receives divide the problem into a set of visual cues, with each

cue describing a particular visual pattern which can be used to infer properties

of surface layout along with the needed rules of inference. Since surface layout

can rarely be determined accurately and unambiguously from vision alone, the

process of inferring surface layout usually requires additional, non-visual infor-

mation. This can come from other senses or assumptions about what is likely to

occur in the real world.

Visual cues are typically categorized into four categories. Ocularmotor cues

involve information about the position and focus of the eyes. Disparity cues in-

volve information extracted from viewing the same surface point with two eyes,

beyond that available just from the positioning of the eyes. Motion cues provide

information about the world that arises from either the movement of the observer

or the movement of objects. Pictorial cues result from the process of projecting

3D surface shapes onto a 2D pattern of light that falls on the retina. This sec-

tion deals with the visual cues relevant to the extraction of geometric information

about individual points on surfaces. More general extraction of location and shape

information is covered in Section 22.4.

22.3.1 Frames of Reference and Measurement Scales

Descriptions of the location and orientation of points on a visible surface must be

done within the context of a particular frame of references that speciﬁes the ori-

gin, orientation, and scaling of the coordinate system used in representing the ge-

ometric information. The human vision system uses multiple frames of reference,

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 22. Visual Perception (4/9)

Create new playlist

Sign In

Sign Up

Table of Contents for
22. Visual Perception (4/9)