Chapter 8. Travel

As we described in the introduction to Part IV, navigation is a fundamental human task in our physical environment. We are also increasingly faced with navigation tasks in synthetic environments: navigating the Web via a browser, navigating a complex document in a word processor, navigating through many layers of information in a spreadsheet, or navigating the virtual world of a computer game. Navigation in 3D UIs is the subject of this chapter.

8.1 Introduction

This chapter discusses both aspects of navigation: travel and wayfinding. In the first edition of this book, we covered these topics in two separate chapters. However, because they are tightly integrated, we have combined the topics in this edition. The emphasis is on interaction techniques for travel, which are supported by wayfinding information.

8.1.1 Travel

Travel is the motor component of navigation—the task of moving from the current location to a new target location or moving in the desired direction. In the physical environment, travel is often a “no-brainer.” Once we formulate the goal to walk across the room and through the door, our brains can instruct our muscles to perform the correct movements to achieve that goal. However, when our travel goal cannot be achieved effectively with simple body movements (we want to travel a great distance, or we want to travel very quickly, or we want to fly), then we use vehicles (bicycles, cars, planes, etc.). All vehicles contain some interface that maps various physical movements (turning a wheel, depressing a pedal, flipping a switch) to travel.

In 3D UIs, the situation is similar: there are some 3D interfaces where simple physical motions, such as walking, can be used for travel (e.g., when head and/or body trackers are used), but this is only effective within a limited space at a very limited speed. For most travel in 3D UIs, our actions must be mapped to travel in other ways, such as through a vehicle metaphor, for example. A major difference between real-world travel in vehicles and virtual travel, however, is that 3D UIs normally provide only visual motion cues, neglecting vestibular cues—this visual-vestibular mismatch can lead to cybersickness (see Chapter 3, “Human Factors Fundamentals,” section 3.3.2.)

Interaction techniques for the task of travel are especially important for two major reasons. First, travel is easily the most common and universal interaction task in 3D interfaces. Although there are some 3D applications in which the user’s viewpoint is always stationary or where movement is automated, those are the exception rather than the rule. Second, travel (and navigation in general) often supports another task rather than being an end unto itself. Consider most 3D games: travel is used to reach locations where the user can pick up treasure, fight with enemies, or obtain critical information. Counterintuitively, the secondary nature of the travel task in these instances actually increases the need for usability of travel techniques. That is, if the user has to think about how to turn left or move forward, then he has been distracted from his primary task. Therefore, travel techniques must be intuitive—capable of becoming “second nature” to users.

8.1.2 Wayfinding

Wayfinding is the cognitive process of determining and following a route between an origin and a destination (Golledge 1999). It is the cognitive component of navigation—high-level thinking, planning, and decision-making related to user movement. It involves spatial understanding and planning tasks, such as determining the current location within the environment, determining a path from the current location to a goal location, and building a mental map of the environment. Real-world wayfinding has been researched extensively, with studies of aids like maps, directional signs, landmarks, and so on (Golledge 1999).

In virtual worlds, wayfinding can also be crucial. In a large, complex environment, an efficient travel technique is of no use if one has no idea where to go. When we speak of “wayfinding techniques,” we refer to designed wayfinding aids included as part of the interface or in the environment. Unlike travel techniques or manipulation techniques, where the computer ultimately performs the action, wayfinding techniques only support the performance of the task in the user’s mind (see Chapter 3, section 3.4.2 for a discussion of situation awareness, cognitive mapping, and the different types of spatial knowledge).

Clearly, travel and wayfinding are both part of the same process (navigation) and contribute towards achieving the same goals. However, from the standpoint of 3D UI design, we can generally consider them to be distinct. A travel technique is necessary to perform navigation tasks, and in some small or simple environments a good travel technique may be all that is necessary. In more complex environments, wayfinding aids may also be needed. In some cases, the designer can combine techniques for travel and wayfinding into a single integrated technique, reducing the cognitive load on the user and reinforcing the user’s spatial knowledge each time the technique is used. Techniques that make use of miniature environments or maps fit this description (see section 8.6.1), but these techniques are not suitable for all navigation tasks.

8.1.3 Chapter Roadmap

In this chapter, we discuss both interaction techniques for travel tasks and the design of wayfinding aids. We begin by describing specific types of travel tasks (section 8.2) and some classifications of travel techniques (section 8.3). We then discuss a wide variety of travel techniques, classified by their metaphors:

walking (section 8.4)

steering (section 8.5)

selection-based travel (section 8.6)

manipulation-based travel (section 8.7)

Section 8.8 completes the discussion of travel by discussing other aspects of travel technique design. The design of wayfinding aids is presented in section 8.9. Finally, we present design guidelines for navigation interfaces (section 8.10), followed by our case studies (section 8.11).

8.2 3D Travel Tasks

There are many different reasons why a user might need to perform a 3D travel task. Understanding the various types of travel tasks is important because the usability of a particular technique often depends on the task for which it is used. As we describe various techniques in the following sections, we provide guidance (as often as possible) on the task types for which a particular technique is appropriate. Experiments based on travel “testbeds” (e.g., Bowman, Johnson et al. 1999; Lampton et al. 1994; Nabiyouni and Bowman 2015) have attempted to empirically relate task type to technique usability. We classify travel tasks under the headings of exploration, search, and maneuvering.

8.2.1 Exploration

In an exploration or browsing task, the user has no explicit goal for her movement. Rather, she is browsing the environment, obtaining information about the objects and locations within the world and building up knowledge of the space. For example, the client of an architecture firm may explore the latest building design in a 3D environment. Exploration is typically used at the beginning of an interaction with an environment, serving to orient the user to the world and its features, but it may also be important in later stages. Because a user’s path during exploration may be based on serendipity (seeing something in the world may cause the user to deviate from the current path), techniques to support exploration should allow continuous and direct control of viewpoint movement or at least the ability to interrupt a movement that has begun. Forcing the user to continue along the chosen path until its completion would detract from the discovery process. Of course, this must be balanced, in some applications, with the need to provide an enjoyable experience in a short amount of time (Pausch et al. 1996). Techniques should also impose little cognitive load on users so that they can focus cognitive resources on spatial knowledge acquisition, information gathering, or other primary tasks.

To what extent should 3D UIs support exploration tasks? The answer depends on the goals of the application. In some cases, exploration is an integral component of the interaction. For example, in a 3D visualization of network traffic data, the structure and content of the environment is not known in advance, making it difficult to provide detailed wayfinding aids. The benefits of the visualization depend on how well the interface supports exploration of the data. Also, in many 3D gaming environments, exploration of unknown spaces is an important part of the entertainment value of the game. On the other hand, in a 3D interface where the focus is on performing tasks within a well-known 3D environment, the interface designer should provide more support for search tasks via goal-directed travel techniques.

8.2.2 Search

Search tasks involve travel to a specific goal or target location within the environment. In other words, the user in a search task knows the final location to which he wants to navigate. However, it is not necessarily the case that the user has knowledge of where that location is or how to get there from the current location. For example, a gamer may have collected all the treasure on a level, so he needs to travel to the exit. The exit may be in a part of the environment that hasn’t yet been explored, or the user may have seen it previously. This leads to the distinction between a naïve search task, where the user does not know the position of the target or a path to it in advance, and a primed search task, where the user has visited the target before or has some other knowledge of its position (Darken and Sibert 1996).

Naïve search has similarities with exploration, but clues or wayfinding aids may direct the search so that it is much more limited and focused than exploration. Primed search tasks also exist on a continuum, depending on the amount of knowledge the user has of the target and the surrounding environment. A user may have visited a location before but still might have to explore the environment around his starting location before he understands how to begin traveling toward the goal. On the other hand, a user with complete survey knowledge of the environment can start at any location and immediately begin navigating directly to the target. Although the lines between these tasks are often blurry, it is still useful to make the distinction.

Many 3D UIs involve search via travel. For example, the user in an architectural walkthrough application may wish to travel to the front door to check sight lines. Techniques for this task may be more goal oriented than techniques for exploration. For example, the user may specify the final location directly on a map rather than through incremental movements. Such techniques do not apply to all situations, however. Bowman, Johnson, and Hodges (1999) found that a map-based technique was quite inefficient, even for primed search tasks, when the goal locations were not explicitly represented on the map. It may be useful to combine a target-based technique with a more general technique to allow for the continuum of tasks discussed above.

8.2.3 Maneuvering

Maneuvering is an often-overlooked category of 3D travel. Maneuvering tasks take place in a local area and involve small, precise movements. The most common use of maneuvering is to position the viewpoint more precisely within a limited local area to perform a specific task. For example, the user needs to read some written information in the 3D environment but must position herself directly in front of the information in order to make it legible. In another scenario, the user wishes to check the positioning of an object she has been manipulating in a 3D modeling system and needs to examine it from many different angles. This task may seem trivial compared to large-scale movements through the environment, but it is precisely these small-scale movements that can cost the user precious time and cause frustration if not supported by the interface.

A designer might consider maneuvering tasks to be search tasks, because the destination is known, and therefore use the same type of travel techniques for maneuvering as for search, but this would ignore the unique requirements of maneuvering tasks. In fact, some applications may require special travel techniques solely for maneuvering. In general, travel techniques for this task should allow great precision of motion but not at the expense of speed. The best solution for maneuvering tasks may be physical motion of the user’s head and body because this is efficient, precise, and natural, but not all applications include head and body tracking, and even those that do often have limited range and precision. Therefore, if close and precise work is important in an application, other techniques for maneuvering, such as the object-focused travel techniques in section 8.7.2, must be considered.

8.2.4 Additional Travel Task Characteristics

In the classification of the tasks above, they are distinguished by the user’s goal for the travel task. Remember that many other characteristics of the task should be considered when choosing or designing travel techniques:

Distance to be traveled: In a 3D UI using head or body tracking, it may be possible to accomplish short-range travel tasks using natural physical motion only. Medium-range travel requires a virtual travel technique but may not require velocity control. Long-range travel tasks should use techniques with velocity control or the ability to jump quickly between widely scattered locations.

Amount of curvature or number of turns in the path: Travel techniques should take into account the amount of turning required in the travel task. For example, steering (see section 8.5.1) based on torso direction may be appropriate when turning is infrequent, but a less strenuous method, such as hand-directed steering (most users will use hand-directed steering from the hip by locking their elbows in contrast to holding up their hands), would be more comfortable when the path involves many turns.

Visibility of the target from the starting location: Many target-based techniques (section 8.6) depend on the availability of a target for selection. Gaze-directed steering (section 8.5.1) works well when the target is visible but not when the user needs to search for the target visually while traveling.

Number of DOF required for the movement: If the travel task requires motion only in a horizontal plane, the travel technique should not force the user to also control vertical motion. In general, terrain-following is a useful constraint in many applications.

Required accuracy of the movement: Some travel tasks require strict adherence to a path or accurate arrival at a target location. In such cases, it’s important to choose a travel technique that allows for fine control and adjustment of direction, speed, or target location. For example, map-based target selection (section 8.6) is usually inaccurate because of the scale of the map, imprecision of hand tracking, or other factors. Travel techniques should also allow for easy error recovery (e.g., backing up if the target was overshot) if accuracy is important.

Other primary tasks that take place during travel: Often, travel is a secondary task performed during another more important task. For example, a user may be traveling through a building model in order to count the number of windows in each room. It is especially important in such situations that the travel technique be unobtrusive, intuitive, and easily controlled.

8.3 Classifications for 3D Travel

With the various types of travel tasks in mind, we turn to the design of interaction techniques for travel. Before discussing the techniques themselves, we present a number of ways to classify travel techniques.

8.3.1 Technique Classifications

A common theme of 3D interaction research has been the attempt to classify and categorize interaction techniques into structures. This is not a pointless academic exercise, but rather an attempt to gain a more complete understanding of the tasks and techniques involved. For the task of travel, several different classification schemes have been proposed. None of these should be considered the “correct” taxonomy; they each provide different views of the same space. In this section, we discuss four classification schemes

active versus passive

physical versus virtual

using task decomposition

by metaphor

Active versus Passive Techniques

One way to classify travel techniques is to distinguish between active travel techniques, in which the user directly controls the movement of the viewpoint, and passive travel techniques, in which the viewpoint’s movement is controlled by the system. Most of the techniques we present in this chapter are active, but in some cases it may be useful to consider a technique that is automated or semiautomated by the system (Galyean 1995; Mackinlay et al. 1990a). We discuss this issue in section 8.8.4. This is especially useful if the user has another primary task, such as gathering information about the environment while traveling. Bowman, Davis et al. (1999) studied active and passive techniques and also included a third category called route planning. Route planning is both active and passive—users actively plan their path through the environment, then the system executes this path (section 8.6.2).

Physical versus Virtual Techniques

We can also classify travel techniques into those that use physical travel, in which the user’s body physically translates or rotates in order to translate or rotate the viewpoint, and virtual travel, in which the user’s body primarily remains stationary even though the virtual viewpoint moves. Desktop 3D systems and gaming consoles utilize virtual translation and rotation. Many VR systems use a combination of physical rotation (via head tracking) and virtual translation, while others use physical translation and rotation via a locomotion device or real walking technique. We discuss physical techniques in section 8.4, while sections 8.5–8.7 present primarily virtual techniques. The active/passive and physical/virtual classifications are orthogonal, so they can be combined to define a 2 × 2 design space.

Classifications Using Task Decomposition

Bowman et al. (1997) decomposed the task of travel into three subtasks: direction or target selection, velocity/acceleration selection, and conditions of input (Figure 8.1). Each subtask can be performed using a variety of technique components.

Direction or target selection refers to the primary subtask in which the user specifies how to move or where to move.

Velocity/acceleration selection describes how users control their speed.

Conditions of input refers to how travel is initiated, continued, and terminated.

This taxonomy covers a large portion of the design space for travel techniques (only a few representative technique components are shown in the figure) and allows us to view the task of travel in more fine-grained chunks (subtasks) that are separable. By choosing a technique component for each of the three subtasks, we can define a complete travel technique. For example, the most common implementation of the pointing technique (see section 8.5.1) uses pointing steering, constant velocity/acceleration, and continuous input (a button held down).

Figure 8.1 Taxonomy of travel techniques focusing on subtasks of travel. (Bowman et al. 1997, © 1997 IEEE)

Figure 8.2 Taxonomy of travel techniques focusing on level of user control. (Bowman, Davis et al. 1999; reprinted by permission of MIT Press and Presence: Teleoperators and Virtual Environments)

A second task decomposition (Bowman, Davis et al. 1999) subdivides the task of travel in a different, more chronological way (Figure 8.2). In order to complete a travel task, the user first starts to move, then indicates position and orientation, and then stops moving. Of course, this order does not always strictly hold: in some target-based techniques, the user first indicates the target position and then starts to move. One feature of this taxonomy that distinguishes it from the first is the explicit mention of specifying viewpoint orientation, which is an important consideration for 3D UIs without head tracking (see section 8.8.1). As shown in the figure, the taxonomy further decomposes the position specification subtask into position (xyz coordinates), velocity, and acceleration subtasks. Finally, it lists three possible metaphors for the subtask of specifying position: discrete target specification, one-time route specification, and continuous specification of position. These metaphors differ in the amount of control they give the user over the exact path (the active/passive distinction). We discuss these metaphors in sections 8.5 and 8.6.

Classification by Metaphor

Finally, we can classify travel techniques by their overall interaction metaphor. In one way, such a classification is not as useful because it does not allow us to look at subtasks of travel separately. However, classification by metaphor is easier to understand, especially from the user’s point of view. If someone tells you that a particular travel technique uses an “airplane” metaphor, for example, you can infer that it allows you to move in all three dimensions and to steer using hand motions. In addition, if new metaphors are developed, they can be added to such an informal classification easily. Thus, classification by metaphor is a useful way to think about the design space for interaction techniques.

In this chapter, we provide our own classification based on metaphors. Sections 8.4–8.7 organize travel techniques by four common metaphors: walking, steering, target/route selection, and travel-by-manipulation. These sections consider only the task of controlling the position of the viewpoint and do not consider issues such as specifying viewpoint orientation, controlling the speed of travel, or scaling the world. We cover these issues in section 8.8.

8.4 Walking Metaphors

The most natural technique for traveling in a 3D world is to physically walk around it. However, due to technological and space limitations, real walking is not always practical or feasible. To overcome this, many researchers have designed and developed interaction metaphors based on walking. We have classified those metaphors into three categories based on human gait. We call the first category “full gait” techniques, as those metaphors involve all of the biomechanics of a full gait cycle (see Figure 8.3). In contrast, the second category of walking metaphors mimics only some of the biomechanical aspects of human gait. We refer to these metaphors as “partial gait” techniques. Finally, “gait negation” techniques are designed to keep the user walking within a defined space by negating the user’s forward locomotion.

Figure 8.3 Illustration of the human gait cycle. (Image courtesy of Ryan P. McMahan)

8.4.1 Full Gait Techniques

The human gait cycle involves two main phases for each leg—the stance phase and the swing phase (Whittle 1996). The stance phase begins with a heel strike and is immediately followed by the foot flattening on the ground. At this point, the leg is in midstance, supporting the body’s weight while the other leg is in its swing phase. The stance phase ends with the heel and toes coming off the ground. At this point, the leg is in its swing phase until the heel strikes the ground again. Full gait techniques are walking metaphors that afford all of the events and biomechanics involved with the stance and swing phases of human gait. We will discuss

real walking

redirected walking

scaled walking

Real Walking

Real walking is the most direct and obvious full gait technique. It is natural, provides vestibular cues (which help the user understand the size of the environment and avoid getting sick), and promotes spatial understanding. However, real walking is not always practical or feasible and can work only when the size of the environment is less than the range of the tracking system, unless combined with another travel technique. Even if a large-area tracking system is available, the physical space must also be free of obstacles.

Real walking also raises issues with cabling: cables for trackers, input devices, and/or displays may not be long enough to allow complete freedom of movement in the tracked area, and unless cabling is carefully handled by another person or managed by a mechanical system, walking users can easily become tangled in cables as they move about the space. Wireless devices, such as wireless video transmitters and receivers, can alleviate these concerns for some systems. Alternatively, in so-called backpack systems, the user carries a portable computer to which all the devices are connected.

Usoh et al. (1999) used a large-area tracking system to determine the effect of a physical walking technique on the sense of presence. They created a small, compelling environment with a virtual pit whose bottom was far below the floor on which the user stood. They found that users who physically walked through the environment felt more present and exhibited greater fear of the virtual pit than users who walked in place (described in section 8.4.2) or used a virtual travel technique. Other researchers have found that real walking leads to a higher level of spatial knowledge in a complex environment (Chance et al. 1998).

Real walking is also becoming more important for other types of 3D applications, especially mobile augmented reality (Höllerer et al. 1999; Nurminen et al. 2011). In these systems, users are free to walk around in very large-area indoor or outdoor environments and have additional graphical information superimposed on their view of the real world (Figure 8.4). These applications often use Global Positioning System (GPS) data for tracking, possibly enhanced with orientation information from self-contained inertial trackers. Most GPS devices only give coarse-grained position information on the scale of meters, which is not sufficiently precise for most 3D UIs. Recent technological developments may lead to increasing the precision of most GPS devices to centimeters (Pesyna et al. 2014), however, most 3D UIs will still require millimeter precision. Another option is to use inside-out optical tracking of the environment using SLAM or one of its variants (see Chapter 6, “3D User Interface Input Hardware,” section 6.3.1).

Figure 8.4 Mobile augmented reality: (a) prototype system (© 2002 Computer Graphics and User Interfaces Lab, Columbia University); (b) user’s view (© 1999 Tobias Höllerer, Steve Feiner, and John Pavlik, Computer Graphics and User Interfaces Lab, Columbia University)

Real walking can be a compelling travel technique, but its effectiveness is dependent upon the tracking system and tracking area. In most practical 3D UIs, real walking can be used in a small local area, but other techniques are needed to reach other parts of the environment. However, in our experience, when users are allowed to both walk and use a virtual travel technique, they quickly learn to use only the virtual technique because it requires less effort.

Redirected Walking

To overcome the large-area requirements of real walking, researchers have investigated redirecting users away from the boundaries of the tracking space while walking. This is known as redirected walking (Razzaque et al. 2002). The concept behind the technique is to guide users unwittingly along paths in the real world that differ from the path perceived in the virtual world. The perceptual phenomenon that enables redirected walking is that visual stimuli often dominate proprioceptive and vestibular stimuli when there is a conflict among them. Hence, a user can be imperceptibly led away from the tracking boundaries by, for example, rotating the virtual scene and the intended virtual path toward the center of the tracking area as the user turns or walks.

A basic approach to redirected walking is the stop-and-go method (Bruder et al. 2015). With this method, the virtual world is rotated around a stationary user faster or slower than the user is physically turning (i.e., rotation gains are applied). This allows the technique to orient the user’s forward direction away from the tracking boundaries. The key to the stop-and-go method is to influence the user to physically turn in the first place. Researchers have investigated a number of ways to accomplish this, including location-oriented tasks (Kohli et al. 2005), visual distractors (e.g., a butterfly; Peck et al. 2009), and verbal instructions (Hodgson et al. 2014).

A more advanced approach to redirected walking is to continuously redirect the user while walking through the virtual environment (Bruder et al. 2015). For example, if the user is walking straight ahead in the virtual world, small rotations of the virtual scene can be used to redirect the user along a circular path within the tracking area (Figure 8.5). If the rotations are small enough, they will be imperceptible and the user will have the impression of truly walking straight ahead in the virtual world.

Figure 8.5 With continuous redirected walking, the user is imperceptibly guided away from the boundaries of the tracking area through subtle manipulations of the virtual scene. (Image adapted from Bruder et al. 2015)

When implementing a redirected walking technique, it is important to avoid perceptible visual conflicts with the proprioceptive and vestibular cues. Steinicke et al. (2010) conducted studies to estimate detection thresholds for these conflicts. They found that users could be physically turned about 49% more or 20% less than a displayed virtual rotation. Users can also be physically translated 14% more or 26% less than a displayed virtual translation. Additionally, the researchers found that users will perceive themselves walking straight in the virtual environment while being continuously redirected on a circular arc if the arc’s radius is greater than 22 meters.

An important aspect of continuous redirected walking is how to rotate the virtual scene relative to the physical tracking space while the user is walking. Two commonly used algorithms are steer-to-center and steer-to-orbit (Hodgson et al. 2014). The goal of the steer-to-center algorithm is to continuously redirect the user’s physical movements through the center of the tracking area while adhering to the detection thresholds. In contrast, the goal of the steer-to-orbit algorithm is to continuously redirect the user’s physical movements along a circular arc, as if the user were orbiting the center of the tracking space. Hodgson and Bachmann have shown that the steer-to-center algorithm helps to avoid tracking boundaries better for relatively open-spaced virtual environments (Hodgson and Bachmann 2013), while the steer-to-orbit algorithm performs better for relatively enclosed VEs (Hodgson et al. 2014).

An alternative to rotating the virtual scene is to manipulate the dimensional properties of the scene. Suma et al. (2012) demonstrated that VEs can make use of self-overlapping architectural layouts called “impossible spaces.” These layouts effectively compress large interior environments into smaller tracking spaces by showing only the current room to the user despite another room architecturally overlapping it. Vasylevska et al. (2013) have expanded upon the concept of impossible spaces to create flexible spaces, which are created by dynamically generating corridors between rooms to direct the user away from the center of the tracking space when exiting a room and back toward the center when entering the next room.

While redirected walking can overcome some of the tracking limitations of real walking, it has its own challenges. First, continuous redirected walking requires a large tracking space to be imperceptible. Using it in smaller spaces is not very effective. Second, even in large tracking spaces, users can still walk outside of the space if they decide to ignore visual cues, tasks, and distractors. Finally, Bruder et al. (2015) have demonstrated that redirected walking with small arcs demands more cognitive resources than with larger arcs. For those readers interested in more details on redirected walking, we recommend a book on human walking in VEs (Steinicke et al. 2013).

Scaled Walking

Another full gait technique that overcomes the large-area requirements of real walking is scaled walking. With this technique, users can travel through virtual environments larger than the physical space by scaling their movements, so that one physical step equals several virtual steps. A basic approach to scaled walking is to apply a uniform gain to the horizontal components of the user’s head tracker data. However, Interrante et al. (2007) have shown that this approach induces significantly more cybersickness and is significantly less natural and easy to use than real walking.

A better scaled-walking approach is the Seven League Boots technique (Interrante et al. 2007). With this technique, only the user’s intended direction of travel is scaled. This avoids scaling the oscillatory motions of the user’s head that occur during gait, which would result in excessive swaying of the user’s viewpoint. Seven League Boots also uses a dynamically changing scaling factor based on the speed of physical walking. When users walk slowly, no scaling is applied, but at faster walking speeds, travel is scaled to be even faster. This is similar to common mouse acceleration techniques in desktop UIs and enables users to explore local areas with one-to-one walking and reach distant areas by quickly walking in their direction.

The user’s intended direction of travel must be determined to implement Seven League Boots. Interrante et al. (2007) propose that the best method is to determine the direction of travel as a weighted combination of the user’s gaze direction and the direction of the user’s horizontal displacement during the previous few seconds. When displacement is minimal, as it is when standing, a weight of 1 is assigned to the gaze direction. However, when displacement resembles walking, a weight of 0 is assigned to the gaze direction to allow the user to look around while walking.

While scaled walking techniques afford the biomechanics of a full gait cycle and circumvent the physical limitations of real walking, they can be unnatural to use. Based on work by Steinicke et al. (2010), scaled walking techniques are perceptible when the scale factor is greater than 1.35 times the user’s physical motions. And while scaled walking can increase the size of the area users can reach with natural walking, that area is still finite, meaning that additional travel techniques will be needed to reach other parts of the virtual environment.

8.4.2 Partial Gait Techniques

Not all walking metaphors incorporate both the stance and swing phases of human gait. Some techniques focus on recreating specific aspects of the gait cycle to provide seminatural interactions for travel. These techniques often circumvent the restrictions of using real walking, such as space limitations, by representing only a subset of the gait cycle. For instance, walking in place incorporates the actions associated with the stance phase but not those of the swing phase in order to keep the user in place. The human joystick metaphors incorporate the swing aspects of gait but not the stance aspects. Other techniques focus on recreating the motions of the gait cycle using anatomical substitutions, such as finger walking (Kim et al. 2008). In this section, we cover

walking in place

human joystick

Walking in Place

An alternative to real walking is walking in place: users move their feet to simulate walking while actually remaining in the same location. This technique seems like a good compromise because users still physically exert themselves, which should increase the sense of presence, and the limitation on the size of the environment is removed.

However, there are several caveats to be considered. First, walking in place does not incorporate the swing phase of the gait cycle, as real walking does, so the sense of presence or real movement is diminished. Second, the size of the environment, while theoretically unlimited, still has a practical limitation because users exert more energy while using walking in place than real walking (Nilsson et al. 2013).

Several approaches have been used to implement walking in place. Early on, Slater et al. (1995) used a position tracker and a neural network to analyze the motions of the user’s head to distinguish walking in place from other types of movements. When the system detects that the user is walking in place, it moves the user through the virtual world in the direction of the user’s gaze. Instead of tracking head motions, Templeman et al. (1999) tracked and analyzed leg motions in their Gaiter system to determine when the user was stepping in place and in which direction. In a similar system, Feasel et al. (2008) used a chest tracker and the orientation of the user’s torso to determine the direction of virtual motion. They also analyzed vertical heel movements instead of steps to reduce the half-step latency seen in most other implementations.

More recently, Nilsson et al. (2013) investigated the kinematics of walking in place (Figure 8.6). In most implementations, a marching gesture is utilized, in which the user lifts each foot off the ground by raising the thighs. Nilsson et al. (2013) proposed two additional gestures—wiping and tapping. With wiping, the user bends each knee while keeping the thigh relatively steady, similar to wiping one’s feet on a doormat. With tapping, the user lifts each heel off the ground while keeping the toes in contact with the ground. In a study of perceived naturalness, the researchers found that tapping was considered the most natural and least strenuous of the three walking-in-place gestures.

Figure 8.6 Three types of walking-in-place gestures: (a) marching by raising the thighs to lift each foot; (b) wiping by bending the knees to move each foot backward; (c) tapping by lifting the heels while keeping the toes on the ground. (Image adapted from Nilsson et al. 2013)

Studies have shown that walking in place does increase the sense of presence relative to completely virtual travel but that it’s not as effective as real walking (Usoh et al. 1999). These techniques also sometimes suffer from the problems of recognition errors and user fatigue. This approach applies to systems where higher levels of naturalism are needed and where the environment is not too large. For applications in which the focus is on efficiency and task performance, however, a steering technique (see section 8.5) is often more appropriate.

Human Joystick

While walking in place excludes the swing phase of gait, the human joystick metaphor incorporates it instead of the stance phase. The concept of this metaphor is that the user’s body acts like the handle of a joystick to initiate travel in different directions. One early implementation was the Virtual Motion Controller, or VMC (Wells et al. 1996). This general-purpose device aimed to allow virtual motion by allowing users to perform a subset of the walking motion they would perform in the physical world. The VMC consisted of a platform with embedded pressure sensors beneath the surface (Figure 8.7). The sensors were distributed along the rim of the platform, so that a user standing in the center of the platform placed no pressure on the sensors and so that the pressure sensed would increase the farther the user stepped from the center. By analyzing the pressure sensed at various locations, the device could determine the direction and distance that the user had stepped from the center.

Another implementation of the human joystick technique is to calculate the 2D horizontal vector from the center of the tracking space to the user’s tracked head position and then use that vector to define the direction and velocity of virtual travel. McMahan et al. (2012) implemented this technique in a CAVE-like system. They defined a no-travel zone at the center of the tracking volume to avoid constant virtual locomotion due to minute distances between the user’s head position and the center of the tracking volume. To initiate travel, the user stepped outside of the neutral zone to define the 2D travel vector. After this initial swing phase, the user simply stood in place to continue the same direction and speed of travel. The user could move closer to or farther from the center to adjust the travel speed. Additionally, the user could move laterally to redefine the direction of travel relative to the tracking center.

In both implementations of the human joystick, the user’s gaze direction is independent of the direction of travel, which is defined by either the positions of the user’s feet or the position of the user’s head. This allows the user to travel backwards, which techniques like walking in place do not allow. The human joystick has also been shown to afford higher levels of presence (McMahan et al. 2012) than purely virtual travel. However, McMahan (2011) also found that the human joystick technique was significantly less efficient than a traditional keyboard and mouse travel technique. He attributed this to the fact that the human joystick technique requires the user to step back to the center of the tracking volume to terminate travel. This is very different from real human gait, which requires little to no effort to stop locomotion.

Figure 8.7 Virtual Motion Controller. (Photograph courtesy of HIT Lab, University of Washington)

8.4.3 Gait Negation Techniques

When realism is desired (e.g., in training applications), partial gait techniques can be less than satisfactory because they don’t capture the same motion and effort as real walking. However, real walking and full gait techniques are often limited by space and technological constraints. Thus, there is a third type of walking metaphor that uses special locomotion devices to provide a somewhat realistic walking motion and feel while not actually translating the user’s body. We refer to these as gait negation techniques, because they negate the forward movement of the user’s gait. In this section, we will discuss the following gait negation techniques:

treadmills

passive omnidirectional treadmills

active omnidirectional treadmills

low-friction surfaces

step-based devices

Treadmills

The simplest form of gait negation is a common treadmill. Of course, the basic treadmill does not allow the user to turn naturally, so some sort of indirect turning mechanism, such as a joystick, is needed (Brooks 1986). This can be implemented easily but is not appropriate for applications requiring high levels of realism. Researchers have created many different devices to try to solve this problem. One simple idea is to track the user’s head and feet on a standard treadmill in order to detect when the user is trying to make a turn. This detection is based on analysis of the direction the feet are pointing, deviation of a foot’s motion from the forward direction, and other factors. When the system detects a turn, it rotates the treadmill, which is mounted on a large motion platform (Noma and Miyasato 1998). The motion platform also allows tilting of the treadmill to simulate slopes. Experience with this setup indicates that it works well for slow, smooth turns but has too much latency for sudden or sharp turns. In addition, it does not allow sidestepping.

Passive Omnidirectional Treadmills

Another approach to gait negation is to use a treadmill that is specially constructed to allow walking in any direction. There are two categories of these omnidirectional treadmills. The first category relies on the user’s weight and momentum to activate the treadmill’s surface. These are known as passive omnidirectional treadmills, which we discuss in this section. The second category actively controls the treadmill’s surface in reaction to detecting the user’s movements. These are known as active omnidirectional treadmills, which we discuss in the next section.

As explained, passive omnidirectional treadmills rely on the user’s weight and momentum to activate their surfaces. In turn, these moving surfaces negate the user’s gait and keep the user in the center of the space. An example of a passive omnidirectional treadmill is the Omnidirectional Ball-Bearing Disc Platform (OBDP) developed by Huang (2003). This prototype uses hundreds of ball bearings arranged in a concave platform to passively react to the user’s walking motions. When the user steps forward, the ball bearings roll toward the center of the device due to the curvature and return the user’s foot to the center of the platform. Due to the unnaturalness of stepping on a concave surface and the abrupt response of the ball bearings, the OBDP also requires a frame around the user’s waist to stabilize the user in the event of a fall.

Another example of a passive omnidirectional treadmill is the Virtusphere (Figure 8.8; Medina et al. 2008). This device serves as a human-sized “hamster ball” that rolls in place while the user walks inside of it. Like the OBDP, the weight of the user’s steps initiates the rolling of the Virtusphere’s surface. However, once the Virtusphere is rolling, the user must exert reverse forces in order to stop the device without falling. Research conducted by Nabiyouni et al. (2015) shows that the Virtusphere provides a user experience that is significantly worse than a joystick or real walking.

Figure 8.8 The Virtusphere, a human-sized “hamster ball,” is an example of a passive omnidirectional treadmill. (Image courtesy of Mahdi Nabiyouni)

Active Omnidirectional Treadmills

While passive omnidirectional treadmills are built to react to the user’s weight and momentum, active omnidirectional treadmills are built to detect the user’s walking motions and move to negate them. One implementation for such a device is to chain together several belt-based treadmills side to side to create a larger treadmill that can be rotated left and right. While each section supports forward and backward motions, the overall chain supports lateral movements. Hence, the treadmill surface has the ability to move in any arbitrary horizontal direction. The omnidirectional treadmill, or ODT (Darken et al. 1997), the Torus treadmill (Iwata 1999), and the Cyberwalk (Schwaiger et al. 2007) are examples of this type of active omnidirectional treadmill. The ODT was constructed for the military with the hope that it could be used to train infantry naturally in a 3D UI. The device allowed walking or even running in any direction, but also had some serious usability and safety issues that made it less than effective.

The most important issue was that because the ODT continually moved the surface in order to recenter the user on the treadmill and keep the user from walking off the edge of the device, the recentering motion caused the user to lose his or her balance, especially during turns or sidestepping. The ODT was found to support walking at a moderate speed with gradual turns, but not many of the other movements that soldiers would need to make.

Another approach to creating an active omnidirectional treadmill is to use conveyor rollers in a radial pattern (i.e., with all the rollers’ direction of motion being toward the center of the device) instead of belts. Some commercial versions consist of 16 sections of rollers surrounding a small stable platform in the center. When the user is detected walking outside of the small platform by an optical tracking system, the relevant rollers are actively controlled to move the user back to the center platform. The speed of the rollers is controlled to match the user’s forward momentum. Otherwise, the user may fall if there’s a mismatch in inertia and roller speed.

Low-Friction Surfaces

Gait negation can also be implemented by using low-friction surfaces to negate the kinetics or forces of walking. Like omnidirectional treadmills, low-friction surfaces can support virtual travel in any horizontal direction. Iwata and Fujii (1996) implemented an early example of such a device. They used sandals with low-friction film on the soles to allow the user to shuffle his feet back and forth on a floor sheet of low-friction film. The sandals also had rubber soles on the toes to afford braking. Swapp et al. (2010) implemented a similar device called the Wizdish. The major difference between the two devices is that the Wizdish used a concave low-friction surface that not only negated the forces of stepping forward but also brought the user’s foot back to the center. In recent years, a number of low-friction surfaces have been developed to target the consumer market (Figure 8.9). These devices often utilize a support frame around the user’s waist for balance and safety.

Figure 8.9 A low-friction surface for virtual locomotion in HWD-based VR. (Image courtesy of Virtuix)

There are several fundamental challenges for low-friction surfaces. First, the degree of friction must be properly balanced between the surface and the shoes. If the amount of friction is too high, the user will exert much more energy to move than in real walking. If the amount of friction is too low, the surface will be slippery, and the user is likely to slip or fall. Second, though these devices are designed to simulate real walking, they require biomechanics that are more similar to skating than walking. Finally, based on the authors’ personal experiences, low-friction surfaces require time to learn how to use effectively.

Step-Based Devices

The final gait negation technique that we cover here are step-based devices. These devices are developed to detect where the user is about to step and to provide a surface for the user to step on. By “catching” the user’s steps, these devices can then recenter the user by moving both steps back toward the center of the space. The GaitMaster (Iwata 2001) is such an approach. Rather than using a treadmill or a frictionless surface, the user straps her feet onto platforms that rest on top of two small motion bases (Figure 8.10). The system attempts to detect the user’s walking motion via force sensors embedded in the platforms and move the platforms appropriately so that a hard “ground” surface is felt at the correct location after each step. This device is technically quite complex, and it can have serious safety and usability issues because of the powerful hydraulics involved and the latency in detecting foot motion.

Figure 8.10 GaitMaster2 locomotion device. (Iwata 2001, © 2003 IEEE; reprinted by permission)

Another step-based device example is the CirculaFloor, also developed by Iwata et al. (2005). This system uses multiple omnidirectional robotic vehicles as moveable tiles that are programmed to position themselves under the user’s steps. Each tile provides a sufficient area for walking to avoid requiring precise positioning. As the user walks, the vehicles can move in the opposite direction to cancel the user’s global movement. Additionally, once the user steps off a tile, it can be circulated back to catch the next step.

In general, gait negation devices have not been as successful as one might hope. This is because they are still too expensive, too susceptible to mechanical failures, and too slow to respond to the user’s movements. Most importantly, the devices do not produce the perception of natural walking for the user because of biomechanics and forces that are different from real walking. Therefore, instead of being able to use her natural ability to walk, the user must learn to adapt her walking motion to the device’s characteristics. Still, such devices may prove useful in certain applications that require physical locomotion via walking. For a more detailed overview of locomotion devices, see Hollerbach (2002) and Steinicke et al. (2013).

8.5 Steering Metaphors

Although a great deal of work has been focused on natural locomotion techniques such as those described in section 8.4, most 3D UIs, from video games to immersive VR, use some sort of virtual travel technique. Among virtual techniques, by far the most common metaphor is steering, which refers to the continuous control of the direction of motion by the user. In other words, the user constantly specifies either an absolute (“move along the vector (1,0,0) in world coordinates”) or relative (“move to my left”) direction of travel. In most cases, this specification of travel direction is either achieved through spatial interactions or with physical steering props. Hence, we categorize steering techniques under these two major categories.

8.5.1 Spatial Steering Techniques

Spatial steering techniques allow the user to guide or control the movement of travel by manipulating the orientation of a tracking device. Steering techniques are generally easy to understand and provide the highest level of control to the user. The spatial steering techniques we will describe are

gaze-directed steering

hand-directed steering (pointing)

torso-directed steering

lean-directed steering

Gaze-Directed Steering

The most common steering technique—and the default travel technique in many 3D toolkits (Kessler et al. 2000)—is gaze-directed steering (Mine 1995a). Quite simply, this technique allows the user to move in the direction toward which he is looking. In a tracked environment, the gaze direction is obtained from the orientation of a head tracker (although true gaze-directed steering would use an eye tracker); in a desktop environment, the gaze direction is along a ray from the virtual camera position (assumed to be one of the user’s eye positions) through the center of the viewing window. Once the gaze direction vector is obtained, it is normalized, and then the user is translated along this vector in the world coordinate system. The vector may also be multiplied by a velocity factor to allow for different rates of travel (see section 8.8.2 for velocity specification techniques). Some discrete event (e.g., a button press or joystick movement) is needed to start and stop the motion of the user.

The basic concept of gaze-directed steering can be extended by allowing motion along vectors orthogonal to the gaze vector. This gives the user the ability to strafe—to move backward, up, down, left, and right (Chee and Hooi 2002). This ability is especially important for desktop systems, where setting the gaze direction may be more cumbersome than in a head-tracked VE. Additionally, strafing is often used for maneuvering (see section 8.2.3).

From the user’s point of view, gaze-directed steering is quite easy to understand and control. In a desktop 3D environment, it seems quite natural to move “into” the screen. In an immersive 3D UI with head tracking, this technique also seems intuitive, especially if motion is constrained to the 2D horizontal plane. In addition, the hardware requirements of the technique are quite modest; even in an immersive system, the user needs only a head tracker and a button. However, if complete 3D motion is provided (i.e., “flying”), gaze-directed steering has two problems. First, when users attempt to travel in the horizontal plane, they are likely to travel slightly up or down because it’s very difficult to tell whether the head is precisely level. Second, it’s quite awkward to travel vertically up or down by looking straight up or down, especially when wearing an HWD.

But perhaps the most important problem with gaze-directed steering is that it couples gaze direction and travel direction, meaning that users cannot look in one direction while traveling in another. This may seem a small issue, but consider how often you look in a direction other than your travel direction while walking, cycling, or driving in the physical world. Studies have shown that pointing (see below) outperforms gaze-directed steering on tasks requiring motion relative to an object in the environment (Bowman et al. 1997).

Hand-Directed Steering (Pointing)

To avoid the coupling of gaze direction and travel direction, the pointing technique (Mine 1995a) uses the orientation of the user’s hand (or tracked controller) to specify the direction of travel. Hence, pointing is also known as hand-directed steering. The forward vector of the hand tracker is first transformed into a world coordinate vector (this forward vector depends on the specific tracking system, method of mounting the tracker on the hand, and 3D UI toolkit used). The vector is then normalized and scaled by the velocity, and the user is moved along the resulting vector. The same concept could be implemented on the desktop, for example, by using the keyboard to set the travel direction and the mouse to set the gaze direction. In this case, however, well-designed feedback indicating the direction of travel would be necessary. In the case of an immersive 3D UI, the user’s proprioceptive sense (i.e., the sense of one’s own body and its parts) can tell her the direction in which her hand is pointing. Once the travel vector is obtained, the implementation of the pointing technique is identical to that of the gaze-directed steering technique.

An extension of the pointing concept uses two hands to specify the vector (Mine 1997). Rather than use the orientation of the hand to define the travel direction, the vector between the two hands’ positions is used. An issue for this technique is which hand should be considered the “forward” hand. In one implementation using Pinch Gloves (Bowman, Wingrave et al. 2001), the hand initiating the travel gesture was considered to be forward. This technique makes it easy to specify any 3D direction vector and also allows easy addition of a velocity-control mechanism based on the distance between the hands.

This pointing technique is more flexible but also more complex than gaze-directed steering, requiring the user to control two orientations simultaneously. This can lead to higher levels of cognitive load, which may reduce performance on cognitively complex tasks like information gathering (Bowman, Koller et al. 1998). The pointing technique is excellent for promoting the acquisition of spatial knowledge because it gives the user the freedom to look in any direction while moving (Bowman, Davis et al. 1999).

Torso-Directed Steering

Another simple steering technique uses the user’s torso to specify the direction of travel. This torso-directed technique is motivated by the fact that people naturally turn their bodies to face the direction in which they are walking. A tracker is attached to the user’s torso, somewhere near the waist (for example, the tracker can be mounted on a belt that the user wears). If the tracker is attached much higher than this, undesirable rotations may occur when the user looks away from the direction of travel. After the travel direction vector is obtained from this tracker, the technique is implemented exactly as gaze-directed steering. The torso-directed technique does not apply to desktop 3D UIs.

The major advantage of the torso-directed technique is that, like pointing, it decouples the user’s gaze direction and travel direction. Unlike pointing, it does this in a natural way. The user’s cognitive load should be lessened with the torso-directed technique, although this has not been verified experimentally. The torso-directed technique also leaves the user’s hands free to perform other tasks. However, the torso-directed technique also has several disadvantages. The most important of these is that the technique applies only to environments in which all travel is limited to the horizontal plane, because it is not practical to point the torso up or down. The technique also requires an additional tracker beyond the standard head and hand trackers for tracking the user’s torso.

Lean-Directed Steering

A slightly more complex steering technique allows the user to define the direction of travel by leaning. This metaphor uses the natural motion of leaning towards something to view it but interprets the leaning direction as a direction for travel. Lean-directed steering is similar to the Human Joystick techniques we described in section 8.4.2, except that the user does not take any steps in lean-directed steering.

One example is the PenguFly technique developed by von Kapri et al. (2011). With PenguFly, both of the user’s hands are tracked in addition to the head. The direction of travel is then specified by the addition of the two vectors defined from each hand to the head. The travel speed is defined by half of the length of this lean-directed vector (Figure 8.11). Lean-directed steering has been shown to be more accurate for traveling than pointing due to its higher granularity of control over the travel speed (von Kapri et al. 2011). However, it also induces a significant increase in cybersickness, likely due to the large body motions required.

Figure 8.11 PenguFly is a lean-directed steering technique that defines travel direction as the vector created by adding the two vectors created from the hands to the head. The length of this vector also defines the velocity of travel. (Image adapted from von Kapri et al. 2011)

Another lean-directed steering implementation is the ChairIO interface developed by Beckhaus et al. (2007). The interface is based on an ergonomic stool with a rotating seat and a spring-based column that can tilt and even allow the user to bounce up and down. With magnetic trackers attached to the seat and the column, the ChairIO allows the movements of the user to be tracked as she leans and turns in the seat. This tracking information can then be used for steering. Similarly, Marchal et al. (2011) developed the Joyman interface, which consisted of essentially a trampoline, a rigid surface, an inertial sensor, and a safety rail to prevent falling. While holding onto the rail, the user could lean extremely far in any direction, which would cause the rigid surface to also lean toward one side of the trampoline’s structure. The inertial sensor detects the orientation of the lean and then translates it into a direction and speed for steering.

All of the lean-directed steering techniques integrate direction and speed into a single, easy-to-understand movement. These techniques allow the user to rely on natural proprioceptive and kinesthetic senses to maintain spatial orientation and understanding of movement within the environment. Additionally, Kruijff et al. (2016) found that adding walking-related sensory cues, such as footstep sounds, head-bobbing camera motions, and footstep-based vibrotactile feedback, can all enhance the user’s senses of vection and presence. The major disadvantage to these techniques is that they are limited to 2D locomotion unless another technique or feature is added to allow for vertical motion.

8.5.2 Physical Steering Props

Steering techniques for travel can also be implemented with a variety of physical props (specialized devices designed for the task of steering). In general, steering props are useful when a certain type of vehicle is being simulated, when the interface must be usable by anyone without any training, or when steering is an important part of the overall user experience. Additionally, automobile steering wheels may be a good choice for some general-purpose applications because they are understandable by anyone who has driven a car. Props provide users with appropriate affordances and feedback—telling them what can be done, how to do it, and what has been done. A potential pitfall is that props may create unrealistic expectations of realistic control and response in users accustomed to using the same steering interface in a real vehicle. Additionally, most physical steering props facilitate only 2D horizontal travel.

We discuss the following types of steering props in this section

cockpits

cycles

Cockpits

The most obvious steering prop is a simple steering wheel similar to that found in a car, which of course can be combined with a typical accelerator and brake for virtual driving. These devices generally require that the user be seated but can be quite effective to implement a simple vehicle metaphor. They are also usable in either immersive or desktop VEs and are understandable by almost any user.

Other specialized steering props can be used to simulate real or imaginary vehicles for particular application domains. For example, realistic ship controls can be used to pilot a virtual ship (Brooks 1999), or an actual cockpit from a tractor can be used to control a virtual tractor (Deisinger et al. 2000). Of course, this near-field haptics approach (Brooks 1999) has been used in aircraft simulators for years. Disney used steering props in several of its VR-based attractions (Pausch et al. 1996; Schell and Shochet 2001). For example, the virtual jungle cruise ride at DisneyQuest in Orlando allows several guests to collaboratively steer and control the speed of a virtual raft using physical oars, and the Pirates of the Caribbean attraction includes a steering wheel and throttle for the virtual ship. In addition, many arcade games use such props—motorcycle handlebars, steering wheels, skateboards, skis, and the like.

Cycles

If physical exertion is desired but walking is not necessary, a bicycle or other pedal-driven device can be used. A typical exercise bicycle setup (Brogan et al. 1998) is the easiest to implement, because these devices usually already report the speed of pedaling. Some of them also report the degree of handlebar turning or user leaning, allowing natural steering.

Figure 8.12 Uniport locomotion device. (Photograph courtesy of Sarcos)

The Uniport is a unicycle-like device that allows travel through virtual worlds seen through an HWD (Figure 8.12); it was designed for infantry training (as one of the precursors to the omnidirectional treadmill). It is obviously less effective at producing a believable simulation of walking, and users may also have difficulty steering the device. However, it is mechanically much less complex and produces significant exertion for the user, which may be desired for some applications.

8.6 Selection-Based Travel Metaphors

Another major category of travel metaphors depends on the user selecting either a target to travel to or a path to travel along. These selection-based travel metaphors often simplify travel by not requiring the user to continuously think about the details of travel. Instead, the user specifies the desired parameters of travel first and then allows the travel technique to take care of the actual movement. While these techniques are not the most natural, they tend to be extremely easy to understand and use.

8.6.1 Target-Based Travel Techniques

In some cases, the user’s only goal for a travel task is to move the viewpoint to a specific position in the environment. For example, the user may want to inspect a virtual piece of art by moving next to it. The user in these situations is likely willing to give up control of the actual motion to the system and simply specify the endpoint. Target-based travel techniques meet these requirements.

Even though the user is concerned only with the target of travel, this should not necessarily be construed to mean that the system should move the user directly to the target via teleportation. An empirical study (Bowman et al. 1997) found that teleporting instantly from one location to another in a 3D UI significantly decreases the user’s spatial orientation (users find it difficult to get their bearings when instantly transported to the target location). Therefore, continuous movement from the starting point to the endpoint is recommended when spatial orientation is important. On the other hand, continuous movement that’s not under the user’s control can increase cybersickness. For this reason, a compromise is often used in which the user is not teleported instantaneously but instead is moved very quickly through virtual space to the target. This “blink” mode of travel gives users enough visual information to help them understand how they have moved, but is so short that it’s unlikely to make users feel sick.

There are many ways to specify the target of travel. In this section, we describe two techniques:

representation-based target techniques

dual-target techniques

Many other target-based travel techniques specify the target using interaction techniques designed for another task. We call this type of technique a cross-task technique because the technique implementation has crossed from one task to another. Cross-task target-specification techniques include

selecting an object in the environment as the target of travel using a selection technique (see Chapter 7, “Manipulation”)

placing a target object in the environment using a manipulation technique (see Chapter 7)

selecting a predefined target location from a list or menu (see Chapter 9, “System Control”)

entering 2D or 3D coordinates using a number entry technique or a location name using a text entry technique (see Chapter 9)

Representation-Based Target Techniques

A 2D map or 3D world-in-miniature can be used to specify a target location or object within the environment (Figure 8.13). A typical map-based implementation of this technique (Bowman, Wineman et al. 1998) uses a pointer of some sort (a tracker in an immersive 3D UI, a mouse on the desktop) to specify a target and simply creates a linear path from the current location to the target, then moves the user along this path. The height of the viewpoint along this path is defined to be a fixed height above the ground.

Figure 8.13 Map-based target specification. The darker dot on the lower right of the map indicates the user’s current position and can be dragged to a new location on the map to specify a travel target in the full-scale environment. (Bowman, Johnson et al. 1999; reprinted by permission of MIT Press and Presence: Teleoperators and Virtual Environments)

Dual-Target Techniques

Dual-target travel techniques allow the user to travel easily between two target locations. Normally, the user directly specifies the first target location by using a selection technique while the second target location is implicitly defined by the system at the time of that selection. For example, the ZoomBack technique (Zeleznik et al. 2002) uses a typical ray-casting metaphor (see Chapter 7) to select an object in the environment and then moves the user to a position directly in front of this object. Ray-casting has been used in other 3D interfaces for target-based travel as well (Bowman, Johnson et al. 2001). The novel feature of the ZoomBack technique, however, is that it retains information about the previous position of the user and allows users to return to that position after inspecting the target object.

Zeleznik and colleagues used this technique in the context of a virtual museum. The technique allowed users to select a painting on the wall, examine that painting up close, and then return to the original location where multiple paintings could be viewed. Their implementation used a specialized pop-through button device (see Chapter 6, “3D User Interface Input Hardware”). Users moved to the target object with light pressure on the button and could choose either to remain there by pressing the button firmly or to return to their previous location by releasing the button. This technique could also be implemented using two standard buttons, but the pop-through buttons provide a convenient and intuitive way to specify a temporary action that can then be either confirmed or canceled. This general strategy could be applied to other route-planning and target-based techniques as well.

8.6.2 Route-Planning Travel Techniques

A second category of selection-based travel techniques, called route planning, allows the user to specify a path or route through the environment, then moves the user along that path. The essential feature of the route-planning metaphor is this two-step process: the user plans, and then the system carries out the plan. This type of technique is much less common than continuous steering or target-based travel but has many uses. The techniques can allow the user to review, refine, or edit the path before its execution. For example, the user might want to define a camera path to be followed in an animation. He could do this with a steering technique, but the result would likely be a more erratic and less precise path.

Route-planning techniques also allow the user to focus on other tasks, such as information gathering, during the actual period of travel. Route-planning techniques still give users at least some degree of control over their motion, but they move that control to the beginning of the travel task. This section contains information on the following route-planning techniques:

drawing a path

marking points along a path

Drawing a Path

One way to specify a route is to draw the desired path. A continuous path ensures the highest level of user control. One published technique for desktop 3D UIs allows the user to draw directly in the 3D environment using the mouse by a projection of the 2D mouse path onto the 3D geometry in the environment (Figure 8.14; Igarashi et al. 1998). This technique assumes that the camera should always move at a given height above a surface rather than fly through empty space. The technique includes intelligent mapping of the path: rather than simply projecting the 2D stroke onto the nearest 3D surface in the scene, the algorithm takes into consideration the continuity of the path and the surface the path has followed up to the current point. Thus a path can be drawn that goes through a tunnel even if all of the ground within the tunnel is not visible to the user.

In an immersive 3D UI, the user likely cannot reach the entire environment to draw in it directly, so drawing on a 2D or 3D map of the environment could be used to specify the path. This requires a transformation from the map coordinate system to the world coordinate system, and in the case of a 2D map, an inferred height.

Figure 8.14 Path-drawing system. (Igarashi et al. 1998, © 1998 ACM; reprinted by permission)

Figure 8.15 Route-planning technique using markers on a 3D map. (Bowman, Davis et al. 1999; reprinted by permission of MIT Press and Presence: Teleoperators and Virtual Environments)

Marking Points along a Path

Another method for specifying a path in a 3D environment is to place markers at key locations along the path. Again, these markers could be placed in the environment directly (using a mouse on the desktop or perhaps using a manipulation-at-a-distance technique; see Chapter 7) or on a 2D or 3D map of the environment. The system is then responsible for creating a continuous path that visits all of the marker locations. One simple implementation (Bowman, Davis et al. 1999) used a 3D map of the environment and moved the user along a straight-line path from one marker to the next (Figure 8.15). The path might also be more complex; the markers could be used as control points for a curve, for example. One advantage of this type of technique is that the user can vary the level of control by placing more (increased user control) or fewer (increased system control) markers. A key issue with marker placement techniques is feedback: how does the user know what path will be traversed? A well-designed technique will include interactive feedback to show the user the path in the environment or on the map.

8.7 Manipulation-Based Travel Metaphors

Manipulation-based travel techniques are another type of cross-task technique that can be quite effective in some situations. These techniques use hand-based object manipulation metaphors, such as HOMER, Go-Go, and so on (see Chapter 7), to manipulate either the viewpoint or the entire world.

Manipulation-based travel techniques for travel should be used in situations where both travel and object manipulation tasks are frequent and interspersed. For example, consider an interior layout application, where the user’s goal is to place furniture, carpets, paintings, and other items in a virtual room. This task involves frequent object manipulation tasks to place or move virtual objects and frequent travel tasks to view the room from different viewpoints. Moreover, the designer will likely move an object until it looks right from the current viewpoint and then travel to a new viewpoint to verify the object’s placement. If the same metaphor can be used for both travel and object manipulation, then the interaction with this environment will be seamless and simple from the user’s point of view.

8.7.1 Viewpoint Manipulation Techniques

There are several approaches to manipulating the viewpoint to achieve travel. Examples of viewpoint manipulation techniques described below are

camera manipulation

avatar manipulation

fixed-object manipulation

Camera Manipulation

A technique for travel in a desktop VE that still uses position trackers is called the camera-in-hand technique (Ware and Osborne 1990). A tracker is held in the hand, and the absolute position and orientation of that tracker in a defined workspace specifies the position and orientation of the camera from which the 3D scene is drawn. In other words, a miniature version of the world can be imagined in the work area. The tracker is imagined to be a virtual camera looking at this world (see Figure 8.16). Travel then is a simple matter of moving the hand in the workspace.

Figure 8.16 Camera-in-hand technique. The user’s hand is at a certain position and orientation within the workspace (left), producing a particular view of the environment (right).

The camera-in-hand technique is relatively easy to implement. It simply requires a transformation between the tracker coordinate system and the world coordinate system, defining the mapping between the virtual camera position and the tracker position.

This technique can be effective for desktop 3D UIs (if a tracker is available) because the input is actually 3D in nature, and the user can use her proprioceptive sense to get a feeling for the spatial relationships between objects in the 3D world. However, the technique can also be confusing because the user has an exocentric (third-person) view of the workspace, but the 3D scene is usually drawn from an egocentric (first-person) point of view.

Avatar Manipulation

Instead of manipulating the camera, the user can manipulate a virtual representation of himself in order to plan a route. For example, in the world-in-miniature (WIM) technique (see Chapter 7, section 7.7.2), a small human figure represents the user’s position and orientation in the miniature world (Figure 8.17). The user selects and manipulates this object in the miniature environment in order to define a path for the viewpoint to move along or simply a target location; then the system executes this motion in the full-scale environment. Pausch et al. (1995) found that this technique is most understandable when the user’s view actually flies into the miniature world, having it replace the full-scale world and then creating a new miniature. One major advantage of this technique relative to the other route-planning techniques is that the user representation has orientation as well as position so that viewpoint rotations, not just translations, can be defined.

Similarly, a path can be defined by moving a user icon on a 2D map of the environment, a technique that would apply equally well to desktop and immersive 3D UIs. Because this path is only 2D, the system must use rules to determine the height of the user at every point along the path. A common rule would keep the viewpoint at a fixed height above the ground.

Figure 8.17 WIM (in foreground) held in front of the corresponding full-scale environment. The user icon is at the bottom of the image. (Image courtesy of Doug A. Bowman)

Fixed-Object Manipulation

You can also use a manipulation technique for travel by letting a selected object serve as a focus for viewpoint movement. In other words, the user selects an object in the environment and then makes hand movements, just as he would to manipulate that object. However, the object remains stationary and the viewpoint is moved relative to that object. This is called fixed-object manipulation. Although this is hard to understand without trying it yourself, a real-world analogy may help. Imagine grabbing a flagpole. The pole is fixed firmly to the ground, so when you move your hand toward your body, the flagpole doesn’t move; rather, you move closer to it. Similarly, you might try to rotate the flagpole by turning your hand, but the effect instead will be to rotate your body in the opposite direction around the pole.

Let us consider a specific example of fixed-object manipulation in 3D UIs. Pierce et al. (1997) designed a set of manipulation techniques called image-plane techniques (see Chapter 7, section 7.5.1). Normally, an object is selected in the image plane, then hand movements cause that object to move within the environment. For example, moving the hand back toward the body would cause the selected object to move toward the user as well. When used in travel mode, however, the same hand motion would cause the viewpoint to move toward the selected object. Hand rotations can also be used to move the viewpoint around the selected object. The scaled-world grab technique (Mine et al. 1997) and the LaserGrab technique (Zeleznik et al. 2002) work in a similar way.

Fixed-object manipulation techniques provide a seamless interaction experience in combined travel/manipulation task scenarios, such as the one described above. The user must simply be aware of which interaction mode is active (travel or manipulation). Usually, the two modes are assigned to different buttons on the input device. The two modes can also be intermingled using the same selected object. The user would select an object in manipulation mode, move that object, then hold down the button for travel mode, allowing viewpoint movement relative to that selected object, then release the button to return to manipulation mode, and so on.

8.7.2 World Manipulation Techniques

An alternative to manipulating the viewpoint to travel is to manipulate the entire world relative to the current viewpoint. A number of techniques take this approach. We categorize them by the number of points used to manipulate the world:

single-point world manipulation

dual-point world manipulation

Single-Point World Manipulation

One method for using manipulation techniques for travel tasks is to allow the user to manipulate the world about a single point. An example of this is the “grab the air” or “scene in hand” technique (Mapes and Moshell 1995; Ware and Osborne 1990). In this concept, the entire world is viewed as an object to be manipulated. When the user makes a grabbing gesture at any point in the world and then moves her hand, the entire world moves while the viewpoint remains stationary. Of course, to the user this appears exactly the same as if the viewpoint had moved and the world had remained stationary.

In order to integrate this travel technique with an object manipulation technique, the system must simply determine whether or not the user is grabbing a moveable object at the time the grab gesture is initiated. If an object is being grabbed, then standard object manipulation should be performed; otherwise, the grab gesture is interpreted as the start of a travel interaction. In this way, the same technique can be used for both tasks.

Although this technique is easy to implement, developers should not fall into the trap of simply attaching the world object to the virtual hand, because this will cause the world to follow the virtual hand’s rotations as well as translations, which can be quite disorienting. Rather, while the grab gesture is maintained (or the button held down), the system should measure the displacement of the virtual hand each frame and translate the world origin by that vector.

In its simplest form, this technique requires a lot of arm motion on the part of the user to travel significant distances in the VE. Enhancements to the basic technique can reduce this. First, the virtual hand can be allowed to cover much more distance using a technique such as Go-Go (Poupyrev et al. 1996). Second, the technique can be implemented using two hands instead of one, as discussed next.

Dual-Point World Manipulation

Manipulating the world has also been implemented by defining two manipulation points instead of one. The commercial product SmartScene, which evolved from a graduate research project (Mapes and Moshell 1995), allowed the user to travel by using an action similar to pulling oneself along a rope. The interface was simple—the user continuously pulled the world toward him by making a simple grab gesture with his hand outreached and bringing the hand closer before grabbing the world again with his other hand. This approach distributed the strain of manipulating the world between both of the user’s arms instead of primarily exerting one.

Another advantage of dual-point manipulation is the ability to also manipulate the view rotation while traveling. When the user has the world grabbed with both hands, the position of the user’s nondominant hand can serve as a pivot point while the dominant hand defines a vector between them. Rotational changes in this vector can be applied to the world’s transformation to provide view rotations in addition to traveling using dual-point manipulations. Additionally, the distance between the two hands can be used to scale the world to be larger or smaller, as discussed in section 8.8.5.

8.8 Other Aspects of Travel Techniques

In addition to the techniques covered in the previous sections, there are many other aspects of travel techniques that 3D UI designers should be concerned with. These include how to orient the viewpoint, how to specify the velocity of travel, how to provide vertical travel, whether to use automated or semiautomated travel, whether to scale the world while traveling, how to transition between different travel techniques, using multiple cameras and perspectives, and considerations of using nonphysical inputs, such as brain signals.

8.8.1 Viewpoint Orientation

Thus far, we have focused almost exclusively on techniques for changing the position (xyz coordinates) of the viewpoint. Travel also includes, however, the task of setting the viewpoint orientation (heading, pitch, and roll). Here we discuss techniques specifically designed to specify the orientation of the viewpoint, including

head tracking

orbital viewing

nonisomorphic rotation

virtual sphere techniques

Head Tracking

For immersive VR and AR, there is usually no need to define an explicit viewpoint orientation technique, because the viewpoint orientation is taken by default from the user’s head tracker. This is the most natural and direct way to specify viewpoint orientation, and it has been shown that physical turning leads to higher levels of spatial orientation than virtual turning (Bakker et al. 1998; Chance et al. 1998).

Orbital Viewing

A slight twist on the use of head tracking for viewpoint orientation is orbital viewing (Koller et al. 1996). This technique is used to view a single virtual object from all sides. In order to view the bottom of the object, the user looks up; in order to view the left side, the user looks right; and so on. However, recent research by Jacob et al. (2016) has indicated that head roll motions, instead of head yaw motions, should be used for horizontal orbital viewing, when interacting with a stationary screen.

Nonisomorphic Rotation

There are certain situations in immersive VR when some other viewpoint orientation technique is needed. The most common example comes from projected displays in which the display surfaces do not completely surround the user, as in a three-walled surround-screen display. Here, in order to see what is directly behind, the user must be able to rotate the viewpoint (in surround-screen displays, this is usually done using a joystick on the “wand” input device). The redirected walking technique (Razzaque et al. 2002) slowly rotates the environment so that the user can turn naturally but avoid facing the missing back wall. Research has produced nonisomorphic rotation techniques (LaViola et al. 2001) that allow the user of such a display to view the entire surrounding environment based on amplified head rotations (for an introduction to nonisomorphic mappings, see Chapter 7, section 7.3.1).

A number of different nonisomorphic mappings could be used for setting the virtual viewpoint orientation. For a CAVE-like display, LaViola et al. (2001) used a nonlinear mapping function, which kicks in only after the user has rotated beyond a certain threshold, dependent on the user’s waist orientation vector and position within the CAVE. A scaled 2D Gaussian function has been shown to work well. LaViola et al. (2001) offer more details on nonisomorphic viewpoint rotation.

Virtual Sphere Techniques

For desktop 3D UIs, setting viewpoint orientation is usually a much more explicit task. The most common techniques are the Virtual Sphere (Chen et al. 1988) and a related technique called the Arcball (Shoemake 1992). Both of these techniques were originally intended to be used for rotating individual virtual objects from an exocentric point of view and are described in detail in Chapter 7, “Manipulation,” section 7.7.3. For egocentric points of view, the same concept can be used from the inside out. That is, the viewpoint is considered to be the center of an imaginary sphere, and mouse clicks and drags rotate that sphere around the viewpoint.

8.8.2 Velocity Specification

Next, we need to consider techniques for changing the speed of travel. Many 3D UIs ignore this aspect of travel and simply set what seems to be a reasonable constant velocity. However, this can lead to a variety of problems, because a constant velocity will always be too slow in some situations and too fast in others. When the user wishes to travel from one side of the environment to another, frustration quickly sets in if he perceives the speed to be too slow. On the other hand, if he desires to move only slightly to one side, the same constant velocity will probably be too fast to allow precise movement. Therefore, considering how the user or system might control velocity is an important part of designing a travel technique.

The user can control velocity in many different ways. We discuss the following approaches to defining or manipulating velocity:

discrete changes

continuous control

direct input

automated velocity

Discrete Changes

One way to allow the user to control velocity is through discrete changes based on predefined amounts. A simple example is using two buttons, one to increase the speed and the other to decrease it. These buttons could be on the input device held by the user or displayed on a menu within the 3D UI. While discrete changes in velocity can help alleviate most of the issues with traveling too slow or too fast, the user may not be able to set the velocity to the desired speed if that value falls between two of the discrete steps. An interaction designer may decide to use a smaller step value to decrease that likelihood, but this in turn can cause the user to press a button an unnecessarily large number of times to eventually reach the desired velocity. Hence, interaction designers must be careful in selecting step values for discrete changes.

Continuous Control

Considering the issues with discrete changes, a likely better solution is to afford the user continuous control over velocity. Often, continuous velocity control can be integrated with the direction control technique being used. For example, in gaze-directed steering, the orientation of the head is used to specify travel direction, so the position of the head relative to the body can be used to specify velocity. This is called lean-based velocity (Fairchild et al. 1993; LaViola et al. 2001; Song and Norman 1993). In LaViola’s implementation, this involves looking at the absolute horizontal distance between the head and the waist. Once this distance exceeds a threshold, then the user begins to translate in the direction of leaning, and the velocity is some multiple of the leaning magnitude. Similarly, a technique that bases velocity on hand position relative to the body (Mine 1995a) integrates well with a pointing technique.

Other methods for continuously controlling velocity make use of physical devices. A physical prop that includes acceleration and braking pedals can be used to control velocity similar to controlling the speed of a vehicle. Velocity may also be controlled with the use of physical force-based devices, such as a force-feedback joystick. In these cases, velocity is a function of the amount of force applied. For more information on such techniques, see MacKenzie (1995).

An obvious advantage to the continuous control techniques is that the user can achieve his desired travel speed. However, with some implementations, such as the techniques based on relative positions, it can be difficult to sustain a specific speed.

Direct Input

Another method of allowing the user to specify velocity is through direct input. For example, the user can enter a numeric value using a keyboard or specify the velocity through a voice command. While direct input techniques like these allow the user to specify and sustain a desired velocity, they are often inconvenient and distract the user from other tasks.

Automated Velocity

The main drawback to allowing the user to control velocity is that it adds complexity to the interface. In cases where velocity control would be overly distracting to the user, a system-controlled velocity technique may be appropriate. For example, to allow both short, precise movements with a small velocity and larger movements with a high velocity, the system could automatically change the velocity depending on the amount of time the user had been moving. In such techniques, travel starts slowly and gradually gets faster until it reaches some maximum speed. The shape of the velocity curve and the maximum velocity might depend on the size of the environment, the need for precise travel, or other factors. Of course, this technique decreases the precision of longer movements. Another potential technique uses the concept of “slow-in, slow-out” from animation (Mackinlay et al. 1990a), meaning that travel begins slowly, gets faster, and then slows again as the destination comes near. This implies that the destination is known, so this can be fully automated only with a target-based travel technique (section 8.6.1).

8.8.3 Vertical Travel

Many of the techniques presented in this chapter are restricted to traveling within the horizontal plane (e.g., the walking metaphors and physical steering props). Some of the techniques afford traveling in vertical directions, such as the spatial steering metaphors, but this is often due to the ability to travel in all three dimensions, as opposed to only vertical movements. However, there have been a few techniques focused on vertical travel.

Slater, Usoh, and Steed (1994) used the walking-in-place technique and collisions with virtual ladders and stairs to provide vertical travel via climbing. The direction of climbing was determined by whether the user’s feet collided with the bottom step of a ladder or staircase (climbing up) or with the top step (climbing down). While on a ladder or staircase, the user could reverse the climbing direction by physically turning around.

More recently, researchers have developed other techniques for climbing ladders. Takala and Matveinen (2014) created a technique based on reaching for and grabbing the rungs of a virtual ladder. Once a rung is grabbed, users can travel up or down by moving the grabbed rung in the opposite direction (e.g., moving the rung down to climb up). To provide a more realistic technique for climbing ladders, Lai et al. (2015) developed the march-and-reach technique, in which the user marches in place to virtually step on lower ladder rungs while reaching to virtually grab higher rungs. While users found the technique more realistic, it was more difficult to use than walking in place or Takala’s technique of reaching and grabbing.

8.8.4 Semiautomated Travel

In certain applications, particularly in the areas of entertainment, storytelling, and education, the designer wants to give the user the feeling of control while at the same time moving the user toward an eventual goal and keeping her attention focused on important features of the environment. For example, in Disney’s Aladdin attraction (Pausch et al. 1996), the user needs to feel as if she is controlling her magic carpet, but the experience must be limited to a certain amount of time, and every user needs to reach the end of the story. In such applications, semiautomated travel techniques are needed.

The basic concept of semiautomated travel is that the system provides general constraints and rules for the user’s movement, and the user is allowed to control travel within those constraints. This idea is of course applicable to both immersive and desktop 3D UIs. A particular implementation of this concept is Galyean’s river analogy (Galyean 1995). He used the metaphor of a boat traveling down a river. The boat continues to move with the current whether the user is actively steering or not, but the user can affect the movement by using the rudder. In particular, he designed an application in which the designer defined a path through the environment (the river). The user was “attached” to this path by a spring and could move off the path by some amount by looking in that direction (Figure 8.18).

Figure 8.18 Galyean’s (1995) guided navigation technique. (© 1995 ACM; reprinted by permission)

8.8.5 Scaling the World

In section 8.4.1, we noted that the most natural and intuitive method for traveling in a 3D virtual world is real walking, but real walking is limited by the tracking range or physical space. One way to alleviate this problem is to allow the user to change the scale of the world so that a physical step of one meter can represent one nanometer, one kilometer, or any other distance. This allows the available tracking range and physical space to represent a space of any size.

There are several challenges when designing a technique for scaling the world and traveling. One is that the user needs to understand the scale of the world so that he can determine how far to move and can understand the visual feedback he gets when he moves. Use of a virtual body (hands, feet, legs, etc.) with a fixed scale is one way to help the user understand the relative scale of the environment. Another issue is that continual scaling and rescaling may hasten the onset of cybersickness or discomfort (Bowman, Johnson et al. 2001). In addition, scaling the world down so that a movement in the physical space corresponds to a much larger movement in the virtual space will make the user’s movements much less precise.

There are two common approaches to designing scaling-and-traveling techniques:

active scaling

automated scaling

Active Scaling

The most common approach to scaling and traveling is to allow the user to actively control the scale of the world. Several research projects and applications have used this concept. One of the earliest was the 3DM immersive modeler (Butterworth et al. 1992), which allowed the user to “grow” and “shrink” to change the relative scale of the world. The SmartScene application (Mapes and Moshell 1995) also allowed the user to control the scale of the environment in order to allow rapid travel and manipulation of objects of various sizes. The interface for changing the environment’s scale was simple—users wore Pinch Gloves (see Figure 6.23), made a simple pinch gesture, and either brought the hands together to signify scaling the world down or moved the hands apart to scale the world up. The scaled-world grab technique (Mine, Brooks, and Séquin 1997) scales the user in an imperceptible way when an object is selected (Figure 8.19). The user sees the same scene (disregarding stereo) before and after the selection, although the world has actually been scaled. Although this technique is meant for object manipulation, the scaling also allows the user to travel larger distances using physical movements.

Figure 8.19 Scaling in the scaled-world grab technique. (Mine et al. 1997, © 1997 ACM; reprinted by permission)

Automated Scaling

While active scaling allows the user to specify the scale of the world, it requires additional interface components or interactions to do so. Alternatively, 3D UIs can be designed to have the system change the scale of the world based on the user’s current task or position. This automated approach affords scaling and traveling without requiring the user to specify the scale. An example of automated scaling is the multiscale virtual environment (MSVE) interface developed by Kopper et al. (2006). Each MSVE contains a hierarchy of objects, with smaller objects nested within larger objects. As the user travels from a larger object to a smaller object, the system detects that the user is within the smaller object’s volume and scales the world up. For example, a medical student learning human anatomy could travel from outside the body and into an organ. During this travel, the system detects the travel and scales the world up to make the organ the same size as the medical student. Alternatively, when the student travels from the organ to outside the body, the system scales the world back down.

MSVEs allow the user to concentrate on traveling instead of scaling while still gaining the benefits of having the world scaled up or down. However, such VEs require careful design, as the hierarchy of objects and scales need to be intuitive and usable for the user.

8.8.6 Travel Modes

Most of the travel techniques discussed in this chapter use a single mode for travel. However, some techniques require additional modes to transition among different travel methods. For example, many 3D desktop applications require several travel modes, since the user can control only two or three of the camera’s six DOF with a standard mouse and keyboard. When providing a number of travel modes like this, it is important that the modes are well integrated to allow easy transition from one to another. At the same time, travel modes must be clearly distinguished to avoid the user moving the camera in unintentional DOF.

An early novel approach to transitioning among travel modes was Zeleznik and Forsberg’s (1999) UniCam, a 2D camera-control mechanism (Figure 8.20) originally derived from the SKETCH camera-control metaphor (Zeleznik et al. 1996). In UniCam, the user controls travel through the 3D scene by making 2D gestural commands, using a mouse or stylus with a button, to manipulate the virtual camera. To facilitate camera translation and orbiting, the viewing window is broken up into two regions, a rectangular center region and a rectangular border region. If the user clicks within the border region, a virtual sphere rotation technique is used to orbit the viewer about the center of the screen. If the user clicks within the center region and drags the pointer along a horizontal trajectory, image-plane translation is invoked. If the user drags the pointer along a vertical trajectory, the camera zooms in or out. The user can also invoke orbital viewing about a focus point by making a quick click to define a dot and then clicking again elsewhere.

Figure 8.20 Gesture-based controls for camera translation and rotation. (Zeleznik and Forsberg 1999; © 1999 ACM; reprinted by permission)

A similar but more recent approach to integrating travel modes is the Navidget travel widget (Hachet et al. 2008). With the Navidget technique, the user can first zoom in by encircling, with a pointer or stylus, the area to keep within the camera’s field of view. If the user holds the input button after sketching a circle, instead of zooming, a virtual sphere appears to provide orbital viewing. Upon release of the input button, the camera moves to the final viewing position. Four additional widgets are attached to the virtual sphere and can manipulate the size of the sphere, which in turn controls the viewing distance that the camera is placed at. Finally, an outer region allows the user to orbit between the front and back of the virtual sphere by moving into the region and then back into the sphere.

8.8.7 Multiple Cameras

While most of the techniques described in this chapter rely on a single (usually virtual) camera, some travel techniques have been designed to specifically incorporate different perspectives of multiple cameras. An early example of such a technique is the through-the-lens metaphor developed by Stoev and Schmalstieg (2002). This metaphor provides two different viewpoints that can be used to simultaneously explore the virtual world. The primary viewpoint is a standard immersive view of the world surrounding the user. The secondary viewpoint is a sort of magic lens (Bier et al. 1993) that is seen within the surrounding world but displays a different perspective of the world, as if positioned elsewhere or viewing the same world in an alternate dimension. Kiyokawa and Takemura (2005) expanded upon the through-the-lens metaphor by adding the capability to create an arbitrary number of viewing windows with their tunnel-window technique.

Multi-camera techniques have also been used with AR. Veas and colleagues (2010) investigated techniques for observing video feeds from multiple cameras within an outdoor environment. They developed three techniques for transitioning to remote camera feeds with visible viewpoints of the same scene—the mosaic, the tunnel, and the transitional techniques. The mosaic and tunnel techniques provide egocentric transitions to the remote camera while the transitional technique provides an exocentric transition. Veas et al. (2012) later developed a multiview AR system by combining different views of remote cameras, views generated by other users’ devices, a view of 2D optical sensors, cameras on unmanned aerial vehicles, and predefined virtual views—more on these techniques can be found in the mobile AR case study below (section 8.11.2).

In a different multi-camera approach, Sukan et al. (2012) stored snapshots of augmented scenes that could later be virtually visited without physically returning to their relative locations. Along with still images of the real world, the researchers also stored the position of the camera. This allowed for virtual augmentations and objects to be dynamically updated when the snapshots were virtually viewed later. This approach enabled users to quickly switch between viewpoints of the current scene.

8.8.8 Nonphysical Input

All of the techniques discussed in this chapter require some form of physical input, whether physically walking within a tracked space or using a mouse for orbital viewing. However, not all travel techniques require physical (motor) input. Researchers have recently begun investigating brain–computer interfaces (BCIs) for navigating VEs. In an early study, Pfurtscheller et al. (2006) used an electroencephalogram (EEG) device to determine when the user was thinking about walking forward, and in turn, moved the user’s view down a virtual street. More recently, Larrue et al. (2012) used an EEG device to provide the ability to turn left or right by thinking, in addition to walking forward by thought. While BCI techniques can be used to enable travel without physical actions, these techniques currently require a great deal of time to train the BCI system. Generically trained signal-processing algorithms can be used to reduce this time, but these algorithms can be unresponsive and induce false positives.

8.9 Wayfinding in 3D Environments

We now turn to the cognitive aspect of navigation—wayfinding. In general, the effectiveness of wayfinding depends on the number and quality of wayfinding cues or aids provided to users. The following sections on user-centered and environment-centered wayfinding cues will present a number of different wayfinding aids. These sections will address wayfinding in practice, such as how to include cues, when to include cues, and how the design of the VE affects wayfinding. See Golledge (1999) for a discussion of the theoretical foundations of wayfinding.

8.9.1 User-Centered Wayfinding Cues

User-centered wayfinding cues make use of the characteristics of human perception and can draw upon multiple human senses. Thus, most user-centered support is display-oriented. Because output devices still cannot deliver information that fully matches the capabilities of the human perceptual system (see Chapter 3, “Human Factors Fundamentals”), they can have a negative impact on wayfinding. However, there are certain strategies that developers can use to lessen these negative effects. In this section, we discuss various user-centered wayfinding cues:

field of view

motion cues

multisensory output

presence

search strategies

Field of View

A small field of view (FOV) may inhibit wayfinding. Because a smaller portion of the VE is visible at any given time, the user requires repetitive head movements to comprehend the spatial information obtained from the viewpoint. Using a larger FOV reduces the amount of head movement and allows the user to understand spatial relationships more easily. Some studies, such as Péruch et al. (1997) and Ruddle et al. (1998), do not fully support these claims, showing little difference in the orientation capabilities of a user between several small FOVs (40, 60, and 80 degrees, or in desktop environments). However, they have demonstrated the usefulness of larger FOVs when environments become more detailed and complex. Furthermore, wide FOVs closer to the FOV of the human visual system (like those in some surround-screen displays) were not considered in these studies.

Another negative side effect of a small FOV is the lack of optical-flow fields in users’ peripheral vision. Peripheral vision provides strong motion cues, delivering information about the user’s direction, velocity, and orientation during movement. In addition, it has been shown that small FOVs may lead to cybersickness (Stanney et al. 1998).

Motion Cues

Supplying motion cues enables the user to judge both the depth and direction of movement and provides the information necessary for dead reckoning (backtracking of the user’s own movement). Motion cues can be obtained from peripheral vision, as discussed above, but motion cues are not purely visual—it is important to supply the user with additional vestibular (inertia and balance) cues if possible. These cues generally fall in the category of embodied self-motion cues (Riecke et al. 2012). A lack of vestibular cues causes an intersensory conflict between visual and physical information. This may cause cybersickness and can affect judgments of egomotion, thus negatively impacting the formation of the cognitive map.

The effect of vestibular cues on the orientation abilities of users in VEs has been the subject of a range of studies. Usoh et al. (1999) compared real walking (section 8.4.1) against walking in place (section 8.4.2) and hand-directed steering (section 8.5.1). The two walking metaphors, which included physical motions, were found to be better than the steering technique, which only included virtual motions. Other studies of virtual and physical travel have shown positive effects of real motion on spatial orientation (Klatzky et al. 1998; Chance et al. 1998).

Our understanding of the proper balance between visual and vestibular input is still being formed. Harris et al. (1999) performed tests matching visual and vestibular input and concluded that developers should support vestibular inputs corresponding to at least one-quarter of the amount of visual motion. However, statically tilting the user’s body has been shown to have a positive effect on self-motion perception in some cases. In a small study, Nakamura and Shimojo (1998) found that vertical self-motion increases as the user is tilted further back in a chair. In a more recent study, Kruijff et al. (2015) found that statically leaning forward increased perceived distances traveled, likely due to an increase in perceived speed of self-motion. Additionally, Kruijff et al. (2016) found that dynamically leaning enhanced self-motion sensations.

Multisensory Output

In addition to the visual and vestibular systems, developers might want to experiment with multisensory output (i.e., adding auditory, tactile, or other multimodal stimuli) to deliver wayfinding cues. Audio can provide the user with useful directional and distance information (Davis et al. 1999). For example, the sound of trains can indicate the direction to the station, whereas the volume allows the user to estimate the distance to the station. Audio for wayfinding support is still a largely open question. Another form of multisensory support is the tactile map—a map whose contours are raised so they can be sensed by touch as well as sight. Initial experiments used tactile maps to fill in gaps in the spatial knowledge of visually impaired people. The tactile map was used as an additional cue, not as a substitute for another cue type (Jacobson 1996). Tan et al. (2002) showed that tactile cues can aid in the formation and usage of spatial memory, so tactile wayfinding aids are another area of great potential.

Presence

The sense of presence (the feeling of “being there”) is a much-explored but still not well-understood phenomenon that is assumed to have an impact on spatial knowledge. Briefly, the idea is that if the user feels more present in a virtual world, then real-world wayfinding cues will be more effective. Many factors influence the sense of presence, including sensory immersion, proprioception, and the tendency of the user to become immersed. For example, in the study conducted by Usoh et al. (1999), real walking increased the sense of presence considerably compared to walking in place. The inclusion of a virtual body—the user’s own virtual representation—may also enhance the sense of presence, which in turn has a positive effect on spatial knowledge acquisition and usage (Draper 1995; Usoh et al. 1999). See Chapter 11, “Evaluation of 3D User Interfaces,” for more discussion of presence.

Search Strategies

A final user-centered wayfinding technique is to teach the user to employ an effective search strategy. Using a search strategy often depends on user skill. More skilled users, such as professional aviators, use different strategies than users with limited navigation experience. Not only do skilled users depend on other kinds of spatial knowledge and therefore on other cues in the environment, but they often use different search patterns as well. Whereas novice users depend largely on landmarks, skilled users make use of cues like paths (e.g., a coastline).

Using a search strategy inspired by navigation experts can increase its effectiveness. For example, search patterns such as those used during aviation search-and-rescue missions may aid a user during wayfinding (Wiseman 1995). The basic line search follows a pattern of parallel lines. The pattern search starts at a specific central point and moves further away from it, using quadratic or radial patterns. The contour search is designed to follow contours in a landscape, like a river or a mountain. Finally, the fan search starts from a center point and fans out in all directions until the target is found. Of course, the use of these search strategies is dependent on the content of the environment—they might work well in a large outdoor environment but would not make sense in a virtual building.

Another important search strategy is to obtain a bird’s-eye view of the environment rather than performing all navigation on the ground. Users can be trained to employ this strategy quite easily, and it results in significantly better spatial orientation (Bowman et al. 1999). This can even be automated for the user. In the “pop-up” technique (Darken and Goerger 1999), users can press a button to temporarily move to a significant height above the ground, and then press the button again to go back to their original location on the ground.

We assume that novice users can learn search techniques, even if the described pattern search strategies are seen primarily in expert navigators. Placing a rectangular or radial grid directly in the environment provides a visual path along which users can search. Although these grids may supply directional and depth cues, they do not force users to employ a good search strategy.

8.9.2 Environment-Centered Wayfinding Cues

Environment-centered wayfinding cues refer to the conscious design of the virtual world to support wayfinding. Beyond the technology and training support described above, most wayfinding aids for virtual worlds can be directly related to aids from the real world. These range from natural environmental cues, like a high mountain, to artificial cues, such as a map. In this section, we discuss several environment-centered wayfinding cues:

environment legibility

landmarks

maps

compasses

signs

trails

reference objects

Environment Legibility

Just as urban planners design cities in the real world to be navigable, virtual worlds can be designed to support wayfinding. In his book, Lynch (1960) describes several legibility techniques that serve as principles for urban design. These techniques allow the user to quickly obtain an understanding of an environment by understanding its basic structural elements. Lynch identified five basic building blocks that can be applied to design a legible environment: paths, edges, districts, nodes, and landmarks. Paths are elements or channels for linear movement, like streets or railways. People often view a city from the perspective of such paths. Edges are related to paths, but are focused on bordering spaces rather than on movement. These edges can be natural, like a river, or artificial, like a walled structure. Districts are areas that are uniquely identifiable because of their style (e.g., type of buildings), color, or lighting. Nodes are gathering points, such as a major intersection of streets, or the entrance to a certain district. Finally, landmarks are static objects that are easily distinguished and often placed near a node (Darken and Sibert 1996). Landmarks are crucial enough to wayfinding that we discuss them in further detail below.

Landmarks

Landmarks are easily distinguishable objects that can be used to maintain spatial orientation, develop landmark and route knowledge, and serve as foundations for distance and direction estimation. Although landmarks are naturally part of a legible environment design, artificial landmarks may be added to any environment to support wayfinding. When a landmark is being placed, it is important that the landmark can be easily spotted, such as on a street corner, as opposed to placing it within a city block. It is also important to consider whether the landmark will serve as a global or local landmark. Global landmarks are visible from practically any location, so they provide directional cues. Local landmarks help users in the decision-making process by providing useful information when a decision point is reached (Steck and Mallot 2000). Finally, users should be able to distinguish a landmark from other surrounding objects within the environment. This can be accomplished by altering its visual characteristics, such as using a different color, texture, light, shape, or size.

Maps

One of the most common wayfinding aids is the map. Although simple in concept, the design of maps for wayfinding in 3D UIs is surprisingly complex. First, the designer should realize that the map can be dynamic because it is virtual. This means that you-are-here markers can be continuously displayed. The map can be updated if the environment changes. Paths to the next destination can be overlaid on the map (or the world itself). The map can even be rotated to face the direction the user is facing or zoomed to show only the local area.

With all this added flexibility come some difficult design choices, as clutter and confusion should be avoided. In general, it is recommended to use you-are-here markers and to show an up-to-date version of the environment on the map. However, designers should be careful about automatically rotating or zooming the map, as this can cause confusion, depending on the user’s task, or even inhibit the formation of survey knowledge (Darken and Cevik 1999).

Another design choice is where and when to display the map. On small AR or VR displays, a legible map may take up a large percentage of the display area. At the least, a mechanism to display or hide the map should be provided. One effective technique is to attach the virtual map to the user’s hand or a handheld tool, so that the map can be viewed or put away with natural hand motions. Figure 8.21 illustrates a map attached to the user’s hand in this way. Unlike most maps, this map is not a spatial representation of the 3D environment, but rather a hierarchical representation showing objects at different levels of scale (Bacim et al. 2009).

Figure 8.21 A hierarchical map used as a wayfinding aid in a multiscale 3D environment. (Image courtesy of Felipe Bacim)

Compasses

A compass primarily serves to provide directional cues. For a trained navigator, a compass in combination with a map is an invaluable wayfinding tool. However, most users of 3D UIs will not be familiar with effective methods for using compass information. As a VE wayfinding aid, compasses are typically found in navigation training tools, such as those used in the military.

Figure 8.22 Examples of signs (left) and local landmarks (right) from the real world. (Photograph courtesy of Ernst Kruijff)

Signs

Signs are used extensively in real-world environments to provide spatial knowledge and directions (Figure 8.22), but surprisingly there is little research on the use of signs as a wayfinding cue in VEs. Signs can be extremely effective because of their directness, but signs can also become confusing in complex environments (think about badly designed airports). Signs should be placed in an easily observable location, should supply clear directions, and should be spaced far enough apart that multiple signs do not confuse the user.

Trails

In order to help the user “retrace his steps” in an environment or to show which parts of the world have been visited, trails can be included as an artificial wayfinding aid. A trail can be made up of a simple line or by using markers that include directional information, just like footprints in the real world. A trail can be placed directly into the environment but can also be shown on a map (Darken and Peterson 2002; Grammenos et al. 2002).

Reference Objects

Reference objects are objects that have a well-known size, such as a chair or a human figure, and aid in size and distance estimation. Users often have difficulty judging distances in large, mostly empty environments, and VR systems have well-known deficiencies in distance perception (Renner et al. 2013). Distances are highly under or overestimated. When reference objects are placed in such a space, estimation of sizes and distances becomes easier.

8.9.3 Combining Travel and Wayfinding Techniques

Since travel and wayfinding are intimately linked, techniques for these two tasks should be integrated if possible. In some cases, hardware can provide this directly. For example, a treadmill couples a method of travel with a vestibular feedback component. Other techniques have inherent proprioceptive cues. Gaze-directed steering, for example, supplies directional information via head-centric cues. Wayfinding aids may actually be part of the travel technique. For example, the World-in-Miniature technique combines a 3D map with a route-planning travel metaphor (section 8.7.1). Finally, wayfinding aids can be placed in the environment near the focus of the user’s attention during travel. For example, a small compass can be attached to the (real or virtual) tip of a stylus when the pointing technique (section 8.5.1) is used.

8.10 Design Guidelines

This chapter has provided a large number of potential techniques for travel in 3D environments. One reason there are so many techniques for this and other 3D interaction tasks is that no single technique is best for all applications and situations. Therefore, as in the other chapters in Part IV, we present some general guidelines to help the designer choose an appropriate travel technique for a given application.


Tip

Match the travel technique to the application.


The authors are of the opinion that no set of 3D interaction techniques is perfect for all applications. Designers must carefully consider the travel tasks that will be performed in the application (section 8.2), what performance is required for travel tasks, in what environment travel will take place, and who will be using the application.

Example: A visualization of a molecular structure uses an exocentric point of view and has a single object as its focus. Therefore, an object-inspection technique such as orbital viewing is appropriate.


Tip

Consider both natural and magic techniques.


Many designers start with the assumption that the 3D interface should be as natural as possible. This may be true for certain applications that require high levels of realism, such as military training, but many other applications have no such requirement. Nonisomorphic “magic” travel techniques may prove much more efficient and usable.

Example: The task of furniture layout for an interior design application requires multiple viewpoints but not natural travel. A magic technique such as scaled-world grab (section 8.8.5) will be efficient and will also integrate well with the object manipulation task.


Tip

Use an appropriate combination of travel technique, display devices, and input devices.


The travel technique cannot be chosen separately from the hardware used in the system.

Example: If a personal surround-screen display is used, a vehicle-steering metaphor for travel fits the physical characteristics of this display. If a pointing technique is chosen, then an input device with clear shape and tactile orientation cues, such as a stylus, should be used instead of a symmetric device, such as a ball.


Tip

Choose travel techniques that can be easily integrated with other interaction techniques in the application.


Similarly, the travel technique cannot be isolated from the rest of the 3D interface. The travel technique chosen for an application must integrate well with techniques for selection, manipulation, and system control.

Example: A travel technique involving manipulation of a user representation on a 2D map suggests that virtual objects could be manipulated in the same way, providing consistency in the interface.


Tip

Provide multiple travel techniques to support different travel tasks in the same application.


Many applications include a range of travel tasks. It is tempting to design one complex technique that meets the requirements for all these tasks, but including multiple simple travel techniques is often less confusing to the user. Some VR applications already do this implicitly, because the user can physically walk in a small local area but must use a virtual travel technique to move longer distances. It will also be appropriate in some cases to include both steering and target-based techniques in the same application. We should note, however, that users will not automatically know which technique to use in a given situation, so some small amount of user training may be required.

Example: An immersive design application in which the user verifies and modifies the design of a large building requires both large-scale search and small-scale maneuvering. A target-based technique to move the user quickly to different floors of the building plus a low-speed steering technique for maneuvering might be appropriate.


Tip

Make simple travel tasks easier by using target-based techniques for goal-oriented travel and steering techniques for exploration and search.


If the user’s goal for travel is not complex, then the travel technique providing the solution to that goal should not be complex either. If most travel in the environment is from one object or well-known location to another, then a target-based travel technique is most appropriate. If the goal is exploration or search, a steering technique makes sense.

Example: In a virtual tour of a historical environment, the important areas are well known, and exploration is not required, so a target-based technique such as choosing a location from a menu would be appropriate. In a visualization of weather patterns, the interesting views are unknown, so exploration should be supported with a steering technique such as pointing.


Tip

Use a physical locomotion technique if user exertion or naturalism is required.


Walking or redirected walking require large tracked areas, and physical locomotion devices can have serious usability issues. However, for some applications, especially training, where the physical motion and exertion of the user is an integral part of the task, such a device is required. First, however, consider whether your application might make do with something simpler, like walking in place.

Example: A sports training application will be effective only if the user physically exerts himself, so a locomotion device should be used.


Tip

The most common travel tasks should require a minimum amount of effort from the user.


Depending on the application, environment, and user goals, particular types of travel are likely to be common in a specific system, while others will only be used infrequently. The default navigation mode or controls should focus on the most common tasks.

Example: In a desktop 3D game in an indoor environment, most travel will be parallel to the floor, and users very rarely need to roll the camera. Therefore, a navigation technique that uses left and right mouse movement for camera yaw, up and down movement for camera pitch, and arrow keys to move the viewpoint forward, backward, left, and right would be appropriate for this application.


Tip

Use high-speed transitional motions, not instant teleportation, if overall environment context is important.


Simply providing a smooth path from one location to another will increase the user’s spatial knowledge and keep her oriented to the environment (Bowman et al. 1997). The movement can be at high speed to help avoid cybersickness. This approach complements the use of wayfinding aids. Teleportation should only be used in cases where knowledge of the surrounding environment is not important.

Example: Unstructured environments, such as undeveloped terrain or information visualizations, can be quite confusing. Give users spatial context by animating their target-based travel with a brief, high-speed motion.


Tip

Train users in sophisticated strategies to help them acquire survey knowledge.


If spatial orientation is especially important in an application, there are some simple strategies that will help users obtain spatial knowledge (Bowman, Davis et al. 1999). These include flying up to get a bird’s-eye view of the environment, traversing the environment in a structured fashion (see section 8.9.1), retracing paths to see the same part of the environment from the opposite perspective, and stopping to look around during travel. Users can easily be trained to perform these strategies in unfamiliar environments.

Example: Training soldiers on a virtual mockup of an actual battlefield location should result in increased spatial knowledge of that location. If the soldiers are trained to use specific spatial orientation strategies, their spatial knowledge in the physical location should improve.


Tip

If a map is used, provide a you-are-here marker.


You-are-here (YAH) maps combine a map with a YAH-marker. Such a marker helps the user to gain spatial awareness by providing her viewpoint position and/or orientation dynamically on the map. This means that the marker needs to be continuously updated to help the user match her egocentric viewpoint with the exocentric one of the map.

8.11 Case Studies

In this section, we discuss the design of the travel techniques and navigation interfaces for our two case studies. For background on the case studies, see Chapter 2, section 2.4.

8.11.1 VR Gaming Case Study

Navigation in our VR game is simultaneously the simplest and most complicated interaction to design. We want the primary navigation technique to be real walking, to give the user a heightened sense of presence in the world of the game. Of course, the tricky part is giving users the impression of a vast virtual space, when in reality, they may only have access to a small tracked physical play area. In addition, we don’t want users to constantly be reminded that they are having to avoid the physical boundaries of the space, since that would distract them from being engaged with the story.

The most common solutions to this problem are less than satisfactory. Walking in place is simply not realistic or controllable enough for the game we envision. Redirected walking isn’t possible in tracked areas the size of most people’s living rooms. Applying a scale factor to walking so that the physical space maps directly to the virtual space doesn’t make sense when the virtual space is so much larger (and when it contains multiple floors), and it’s also difficult to use for fine-grained maneuvering tasks. Teleportation, while popular in many VR games, can be jarring and disorienting. Players have to plan for where to teleport so that they will be able to move away from a physical boundary (wall) once they arrive, which is cognitively demanding and reminds players of the real world, breaking presence. So the typical behavior with teleportation is to always stand in the center of the space and teleport every time you need to move even a little bit.

Let’s break the problem down a bit. First, consider navigating within a single virtual room. Assume that the game requires a minimum play area of 2x2 meters but that virtual rooms can be bigger than that. To allow players to physically walk around the entire virtual room, we could use many of the travel techniques described in the chapter, all of which come with tradeoffs. In general, purely virtual techniques would cause users to become reliant on virtual travel, so that they rarely perform physical walking. Scaled walking techniques could work within a single room, but could lead to sickness and lack of control.

In the end, we decided on a modified form of teleportation that still encourages (actually, requires) physical walking. We will design our virtual rooms to be composed of multiple cells, with each cell being the size of the tracked play area. Obstacles on the floor (chasms, debris, etc.) will indicate that the user can’t walk beyond the border of the cell. To get to a different cell, users will gaze at a special object in the cell and perform some trigger action (a gesture or button press) that will cause a rapid movement to that cell.

The key is that the movement will start at the current location in the current cell and end at the corresponding location in the target cell. In this way, the mapping of the physical play area to the cell boundaries is still correct, and users can reach the entire cell by physically walking around it. Since users don’t get to choose the exact target location of the movement, they have to walk within the cell to get access to the virtual objects therein.

How does this technique fit into the story of the game? Many narratives are possible. For example, we could say that you have a bird as a companion, and when you raise your hand the bird comes and picks you up and takes you to the part of the room you’re looking at. This would explain how you get past the barriers and also why you’re not in complete control of where you end up.

So what about moving from room to room? Since doors will by definition be near the boundaries of the physical space, moving through a virtual door will put players in a position where they can’t move any further into the new room without running into the wall. Again, we could use teleportation to move from the center of one room to the center of an adjoining room, but we think a more creative solution that could actually enhance the story is possible.

Imagine that all the doors in our fictional hotel have been sealed shut by the bad guys. But the designers of the hotel, envisioning just such a scenario, included some secret passages that we can use. These take the form of those fake bookshelves you’ve seen in dozens of bad spy flicks, where pressing a hidden lever causes the bookcase to spin around into the room on the other side of the wall. So to go through a wall, players have to stand on a semicircular platform next to the bookshelf and activate it somehow. Then the bookshelf and platform rotates them into the room on the other side. In this way, the user-room relationship is correct, and the user can physically walk back into the play area (Figure 8.23). This should also be fun and engaging. Sound effects will also help.

Figure 8.23 Rotating the virtual game environment relative to the physical tracking space, illustrated by the blue volume, using the bookshelf metaphor. (Image courtesy of Ryan P. McMahan)

When using the rotating bookshelf technique, we want to avoid making the user feel sick due to visual-vestibular mismatch. We suggest providing a clear view not only of the current room but also the room behind the bookshelf. For example, we could cut a couple of holes through the bookshelf so users can see the room beyond before they activate the technique. If the user knows where she’s about to go and sees the rotation through the holes (with the bookshelf as a fixed frame of reference), that should reduce feelings of sickness.

The last piece we have to consider is vertical navigation from one floor to the next. In a play area with only a flat floor, it would be very difficult to do believable stairs, although there are some interesting techniques for climbing virtual ladders (Lai et al. 2015; see section 8.8.3). In a hotel environment, however, virtual elevators are the obvious choice.

Like all of the designs we’ve presented in our case study, this is only one of many possible ways to interact effectively in our VR game. But it does provide some unique ways to navigate the large environment of this virtual hotel in a way that should seem natural and fluid, despite the challenges of a limited physical play area.

Key Concepts

The key concepts that we used for travel in the virtual reality gaming case study were:

Natural physical movements for navigation can enhance the sense of presence.

Even with a limited tracking area, consider ways to allow and encourage the use of a physical walking metaphor.

If the application allows, use story elements to help users make sense of travel techniques.

8.11.2 Mobile AR Case Study

On-site exploration of environmental data implies that users may explore sites of different scales. In HYDROSYS, the scales varied widely, from small sites of a couple hundreds of meters to sites spanning several kilometers in length. While you can certainly walk across such sites, larger sites come with inherent issues that may limit the awareness and understanding of the available augmented data. For example, consider a site with changes in elevation: it is unlikely you can see the whole environment from a single perspective due to occlusion. Furthermore, the cameras used for handheld AR often narrow the view of the site.

We tried solving these limitations by developing a multiview camera navigation system. In other words, we deployed multiple cameras to allow users to take different viewpoints at the site, rather than limiting them to the view from their own physical location. The system included the cameras of the handheld units, but also a pan-tilt unit and several cameras mounted underneath a large blimp. Our system not only enabled the user to view the site from different perspectives to gain a better awareness, but also to share viewpoints with other users at the site, supporting collaborative work.

Implementing the right techniques to support multiview navigation encompasses both cognitive (wayfinding) and performance issues. To advance spatial awareness, we had to develop techniques that enhance overview possibilities and deal with occlusion issues, while conveying correct spatial relations (Veas et al. 2012). Users had to clearly understand the spatial features of the site to be able to interpret the augmented information correctly. We assumed that multiple camera viewpoints could provide a better overview of the site that could aid in creating a more accurate mental map of the environment. This principle is well known from the area of surveillance systems, but our system approach differs from such systems. In surveillance systems, the cameras are mostly static, while the observer is expected to have good knowledge of the site and the location of the cameras around the site. In contrast, the HYDROSYS setup also consisted of dynamic cameras and thus dynamic viewpoints, which may not easily be processed by the user, as she may not know the site, and the information displayed in a camera image might be difficult to match to the mental map.

Processing of the spatial information into the mental map is affected by discrepancies between the camera view and the user’s own first-person view (Veas et al. 2010), including the camera position and orientation, and the visibility of objects and locations. Clearly, cameras that are not from or in the user’s point of view and that are looking at another part of the scene are more difficult to process mentally. The key to an effective navigation technique is the approach used for traversing between the local and remote views. Such an approach might provide additional data such as maps or 3D models that convey the spatial configuration of a site, but that information might not always be readily available, or it might be difficult to match map/model data to ad hoc views.

In an initial evaluation (Veas et al. 2010), we showed that users prefer a technique that imposes lower workload if it allows them to perform reasonably well. Still, the evaluation also showed room for improvement in our initial techniques.

In the final system, we took a hybrid AR and 3D model exploration approach (Veas et al. 2012). Users could assess the locations of the different cameras by either looking through the lens of the selected camera or by switching to a 3D exploration mode. In the latter, the user could freely explore the underlying 3D model of the site at hand, similar to exploring Google Maps in 3D. This offers the advantages of resolving occlusion more easily and providing a good overview, since users do not need to stick to the available camera viewpoints. However, the 3D model has the disadvantage of being a static and potentially outdated model of the environment.

The interface showed the camera viewpoints (Figure 8.24) with regularly updated thumbnails of their video footage, and a mini-map could also be displayed. The user could switch to another camera’s viewpoint by selecting it directly or in a menu.

We also experimented with a method called variable perspective. This mode is an AR visualization technique (Figure 8.25) developed to combine views from different perspectives in a single image. It seamlessly blends between two different cameras: the first-person point of view and a second camera that can be rotated to visually “flip” the scene at further distances. This effect is similar to a famous scene from the movie Inception, in which the world is bent towards the user. The environment simply “curls upwards” at the switch point between first- and third-person perspective, thus presenting a gradual transition between both viewpoints. If the third-person view comes from a bird’s-eye view, this helps users see parts of the site that might be occluded from the first-person view.

Figure 8.24 A viewpoint and corresponding video footage from a remote user is displayed in the 3D model. The user can travel to the remote viewpoint by clicking on the camera. (Image courtesy of Eduardo Veas and Ernst Kruijff)

Figure 8.25 Variable perspective visualization. The first-person viewpoint (mc in the diagram) is shown on the bottom, and the remote viewpoint (sc) is shown on the top. The remote viewpoint can be moved and “flipped” to define the (exocentric) angle at which the distant parts of the site can be viewed. (Image courtesy of Eduardo Veas and Ernst Kruijff)

Key Concepts

The key lessons that we learned from the mobile AR case study for travel were:

Situation awareness: creating a good mental map of the observed environment is crucial to adequately making use of the augmented information within. The acquisition and processing of spatial knowledge at larger-scale sites can be an issue, particularly if users have not visited the site before. The 3D UI needs to provide users with an overview, using techniques such as mini-maps and multi-camera setups.

Multiview: the use of multi-camera systems can help by providing an overview and resolving occlusions. However, we need to carefully consider how users switch between viewpoints, as switching can result in confusion. Combining AR and 3D exploration modes can be beneficial for providing contextual information about the spatial relationships between the cameras.

8.12 Conclusion

In this chapter, we have discussed how both 3D travel techniques and wayfinding affect navigation in 3D UIs. We have presented four categories of travel metaphors, including walking, steering, selection-based travel, and manipulation-based travel. Within each of these categories, we have presented multiple techniques, each with its own approach to realizing its travel metaphor. We have also discussed several design aspects of travel techniques, including viewpoint orientation, velocity specification, vertical travel, semiautomated travel, scaling the world, travel modes, handling multiple cameras, and nonphysical input. We then discussed how wayfinding affects navigation in 3D environments, including user-centered and environment-centered wayfinding cues. Finally, we discussed several guidelines for designing navigation interfaces for 3D UIs and how those guidelines influenced our two case studies. The upcoming chapter concludes part IV with an in-depth treatment of system control in 3D UIs.

Recommended Reading

For an excellent overview of locomotion devices, we recommend the following:

Hollerbach, J. (2002). “Locomotion Interfaces.” In K. Stanney (ed.), Handbook of Virtual Environments: Design, Implementation, and Applications, 239–254. Mahwah, NJ: Lawrence Erlbaum Associates.

Steinicke, F., Y. Visell, J. Campos, and A. Lécuyer (2013). Human Walking in Virtual Environments: Perception, Technologies, and Applications. New York: Springer.

An informative presentation on the implementation of travel techniques for desktop 3D interfaces can be found in this text:

Barrilleaux, J. (2000). 3D User Interfaces with Java 3D. Greenwich, CT: Manning.

Readers interested in more details on the empirical performance of common travel techniques should take a look at the following:

Bowman, D., D. Johnson, and L. Hodges (2001). “Testbed Evaluation of VE Interaction Techniques.” Presence: Teleoperators and Virtual Environments 10(1): 75–95.

For an introduction to the effects of sense of presence on wayfinding, we recommend reading:

Regenbrecht, H., T. Schubert, and F. Friedman (1998). “Measuring the Sense of Presence and Its Relations to Fear of Heights in Virtual Environments.” International Journal of Human-Computer Interaction 10(3): 233–250.

Usoh, M., K. Arthur, M. Whitton, R. Bastos, A. Steed, M. Slater, and F. Brooks Jr. (1999). “Walking > Walking-in-Place > Flying in Virtual Environments.” Proceedings of SIGGRAPH ’99, 359–364.

For an example of a study on the effects of wayfinding in training transfer, we recommend reading:

Darken, R., and W. Banker (1998). “Navigating in Natural Environments: A Virtual Environment Training Transfer Study.” Proceedings of the 1998 IEEE Virtual Reality Annual International Symposium (VRAIS ’98), 12–19.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.238.76