Chapter 13. Metaphors, Idioms, and Affordances

Some interface designers speak of finding the right metaphor upon which to base their interface designs. They imagine that filling their interface with images of familiar objects from the real world will give their users a pipeline to easy learning. So, they create an interface masquerading as an office filled with desks, file cabinets, telephones, and address books, or as a pad of paper or a street of buildings. If you, too, search for that magic metaphor, you will be in august company. Some of the best and brightest designers in the interface world consider metaphor selection as one of their first and most important tasks.

We find this very literal approach to be limiting and potentially problematic. Strict adherence to metaphors ties interfaces unnecessarily to the workings of the physical world. One of the most fantastic things about digital products is that the working model presented to users need not be bound by the limitations of physics and the inherent messiness of real three-dimensional space.

User interfaces based on metaphors have a host of other problems as well: There aren’t enough good metaphors to go around, they don’t scale well, and the ability of users to recognize them is often questionable, especially across cultural boundaries. Metaphors, especially physical and spatial metaphors, have an extremely limited place in the design of most Information-Age, software-enabled products. In this chapter, we discuss the reasons for this, as well as the alternatives to design based on metaphors.

Interface Paradigms

There are three dominant paradigms in the conceptual and visual design of user interfaces: implementation-centric, metaphoric, and idiomatic. The implementation-centric interfaces are based on understanding how things actually work under the hood — a difficult proposition. Metaphoric interfaces are based on intuiting how things work — a risky method. Idiomatic interfaces, however, are based on learning how to accomplish things — a natural, human process.

The field of user-interface design progressed from a heavy focus on technology (implementation) to an equally heavy focus on metaphor. There is ample evidence of all three paradigms in contemporary software design, even though the metaphoric paradigm is the only one that has been named and described. Although metaphors are great tools for humans to communicate with each other (this book is filled with them), they are weak tools for the design of software, and all too often they hamper the creation of truly superior interfaces.

Implementation-centric interfaces

Implementation-centric user interfaces are widespread in the computer industry. These interfaces are expressed in terms of their construction, of how they are built. In order to successfully use them, users must understand how the software works internally. Following the implementation-centric paradigm means user-interface design based exclusively on the implementation model.

The overwhelming majority of software programs today are implementation centric in that they show us, without any hint of shame, precisely how they are built. There is one button per function, one dialog per module of code, and the commands and processes precisely echo the internal data structures and algorithms.

We can see how an implementation model interface ticks by learning how to run its program. The problem is that the reverse is also true: We must learn how the program works in order to successfully use the interface.

Note

Implementation-centric interfaces

Clearly, implementation-centric interfaces are the easiest to build — every time a programmer writes a function he slaps on a bit of user interface to test that function. It’s easy to debug, and when something doesn’t behave properly, it’s easy to troubleshoot. Further, engineers like to know how things work, so the implementation-centric paradigm is very satisfying to them. Engineers prefer to see the gears and levers and valves because it helps them understand what is going on inside the machine. That those artifacts needlessly complicate things for users seems a small price to pay. Engineers may want to understand the inner workings, but most users don’t have either the time or desire. They’d much rather be successful than be knowledgeable, a preference that is often hard for engineers to understand.

A close relative of the implementation-centric interface worth mentioning is the “org-chart centric” interface. This is the common situation where a product, or most typically, a Web site, is organized, not according to how users are likely to think about information, but by how the company the site represents is structured. On such as site, there is typically a tab or area for each division and a lack of cohesion between these areas. Similar to the implementation-centric product interface, an org-chart-centric Web site requires users to have an intimate understanding of how a corporation is structured to find the information they are interested in.

Metaphoric interfaces

Metaphoric interfaces rely on intuitive connections that users makes between the visual cues in an interface and its function. There is no need to understand the mechanics of the software, so it is a step forward from implementation-centric interfaces, but its power and usefulness has been inflated to unrealistic proportions.

When we talk about metaphors in the context of user interface and interaction design, we really mean visual metaphors: a picture used to represent the purpose or attributes of a thing. Users recognize the imagery of the metaphor and, by extension, can presumably understand the purpose of the thing. Metaphors can range from the tiny images on toolbar buttons to the entire screen on some programs — from a tiny pair of scissors on a button indicating Cut to a full-size checkbook in Quicken. We understand metaphors intuitively, but what does that really mean? Webster’s Dictionary defines intuition like this:

  • in·tu·i·tion in-’tu-wi-shen n 1 : quick and ready insight 2 a : immediate apprehension or cognition b : knowledge or conviction gained by intuition c : the power or faculty of attaining to direct knowledge or cognition without evident rational thought and inference

This definition highlights the magical quality of intuition, but it doesn’t say how we intuit something. Intuition works by inference, where we see connections between disparate subjects and learn from these similarities, while not being distracted by their differences. We grasp the meaning of the metaphoric controls in an interface because we mentally connect them with other things we have already learned. This is an efficient way to take advantage of the awesome power of the human mind to make inferences. However, this method also depends on the idiosyncratic human minds of users, which may not have the requisite language, knowledge, or inferential power necessary to make those connections.

Limitations of metaphors

The idea that metaphors are a firm foundation for user-interface design is misleading. It’s like worshipping floppy disks because so much good software once came on them. Metaphors have many limitations when applied to modern, information-age systems.

For one thing, metaphors don’t scale very well. A metaphor that works well for a simple process in a simple program will often fail to work well as that process grows in size or complexity. Large file icons were a good idea when computers had floppies or 10 MB hard disks with only a couple of hundred files, but in these days of 250 GB hard disks and tens of thousands of files, file icons become too clumsy to use effectively.

Metaphors also rely on associations perceived in similar ways by both the designer and the user. If the user doesn’t have the same cultural background as the designer, it is easy for metaphors to fail. Even in the same or similar cultures, there can be significant misunderstandings. Does a picture of an airplane mean “check flight arrival information” or “make airline reservations?”

Finally, although a metaphor offers a small boost in learnability to first-time users, it exacts a tremendous cost after they become intermediates. By reflecting the physical world of mechanisms, most metaphors firmly nail our conceptual feet to the ground, forever limiting the power of our software. We’ll discuss this issue with metaphors later in this chapter.

Our definition of intuition indicates that rational thought is not required in the process of intuiting. In the computer industry, and particularly in the user-interface design community, the word intuitive is often used to mean easy-to-use or easy-to-understand. Ease-of-use is obviously important, but it doesn’t promote our craft to attribute its success to metaphysics. Nor does it help us to devalue the precise meaning of the word. There are very real reasons why people understand certain interfaces and not others.

Intuition, instinct, and learning

There are certain sounds, smells, and images that make us respond without any previous conscious learning. When a small child encounters an angry dog, she instinctively knows that bared fangs signal great danger even without any previous learning. The encoding for such recognition goes deep. Instinct is a hard-wired response that involves no conscious thought. Intuition is one step above instinct because, although it also requires no conscious thought, it is based on a web of knowledge learned consciously.

Examples of instinct in human-computer interaction include the way we are startled and made apprehensive by gross changes in the image on the screen, find our eyes drawn inexorably to the flashing advertisement on a Web page, or react to sudden noises from the computer or the smell of smoke rising from the CPU.

Intuition is a middle ground between having consciously learned something and knowing something instinctively. If we have learned that things glowing red can burn us, we tend to classify all red-glowing things as potentially dangerous until proven otherwise. We don’t necessarily know that the particular red-glowing thing is a danger, but it gives us a safe place to begin our exploration.

What we commonly refer to as intuition is actually a mental comparison between a new experience and the things we have already learned. You instantly intuit how to work a wastebasket icon, for example, because you once learned how a real wastebasket works, thereby preparing your mind to make the connection years later. But you didn’t intuit how to use the original wastebasket. It was just an extremely easy thing to learn. This brings us to the third type of interface, based on the fact that the human mind is an incredibly powerful learning machine that constantly and effortlessly learns new things.

Idiomatic interfaces

Idiomatic design, what Ted Nelson has called “the design of principles,” is based on the way we learn and use idioms — figures of speech like “beat around the bush” or “cool.” Idiomatic user interfaces solve the problems of the previous two interface types by focusing not on technical knowledge or intuition of function, but rather on the learning of simple, nonmetaphorical visual and behavioral idioms to accomplish goals and tasks.

Idiomatic expressions don’t provoke associative connections the way that metaphors do. There is no bush and nobody is beating anything. Idiomatically speaking, something can be both cool and hot and be equally desirable. We understand the idiom simply because we have learned it and because it is distinctive, not because we understand it or because it makes subliminal connections in our minds. Yet, we are all capable of rapidly memorizing and using such idioms: We do so almost without realizing it.

If you cannot intuit an idiom, neither can you reason it out. Our language is filled with idioms that, if you haven’t been taught them, make no sense. If we say, “Uncle Joe kicked the bucket,” you know what we mean even though there is no bucket or kicking involved. You can’t know this by thinking through the various permutations of smacking pails with your feet. You can only learn this from context in something you read or by being consciously taught it. You remember this obscure connection between buckets, kicking, and dying only because humans are good at remembering things like this.

The human mind has a truly amazing capacity to learn and remember large numbers of idioms quickly and easily without relying on comparisons to known situations or an understanding of how or why they work. This is a necessity, because most idioms don’t have metaphoric meaning at all, and the stories behind most others were lost ages ago.

Graphical interfaces are largely idiomatic

It turns out that most of the elements of intuitive graphical interfaces are actually visual idioms. Windows, title bars, close boxes, screen-splitters, hyperlinks, and drop-downs are things we learn idiomatically rather than intuit metaphorically. The Macintosh’s use of the trashcan to unmount an external FireWire disk before removing it is purely idiomatic (and many designers consider it a poor idiom), despite the visual metaphor of the trash can itself.

The ubiquitous mouse input device is not metaphoric of anything, but rather is learned idiomatically. There is a scene in the movie Star Trek IV where Scotty returns to 20th-century Earth and tries to speak into a mouse. There is nothing about the physical appearance of the mouse that indicates its purpose or use, nor is it comparable to anything else in our experience, so learning it is not intuitive. However, learning to point at things with a mouse is incredibly easy. Someone probably spent all of three seconds showing it to you the first time, and you mastered it from that instant on. We don’t know or care how mice work, and yet even small children can operate them just fine. That is idiomatic learning.

Ironically, many of the familiar GUI elements that are often thought of as metaphoric are actually idiomatic. Artifacts like resizable windows and endlessly nested file folders are not really metaphoric — they have no parallel in the real world. They derive their strength only from their easy idiomatic learnability.

Good idioms must be learned only once

We are inclined to think that learning interfaces is hard because of our conditioning based on experience with implementation-centric software. These interfaces are very hard to learn because you need to understand how the software works internally to use them effectively. Most of what we know we learn without understanding: things like faces, social interactions, attitudes, melodies, brand names, the arrangement of rooms, and furniture in our houses and offices. We don’t understand why someone’s face is composed the way it is, but we know that face. We recognize it because we have looked at it and automatically (and easily) memorized it.

Note

Good idioms must be learned only once

The key observation about idioms is that although they must be learned, they are very easy to learn, and good ones need to be learned only once. It is quite easy to learn idioms like “neat” or “politically correct” or “the lights are on but nobody’s home” or “in a pickle” or “take the red-eye” or “grunge.” The human mind is capable of picking up idioms like these from a single hearing. It is similarly easy to learn idioms like radio buttons, close boxes, drop-down menus, and combo boxes.

Branding and idioms

Marketing and advertising professionals understand well the idea of taking a simple action or symbol and imbuing it with meaning. After all, synthesizing idioms is the essence of product branding, in which a company takes a product or company name and imbues it with a desired meaning. The golden arches of McDonalds, the three diamonds of Mitsubishi, the five interlocking rings of the Olympics, even Microsoft’s flying window are nonmetaphoric idioms that are instantly recognizable and imbued with common meaning. The example of an idiomatic symbol shown in Figure 13-1 illustrates its power.

Here is an idiomatic symbol that has been imbued with meaning from its use, rather than by any connection to other objects. For anyone who grew up in the 1950s and 1960s, this otherwise meaningless symbol has the power to evoke a shiver of fear because it represents nuclear radiation. Visual idioms, such as the American flag, can be just as powerful as metaphors, if not more so. The power comes from how we use them and associate them, rather than from any innate connection to real-world objects.

Figure 13-1. Here is an idiomatic symbol that has been imbued with meaning from its use, rather than by any connection to other objects. For anyone who grew up in the 1950s and 1960s, this otherwise meaningless symbol has the power to evoke a shiver of fear because it represents nuclear radiation. Visual idioms, such as the American flag, can be just as powerful as metaphors, if not more so. The power comes from how we use them and associate them, rather than from any innate connection to real-world objects.

Further Limitations of Metaphors

If we depend on metaphors to create user interfaces, we encounter not only the minor problems already mentioned, but also two more major problems: Metaphors are hard to find, and they constrict our thinking.

Finding good metaphors

It may be easy to discover visual metaphors for physical objects like printers and documents. It can be difficult or impossible to find metaphors for processes, relationships, services, and transformations — the most frequent uses of software. It can be extremely daunting to find a useful visual metaphor for changing channels, purchasing an item, finding a reference, setting a format, changing a photograph’s resolution, or performing statistical analysis, yet these operations are precisely the type of processes we use software to perform most frequently.

Computers and digital products are so powerful because of their ability to manage incredibly complex relationships within very large sets of data. Their very utility is based upon the fact that the human mind is challenged by such multidimensional problems, so almost by definition, these processes are not well suited to a simple, physical analog that people “automatically” comprehend.

The problems with global metaphors

The most significant problem with metaphors, however, is that they tie our interfaces to Mechanical Age artifacts. An extreme example of this was Magic Cap, a handheld communicator interface introduced with some fanfare by General Magic in the mid-1990s. It relies on metaphors for almost every aspect of its interface. You access your messages from an inbox or a notebook on a desk. You walk down a hallway that is lined with doors representing secondary functions. You go outside to access third-party services, which as you can see in Figure 13-2, are represented by buildings on a street. You enter a building to configure a service, and so on. The heavy reliance on this metaphor means that you can intuit the basic functioning of the software, but the downside is that, after you understand its function, the metaphor adds significantly to the overhead of navigation. You must go back out onto the street to configure another service. You must go down the hallway and into the game room to play Solitaire. This may be normal in the physical world, but there is no reason for it in the world of software. Why not abandon this slavish devotion to metaphor and give the user easy access to functions? It turns out that a General Magic programmer later created a bookmarking shortcut facility as a kludgy add-on, but alas, too little too late.

The Magic Cap interface from General Magic was used in products from Sony and Motorola in the mid-1990s. It is a tour de force of metaphoric design. All the navigation in the interface, and most other interactions as well, were subordinated to the maintenance of spatial and physical metaphors. It was surely fun to design but was not particularly easy to use after you became an intermediate. This was a shame, because some of the lower-level, nonmetaphoric, data-entry interactions were quite sophisticated and well designed for the time.

Figure 13-2. The Magic Cap interface from General Magic was used in products from Sony and Motorola in the mid-1990s. It is a tour de force of metaphoric design. All the navigation in the interface, and most other interactions as well, were subordinated to the maintenance of spatial and physical metaphors. It was surely fun to design but was not particularly easy to use after you became an intermediate. This was a shame, because some of the lower-level, nonmetaphoric, data-entry interactions were quite sophisticated and well designed for the time.

General Magic’s interface relies on what is called a global metaphor. This is a single, overarching metaphor that provides a framework for all the other metaphors in the system. The desktop of the original Macintosh is also a global metaphor.

A hidden problem of global metaphors is the mistaken belief that other lower-level metaphors consistent with them enjoy cognitive benefits by association. The temptation is irresistible to stretch the metaphor beyond simple function recognition: That software telephone also lets us dial with buttons just like those on our desktop telephones. We see software that has address books of phone numbers just like those in our pockets and purses. Wouldn’t it be better to go beyond these confining, industrial-age technologies and deliver some of the real power of the computer? Why shouldn’t our communications software allow multiple connections or make connections by organization or affiliation, or just hide the use of phone numbers altogether?

It may seem clever to represent your dial-up service with a picture of a telephone sitting on a desk, but it actually imprisons you in a limited design. The original makers of the telephone would have been ecstatic if they could have created a phone that let you call your friends just by pointing to pictures of them. They couldn’t because they were restricted by the dreary realities of electrical circuits and Bakelite moldings. On the other hand, today we have the luxury of rendering our communications interfaces in any way we please — showing pictures of our friends is completely reasonable — yet we insist on holding these concepts back with representations of obsolete technology.

There are two snares involved in extending metaphors, one for the user and one for the designer. After the user depends on the metaphor for recognition, he expects consistency of behavior with the real-world object to which the metaphor refers. This causes the snare for the designer, who now, to meet user expectations, is tempted to render the software in terms of the metaphor’s Mechanical Age referent. As we discussed in Chapter 2, transliterating mechanical processes onto the computer usually makes them worse than they were before.

Take the example of the ubiquitous file folder in modern computer operating systems. As a mechanism for organizing documents, it is quite easy to learn and understand because of its similarity to a physical file folder in a file cabinet. Unfortunately, as is the case with many metaphoric user interfaces, it functions a bit differently than its real world analog, which has the potential to create cognitive friction on the part of users. For example, in the world of paper, no one nests folders 10 layers deep, which makes it difficult for novice computer users to come to terms with the navigational structures of an operating system.

There are also gravely limiting consequences to the implementation of this mechanism. In the world of paper, it is impossible for the same document to be located in two different places in the filing cabinet, and as a result, filing is executed with a single organization scheme (such as alphabetically by name or numerically by account number). Our digital products are not intrinsically bound by such limitations, but blind adherence to an interface metaphor has drastically limited our ability to file a single document according to multiple organization schemes.

As Brenda Laurel said, “Interface metaphors rumble along like Rube Goldberg machines, patched and wired together every time they break, until they are so encrusted with the artifacts of repair that we can no longer interpret them or recognize their referents.” It amazes us that designers, who can finally create that dream-phone interface, give us the same old telephone simply because they were taught that a strong, global metaphor is a prerequisite to good user-interface design. Of all the misconceptions to emerge from Xerox PARC, the global metaphor myth is the most debilitating and unfortunate.

Idiomatic design is the future of interaction design. Using this paradigm, we depend on the natural ability of humans to learn easily and quickly as long as we don’t force them to understand how and why. There is an infinity of idioms waiting to be invented, but only a limited set of metaphors waiting to be discovered. Metaphors give first-timers a penny’s worth of value but cost them many dollars’ worth of problems as they continue to use the software. It is always better to design idiomatically, using metaphors only when a truly appropriate and powerful one falls in our lap.

Use metaphors if you can find them, but don’t bend your interface to fit some arbitrary metaphoric standard.

Note

The Magic Cap interface from General Magic was used in products from Sony and Motorola in the mid-1990s. It is a tour de force of metaphoric design. All the navigation in the interface, and most other interactions as well, were subordinated to the maintenance of spatial and physical metaphors. It was surely fun to design but was not particularly easy to use after you became an intermediate. This was a shame, because some of the lower-level, nonmetaphoric, data-entry interactions were quite sophisticated and well designed for the time.

Macs and metaphors: A revisionist view

In the mid-1970s, the modern graphical user interface (GUI) was invented at Xerox Palo Alto Research Center (PARC). The GUI — as defined by PARC — consisted of many things: windows, buttons, mice, icons, visual metaphors, and drop-down menus. Together they have achieved an unassailable stature in the industry by association with the empirical superiority of the ensemble.

The first commercially successful implementation of the PARC GUI was the Apple Macintosh, with its desktop metaphor: the wastebasket, overlapping sheets of paper (windows), and file folders. The Mac didn’t succeed because of these metaphors, however. It succeeded for several other reasons, including an overall attention to design and detail. The interaction design advances that contributed were:

  • It defined a tightly restricted but flexible vocabulary for users to communicate with applications, based on a very simple set of mouse actions.

  • It offered sophisticated, direct manipulation of rich visual objects on the screen.

  • It used square pixels at high resolution, which enabled the screen to match printed output very closely, especially the output of Apple’s other new product: the laser printer.

Metaphors helped structure these critical design features and made for good marketing copy but were never the main appeal. In fact, the early years were rather rocky for the Mac as people took time to grow accustomed to the new, GUI way of doing things. Software vendors were also initially gun-shy about developing for such a radically different environment (Microsoft being the exception).

However, people were eventually won over by the capability of the system to do what other systems couldn’t: WYSIWYG (what you see is what you get) desktop publishing. The combination of WYSIWYG interfaces and high-quality print output (via the LaserWriter printer) created an entirely new market that Apple and the Mac owned for years. Metaphors were but a bit player (no pun intended) in the Mac’s success.

Building Idioms

When graphical user interfaces were first invented, they were so clearly superior that many observers credited the success to the interfaces’ graphical nature. This was a natural, but incorrect, assumption. The first GUIs, such as the original Mac, were better primarily because the graphical nature of their interfaces required a restriction of the range of vocabulary by which the user interacted with the system. In particular, the input they could accept from the user went from an unrestricted command line to a tightly restricted set of mouse-based actions. In a command-line interface, users can enter any combination of characters in the language — a virtually infinite number. In order for a user’s entry to be correct, he needs to know exactly what the program expects. He must remember the letters and symbols with exacting precision. The sequence can be important, and sometimes even capitalization matters.

In modern GUIs, users can point to images or words on the screen with the mouse cursor. Most of these choices migrated from the users’ heads to the screen, eliminating any need to memorize them. Using the buttons on the mouse, users can click, double-click, or click and drag. The keyboard is used for data entry, but not typically for command entry or navigation. The number of atomic elements in users’ input vocabulary has dropped from dozens (if not hundreds) to just three, even though the range of tasks that can be performed by GUI programs isn’t any more restricted than that of command-line systems.

The more atomic elements there are in an interaction vocabulary, the more time-consuming and difficult the learning process is. A vocabulary like that of the English language takes at least 10 years to learn thoroughly, and its complexity requires constant use to maintain fluency, but it can be extraordinarily expressive for a skilled user. Restricting the number of elements in our interaction vocabulary reduces its expressiveness at the atomic level. However, more complex interactions can be easily built from the atomic ones, much the way that letters can be combined to form words, and words to form sentences.

A properly formed interaction vocabulary can be represented by an inverted pyramid. All easy-to-learn communications systems obey the pattern shown in Figure 13-3. The bottom layer contains primitives, the atomic elements of which everything in the language is composed. In modern GUIs, these primitives consist of pointing, clicking, and dragging.

One of the primary reasons that GUIs are easy to use is that they enforce a restricted interaction vocabulary that builds complex idioms from a very small set of primitives: pointing, clicking, and dragging. These primitives can build a larger set of simple compounds, which in turn can be assembled into a wide variety of complex, domain-specific idioms, all of which are based on the same small set of easily learned actions.

Figure 13-3. One of the primary reasons that GUIs are easy to use is that they enforce a restricted interaction vocabulary that builds complex idioms from a very small set of primitives: pointing, clicking, and dragging. These primitives can build a larger set of simple compounds, which in turn can be assembled into a wide variety of complex, domain-specific idioms, all of which are based on the same small set of easily learned actions.

The middle layer contains compounds. These are more complex constructs created by combining one or more of the primitives. These include simple visual objects such as text display, actions such as double-clicking or clicking-and-dragging, and manipulable objects like pushbuttons, check boxes, hyperlinks, and direct manipulation handles.

The uppermost layer contains idioms. Idioms combine and structure compounds using domain knowledge of the problem under consideration: information related to the user’s work patterns and goals, and not specifically to the computerized solution. The set of idioms opens the vocabulary to information about the particular problem the program is trying to address. In a GUI, it includes things like labeled buttons and fields, navigation bars, list boxes, icons, and even groups of fields and controls, or entire panes and dialogs.

Any language that does not follow this form will be very hard to learn. Many effective communications systems outside of the computer world follow similar vocabularies. Street signs in the United States follow a simple pattern of shapes and colors: Yellow triangles are cautionary, red octagons are imperatives, and green rectangles are informative.

Similarly, there is nothing intuitive or metaphoric about text messaging on a phone. The compound interactions involved in tapping numeric buttons in specific sequences to write in alphabetical characters is entirely learned, and when combined with predictive text capabilities, forms an incredibly effective idiom for writing brief notes from a mobile phone.

Manual Affordances

In his seminal book The Design of Everyday Things, Donald Norman gave us the term affordance, which he defines as “the perceived and actual properties of the thing, primarily those fundamental properties that determine just how the thing could possibly be used.”

This concept is absolutely invaluable to the practice of interface design. For our purposes, the definition omits a key connection: How do we know what those properties offer to us? If you look at something and understand how to use it — you comprehend its affordances — you must be using some method for making the mental connection.

Therefore, we propose altering Norman’s definition by omitting the phrase “and actual.” By doing this, affordance becomes a purely cognitive concept, referring to what we think the object can do rather than what it can actually do. If a pushbutton is placed on the wall next to the front door of a residence, its affordances are 100% doorbell. If, when we push it, it causes a trapdoor to open beneath us and we fall into it, it turns out that it wasn’t a doorbell, but that doesn’t change its affordance as one.

So how do we know it’s a doorbell? Simply because we have learned about doorbells and door etiquette and pushbuttons from our complex and lengthy socialization and maturation process. We have learned about this class of pushable things by being exposed to electrical and electronic devices in our environs and because — years ago — we stood on doorsteps with our parents, learning how to approach another person’s home.

But there is another force at work here, too. If we see a pushbutton in an unlikely place such as the hood of a car, we cannot imagine what its purpose is, but we do recognize it as a finger-pushable object. How do we know this? Undoubtedly, we recognize it because of our tool-manipulating nature. We, as a species, see things that are finger-sized, placed within reach, and we automatically push them. We see things that are long and rounded, and we wrap our fingers around them and grasp them like handles. This is what Norman was getting at with his term affordance. For clarity, however, we’ll call this instinctive understanding of how objects are manipulated with our hands manual affordance. When artifacts are clearly shaped to fit our hands or feet, we recognize that they can be directly manipulated and require no written instructions. In fact, this act of understanding how to use a tool based on the relationship of its shape to our hands is a clear example of intuiting an interface.

Norman discusses at length how [manual] affordances are much more compelling than written instructions. A typical example he uses is a door that must be pushed open using a metal bar for a handle. The bar is just the right shape and height and is in the right position to be grasped by the human hand. The manual affordances of the door scream, “Pull me.” No matter how often someone uses this diabolical door, he will always attempt to pull it open, because the affordances are strong enough to drown out any number of signs affixed to the door saying Push.

There are only a few manual affordances. We pull handle-shaped things with our hands or, if they are small, we pull them with our fingers. We push flat plates with our hands or fingers. If they are on the floor we push them with our feet. We rotate round things, using our fingers for small ones — like dials — and both hands on larger ones, like steering wheels. Such manual affordances are the basis for much of our visual user-interface design.

The popular simulated-3D design of systems like Windows, Mac OS, and Motif relies on shading, highlighting, and shadows to make screen images appear more dimensional. These images offer virtual manual affordances in the form of buttonlike images that say “Push me” to our tool-manipulating brains.

Semantics of manual affordances

What’s missing from an unadorned, virtual manual affordance is any idea of what function it performs. We can see that it looks like a button, but how do we know what it will accomplish when we press it? Unlike mechanical objects, you can’t figure out a virtual lever’s function just by tracing its connections to other mechanisms — software can’t be casually inspected in this manner. Instead, we must rely either on supplementary text and images, or, most often, on our previous learning and experience. The affordance of the scrollbar clearly shows that it can be manipulated, but the only things about it that tell us what it does are the arrows, which hint at its directionality. In order to know that a scrollbar controls our position in a document, we either have to be taught or learn through experimentation.

Controls must have text or iconic labels on them to make sense. If the answer isn’t suggested by the control, we can only learn what it does by one of two methods: experimentation or training. Either we read about it somewhere, ask someone, or try it and see what happens. We get no help from our instinct or intuition. We can only rely on the empirical.

Fulfilling user expectations of affordances

In the real world, an object does what it can do as a result of its physical form and its connections with other physical objects. A saw can cut wood because it is sharp and flat and has a handle. A knob can open a door because it is connected to a latch. However, in the digital world, an object does what it can do because a programmer imbued it with the power to do something. We can discover a great deal about how a saw or a knob works by physical inspection, and we can’t easily be fooled by what we see. On a computer screen, though, we can see a raised, three-dimensional rectangle that clearly wants to be pushed like a button, but this doesn’t necessarily mean that it should be pushed. It could, literally, do almost anything. We can be fooled because there is no natural connection — as there is in the real world — between what we see on the screen and what lies behind it. In other words, we may not know how to work a saw, and we may even be frustrated by our inability to manipulate it effectively, but we will never be fooled by it. It makes no representations that it doesn’t manifestly live up to. On computer screens, canards and false impressions are very easy to create.

When we render a button on the screen, we are making a contract with the user that that button will visually change when she pushes it: It will appear to be depressed when the mouse button is clicked over it. Further, the contract states that the button will perform some reasonable work that is accurately described by its legend. This may sound obvious, but it is frankly astonishing how many programs offer bait-and-switch manual affordances. This is relatively rare for pushbuttons, but all too common for other controls, especially on many Web sites where the lack of affordances can make it difficult to differentiate between controls, content, and ornamentation. Make sure that your program delivers on the expectations it sets via the use of manual affordances.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.240.222