Chapter 8

General and Specific Information in Representations

REPRESENTATION AND ABSTRACTION

In chapters 6 and 7, I turned from defining new representational types to examining how one particular representation, structured representation, can represent perceptual and conceptual information. Nonetheless, chapters 2 through 7 were primarily concerned with questions like “How are representations of different types structured?” and “What processes are often used with particular representations?” Chapters 8 and 9 focus primarily on issues of content of representations. Although I have discussed content to some degree already, some general issues about the content of representations are important to address. In this chapter, I examine the degree of abstractness associated with representations. Chapter 9 focuses on a particular kind of specific instance information: mental models.

The issue of abstraction is important in thinking about representation. In this book, I have characterized representations as involving a representing world that stands in a consistent relation to elements in a represented world. The degree of abstraction of an element in the representing world is simply the range of different items in the represented world represented by that element. For example, a symbol representing the concept catfish is less abstract than a symbol representing the concept fish because it refers to a narrower range of items in the represented world. Likewise, a symbol representing the concept fish is less abstract than a symbol representing the concept animal. On this view, all representations are abstractions to some degree; there are always aspects of the represented world not captured in the representing world (and hence distinct items in the represented world treated as the same in the representing world). Thus, the degree of abstraction focuses on how much information is captured by a representation.

For an extreme example, Braitenberg (1984) described some simple vehicles that exhibit primitive behaviors. The vehicle pictured in Figure 8.1 has a single wheel at the back and a sensor at the front. The sensor is sensitive to some aspect of the environment, say light. The more photons that hit the sensor, the more current it generates. When electricity is sent to the motor, it causes it to turn. A vehicle like this moves when there is a light somewhere in front of it and does not move when there is no light in front of it. This vehicle exhibits only a crude behavior and has only a crude representation. The more photons, the more current that flows. This representation is transient; when the light source is taken away, the sensor stops producing current. There is no memory for past lights; the stopped vehicle does not daydream of a time when photons were plentiful.

This representation is not only crude; it is also extremely abstract. The sensor is sensitive to any photon-producing thing in the environment. Thus, neon signs, distant stars, Zippo lighters, and welding torches are all treated equivalently. The information that allows people to distinguish between a Zippo lighter and a welding torch (a helpful distinction at the encore of a rock concert) is not part of the representing world of the Braitenberg vehicle.

At the other end of the spectrum, imagine a thumbprint recognition security system (of the type seen in many science fiction films). Such a machine has a detailed representation of a person’s thumbprint. When an object is placed on the machine’s sensor, its image is compared with stored representations, and the door opens only when the object on the sensor matches a stored thumbprint. In this case, the representation is highly specific: Only a few objects in the world correspond to this object (namely, one person’s thumb and really good replicas of it). This representation is still an abstraction. If it were not an abstraction, only one thumb would match the stored representation (and the plots of many science fiction films would be foiled).

Images

FIG. 8.1. A Braitenberg vehicle that is sensitive only to the presence of light.

Although these examples are contrived, they do illustrate the notion of abstraction. What is interesting about research in cognitive science is that much of it has assumed that representations are rather abstract. For example, the scripts described in chapter 7 are generalized events. They do not refer to a specific event but to the sequence of occurrences that make up an event of that type in general. Likewise, the mental space representations of concepts described in chapter 2, like those described by Rips, Shoben, and Smith (1973), have points in a space corresponding to concepts like cat and horse (see Figure 2.4). These nodes are supposed to represent concepts at a level of abstraction specified by the label rather than by specific instances. For one more example, many researchers have suggested that categories are represented by prototypes (Posner & Keele, 1970; Reed, 1972; Rosch & Mervis, 1975). As discussed in chapter 3, a category prototype is supposed to be a “central tendency” of a category. The prototype need not be an actual member of the category; it is a generalization that captures the properties typical of the category. The prototypical bird, for instance, is small, has feathers, flies, and sings. A person need not have seen a single bird with precisely these characteristics, but the prototype is an average of the birds that the person has seen.

A fundamental assumption that appears to underlie this use of abstraction is that a representation is useful for future reasoning only if the important elements of a situation are abstracted away. New situations are never identical to past occasions. If the representation of a category or event contains too many details about previous episodes, it is difficult to find past experiences that apply in a new situation. It is difficult to distinguish which differences between the current situation and past ones are important and which are not. This difficulty interferes with the use of past experiences that are retrieved. For example, a particular past experience of going to a restaurant may have involved getting crayons at the table to draw on the placemat during the wait for the food. If this instance was retrieved during a new trip to a restaurant, some mechanism would be needed to suppress the confusion that would occur when crayons were not brought to the table at the restaurant.

Despite the apparent need for abstract representations, there has been a surge of interest among cognitive scientists in the use of specific instance information in cognition and reasoning. This work has suggested that in many situations people do not rely on abstract representations but use information about specific instances or scenarios. The basic assumption of this research is that specific instances are indeed useful for reasoning as long as processes are designed to find the relevant information in a situation at the time of processing. On this view, when new knowledge is acquired (and when new situations are experienced), it is impossible to foresee all possible uses of this information in advance. If a lot of specific information about an instance is stored at the time of acquisition, more information is available for future reasoning.

In this chapter, I begin with a discussion of some data that suggest cognitive processing operates over representations that contain specific instance information. Then, I discuss three proposals for the use of specific instance information in cognition. First, I review research on case-based reasoning, which assumes that new problems can be solved by referring back to specific previous instances. Second, I return to the role of concrete metaphors as ways to conceptualize abstract concepts (discussed briefly in chap. 6). Finally, I discuss exemplar models, which assume that categories are represented by specific examples encountered in the past.

THE PRIORITY OF THE SPECIFIC

In many areas of cognitive science, researchers have assumed that reasoning processes approach a logical ideal. The appeal of deductive logic, for example, is that a set of rules can allow a system to reason about any domain. The deductive reasoning schemas described in chapter 5 are powerful because they guarantee the truth of the conclusions when the premises are true for any set of premises with the structure specified by a schema. Thus, deductive schemas are ways of reasoning about anything. The appeal of logic in artificial intelligence (AI) is that if a logic is sufficiently powerful, people need not understand the domain about which they reason; they simply have to structure the knowledge about the domain in the appropriate way, and the highly abstract rules of logic do the rest. This appeal to abstract reasoning abilities is also implicit in other accounts of cognition. For example, Piaget assumed that the ultimate end state of cognitive development was formal operations, in which adults could reason logically about many different situations by applying a single set of abstract rules across domains.

Despite the appeal of logic, humans clearly are not good at even the simplest logical puzzles. The Wason selection task described in chapter 1 is a logical task in which people must successfully use the reasoning schemas modus ponens and modus tollens. Recall that in this task, investigators showed subjects four cards and told them to decide which cards must be turned over to test the rule “If there is a vowel on one side of the card, there is an odd number on the other side” (see Figure 1.1). Although subjects generally correctly turn over the card with the vowel on one side (recognizing that modus ponens applies), they rarely turn over the card with the even number on the other side (which requires reasoning by modus tollens). This failure is striking, especially because (as discussed in chap. 1) subjects presented with an isomorphic problem set in a familiar context (e.g., people drinking in a bar) solve the problem with ease. Many studies have demonstrated that people are much better at solving problems with a particular logical structure when the content of the problems is in a familiar domain than when it is in an unfamiliar domain (Johnson-Laird, 1983; Johnson-Laird, Legrenzi, & Legrenzi, 1972).

What benefit can a familiar domain provide for reasoning? One possibility is that people recognize a new event as an instance of a familiar event and then do exactly what they did in that situation. The problem with this possibility is that people would have great difficulty when operating in situations not exactly like those they had encountered before. A second possibility is that people reason by using rules that are intermediate in abstraction between detailed representations of the events in specific situations and highly abstract domain-independent rules. Another proposal is that people form rules about common social interactions, and they bring their knowledge of social situations to bear when solving a new problem (Cheng & Holyoak, 1985, 1989; Cosmides, 1989). To demonstrate this point, Cheng and Holyoak (1985) gave people a version of the Wason selection task in an unfamiliar domain (entrance visas to a new country), but used a familiar social rule (permission). In this case, the rule was that if the visa says “entering” on one side, it includes cholera on the list of diseases on the other (the cover story had made clear that the list of diseases was a list of inoculations that a passenger had received). Thus, people could consider this task as an instance of needing permission to carry out an action. Cheng and Holyoak found that more subjects were able to solve the Wason task correctly in this situation than were subjects given the same rule without the rationale for the rule (i.e., without telling them that the list of diseases was a list of inoculations). This study suggests that people use rules derived from familiar situations to reason about common social situations.1

Of course, logic is only one kind of abstract reasoning system. Data like those on the use of social reasoning suggest not that there are no abstract rules, but that the rules are not maximally abstract. Instead, there seem to be a set of rules tied to familiar situations. Workers have showed that many cognitive processes that could function with abstract rules are actually carried out by using representations tied to specific episodes (Medin & Ross, 1989). Medin and Ross (1989) suggested that this use of instance information reflects a priority of the specific in cognitive processing; that is, the cognitive system is designed to operate by using representations of specific instances that have been encountered rather than by creating abstract rules from experience.

A striking case of the use of specific information came from a study of college students asked to develop word problems for addition and division problems (Bassok, Chase, & Martin, 1998). This domain is interesting, because intuition suggests that semantic content does not affect college students’ performance with basic arithmetic. In one study, researchers asked students to generate either addition or division problems and gave them a pair of objects to use in the problems. The objects either came from the same category (e.g., apples and oranges) or were associatively related (e.g., apples and baskets). When college students were asked to make addition word problems involving objects from the same category like apples and oranges, they wrote simple stories in which the objects were aggregated (say, finding the total number of pieces of fruit when given the number of apples and the number of oranges). In contrast, when asked to make addition word problems involving objects that were only associatively related, like apples and baskets, they were uncomfortable writing problems that required dissimilar items to be aggregated. Instead, they tended to use other strategies, such as introducing new objects from the same category as one of the original objects (e.g., having apples and oranges in a basket and finding the total number of pieces of fruit) or even writing the problem with a different arithmetic operation (e.g., using division instead of addition). The opposite pattern was observed for division problems. College students were happy to write division problems for objects that were associatively related (e.g., distributing apples among baskets), but not for objects from the same category (e.g., finding the number of apples per orange). This finding suggests that content influences the way college students think about even very basic abstract domains like arithmetic.

In another illuminating example of the spontaneous use of specific instance information, Regehr and Brooks (1993) gave people a classification task that involved a complex rule. Sample items from this classification task are shown in Figure 8.2. Subjects were shown the items in either Training Set A or Training Set B and were asked to identify the “builders,” which were defined by a complex rule like “six legs and an angular body and spots.” For this classification task, the complex rule was given explicitly, so that the task required only learning how to apply the rule to specific instances. After a set of training trials in which they practiced applying the rule, people were given a transfer phase in which a new set of items was presented. The transfer items were selected so that some of the items for which the rule did not apply were perceptually similar to the items in the training set for those subjects given Training Set A, but not for those given Training Set B.

Images

FIG. 8.2. Training and transfer items from a study demonstrating the role of specific instances in a situation that required only application of an abstract rule. From G. Regehr and L. R. Brooks (1993). Copyright © 1993 by American Psychological Association. Reprinted by permission.

If people use only the abstract rule, this perceptual similarity should not matter, because only the application of the rule is important. In contrast, if people use specific instance information, people given Training Set A would mistakenly say that the transfer items belonged in the category (because of their perceptual similarity to the training instances), whereas people given Training Set B would not. Indeed, people given Training Set A did make many errors on these transfer items, but people given Training Set B did not. Thus, this finding suggests that subjects did not simply follow a rule but instead answered the new classification questions on the basis of information about specific exemplars seen during training, even though the task required only following an abstract rule. The cognitive system is apparently designed to store specific information about instances that it is exposed to.

I now turn to three proposals for the ways that specific instances are used and address the issue of how to determine the right level of abstraction for representing properties of instances at the end of this chapter.

CASE-BASED REASONING

Perhaps the best worked-out models that use specific instance information for reasoning belong to a branch of AI called case-based reasoning (CBR). The central tenet of CBR models is that the knowledge of an intelligent reasoning system is in the form of cases, which describe specific instances that are typically other experiences in the same problem domain (see Kolodner, 1993; Schank, Kass, & Riesbeck, 1994, for overviews). CBR systems have been applied primarily to problems of planning and explanation, in which the cases in the knowledge base (typically called a case base) are previous plans or explanations. In this discussion of case-based reasoning, I first describe the basic components of a CBR system and then one example of a CBR system that illustrates these components in action. Finally, I discuss some limitations of current CBR models and some possible extensions.

Components of a Case-Based Reasoning System

CBR systems are often implemented computer programs that consist of four basic components (see Figure 8.3). First, there is a knowledge base that contains a set of relevant cases. Second, there is a mechanism that retrieves cases from the knowledge base when a new problem is encountered. Third, there is a mechanism that adapts or tweaks the case to make it work in the current situation. Finally, there is a mechanism that takes new situations and adds them to the case base to enhance the functionality of the system.

Images

FIG. 8.3. The four major components of case-based reasoning systems.

The heart of a case-based reasoning system is the case base. Each case in the case base consists of a bundle of knowledge about a particular episode. Typically, cases are represented with some form of structured representation. In domains with a consistent set of attributes that one can expect across cases (as in medical diagnosis, where every patient is likely to have had the same set of diagnostic tests), the cases can be represented as frames (see chap. 5). In these systems, cases are described by slots and their fillers. As discussed in chapter 5, however, it is possible to augment these structured representations with relations that provide further information about relationships among elements in the case. For example, causal relations that describe how the value of one slot is related to the value of another can be added.

After a set of cases has been defined, there needs to be a mechanism for retrieving cases from memory. Case-based reasoning systems typically use an indexing scheme for this purpose. In an indexing system, particular aspects of memory items are defined in advance to be indices. These indices point directly to the item in memory, and if the index appears in the description of a new problem situation, the case can be retrieved. Many different aspects of a case can be used as indices, for example, the actors or objects in the case. Likewise, key elements of the case, such as goals they satisfy or failures they help avoid, can be used as indices. The use of goals and execution failures helps ensure that the case is retrieved when it is likely to be relevant. Thus, the advantage of an indexing system is that indices allow items to be retrieved when they are relevant to important goals that the system must reach. The potential disadvantage is that the indices must be defined in advance when the case is entered into the case base. Thus, cases can be retrieved only for goals that are known in advance. If a new problem arises, cases relevant to that problem are not retrieved.

After a case has been retrieved from the case base, it must be applied to the current problem. Sometimes, the case can be applied without any further effort. For example, if a person who is cooking macaroni recalls a previous occasion of cooking macaroni, he or she can simply repeat the steps of the previous situation. If the new situation differs in any way from the previous situation, the latter must be altered to be used in the new situation. In many cases, the changes are minor, and the case need only be tweaked (Schank et al., 1994). For example, if a person is cooking macaroni and is reminded of a time when he or she cooked spaghetti, the person can substitute macaroni for spaghetti in the retrieved case and then carry out the events. This is not a substantial change, but it does require that the old case be properly aligned with the new case so that the proper substitution can be made.

Large-scale changes to a retrieved plan may also be required. Here, an existing case may be adapted to the new situation. For example, imagine cooking macaroni and retrieving a previous situation in which one cooked angel-hair pasta. Because it is so thin, angel hair cooks in only a few minutes; macaroni is thicker and requires more cooking time. If a person follows the angel-hair pasta plan, substitutes macaroni for angel hair, and cooks it for the same amount of time, the macaroni would be too hard. In this situation, other reasoning is necessary. One must know that pasta gets softer as it cooks, and one should change the plan by increasing the cooking time. In this case, one has adapted the old case to the new situation (Alterman, 1988; Hammond, 1990).

The idea that a retrieved example requires cognitive effort to be applied in a new situation has psychological support. Researchers have extensively explored how people solve insight problems (e.g., Sternberg & Davidson, 1995), an example of which is the nine-dot problem shown in Figure 8.4. The problem is to draw four lines through all nine dots in the three-by-three grid of dots without lifting the pencil. The solution to the problem, shown in Figure 8.5, involves drawing lines that go outside the boundaries of the grid. Studies have demonstrated that an important component of the insight experience (i.e., the “aha” experience) that goes along with solving an insight problem is retrieving a relevant case from memory. For example, other problems like the nine-dot problem involve larger grids; the solution to these problems also involves drawing lines that extend beyond the boundaries of the grid, but the specific lines are not the same. Thus, many people report having an “aha” experience with other versions of the nine-dot problem (if they have solved the nine-dot problem itself before), because they are reminded of the nine-dot problem and remember that problems of this type can be solved by drawing lines that go outside the boundaries of the grid. Nevertheless, they still need a lot of time to solve the problem even after having this insight experience; the insight provides an avenue for solving the problem, but does not provide a complete solution (Weisberg, 1995). This result supports the idea that recalling a relevant case from memory is only part of the process of using a previous experience. The retrieved case must also be adapted to the current situation to be used.

Images

FIG. 8.4. The nine-dot problem. The goal of this problem is to draw four straight lines that pass through all nine points without lifting the pencil (answer in Figure 8.5).

When a new situation leads to a tweak or adaptation of an old case, this new situation is a candidate to be stored in the case base as well. In CBR systems, storing the case requires placing its representation into the case memory and indexing it. Typically, CBR systems have an automatic indexing system that specifies particular slots whose fillers become the indices to the new case. As discussed earlier, object, goal, and goal failure information are frequently used as retrieval indices.

Not all these mechanisms are fully implemented in the CBR models that have been developed. Many systems work with a fixed case base that is not updated with new cases. In these systems, the case base was selected to be representative of the problems in the domain, and updating this case base is deemed unnecessary. In many systems, the process of adapting and tweaking is carried out primarily by users of the systems. In these programs, the case-based system selects a relevant case and may even suggest possible changes to the case selected. These systems act only as advisers to a human user, who has ultimate authority over how the case is used. Case-based reasoning systems have been applied to real-world problems, including architectural design, autoclave loading, and case law retrieval (Kolodner, 1993).

Images

FIG. 8.5. The solution to the nine-dot problem.

SWALE: A Case-Based Explainer

As an example of a system that does case-based explanation, I consider SWALE (Schank & Leake, 1989), a program designed to generate explanations of anomalous occurrences based on past events. SWALE contains the four components in Figure 8.3.2 The program was originally designed to explain the mysterious death of the racehorse Swale, who died after winning the Belmont Stakes. In casting about for an explanation for the death of an athlete, one can look for similar events. SWALE has in its case memory a number of possible similar events such as the death of Jim Fixx, a jogger and author who wrote about the joys of jogging. He had a heart defect and collapsed one day after running. Likewise, one can speculate that Swale had a similar hidden defect that caused him to die after exertion.

In SWALE, memory consists of objects and memory organization packets (MOPs). The objects describe the attributes of objects and relate them on abstraction hierarchies via class-inclusion relations (e.g., Swale was a racehorse, which is a kind of horse, which is a kind of animal). The MOPs (see chap. 7) describe events (e.g., the events involved in a horse race as well as extra information such as the fact that a horse race requires exertion on the part of the horse). One other memory structure in SWALE is the explanation pattern, a previous explanation that the system has stored. For example, SWALE might know about Jim Fixx. The explanation pattern for Jim Fixx would say that Fixx had a heart defect and that the heart defect caused Fixx’s untimely death.

The explanation patterns are the cases in memory that must be retrieved to generate an explanation. SWALE has an indexed case base. The primary indices in the case base are unexpected events. In the case of Swale, the unexpected event is that, whereas a young racehorse is expected to be alive, Swale died. This story is classified as involving an unexpected death which is used as a cue in a search of memory. Explanation patterns in memory are retrieved if they were indexed as unexpected deaths as well. The case of Jim Fixx would be indexed this way (as might a situation like that of the singer Janis Joplin, who died of a drug overdose).

After a case is retrieved, it must be fitted to the current situation. SWALE adapts cases by using a set of tweakers. These tweakers allow objects and events in the explanation pattern to be changed if the current situation has different values for the objects and events, but they do not affect the causal connections between elements in the story. For example, SWALE compares the case of Swale to that of Jim Fixx and notices that Swale was a horse and Fixx was a person. The tweaker looks at its information about horses and people and decides that the difference is not important because horses and people are both animals and can die. After making this change, the program determines that Jim Fixx died after jogging, but that Swale did not jog. Because the MOPs have information that jogging is a kind of physical exertion and that horse races are a kind of race, which is also a physical exertion, this substitution is also allowed. Thus, the differences between the Swale and Fixx cases are not deemed fatal to the explanation, and the Jim Fixx explanation pattern is accepted as a possible explanation for Swale’s death. The explanatory aspect of this case would be the suggestion that Swale had a heart defect just as Jim Fixx did. This aspect is the explanation, because it is the relational information that binds together the parts of the Swale story connected to the expectation failure.

More elaborate tweaking may also be necessary. For example, if there was a case in memory for the death of Janis Joplin (who died of a drug overdose), a simple substitution is not possible. Nothing in the immediate story about Swale points to drug use. Instead, other knowledge would have to be brought to bear, such as that athletes sometimes use drugs to enhance their performance. Further, the system would have to know that horses cannot administer drugs to themselves, and another agent would have to be posited for giving the drugs to Swale. In fact, elaborate explanations like this can be generated by Swale if enough object and event knowledge is given to the system. SWALE does not really evaluate the explanations. If it finds a few different explanation patterns that it can tweak, it returns more than one possible explanation for the observed anomaly. Presumably, however, some heuristics can be used to evaluate competing explanations. For example, explanations requiring extensive tweaking should probably be deemed less plausible than are explanations requiring little tweaking.

Finally, after an explanation is generated, it can be incorporated into memory. One strategy that SWALE uses for incorporating new cases into memory is to generalize the initial explanation. After discovering that both horse racing and jogging are forms of physical exertion, SWALE can generate an explanation pattern that assumes that physical exertion is important and that the particular form of physical exertion is not. Likewise, if SWALE notes that Swale and Fixx were both animals but were not both human, the new explanation pattern can be stored as true of animals in general, not just of people. After generalizing the explanation pattern, SWALE can index it. SWALE uses two types of indices for new explanation patterns. First, the explanation patterns are indexed by the anomaly that they explain (in this case an unexpected death). Second, they are indexed by the causal preconditions in the situation. Thus, causally relevant features are assumed to be important for retrieval. In the case of Swale, physical exertion would become an index, as would death (Schank & Leake, 1989).

Discussion and Limitations

Case-based reasoning systems attack complex reasoning problems by looking for previous solutions that are similar rather than by reasoning from abstract principles. The advantage of using similar cases is that these cases are known to work. Jim Fixx really did die from the combination of a heart defect and exertion. Janis Joplin really did die from an overdose. Thus, the explanation in a case is already known to work in at least one domain. If explanations had to be constructed before they were applied, the explanations would first have to be tested to make sure they were plausible.

Case-based reasoning systems also solve a difficult combinatorial problem. Any anomalous situation or planning problem may have an infinite number of possible explanations or solutions. By limiting the search for answers to the set of items in memory, a case-based reasoning system is able to find solutions efficiently. This advantage also limits the system to explanations that bear some resemblance to episodes from past experience. In many real-world domains, however, the combinatorial problem is probably more severe than the possibility that there will be too little good information in memory, and so it is reasonable to use a case-based reasoning approach. Case-based reasoning systems can be augmented with other methods of generating explanations to provide even more powerful engines for solving problems.

The notion that previous cases can help to solve important combinatorial problems seems to be exploited by people in situations in which they are asked to be creative (Ward, 1994, 1995). In particular, people tend to extend existing categories when making novel inventions rather than seeking highly novel solutions to new problems. Ward (1995) pointed out that during the development of the railroad, conductors and brakemen initially rode on the outside of the train, because drivers sat outside on stagecoaches and stagecoaches were a dominant means of transportation when railroads were being developed. Thus, designers extended the design for a stagecoach to this new mode of transportation because the stagecoach had solved important problems for transporting people (e.g., where they should sit, how luggage should be stored). Unfortunately, this design also presented problems; railroad cars had a tendency to derail and tip over. Thus, the design of railroad cars was eventually modified to put the crew on the inside to avoid injuries. Such incremental creativity in invention is common.

Case-based reasoning systems are typically designed as programs that are applied in practical settings. They are not really cognitive models, and the mechanisms they use are not necessarily those that people use to solve the same problems. Although people also seem to use previous examples to solve difficult problems (Gick & Holyoak, 1983; Reed & Bolstad, 1991; Reed, Ernst, & Banerji, 1974; Ross, 1984, 1989), they do not appear to access these problems by using an indexing system like those implemented in CBR models (Gentner, Rattermann, & Forbus, 1993).

There has been significant research on the way that analogs in memory are accessed in the area of analogical retrieval and problem solving (see Reeves & Weisberg, 1994, for a review of this area). The generic research paradigm here is that people are given a previous case or example and some time later are asked to solve a new problem. The new problem is similar to the previous case in some way, and researchers look for evidence that the previous case was used to solve the new problem. In one classic set of studies, Gick and Holyoak (1980, 1983) examined people’s ability to use a previous example to solve a new problem. First, they told people a story that would later be useful in solving a problem. The story told of a general who wanted to attack a fortress surrounded by mines, and so he divided his army into small groups and had them converge on the fortress. Later, these people were given a problem to solve. This problem, Duncker’s (1945) ray problem, describes a patient with an inoperable tumor that can be destroyed by radiation, but radiation strong enough to destroy the tumor will also destroy the healthy tissue around the tumor. The subjects must find a way to destroy the tumor without destroying the healthy tissue.

The solution to this problem is similar to that in the story about the general. Weak rays should be directed at the tumor from different directions so that they converge on the tumor, destroy it, but spare the surrounding healthy tissue. The striking finding in these studies is that, despite the fact that people had just read a story with a solution that is helpful for the problem, they did not find the earlier story very helpful. In a typical study, 10% of the subjects solved the ray problem spontaneously without being given the story about the general. Gick and Holyoak found that only about 20% of the subjects solved the ray problem correctly when given the story about the general. There are two possible explanations for this finding. First, people may have failed to notice that the story about the general was relevant to the problem. Second, people may have recognized that the story was relevant, but were unable to apply the solution to the ray problem. To distinguish between these possibilities, Gick and Holyoak gave people time to solve the ray problem after reading the story about the general. If they failed to solve it, they were given a hint to use a story they had just read. Given this hint, 92% of the subjects solved the ray problem. This finding suggests that people did not notice that the story was relevant to solving the problem, because they could use it when told it was relevant.

The story about the general is similar to the ray problem only in that both require a common solution to a problem. The troops in the military story are not similar to healthy tissue. The fortress is not similar to a tumor. Thus, this result suggests that people do not retrieve things from memory whose only similarities are in the relations between objects. This pattern of results has been confirmed in studies using similar procedures (Gentner et al., 1993; Holyoak & Koh, 1987; Novick, 1988; Ross, 1984, 1987, 1989).

If people do not retrieve things from memory on the basis of similarities in relations between objects, what do they use to guide retrieval? Much evidence seems to suggest that commonalities in the objects in pairs of situations drive retrieval. In one study demonstrating this point, Gentner et al. (1993) had people read stories. A week later, the same people read other stories and were asked whether they were reminded of any of the earlier stories. Stories with similar characters served as good retrieval cues whether or not they had a similar plot. Stories with dissimilar characters and similar plots served as poor retrieval cues. Related work by Wharton and colleagues (1994) found a small effect of matching relations between cue and memory items. In their study, people read two stories, only one of which had a plot similar to a cue story presented later. The story with the similar plot was more likely to be retrieved than was the story with the dissimilar plot, but only if the stories also had similar characters. Taken together, these studies suggest that the presence of common objects is very important for retrieval.3 Thus, people do not appear to index previous cases by goals and prominent relations in the way that many CBR models do.

This pattern of data is rather perplexing. It seems advantageous for a person in a new situation to be reminded of episodes with a similar relational structure, because such episodes are likely to be useful in constructing new plans and problem solutions. Although a complete review of this issue is beyond the scope of this chapter (see Forbus, Gentner, & Law, 1995; Hammond, Seifert, & Gray, 1991; Reeves & Weisberg, 1994), at least two important factors make purely relational remindings less attractive than remindings that involve some common objects. First, as discussed in chapter 5, processes (like analogical comparison) that operate over structured representations are computationally expensive. It takes substantial processing resources to place the arguments of matching relations in correspondence. This degree of processing is not feasible for a memory retrieval process that must take into account all items in memory. For this reason, many computational models of analogical retrieval have adopted a two-stage process in which the first stage is computationally inexpensive and operates over nonstructured representations; only later processing is affected by structural elements of the representations like relations (see chap. 10; Forbus et al., 1995).4

Second, as interesting as it is to find relational remindings that come from very different domains (like being reminded of the general’s story when given the ray problem), these remindings are often not useful relative to mundane remindings of highly similar situations. When a person goes to a supermarket to buy tomato sauce, it is probably best to be reminded of other experiences of buying tomato sauce rather than of situations in which the person was selecting a vacation spot, even though both involve a choice. The specific factors important to choosing a tomato sauce (e.g., spices, thickness) are more likely to be part of a stored tomato sauce purchase than of a stored vacation decision. A similar situation occurs in scientific reasoning, in which the best “analogy” to a current problem often comes from a similar domain rather than a dissimilar one. Dunbar (1997) demonstrated that microbiologists often make analogies from one bacterium to another. These analogies are very rich and useful, but they also involve pairs of domains with many similar surface elements. Thus, case-based reasoning systems that attempt to find far-flung analogies (like SWALE) may be aimed at a problem that is much harder than those that people are generally required to solve to use past experience in their daily life.

METAPHORIC SYSTEM MAPPINGS AND ABSTRACT CONCEPTS

Concrete domains are also useful in the representation of abstract domains. In chapter 6, I briefly mentioned that cognitive linguists have speculated that abstract domains can be conceptualized in spatial terms (Gibbs, 1994; Lakoff & Johnson, 1980). For example, English has a system of meaning that follows the mapping MORAL is UP. One can say:

Images

to mean a moral decision, and:

Images

to refer to an immoral act. The speculation on the part of cognitive linguists is that space is easy to understand and so it helps people comprehend complex concepts.

Space is not the only concrete domain that has been suggested as a source domain for metaphors for talking about abstract concepts. Gibbs (1994) gives an extended discussion of the metaphor ANGER is HEATED FLUID IN A CONTAINER In this metaphor, the abstract domain of anger is conceptualized in terms of a closed container that contains fluid. As the level of anger increases, the container heats up; the increased pressure on the container may cause it to burst. This metaphor can be seen in the following passage:

Mary’s blood was beginning to boil. The longer she watched John, the more the pressure began to build inside her. She was unable to vent her anger, and finally blew up at John, spewing her rage.

(8.3)

In this passage, Mary’s anger acts like heated fluid. This account assumes that the action of heated fluid is easier to understand than is that of anger, and it is easier to communicate about anger by using this metaphor than to use language that refers only to abstract emotional concepts. Cognitive linguists did not have to search very far to find metaphors like this one. As shown in Table 8.1, English is filled with metaphors in which physical systems refer to abstract concepts.

Such metaphors are assumed to set up mappings between representational systems (Fauconnier, 1994; Gentner, 1988; Gibbs, 1994; Lakoff & Johnson, 1980). Lakoff and Johnson (1980) and Gibbs (1994) were not very specific about how these metaphors are represented, but Gentner (1988) and Fauconnier (1994) both suggested that metaphors involve finding correspondences between structured relational representations. For Gentner, the comparison process is the structure-mapping process described in chapter 5. On this view, the relations among concrete objects can be transferred to the abstract domain.

TABLE 8.1
Some Metaphoric Systems for Abstract Concepts in English

Images

Fauconnier does not provide a specific comparison process but focuses on metaphor as one of several linguistic devices that involve setting up local domains in discourse. In each of these local domains, the representations are structured. Understanding a metaphor involves setting up one domain for the base of the metaphor and one domain for the target and tracking the correspondences between them.5 For Fauconnier, language comprehension involves creating subdomains and finding correspondences between them. These correspondences can be simple identifications, as in the case of metonymy, where one object stands for another. For example, understanding the sentence:

Images

involves identifying the person who ate the ham sandwich with the thing he or she ate and setting up a domain in which the ham sandwich refers to the person. This domain is separate from the “real” world, in which the phrase “ham sandwich” refers to a ham sandwich. The distinction between the real world and the metonymic world can be seen in the sentence:

Images

This sentence does not make sense; it uses the phrase “ham sandwich” to refer both to the person (in the metonymic world) and a ham sandwich (in the real world).

Although there is no doubt that English uses the language of concrete domains to refer to abstract situations, what this fact reflects about ongoing cognitive processes is unclear (Murphy, 1996). Possibly, metaphoric language is a sign of active conceptual mappings between domains. On this view, when people use the metaphor ANGER is HEATED FLUID IN A CONTAINER, they activate their knowledge about anger and also about heated fluid and construct a mapping between these domains. A second possibility is that the observed metaphoric language is largely a collection of cognitive fossils enabling people to see metaphors that were active in the past. On this view, people interpret passages like Example 8.3 by using literal meanings of phrases that refer directly to anger, even though someone may have used these same phrases metaphorically before they were incorporated into the language.

Murphy (1996) pointed out several potential problems with the view that abstract concepts are understood only metaphorically. One problem is that many base domains used in metaphors are themselves rather complex and may not be well understood by people using the metaphor. These observed metaphorical systems may involve primarily literal meanings of terms that were once used metaphorically. On this view, many word senses that were once metaphorical are now literal. A second problem is that the metaphor view does not explain which aspects of the base domain are carried over to the abstract target domain. For example, the metaphor ANGER is HEATED FLUID IN A CONTAINER does not include the fact that hot fluid thrown from a container can burn those around it. Otherwise, it would make sense to say:

Images

Thus, although examples of language from concrete domains do seem to be used to express abstract concepts, there is no account of why some concepts are carried from the base to the target and others are not.

This argument should not be taken to mean that active use of metaphor is unimportant in understanding abstract concepts. Rather, the dispute is about the degree to which metaphors in which words for concrete domains refer to abstract concepts involve active processing of the metaphor rather than access of stored meanings relating to abstract concepts.

EXEMPLAR MODELS

Specific instance information has also been important in recent models in psychology. Researchers have examined whether human performance can be modeled by assuming that people store only previous instances (typically called exemplars), in which subsequent processing is a simple function of the stored past information. In this section, I review exemplar models of classification, automaticity, and memory and then discuss the representational assumptions that these models have in common.

Exemplar Models of Classification

An important task in the study of category acquisition is classification. In the typical classification task, people are shown a new item and asked to classify it into one of a small number of categories. After making a response, subjects receive feedback and are then given another trial. This process continues until subjects reach some accuracy criterion or perform a particular number of trials. Subjects’ performance in these studies is often described by mathematical models. A prominent class of mathematical models for this task is the exemplar model (Estes, 1986, 1994; Kruschke, 1992; Medin & Schaffer, 1978; Nosofsky, 1986, 1987). Exemplar models assume that people store each instance that they are shown as well as the correct category label for the instance. On each new trial, the new instance is compared to all exemplars stored in memory, and a response based on the similarity of the new exemplar to previously seen exemplars is given.

One early exemplar model, which can serve as an example here, is Medin and Schaffer’s (1978) context model. The context model assumed that exemplars are stored in memory as collections of features. When a new instance is presented, each of its features is compared with those of all exemplars stored in memory. For each new instance–old exemplar pair, each feature of the items is compared, and a similarity value that ranges between 0 and 1 is given for each feature comparison. Identical features get a value of 1, and less similar features get lower similarity values. The feature similarity values for each dimension are multiplied together to form the overall similarity between exemplars. The probability that a new instance is classified in a particular category is a function of the sum of the similarity values of the new instance to all old exemplars in that category divided by the sum of the similarity values of the new instance to all old exemplars in memory. Most exemplar models of classification have this general structure, although they have been extended with other mechanisms, such as those that allow different dimensions to be given different attention weights (Kruschke, 1992; Nosofsky, 1986, 1987). In Nosofsky’s (1986) generalized context model, objects are represented as points in a multidimensional space (e.g., chap. 2), and the similarity of items is a function of the distance between points in that space. The spatial representation allows the dimensions of the space to be stretched or shrunk to reflect changes in the importance of the dimensions as a function of context.

Images

FIG. 8.6. Two categories of faces with prototypes in the middle of each group.

Exemplar models can account for many findings in studies of classification. For example, it is well known that category prototypes are easy to classify. The prototype is typically the “average” exemplar. In Figure 8.6, the prototype is the middle face in categories A and B. Every other exemplar shares three features with this prototype and one feature with the prototype of the other category. As discussed in chapter 2, an interesting finding is that the category prototype may be well categorized, even if it has never been seen during learning (e.g., Posner & Keele, 1970). This finding had been taken as evidence that category acquisition involves the automatic generation of a prototype (that is, of an average exemplar), but exemplar models can also account for this phenomenon. In general, the prototype is similar to many of the instances that are members of the category. Thus, even if the prototype has never been seen before, it is more similar to the exemplars of one category than to those of the other, and hence it is classified correctiy.

Exemplar Models of Automaticity

Exemplar models have also been applied to the study of automaticity. Automaticity is achieved in a cognitive skill when the skill can be performed without effort or conscious control. For example, most people can produce basic arithmetic facts automatically. When they must add 6 and 3, they can produce 9 without having to generate the answer through an algorithm (such as starting at 6 and counting up 3). Of course, this automaticity is not achieved easily; it is the result of extensive practice in the domain. For arithmetic facts, teachers often give children hundreds of practice trials for each fact with an emphasis on speed. An early school memory for me is a grade school test in which all 100 two-addend addition facts involving the numbers 0 through 9 had to be completed in 5 minutes. As I recall, the class could not continue to the next lesson until everyone successfully completed this assignment.

Logan (1988) suggested that such automatic processes are carried out by storing individual training instances paired with the response that goes with them. Performance on a task is a race between an algorithm (such as starting at the first addend and counting up to the number of the second addend) and retrieval of a fact from memory. Logan assumed that the more times a particular instance is stored in memory, the faster it can be retrieved. At some point, there are so many instances of a particular fact in memory that the fact can be retrieved faster than the algorithm can be performed. When memory retrieval becomes faster than the algorithm, the skill becomes automatic.

According to this initial instance theory of automaticity, a new instance retrieves only memory instances identical to it. Extensions to this type of model have proposed that merely similar instances can be used as well. In the exemplar-based random walk model (EBRW: Nosofsky & Palmeri, 1997; Palmeri, 1997), instances are represented as points in a multidimensional space (as in the generalized context model just described), in which the similarity of two items decreases with the distance between them in the space. In EBRW, a newly presented instance retrieves items from memory, where the probability of retrieving a given item in memory is larger the greater the similarity between the item and the new instance. This model differs from Logan’s instance-based theory in that the response made is not the one associated with the first new instance retrieved. Instead, the model uses a random walk of a particular distance to determine the response.

A diagram of a random walk is illustrated in Figure 8.7. Initially, the response criteria (labeled Response A and Response B) are some distance from the starting point. When an instance is retrieved, the model takes one step toward the response associated with that instance. Then, another instance is retrieved, and another step is taken. This process continues until one of the response criteria is reached. Thus, a response involves many memory retrievals rather than just one. The time needed to respond is based on the number of memory retrievals required before one of the response criteria is reached (or until a competing algorithm finishes processing). The model has been applied successfully to data from tasks designed to examine the development of automaticity (Palmeri, 1997) and classification ability (Nosofsky & Palmeri, 1997).

Images

FIG. 8.7. Illustration of a random walk.

Exemplar Models of Memory

Both the classification and automaticity tasks described in the previous sections are essentially memory tasks. In both cases, appropriate previous elements must be retrieved from memory to make a response. As discussed in chapter 4, many models of memory tasks have adopted semantic network representations (J. R. Anderson, 1983a; Raaijmakers & Shiffrin, 1981), but some models make assumptions closer to those of the classification and automaticity models described in the previous two sections. A common memory task in cognitive psychology involves showing subjects a list of items (e.g., words or pictures) and then probing their memory for the items in some way. The probes can be direct, as in free recall, cued recall, and recognition tasks, or indirect, as in stem completion tasks (e.g., Squire, 1987).6 Some memory models are similar to the exemplar models described in the previous sections, in that each memory item is stored as an instance (which may include some information about the context of presentation). The representations used in these exemplar memory models may involve features or points in a multidimensional space. As in the previously described models, the probability that an item will be retrieved is a function of the similarity of the item and a cue for the item, although memory models incorporate many other factors as well.

Random walk models can be applied to data from recognition experiments. Ratcliff (1988) presented a model in which the recognition probe is compared to memory feature by feature. Each feature match pushes the model one step toward the boundary to respond “old,” and each feature mismatch pushes the model one step toward the boundary to respond “new.” In this case, the random walk model assumes a featural representation of information in memory. Ratcliff further demonstrated that the random walk model can be given a continuous formulation (the diffusion model). In this case, the representation of items is more like a multidimensional space than a set of discrete features.

Another model that focuses on storage of individual items is MINERVA 2 (Hintzman, 1986). In this model, items in a studied list are represented as vectors of features, in which the value of any feature can range from 1 to –1. A sample memory of this type is shown in Figure 8.8. In models like this, it is possible to think of values of 1 as features that are present, values of –1 as features that are explicitly noted as absent, and values of 0 as features that are absent but are not explicitly noted as absent (Markman, 1989). When a memory probe is presented to the model, it is compared with every vector in memory. For each comparison, the dot product (see Equation 2.17) is taken between the probe and the item in memory, and this value is divided by the number of nonzero elements in the vector for at least one of the vectors. This quantity, which ranges between –1 and 1, is a measure of the distance between the vectors in space. The probe is assumed to activate each item in memory proportional to its similarity to the probe. In particular, the activation of each item is the cube of the similarity score, which has the effect of driving moderately small values of similarity toward 0 while preserving the sign of the dot product. Activations of each item in memory when there is a probe are also shown in Figure 8.8.

Two things can be done with these activations. First, the familiarity of the probe can be determined by summing the degree of activation generated by the probe. The greater this activation score, the more familiar the probe. The familiarity can be used as the basis of a judgment that the probe had been seen before. For example, in the memory shown in Figure 8.8, three memory items are activated by the probe. When the activation of these memory items is added together, the probe is judged as familiar, because there are some moderately high levels of activation. Second, one can generate a memory response by taking each feature of the items in memory, multiplying them by their activation, and then adding the values together. The activated features with a memory probe are also shown in Figure 8.8. This vector of activated features reflects an average across a variety of items in memory (in this case, the feature values are most strongly influenced by the three highly activated vectors). Because this vector is an average across items in memory, it acts like a prototype. For example, the fourth feature in the probe vector is a one, but when the activated features from memory are found, there is very little activation for this feature. This result suggests that a value of 1 was not the dominant value for this feature in memory. In this way, storage of individual instances can give rise to an abstraction. The model calculates this abstraction during retrieval from memory by using stored information about specific instances.

Images

FIG. 8.8. An examplar memory like that proposed in the MINERVA 2 model. If the probe is compared to the memory, the items in memory are activated to the degree shown in the column labeled “Activation.” The features activated in memory are shown at the bottom of the figure.

Exemplar Representation and Retrieval

Exemplar models of classification, automaticity, and memory differ in many of their specific implementational details, but they all seem to share a common set of representational assumptions. In particular, all the models assume either a multidimensional space representation or a feature representation in which the features are independent. In these feature representations, the common and distinctive features are typically not treated differently. Instead, simple feature comparisons are made, and the results of a set of feature comparisons for a given pair are combined in a straight-forward way (typically by multiplying the results of different comparisons together).

This commonality among models is not surprising: All the models solve a fundamentally similar problem. All assume that memory consists of a large number of stored instances. In exemplar models of classification and memory, memory apparently consists of all the training stimuli. In exemplar models of automaticity, memory is even larger and consists of all training trials (which can exceed 10,000). For a newly presented instance to be compared against every item in memory, the comparison process must be computationally simple. The simple feature comparisons and spatial distance comparisons in the models described in this section all have that property.

These models highlight an interesting contrast in the use of representation. Exemplar models that require comparisons of new instances with a large number of previous cases involve simple representations (i.e., features, spaces, or semantic networks). Likewise, as discussed previously, the early stages of models of analogical retrieval also involve simple representations that can be easily compared with many other representations. In contrast, more complex processes of reasoning, explanation, and metaphor involve structured representations. The representations in case-based reasoning systems and those assumed by models of metaphor are structured and contain explicit relations among elements. At first glance, it might seem difficult to reconcile these two approaches, but the two kinds of representations are designed to carry out different functions and may indeed coexist in the cognitive system (see Sloman, 1996, for a discussion of two types of reasoning processes compatible with this suggestion; this issue is discussed in more detail in chap. 10).

FINDING THE RIGHT LEVEL OF ABSTRACTION

As discussed at the very beginning of this chapter, all representations are abstractions to some degree. A perceptual representation of a face incorporates some but not all of the information available in the light that hits the eye. If a symbolic representation is used, each symbol has a particular scope. A question of great importance to cognitive science, one that has received very little attention, is how the scope of a representation is determined.

This issue can be illustrated with the schematic bugs shown in Figure 8.9 (Yamauchi & Markman, 1998). How should the bug in Figure 8.9A be represented? Is the overall outline important? Should it be broken down into features? Even if it is agreed that the bug can be described by features for the head, tail, body, and legs, what features are appropriate? The overall shape of the legs may be important or perhaps the number of legs. Seeing another example, like the one in Figure 8.9B, does not necessarily help. How the features in these two bugs should be represented depends in large part on what is to be done with the bugs and the relation between them. If the bugs are to be classified, the way that the features are described depends on whether these bugs are in the same or different classes. If they are in the same class, perhaps the number of legs is important; the shape of the legs clearly is not. If they are in different classes, perhaps the shape of the legs should be represented. Deciding when two aspects of the world should be treated as manifestations of a single feature and when they should be treated as two distinct features is a complex process (Medin, Dewey, & Murphy, 1983).

Little research has addressed this issue. Studies of word learning in children have revealed some early biases in the information represented about objects. Some data have demonstrated that when children are given a novel noun in the presence of an unfamiliar object, they extend the label to refer to other objects with a similar shape; this finding suggests that the overall shape of objects is represented (Imai, Gentner, & Uchida, 1994; Landau, Smith, & Jones, 1988). Workers have focused broadly on the shapes of objects and have not explored the components that may make up shape representation. Perhaps the kinds of representations discussed in chapter 6 might be useful.

Images

FIG. 8.9. Two bugs that illustrate the difficulty of determining how features should be described.

Other investigations support the suggestion that the way one feature contrasts with another influences how the information is represented (Schyns & Rodet, 1997; Spalding & Ross, 1994). In one study, Schyns and Rodet (1997) showed people complex perceptual stimuli, which they called Martian cells. Stimuli like those used in their studies are shown in Figure 8.10. There were two classes of cells. One was defined by the presence of a perceptual feature X, and a second was defined by the presence of a perceptual feature that was a combination of perceptual features X and Y (which I call XY). An example of an XY stimulus is shown in the third row of Figure 8.10. Half the subjects in this study saw examples of the X category followed by examples of the XY category. The other half saw the examples in the reverse order (i.e., XY followed by X). After this training, subjects were shown a new stimulus that had both the features X and Y, but they were separate elements, not joined (I call this category X-Y).

Images

FIG. 8.10. Perceptual stimuli like those used by Schyns and Rodet (1997).

When people saw the XY category first, they tended to categorize the new X-Y stimuli in the X category. This result was probably obtained because these subjects saw the XY feature first and treated it as a single indivisible feature. When they then saw the X feature alone, they treated it as an indivisible feature. For these subjects, the XY and X features were diagnostic of the categories. Because they had no representation of the Y feature alone, they did not think it relevant when the X-Y stimulus was shown, and so the X-Y item was categorized in the X category.

In contrast, when subjects saw the X category followed by the XY category during training, they tended to categorize the X-Y stimulus in the XY category. In this case, seeing the X category first gave them a representation of the X feature. Then, when the XY stimulus was shown, the complex XY feature could be broken down into its X and Y components. These subjects would have category representations in which features X and Y were both diagnostic of the categories. Thus, when the X-Y category was presented, they were able to recognize that it contained both X and Y elements and hence categorized it in the XY category. This study suggests that people’s previous history with perceptual items influences the way that they represent the items.

Finally, language may play a role in determining an appropriate level of abstraction for features. In many situations, properties of objects are described in different ways on the basis of how they should be represented. For example, I often take my son to the Museum of Natural History in New York and show him the dinosaur fossils. Pointing to an apatosaurus skeleton, I may say, “That dinosaur ate plants.” Later in the visit, we often pass displays of animals local to the northeastern United States, and I may point to a cow and say, “Cows eat grass.” When I say that the dinosaur ate plants, I am providing information about its diet but also information about the right level of abstraction for representing this property. I could have said that the dinosaur ate grass (as I did for the cow) or even that the dinosaur ate a particular species of plant. By using the general term plants, I signaled that the feature should be represented at a high level of abstraction for dinosaurs (but not for cows). In this way, language may be a powerful force for helping people to determine how features should be represented.

SUMMARY

As this chapter demonstrates, there are many cases in which people could use abstract concepts but instead seem to reason with specific information. In case-based reasoning, rather than storing abstractions that capture only the important relations that hold for a domain, people seem to store specific cases. New reasoning situations involve retrieving past cases and adapting them to the new situation. Analysis of language for abstract concepts suggests that people talk about abstract domains by using the language of concrete situations. This metaphoric language suggests that (at least at one time) these abstract domains were actually conceptualized in concrete terms. Finally, many models of memory-based processes (including classification, automaticity, and retrieval) involve the storage of individual cases that are retrieved directly and used to process new items.

It is clear that specific instance information is important in cognitive processing. Nonetheless, abstractions are also crucial. Generalized structures like MOPs help people see similarities across domains without getting caught up in the fine details of specific situations. Having abstract schemas can be helpful. Earlier in this chapter, I described one of Gick and Holyoak’s studies, in which people were unable to retrieve the story about the general’s armies converging on the fortress when given Duncker’s ray problem. In an extension of this study (Gick & Holyoak, 1983), people were given two analogous stories and were allowed to compare them. Later, they were given Duncker’s ray problem. Unlike the first study described, the subjects in this study often solved the ray problem. This finding suggests that the comparison of the two analogous stories allowed people to form a schema for a “convergence” solution to the problem (i.e., that a force should converge on something to allow it to be destroyed). This schema was easier to recall than was the specific case.

The main point is that cognitive representations probably involve a mix of specific and abstract situations. I have focused here primarily on the use of specific cases, because research on knowledge representation has historically focused on abstraction. This chapter is meant to demonstrate that, when thinking about representation, specific cases should be on a par with abstract information, but not that cognitive representations should contain only information about specific instances.

1There has been considerable debate about where these rules come from. Cosmides (1989) suggested that humans are born equipped with a capacity to learn rules of social exchange because of evolutionary history. Cheng and Holyoak (1989) proposed that social interaction may be sufficient to create these rules. In general, it is difficult to resolve debates about the evolutionary basis of cognitive processes, because the evidence for the evolutionary explanation is the presence of a particular behavior (e.g., social reasoning ability) and a plausible evolutionary story.

2The LISP code for a small version of SWALE, called microSWALE, is presented by Schank, Kass, and Riesbeck (1994).

3These are not the only possible cues that can be used for retrieval. Many problem situations are connected to a goal that they address; for these problems, the goal may also serve as a retrieval cue (Seifert, 1994).

4A multistage model has also been developed by Thagard, Holyoak, Nelson, and Gochfeld (1990).

5Fauconnier (1994) referred to these domains as “mental space.” I avoid this term here to avoid confusion with the mental space representations discussed in Chapter 2.

6In a stem completion task, subjects first read a list of words. Some time later, they are given the first three letters of a word (e.g., STR_______) and are asked to complete the stem with the first word that comes to mind. A word beginning with a given stem is more likely to be given as a completion to a stem if it appeared on a previous study list than if it did not. This task is thought to tap implicit memory, because even people with amnesia, who have no recollection of the previous list, show evidence of having seen the list in a stem completion task (Cohen & Eichenbaum, 1993; Squire, 1987).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.57.16