Concepts and Communication
When Washoe was 27 months old, she made a hole in the then flimsy inner wall of her house trailer. The hole was located high up in the wall at the foot of her bed. Before we repaired the hole, she managed to lose a toy in the hollow space between the inner and outer walls. When Allen Gardner arrived that evening, she attracted his attention to an area of the wall down below the hole at the level of her bed, signing OPEN OPEN many times over that area. He easily understood her problem and fished out the toy. It was exciting to realize that a chimpanzee had used a human language to communicate new and unexpected information. Soon situations of this sort became commonplace. For example, Washoe’s playground was in the garden behind a single-story house. High in her favorite tree, Washoe was often the first to know who had arrived at the front of the house and her companions on the ground learned to rely on her to tell them who was arriving and departing.
COMMUNICATION AND INFORMATION
Washoe could tell her human companions things that they did not already know. This is what Clever Hans could not do. Clever Hans was a German horse that seemed to do arithmetic by tapping out numbers with his hoof. Hardly anyone besides his owner believed that Clever Hans could really do arithmetic, but for a long time no one could figure out how he got the right answers. Not the circus trainers or the cavalry officers, not the veterinarians or the zoo directors, not even the philosophers and the linguists who studied the case could explain how Clever Hans did it. Eventually, an experimental psychologist, Oskar Pfungst (1911), unraveled the problem with the following test. Pfungst whispered one number into Clever Hans’s left ear and Herr von Ost, the trainer, whispered a second number into the horse’s right ear. When Clever Hans was the only one who knew the answer, he could not tap out the correct sums. He could not tell his human companions anything that they did not already know.
Now that he knew the source of information, Pfungst observed Herr von Ost more carefully. He soon noticed that the trainer always wore a hat with a large brim which he pointed down toward Clever Hans’ hoof when it was time to start tapping and raised up again when it was time to stop tapping. Pfungst soon demonstrated that he could get whatever number he wanted from Clever Hans by lowering and raising the brim of his hat.
The truly interesting thing that Clever Hans was doing has become more clear in recent times with studies of pragmatic devices in everyday human conversation. The speech of normal conversation is embedded in other sorts of behavior—tone of voice, gestures, facial expressions—called pragmatic devices. Most structural theories of language (e.g., Chomsky, 1972) ignore pragmatic devices even though they are obviously criticial aspects of everyday communication. The meaning of a sentence can be completely altered by a lifted eyebrow or a shrugging shoulder, for example.
The glances of a human speaker and listener play a vital role in conversational turn taking. The listener looks at the face of the speaker, usually the lower face, presumably focusing on the mouth of the speaker. At the beginning of a turn, the speaker looks away from the listener and only looks back at the listener’s face (a) when checking for approval or understanding, after which the speaker looks away again and continues to talk, and (b) when turning the initiative over to the listener, after which the new speaker looks away and the new listener gazes at the face of the speaker (Argyle & Cook, 1976, pp. 98–124; Kendon, 1967).
Apparently, Clever Hans had learned how to take turns in a European conversation. At the end of a question, German speakers looked down at Clever Hans’ organ of communication, his hoof. At the end of an answer they looked up, often exclaiming their approval of the answer. All Clever Hans had to know was when to start his turn by tapping his hoof and when to end a turn by ending his tapping. This explains why perfect strangers, even hatless strangers, could get correct answers from Clever Hans.
Since Pfungst (1911), controls for Clever Hans errors have been standard procedure in comparative psychology. As late as 1997, forced choice tests of comprehension by human infants continue to omit such controls (for example, Girouard, Ricard, & Décarie, 1997; Hill, Collis, & Lewis, 1997; Shwe & Markman, 1997). Students of child development seem to believe that, whereas horses and chimpanzees may be sensitive to subtle nonverbal communication, human children are totally insensitive to pragmatics.
A classic study by Fraser, Bellugi, and Brown (1963) is typical of many studies of human children. Fraser et al. showed pairs of pictures to individual 3-year-old children. Each pair of pictures was untitled, of course, and illustrated a grammatical contrast, such as “the dog is biting the cat” versus “the cat is biting the dog.” As the experimenters showed each pair of pictures, they asked the child to point at the picture that illustrated one of the contrasts. Fraser et al. comment that “S sometimes pointed quickly, then reflected and corrected himself; the last definite pointing is the one we always scored” (p. 129). As they were in full view of the children, the experimenters could indicate the correct picture by gaze direction alone, or by approving and disapproving facial expressions. They could do this without uttering a word, without being any more aware of their hints than the baffled experts who tested Clever Hans.
What is wrong with the procedure of Fraser et al. is that the observers knew the correct answer before the children replied. The tests described in this chapter kept the observers from knowing the correct answer until after the chimpanzee responded. As in the case of Washoe telling us where she lost her toy and who was coming, in these tests the chimpanzees knew something that the observer did not know. They had information to communicate.
VOCABULARY TESTS
Early in Project Washoe, we devised vocabulary tests to demonstrate that chimpanzees could use the signs of ASL to communicate information. For Washoe’s first test, we mounted color photographs (mostly cut from magazine illustrations) on 8.5 × 5.5 inch cards. An experimenter selected a random sample of 6 to 10 cards and placed them facedown in a cupboard of Washoe’s trailer the night before a test session. Early in the morning an observer held up the cards one by one and asked Washoe to name them. For each card, the observer wrote down the first sign that Washoe made and then looked at the card and scored the response as correct or incorrect.
In order to extend the test to three-dimensional exemplars, we put objects into specially designed 12 × 13 × 9.5 inch plywood boxes. One side of each box was clear Plexiglas; the other sides were opaque. An experimenter selected exemplars randomly from a pool of photographs and objects and placed them one by one in a box. Again, the observer exposed the window side of the box to Washoe without looking inside and wrote down the first sign that Washoe made. We soon abandoned the box test because it was too cumbersome, but it gave us a valuable piece of information about the difference between photographs and three-dimensional exemplars.
Three-dimensional exemplars of objects like bibs and brushes fit easily into a small box, but exemplars of CAT, DOG, COW, BIRD, and CAR are another matter. In order to present both photographs and three-dimensional exemplars for all categories, we used high-quality figurines and models as well as photographs for the larger objects. We soon noticed that the expensive replicas produced significantly more errors than the photographs. Washoe’s favorite error for replicas of cats, dogs, cows, birds, and cars was BABY, but she rarely called any of the photographs of these items BABY. In a sense, she treated the three-dimensional replicas as less real than the photographic replicas (B. T. Gardner & R. A. Gardner, 1971, pp. 160–161).
On the basis of these results, we developed a procedure that used photographs exclusively. For a more detailed description of the testing procedures and results, see R. A. Gardner and B. T. Gardner (1984) and Campbell-Jones (1974).
Objectives
The first objective of these tests was to demonstrate that the chimpanzee subjects could communicate information under conditions in which the only source of information available to a human observer was the signing of the chimpanzees. To accomplish this, nameable objects were photographed on 35-mm slides. During testing, the slides were back-projected on a screen that could be seen by the chimpanzee subject, but not by the observer. The slides were projected in a random sequence that changed from test to test so that neither the observer nor the subject could memorize the sequence.
The second objective of these tests was to demonstrate that independent observers agreed with each other. To accomplish this, there were two observers. The first observer (01) served as interlocutor in the testing room with the chimpanzee subject. The second observer (02) was stationed in a second room and observed the subject from behind one-way glass, but could not see the projection screen. The two observers gave independent readings; they could not see each other and they could not compare observations until after the test.
The third objective of these tests was to demonstrate that the chimpanzees used the signs to refer to natural language categories—that the sign DOG could refer to any dog, FLOWER to any flower, SHOE to any shoe, and so on. This was accomplished by preparing a large library of slides to serve as exemplars. Some of the slides appeared in pretests that served to adapt subjects, observers, and experimenters to the testing procedure. The slides that were reserved for the tests never appeared in pretests so that the first time that a particular chimpanzee subject saw any one of the test slides was on a test trial and each test slide appeared on one and only one test trial. Consequently, it was impossible for a chimpanzee to memorize particular pairs of exemplars and signs. Scores on these tests depended on the ability to name new exemplars of natural language categories.
Teaching and Testing
In most laboratory studies of nonhuman beings, the same procedures serve both for teaching and for testing. A monkey or a rat, for example, learns trial-by-trial to associate one stimulus with reward and the other with an empty food dish, and the very same trials are scored for correct and incorrect choices and plotted to show learning curves. Washoe, Moja, Tatu, and Dar spent virtually all of their waking hours in the company of some human member of their foster families. During that time, their exposure to objects and the ASL names for objects was very large compared with the brief samples of the vocabulary tests. These tests were as different from the routines of the rest of their daily lives as similar testing would be for young children. For caged subjects, a session of testing is probably the most interesting thing that happens in the course of a laboratory day. For Washoe, Moja, Tatu, and Dar, most of the activities of daily life were more attractive than their formal tests. The cross-fostering regime precluded any attempt to starve them like rats or pigeons to make them earn their daily rations by taking tests.
Getting free-living, cross-fostered chimpanzees to do their best under these stringent conditions required ingenuity and patience. A basic strategy was to establish the testing routine by a regular program of pretests that were short, usually less then 30 minutes, and infrequent, rarely more than two sessions per week. We used the pretests to pilot variations in procedure and to ensure that experimenters, observers, and subjects were all highly practiced at the procedure before the tests.
Rewards
Early in Project Washoe, we learned that when she was too anxious to earn her reward—when she was too hungry or the reward too desirable—then we could expect no more from Washoe than the absolute minimum amount or quality of response necessary to get the reward. If used, food rewards had to be very small—half of a raisin or a quarter of a peanut—more symbolic than nourishing. Attempting to reward Washoe for correct replies in the testing situation created procedural difficulties, which we avoided with Moja, Tatu, and Dar by rewarding them for prompt, clear replies, regardless of correctness. Unlike Washoe, the other subjects were distracted by the treats and would often ask for the rewards by name at critical points in the procedure, so that 01 and O2 could not tell whether the chimpanzees were asking for a treat or naming a picture. Consequently, we abandoned this procedure entirely for Tatu and Dar, and they rarely received any rewards after we discovered this problem during the initial stages of pretesting.
Additional personnel frequently monitored both the procedure in the observation room and the procedure in the testing room (Fig. 15.1). Whether those serving as 01 were aware of it or not, they often revealed their approval or disapproval of a cross-fosterling’s performance by smiling or frowning and by nodding or shaking their heads as well as by signing such things as GOOD GIRL and SMART CHIMP. Gestures and signing evoked more communication from the cross-fosterlings, just as edible treats evoked begging (chap. 8).
ITEMS AND EXEMPLARS
Here the term vocabulary item refers to a category of objects—such as shoes, flowers, dogs—named by a particular sign; and the term exemplar refers to a unique member of such a category in these tests—the 16 × 11 inch back-projected image of a 35-mm color slide.
Photography
Outside of the testing situation, we could use the pictures in ordinary books and magazines—even when the pictures contained objects of many different kinds—because we could point to particular objects or ask questions such as, WHAT BIRD EAT? Under the blind conditions of the tests, however, the objects had to dominate the field of view. The backgrounds had to be very plain, since vividly colored backgrounds were often so distracting that the chimpanzees named the background colors instead of the objects. Note in the examples of Fig. 15.2 that the objects filled the screen regardless of their normal relative size, whether they were bugs, shoes, or cars.
Some of the things the chimpanzees often named, such as grass or water, seemed impossible to present in test slides because they lack any characteristic shape and appear as meaningless forms against a blank background. Some otherwise shapeless objects, such as coffee or facial tissues, made acceptable test slides because they appear in distinctive containers. Still other objects, such as houses and windows, seemed impossible to photograph without backgrounds or foregrounds that contain distracting extraneous detail. An important function of the pretests was to try out different photographic techniques. It was in this way that we learned to avoid vividly colored backgrounds because the chimpanzees sometimes named the color of the background instead of the featured object.
We had to learn to look at the slides with the eyes of our subjects. For example, dramatic slides of leafy trees yielded mixed results, but shortly after Christmas one year, Tatu suggested something better. As she played with her discarded Christmas tree she named it to herself many times. It turned out that Christmas trees, photographed close up against the sky, made highly acceptable exemplars both for Tatu and for Dar. The tops of live evergreens photographed with a telephoto lens were also highly acceptable. Even though evergreens come in some variety, we wanted to demonstrate that chimpanzees can also name deciduous trees under test conditions. Once more, Tatu showed us the way. That winter, on outings in the woods, Tatu frequently called our attention to trees by signing THAT TREE at their bare trunks. With this hint, we discovered that photographs of bare trees in winter made excellent exemplars.
Novelty
Each slide was unique and each slide appeared only once on each test or pretest. An exemplar could be unique because the chimpanzee subject had never seen that object before or because it was the first time that the subject had ever seen a photograph of that particular object or because it was the first time that the subject had ever seen that particular photograph of that particular object. When different exemplars consisted of different photographs of the same object, each was unique in that the object appeared (a) at a different distance, (b) at a different angle, (c) against a different background, (d) under different light, or (e) in a different arrangement of a group, such as fruits, nuts, or shoes. At least three, but usually four, of the five dimensions varied from slide to slide. All except four of the slides used in all of Tatu’s and Dar’s tests were photographs of objects that neither chimpanzee had ever seen before, either directly or in photographs.
Target Signs
The correct sign for each vocabulary item was designated in advance of the tests. That sign and that sign only was scored as correct for that item. Although there were aspects of the pictures for which superordinate terms, such as FOOD, or descriptive terms, such as BLACK, might be correct or incorrect, neither the presence or absence nor the correct or incorrect use of such terms was considered in the scoring of these tests.
Most of the replies consisted of a single sign which was the name of an object. Sometimes, the single noun in the reply appeared in a descriptive phrase, as when Tatu signed RED BERRY for a picture of cherries, or when Dar signed THAT BIRD for a picture of a duck. These replies contained only one object name and that was the sign that was scored as correct or incorrect. Occasionally, a test reply contained more than one object name, as when Washoe signed FLOWER TREE LEAF FLOWER for a picture of a bunch of daisies. In such cases, the observers designated a single sign for scoring (usually the first) without looking at the picture themselves. For each trial and each observer, then, one sign and one sign only in each report was used to score agreement between Ol and 02 and agreement between the reports of the observers and the name of the exemplar.
Table 15.1 lists the vocabulary items that appeared in the tests of Washoe, Moja, Tatu, and Dar. Differences among the subjects in this table reflect differences in their vocabularies as well as a strategy of overlapping tests that sampled the range of picturable objects in the vocabularies without making the tests excessively long. For each test, we chose four exemplars of each vocabulary item to illustrate the range of objects that a subject could name with the same sign. Different breeds represented CAT and DOG, different species represented BIRD and BUG, different makes and models represented CAR, and so on (see Fig. 15.2). The number of vocabulary items and the resulting number of trials (items × exemplars) appear in Table 15.2.
Note. W = Washoe, M = Moja, T = Tatu, D = Dar.
TEST RESULTS
Table 15.2 shows how the tests accomplished their major objectives. The agreement between 01 and 02 was high for all seven tests; except for Moja, the agreement ranged between 86% and 95% and all agreement was far beyond chance expectancy. Note that this is the agreement for both correct and incorrect signs. Clearly, the signs made by the chimpanzees were distinct and intelligible. The agreement between the signs reported by O1 and O2 and the correct names of the categories is also high; except for Moja, correct scores ranged between 71% and 88% and all scores were far beyond chance expectancy.
* assuming that the observer was guessing on the basis of perfect memory for allprevious trials that that observer had seen (see text).
abased on 135 trials; 02 missed 5 trials. bbased on 132 trials; 8 unscorable trials.
Chance Expectancy
The line labeled expected in Table 15.2 needs some explanation. From the point of view of Washoe, Moja, Tatu, and Dar, the chance of being correct by guessing alone would be 1/N where N is the number of vocabulary items on a test and all items have the same number of exemplars. For example, Tatu’s first test had 25 items. With the usual four exemplars of each item, the total number of trials was 100. If Tatu was only guessing on each trial, but selected each guess from 1 of the 25 items on the test, then her average performance would be 1/25 or 4 correct by chance alone. The 1/N estimate may be too low, because it only considers the guesses of the chimpanzees and fails to consider the possibility that the human observers could also be guessing. This is important because we can never know what the chimpanzees were signing; we only know what the observers reported. Indeed, a major objective of the tests was to verify the independent agreement between observers.
After each and every trial, we could have randomly reshuffled the 100 items on Tatu’s first test. That procedure is called random sampling with replacement and the odds of guessing correctly would remain the same 1/25 on each and every trial. The trouble with this version of random sampling is that some exemplars could reappear on more than one trial and some might never appear. To make sure that each exemplar appeared only once and also that each vocabulary item appeared exactly four times, we assigned all 100 exemplars to trials in a random but fixed sequence. This is called random sampling without replacement. The trouble with this procedure is that it makes probabilities shift during the course of the test. Because Ol and 02 knew that there were only four exemplars of each item on each test, they could have used this information to improve their guesses in later trials. The last trial is completely predictable, because there is only one exemplar of only one item left in the series. The next to the last trial may be completely predictable, but there are at most two vocabulary items still to appear, and so on.
In random sampling without replacement, the probabilities of later events in a fixed sequence depend on earlier events. Thus, gamblers who can remember the cards in the deck that have already appeared can win significant amounts at games such as blackjack. Dealers in casinos routinely reshuffle the cards at the 50% to 75% point in a deck to defeat “card counters.”
As small as the effect of sampling without replacement might be in these vocabulary tests, good scientific practice demands an exact estimate of chance. Accordingly, Patterson, B. T. Gardner, and R. A. Gardner (1986) developed a general mathematical expression to calculate the effect of random selection without replacement on chance expectancy under the conditions of these tests. As applied to the vocabulary tests, this expression assumes that both observers (a) saw each slide after each trial, (b) had perfect memory for the number of exemplars of each vocabulary item that had appeared before the beginning of each trial, and (c) guessed the correct sign on the basis of the number of exemplars of each vocabulary item that remained on the test. The expected chance scores for each test appear in Table 15.2. In all cases, this estimate is negligibly small compared to the number of correct responses. Since 01 and 02 reported extralist intrusions (signs that were not on the target lists), they were using a less efficient strategy. Small as they are, the values in the expected line of Table 15.2 overestimate chance expectancy.
The expected score for Washoe’s first test is small but appreciably larger than the expected scores for the other tests in Table 15.2 for two reasons. First, this test was shorter than the other tests and predictability depends on the number of vocabulary items—the fewer the items the greater the predictability. Second, and more significantly for this discussion, predictability increases as we approach the end of the test. In all of the tests in Table 15.2, except for Washoe’s first test, each individual observer served for only half of the trials. We did this to demonstrate that at least four different human observers could read the signs of the chimpanzees. The effect on chance expectancy is the same as a dealer in a casino reshuffling the cards halfway through the deck. The smaller number of items and the assignment of the same two observers to all trials of Washoe’s first test account for the higher, but still quite small, expected score on that test.
PRODUCTIVE TESTS VERSUS FORCED CHOICES
Because these were productive tests, Washoe, Moja, Tatu, and Dar could respond with any item in their vocabularies on any trial. This reduces the chance probability of correct responses to a negligibly small number. Tests in other laboratories have often used forced choice tests of understanding with very few response alternatives, usually only two, rarely more than four. With a small number of alternatives, subjects can succeed at these tests with strategies that are irrelevant to the cognitive objectives of the experiment. Once again, the scientist must think like a detective eliminating prime suspects.
For example, a classic test for extrasensory perception (ESP) uses a standard deck of 25 ESP cards consisting of five exemplars of each of five symbols. An experimenter shuffles the deck and then makes a list of the sequence of symbols. A percipient in another room attempts to perceive the sequence without looking at the cards. By chance the percipient should average 1/N or five hits on each run through the deck. Anything significantly greater than that means that the percipient has ESP. Careful experimenters average the number of hits over many runs through the deck to reduce the possibility of chance runs of success.
Tart (1976), arguing for ESP as well as for postive reinforcement, reasoned that percipients would improve their powers if the experimenter gave them positive feedback for every hit. Sure enough, under these conditions percipients got more hits as they progressed through the deck and their average score was greater than chance indicating that postive reinforcement heightens ESP. There is another prime suspect for the results in this case, however. Running through the deck in a fixed order is random sampling without replacement so a percipient can use information about past hits to improve later scores (Read, 1962).
A standard ESP deck of 25 cards samples without replacement every 25 trials. Suppose that Tart (1976) had used a rigorous Skinnerian procedure that ended each trial with a hit and a reward. Suppose that he had allowed percipients to correct for misses. That is, when they missed a card Tart could have allowed a percipient to guess again until each trial ended with a hit. This is called a correction procedure. Suppose further that Tart had sampled without replacement every 5 trials instead of every 25 trials. It is easy to calculate the effect of the correction procedure together with random sampling every five trials. On the first trial of any set of five, chance would be 1/5 or 20%. On the second trial there would be only four possibilities left so chance would be 1/4 or 25%. On the third trial chance would be 1/3 or 33.3%, on the fourth trial 1/2 or 50%, and on the last trial there would be only one possibility so a percipient with ordinary human memory could always get a hit on the last trial. Average chance expectancy under these rigorous reinforcment conditions is 20% + 25% + 33.3% + 50% + 100% divided by five or 45.7%. Under these conditions, a percipient would have to average significantly more than 45.7% hits before a scientist could attribute the results to ESP.
It may seem more conventional to argue that pigeons can learn concepts from pictures by Skinnerian reinforcement than to argue that human clairvoyants can learn ESP by Skinnerian reinforcement, but the rules of evidence are the same. For example, Wasserman and his associates (Bhatt, Wasserman, Reynolds, & Knauss, 1988; Wasserman, 1993; Wasserman, Kiedinger, & Bhatt, 1988) presented four different conceptual categories of pictures—such as human beings, cats, flowers, or cars—and trained pigeons to peck at four different locations depending on which category appeared on a projection screen. They used trial-by-trial correction and sampling without replacement every four trials. They estimated chance as 1/N or 25% hits. The correct value, however, should be 25% + 33.3% + 50%+ 100% divided by four, or 52%. Consequently, many of their demonstrations fail as evidence for conceptual transfer because correct performance was less than or indistinguishable from 52%. Some of their claims for transfer based on hierarchical categories actually show negative transfer when the pigeons were correct less than 52% of the time.
Chapter 3 described the original procedure that Skinner recommended to measure discrimination between stimuli. With pigeons, Skinner recommended presenting stimuli one at a time on a single response key. When S+ appears on the screen, the apparatus dispenses rewards for key-pecking during S+ periods and never dispenses rewards for key-pecking during S– periods. If the pigeon pecks at a higher rate during S+ periods than during S– periods, then, according to Skinnerian tradition, the pigeon is discriminating between S+ and S–. Jenkins (1965) showed that under these conditions pigeons can discriminate between S+ and S– periods without looking at the stimuli. All they have to do is peck after reward and stop pecking after nonreward. In the feed forward view of this book, food in S+ evokes pecking and the lack of food in S– evokes other behavior even if the pigeon is blind.
In spite of Jenkins’ (1965) analysis, respected experimenters frequently offer such measures as evidence of discrimination and even of categorical concepts as if Skinner’s approval guarantees scientific validity. In a series of experiments, Herrnstein and his associates (Herrnstein, 1979, 1985) projected photographs of natural scenes on the key in a Skinner box. Exemplars of the target category appeared in half of the scenes but were absent from the other half of the scenes. The pigeons received food reward for pecking at exemplars of the target category but never received food when the target category was absent. With training, the pigeons pecked more often at exemplars of the target category than at other pictures. The pigeons could have done quite well without looking at the photographs; all they had to do was peck when rewarded and stop pecking when not rewarded. Jenkins described how proper control groups and transfer tests could provide valid evidence of discrimination in the Skinner box.
Forced-choice tests with very few alternatives create serious problems of measurement that productive tests avoid by allowing a large number of alternatives. The smaller the number of response alternatives, the cruder the measurement and the worse the effect of trial-by-trial reinforcement and sampling without replacement. R. Gardner and Gardner (1978, pp. 61–65, 68) discusses the problems of interpretation raised by forced-choice tests of understanding in chimpanzees.
SIGNS OF ASL
The field records summarized in B. T. Gardner et al. (1989, Table 3.2) show how closely the signs in the vocabularies of Washoe, Moja, Tatu, and Dar approximated the target signs of ASL that human companions modeled for them, allowing of course for childish diction. Fluent signers frequently visited the cross-fostering laboratory and observed the cross-fosterlings signing under naturalistic conditions. The double-blind testing procedures permitted more rigorous confirmation. Two fluent deaf signers, both then recent graduates of Gallaudet College, each served as 02 in Washoe’s pretests in the summer of 1970. Each of these young men participated in two pretests at a time when each had observed Washoe for less than one hour. Their agreement with 01 (who had in each case years of experience with Washoe) rose from 67% and 71% on their first session to 89% for both on their second session.
The task that these fluent signers faced was the same as the task of fluent English speakers identifying words in the speech of equally immature human children after equally brief preexposure to the immature speakers and under equally stringent conditions. These two outside observers who were expert at ASL, but unfamiliar with Washoe, read her signs fairly well at first and then improved markedly in their second test session. The first and second sessions contained different items so the improvement did not depend on the deaf observers learning specific vocabulary items. The initially good agreement with 01 together with the improvement indicates that Washoe’s signs were intelligible to fluent signers with, perhaps, a childish or chimpanzee accent that they could learn fairly quickly.
CONCEPTS
To make sure that the signs referred to conceptual categories, all of the test trials were first trials; that is, the one and only test trial in which a slide appeared was the first time that that chimpanzee ever saw that slide. All of the specific stimulus values varied, as they do in natural language categories; that is to say, most human beings would agree that the exemplars in each set belong together. Apparently, Washoe, Moja, Tatu, and Dar agreed with this assignment of exemplars to conceptual categories.
Significant variation among exemplars and testing with true first trials are essential to the definition of natural language categories. More concerned with theoretical definitions of language than with conceptual behavior, the Rumbaughs and their associates (Rumbaugh, 1977; Savage-Rumbaugh, Pate, Lawson, Smith, & Rosenbaum, 1983) administered hundreds of trials of training and testing with identical exemplars or with minimally varied exemplars. To be sure, in their tests of chimpanzees the Rumbaughs concentrated on the arbitrariness of what they called “lexigrams” as used in arbitrarily fixed sequences. It seems likely that the Rumbaugh chimpanzees could have used natural language categories, given the opportunity to do so.
In 4 years of work with the chimpanzee Nim, Terrace (1979) never attempted any systematic tests at all. His work is unique in this field in that it was entirely restricted to adventitious naturalistic observation without any controls for Clever Hans errors, whatsoever. Terrace’s claims that all of Nim’s sign language could have been cued by his human companions only reflects Terrace’s failure to conduct properly controlled tests.
COMMUNICATION AND LANGUAGE
If the development of human verbal behavior requires any significant expenditure of biological resources, then it must return some biological advantages to its possessors. Before it can return any profit, however, a biological trait must operate on the world in some way; it must be instrumental in obtaining benefit or avoiding harm. If clarifying one’s ideas is biologically profitable, it must be because in some way clarified ideas provide superior means for operating in the biological world. As for establishing social relations, a system of displays and cries is sufficient to maintain group cohesiveness in most animals. The advantage of a wider variety of signals would seem to be the communication of more information. But, unless verbal behavior refers to objects and events in the external world, it cannot communicate information and it cannot have any such advantage. From this point of view, reference is the biological function of verbal behavior, and the function of grammar or structure in verbal behavior must be to enlarge the scope and to increase the precision of reference.
In the naturally occurring languages of the world, the pairing of words and signs with conceptual categories is arbitrary. This is amply demonstrated by the mutual unintelligibility of languages and the well-documented history of shifts in forms and usage. Washoe, Moja, Tatu, and Dar, if they had been human children—if they had been angels for that matter—could succeed in these vocabulary tests only by associating the signs of ASL with their referents. Angels may have other ways of associating responses with stimuli, but children and chimpanzees must learn arbitrary associations. Thus, to the extent that the communication of information depends on the arbitrary connection of terms to conceptual categories, then that biological function of a natural language depends on rote learning.
DUALITY OF PATTERNING
Hockett (1978) describes duality of patterning as one of the design features of human languages. The meaningful units or morphemes are composed of smaller units such as vowel and consonant sounds that are meaningless in themselves. There are structural rules for combining sounds or phonemes into meaningful units and there are different structural rules for combining the morphemes into messages. Duality of patterning was a significant source of errors in these vocabulary tests.
Errors often tell a great deal about basic processes. Sometimes more can be learned from errors than from correct responses. In their vocabulary tests, the cross-fostered chimpanzees were free to use any sign in their vocabularies. They chose their errors for themselves rather than from a set of forced choices chosen by the experimenters. The probability of any particular error, like the probability of any particular correct response, was very low, permitting us to detect patterns of errors. Most errors fell into one of two patterns: conceptual errors and form errors. Thus, DOG was a common error for a picture of a cat, SODAPOP for a picture of ice cream, and so on, showing that conceptual groups such as animals and foods were a major source of confusion. Similarly, signs made on the nose such as BUG and FLOWER were confused with each other, as were signs made on the hand such as SHOE and SODAPOP, showing that the cheremic structure of ASL was a source of errors the way the phonemic structure of English is the source of errors in the verbal behavior of human beings (R. A. Gardner & B. T. Gardner, 1984, pp. 393–398).
The daily laboratory procedure discouraged the cross-fosterlings from answering a question with a string of guesses, particularly during testing. When a reply contained more than one sign for an object, we scored the first object sign as the only reply. Inspecting the 14% of the replies that contained two or more names for objects, we found that Washoe, Moja, Tatu, and Dar made more errors on these trials than on the trials in which they only offered a single object name as a reply. They seemed to be groping about for the correct sign in these cases as if unsure of themselves.
Pairs of signs within these indecisive replies formed patterns. Conceptual pairs, such as CAT and DOG, or SODAPOP and ICECREAM, and form pairs, such as CAT and APPLE, or BUG and FLOWER, were the most common. Sometimes, pairs were repeated as in, CAT DOG CAT DOG, or BUG FLOWER BUG FLOWER. Sometimes, replies contained a string of related signs as when Washoe signed CAT BIRD DOG MAN for a picture of a kitten, or FLOWER TREE LEAF FLOWER for a picture of daisies.
There are several signs that are made by grasping points along the edge of one hand with the thumb and index finger of the active hand; the end of the thumb is grasped in BERRY, the upper edge of the palm in MEAT, the lower edge of the palm in OIL. In a typical case, Washoe signed OIL BERRY MEAT for a picture of frankfurters, as if the correct sign was on the tip of her fingers. Thus, not only the correct replies, but the errors and the very dithering between alternatives depended on conceptual relations among the referents and cheremic relations among the signs of ASL.
Hockett (1978, pp. 275–276) points out that, when errors depend on phonology as well as on semantics, we have evidence for duality of patterning. Perhaps an example can make this point more clear. Warden and Warner (1928) demonstrated that the German shepherd dog Fellow, a star of movies and vaudeville, could understand instructions in spoken English. In the most critical tests, Fellow’s master spoke from behind a screen and instructed him to fetch objects from the next room. Fellow was good at this task, but he also made interesting errors—such as fetching a collar instead of a dollar—that depend on the sounds of English. We would argue that only a chimpanzee that had learned the shapes of the ASL signs would confuse CAT with APPLE or BUG with FLOWER. In their test errors as well as in their use of cheremic inflections (Rimpau, R. A. Gardner, & B. T. Gardner, 1989), the cross-fostered chimpanzees exhibited Hockett’s duality of patterning.
ETHOLOGY AND OPERATIONAL DEFINITION
Chapter 14 described the ethological considerations that dictated the procedure of cross-fostering and the use of ASL as a naturally occurring human language. In this rich environment and with this rich repertoire of ASL signs, the cross-fosterlings could name quite new examples of a wide variety of objects. In spite of the sharp break with traditional laboratory procedures, the ethologically sound laboratory was entirely compatible with rigorous testing.
Members of the human foster family served as testers who could get the chimpanzees to communicate information in signs. They could do this without forcing the animals to beg for food. They responded with social approval, which fed forward to more communication. By separating communication from extrinsic reward, they widened the range of communication. The chimpanzees could tell them about many things that had nothing to do with eating or drinking. Given the social relationship between the human adults and the cross-fostered chimpanzees, it was a relatively easy matter to construct a testing situation in which the only source of information available to the human testers was the signing of the chimpanzees.
Ethological consisiderations dictated the use of a naturally occurring human language that left Washoe, Moja, Tatu, and Dar free to use any of the signs in their ASL vocabularies at any time, including testing time, so their errors were their own errors rather than arbitrary alternatives forced upon them by the test. These open tests showed more than numbers of correct and incorrect choices. They showed how errors as well as correct replies depended on the conceptual structure of the vocabularies and the cheremic structure of ASL. The program of research that combined cross-fostering, naturalistic observation, and systematic experiments yielded operational definitions of communication, concepts, and structure. The ethological validity of the laboratory conditions contributed to the experimental operations.
Like all of the methods described in this book, sign language studies of cross-fostered chimpanzees are a tool for studying intelligent behavior. It seems unlikely that a phenomenon as rich as language could be based on an isolated, unitary biological trait, unrelated to the rest of human nature. It is more reasonable to suppose that language is embedded in a complex pattern that relates all aspects of human intelligence. This book argues further that, like other significant biological phenomena, the general laws that govern human intelligence are instances of the general laws that govern the intelligent behavior of all animals. The search for underlying patterns and general biological laws of intelligence led to sign language studies of crossfostered chimpanzees.
Like the other tools described in this book, sign language studies aim to open up new fields of discovery from the bottom-up rather than to answer traditional philosophical questions from the top-down. The following comment of Bruner (1978) applies directly to the last two chapters. Hopefully, readers can see how it applies in spirit to the bottom-up, feed forward approach of the whole book.
A third trend is also discernible: the bridging of gaps that before were not so much empty as they were filled with corrosive dogmatism. The gaps between prelinguistic communication and language proper as the child develops, the gap between gesture and word, between holophrases and sentences, between chimps signing and man talking, between sign languages and spoken ones, between the structure of action and the structure of language. I think that the renewal of interest in language as an interactive, communicative system has made these ‘gaps’ less like battlegrounds where one fights and dies for the uniqueness of man and more like unknown seas to be mapped. (p. viii)
18.222.82.221