CHAPTER 9
Appetite, Aversion, and Conflict
Throughout the 20th century, the concept of motivation has served three functions in most psychological theories: 1) driving behavior, 2) directing and selecting behavior, and 3) rewarding behavior. With respect to the driving or energizing function, conventional psychological theory maintained that active behavior requires internal motive force. According to this view, an animal that was satiated for all incentives—hunger, thirst, sex, warmth, etc., would become completely inactive and probably fall asleep—unless aroused by pain or threat of pain. With respect to the directing and selecting function, in order to eat when hungry and drink when thirsty, an animal must be able to tell when it is thirsty and when it is hungry. Otherwise, animals would drink when hungry and eat when thirsty, which would certainly be an awkward state of affairs. With respect to the rewarding function, if learning depends on contingent reinforcement then motivation determines learning, because motivation determines reward and punishment.
Earlier chapters of this book considered difficulties with the traditional view of contingent reinforcement as the basis of learning. This chapter considers further difficulties with the traditional view and shows how modern developments replace the Greek notion of motive force with the feed forward principles of appetite, aversion, and conflict.
MOTIVE AND DRIVE
In classical times, Greek scientists noted that celestial bodies move in circular patterns, with our earth at the center. Terrestrial bodies move downward by themselves, but human beings and beasts of burden have to push, pull, and carry inanimate objects. Human beings move by themselves, which classical philosophers attributed to divine gifts of internal power. Other animals move by themselves, endowed with a kind of robotic internal force that also required divine intervention. This model of motive forces remained essentially in tact through Descartes’ time. It was late in the 18th century that chemists discovered the relationship between combustion and animal energy and only early in the 20th century that the chemistry and biology of this relationship reached its modern form (Mendelsohn, 1964).
The most dramatic technological advances of the 19th century were steam engines that did useful work, and actually moved themselves in contraptions called locomotives. The terms motivation and drive express the concern with sources of motion that comes down to us from classical Greece. During the first half of the 20th century, a series of influential studies seemed to support the traditional energizing view of drive. Animals, mainly rats, lived in an apparatus which consisted of a running wheel and a counter that recorded revolutions of the wheel. Animals lived in the wheel for days, sometimes without food, sometimes without water, and sometimes without either food or water.
The experimenter or a technician visited the apparatus regularly each day to record the number of revolutions on the counter. With so little effort required, experimenters could keep squads of these inexpensive devices running at the same time. Without a learning or perceptual task to be disturbed, they could set up the running wheels anywhere in the laboratory without the bother of isolating the subjects from visual and auditory disturbance. These activity wheel studies all showed the same pattern. Activity increased with hours and days of deprivation up to some maximum and then declined, presumably because the animals weakened from lack of food or water (Morgan & Stellar, 1950, pp. 368–381; Reed, 1947). This seemed to confirm the notion that deprivation causes an increase in drive and drive increases output. This steam engine model of motivation appears in the work of such diverse theorists as Freud, Hull, Skinner, and Tinbergen and remains influential at this writing.
Irritability
The energy output of 19th century steam engines made a great impression on psychologists of the early 20th century because this was the most dramatic technological advance of those times. The most dramatic advances of recent times have to do with systems that process information and govern energy. Modern psychobiologists can see that living animals make their way in the world by responding appropriately to external stimulation. Life depends more on government than on release of animal energy.
In a landmark experiment, Campbell and Sheffield (1953) studied the interaction between activity and external stimulation. First, they isolated their rat subjects in a special insulated cabinet that they placed in a soundproofed room. They further isolated the rats from disturbing stimuli by keeping them in the dark and running a fan that made the same steady hum all day long. They housed each rat separately in a specially designed activity cage, which had a round floor balanced on a central point. Silent mercury switches at four opposite points of the floor recorded all movements that tilted the cage in any way. The animals must have gotten some stimulation from their own movements. Apart from that their stimulus environment was virtually constant, with one exception. For 10 minutes every day, a timer automatically turned on the lights and turned off the fan. Water was always available to the rats from a drinking tube. For 4 days, food was also constantly available and then for 3 days all food was withdrawn.
The results in Fig. 9.1 show that activity depended on stimulation. During the 10 minutes of stimulus change each day (upper curve), activity was much higher than during the 10-minute constant period before stimulus change (lower curve). During the control days before food deprivation, activity remained fairly constant whether we look at the higher level evoked by stimulus change or the lower level during constant stimulation. During the days of food deprivation, activity remained virtually constant during the 10-minute periods of constant stimulation, but rose dramatically with each day of deprivation during the 10-minute stimulus change periods. Without stimulus change, increased hunger failed to increase activity. Increased hunger only increased responsiveness to stimulation—that is to say, it made the animals more irritable.
We can see now why hunger seemed to have a general energizing function when animals lived in running wheels in busy laboratory rooms. Hunger made them more and more responsive to the noises and sounds around them. The running wheels themselves made noise, so that running caused noise that stimulated more running. When Campbell and Sheffield kept the level of stimulation in the rooms nearly constant, and at the same time kept the activity cages themselves silent, the level of activity remained nearly constant even though the level of hunger rose with each passing day. Activity rose with hunger only when the experimenters changed the sounds and the sights that stimulated the rats. These results contradict the notion of hunger as energizing motivation. They show instead that hunger made the animals more responsive to changes in stimulation. Irritability together with the directing and selecting function of drive is enough to account for all evidence that activity or energy output rises with increases in deprivation (see review by Hall, 1976, pp. 212–213).
Campbell and Sheffield’s discovery that hunger increases irritability rather than energy output applies directly to the phenomenon called stimulus change reinforcement (Kish, 1966; Osborne & Shelby, 1975). Kish (1955) was the first to show that if lever-pressing caused changes in illumination, mice increased their lever-pressing. He and many others interpreted this as reinforcement by stimulus change (chap. 4). Suppose, however, that Campbell and Sheffield had arranged their stimulus change experiment somewhat differently. Suppose that instead of fixed periods of stimulus change, they wired the apparatus so that each small amount of activity started a brief stimulus change. We can see from the results they obtained in Fig. 9.1 that each brief stimulus change would evoke activity, which would cause the apparatus to deliver more stimulus change, which would evoke more activity, and so on. This version of the Campbell and Sheffield experiment would seem to show that contingent stimulus change reinforces general activity. We already know from Campbell and Sheffield’s results, however, that stimulus change feeds forward to activity without any contingency at all. Just as in the other cases of stimuli that evoke responses in chapter 8, we cannot attribute this result to contingent reinforcement.
How Many Drives?
Early theorists such as Hull, Skinner, and Freud claimed to be biological and materialistic because they assumed a small number of primary drives—hunger, thirst, sex, and a very few others. Secondary drives were supposed to develop by association with primary drives just as second-order conditioning and secondary rewards were supposed to develop by association between an arbitrary Sa and a biological S* (chaps. 4 and 7). In this view, an infant loves its mother because she brings food, drink, and warmth. In this way, the sight and sound of mothers become incentives that motivate behavior. In humans, her praising voice becomes a reward, her scolding voice, a punishment. Eventually, the sight and sound of other women who resemble her become rewarding or punishing, and so on.
Modern studies, however, reveal a wide variety of incentives that cannot be traced to a history of conditioning with one of the traditional short list of primary drives such as hunger or thirst. Monkeys, for example, are good at solving mechanical problems because they spontaneously manipulate objects such as sticks and straw (Westergaard, 1988; Westergaard, Greene, Babitz, & Suomi, 1995). In a particularly revealing study, H. F. Harlow, M. K. Harlow, and Meyer (1950) left mechanical latches (Fig. 9.2) in cages and showed that monkeys would work the latches open without any extrinsic reward. The latches were mounted on a wall of their cages and nothing else opened when the monkeys undid the latches. The monkeys acted as if the latches, themselves, evoked manipulation.
Later, H. F. Harlow (1958) showed that infant monkeys became attached to artificial dolls if the dolls had the right sort of cloth texture. They preferred a familiar cloth-covered doll to an equally familiar wire-covered doll, even when they got all of their food from the wire doll. After they became attached to a cloth doll, just the sight of the doll could comfort them even when they could only see the doll through a window.
It may be a small step to add a need for contact comfort to the list of primary motives, but a need to manipulate hardware on a wall begins to stretch the concept of primary biological needs. This sort of motivational psychology says that monkeys manipulate hardware on the wall because they need to manipulate hardware on the wall. Why not say they manipulate because they manipulate?
Evocative Stimuli
The latch problems of Harlow et al. (1950) evoked manipulation and the cloth dolls of Harlow (1958) evoked cuddling. The evidence indicates that you love your mother when you are an infant because her skin and her voice evoke cuddling, which is certainly a materialistic discovery.
From a feed forward point of view, motivation is responsiveness to stimuli. In Alice in Wonderland, Alice keeps finding cakes labeled “eat me” and liquids labeled “drink me.” She cannot seem to resist orders to eat or drink even though she is always sorry afterward. In a feed forward system, instead of inside forces that push there are outside stimuli that pull—drink me’s, eat me’s, cuddle me’s, unlatch me’s, and so on. It is very hard to resist touching newly painted walls, fences, and furniture in spite of warning signs and memories of past disasters. Wet paint seems to say “touch me.”
Selection
A working prototype of a modern robot cleans floors without human guidance and, when its batteries are low, seeks out specially designed power outlets to recharge the batteries by itself. While not yet commercially successful, the technology for mass producing such devices is already commonplace. When a meter attached to the batteries reads above a certain point, the robot cleans the floors; when the meter reads below that point, the robot goes to the nearest outlet and recharges its batteries. Early models had only two motives, but more could be added according to the same principle. They could, for example, grease their own bearings or chase off intruders.
The onboard computer regulates the robot; it does not drive it. Some tasks might require small, delicate movements. The robot would have to carry out such tasks with a low energy output no matter what the level of need of its batteries, its designer, or its present owner. A robot that had to operate at high energy levels when needs were high and low energy levels when needs were low would be unsuitable for many economically significant tasks.
Robots function quite well so long as their prevailing state (Q in the E-S-Q-R paradigm of chap. 4) determines their response to prevailing stimulation. They function without liking clean floors and charged batteries or disliking dirty floors and weak batteries. An engineer who added likes and dislikes to the system would be adding to the burdens of the onboard computer. Unless likes and dislikes added some economic advantage, robots governed by pleasure and pain would disappear from the commercial marketplace. In the same way, unless the likes and dislikes frequently imputed to living organisms add ecological advantages, it is prudent to assume that they cannot survive in evolutionary competition. To keep an animal alive, motivational states only have to increase and decrease the probabilities of competing responses in a way that selects now one type of behavior, now another.
Summary
In a feed forward system, the effect of hunger and thirst is to increase feeding and drinking behavior and decrease other types of behavior. Rather than eating harder or drinking harder, animals eat or drink more exclusively when they are hungry or thirsty. In the same way, hormonal chemistry can increase responsiveness to sexual stimuli and lower responsiveness to other types of stimulation, such as food and even lectures and textbooks.
AVERSION
To make its way in the world, an organism must obtain food, water, and other goods. It must also escape and—preferably—avoid injury, poison, and other bads. A significant portion of the behavior of human beings and other animals consists of responses to pain or to the threat of pain. In human behavior, we see some of the most dramatic examples of persistent responses that have plainly negative consequences. When human responses to stress persist in spite of their negative consequences, we call them maladaptive and, if the consequences are serious, some even recommend psychotherapy.
This is why so many psychologists are so interested in studies of aversive conditioning in nonhuman animals. Some of the classic symptoms of anxiety resemble conditioned defensive behavior that is evoked in anticipation of impending painful experience. Other symptoms, such as eating and drinking disorders, resemble conditioned consumatory behavior that is evoked in the aftermath of painful experience. Experimenters have justified a great deal of outright torture of captive animals in the name of this cause.
Defensive Aggression
The notion that animals approach appetitive stimulation and withdraw from aversive stimulation is deeply ingrained in Western culture, but it hardly exhausts the ethological possibilities that we find in nature. For example, chapter 2 described how flocks of smaller birds can drive off larger predators by aggressive mobbing. Wild rodents, like squirrels and rats, nest in burrows that shelter them from predators as well as from weather. Some predators, like snakes and weasels, invade burrows and prey on rodents. An invader in the burrow is a highly aversive stimulus, but rodents approach rather than retreat. Ethological observations record a species-specific defensive approach: chest pressed against the floor of the burrow and forelimbs outstretched, hurling loose dirt from the floor at the face of the invader. With luck, the dirt obstructs the passage enough to halt the invasion. Dirt in the face should also feed the predator forward to nonfeeding behavior, so that the predator withdraws.
Pinel, Mana, and Ward (1989), Pinel, Symons, Christensen, and Tees (1989), and others reviewed by them, studied this phenomenon in the laboratory. If there is hurlable material on the floor, such as sawdust, rodents hurl it. If not, they assume the characteristic posture and make the characteristic forelimbs movements anyway. Animals that have lived in the wild are more likely to engage in this form of defensive aggression (Heynen, Sainsbury, & Montoya, 1989).
Pinel, Mana, and Ward (1989) placed rodents in a U-shaped burrow. Instead of an invader, the experimenters placed a wire coiled around a small stick that was horizontally mounted on the back wall of one arm of the U. In the course of exploring the burrow, each rat eventually touched the coil and received a painful shock. Each rat immediately retreated to the middle section of the U, but soon returned, chest to the floor and forelimbs throwing sawdust or imaginary dirt at the coiled wire. Rather than avoiding the place where they were shocked, as we would expect from the pleasure-pain principle, rats spent most of their time in the shock section of the U. As usual, what a stimulus makes an animal do is much more significant than what a stimulus makes an animal feel—whether the feeling is pleasure or pain.
Suppression and Induction
In traditional systems of excitation and inhibition, appetites excite and aversions inhibit. In the feed forward analysis of this book, motivation makes animals selectively irritable—that is, more responsive to certain stimuli than to others and more likely to make certain responses than others. Painful shocks suppress eating by positively evoking aversive responses that are incompatible with eating rather than by inhibiting eating.
After an interruption, many forms of behavior resume with increased vigor. When they have free access to food, animals tend to consume a stable amount of food per hour or per day. After a period of deprivation, they tend to consume more than usual at first and then resume their stable rate of consumption. They overcompensate for the deprivation.
That animals should need food seems fairly obvious, but the biological significance of much behavior remains obscure. Dreaming is a good example. Human beings dream for a portion of each night of sleep. We can tell when they are dreaming by rapid eye movements (REM), when the cornea moves under closed eyelids, and correlated with self-reports of dreaming. There are cycles of REM and non-REM sleep throughout the night, and individual human beings tend to dream for a stable portion of each night of sleep.
Experimenters have deprived subjects of REM sleep for several consecutive nights by waking them whenever REM started. On nights that follow nights of REM deprivation, subjects overcompensate with more than the usual amount of REM before returning to their usual amount of REM sleep (Vogel, 1975). To say that they have a need to dream is imminantly reasonable, but only tells us that they must have a biological need to dream because they dream. The supernormal increase in dreaming after dream deprivation is like the supernormal increase in feeding after food deprivation.
Stimuli that evoke incompatible responses also interrupt ongoing behavior. Painful electric shock evokes a pattern of behavior that is incompatible with feeding, and the interruption is followed by a supernormal increase in feeding in the aftermath of the shock (Tugendhat, 1960).
Rhythmic Patterns
In Pavlov’s laboratory, when the interval between the conditioned stimulus and the food
… is short, say 1–5 seconds, the salivary reaction almost immediately follows the beginning of the conditioned stimulus. On the other hand, in reflexes which have been established with a longer interval between the two stimuli the onset of the salivary response is delayed, and this delay is proportional to the length of the interval between the two stimuli and may even extend to several minutes. (Pavlov, 1927/1960, p. 88)
Presumably, temporal conditioning is possible because there are biological rhythms which correlate with time so well that animals can use them as stimuli. The cyclically repeating events of a conditioning experiment—a constant stimulus that begins each trial, followed by the S*, followed by an intertrial interval—are favorable conditions for temporal conditioning.
Temporal conditioning also appears in the Skinner box. If the apparatus delivers food at intervals, then the distribution of responses varies accordingly. Lever-pressing or key-pecking is absent soon after each delivery of food and begins again later on, reaching its peak toward the end of each interval, just before food usually arrives. This pattern appears whether or not the delivery of food is contingent on responding (Staddon & Simmelhag, 1971; Zeiler, 1968). The absence of lever-pressing and key-pecking immediately after receiving food is called the postreinforcement pause.
During the postreinforcement pause, subjects engage in nonfeeding behavior such as grooming or preening (Anderson & Shettleworth, 1977; Shettleworth & Juergensen, 1980). If there is a drinking tube in the chamber, rats drink during the pause in lever-pressing and, as conditioning progresses, they may drink three or four times as much as normal (Falk, 1971). This overdrinking is called polydipsia. If, instead of a drinking tube, there is a second (suitably restrained) pigeon in the chamber, the subject pigeon attacks it vigorously during the pause in key-pecking, beginning these attacks with elements of the classic, ethological, agonistic pattern; including swaying head-lowered approach, deep-throated growls, and wing striking (Azrin, Hutchinson, & Hake, 1966). Such nonfeeding behaviors, often called adjunctive behaviors, are evoked by the schedule of widely spaced food delivery; they fail to appear when food deliveries are frequent and they cease when food deliveries are stopped (Cohen & Looney, 1984).
An S* evokes some responses and suppresses other incompatible responses. When an S* ceases or is consumed, other responses recover, often at a supernormal level in the aftermath of the S*. If there are repeated, spaced intervals between deliveries of S*, then an S* itself becomes a stimulus for a period without any S*. As the animal synchronizes its behavior to the rhythm of the spacing, the animal makes prefeeding or preshock responses late in the interval when an S* is imminent, and makes otherwise incompatible responses early in the interval when an S* is unlikely to appear (Janssen et al., 1995).
An Sa that appears late in the interval is contiguous with prefeeding or preshock behavior, and an Sa that appears early in the interval is contiguous with nonfeeding or nonshock behavior. Consequently, prefeeding and preshock responses become conditioned to an Sa that precedes the S*, and nonfeeding or nonshock responses become conditioned to an Sa that follows the S*. In this way, the responses that become conditioned to the Sa depend on the phase of the inter-S* interval.
In a feed forward system, differentiation between the pre-and post-S* phases of the trial cycle should appear only after a certain amount of repetition of the cycle. At first, food should only evoke consumatory responses, and shock should only evoke defensive responses. With very few trials there should be little difference between the responses paired with an Sa that occurs before S* and the responses paired with an Sa that occurs after S*—perhaps no difference in response at all after only a single trial. Thus, backward conditioning should be roughly the same as forward conditioning after a single trial (chap. 4). With repetition of the trial cycle, the suppression and induction effects should condition opposite responses in the pre-and post-S* phases.
Avoidance
Painful electric shocks delivered through a grid in the floor of the conditioning chamber evoke running and jumping in many animals. If the shock ceases as soon as they respond, animals learn to run or jump more quickly. This is called escape. They also readily learn to run or jump during a warning interval that precedes shock. In a highly effective procedure, animals avoid shock altogether if they respond before the end of the warning period. This is called avoidance. It seems clear that the reward for escape conditioning is that pain stops when shock stops. But, what is the reward for avoidance?
A series of pioneering experiments by Solomon and his associates established the basic phenomena of avoidance conditioning. The subjects were usually dogs trained to avoid shock in a two-compartment apparatus. The conditioned stimulus, signaling the onset of a strong shock, was a change in illumination and the lowering of a door, permitting the dog to jump over a shoulder-high hurdle and get into the other compartment. A favorable warning interval between the onset of the warning signal and the onset of shock is about 10 seconds. Jumping the hurdle to the “cold” compartment escapes shock in the early stages of conditioning. The dogs avoid the shock entirely when they jump before the end of the warning interval.
Figure 9.3 shows the responses of a typical subject. Note that:
1. After seven trials with latencies longer than 10 seconds, the dog responded soon enough to avoid the shock on Trial 8 and every trial after that.
2. Instead of extinguishing in the absence of any addititional shocks, the dog improved its performance throughout 32 remaining trials, improving rapidly at first and then more slowly, producing the usual negatively accelerated learning curve. In fact, after 10 successive avoidances the dog responded so promptly that it never received another shock.
3. Signs of fear such as increased heart rate and breathing rate tend to appear during the first three or four trials of avoidance conditioning followed by marked decrease later in learning. During later trials, animals appear to be quite calm in the experimental apparatus. Liddell (1956) found this same calmness in the laboratory chamber when he conditioned sheep with electric shock. In his experiments, Liddell often shocked the same individual in the same laboratory chamber for several years. Ethologically interested in his animals, he continued to observe them outside of the laboratory. He reported dramatically disturbed behavior in the pasture and the barn after long periods of aversive conditioning with shock (1956, pp. 51–67).
4. Vigorous hurdle jumping persists for hundreds of trials, which completes the picture of the problem confronting the reinforcement theorist attempting to account for the facts of avoidance learning.
How is such rapid and persistent conditioning possible with so little reward or punishment? At first, the principle of persistence after partial reward (chap. 11) seemed to account for the persistence of conditioned avoidance. According to this view, conditioned avoidance resists extinction, because it gets intermittently reinforced by occasional lapses and consequent punishments during acquisition. Typically, however, as in Fig. 9.3, there is shock rewarded by escape from shock on every trial until the first avoidance. From then on dogs avoid all shocks. Persistent avoidance develops without any period of intermittent reward.
Solomon and Wynne (1953) proposed an elaborate two-factor theory to explain avoidance conditioning. The main elements of this theory are similar to Mowrer’s (1960) two-factors. Fear becomes conditioned to the warning signal by stimulus-response contiguity. After that the animal escapes from the fear evoked by the warning stimulus. But how does conditioned fear, itself, resist extinction? This is a puzzle, since second-order conditioning is so weak and easily extinguished under laboratory conditions (chaps. 4 and 7). Nevertheless, versions of this two-factor theory remain popular among those who would preserve reinforcement theory at all costs (e.g., Schwartz & Reisberg, 1991, pp. 139–153). The problem with theories of conditioned fear is that the fear in these theories is an entirely hypothetical phenomenon that exists only to fill the gaps between reinforcement theory and the results of experiments. By adding new imaginary phenomena to account for each new experimental finding, reinforcement theories weaken their claim as down-to-earth, hardheaded scientific formulas and lose much of their credibility.
Compatibility
Defensive responses evoked by painful shock are incompatible with feeding. Since lever-pressing is a feeding behavior, it should be very difficult to condition a rat to press a lever to avoid shock. Meyer, Cho, and Wesemann (1960), for example, failed to obtain any appreciable amount of lever-pressing for the reward of shock avoidance under a variety of procedures that would have been quite favorable to condition running or jumping. D’Amato and Schiff (1964), using a similar procedure, gave rats 60 trials per night for a total of 123 nights or 7,380 trials of avoidance conditioning, altogether. And still, half of their subjects failed to develop any appreciable level of avoidance responding. Results of this kind were the point of departure for Bolles’ influential discussion of species-specific defense reactions in aversive conditioning (Bolles, 1970).
These studies used a discrete trial procedure—warning signal, followed by scheduled shock or avoided shock, followed by intertrial interval. In D’Amato and Schiff (1964), the intervals between shocks were about 3 minutes; in Meyer et al. (1960) the interval between shocks was about 1 minute; and both used a 5-second warning interval. These are favorable intervals for conditioning rats to run or jump to avoid shock. Reasoning that the association between warning and shock should be stronger if the onset of the warning signal were closer to the onset of the shock, Meyer et al. tried still shorter warning intervals but only obtained still less lever-pressing during the warning interval. Greater suppression is what we would expect, of course, if anticipatory defensive responses are conditioned to the warning signal while components of feeding behavior (e.g., lever-pressing) are suppressed by defensive responses (see discussion of conditioned emotional response, CER, in chap. 2). In that case, the closer the warning signal to the shock, the more the defensive effect should suppress feeding.
Apparently, D’Amato and Schiff did not record responses between trials—that is, between each shock and the next warning signal. Meyer et al. did record intertrial responses, and they reported “[intertrial] bar-pressing rates that often reached extremely high values. There were many instances, in fact, of rates above two thousand per hour; had these rates been evenly distributed in time, at least a third of all the subjects would have shown us ultimately perfect avoidance. We obtained, instead, a suppression of the rate when the warning stimulus came on” (p. 226). Again, intertrial lever-pressing is what we would expect if components of feeding behavior that were suppressed during the warning interval recovered at a supernormal level during the intertrial interval.
The longer the warning interval the more the onset of the warning signal predicts an interval without shock. Longer warning intervals should evoke less defensive behavior and permit more lever-pressing. The onset of very long warning intervals should seem like signals of safety and actually induce lever-pressing. Berger and Brush (1975) lengthened the warning interval and that is precisely what they found. The longer the warning interval the more often their rats avoided shock by pressing the lever. This increased as they lengthened the warning interval from 20 to 30 and even to 60 seconds. As we would expect—if lever-pressing is suppressed by signals close to shock and enhanced by signals remote from shock—most of the lever-pressing appeared shortly after the onset of the warning signal. The warning signal tells the subject that shock will not come for 20, 30, or 60 seconds. Denny (1971) discusses in detail a great deal of related evidence on the temporal distribution of defensive and nondefensive responses together with a theory that also attributes the observed distribution of responses to theoretical phases of relief and relaxation in the aftermath of shock.
Operant Avoidance
Skinner’s operant procedure (chap. 3) has neither trials nor intertrial intervals. The delivery of S* punctuates conditioning sessions according to a schedule that is contingent, noncontingent, or partially contingent on a criterion response such as lever-pressing or key-pecking. Sidman (1953) introduced an avoidance procedure for rats in which painful electric shocks are the S* and each lever-press lengthens the interval between shocks. A clock delivers shocks at the end of shock-shock intervals and response-shock intervals. If the animal presses the lever or pecks the key, the next shock comes at the end of the response-shock interval, typically 20 seconds. If an animal fails to respond after a shock, the next shock comes at the end of the shock-shock interval, typically 5 seconds. Consequently, the shorter the interval between responses the fewer the shocks. If an animal always responds within the response-shock interval, it never gets any shock at all. The vast number of successful experiments that have used variants of this method proves that Sidman’s procedure is highly effective for conditioning rats to press levers with shock as the S* (see Hineline, 1977, 1981, for extensive reviews of this literature).
Where defensive responses, such as running and jumping, serve to avoid or postpone shock, animals respond most frequently at times when shock is imminent. By contrast, in Sidman’s operant procedure rats characteristically press the lever in bursts in the aftermath of each shock and make the majority of their responses early in the intervals between shocks when there is little or no probability that shocks will occur (Ellen & Wilson, 1964; Forgione, 1970; Herrnstein & Hineline, 1966; R. W. Powell, 1972; R. W. Powell & Peck, 1969; Sidman, 1958). This distribution of responses is clearly shown in Fig. 9.4 (Powell & Peck, 1969, p. 1059). Just as in the discrete trials procedure, lever-pressing and key-pecking in Sidman’s procedure appear in the aftermath of shock as if evoked by the absence of shock (R. A. Gardner & B. T. Gardner, 1988a, pp. 140–142).
Two Ways to Abolish Contingency
From a feed backward point of view, rats press levers and pigeons peck keys in operant avoidance because this earns them lower probabilities of shock. That is the role of contingency in operant avoidance. From a feed forward point of view, on the other hand, shock evokes lever-pressing and key-pecking as Powell and Peck (1969) plainly showed, regardless of any contingency. In a feed backward system, responses extinguish when contingency is abolished. There are two ways to abolish contingency in operant avoidance: 1) the experimenter continues to deliver shock no matter what the subject does or, 2) the experimenter stops delivering shock altogether no matter what the subject does. The results of this comparison are decisive. In Sidman’s procedure, lever-pressing persists indefinitely when the apparatus continues to deliver noncontingent shocks, particularly if the number and spacing of the shocks is roughly the same as it was in the contingent phase of the experiment (Hineline, 1977, pp. 377–381).
In Herrnstein and Hineline (1966), for example, a typical subject that received noncontingent shocks continued to press the lever for 170 daily sessions of 100 minutes each, making about 20,000 responses before meeting the criterion of extinction. Other experimenters, perhaps because they were less persistent than Herrnstein and Hineline, failed to find any appreciable drop in responding after less extended but still reasonably long periods of noncontingent shock. Under the same conditions, indeed in the same experiments, when shock ceased entirely, lever-pressing ceased abruptly (Hineline, 1977, pp. 377–381). Clearly, shock evokes operant responding in Sidman’s procedure regardless of contingency.
In aversive conditioning with shock as in appetitive conditioning with food, the responses that become conditioned are the responses that an S* evokes. At the same time that it evokes certain responses an S* suppresses other responses, which recover in the aftermath of S*. If an S* repeats at intervals, then the repeating cycles evoke different responses during different phases of a cycle. The S* evokes the responses, but the responses become conditioned to arbitrary stimuli that are correlated, either positively or negatively, with S*. The arbitrary conditioning of response to stimulus depends on the temporal contiguity between arbitrary stimulus and evoked response. Conditioning is independent of any change in contingency that the experimenter arranges between response and S*.
Earning Pain
In a feed forward system, responses become conditioned to an arbitrary stimulus regardless of any contingency between the responses and any S*. If this is true, then lever-pressing and key-pecking will become conditioned to an Sa that appears in the aftermath of shock, even if responses earn more shocks at other times. This is just what happens in Sidman’s operant avoidance procedure. Lights and tones that signal relatively shock-free periods evoke lever-pressing and key-pecking even when the apparatus delivers more shock for more responses—that is, even though the animals earn many additional painful shocks if they respond during shock-free periods (Badia, Coker, & Harsh, 1973; Badia, Culbertson, & Harsh, 1973; E.T. Gardner & Lewis, 1977; Hineline, 1977, pp. 393–398). Thus, in aversive conditioning with shock as in appetitive conditioning with food, rhythmic temporal patterns control conditioning, even when responding has negative consequences for the well-being of the subject.
The positive effects of shock parallel the negative effects of food (see Avoiding Food, chap. 8). These effects are only paradoxical, however, if learning depends on contingencies between responses and feelings of pleasure or pain. The paradox vanishes when we look directly at the responses that an S* evokes instead of trying to imagine the feelings that it might cause.
Summary
The traditional association of appetites with positive approach and aversion with negative withdrawal ignores prominent ecological and ethological aspects of behavior such as defense and aggression. In a feed forward system, stimuli evoke responses and responsiveness depends on the state, Q, of a human or nonhuman animal. Hormonal balance and depletion of resources as well as repeated experience determine responsiveness to particular stimuli. This system can deal with all aspects of motivation without making subjective conjectures about the experience of pleasure and pain. Specifically, it provides a rule for conditioning that fits the facts of experiments. Yes, the results of many experiments seem to agree with the notion that contingent pleasure and pain govern learning. These experiments all fail to evaluate the role of contingency because they fail to compare contingency with noncontingency. Experiments that actually compare contingency with noncontingent control conditions repeatedly show that contingency is irrelevant to conditioning whether the S* is food or painful shock.
CONFLICT
Traditional views of motivation are weakest when they attempt to deal with conflict between motives. This section describes recent developments in computer science that offer a means of dealing with the problem of confict within a feed forward system.
Homeostastis
Cannon (1932) introduced the term homeostasis to apply to the selfregulation of vital constants in living systems. In this view, such values as temperature, hydration, saline and sugar in the bloodstream, and so on, have critical set points that a living organism must maintain to stay alive. Behaviors such as seeking shelter from the sun and foraging for food are an extension of physiological homeostasis. The traditional analogue of a homeostatic system is the regulation of temperature in a home by a thermostat (chap. 4).
Many successful industrial systems incorporate automatic devices, called comparators, to maintain vital set points. A thermostatic comparator, for example, responds to a temperature sensor by heating up the system when temperature falls below a set point or cooling it down when temperature rises above another set point. Comparators appear schematically in traditional theories (Plooij & Rijt-Plooij, 1994, pp. 359–361; Staddon, 1983, pp. 66–69) that attempt to use control theory to model behavior.
Set points and comparators are unlikely models for living animals, however, because they are designed for artificial systems that depend on outside supplies of vital resources. The thermostat in a home works well as long as someone pays for unlimited amounts of energy from external suppliers. When the price of energy rises, many homeowners allow the temperature to fall or rise to uncomfortable levels in preference to uncomfortably empty refrigerators or idle automobiles. Set points fail to control artificial systems when critical needs compete for limited resources.
The traditional analogy between home thermostats and body temperature fails because living systems must balance conflicting needs, or die. Body temperature drops when perspiration evaporates on the body surface, but this cooling device is strictly limited. Perspiration leads to dehydration, and dehydration leads to death. A live animal must replace the water lost in perspiration fairly soon. Finding water usually demands movement, energy expenditure, and further increases in temperature. Finding water often brings risk of death from predators that wait for prey at watering places. Autonomous systems must resolve conflicts.
Robots in Conflict
Figure 9.5 is a modified version of the robot vehicle of Fig. 8.3. In this case, the connections between the light sensors and the motors are ipsilateral so that the robot automatically avoids light and rests in dark places. In Fig. 9.5, there is an additional chemical sensor with contralateral connections to the motors so that the robot approaches chemicals that smell like fuel. An autonomous robot might need to approach sources of fuel to replenish its onboard stocks and yet avoid light because of enemy surveillance. A successful autonomous robot would have to make different decisions at different times depending on the amount of light and the depletion of onboard fuel, just as a living animal must balance danger of predation against depletion of onboard stocks of energy.
The two conflicting motives in Fig. 9.5 underestimate the number of conflicting motives in a moderately useful autonomous robot and certainly underestimate the number of conflicting motives in a moderately complex animal. Under natural conditions, a best or even a relatively favorable balance of motives shifts constantly with shifts in external and internal conditions. The robot vehicle in Fig. 8.3 works well without a central processor. The robot in Fig. 9.5 needs a central processor to resolve conflict.
FUZZY CONTROL
Recent industrial developments require systems that are much more sophisticated than comparators. Fuzzy controllers are simple, effective, and economical systems currently in use under demanding industrial conditions. Fuzzy controllers can balance multiple conflicting demands in situations of flux and change that resemble the problems of living animals under natural conditions. In the following example, two computer scientists in the Research and Technology division of Boeing Aircraft Company, Kipersztok and Patterson (1995), solved a modern industrial problem with an intelligent fuzzy controller. Their solution offers a practical model for analogous problems faced by living animals.
Fuzzy Logic
To readers schooled in traditional Greek logic, fuzzy logic seems to be a contradiction in terms. By Aristotle’s law of the excluded middle, things must belong either to A or to not-A. They cannot belong partly to A and partly to not-A at the same time. Aristotle’s laws of logic are so deeply entrenched that the notion of fuzzy categories sounds downright illogical. On the other hand, the natural world of daily experience appears in shades of gray, described in overlapping, continuous variables rather than sharp divisions. To traditional logicians, this only shows that daily experience is inferior to logical ideals. In this time-honored view, a truer picture of nature will emerge when we obey Plato’s injunction to carve nature at the joints—that is, into mutually exclusive categories.
Following this Greek prescription, many social scientists and philosophers pursue endless searches for the dividing lines between intentional and unintentional, conscious and unconscious, human and nonhuman, and so on. Meanwhile, the natural sciences that have progressed since classical times soon discarded Aristotle’s mutually exclusive categories in favor of continuous variables. Fuzzy, overlapping categories are hardly new. What is new is the discovery that artificial computer systems work well with fuzzy categories. Not only that, but computer systems based on fuzzy categories are simpler, cheaper, and easier to program than traditional systems (Kosko, 1993; McNeill & Freiberger, 1993).
Parallel Processing
Boeing’s control problem arose from the introduction of a massive parallel processing system to solve massive computational needs of this huge organization. Most computers solve problems sequentially, one step after another, the way human beings solve problems with pencil and paper. Even though individual steps may require only a fraction of the power of a large computer, the whole system must wait for the results of each step in a sequential procedure. More recently, computer scientists discovered how to program computers to divide large problems into individual parts and process the parts simultaneously, in parallel. Parallel processing dramatically increases efficiency in many common industrial and research applications. One of the advantages of parallel processing to a very large organization such as Boeing is that individual computers can be connected to work together in a network even though they are located in separate offices and laboratories scattered throughout an enormous facility. This reduces the idle evening and weekend time of many systems at the facility. Kipersztok and Patterson set up just such a massive parallel processing system for Boeing.
Queuing
An enormous organization consumes an enormous amount of computing power. At Boeing there are massive amounts of data to analyze regarding the aerodynamics and braking mechanisms for jumbo jets. Logistical problems of supply, storage, and shipping have to be solved. Global market trends have to be anticipated. These examples only suggest the enormous computer needs at Boeing. After consolidating scattered individual systems into one massive parallel processor, Kipersztok and Patterson had to decide how to assign priorities to the jobs that colleagues submitted to the new supersystem. Just as with living animals in nature, conflict arises because there are many jobs to accomplish with limited resources. In industrial situations, this sort of problem is called queuing.
First come, first serve is a common queuing strategy, but it has many drawbacks in serious industrial situations. Suppose that the system is already processing many jobs simultaneously, but without fully occupying its resources. The next job submitted is so large that it requires more of the system than the total amount of resources that are currently idle. The large job must wait until there are enough available resources to serve its needs. Suppose that it takes a long time for enough of the smaller jobs in progress to finish and free the needed resources. In the meantime, quite large amounts of computer resources remain idle waiting until the system can accommodate the large job. This is obviously wasteful. Suppose that the system manager lets smaller jobs take advantage of idle resources while the large job continues to wait. This strategy could make the large job wait indefinitely even though the large job may be much more important to Boeing than many of the smaller jobs. To recognize this economic priority, the system manager should suspend enough smaller jobs in process to accommodate an important large job, but then how to decide which jobs to suspend? That is, how to resolve conflicts that arise when many tasks compete for limited resources?
At the outset, Kipersztok and Patterson could specify the most important variables in their queuing problem, they knew that the problem has a mathematically optimum solution, and they knew that between them they had more than enough mathematical skills to solve it. Unfortunately, they could also estimate that it would take them many months, certainly more than a year, probably more than two years, to solve a problem involving so many complex interacting variables. From the point of view of Boeing Aircraft, that would be impractical. First, it would cost at least two years’ salary for two senior computer scientists. Second, the expensive super parallel processor would have to limp along inefficiently while it waited for the solution. Worst of all, things change swiftly both in computer science and in the aircraft industry. While Kipersztok and Patterson labored at the mathematical problem, many vital factors would probably change, making their solution obsolete before they found it.
The problem is very similar in evolutionary biology. Given enough time and generations some species can emerge perfectly adapted to any ecosystem. Adaptation fails, however, if it takes too many generations to produce the optimal solution for a particular ecological problem. Ecosystems are inherently unstable. If it takes too long to emerge, the optimal species will be perfectly suited to an ecosystem that is long gone.
With a fuzzy controller, Kipersztok and Patterson produced a practical solution to the queuing problem in a matter of weeks, without even breathing hard. The secret of fuzzy systems is that they produce good practical solutions without attempting to find optimum solutions. This strategy permits them to take advantage of crude, but effective, devices. In nature, species only have to survive. Crude devices will serve because optimality is unnecessary. This is a signficant similarity between fuzzy systems and living animals.
Control Problem
Figure 9.6 is a schematic diagram of the controller that Kipersztok and Patterson (1995) developed. New jobs arrive unranked in the queue at the left of the diagram. The fuzzy controller assigns priorities to the jobs in the incoming queue on the basis of their resource requirements and the current availability of resources in the network. This creates a new queue with jobs ranked for priority. The controller allows the jobs in the ranked queue to enter the network for processing until the resources that remain are insufficient to accommodate the highest ranking job in the queue. At that point the controller ranks the jobs in process in the network and suspends enough lower ranking jobs in process to accommodate the highest ranking jobs that remain in the queue. When all of the jobs in process outrank all of the jobs in the queue, and idle resources are insufficient to accommodate the highest ranking job that remains in the queue, the system is stable and the controller rests.
The controller automatically resubmits suspended jobs to the queue along with new jobs and assigns priorities to the jobs in the newly formed incoming queue. As the network completes jobs in process, more resources become available, and the ranking of jobs in the waiting queue changes along with the ranking of jobs remaining in process in the network. Thus, the fuzzy system is continually reevaluating priorities, submitting waiting jobs, and suspending jobs in process. Like a living animal its needs and resources are constantly in flux.
Fuzzy Solution
The controller uses the requirements of jobs and the availablilty of resources to assign priorities. Two major requirements for the Boeing system are the number of machines and the communication load requested by a job. Communication is critical in parallel processing because the separate units that are working on separate parts of the same problem must communicate with each other and different jobs require different amounts of communication.
In general, the less that a job requires in the way of resources, the higher its priority. The weight of each requirement in assigning priority depends on the current availability of that resource. Kipersztok and Patterson constructed the matrix shown in Table 9.1 to assign weights to the factor of requested machines under different conditions of available machines. The matrix is simple and straightforward. The first row in Table 9.1 applies when the number of available machines is very high. In that situation, the number of machines requested by a new job is less significant and usually has a relatively low weight, but this weight increases with the number of requested machines. The last row applies when the number of available machines is very low. In that situation, even the requirements of a small job may be too high to begin processing. Each cell in the matrix generates an inference rule. The cell in the second row and the third column generates the rule, “If available machines is high and requested machines is medium, then weight is medium.”
The matrix is as easy to construct as it appears in this brief description. Some matrices are better than others and some are worse, of course, so expertise and familiarity with the system helps, but the first matrix only needs to be a starting point. It is so easy to change all or part of any matrix that system managers can later adjust, indefinitely.
This ease of adjustment makes fuzzy controllers even more suitable as models of evolutionary adaptation. Designers have tested fuzzy systems by deliberately introducing erroneous inference rules or removing several rules entirely. The controllers continue to function although they limp along less efficiently (Kosko, 1993, pp. 339–361). On the positive side, this same property of fuzzy systems allows computer scientists to improve on their first attempts and to modify each part of the system easily and quickly as conditions change. This crude but powerful property of fuzzy systems is a promising model for rapid adaptation to dynamically changing natural ecosystems. A fuzzy system could survive without achieving the optimal solution to a unique, and very likely transient, ecological situation.
Each cell in the matrix of Table 9.1 represents a fuzzy category without sharply defined boundaries. A particular value of available or requested machines can belong partly to one category and partly to an adjacent category. In these common cases of overlap, the fuzzy system assigns a crudely defined intermediate weight to the output. These outputs combine with the outputs of other matrices in higher order matrices, which produce fuzzy outputs of the combinations. Eventually, the system converts the combined fuzzy weights to crisp values that determine the ranking of each job waiting in the queue and each job in process in the network. Again, the fuzzy system converts fuzzy values to crisp values by crude but effective approximation rather than by precise mathematical modeling.
Kipersztok and Patterson could have developed a precise mathematical equation for the output of Table 9.1 by dint of thorough experimentation with the system and elaborate mathematical modeling. The fuzzy matrix describes the main features of the problem in an intuitively reasonable and direct way, saving computer scientist time, calendar time, and computer resources. The advantage of the fuzzy system mutiplies as the system combines more and more complexly interacting factors to assign priorities. Without attempting to produce optimum results, fuzzy systems settle for good practical results. They exchange optimality for practical savings in time and effort.
The system designer must also define the degree of overlap between fuzzy categories, and the different boundaries can overlap in different ways. The details of these and other technical steps in constructing a fuzzy controller are beyond the scope of this book, but they are as simple and straightforward as described here. All of the details of Kipersztok and Patterson’s controller appear in their 1995 article. At this writing, Kosko’s (1993) book on fuzzy systems is a serious, yet accessible, technical introduction and many excellent new books on this subject will certainly appear soon. Kipersztok and Patterson (1995) used available commercial software to translate the crude rules into a working fuzzy controller.
Biological and Industrial Priorities
Assigning priorities on the basis of available resources and projected needs has clear parallels to the way living animals and autonomous robots must resolve conflicts. Kipersztok and Patterson had to incorporate factors into their industrial application that increase the resemblance to living animals under natural conditions. The system had to include, for example, a weight for economic importance to the health of the Boeing Aircraft Company. The most efficient controller would still fail if it failed to relate priority of a job to Boeing’s profits. Kipersztok and Patterson also had to be sure that the system would eventually run every job submitted to it. To do this, they introduced an aging factor that adds priority to a job according to the time it spends waiting in the queue. Living animals can also tolerate moderately critical needs for limited periods without serious consequences. An animal, for example, can survive at moderately higher than normal internal temperatures for a while. Eventually, however, the animal must do something to lower temperature, such as rest or seek shelter from the sun, even if it must suspend other critical activities.
Summary
This section considered the modern view of motivation as jobs to be done with limited resources. The problem is one of balanced control rather than maxium output of energy. Limited resources create conflicts which must be resolved. Increasingly, modern industrial systems also focus on control, which makes them increasingly resemble living systems. Fuzzy systems like living systems deal with job requirements and resources that are in a constant state of flux. Fuzzy systems are valuable in modern industry because they are simple and inexpensive, which makes them easy to improve and to modify rapidly as conditions change. This is a valuable property for a living system in an evolving natural world. Fuzzy logic of continuous overlapping categories is also a more appropriate way to describe nature, including the behavior of human and nonhuman animals.
3.139.81.143