Chapter 12 Experts and Expert Systems

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google


12	Experts and Expert Systems

The study of expertise covers remarkably diverse domains, such as sports, chess, music, medicine, and the arts and sciences, and examines the entire range of mastery from beginners to world-class performer.… Very high levels of achievement in virtually all domains are mediated by mechanisms acquired during an extended period of training and development.

K. A. Ericsson
2005

INTRODUCTION

In the previous chapters in this section we have discussed the processes involved in attention, memory, and thought, with a special emphasis on how people are limited in their ability to process information. In Chapter 9, we discussed the limited capacity that people have for attending to multiple sources of information. In Chapter 10, we emphasized that similar capacity limitations influence our ability to retain and perform computations in working memory and retrieve information from long-term memory. In Chapter 11, we showed that a person’s ability to perform abstract reasoning is also limited, and because of these limitations, human reasoning relies heavily on simplifying heuristics and past experience.

Despite their information-processing limitations, people can develop expertise and become highly skilled in specific domains. We say that someone is an expert in a domain when they have achieved elite, peak, or exceptionally high levels of performance (Bourne, Kole, & Healy, 2014). An expert in a domain solves problems much faster, more accurately, and more consistently than does a novice. The question of how experts differ from novices, and therefore how novices can be trained and supported to perform like experts, is a central concern for human engineering. As we will see in this chapter, differences in performance between novices and experts arise to a large extent from the expert’s specialized knowledge acquired from years of practice (Ericsson, 2006a), although other traits and genetic factors also play a role (Ullén, Hambrick, & Mosing, 2016). Experts see problems differently from novices and use different strategies to obtain solutions.

The present chapter focuses on how people acquire specialized knowledge and how this knowledge affects their information processing and performance. We will examine the way that speed and accuracy of performance vary as a function of how a task is learned and practiced. In order to explain and understand the effects of training, it is useful to consider several different perspectives of skill acquisition. To help us understand what expertise really is, we can compare expert and novice performance on a task. These comparisons can reveal why experts are able to think more efficiently and how novices may best be trained.

Large, complex systems often require expertise to operate and to troubleshoot problems. However, experts are usually in high demand and may not be available when the need arises. Consequently, expert systems have been designed to help novices perform those tasks usually performed by experts. These computer-based systems are designed using our understanding of the knowledge and reasoning processes that experts use in problem solving. This understanding, of course, derives from research on the skilled performance of experts. Eberts, Lang, and Gabel (1987) noted, “To design more effective expert systems, we must understand the cognitive abilities and functioning of human experts” (p. 985). In this chapter, we will also describe the characteristics of expert systems and the crucial roles that human factors specialists can play in the development, implementation, and evaluation of such systems.

ACQUISITION OF COGNITIVE SKILL

How skill is acquired has been a topic of interest since the earliest research on human performance (e.g., see Bryan & Harter’s, 1899, study of telegraph skill in Chapter 1). However, much of this research has focused on the development of perceptual-motor skills (see Chapter 14). Today’s technologically specialized jobs require cognitive expertise rather than perceptual-motor expertise, although both are required to some extent in expert performance of essentially any task (Suss & Ward, 2015). Consequently, we are now more interested in how cognitive skills are acquired, and research is now focused on cognitive differences between experts and novices. This research has improved our understanding of how changes in cognitive processing occur as knowledge and skills in a specialized domain are acquired (e.g., Anderson, 1983; Healy & Bourne, 2012).

A person is skilled in a particular domain when her performance is relatively precise and effortless. Cognitive tasks can be as simple as pressing a key when a stimulus event occurs or as complex as air-traffic control, and therefore there may be many or only few components of a task at which a person can become skilled (Johnson & Proctor, 2017). For these components, it is important that we distinguish between different task requirements and identify the kinds of information necessary to complete each one.

One dichotomy we can make is between convergent and divergent tasks. Tasks are said to be convergent if there is only one acceptable, predetermined response, whereas divergent tasks require novel responses. Another dichotomy is between algorithmic and non-algorithmic tasks. Algorithmic tasks can be performed in a sequence of steps that infallibly lead to a correct response, and so no deeper understanding of the task requirements is necessary. Tasks that are not algorithmic require an understanding of the principles that underlie the problem to be solved. Furthermore, task-performance skills can require deductive reasoning or inductive reasoning, and they can be performed in closed (predictable) or open (unpredictable) environments. Finally, there are highly specialized cognitive skills such as chess playing, or nearly universal skills such as reading. Given these distinctions, it should be clear that we must always qualify general principles of skill acquisition according to the particular task requirements.

POWER LAW OF PRACTICE

It is universally appreciated that practice has beneficial effects on performance. In particular, people get faster and more accurate the longer they practice a task. Across a wide variety of perceptual, cognitive, and motor tasks, speed-up in performance for a group of people is characterized by a power function (Newell & Rosenbloom, 1981). The simple form of the function is

T = B N^{- α},

$T = B N^{- α},$

(12.1)

where:
T	represents the time to perform the task,
N	represents the amount of practice or the number of times the task has been performed, and
B	“and” are positive constants.

This function is the power law of practice. One characteristic of the power law is that the more people practice a task, the less effect a fixed amount of additional practice will have. Early on, when they have little experience with the task, performance time will decrease rapidly, but later on, improvements will not be so great. The rapidity with which improvement decreases is a function of the rate parameter α. For power law curves, the amount of improvement on each trial is a constant proportion of what remains to be learned.

Neves and Anderson (1981) asked people to practice 100 geometry-like proofs. It took people 25 minutes on average to solve the first proof, and they got faster and faster with each proof. People took only 3 minutes to complete the last proof. Figure 12.1a shows the solution time as a function of the number of proofs completed, plotted on a linear scale. Two characteristics of the power law are apparent. First, the benefits of practice continue indefinitely: People were still speeding up even on the last proof. Second, the greatest benefits occur early in practice: Initial speed-ups are quite large, whereas later ones are small. Figure 12.1b shows the same data plotted on a log-log scale. The function should be linear if it follows the power law, because

FIGURE 12.1The power law of practice, as illustrated by the time to generate geometry-like proofs, on (a) a linear scale and (b) a log-log scale.

ln (T) = ln (B N - α) = ln (B) - a ln (N) .

$ln (T) = ln (B N - α) = ln (B) - a ln (N) .$

As shown by the straight line fit to the data, the effect of practice on the geometry-like proofs closely approximates a power function.

One problem with Equation 12.1 is that as N grows very large (i.e., practice is very extensive), the time to complete a task should approach zero. This makes no sense, because task performance, even at the most expert level, should take at least some time. For this reason, we might rewrite the power law with a nonzero asymptote. Also, we might want to take into account any previous experience a person has with the task. The generalized form of the power law that incorporates these factors is

T = A + B {(N + E)}^{- α},

$T = A + B {(N + E)}^{- α},$

where:
A	is the fastest possible performance, and
E	is the amount of practice that a person brings to the task, that is, the amount of previous experience.

This generalized function still yields a straight line with slope −α when plotted on a log-log scale.

The family of generalized power functions characterizes performance across a wide range of tasks. For example, one study found that the time spent at particular e-commerce websites (e.g., Amazon) decreased with each successive visit following power functions (Johnson, Bellman, & Lohse, 2003). The researchers concluded that people can quickly learn to navigate a site, with the speed-up being faster for a well-designed website than for a poorly designed site. Thus, initial differences in usability may become even more pronounced after a few visits to the sites, leading users to prefer the better-designed website even more than they did originally.

An additional point to note is that the power law also describes increasing productivity in a production process that occurs as a result of a group of operators’ increasing experience with the system (Lane, 1987; Nanda & Adler, 1977). Such manufacturing process functions predict how quickly products will be produced, but they do not necessarily predict how quickly individual operators will perform. More generally, research has shown that individual improvements in performance may be described better by at least two power functions (Donner & Hardy, 2015), reflecting a shift in processes or strategies operating at different phases of skill acquisition, as discussed in the next section.

TAXONOMIES OF SKILL

The power law of practice suggests that improvement occurs continuously across time, but the fact that at least two functions seem to be needed to characterize an individual’s improvement implies that there are qualitative changes in performance as well. By this we mean that people seem to do things in completely different ways depending on their skill level. As expertise is acquired, people transition from one way of performing a task to another. Several taxonomies have been developed to capture these differences in performance. Two complementary and influential taxonomies are Fitts’s phases of skill acquisition and Rasmussen’s levels of behavior.

Phases of Skill Acquisition

Fitts (1964; Fitts & Posner, 1967) distinguished between three phases of skill acquisition, which are, from least skilled to most skilled, cognitive, associative, and autonomous. Performance in the initial cognitive phase is determined by how well instructions and demonstrations are given. Fitts used the term cognitive to refer to the fact that the novice learner is still trying to understand the task, and therefore must attend to cues and events that will not require attention in later phases. During the associative phase, the task components that have been learned in the cognitive phase begin to be related to each other. This is accomplished by combining these components into single procedures, much like the subroutines of a computer program. The final autonomous phase is characterized by the automatization of these procedures, which makes them less subject to cognitive control.

Automatic processes are those that do not require limited-capacity attentional resources for their performance. There are four general characteristics of automatic processes (Schneider & Chein, 2003; Schneider & Fisk, 1983; Shiffrin, 1988). They (1) occur without conscious intention during the performance of the task; (2) can be performed simultaneously with other attention-demanding tasks; (3) require little effort and may occur without awareness; and (4) are not affected much by high workload or stressors.

It is easy to demonstrate that with increasing practice, task performance appears to shift from being effortful and attention-demanding to requiring little effort and attention (Kristofferson, 1972; Schneider & Shiffrin, 1977). Most of these demonstrations use a simple task, like visual or memory search, in which a person is asked to determine whether an item (such as a letter or digit) is present in a visual display or a set of items memorized previously. As the number of items to be searched increases (usually by adding distracting items), response time increases, reflecting greater cognitive demands. However, with increased practice, response times become more independent of the number of items to be searched, as long as the items are consistently “mapped” to the same stimulus category, target or distractor, throughout the task. That is, if the digit “8” is a target to be searched for, it will never appear among the items to be searched when any other target is presented.

After task procedures have become automatic, it may be very difficult to perform the task in any other way, even after the task requirements change. Shiffrin and Schneider (1977) asked people to practice a memory search task with a consistent mapping for 2100 trials. Then the mapping was reversed so that former distractors became targets, and vice versa. Nine hundred trials of practice with the reversed task were required before people reached the level of accuracy they demonstrated on the original task without any practice at all (see Figure 12.2). Accuracy on the reversed task remained poorer than on the original task until 1500 trials after the reversal. Shiffrin and Schneider argued that the automatic procedures for target identification developed during the initial training apparently continued to “fire” when their stimulus conditions were present, even though the task requirements had changed.

FIGURE 12.2Learning with initial and reversed consistent mappings.

Skill-Rule-Knowledge Framework

Whereas Fitts’s taxonomy focuses on different phases of skill acquisition, Rasmussen’s (1986) taxonomy, introduced in Chapter 3, focuses on three levels of behavioral control that interact to determine performance in specific situations. These levels (skill-, rule-, and knowledge-based) correspond approximately to Fitts’s phases of skill acquisition, except that Rasmussen acknowledges that even a skilled performer will revert to earlier levels of control in certain circumstances.

Skill-based behavior involves relatively simple relations between stimuli and responses. Task performance is determined by automatic, highly integrated actions that do not require conscious control. The performance of routine activities in familiar circumstances would fall within this category, as would the memory search performance described above. For some skills, such as simple assembly and repetitive drill operations, highly integrated, automatic routines are desirable to maximize performance efficiency (Singleton, 1978). However, many skills require not only fast and accurate performance but also a high degree of flexibility. Flexibility arises from an ability to organize many elemental skilled routines in different ways to accomplish different, sometimes novel, goals.

Rule-based behavior is controlled by rules or procedures learned from previous experiences or through instructions. This level of behavioral control arises when automatic performance is not possible, such as when the performer experiences a deviation from the planned or expected task conditions. Rule-based performance is goal-oriented and typically under conscious control.

Knowledge-based behavior is used in situations for which no known rules are applicable. People may engage in knowledge-based behavior after an attempt to find a rule-based solution to a problem has failed. Knowledge-based behavior relies on a conceptual model of the domain or system of interest. A person must formulate a concrete goal and then develop a useful plan. Knowledge-based behavior involves problem solving, reasoning, and decision making of the type we described in Chapter 11. Consequently, performance depends on the adequacy of the mental model of the performer and can be influenced by the many heuristics that people use to solve problems and make decisions.

According to Reason (1990; 2013), distinct types of failures can be attributed to each different performance level. For skill-based performance, most errors involve misdirected attention. Failures due to inattention often occur when there is intent to deviate from a normal course of action but “automatic” habits intrude. Conversely, errors of over-attention occur when the performer inappropriately diverts his or her attention to some component of an automatized sequence; that is, the performer “thinks too hard” about what he or she is trying to accomplish. At the rule-based level, failures can result from either the misapplication of good rules or the application of bad rules. At the knowledge-based level, errors arise primarily from fallibilities of the strategies and heuristics that problem solvers use to address their limited capacities for reasoning and representing the problem.

In sum, most skills require not only that routine procedures become automatized, but also that enough appropriate knowledge is learned for efficient rule-based and knowledge-based reasoning.

THEORIES OF SKILL ACQUISITION

Theories of skill acquisition and skilled performance are of value for several reasons. First, they help us understand why people do better in some situations than in others. Second, they provide us with a foundation for designing new experiments that may potentially lead to greater understanding of skill and expertise. With this greater understanding, the human factors engineer can help design and implement training programs to optimize skill acquisition. These theories are formalized in models of skill acquisition.

There are two major types of models of skill acquisition: production system models and connectionist models (Ohlsson, 2008). Production system models view skill acquisition as similar to problem solving and describe how production rules change and how people use them differently across different phases of practice. Connectionist models are based on networks of connected units, like the network memory models in Chapter 10. These units will be activated to greater or lesser degrees depending on task demands and the strength of their connection to other units. The result is a pattern of activation levels across the units of the network, which determines performance. Like learning and memory in other contexts, skill acquisition arises from changes in the connections within the network.

A Production System Model

Anderson’s ACT-R (Adaptive Control of Thought–Rational; Anderson et al., 2004) cognitive architecture distinguishes three phases of skill acquisition similar to those proposed by Fitts (1964). The model relies on a procedural memory that contains the productions used to perform tasks, a declarative memory that contains facts in a semantic network, and a working memory that is used to link declarative and procedural knowledge. The first phase of skill acquisition is called the declarative stage, because it relies on declarative knowledge. In this stage, performance depends on general problem-solving productions that use weak heuristic methods of the type we described in Chapter 11. The person learning a task encodes the facts necessary to perform that task in declarative memory. The learner must retain these facts in working memory, perhaps by rehearsal, to be useful for the general productions.

In the second, associative phase, the learner gradually detects and eliminates errors. He begins to develop domain-specific productions that no longer require declarative memory for their operation. The process that leads to the acquisition of domain-specific productions, called knowledge compilation, has two subprocesses: composition and proceduralization. The composition subprocess collapses several productions into a single, new production that produces the same outcome. The proceduralization subprocess removes from the productions those conditions that require declarative knowledge. Composition and proceduralization together can be referred to as production compilation.

The domain-specific productions acquired in the associative phase become increasingly specific and automatic in the third, procedural phase as performance becomes highly skilled. These productions are further tuned through subprocesses of generalization (development of more widely applicable productions), discrimination (narrowing the conditions in which a production is used to only those situations for which the production is successful), and strengthening through repeated application.

Imagine an air-traffic controller who must learn to direct her attention to the bottom left of a display screen to read a list of the planes that can be landed (Hold Level 1; Taatgen & Lee, 2003). Table 12.1 shows three general rules needed initially to interpret when/how to do this. These rules are (1) retrieve an instruction, then (2) move attention, and finally (3) move the eyes to the appropriate location on the display. You can see that each of these rules is composed of an “if statement,” which describes the conditions under which the rule should be applied, and a “then statement,” which lists the actions to be taken.

TABLE 12.1

Rules for Learning to Direct Attention to the Bottom Left of the Display Screen to Read a List of the Planes That Can Be Landed

Retrieve instruction
IF	You have to do a certain task
THEN	Send a retrieval request to declarative memory for the next instruction for this task
Move attention
IF	You have to do a task AND an instruction has been retrieved to move attention to a certain place
THEN	Send a retrieval request to declarative memory for the location of this place
Move to location
IF	You have to do a task AND a location has been retrieved from declarative memory
THEN	Issue a motor command to the visual system to move the eyes to that location

As the air-traffic controller becomes more skilled, production compilation combines these general procedures with the specific declarative instructions for the air-traffic controller task, producing the new set of rules in Table 12.2. You can see that the new rules are combinations of pairs of the original rules. At the highest level of skill, these combination rules will be compiled with the remaining rule from the original set (“move to location” with “instruction & attention” or “retrieve instruction” with “attention & location”). This results in the following single, task-specific rule:

TABLE 12.2

Set of Rules Developed from Production Compilation

Instruction and attention
IF	You have to land a plane
THEN	Send a retrieval request to declarative memory for the location of Hold Level 1
Attention and location
IF	You have to do a task AND an instruction has been retrieved to move attention to Hold Level 1
THEN	Issue a motor command to the visual system to move the eyes to the bottom left of the screen

All three:

IF:	You have to land a plane,
THEN:	Issue a motor command to the visual system to move the eyes to the bottom left of the screen.

The process of production compilation produces a single, compact production rule that can be executed much more quickly than the original rules.

A Connectionist Model

One early connectionist model of skill acquisition was proposed by Gluck and Bower (1988). Their model described the performance of students learning to make medical diagnoses based on descriptions of patients’ symptoms. The different symptoms had different probabilities of occurring with each disease, so it was impossible for the students to be 100% accurate. Students made a diagnosis for each patient, and each diagnosis was followed by feedback about the accuracy of the diagnosis.

The different symptoms were represented in the model by activations across a network of input units (see Figure 12.3). These activations are weighted and summed at an output unit. The activation of the output unit reflects the extent to which one disease is favored over the other. This activation is used to classify the disease, and the feedback about the accuracy of the diagnosis is used to modify the weights. Modifying the weights gives the network the ability to detect correlations between symptoms and diseases, and to use these correlations to arrive at a diagnosis.

FIGURE 12.3Network model.

Some models incorporate properties of both production system and connectionist models (Schneider & Chein, 2003). These more complex models are implemented with connectionist components, such as a data matrix of input, internal operation, and output modules, a control system with multiple processors that receive input and transmit output. Schneider and Chein’s model has automatic and controlled processing modes, and can explain many findings on controlled processing, automaticity, and improvements with performance as skill is acquired.

TRANSFER OF LEARNING

A significant issue in human factors is the extent to which the benefits of practice at one task or in one domain can “transfer” to related tasks and domains. By transfer, we mean the extent to which a person will be able to perform a new task because of his or her practice with a related task. Transfer has been studied in both basic and applied research (Cormier & Hagman, 1987; Healy & Wohldmann, 2012).

Views of Transfer

There are two extreme points of view regarding transfer (Cox, 1997). At one end of the continuum is the idea that expertise acquired in any domain should improve task performance in any other domain. This is the doctrine of formal discipline, originated by John Locke (Dewey, 1916). This doctrine attributes expertise in any area to general skills that are required for the performance of a broad range of tasks. From a production system perspective, extended practice at solving problems within a specific area allows the learner to acquire procedures related to reasoning and problem solving. These general procedures can then be used for novel problems in other areas.

At the other end of the continuum is Thorndike’s (1906) theory of identical elements. This theory states that transfer should occur only to the extent that two tasks share common stimulus-response elements. Practice at solving problems within a specific area should benefit problem-solving performance within a different area if the two areas share common elements. Thus, the extent to which transfer will occur will depend on the characteristics of the practiced and novel tasks, and could be very limited or nonexistent.

Results from experiments investigating the extent of transfer between different tasks indicate that neither of these extreme views is correct. The evidence for transfer of general problem-solving skills (the doctrine of formal discipline) has been primarily negative. For example, students in one study received several weeks of training in solving algebraic word problems using a general problem-solving procedure intended to teach heuristics that could be applied to a variety of problems. These students did no better at solving new problems on a subsequent test than students who had not received training, leading the authors to conclude, “The results of this study suggest that formal instruction in a heuristic model suggesting general components of the problem-solving process is not effective in promoting increased problem-solving ability” (Post & Brennan, 1976, p. 64).

The lack of evidence for transfer of general skills may be due to the fact that the training regimens used in these and similar experiments are focused on those generally applicable weak methods (see Chapter 11) that are already highly practiced for most adults (Singley & Anderson, 1989). Other evidence indicates that transfer is not as specific as envisioned by Thorndike. Studies using tasks such as those interpreted in terms of a permission schema (see Chapter 11) show that transfer can occur when the stimulus and response elements are not identical (Cheng, Holyoak, Nisbett, & Oliver, 1986). However, skill acquisition seems to occur more along the lines of Thorndike’s identical-elements view rather than the formal discipline view.

An alternative proposal made by Singley and Anderson (1989) relates the identical-elements view to mental representations. Focusing on the ACT architecture’s distinction between declarative and procedural phases of performance, they proposed that the specific productions developed with practice are the elements of cognitive skill. Transfer will occur to the extent that the productions acquired to perform one task overlap with those for a second task. In other words, the specific stimulus and response elements do not have to be identical for transfer to occur; rather, the acquired productions must be appropriate for the second task.

This point is emphasized by an experiment Singley and Anderson (1989) conducted on learning calculus problems. Students who were unfamiliar with freshman calculus learned to translate word problems into equations and select operations to perform on those equations. These operations included differentiation and integration, among seven possible operations. When problems stated as applications in economics were changed to applications in geometry, there was total transfer of the acquired skill of translating the problem into equations. They also observed transfer in operator selection from problems that required integration to ones that required differentiation, but only for the operations that were shared between the two problem types. In short, transfer occurred only when the productions required for integrating and differentiating economics and geometry problems were similar.

Part-Whole Transfer

The operator of a human–machine system typically has to perform a complex task composed of many subtasks. The issue of part-whole transfer involves the question of whether the performance of the whole task can be learned by learning how to perform the subtasks. Training the subtasks is called part training, while training the whole task is called whole training. From a practical standpoint, there are many reasons why part training might be preferable to whole training. For example, (1) whole-task simulators are typically more complex and expensive than part-task simulators; (2) as a consequence of lack of salience or emphasis, subtasks critical to successful performance of the whole task may receive relatively little practice in the whole-task situation; (3) experienced operators could be trained more efficiently on only the new subtasks required for a new machine or task; and (4) relatively simple training devices could be used to maintain essential skills.

There are three ways that tasks can be broken into subtasks (Wightman & Lintern, 1985). Segmentation can be used for tasks that are composed of successive subtasks. The subtasks can be performed in isolation or in groups and then recombined into the whole task. Fractionation is similar to segmentation, but applies to tasks in which two or more subtasks are performed simultaneously. This procedure involves separate performance for each of the subtasks before combining them. Finally, simplification is a procedure used to make a difficult task easier by simplifying some aspect of the task. It is more applicable to tasks for which there are no clear subtasks.

When the use of the part method seems appropriate, it is important to plan how the components will be reassembled into the whole task once they have been individually mastered. There are three schedules for part-task training: pure-part, repetitive-part, and progressive-part (Wightman & Lintern, 1985). With a pure-part schedule, all parts are practiced in isolation before being combined in the whole task. With a repetitive-part schedule, subtasks are presented in a predetermined sequence and progressively combined with other components as they are mastered. A progressive-part schedule is similar to the repetitive-part schedule, but each part receives practice in isolation before being added to the preceding subtasks. In certain circumstances, the whole task may be presented initially to identify any subtasks that may be especially difficult. These subtasks are then practiced using the part method.

No single method of training is best for all situations. Part-task training is most beneficial for complex tasks composed of a series of subtasks of relatively long duration, but it can be detrimental if subtasks overlap or have to be, at least in part, performed at the same time (Wickens, Hutchins, Carolan, & Cumming, 2013). The reason for this seems to be due to the training method not permitting development of a time-sharing skill. Wickens et al. concluded that for tasks in which time sharing is crucial, a training schedule in which the whole task is performed but with varying emphases on its component subtasks is most beneficial.

EXPERT PERFORMANCE

The research that we have discussed up to this point has focused on skill acquisition in laboratory tasks. These artificial, oversimplified tasks are easily mastered, but they bear little resemblance to most real-world tasks. To say that someone is an expert after performing a few sessions of a laboratory task stretches the definition of the word “expert.” An expert is someone who has acquired special knowledge of a domain (like an entomologist or physician) or a set of complex perceptual-motor skills (like a concert pianist or an Olympic athlete). Ten years of intensive practice and training is typically required before a person’s abilities reach an expert level in these real-world domains (Ericsson, Krampe, & Tesch-Romer, 1993).

The benefit of laboratory studies is that they give us the ability to see how simple skill acquisition occurs under controlled conditions (see, for example, Proctor & Vu, 2006a). However, because the acquisition of true expertise cannot be studied in the laboratory, studies of expertise focus on how experts think and behave differently from novices. Such studies have enhanced our understanding of cognitive skill and provided a foundation for the development of expert systems.

DISTINCTIONS BETWEEN EXPERTS AND NOVICES

There is no argument that experts are able to do things that novices cannot (Glaser & Chi, 1988). Table 12.3 summarizes some characteristics of experts’ performance. These characteristics reflect the expert’s possession of a readily accessible, organized body of facts and procedures that can be applied to problems in his or her domain. That is, the special abilities of experts arise mainly from the substantial amount of knowledge that they have about a particular domain and not from a more efficient general ability to think. For example, expert taxi drivers can generate many more secondary, lesser-known routes through a city than can novice taxi drivers (Chase, 1983). As another example, although both chemists and physicists are presumed to be of equal scientific sophistication, chemists solve physics problems much like novices do (Voss & Post, 1988).

TABLE 12.3

Characteristics of Experts’ Performance

1.Experts excel mainly in their domains.

2.Experts perceive large meaningful patterns in their domain.

3.Experts are fast; they are faster than novices at performing the skills of their domain, and they quickly solve problems with little error.

4.Experts have superior short-term and long-term memory for material in their domain.

5.Experts see and represent a problem in their domain at a deeper (more principled) level than novices; novices tend to represent a problem at a superficial level.

6.Experts spend a great deal of time analyzing a problem qualitatively.

7.Experts have more accurate self-monitoring skills.

8.Experts are good at selecting the most appropriate strategies to use in a situation.

Chess is a domain that lends itself well to the study of expertise (Gobet & Charness, 2007). Experts are ranked and designated by an explicit scoring system that is monitored by several national and international organizations. Chess is also a task that can be easily brought into the laboratory. Therefore, although we cannot observe the development of expertise over time, we can observe differences between experts and novices under controlled conditions. Some of the most influential research on expertise has compared the performance of chess masters with that of less skilled players (Chase & Simon, 1973; de Groot, 1966).

One famous experiment presented chess masters and novices with chess board configurations to remember (de Groot, 1966). The pieces on the board were placed either randomly or in positions that would arise during play. When the configuration was consistent with actual play, the masters demonstrated that they could remember the positions of more than 20 pieces, even when the board had been shown for only 5 seconds. In contrast, novices could remember the positions of only five pieces. However, when the configuration was random, both masters and novices could remember only five pieces.

Chase and Simon (1973) later examined how the masters “chunked” the pieces on the board together. They found that the chunks were built around strategic relations between the pieces. Chess masters can recognize approximately 50,000 board patterns (Simon & Gilmartin, 1973). We can hypothesize that each of these patterns has associated with it automatized procedures composed of all the moves that could be made in response to that pattern. Because chess masters have already learned the legitimate board configurations that could be presented and their associated procedures, they can access the configurations effortlessly and hold them in working memory, whereas novices cannot.

What we learned from the studies investigating chess mastery is that experts have elaborate mental representations that maintain the information and associations between objects and procedures. Other studies have demonstrated that an expert’s mental representation of their domain can be used as a scaffold to remember other kinds of information. Chase and Ericsson (1981) examined skilled memory in more detail for one man (S.F.), a long-distance runner. He practiced a simple memory task for over 250 hours during the course of 2 years. This task, a digit span task, required that he remember randomly generated lists of digits and recall the digits in order. In the beginning, his digit span was seven digits, about what we would expect given normal limitations of working memory (see Chapter 10). However, by the end of the 2-year period, his span was approximately 80 random digits.

How did S.F. accomplish this more than 10-fold increase in memory performance? Verbal protocols and performance analyses indicated that he did so by using mnemonics. S.F. began, as most people would, by coding each digit phonemically. However, on day 5, he started using a mnemonic of running times, exploiting his mental representation of long-distance running. S.F. first used three-digit codes, switching in later sessions to four-digit running times and decimal times. Much later in practice, he developed additional mnemonics for digit groups that could not be converted into running times.

Based on the performance of S.F. and other people, Chase and Ericsson (1981) developed a model of skilled memory. According to this model, increases in a person’s memory span beyond that which we consider normal reflect not an increase in the capacity of short-term memory, but more efficient use of long-term memory. The model attributes five characteristics to skilled memory: (1) to-be-remembered information is encoded efficiently using existing conceptual knowledge, like S.F.’s knowledge of running times; (2) the stored information is rapidly accessed with retrieval cues; (3) the to-be-remembered information is stored in long-term memory; (4) the speed of encoding can be constantly improved; and (5) the acquired memory skill is specific to the stimulus domain that was practiced: in S.F.’s case, strings of digits.

Ericsson and Polson (1988) used this model as a framework to investigate the memory skills of a headwaiter at a restaurant who could remember complete dinner orders from more than 20 people at several tables simultaneously. Unlike S.F., this headwaiter did not rely on his expertise in some other domain, but only on his expertise as a waiter. However, like S.F., he used a highly organized mnemonic scheme. He organized his orders into groups of four and represented them in a two-dimensional matrix for the dimensions of location and course (e.g., entrée). In addition, he used imagery to relate each person’s face to her or his order and other special encoding schemes for different courses.

The headwaiter’s memory showed all the characteristics predicted by the skilled memory model, with the exception that his skills transferred to other stimulus materials. Ericsson and Polson (1988) attribute the relatively broad generality of the headwaiter’s memory skills to the wide range of situations that he had to remember. Consequently, he had developed not only considerable flexibility in encoding dinners, but also a more general understanding of his own memory structure, of long-term memory properties, and of broadly applicable “metacognitive” strategies.

We discussed the fact that experts have different mental representations for remembering information. Another way that experts seem to differ from novices is in the quality of their mental models. Recall that a mental model allows a person to simulate the outcome of different actions on the basis of a mental representation. Because experts have better mental representations and, therefore, better mental models, their performance is better. In one experiment, Hanisch, Kramer, and Hulin (1991) evaluated the mental models of novice users of a business phone system. They asked the users to rate the similarity between each pair of nine standard features of the phone. They then compared these ratings with those of system trainers, who were highly knowledgeable about the system features. The users’ mental models were quite different from those of the trainers. The trainers’ mental models corresponded closely to documentation about the system features, but the novice users’ mental models contained many deficiencies and inaccuracies. Hanisch et al. proposed that good training programs should highlight and explain clusters of features of the phone system in a way that is similar to how the trainers cluster the features of the phone system.

Up to this point, we have mostly discussed why experts are more accurate than novices. Experts also differ from novices in terms of how long they take to perform a task. Although experts are faster overall than novices in task performance, they take longer to analyze a problem qualitatively before attempting a solution. Experts may engage in this lengthy, qualitative analysis to construct a mental model incorporating relationships between elements in the problem situation. The extra time they spend on qualitative analysis may also allow them to add constraints to the problem and so reduce the scope of the problem. These analyses, while time-consuming, allow the expert to generate solutions efficiently.

One reason why experts spend more time analyzing a problem is that they are better able to recognize the conceptual structure of the problem and its correspondence to related problems. Experts are also better able to determine when they have made an error, failed to comprehend material, or need to check a solution. They can more accurately judge the difficulty of problems and the time it will take to solve them. This allows experts to allocate time among problems more efficiently. Chi, Feltovich, and Glaser (1981) found that physicists sorted physics problems into categories according to the physical principles on which they were based, whereas novices were more likely to sort the problems in terms of the similarities among the literal objects (balls, cannon, etc.) described in the problems. However, there are both good and poor experts (Dror, 2016): The best experts are unbiased by irrelevant contextual information, as were the physicists in Chi et al.’s study, and reliably reach the same conclusion from the same relevant information.

NATURALISTIC DECISION MAKING

In Chapter 11, we introduced the topic of decision making under uncertainty. Most of the research we described examined choices made by novices in relatively artificial problems with no real consequences. However, most decisions in everyday life are made under complex conditions, often with time pressure, that are familiar and meaningful to the individuals making the decisions. Consequently, reliance on expert knowledge seems to play a much larger role in natural settings than in most laboratory studies of decision making. Beginning in 1989, a naturalistic approach to decision making was developed, which emphasizes how experts make decisions in the field (Gore, Flin, Stanton, & Wong, 2015; Lipshitz, Klein, Orasanu, & Salas, 2001).

Klein (1989) conducted many field studies in which he observed how fireground commanders (leaders of teams of firefighters), platoon leaders, and design engineers made decisions. The following is one example of the thoughts and actions of a decision maker in the field:

The head of a rescue unit arrived at the scene of a car crash. The victim had smashed into a concrete post supporting an overpass, and was trapped unconscious inside his car. In inspecting the car to see if any doors would open (none would), the decision maker noted that all of the roof posts were severed. He wondered what would happen if his crew slid the roof off and lifted the victim out, rather than having to waste time prying open a door. He reported to us that he imagined the rescue. He imagined how the victim would be supported, lifted, and turned. He imagined how the victim’s neck and back would be protected. He said that he ran his imagining through at least twice before ordering the rescue, which was successful. (Klein, 1989, pp. 58–59)

This example illustrates how expert decision makers tend to be concerned with evaluating the situation and considering alternative courses of action. Klein concluded that mental simulation (“[running] his imagining through”) is a major component of the expert’s decisions. These mental simulations allow the expert to quickly evaluate possible consequences of alternative courses of action.

Most explanations of naturalistic decision making emphasize the importance of recognition-primed decisions (Klein, 1989; Lipshitz et al., 2001). A skilled decision maker must first recognize the conditions of a particular situation in making their judgments. The decision maker will recognize many situations as typical cases for which certain actions are appropriate. In such situations, the decision maker knows what course of action to take, because he or she has had to deal with very similar conditions in the past. However, many situations will not be recognized, and for these the decision maker may adopt a strategy of mental simulation to clarify the conditions of the situation and the appropriate actions to take.

These decision strategies rely heavily on expertise. As Meso, Troutt, and Rudnicka (2002) note, “Real life decision making requires expertise in the problem domain in which the problem being solved belongs” (p. 65). For example, police officers who were experts in firearms were shown to use similar processes as those who were novices, but the experts were able to use their experiential knowledge to perform much better (Bolton & Cole, 2016). Their knowledge allowed them to accurately categorize incidents, recognize irregularities, adapt rapidly to a dynamically changing environment, and use their training automatically—freeing up cognitive resources for mental simulation of the immediate situation. According to the recognition-primed decision model, expert decision makers should be trained by improving their recognition and mental simulation skills in a variety of contexts within their domain of expertise (Ross, Lussier, & Klein, 2005).

EXPERT SYSTEMS

Our comparisons between experts and novices demonstrated that many of the differences between them arise from the large amount of domain-specific knowledge that the experts possess. We mentioned earlier that experts are not always available for consultation when a problem arises and that they can also be very expensive. This has led to the development of artificial systems, called expert systems, designed to help nonexperts solve problems (Buchanan, Davis, & Feigenbaum, 2007). Expert systems, also known as knowledge-based systems, have been developed for problems as diverse as lighting energy management in school facilities (Fonseca, Bisen, Midkiff, & Moynihan, 2006), selection of software design patterns (Moynihan, Suki, & Fonseca, 2006), optimization of sites for waste incinerators (Wey, 2005), and financial performance assessment of healthcare systems (Muriana, Piazza, & Vizzini, 2016).

Unlike decision-support systems, which are intended to provide information to assist experts, expert systems are designed to replace the experts (Liebowitz, 1990). More specifically:

An expert system is a program that relies on a body of knowledge to perform a somewhat difficult task usually performed by only a human expert. The principal power of an expert system is derived from the knowledge the system embodies rather than from search algorithms and specific reasoning methods. An expert system successfully deals with problems for which clear algorithmic solutions do not exist. (Parsaye & Chignell, 1987, p. 1)

Most expert systems are not simply data bases filled with facts that an expert knows; they also incorporate information-processing strategies and heuristics intended to mimic the way an expert thinks and reasons. This design feature is called cognitive emulation (Slatter, 1987). That is, the expert system is intended to mimic the thoughts and actions of the decision maker in all respects (Giarratano & Riley, 2004).

A host of human factors issues are involved in designing an effective expert system. An expert system may contain technically accurate information but still be difficult to use and fail to enhance the user’s performance. A good expert system will be technically accurate (and make appropriate recommendations) and adhere to good human engineering principles (Madni, 1988; Preece, 1990; Wheeler, 1989). In this section, we review characteristics of expert systems, with a special emphasis on how the contribution of human factors is important.

CHARACTERISTICS OF EXPERT SYSTEMS

Expert systems have a modular structure (Gallant, 1988). The system modules include a knowledge base that represents the domain-specific knowledge on which decisions are based, an inference engine that controls the system, and an interactive user interface through which the system and the user communicate (Laita et al., 2007).

Knowledge Base

Knowledge can be represented in an expert system in many different ways (Buchanan et al., 2007; Ramsey & Schultz, 1989; Tseng, Law, & Cerva, 1992). Each choice of representation might correspond to alternative ways of representing knowledge in models of human information processing (see Chapters 10 and 11). Three such choices are production rules, semantic networks, and structured objects. Recall from Chapter 10 that production rule representations specify that if some condition is true, then some action is to be performed. A semantic network system is a connected group of node and link elements. Each node represents a fact, and each link a relation between facts. Structured objects represent facts in abstract schemas called frames. A frame is a data structure that contains general information about one kind of stereotyped event and includes nonspecific facts about, and actions performed in, that event. Frames are linked together into collections called frame systems.

Why a designer would select one kind of representation over another depends on the following three considerations:

1.Expressive power: Can experts communicate their knowledge effectively to the system?

2.Understandability: Can experts understand what the system knows?

3.Accessibility: Can the system use the information it has been given? (Tseng et al., 1992, p. 185)

None of the representations we have described satisfy these or any other criteria perfectly, so the best representation to use will depend on the purpose of a particular expert system. Production systems are convenient for representing procedural knowledge, since they are in the form of actions to be taken when conditions are satisfied (see examples earlier in this chapter). They also are easy to modify and to understand. Semantic networks are handy for representing declarative knowledge, such as the properties of an object. Frames and scripts are useful representations in situations where consistent, stereotypical patterns of behavior are required to achieve system goals. Sometimes an expert system will use more than one knowledge representation (like the ship design system we describe later on).

Inference Engine

The inference engine module plays the role of thinking and reasoning in the expert system. The inference engine searches the knowledge base and generates and evaluates hypotheses, often using forward or backward chaining (see Chapter 11; Liebowitz, 1990). The type of inference engine used is often closely linked to the type of knowledge representation used. For example, for case-based reasoning systems (see Chapter 11; Prentzas & Hatzilygeroudis, 2016), the knowledge base is previously solved cases; the inference engine matches a new problem against these cases, selecting for consideration the ones that provide the best matches.

Because many decisions are made under situations of uncertainty, the inference engine, together with the fact base, must be able to represent those uncertainties and generate appropriate prescriptions for action when they are present (Hamburger & Booker, 1989). One function of the inference engine is to make computations of utility that account for preferences of outcomes, costs of different actions, and so on, and base final recommendations on that utility. One way to do this is to incorporate a “belief network” into the system. A belief network represents interdependencies among different facts and outcomes, so that each fact is treated not as a single, independent unit but as a group of units that systematically interact.

User Interface

The user interface must support three modes of interaction between a user and the expert system. These modes are (1) obtaining solutions to problems, (2) adding to the system’s knowledge base, and (3) examining the reasoning process of the system (Liebowitz, 1990). The creation of a useful dialogue structure requires the designer to understand what information is needed by the user, and how and when to display it. The dialogue structure should be such that the information requested by the computer is clear to the user and the user entry tasks are of minimal complexity.

An important component of an expert system’s user interface is an explanation facility (Buchanan et al., 2007). The explanation facility outlines the system’s reasoning processes to the user when he or she requests this information. By examining the reasoning process, the user can evaluate whether the system’s diagnoses and recommendations are appropriate. Often, mistakes made while inputting information to the system can be detected using this part of the interface.

HUMAN FACTORS ISSUES

The development of an expert system usually involves several people (Parsaye & Chignell, 1987). A domain expert provides the knowledge that is collected in the knowledge base. An expert-system developer, or “knowledge engineer,” designs the system and its interface as well as programs for accessing and manipulating the knowledge base. Users are typically involved from the very beginning of the process, especially while designing and evaluating different user interfaces, to ensure that the final product will be usable by the people for whom it is intended.

Several human factors concerns must be addressed during the construction of an expert system (Chignell & Peterson, 1988; Nelson, 1988): (1) selecting the task or problem to be modeled; (2) determining how to represent knowledge; (3) determining how the interface is to be designed; (4) validating the final product and evaluating user performance.

Selecting the Task

Although it seems as though selecting the task would be the easiest part of developing an expert system, it is not. Many problems that might be submitted to an expert system are very difficult to solve. An expert-system designer, when faced with an intractable problem, must determine how to break that problem up into parts and represent it in the system. Expert consultants and the system designer must agree on how the task in question is best performed. Once the design team agrees on the parts of the task, work can begin on how the system represents those tasks. Clearly, easily structured tasks that rely on a focused area of knowledge are the best candidates for representation in expert systems. Furthermore, to the extent that a task can be represented as a deductive problem, it will be easy to implement as a set of rules to be followed. Inductive problems are more difficult to implement (Wheeler, 1989).

Representation of Knowledge

The representation of knowledge and the accompanying inference engine must reflect the expert’s knowledge structure (Arevalillo-Herráez, Arnau, & Marco-Giménez, 2013). One way to ensure this is to acquire the knowledge and inference rules for the system from an expert (see Box 12.1). Most often, the knowledge is extracted through interviews, questionnaires, and verbal protocols collected while the expert performs the tasks to be modeled. As with any naturalistic study, certain factors will determine how useful data collected with these instruments will be. These factors include (1) whether or not the knowledge engineer and subject matter expert share a common frame of reference that allows them to communicate effectively, (2) making sure that the instruments used to elicit information from the expert are compatible with the expert’s mental model of the problem, and (3) detecting and compensating for biases and exaggerations in the expert’s responses (Madni, 1988). These factors are important because much of an expert’s skill is highly automatized, so verbal reports may not produce the most important information for system design.

User-friendly expert system shells are available for developing specific expert systems. The shells include domain-independent reasoning mechanisms and various knowledge representation modes for building knowledge bases. One widely used expert system shell is CLIPS (C Language Integrated Production System; Giarratano & Riley, 2004; Hung, Lin, & Chang, 2015). CLIPS was developed by NASA in 1985, and version 6.30 is available as public domain software (www.clipsrules.net/). This expert system shell allows knowledge to be represented as production rules, which specify the actions to be performed when certain conditions are met. It also supports the representation of knowledge as an object-oriented hierarchical data base, which allows systems to be represented as interconnected, modular components.

BOX 12.1KNOWLEDGE ELICITATION

If an expert system is to incorporate expert knowledge, we must first extract that knowledge from experts in the domain. Without an accurate representation of the information and the strategies used by the experts to solve specific problems, the expert system will not be able to carry out its tasks appropriately. There are two fundamental questions that define the core problem of knowledge elicitation (Shadbolt & Burton, 1995): “How do we get experts to tell us, or else show us, what they know that enables them to be experts at what they do?” and “How do we determine what constitutes the expert’s problem-solving competence?” These questions are difficult to answer because (1) much of an expert’s knowledge is tacit; that is, it is not verbalizable; (2) an expert often solves a problem quickly and accurately with little apparent intermediate reasoning; (3) the expert may react defensively to attempts at eliciting her or his knowledge; and (4) the knowledge elicitation process itself may induce biases in the extracted information (Chervinskaya & Wasserman, 2000).

We can use a variety of techniques effectively to elicit knowledge from experts. These include, for example, verbal protocol analysis, in which experts describe the hypotheses they are considering, the strategies they are using, and so on, as they perform tasks in their domains, and concept sorting, in which experts sort cards with various concepts into related piles. To illustrate, these two methods were used to elicit from information security experts their knowledge about risks in use of applications on mobile devices, from which three main dimensions were identified: personal information privacy, monetary risk, and device stability/instability (Jorgensen et al., 2015). Because different experts within a domain may have dissimilar knowledge representations, it is often a good strategy to elicit knowledge from more than one expert, as was done in that study.

Knowledge elicitation is also important for purposes other than building an expert system (Hoffman, 2008). For example, Peterson, Stine, and Darken (2005) describe knowledge elicitation and representation for military ground navigators, with the goal of using this information in design of training applications. Knowledge elicitation is also a necessary step in determining what content to include in an e-commerce website and how to manage that content (Proctor, Vu, Najjar, Vaughan, & Salvendy, 2003). Knowledge elicited from experts can reveal what information needs to be available to the user, how that information should be structured to allow easy access, and the strategies that different users may employ to search and retrieve information. When we elicit knowledge for usability concerns, as in the case of website design, we must obtain that knowledge not only from experts but also from a broad range of users who will be accessing and employing the information. Obtaining information from users is often conducted in the context of trying to understand their knowledge, capabilities, needs, and preferences, rather than of obtaining specifications for design.

Some knowledge elicitation techniques, such as verbal protocol analysis, are aimed at eliciting knowledge from experts, but others (such as questionnaires and focus groups) are targeted more toward novices and end users. Also, some methods are based on observations of behavior, whereas others are based on self-reports. Table B12.1 summarizes many of these methods along with their strengths and weaknesses. As a rule, we recommend that several methods be used to increase the quantity and quality of relevant information that is elicited

One benefit of expert system shells is that they allow an opportunity for the domain expert to be directly involved in the development of the expert system, rather than just serving as a source of knowledge. This may allow more information about the expert’s knowledge to be incorporated into the system, because the input is more direct. Naruo, Lehto, and Salvendy (1990) describe a case study in which this method was used to design an expert system for diagnosing malfunctions of a machine for mounting chips on an integrated circuit board. A detailed knowledge elicitation process was used to organize the machine designer’s knowledge. This process took several weeks, but the implementation as a rule-based expert system using the shell then took only about a week. On-site evaluation showed that the expert system successfully diagnosed 92% of the malfunctions of the chip-mounting machine.

An alternative to basing the knowledge of an expert system on an expert’s verbal protocols is to have the system develop the knowledge from the actions taken by the expert in a range of different situations. The connectionist approach to modeling is particularly suited to this approach, because connectionist, neural network systems acquire knowledge through experience (Gallant, 1988). The system is presented with coded input and output, corresponding to the environmental stimuli and the action taken by the expert, for a series of specific problems. A learning algorithm adjusts the weights of the connections between nodes to closely match the behavior being simulated. Unlike the previous approaches, the connectionist approach does not rely on formal rules or an inference engine but only on how often the expert performs certain tasks and how frequently different environmental situations occur.

Hunt (1989) describes the results of several experiments that illustrate the potential for the connectionist approach. In one, students were instructed to imagine that they were learning how to troubleshoot an internal combustion power plant. Readings of instruments, such as coolant temperature and fuel consumption, were displayed as in Figure 12.4. From the configuration of readings, one of four malfunctions (radiator, air filter, generator, gasket) had to be diagnosed. In a series of such problems, the students’ performance began at chance level (25% correct) and increased to 75% by the end of the experiment.

FIGURE 12.4A typical display shown in the diagnostic task.

Hunt (1989) developed a connectionist model for each student based on their responses. The models accurately approximated the performance of the individual students, with the mean classification accuracy being 72%. In contrast, a rule-based system for which the knowledge was acquired by interviews averaged only 55% correct. These results suggest that more objective methods of knowledge elicitation, coupled with a system that learns from the expert’s actions, may allow the development of better expert systems.

Interface Design

We have already discussed the importance of good interface design. If the interface is poorly designed, any benefit provided by the expert system may be lost. Errors may be made, and, ultimately, the users may stop using the system. The human factors specialist can provide guidelines for how the expert system and the user should interact. These guidelines will determine the way that information is presented to the user to optimize efficiency. Of particular concern, as we discussed earlier, is the presentation of the system’s reasoning in such a way that the user will be able to understand it.

Two modes of interaction are commonly used for expert system interfaces (Hanne & Hoepelman, 1990). Natural language interfaces present information in the user’s natural language and are the most common. Such interfaces are usable by almost everyone, but they can lead the user to overestimate what the system “understands.” Because the system seems to talk to the user, the user may tend to anthropomorphize the system.

An alternative to the natural language interface is a graphical presentation of the working environment. Graphical user interfaces are effective for communicating such things as system change over time and paths to solution. In many situations, a combination of language and graphical dialogue will be most effective. A good design strategy is to have users evaluate prototypes of the intended interface from early in the development process, so that interface decisions are not made only after the rest of the expert system is developed.

Validating the System

Even a perfectly designed expert system must be validated. There may be errors in the knowledge base or faulty rules in the inference engine that can lead to incorrect recommendations by the system. Incorrect recommendations can be hazardous, because there will be many people who tend to accept the system’s advice without question. Dijkstra (1999) had people read three criminal law cases and the defense attorney’s arguments, which were always correct. After reading the materials for each case, the people consulted an expert system that always gave incorrect advice, the basis of which could be examined by using three explanation functions. Seventy-nine percent of the decisions made by the people were in agreement with the incorrect advice provided by the expert system instead of the correct advice provided by the attorney, and slightly more than half of the people agreed with the expert system for all three cases. In contrast, only 28% of decisions made by persons who judged the criminal cases without the advice of either the attorney or the expert system were incorrect. Several measures indicated that those people who always agreed with the expert system did not put in the effort to study the advice of the expert system but simply trusted it.

A system can be tested by simulating its performance with historical data and having experts assess its recommendations. Because the system ultimately will be used in a work environment by an operator, it is also important to test the performance of the operators. It can be difficult to modify an imperfect expert system once it is installed in the field, so tests of system performance and knowledge validity need to be performed prior to installation. These tests may be accomplished by establishing simulated conditions in a laboratory environment and evaluating operator performance with and without the expert system.

Unfortunately, expert-system designers often neglect the important step of evaluating the operator’s performance, and this can have negative consequences. Nelson and Blackman (1987) evaluated two variations of a prototype expert system developed for operators of nuclear reactor facilities. Both systems used response trees to help operators monitor critical safety functions and to identify a particular problem-solving route when a safety function became endangered. One system required the operator to provide input about failed components when they were discovered and to request a recommendation when one was desired. The other system automatically registered any failures that occurred, checked to determine whether a new problem-solving recommendation was necessary, and displayed this new recommendation without prompting.

Neither system significantly improved performance over that of operators who had no expert system at all. Even the automated system, which was much more usable than the operator-controlled system, did not improve performance. For the operator-controlled system, it was easy to enter incorrect information, which resulted in erroneous and confusing recommendations. Therefore, we cannot assume that an expert system will always improve operator performance, even when that system is easy to use.

An important part of the validation process is to assess how acceptable a system is to its users. The introduction of new technology in the workplace always has the potential to generate suspicion, resentment, and resistance of the users for many reasons. The expert-system designer can minimize acceptance problems by involving users in all phases of the development process. He must also develop training programs to ensure that operators understand how the expert system is to be integrated into their daily tasks. He will also need to develop maintenance procedures that will ensure the reliability of the system. Finally, he should evaluate possible extensions of the system into areas for which it was not initially designed.

EXAMPLE SYSTEMS

As noted earlier, expert systems have been used successfully in a variety of domains, including the diagnosis of device and system malfunctions (Buchanan et al., 2007). One of the earliest expert systems, MYCIN, was used for the diagnosis and treatment of infectious diseases. Digital Equipment Corporation used the XCON system successfully to configure a computer hardware/software system specific to the customer’s needs. Telephone companies have used the ACE system to identify faults in phone lines and cables that may need preventative maintenance. Although expert systems have their limitations, their uses and sophistication can be expected to continue to expand in the future. We describe in detail below two expert systems, one for diagnosing faults in the shapes of steel plates and the other for ship design.

DESPLATE

A system called DESPLATE (Diagnostic Expert System for Steel Plates) was developed to diagnose faults in shapes of rolled steel plates (Ng, Cung, & Chicharo, 1990). Slabs of reheated steel are rolled into plates of specified thickness and shape. The final products are to be rectangular, but perfect rectangular shapes rarely occur. Figure 12.5 illustrates five examples of faulty shapes. Some plates may be sufficiently deviant that they must be cut into smaller dimensions, which is a costly and time-consuming process. Therefore, DESPLATE is designed to locate the cause of particular faulty shapes and recommend adjustments to correct the problem.

FIGURE 12.5Examples of faulty shapes.

DESPLATE uses a mixture of forward and backward chaining to reach a conclusion. The user is prompted for a set of facts observed prior to or during the session in which faulty plates were produced. From this set of data, DESPLATE forward chains until the solution space is sufficiently narrow. If a cause can be assumed, backward chaining is then used to prove this cause; otherwise, forward chaining continues.

DESPLATE searches a knowledge base that is arranged hierarchically. The entire knowledge base is organized according to the time required to test for a fault and the frequency of that fault. Observations or tests that are easily performed have priority over those that are more difficult, and faults that rarely occur are only tested when everything else has failed. Information is presented in order of these priorities. There are three kinds of information in the knowledge base: (1) the observations, or symptoms, that are used to identify different types of faults; (2) the tests used to diagnose faults; and (3) the faults themselves, which are hierarchically arranged according to their nature.

DESPLATE was installed in 1987 at the plate mill of the BHP Steel International Group, Slab and Plate Products Division, Port Kembla, Australia. It produced satisfactory solutions and recommendations for three faulty shapes: camber, off-square, and taper.

ALDES

ALDES (Accommodation Layout Design Expert System) was developed to provide expert assistance in the ship design process (Helvacioglu & Insel, 2005). Task modules were developed to provide expertise about three tasks involved in ship design: (1) generating a general arrangement plan; (2) determining the minimum number of crew members required; and (3) generating layouts of decks for the accommodations area. To develop ALDES, a visual programming interface shell was paired with a CLIPS expert system shell as the inference engine. The interface shell provides functions for data input from the user and outputs results to the user, to maintain a database of objects during the design process; to visually depict the layout; and to perform some fundamental calculations (e.g., container capacity in the hull).

Knowledge in ALDES was acquired from interviews with ship designers, investigation of national and international regulations, examination of social rules and accommodation in ships, and data bases from ships of the same general type. The ship was represented as a hierarchical data base of objects, and the procedural knowledge acquired from experts was represented as production rules. Reasoning in ALDES proceeds through refinement and adaptation of an initial prototype of the ship. A prototype is selected and then decomposed into its main components. Each main component is decomposed into subcomponents, and so on, with the decomposition continuing until a stage is reached at which the design description can be generated using deductive logic.

SUMMARY

Skill acquisition takes place in an orderly way across a range of cognitive tasks. Early in training, task performance depends on generic, weak problem-solving methods. With practice, a person acquires domain-specific knowledge and skills that can be brought to bear on the task at hand. It is this knowledge that defines an expert. The domain-specific knowledge possessed by experts, and how that knowledge is organized, allows them to perceive, remember, and solve problems better in that domain than nonexperts can. Expert behavior can be characterized as skill-based. When the expert encounters an unfamiliar problem to which a novel solution is required, he or she engages a range of general problem-solving strategies and mental models.

Expert systems are knowledge-based computer programs designed to emulate an expert. An expert system has three basic components: a knowledge base, an inference engine, and a user interface. The potential benefits of expert systems are limited by human performance issues. Human factors specialists can assist in design decisions by providing input about tasks that can be successfully modeled, the most appropriate methods for extracting knowledge from domain experts, the best way to represent this knowledge in the knowledge base, the design of an effective dialogue structure for the user interface, evaluations of the performance of the expert system, and the integration of the system into the work environment.

Table of Contents for Chapter 12 Experts and Expert Systems

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 12 Experts and Expert Systems