16
SUMMING UP: HOW TO GO BEHIND THE LABEL “HUMAN ERROR”

“So what is human error? Can you give us a definition, or, better still, a taxonomy?” Throughout the history of safety, stakeholder groups have asked researchers for definitions and taxonomies of human error that are supposedly grounded in science. The questions are animated by a common concern: Each organization or industry feels that their progress on safety depends on having a firm definition of human error. Each group seems to believe that such a definition will enable creation of a scorecard that will allow them to gauge where organizations or industries stand in terms of being safe.

But each organization’s search for the definition quickly becomes mired in complexity and terms of reference. Candidate definitions appear too specific for particular areas of operations, or too vague if they are broad enough to cover a wider range of activities. Committees are sometimes tasked with resolving the difficulties, but these produce ad hoc mixed-bag collections that only have a pretense of scientific standing. The definitions offered involve arbitrary and subjective methods of assigning events to categories. The resulting counts and extrapolations seem open to endless reassessment and debate. Beginning with the question “what is error?” misleads organizations into a thicket of difficulties where answers seem always just around the corner but never actually come into view.

The label “error” is used inconsistently in everyday conversations about safety and accidents. The term is used in at least three ways, often without stakeholders even knowing it:

image Sense 1 – error as the cause of failure: ‘This event was due to human error.’ The assumption is that error is some basic category or type of human behavior that precedes and generates a failure. It leads to variations on the myth that safety is protecting the system and stakeholders from erratic, unreliable people.

image Sense 2 – error as the failure itself, that is, the consequences that flow from an event: ‘The transplant mix-up was an error.’ In this sense the term “error” simply asserts that the outcome was bad producing negative consequences (e.g., injuries to a patient).

image Sense 3 – error as a process, or more precisely, departures from the “good” process. Here, the sense of error is of deviation from a standard, that is a model of what is good practice. However, the enduring difficulty is that there are different models of what the process is that should be followed: for example, what standard is applicable, how standards should be described, and what it means when deviations from the standards do not result in bad outcomes. Depending on the model adopted, very different views of what is error result.

While you might think that it would always be clear from the context which of these senses people mean when they talk about error, in practice the senses are often confused with each other. Even worse, people sometimes slip from one sense to another without being aware that they are doing so.

As this book articulates, the search for definitions and taxonomies of error is not the first step on the journey toward safety; it is not even a useful step, only a dead end. Instead, the research on how individuals and groups cope with complexity and conflict in real-world settings has produced a number of insights about definitions of “error.”

The first is that defining error-as-cause (Sense 1) blocks learning by hiding the lawful factors that affect human and system performance. This is the critical observation that gave rise to the idea that errors were heterogeneous and not directly comparable events that could be counted and tabulated. The standard way we say this today is that the label error should be the starting point of study and investigation, not the ending point.

Of course, it is tempting to stop the analysis of an adverse event when we encounter a person in the chain of events. Continuing the analysis through individuals requires workable models-cognition of individuals and of coordinated activity between individuals. It turns out to be quite hard to decide where to halt the causal analysis of a surprising event. Although there are theoretical issues involved in this stopping-rule problem, the decision about when to stop most often reflects our roles as stakeholders and as participants in the system. We stop when we think we have a good enough understanding and this understanding is, not surprisingly, when we have identified human error as the source of the failure.

The idea of error-as-cause also fails and misleads because it trivializes expert human performance. Error-as-cause leaves us with human performance divided in two: acts that are errors and acts that are non-errors. But this distinction evaporates in the face of any serious look at human performance (see for example, Klein, 1998). What we find is that the sources of successful operation of systems under one set of conditions can be what we label errors after failure occurs. Instead of finding error and non-error, when we look deeply into human systems at work we find that the behaviors there closely match the incentives, opportunities, and demands that are present in the workplace. Rather than being a distinct class of behavior, we find the natural laws that influence human systems are always at work, sometimes producing good outcomes and sometimes producing bad ones. Trying to separate error from non-error makes it harder to see these systemic factors.

Second, defining error-as-consequences (Sense 2) is redundant and confusing. Much of the time, the word “error” is used to refer to harm – generally preventable harm. This sort of definition is almost a tautology: it simply involves renaming preventable harm as error. But there are a host of assumptions that are packed into “preventable” and these are almost never made explicit. We are not interested in harm itself but, rather, how harm comes to be. The idea that something is preventable incorporates a complete (albeit fuzzy) model of how accidents happen, what factors contribute to them, and what sorts of countermeasures would be productive. But closer examination of “preventable” events shows that their preventability is largely a matter of wishing that things were other than they were.

To use “error” as a synonym for harm gives the appearance of progress where there is none. It would be better if we simply were clear in our use of language and referred to these cases in terms of the kind of harm or injuries. Confounding the label error with harm simply adds a huge amount of noise to the communication and learning process (for example, the label medication misadministration describes a kind of harm; the label medication error generates noise).

Third, defining error-as-deviation from a model of ‘good’ process (Sense 3) collides with the problem of multiple standards. The critical aspect of error-as-process-deviation is deciding how to determine what constitutes a deviation. Some have proposed normative models, for example Bayes Theorem, but these are rarely applicable to complex settings like health care, crisis management, or aviation and efforts to use this approach to assess human performance are misleading.

Some have argued that strict compliance with standard operating practices and procedures can be used to define deviation. However, it was quickly discovered that standard operating practices capture only a few elements of work and often prescribe practices that cannot actually be sustained in work worlds. In transportation systems, for example, where striking may be illegal, labor strife has sometimes led to workers adopt a “work-to-rule” strategy. By working exactly to rule, workers can readily make complex systems stop working. Attempts to make complete, exhaustive policies that apply to all cases creates or exacerbates double binds or to make it easy to attribute adverse events to ‘human error’ and stop. Expert performance is a lot more than following some set of pre-written guidance (Suchman, 1987).

Choosing among the many candidates for a standard changes what is seen as an error in fundamental ways. Using finer- or coarser-grain standards can give you a very wide range of error rates. In other words, by varying the standard seen as relevant, one can estimate hugely divergent ‘error’ rates. Some of the “standards” used in specific applications have been changed because too many errors were occurring or to prove that a new program was working. To describe something as a “standard” when it is capable of being changed in this way suggests that there is little that is standard about “standards”. This slipperiness in what counts as a deviation can lead to a complete inversion of standardizing on good process: rather than describing what it is that people need to do to accomplish work successfully, we find ourselves relying on bad outcomes to specify what it is that we want workers not to do. Although often couched in positive language, policies, and procedures are often written and revised in just this way after accidents. Unfortunately, hindsight bias plays a major role in such activities.

Working towards meaningful standards as a means for assessing performance and defining error as deviations might be a long-term goal but it is fraught with hazard. To make standards work requires not only clear statements about how to accomplish work but clear guidance about how conflicts are to be handled. Specifying standards for performance for only part of the work to be done creates double binds that undermine expert-performance-creating conditions for failure. To use standards as a basis for evaluating performance deviations requires the continuous evaluation of performance against the standard rather than (as is often the case) simply after bad outcomes become apparent. One practical test of this is whether or not deviations from standards are actually detected and treated in the same way independent of the actual outcome.

To limit the damage from the multiple standards problem, all must carry forward in any tabulation the standard used to define deviations. This is absolutely essential! Saying some behavior was an error-as-process-deviation has no meaning without also specifying the standard used to define the deviation. There are three things to remember about the multiple standards problem:

image First, the standard chosen is a kind of model of what it means to practice before outcome is known. A scientific analysis of human performance makes those models explicit and debatable. Without that background, any count is arbitrary.

image Second, a judgment of error is not a piece of data which then can be tabulated with other like data; instead it is the end result of an analysis. Its interpretation rests on others being able to decompose and critique that analysis. The base data is the story of the particular episode – how multiple factors came together to produce that outcome. Effective systems of inquiry about safety begin with and continually refer back to these base stories of failure and of success to stimulate organizational learning.

image Third, being explicit about the standard used is also essential to be able to critique, contrast, and combine results across events, studies, or settings. When these standards are dropped or hidden in the belief that error is an objective thing in the world, communication and learning collapse.

The research described in this book has shown that ‘error’ is an example of an essentially contestable concept. In fact, any benefit to the search for error only comes from the chronic struggle to define how different standards capture and fail to capture our current sense of what is expertise and our current model of the factors that make the difference between success and failure.

So, fourth, labeling an act as “error” marks the end of the social and psychological process of causal attribution. Research on how people actually apply the term “error” shows that “error” is a piece of data about reactions to failure, that is, it is serves as a placeholder for a set of socially derived beliefs about how things happen. As stakeholders, our judgments after the fact about causality are used to explain surprising events. Thus, in practice, the study of error is nothing more or less than the study of the psychology and sociology of causal attribution. There are many regularities and biases – for example hindsight and oversimplifications – that determine how people judge causality. The heterogeneity and complexity of real-world work make these regularities and biases especially important: because the field of possible contributors includes so many items, biases may play an especially important role in determining which factors are deemed relevant.

This result is deeply unsettling for stakeholders because it points out that the use of the term “error” is less revealing about the performance of workers than it is about ourselves as evaluators. As researchers, advocates, managers, and regulators, we are at least as vulnerable to failure, susceptible to biases and oversimplifications, and prone to err as those other people at the sharp end. Fallibility has no bounds in a universe of multiple pressures, uncertainty, and finite resources.

Error is not a fixed objective, stable category or set of a categories for modeling the world. Instead, it arises from the interaction between the world and the people who create, run, and benefit (or suffer) from human systems for human purposes – a relationship between hazards in the world and our knowledge, our perceptions, and even our dread of the potential paths toward and forms of failure.

TEN STEPS

So if you feel you have a “human error” problem, don’t think for a minute that you have said anything meaningful about the causes of your troubles, or that a better definition or taxonomy will finally help you get a better grasp of the problem, because you are looking in the wrong place, and starting from the wrong position. You don’t have a problem with erratic, unreliable operators. You have an organizational problem, a technological one. You have to go behind the label human error to begin the process of learning, of improvement, of investing in safety. In the remainder of this concluding chapter, we walk through 10 of the most important steps distilled from the research base about how complex systems fail and how people contribute to safety. The 10 steps forward summarize general patterns about error and expertise, complexity, and learning. These 10 steps constitute a checklist for constructive responses when you see a window of opportunity to improve safety. Here they are:

1. Recognize that human error is an attribution.

2. Pursue second stories to find deeper, multiple contributors to failure.

3. Escape the hindsight bias.

4. Understand work as performed at the sharp end of the system.

5. Search for systemic vulnerabilities.

6. Study how practice creates safety.

7. Search for underlying patterns.

8. Examine how change will produce new vulnerabilities and paths to failure.

9. Use new technology to support and enhance human expertise.

10. Tame complexity.

1. RECOGNIZE THAT HUMAN ERROR IS AN ATTRIBUTION

“Human error” (by any other name: procedural violation, deficient management) is an attribution. It is not an objective fact that can be found by anybody with the right method or right way of looking at an incident. Given that human error is just an attribution, that it is just one way of saying what the cause was, just one way of telling a story about a dreadful event (a first story); it is entirely justified for us to ask why telling the story that way makes sense to the people listening.

Attributing failure to “human error” has many advantages. The story that results, the first story, is short and crisp. It is simple. It is comforting because it shows that we can quickly find causes. It helps people deal with anxiety about the basic safety of the systems on which we depend. It is also a cheap story because not much needs to be fixed: just get rid of or remediate the erratic operators in the system. It is a politically convenient story, because people can show that they are doing something about the problem to reassure the public.

But if that story of a failure is only one attribution of its cause, then other stories are possible too. We are entirely justified to ask who stands to gain from telling a first story and stopping there. Whose voice goes unheard as a result? What stories, different stories, with different attributions about the failure, get squelched?

The first story after celebrated accidents tells us nothing about the factors that influence human performance before the fact. Rather, the first story represents how we, with knowledge of outcome and as stakeholders, react to failures. Reactions to failure are driven by the consequences of failure for victims and other stakeholders. Reactions to failure are driven, too, by the costs of changes that would need to be made to satisfy stakeholders that the threats exposed by the failure are under control. This is a social and political process about how we attribute “cause” for dreadful and surprising breakdowns in systems that we depend on and that we expect to be safe.

So a story of human error is merely a starting point to get at what went wrong, not a conclusion about what went wrong. A first story that attributes the failure to error is an invitation to go deeper, to find those other voices, to find those other stories, and to discover a much deeper complexity of what makes systems risky or safe.

What is the consequence of error being the result of processes of attribution after the fact? The belief that there is an answer to the question – What is error? – is predicated on the notion that we can and should treat error as an objective property of the world and that we can search for errors, tabulate them, count them. This searching and counting is futile.

The relationship between error and safety is mirage-like. It is as if you find yourself in a desert, seeing safety glimmering somewhere in the far distance. To begin the journey, you feel we must gauge the distance to your goal in units of “error.” Yet your presumption about the location of safety is illusory. Efforts to measure the distance to it are little more than measuring your distance from a mirage. The belief that estimates of this number are a necessary or even useful method of beginning an effort to improve safety is false.

The psychology of causal attribution reminds us that it is our own beliefs and misconceptions about failure and error that have combined to make the mirage appear where it does. The research on how human systems cope with complexity tells us that progress towards safety has more to do with the metaphorical earth underneath your feet than it does with that tantalizing image off in the distance. When you look down, you see people struggling to anticipate forms of, and paths toward, failure. You see them actively adapting to create and sustain failure-sensitive strategies, and working to maintain margins in the face of pressures to do more and do it quickly. Looking closely under your feet, you may begin to see:

image how workers and organizations are continually revising their approach to work in an effort to remain sensitive to the possibility for failure;

image how we and the workers are necessarily only partially aware of the current potential for failure;

image how change is creating new paths to failure and new demands on workers and how revising their understanding of these paths is an important aspect of work on safety;

image how the strategies for coping with these potential paths can be either strong and resilient or weak and brittle;

image how the culture of safety depends on remaining dynamically engaged in new assessments and avoiding stale, narrow, or static representations of risk and hazard;

image how overconfident nearly everyone is that they have already anticipated the types and mechanisms of failure, and how overconfident nearly everyone is that the strategies they have devised are effective and will remain so;

image how missing side effects of change is the most common form of failure for organizations and individuals; and

image how continual effort after success in a world of changing pressures and hazards is fundamental to creating safety.

But for you to see all that, you have to go behind the label human error. You have to relentlessly, tirelessly pursue the deeper, second story.

2. PURSUE SECOND STORIES

First stories, biased by knowledge of outcome, are overly simplified accounts of the apparent “cause” of the undesired outcome. The hindsight bias narrows and distorts our view of practice after the fact. As a result, there is premature closure on the set of contributors that lead to failure, the pressures and dilemmas that drive human performance are masked. A first story always obscures how people and organizations work to recognize and overcome hazards and make safety.

Stripped of all the context, first stories are appealing because they are easy to tell and they locate the important “cause” of failure in practitioners closest to the outcome. First stories appear in the press and usually drive the public, legal, political, and regulatory reactions to failure. Unfortunately, first stories simplify the dilemmas, complexities, and difficulties practitioners face and hide the multiple contributors and deeper patterns. The distorted view they offer leads to proposals for “solutions” that are weak or even counterproductive and blocks the ability of organizations to learn and improve.

First stories cause us to ask questions like: “How big is this safety problem?”, “Why didn’t someone notice it before?” and “Who is responsible for this state of affairs?” The calls to action based on first stories have followed a regular pattern:

image Demands for increasing the general awareness of the issue among the public, media, regulators, and practitioners (“we need a conference…”);

image Calls for others to try harder or be more careful (“those people should be more vigilant about …”)

image Insistence that real progress on safety can be made easily if some local limitation is overcome (“we can do a better job if only…”);

image Calls for more extensive, more detailed, more frequent, and more complete, reporting of problems (“we need mandatory incident reporting systems with penalties for failure to report …”); and

image Calls for more technology to guard against erratic people (“we need computer order entry, bar coding, electronic medical records, and so on …”).

First stories are not an explanation of failure. They merely represent, or give expression to, a reaction to failure that attributes the cause of accidents to narrow proximal factors, usually “human error.” They appear to be attractive explanations for failure, but they lead to sterile or even counterproductive responses that limit learning and improvement (e.g., “we need to make it so costly for people that they will have to …”).

When you see this process of telling first stories go on, the constructive response is very simple – in principle. Go beyond the first story to discover what lies behind the term “human error.” Your role could be to help others develop the deeper second story. This is the most basic lesson from past research on how complex systems fail. When you pursue second stories, the system starts to look very different. You can begin to see how the system moves toward, but is usually blocked from, accidents. Through these deeper insights learning occurs and the process of improvement begins. Progress on safety begins with uncovering second stories.

3. ESCAPE FROM HINDSIGHT BIAS

Knowledge of outcome distorts our view of the nature of practice in predictable and systematic ways. With knowledge of outcome we simplify the dilemmas, complexities, and difficulties practitioners face and how they usually cope with these factors to produce success. The distorted view leads people to propose “solutions” that actually can be counterproductive if they degrade the flow of information that supports learning about systemic vulnerabilities and if they create new complexities to plague practice. In contrast, research-based approaches try to use various techniques to escape from hindsight bias. This is a crucial prerequisite for learning to occur.

4. UNDERSTAND THE WORK PERFORMED AT THE SHARP END OF THE SYSTEM

When you start to pursue the second story, the way you look at people working at the sharp end of a system changes dramatically. Instead of seeing them and their work as the instigators of trouble, as the sources of failure, you begin to see the sharp end as the place where many of the pressures and dilemmas of the entire system collect, where difficult situations are resolved on a daily basis. These are organizational, economic, human, and technological factors, that flow toward the sharp end and play out to create outcomes. Sharp-end practitioners who work in this setting face of a variety of difficulties, complexities, dilemmas, and tradeoffs, they are called on to achieve multiple, often conflicting, goals. Safety is created at the sharp end as practitioners interact with hazardous processes inherent in the field of activity, using the available tools and resources.

To really build and understand a second story, you have to look at more than just one incident or accident. You have to go more broadly than a single case to understand how practitioners at the sharp end function – the nature of technical work as experienced by the practitioner in context.

Ultimately, all efforts to improve safety will be translated into new demands, constraints, tools, or resources that appear at the sharp end. Improving safety depends on investing in resources that support practitioners meet the demands and overcome the inherent hazards in that setting. In other words, progress on safety depends on understanding how practitioners cope with the complexities of technical work.

When you shift your focus to technical work in context, you actually begin to wonder how people usually succeed. Ironically, understanding the sources of failure begins with understanding how practitioners create success and safety first; how they coordinate activities in ways that help them cope with the different kinds of complexities they experience. Interestingly, the fundamental insight that once launched the New Look was to see human performance at work as human adaptations directed to cope with complexity (Rasmussen, 1986; Woods and Hollnagel, 2006). Not understanding the messy details of what it means to practice in an operational setting, and being satisfied with only shallow, short forays into the real world of practitioners, carries big risks:

“The potential cost of misunderstanding technical work” is the risk of setting policies whose actual effects are “not only unintended but sometimes so skewed that they exacerbate the problems they seek to resolve. Efforts to reduce ‘error’ misfire when they are predicated on a fundamental misunderstanding of the primary sources of failures in the field of practice [systemic vulnerabilities] and on misconceptions of what practitioners actually do.” (Barley and Orr, 1997, p. 18)

Here are three ways to help you focus your efforts to understand technical work as it affects the potential for failure:

a. Look for sources of success. To understand failure, try to understand success in the face of complexities. Failures occur in situations that usually produce successful outcomes. In most cases, the system produces success despite opportunities to fail. To understand failure requires understanding how practitioners usually achieve success in the face of demands, difficulties, pressures, and dilemmas. Indeed, as Ernst Mach reminded us more than a century ago, success and failure flow from the same sources.

b. Look for difficult problems. To understand failure, look for difficult problems, and try to understand what makes them difficult. Patterson et al. (2010) provide the latest listing of facets of complexity. Aim to identify the factors that made certain situations more difficult to handle and explore the individual and team strategies used to handle these situations. As you begin to understand what made certain kinds of problems difficult, how expert strategies were tailored to these demands, and how other strategies were poor or brittle, you may begin to discover new ways to support and broaden the application of successful strategies.

c. Avoid the Psychologist’s Fallacy. Understand the nature of practice from practitioners’ point of view. It is easy to commit what William James called the Psychologist’s Fallacy in 1890. Updated to today, this fallacy occurs when well-intentioned observers think that their distant view of the workplace captures the actual experience of those who perform technical work in context. Distant views can miss important aspects of the actual work situation and thus can miss critical factors that determine human performance in that field of practice. To avoid the danger of this fallacy, cognitive anthropologists use research techniques based on a practice-centered perspective. Researchers on human problem-solving and decision-making refer to the same concept with labels such as process-tracing and naturalistic decision-making (Klein, Orasanu and Calderwood, 1993).

It is important to distinguish clearly that doing technical work expertly is not the same thing as expert understanding of the basis for technical work. This means that practitioners’ descriptions of how they accomplish their work often are biased and cannot be taken at face value. For example, there can be a significant gap between people’s descriptions (or self-analysis) of how they do something and observations of what they actually do. Successful practice-centered demands a combination of the following three factors:

image the view of practitioners in context,

image technical knowledge in that area of practice, and

image knowledge of general results/concepts about the various aspects of human performance that play out in that setting.

Since technical work in context is grounded in the details of the domain itself, it is also insufficient to be expert in human performance in general. Understanding technical work in context requires (1) in-depth appreciation of the pressures and dilemmas practitioners face and the resources and adaptations practitioners bring to bear to accomplish their goals, and also (2) the ability to step back and reflect on the deep structure of factors that influence human performance in that setting. Individual observers rarely possess all of the relevant skills so that progress on understanding technical work in context and the sources of safety inevitably requires interdisciplinary cooperation.

5. SEARCH FOR SYSTEMIC VULNERABILITIES

Through practice-centered observation and studies of technical work in context, safety is not found in a single person, device or department of an organization. Instead, safety is created and sometimes broken in systems, not individuals. The issue is finding systemic vulnerabilities, not flawed individuals. Safety is an emergent property of systems, not of their components.

Examining technical work in context with safety as your purpose, you will notice many hazards, complexities, gaps, tradeoffs, dilemmas, and points where failure is possible. You will also begin to see how the practice of operational people has evolved to cope with these kinds of complexities. After elucidating complexities and coping strategies, one can examine how these adaptations are limited, brittle, and vulnerable to breakdown under differing circumstances. Discovering these vulnerabilities and making them visible to the organization is crucial if we are anticipate future failures and institute change to head them off. Indeed, detection and recovery is critical to success. You should aim to understand how the system supports (or fails to support) detection of and recovery from failures.

Of course, this process of feedback, learning, and adaptation should go on continuously across all levels of an organization. As changes occur, some vulnerabilities decay while new paths to failure emerge. To track the shifting pattern requires getting information about the effects of change on sharp-end practice and about new kinds of incidents that begin to emerge. If the information is rich enough and fresh enough, it is possible to forecast future forms of failure, to share schemes to secure success in the face of changing vulnerabilities. Producing and widely sharing this sort of information may be one of the hallmarks of a culture of safety.

However, establishing a flow of information about systemic vulnerabilities is quite difficult because it is frightening to consider how all of us, as part of the system of interest, can fail. Repeatedly, research notes that blame and punishment will drive this critical information underground. Without a safety culture, systemic vulnerabilities become visible only after catastrophic accidents. In the aftermath of accidents, learning also is limited because the consequences provoke first stories, simplistic attributions, and shortsighted fixes.

Understanding the ‘systems’ part of safety involves understanding how the system itself learns about safety and responds to threats and opportunities. In organizational safety cultures, this activity is prominent, sustained, and highly valued. The learning processes must be tuned to the future to recognize and compensate for negative side effects of change and to monitor the changing landscape of potential paths to failure. It is critical to examine how the organization at different levels of analysis supports or fails to support the process of feedback, learning, and adaptation. In other words, find out how well the organization is learning how to learn. Safe organizations deliberately search for and learn about systemic vulnerabilities.

6. STUDY HOW PRACTICE CREATES SAFETY

Typically, reactions to failure assume the system is “safe” (or has been made safe) inherently and that overt failures are only the mark of an unreliable component. But what is irreducible is uncertainty about the future, change is always active, and resources are always finite. As a result, all systems confront inherent hazards, tradeoffs, and are vulnerable to failure. Second stories reveal how practice is organized to allow practitioners to create success in the face of threats. Individuals, teams, and organizations are aware of hazards and adapt their practices and tools to guard against or defuse these threats to safety. It is these efforts that “make safety.” This view of the human role in safety has been a part of complex systems research since its origins, and encourages you to study how practitioners cope with hazards and resolve tradeoffs; how they mostly succeed, yet sometimes fail.

Adaptations by individuals, teams, and organizations that have worked well in the past can become limited or stale. This means that feedback about how well adaptations are working or about how the environment is changing is critical. Examining the weaknesses and strengths, costs and benefits of these adaptations points to the areas ripe for improvement. As a result, progress depends on studying how practice creates safety in the face of challenges, what it takes to be an expert in context.

7. SEARCH FOR UNDERLYING PATTERNS

In discussions of some particular episode of failure, or of some “hot button” safety issue, it is easy for commentators to examine only the surface characteristics of the area in question. Progress has come from going beyond the surface descriptions (the phenotypes of failures) to discover underlying patterns of systemic factors (Hollnagel, 1993).

Genotypes are concepts and models about how people, teams, and organizations coordinate information and activities to handle evolving situations and cope with the complexities of that work domain. These underlying patterns are not simply about knowledge of one area in a particular field of practice. Rather, they apply, test, and extend knowledge about how people contribute to safety and failure and how complex systems fail by addressing the factors at work in this particular setting. As a result, when we examine technical work, search for underlying patterns by contrasting sets of cases.

8. EXAMINE HOW ECONOMIC, ORGANIZATIONAL AND TECHNOLOGICAL CHANGE WILL PRODUCE NEW VULNERABILITIES AND PATHS TO FAILURE

As capabilities, tools, organizations, and economic pressures change, vulnerabilities to failure change as well. This means that the state of safety in any system always is dynamic, and that maintaining safety in that system is a matter of maintaining dynamic stability, not static stability. Systems exist in a changing world. The environment, organization, economics, capabilities, technology, management, and regulatory context all change over time. This backdrop of continuous systemic change ensures that hazards and how they are managed are constantly changing. Also, a basic pattern in complex systems is a drift toward failure as planned defenses erode in the face of production pressures and change. As a result, when we examine technical work in context, we need to understand how economic, organizational, and technological change can create new vulnerabilities in spite of or in addition to providing new benefits.

Research reveals that organizations that manage potentially hazardous technical operations remarkably successfully create safety by anticipating and planning for unexpected events and future surprises. These organizations did not take past success as a reason for confidence. Instead, they continued to invest in anticipating the changing potential for failure because of the deeply held understanding that their knowledge base was fragile in the face of the hazards inherent in their work and the changes omnipresent in their environment.

Under resource pressure, however, any safety benefits of change can get quickly sucked into increased productivity, which pushes the system back to the edge of the performance envelope. Most benefits of change, in other words, come in the form of increased productivity and efficiency and not in the form of a more resilient, robust, and therefore safer system (Rasmussen, 1986). Researchers in the field speak of this observation as the Law of Stretched Systems (Woods and Hollnagel, 2006):

We are talking about a law of systems development, which is every system operates, always at its capacity. As soon as there is some improvement, some new technology, we stretch it. (Hirschhorn, 1997)

Change under resource and performance pressures tends to increase coupling, that is, the interconnections between parts and activities, in order to achieve greater efficiency and productivity. However, research has found that increasing coupling also increases operational complexity and increases the difficulty of the problems practitioners can face (Woods, 1988). Increasing the coupling between parts in a process changes how problems manifest, creating or increasing complexities such as more effects at a distance, more and faster cascades of effects and tighter goal conflicts. As a result, increased coupling between parts creates new cognitive and collaborative demands which contribute to new forms of failure.

Because all organizations are resource limited to one degree or another, you are probably naturally concerned with how to prioritize issues related to safety. Consider focusing your resources on anticipating how economic, organizational, and technological change could create new vulnerabilities and paths to failure. Armed with any knowledge produced by this focus, you can try to address or eliminate these new vulnerabilities at a time when intervention is less difficult and less expensive (because the system is already in the process of change). In addition, these points of change are at the same time opportunities to learn how the system actually functions.

9. USE NEW TECHNOLOGY TO SUPPORT AND ENHANCE HUMAN EXPERTISE

The notion that it is easy to get “substantial gains” through computerization or other forms of new technology is common in many fields. The implication is that new technology by itself reduces human error and minimizes the risk of system breakdown. Any difficulties that are raised about the computerization process or about the new technology become mere details to be worked out later.

But this idea, that a little more technology will be enough, has not turned out to be the case in practice. Those pesky details turn out to be critical in whether the technology creates new forms of failure. New technology can help and can hurt, often at the same time – depending on how the technology is used to support technical work in context. Basically, it is the underlying complexity of operations that contributes to the human performance problems. Improper computerization can simply exacerbate or create new forms of complexity to plague operations. The situation is complicated by the fact the new technology often has benefits at the same time that it creates new vulnerabilities. New technology cannot simply be thrown at a world of practice, as if it is merely one variable that can be controlled independently of all others. People and computers are not separate and independent, but are interwoven into a distributed system that performs cognitive work in context. Changing anything about that intricate relationship immediately changes that joint system’s ability to create success or forestall failure.

The key to skillful as opposed to clumsy use of technological possibilities lies in understanding the factors that lead to expert and team performance and the factors that challenge expert and team performance. The irony is that once you understand the factors that contribute to expertise and to breakdown, you will probably understand how to use the powers of the computer to enhance expertise. On the one hand, new technology creates new dilemmas, demands, knowledge and memory requirements and new judgments. But, on the other hand, once the basis for human expertise and the threats to that expertise had been studied, technology can be an important means to enhance system performance. As a result, when you examine technical work, try once again to understand the sources of and challenges to expertise in context. This is crucial to guide the skillful, as opposed to clumsy use of technological possibilities.

10. TAME COMPLEXITY THROUGH NEW FORMS OF FEEDBACK

Failures represent breakdowns in adaptations directed at coping with complexity. Success relates to organizations, groups, and individuals who are skillful at recognizing the need to adapt in a changing, variable world and in developing ways to adapt plans to meet these changing conditions despite the risk of negative side effects.

Recovery before negative consequences occur, adapting plans to handle variations and surprise, recognizing side effects of change are all critical to high resilience in human and organizational performance. Yet all of these processes depend fundamentally on the ability to see the emerging effects of decisions, actions, policies. It depends on the kind of feedback a team or an organization makes available about itself and to itself, especially feedback about the future. In general, increasing complexity can be balanced with improved feedback. Improving feedback is a critical investment area for improving human performance and guarding against paths toward failure.

The constructive response to issues on safety is to study where and how to invest in better feedback. This is a complicated subject since better feedback is:

image integrated to capture relationships and patterns, not simply a large set of available but disconnected data elements (like those gathered in many incident reporting or data monitoring systems);

image event-based to capture change and sequence over multiple time-scales, not simply the current values on each data channel;

image future-oriented to help organizational decision makers assess what could happen next, not simply what has happened;

image context-sensitive and tuned to the interests and expectations of those looking at the data.

Feedback at all levels of the organization is critical. Remember, a basic pattern in complex systems is a drift toward failure as planned defenses erode in the face of production pressures, and as a result of changes that are not well-assessed for their impact on the cognitive work that goes on at the sharp end. Continuous organizational feedback is needed to support adaptation and learning processes. Paradoxically, feedback must be tuned to the future to detect the emergence of drift toward failure, to explore and compensate for negative side effects of change, and to monitor the changing landscape of potential paths to failure. To achieve this, you should help your organization develop and support mechanisms that create foresight about the constantly changing shape of the risks it faces.

CONCLUSION

Safety is not a commodity to be tabulated, it is a chronic value ‘under your feet’ that infuses all aspects of practice. People create safety under resource and performance pressure at all levels of socio-technical systems. They continually learn and adapt their activities in response to information about failure. Progress on safety does not come from hunting down those who err. Instead, progress on safety comes from finding out what lies behind attributions of error – the complexity of cognitive systems, the messiness of organizational life, and ultimately your own reactions, anxieties, hopes, and desires as a stakeholder and as a participant in human systems serving human purposes. Progress on safety comes from going behind the label human error, where you discover how workers and managers create safety, and where you find opportunities to help them do it even better.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.196.175