2.3. Variation and Statistics

In the previous section, we mentioned the following aspects of managing variation:

  • Identifying and quantifying sources of variation.

  • Controlling sources of variation.

  • Reducing sources of variation.

  • Anticipating sources of variation.

The first point, "Identifying and quantifying sources of variation," is a vital step and typically precedes the others. In fact, Six Sigma efforts aside, many businesses can derive useful new insights and better knowledge of their processes and products simply by understanding what their data represent and by interacting with their data to literally see what has not been seen before. Identification of sources of variation is a necessary step before starting any modeling associated with other Six Sigma steps. Even in those rare situations where there is already a high level of understanding about the data and the model, it would be very unwise to begin modeling without first investigating the data. Every set of data is unique, and in the real world, change is ubiquitous, including changes in the patterns of variation.

Given that the study of variation plays a central role in Six Sigma, it would be useful if there were already a body of knowledge that we could apply to help us make progress. Luckily, there is: statistics! One of the more enlightened definitions of statistics is learning in the face of uncertainty; since variation is a form of uncertainty, then the relevance of statistics becomes immediately clear.

Yet, statistics tends to be underutilized in understanding uncertainty. We believe that one of the reasons is that the fundamental difference between an exploratory study and a confirmatory study is not sufficiently emphasized or understood. This difference can be loosely expressed as the difference between statistics as detective and statistics as judge. Part of the difficulty with fully appreciating the relevance of statistics as detective is that the process of discovery it addresses cannot fully be captured within an algorithmic or theoretical framework. Rather, producing new and valuable insights from data relies on heuristics, rules of thumb, serendipity, and contextual knowledge. In contrast, statistics as judge relies on deductions that follow from a structured body of knowledge, formulas, statistical tests, and p-value guideposts.

The lack of appreciation of statistics as detective is part of our motivation in writing this book. A lot of traditional Six Sigma training overly emphasizes statistics as judge. This generally gives an unbalanced view of what Six Sigma should be, as well as making unrealistic and overly time-consuming demands on practitioners and organizations.

Six Sigma is one of many applications where learning in the face of uncertainty is required. In any situation where statistics is applied, the analyst will follow a process, more or less formal, to reach findings, recommendations, and actions based on the data.[] There are two phases in this process:

  1. Exploratory Data Analysis.

  2. Confirmatory Data Analysis.

Exploratory Data Analysis (EDA) is nothing other than a fancy name for statistics as detective, whereas Confirmatory Data Analysis (CDA) is simply statistics as judge. In technical jargon, the emphasis in EDA is on hypothesis generation. In EDA efforts, the analyst searches for clues in the data that help identify theories about underlying behavior. In contrast, the focus of CDA is hypothesis testing and inference. CDA consists of confirming these theories and behaviors. CDA follows EDA, and together they make up statistical modeling. A recent paper by Jeroen de Mast and Albert Trip provides a detailed discussion of the crucial role of EDA in Six Sigma.[]

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.44.182