Background: Models, Data, and Variation

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

2.1. Background: Models, Data, and Variation

There is no doubt that science and technology have transformed the lives of many and will continue to do so. Like many fields of human endeavor, science proceeds by building pictures, or models, of what we think is happening. These models can provide a framework in which we attempt to influence or control inputs so as to provide better outputs. Unlike the models used in some other areas, the models used in science are usually constructed using data that arise from measurements made in the real world.

At the heart of the scientific approach is the explicit recognition that we may be wrong in our current world view. Saying this differently, we recognize that our models will always be imperfect, but by confronting them with data, we can strive to make them better and more useful. Echoing the words of George Box, one of the pioneers of industrial statistics, we can say, "Essentially, all models are wrong, but some are useful."^[]

2.1.1. Models

The models of interest in this book can be conceptualized as shown in Exhibit 2.1. This picture demands a few words of explanation:

In this book, and generally in Six Sigma, the outcomes of interest to us are denoted by Ys. For example, Y1 in Exhibit 2.1 could represent the event that someone will apply for a new credit card after receiving an offer from a credit card company.
Causes that may influence a Y will be shown as Xs. To continue the example, X1 may denote the age of the person receiving the credit card offer.
Rather than using a lengthy expression such as "we expect the age of the recipient to influence the chance that he or she will apply for a credit card after receiving an offer," we can just write Y = f(X). Here, f is called a function, and Y = f(X) describes how Y changes as X changes. If we think that Y depends on more than one X, we simply write an expression like Y = f(X1, X2). Since the function f describes how the inputs X1 and X2 affect Y, the function f is called a signal function.
Note that we have two different kinds of causes: (X1, X2, X3), shown in the diagram with solid arrows, and (X4, X5, X6), shown with dotted arrows. The causes with dotted arrows are the causes that we do not know about or care about, or causes that we can't control. Often, these are called noise variables. In a sense, noise variables are not part of the model. For example, X4 could be the number of credit cards that the recipient of the offer already has, or the time since the recipient received a similar offer. The function that represents the combined effect of the noise variables on Y is called a noise function, and is sometimes referred to simply as error.
Just because we do not know about the noise variables does not mean that they do not influence Y. If X4, X5, or X6 change, as they typically will, then they will necessarily lead to some apparently inexplicable variation in the outcome Y, even when we do our best to keep X1, X2, and X3 fixed. For example, whether an offer recipient applies for a new credit card may well be influenced by the number of credit cards that the recipient already has.

Figure 2.1. Modeling of Causes before Improvement

As you can see in the exhibit, a key aspect of such a model is that it focuses on some specific aspects (i.e., X1, X2, and X3) in order to better understand them. By intention or lack of knowledge, the model necessarily omits some aspects that may actually be important (X4, X5, and X6).

Depending on whether you are being optimistic or pessimistic, Six Sigma can be associated with improvement or problem solving. Very often, an explicit model relating the Ys to Xs may not exist; to effect an improvement or to solve a problem, you need to develop such a model. The process of developing this model first requires arriving at a starting model, and then confronting that model with data to try to improve it. Later in this chapter, in the section "Visual Six Sigma: Strategies, Process, Roadmap, and Guidelines," we discuss a process for improving the model.

If you succeed in improving it, then the new model might be represented as shown in Exhibit 2.2. Now X4 has a solid arrow rather than a dotted arrow. When we gain an understanding of noise variables, we usually gain leverage in explaining the outcome (Y) and, hence, in making the outcome more favorable to us. In other words, we are able to make an improvement.

Figure 2.2. Modeling of Causes after Improvement

2.1.2. Measurements

The use of data-driven models to encapsulate and predict how important aspects of a business operate is still a new frontier. Moreover, there is a sense in which a scientific approach to business is more challenging than the pursuit of science itself. In science, the prevailing notion is that knowledge is valuable for its own sake. But for any business striving to deliver value to its customers and stakeholders—usually in competition with other businesses doing the same thing—knowledge does not necessarily have an intrinsic value. This is particularly so since the means to generate, store, and use data and knowledge are in themselves value-consuming, including database and infrastructure costs, training costs, cycle time lost to making measurements, and so on.

Therefore, for a business, the only legitimate driving force behind a scientific, data-driven approach that includes modeling is a failure to produce or deliver what is required. This presupposes that the business can assess and monitor what is needed, which is a non-trivial problem for at least two reasons:

A business is often a cacophony of voices, expressing different views as to the purpose of the business and needs of the customer.
A measurement process implies that a value is placed on what is being measured, and it can be very difficult to determine what should be valued.

It follows that developing a useful measurement scheme can be a difficult, but vital, exercise. Moreover, the analysis of the data that arise when measurements are actually made gives us new insights that often suggest the need for making new measurements. We will see some of this thinking in the case studies that follow.

2.1.3. Observational versus Experimental Data

Before continuing, it is important to note that the data we will use come in two types, depending on how the measurements of Xs and Ys are made: observational data and experimental data. Exhibit 2.1 allows us to explain the crucial difference between these two types of data.

Observational data arise when, as we record values of the Ys, the values of the Xs are allowed to change at will. This occurs when a process runs naturally and without interference.
Experimental data arise when we deliberately manipulate the Xs and then record the corresponding Ys.

Observational data tend to be collected with relatively little control over associated Xs. Often we simply assume that the Xs are essentially constant over the observational period, but sometimes the values of a set of Xs are recorded along with the corresponding Y values.

In contrast, the collection of experimental data requires us to force variation in the Xs. This involves designing a plan that tells us exactly how to change the Xs in the best way, leading to the topic of experimental design, or design of experiments (DOE). DOE is a powerful and far-reaching approach that has been used extensively in manufacturing and design environments.^[] Today, DOE is finding increasing application in non-manufacturing environments as well.^[] The book Introduction to Design of Experiments with JMP Examples guides readers in designing and analyzing experiments using JMP.^[]

In both the manufacturing and non-manufacturing settings, DOE is starting to find application in the Six Sigma world through discrete choice experiments.^[] In such experiments, users or potential users of a product or service are given the chance to express their preferences, allowing you to take a more informed approach to tailoring and trading off the attributes of a design. Because one attribute can be price, such methods allow you to address an important question—What will users pay money for? We note that JMP has extensive, easy-to-use facilities for both the design and analysis of choice models.

Even in situations where DOE is relevant, preliminary analysis of observational data is advised to set the stage for designing the most appropriate and powerful experiment. The case studies in this book deal predominantly with the treatment of observational data, but Chapters 6 and 8 feature aspects of DOE as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Background: Models, Data, and Variation

Create new playlist

Sign In