10.1. USES OF PROBABILITY

This chapter on probability will help you problem-solve when the data aren't quite as obvious. The probability analysis included is not only the basis for many of the Six Sigma tools, but knowledge of probability can also be used independently on many problems.

We are frequently making estimates on the likelihood of an event, its probability—for example, the chance of rain, of winning the lottery, or of being in a plane crash. Some probabilities are easy to calculate and intuitive, like the chance of getting a head on a coin flip (one in two, or 0.5). Some probabilities are not easy to calculate nor intuitive, like the probability of an earthquake.

In Six Sigma we also need to estimate the probability of an event. In this way we can make some judgment as to whether something just happened due to random coincidence or if there is an assignable cause that we should address. Luckily the work you will be doing does not involve earthquakes!

We will start with problems where we know the mathematical probability of a single random event, verifying any answer with both the abbreviated binomial table (Figure 10-1) and Excel's BINOMDIST (binomial distribution) statistics option. Excel's BINOMDIST is the primary tool you will use to solve these problems, but if you use the Abbreviated Binomial Table first, you will get a more fundamental understanding of probability. This understanding will minimize the likelihood of an error when using any statistics software package.

NOTE

Probability (BINOMDIST)

n = the number of independent trials, like the number of coin tosses, the number of parts measured, etc.

Probability p (or probability s) = the probability of a "success" on each individual trial, like the likelihood of a head on one coin flip or a defect on one part. This is always a proportion and generally shown as a decimal, like 0.0156.

Number s (or x successes) = the total number of "successes" that you are looking for, like getting exactly three heads.

Probability P = the probability of getting a given number of successes from all the trials, like the probability of three heads in five coin tosses or 14 defects in a shipment of parts. This is often the answer to the problem.

Cumulative = the sum of the probabilities of getting "the number of successes or fewer," like getting three or fewer heads on five flips of a coin. This option is used on "less-than" and "more-than" problems.

These definitions and their uses will become apparent as you solve the following problems.

Problem #1

What are the chances of getting three heads in three flips of a coin?

Using the above definitions, recognize that n (number of flips) = three. The probability of getting one head on any one flip is p = 0.5. The number of "successes" is three (three heads). P is the probability of getting exactly three heads on three coin flips, which is the desired answer to the problem.

We can make a table of all possible equally likely outcomes of the three coin flips:

OutcomeFlip #1Flip #2Flip #3
1.headheadhead
2.headheadtail
3.headtailhead
4.headtailtail
5.tailheadhead
6.tailheadtail
7.tailtailhead
8.tailtailtail

As you can see, only one of the eight equally likely outcomes of three coin flips is three heads. So, P = 1/8, or 0.125.

Logically (or mathematically) you can get the same answer. The chance of getting heads on the first flip is 50%, or p = 0.5. On each successive flip, the chances of a head are also 0.5. So, the chance of getting two heads on two coin flips is 0.5 × 0.5 = 0.25. Similarly, since each trial is independent (not affected by earlier tosses), we can multiply the probabilities on three flips: 0.5 × 0.5 × 0.5 = 0.125.

Verify this answer by using the following abbreviated binomial table (Figure 10-1) Values within the table are the probability of getting exactly x successes on n trials. Note that more complete tables are available in most statistics books.

Figure 10-1. Abbreviated binomial table

Here's how to use the Abbreviated Binomial Table (Figure 10-1) on problem #1.

Find n = 3 in the leftmost column, which is the number of trials.

Find the # of successes (3 heads).

In the far column p = 0.500 (chance of a head on each flip), the value 0.1250 is P.

Now do it in Excel. After bringing up the Excel worksheet, click on in the toolbar. Under "category," click on "statistical." Then, under "function," click on "BINOMDIST."

In the first box, enter the "number of successes" (number of heads) you want in these trials, which is "3." The second box asks for the "number of trials," which is "3." The third box asks for the probability of a "success" (head) on each trial, which is "0.5."

The fourth box asks if the problem requires the cumulative probability. If you answer "true," you get the sum of the probabilities up to three (the probability of one head + the probability of two heads + the probability of three heads), which is the probability of getting three or fewer heads. In this problem you do not want the cumulative probability, so answer "false." You then get our desired probability of exactly three heads, which is P = 0.125.

So, here is a summary of what we just did:

Excel BINOMDIST
successes = 3
trials = 3
probability = 0.5
cumulative: false
The result is P = 0.125.

Problem #2

What is the probability of getting two or fewer heads in three flips of a coin?

We will show five ways to get the answer to this problem.

You could look at the possible outcomes table (page 76) and add up the number of outcomes with zero heads (one outcome), one head (three outcomes), and two heads (three outcomes), for a total of seven outcomes (out of eight possible outcomes). This is a probability of 7/8, or 0.875.

The same answer can be found using the Abbreviated Binomial Table (Figure 10-1) by adding the probabilities of zero successes (0.125), one success (0.375), and two successes (0.375), for a total = 0.875.

Equally, using the BINOMDIST in Excel, you do it three times, adding the results of zero, one, and two successes (with cumulative: false), and you get 0.125 + 0.375 + 0.375 = 0.875.

Or you can recognize that the only outcome that has more than two heads is three heads, which has a probability of 0.125 (from problem #1). Since the sum of the probabilities of all possible outcomes always equals 1 (see the abbreviated binomial table, Figure 10-1), we can subtract the probability of three heads (.125) from 1 to get the answer = 0.875!

The following is the most direct way to the answer. Using Excel BINOMDIST, we can realize that the "cumulative true" gives the probability of getting two or fewer heads, which is what we want. So, here's how we can get the answer directly:

Excel BINOMDIST
successes = 2
trials = 3
probability = 0.5
cumulative: true
The result is P = 0.875.

Use the Sum of Probabilities = 1

Since the sum of the probabilities of all possible outcomes always equals 1, we can often use this knowledge to simplify a problem.

For example, if we want to know the probability of getting one or more heads on 10 coin tosses, we can find the probability of getting zero heads, then subtract this probability from 1. This is much easier than adding the probabilities of 1 head + 2 heads + 3 heads + 4 heads + 5 heads + 6 heads + 7 heads + 8 heads + 9 heads + 10 heads.


Using Excel's BINOMDIST Cumulative Function

Do not use the cumulative function for the probability of a single outcome/success, like three heads out of five coin tosses. Enter "false" in the box for cumulative function.

Use the cumulative function for "less-than," "equal-to-or-less-than," "equal-to-or-greater-than," or "greater-than" a given outcome or success, as follows.

For "less-than" a given outcome, like fewer than three heads out of eight coin tosses, use the cumulative function "true" with the success at one less than the given value (3 - 1 = 2).

success = 2

trials = 8

p = 0.5

cumulative: true

The result is P = 0.1445.

For "equal-to-or-less-than" a given outcome, like three heads or fewer out of eight coin tosses, use the cumulative function "true" with the success at the given value (3).

success = 3

trials = 8

p = 0.5

cumulative: true

The result is P = 0.3633.

For "greater-than" a given outcome, like more than three heads out of eight coin tosses, use the cumulative function "true" with the success at the given value (3), then subtract the result from 1.

success = 3

trials = 8

p = 0.5

cumulative: true

The result is 0.3633. P would then equal 1.0000 – 0.3633 = 0.6367.

For "equal-to-or-greater-than" a given outcome, like three or more heads out of eight coin tosses, use the cumulative function "true" with the success at one less than the given outcome (3 – 1 = 2), then subtract the result from 1.

success = 2

trials = 8

p = 0.5

cumulative: true

The result is 0.1445. P would then equal 1.0000 – 0.1445 = 0.8555.

You can satisfy yourself that the above cumulative function examples on eight coin tosses make sense by seeing that the probability of "less than three heads" plus the probability of "three or more heads" equals 1. The same is true for "three heads or fewer" plus "more than three heads."


Problem #3

What is the chance of getting eight or fewer tails on 10 flips of a coin?

Using the Abbreviated Binomial Table (Figure 10-1), we can add the probabilities of getting zero, one, two, three, four, five, six, seven, and eight tails. However, it is easier to add the probabilities of nine or 10 tails, then subtract from 1. The probability of nine tails is 0.0098 and the probability of 10 tails is 0.0010. Adding these and then subtracting the sum from 1 gives 1.0000 – 0.0108 = 0.9892. So, the P of getting eight or fewer tails on 10 coin flips is 0.9892, or 98.92%.

Or, here's the most direct way:

Excel BINOMDIST
successes = 8
trials = 10
probability = 0.5
cumulative: true
The result is P = 0.9893 (the slight difference from above is due to rounding error).

Case Study: Excessive Accident Rate

Asales force had an accident rate that was excessive and the sales manager was under a lot of pressure to reduce it. One of the salespeople thought that the accidents were higher near the end of the year, so he looked at 100 random accidents from each of several years, calculating the average accident rate for each month. He found that December had the highest average of any month, at 15 accidents. The other months all had lower rates, more or less similar.

The sales manager determined that the chance of getting 15 or more accidents in December due to random causes alone was very unlikely. On analyzing further, he also noted that January was not high, so he doubted that weather was the cause.

The next year the sales manager dictated giving gifts to the customers during the holiday season, rather than partying with them, and the accident rate went down to the same as for the other months.


Let's verify the sales manager's finding that December's 15 accidents out of 100 for a year was "very unlikely." First, some interpretation as to what was "very unlikely." He wasn't surprised that the December accident average was exactly 15; he was surprised that it was that high. 16 or 17 would also have surprised him! So, that is why he checked against the likelihood of 15 or more accidents happening in December due to random cause. This approach is more conservative and made him less likely to erroneously find that something happened due to other than random cause. The likelihood of getting an exact number is generally small and will bias results to show that the result was not random.

Excel BINOMDIST

successes = 14

trials = 100

probability = 0.08333 (1/12, which is what would be expected for a random month)

cumulative: true

The result is 0.9814. P would then equal 1.0000 – 0.9814 = 0.0186 or 1.86%.

So the sales manager was correct in saying that 15 or more accidents in December was very unlikely, since it would be expected to happen randomly only 1.86% of the time. He would therefore be 98.14% (100% – 1.86%) confident that the December accident rate was not random.

Note that it was not sufficient just to find that the December accident rate was higher than the 100/12 = 8.33 he would have expected for a random month. It had to be determined with some confidence that the December results were high enough that they were probably not random.

The sales manager did not try to determine the "root cause" of December's higher accident rate. It could have been because of excessive drinking, more miles driving visiting customers, driving more at night, etc. He only saw the correlation between the month and the accident rate. His chance of success on implementing a solution would have been higher if he could have identified the root cause.

Independent Trials Are Not Affected by Earlier Results

The probability on an independent trial is not affected by results on earlier trials. For example, someone could flip 10 heads in a row, but the probability of a head on the next coin flip is still p = 0.5—assuming that both the coin and the person tossing it are honest, etc.


Problem #4

What are the chances of getting at least eight tails on 10 coin flips?

We could use either the abbreviated binomial table (Figure 10-1) or Excel BINOMDIST to find the P of eight, nine, and 10 tails. We would then add these to get the total probability P = 0.0439 + 0.0098 + 0.0010 = 0.0547.

Or, using the Excel BINOMDIST and realizing that the chance of at least eight tails (or eight or more tails) is the same as 1 minus the chance of seven or fewer tails, we solve as follows:

Excel BINOMDIST
successes = 7
trials = 10
p = 0.5
cumulative: true
The result is 0.9453. P would then equal 1.0000 - 0.9453 = 0.0547, or 5.47%.

Problem #5

A vendor is making 25% defective product. In a box of 10 random parts from this vendor, what is the probability of finding two or fewer defects?

If we use the abbreviated binomial table (Figure 10-1), with n = 10, successes = 2, 1, and 0, and p = 0.25, we get 0.282, 0.188, 0.056. We add these for P = 0.526, or 52.6%.

Or, using Excel BINOMDIST:

success = 2
trials = 10
p = 0.25
cumulative: true
The result is P = 0.526, or 52.6%.

Problem #6

A salesperson has been losing 25% of potential sales. In a study of 10 random sales contacts from this salesperson, what is the probability of finding three or more successful sales?

We must be careful that what we call a "success" (successful sales in this case) is consistent with the rest of the problem statement (which is currently stated as % lost sales). We can restate the question as "getting 75% of potential sales" so the success is measured in the same terms as rest of the problem statement. Then we have the following problem:

A salesperson has been successful in getting 75% of potential sales. In a study of 10 random sales contacts from this salesperson, what is the probability of finding three or more successful sales?

Again, to save work, we know that "three or more successful sales" is the same as 1 minus the probability of "two or fewer successful sales."

Excel BINOMDIST

success = 2

trials = 10

p = 0.75

cumulative: true

The result is 0.0004. P would then equal 1.0000 - 0.0004 = 0.9996, or 99.96%.

Or, we can restate the problem in terms of "lost sales":

A salesperson has been losing 25% of potential sales. In a study of 10 random sales contacts from this salesperson, what is the probability of finding seven or less lost sales?

Note that in the above problem restatement we had to recognize that three or more successful sales is the same as seven or less lost sales. If this is not obvious, consider that when you have three successes out of 10 potential sales, seven are lost; with four successes, six losses; with five successes, five losses; with six successes, four losses; and so on.

Excel BINOMDIST

success = 7

trials = 10

p = 0.25

cumulative: true

The result is P = 0.9996, or 99.96%, which is consistent with the answer above.

The above problem shows that if you think your problem through carefully and are careful that the definition you've chosen for "success" is consistent with the other values used in the solution, you will get a correct and consistent answer.

Case Study: Defects Primarily in One Quadrant

Ahigh-speed production line was making a small number of critical defects. A quality engineer randomly collected 100 defects. He examined them and found that 22 defects were from the first quadrant of the product, 36 from the second, 21 from the third, and 21 from the fourth. He concluded that there was less than a 1% chance that 36 or more defects out of 100 would be coming from the second quadrant due to random causes alone.

So he went looking for anything suspicious on the production line that was exclusive to the second quadrant. He found a cooling nozzle from which some of the spray hit the second quadrant of the product. Since the product was very hot at that point in the process, he suspected that stress was being introduced.

When he asked the operator why he had put the spray at that location, he was told that it was being used to cool the tooling that was adjacent to the product at this point. The engineer then designed a more directed spray method that missed the product but hit the tooling, which allowed adequate cooling and also solved the problem of excess defects in that quadrant.


Let's see if we agree with the engineer's conclusion that the higher number of defects in the second quadrant was probably not due to random causes alone. ("Random" would mean that there was nothing peculiar about this quadrant: he just happened to get a sample with more defects in this location.) Note that he checked the chances of 36 or more defects happening randomly, since there was nothing special about being exactly 36. So, we want to calculate the chances of 36 or more defects occurring in the second quadrant from random causes only.

Excel BINOMDIST

successes = 35

trials = 100

p = 0.25 (1/4, the probability of the defect randomly occurring in the second quadrant)

cumulative: true

The result is 0.9906. P would then equal 1.0000 – 0.9906 = 0.0094, or 0.94%.

So, there is only a 0.94% chance of getting 36 or more of the 100 defects in the second quadrant due to random results. So the engineer was right to be suspicious—and his conclusion enabled him to focus his attention on areas affecting only that quadrant, which minimized the areas in the process that he had to examine to identify the problem source.

Excel BINOMDIST Trials Max at 1000

Excel BINOMDIST allows a maximum of 1000 trials. So, if you have more than 1000 trials, proportion the trials and number of successes to 1000.

For example, if the data has 2000 trials and you are looking for 140 successes, use Excel BINOMDIST with 1000 trials and 70 successes. The p is not affected.


Problem #7

These are the results of 109 random erroneous orders processed by seven telephone operators, A through G:

OperatorErroneous Orders
A14
B16
C22
D13
E16
F15
G13

Is Operator C's performance worse than we should expect, since C's 22 error total is higher than that of any of the other six operators? We would want to be at least 95% confident that the performance was poor before we took any action. We want to check against the likelihood of getting 22 or more errors, rather than exactly 22. This will be more conservative and it's not important that there are exactly 22 order errors.

Excel BINOMDIST

successes = 21

trials =109

p = 1/7 = 0.14286 (random chance since there are seven operators) cumulative: true

The result is 0.9429. P would then equal 1.0000 – 0.9429 = 0.0571, or 5.7%.

This means that there is a 5.7 % chance of this happening randomly without an assignable cause. The confidence level of the conclusion that this is not random would therefore be 100% – 5.7% = 94.3%, which is below the 95% test threshold.

Therefore, we can't conclude with a 95% confidence that Operator C is performing worse than any of the other operators.

10.1.1. Additional Practice Problems

Problem #8

What is the likelihood of getting four heads in seven flips of a coin?

Problem #9

What is the likelihood of getting at least four heads in seven flips of a coin?

Problem #10

What is the probability of getting three fives on six rolls of a die?

Problem #11

What is the probability of getting three or more fives rolling six dice one time?

Problem #12

An automaker entered the marketplace with a car to compete with two other brands already in that market. In the first month after introduction, the new entry got 355 sales of a random sample of 1000 sales from the total market. Can the automaker with the new entry say with a 95% confidence that he got more than the projected 1/3 of the sales due to other than random causes?

Problem #13

An automatic transmission repair shop had an established standard to do a certain type of repair. They hired a new mechanic. After six months, they sampled 12 of the new mechanic's repair times versus the standard repair. The new mechanic took more than the standard time on eight of the 12 samples. How confident would the repair shop be in judging that the higher times were due to performance rather than to random causes?

10.1.2. Solutions to Additional Practice Problems

Problem #8

What is the likelihood of getting four heads in seven flips of a coin?

Excel BINOMDIST

successes = 4

trials = 7

p = 0.5

cumulative: false

The result is P = 0.2734, or 27.34%.

Problem #9

What is the likelihood of getting at least four heads in seven flips of a coin?

Excel BINOMDIST

successes = 3

trials = 7

p = 0.5

cumulative: true

The result is 0.500. P would then equal 1.000 – 0.500 = 0.500, or 50.0%.

Problem #10

What is the probability of getting three fives on six rolls of a die?

Excel BINOMDIST

successes = 3

trials = 6

p = 1/6 = 0.1667

cumulative: false

The result is P = 0.0536, or 5.36%.

Problem #11

What is the probability of getting three or more fives rolling six dice one time?

Excel BINOMDIST

successes = 2

trials = 6

p = 1/6 = 0.1667

cumulative: true

The result is 0.9377. P would then equal 1.0000 – 0.9377 = 0.0623, or 6.23%.

(Note that rolling six dice at one time is the same as rolling one die six times. This is because in both cases the result on each die is independent of the results on the others.)

Problem #12

An automaker entered the marketplace with a car to compete with two other brands already in that market. In the first month after introduction, the new entry got 355 sales of a random sample of 1000 sales from the total market. Can the automaker with the new entry say with a 95% confidence that he got more than the projected 1/3 of the sales due to other than random causes?

We want to check against the likelihood of 355 or more sales, since getting exactly 355 is not what is important!

Excel BINOMDIST

successes = 354

trials =1000

p = 1/3 = 0.3333

cumulative: true

The result is 0.9220. P would then equal 1.0000 – 0.9220 = 0.0780, or 7.8%.

The automaker's confidence would therefore be 100.00% – 7.8% = 92.20%, which is below the 95% confidence target level the automaker wanted. Therefore, the automaker can't claim that the new entry gained more than 1/3 of the market due to other than random cause.

Note that if this problem had been done using exactly 355 as the test criterion, it would have shown that the automaker did get more than 1/3 of the market with a 99% confidence. This type of error occurs because it is unlikely to get any specific result randomly with this number of trials.

Problem #13

An automatic transmission repair shop had an established standard to do a certain type of repair. They hired a new mechanic. After six months, they sampled 12 of the new mechanic's repair times versus the standard repair. The new mechanic took more than the standard time on eight of the 12 samples. How confident would the repair shop be in judging that the higher times were due to performance rather than to random causes?

Assume that an average mechanic's repair times would be above standard 50% of the time, so p = 0.5. As in the problem above, we want to check versus eight or more repair times over the standard, since being exactly eight is not the issue.

Excel BINOMDIST

successes = 7

trials = 12

p = 1/2 = 0.5

cumulative: true

The result is 0.8062. P would then equal 1.0000 – 0.8062 = 0.1938, or 19.38%.

So, this result would happen randomly 19.38% of the time. So, the repair shop would be only 100% – 19.4% = 80.6% confident that the new mechanic was taking more time than the standard due to other than random causes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.146.173