4.3 The Binomial Random Variable

Many experiments result in dichotomous responses (i.e., responses for which there exist two possible alternatives, such as Yes–No, Pass–Fail, Defective–Nondefective, or Male–Female). A simple example of such an experiment is the coin-toss experiment. A coin is tossed a number of times, say, 10. Each toss results in one of two outcomes, Head or Tail, and the probability of observing each of these two outcomes remains the same for each of the 10 tosses. Ultimately, we are interested in the probability distribution of x, the number of heads observed. Many other experiments are equivalent to tossing a coin (either balanced or unbalanced) a fixed number n of times and observing the number x of times that one of the two possible outcomes occurs. Random variables that possess these characteristics are called binomial random variables.

Public opinion and consumer preference polls (e.g., the CNN, Gallup, and Harris polls) frequently yield observations on binomial random variables. For example, suppose a sample of 100 students is selected from a large student body and each person is asked whether he or she favors (a Head) or opposes (a Tail) a certain campus issue. Suppose we are interested in x, the number of students in the sample who favor the issue. Sampling 100 students is analogous to tossing the coin 100 times. Thus, you can see that opinion polls that record the number of people who favor a certain issue are real-life equivalents of coin-toss experiments. We have been describing a binomial experiment, identified by the following characteristics:

Characteristics of a Binomial Random Variable

  1. The experiment consists of n identical trials.

  2. There are only two possible outcomes on each trial. We will denote one outcome by S (for Success) and the other by F (for Failure).

  3. The probability of S remains the same from trial to trial. This probability is denoted by p, and the probability of F is denoted by q=1p.

  4. The trials are independent.

  5. The binomial random variable x is the number of S’s in n trials.

Biography Jacob Bernoulli (1654–1705)

The Bernoulli Distribution

Son of a magistrate and spice maker in Basel, Switzerland, Jacob Bernoulli completed a degree in theology at the University of Basel. While at the university, however, he studied mathematics secretly and against the will of his father. Jacob taught mathematics to his younger brother Johan, and they both went on to become distinguished European mathematicians. At first the brothers collaborated on the problems of the time (e.g., calculus); unfortunately, they later became bitter mathematical rivals. Jacob applied his philosophical training and mathematical intuition to probability and the theory of games of chance, where he developed the law of large numbers. In his book Ars Conjectandi, published in 1713 (eight years after his death), the binomial distribution was first proposed. Jacob showed that the binomial distribution is a sum of independent 0–1 variables, now known as Bernoulli random variables.

Example 4.9 Assessing Whether x Is a Binomial

Problem

  1. For the following examples, decide whether x is a binomial random variable:

    1. A university scholarship committee must select two students to receive a scholarship for the next academic year. The committee receives 10 applications for the scholarships—6 from male students and 4 from female students. Suppose the applicants are all equally qualified, so that the selections are randomly made. Let x be the number of female students who receive a scholarship.

    2. Before marketing a new product on a large scale, many companies conduct a consumer-preference survey to determine whether the product is likely to be successful. Suppose a company develops a new diet soda and then conducts a taste-preference survey in which 100 randomly chosen consumers state their preferences from among the new soda and the two leading sellers. Let x be the number of the 100 who choose the new brand over the two others.

    3. Some surveys are conducted by using a method of sampling other than simple random sampling. For example, suppose a television cable company plans to conduct a survey to determine the fraction of households in a certain city that would use its new fiber-optic service (FiOS). The sampling method is to choose a city block at random and then survey every household on that block. This sampling technique is called cluster sampling. Suppose 10 blocks are so sampled, producing a total of 124 household responses. Let x be the number of the 124 households that would use FiOS.

Solution

  1. In checking the binomial characteristics, a problem arises with both characteristic 3 (probabilities the same across trials) and characteristic 4 (independence). On the one hand, given that the first student selected is female, the probability that the second chosen is female is 39. On the other hand, given that the first selection is a male student, the probability that the second is female is 49. Thus, the conditional probability of a Success (choosing a female student to receive a scholarship) on the second trial (selection) depends on the outcome of the first trial, and the trials are therefore dependent. Since the trials are not independent, this variable is not a binomial random variable. (This variable is actually a hypergeometric random variable, the topic of optional Section 4.6.)

  2. Surveys that produce dichotomous responses and use random-sampling techniques are classic examples of binomial experiments. In this example, each randomly selected consumer either states a preference for the new diet soda or does not. The sample of 100 consumers is a very small proportion of the totality of potential consumers, so the response of one would be, for all practical purposes, independent of another.* Thus, x is a binomial random variable.

  3. This example is a survey with dichotomous responses (Yes or No to using FiOS), but the sampling method is not simple random sampling. Again, the binomial characteristic of independent trials would probably not be satisfied. The responses of households within a particular block would be dependent, since households within a block tend to be similar with respect to income, level of education, and general interests. Thus, the binomial model would not be satisfactory for x if the cluster sampling technique were employed.

Look Back

Nonbinomial variables with two outcomes on every trial typically occur because they do not satisfy characteristic 3 or characteristic 4 of a binomial distribution listed in the box on p. 183.

Example 4.10 Deriving the Binomial Probability Distribution—Passing a Physical Fitness Exam

Problem

  1. The Heart Association claims that only 10% of U.S. adults over 30 years of age meet the minimum requirements established by the President’s Council on Fitness, Sports, and Nutrition. Suppose four adults are randomly selected and each is given the fitness test.

    1. Use the steps given in Chapter 3 (box on p. 123) to findFind the probability that none of the four adults passes the test.

    2. Find the probability that three of the four adults pass the test.

    3. Let x represent the number of the four adults who pass the fitness test. Explain why x is a binomial random variable.

    4. Use the answers to parts a and b to derive a formula for p(x), the probability distribution of the binomial random variable x.

Solution

    1. The first step is to define the experiment. Here we are interested in observing the fitness test results of each of the four adults: pass (S) or fail (F).

    2. Next, we list the sample points associated with the experiment. Each sample point consists of the test results of the four adults. For example, SSSS represents the sample point denoting that all four adults pass, while FSSS represents the sample point denoting that adult 1 fails, while adults 2, 3, and 4 pass the test. The tree diagram, Figure 4.8, shows that there are 16 sample points. These 16 sample points are listed in Table 4.2.

      Figure 4.8

      Tree Diagram Showing Outcomes for Fitness Tests

      Table 4.2 Sample Points for Fitness Test of Example 4.10

      Alternate View
      SSSS FSSS FFSS SFFF FFFF
      SFSS FSFS FSFF
      SSFS FSSF FFSF
      SSSF SFFS FFFS
      SFSF
      SSFF
    3. We now assign probabilities to the sample points. Note that each sample point can be viewed as the intersection of four adults’ test results, and assuming that the results are independent, the probability of each sample point can be obtained by the multiplicative rule as follows:

      [&*AS*P|pbo|SSSS|pbc|*AP*|=|P|sbo||pbo|~rom~adult~normal~ 1 ~rom~passes|pbc||inter||pbo|adult~normal~ 2 ~rom~passes|pbc|~normal~ &][&*AS**AP*|inter||pbo|~rom~adult~normal~ 3 ~rom~passes|pbc||inter||pbo|adult~normal~ 4 ~rom~passes|pbc||sbc|~normal~ &] [&*AS**AP*|=|P|pbo|~rom~adult~normal~ 1 ~rom~passes|pbc||multi|~normal~P|pbo|~rom~adult~normal~ 2 ~rom~passes|pbc|~normal~ &] [&*AS**AP*|multi|P|pbo|~rom~adult~normal~ 3 ~rom~passes|pbc||multi|~normal~P|pbo|~rom~adult~normal~ 4 ~rom~passes|pbc|~normal~ &] [&*AS**AP*|=||pbo|.1|pbc|*N*[-1%0]|pbo|.1|pbc|*N*[-1%0]|pbo|.1|pbc|*N*[-1%0]|pbo|.1|pbc||=||pbo|.1|pbc|^{4}|=|.0001 &]

      P(SSSS)=P[(adult 1 passes)(adult 2 passes)](adult 3 passes)(adult 4 passes)]=P(adult 1 passes)×P(adult 2 passes)×P(adult 3 passes)×P(adult 4 passes)=(.1)(.1)(.1)(.1)=(.1)4=.0001

      All other sample-point probabilities are calculated by similar reasoning. For example,

      [&P|pbo|FSSS|pbc||=||pbo|.9|pbc||pbo|.1|pbc||pbo|.1|pbc||pbo|.1|pbc||=|.0009 &]

      P(FSSS)=(.9)(.1)(.1)(.1)=.0009

      You can check that this reasoning results in sample-point probabilities that add to 1 over the 16 points in the sample space.

    4. Finally, we add the appropriate sample-point probabilities to obtain the desired event probability. The event of interest is that all four adults fail the fitness test. In Table 4.2, we find only one sample point, FFFF, contained in this event. All other sample points imply that at least one adult passes. Thus,

      [&P|pbo|~rom~All four adults fail|pbc||=|~normal~P|pbo|FFFF|pbc||=||pbo|.9|pbc|^{4}|=|.6561 &]

      P(Allfouradultsfail)=P(FFFF)=(.9)4=.6561
  1. The event that three of the four adults pass the fitness test consists of the four sample points in the second column of Table 4.2: FSSS, SFSS, SSFS, and SSSF. To obtain the event probability, we add the sample-point probabilities:

    [&*AS*P|pbo|3 ~rom~of~normal~ 4 ~rom~adults pass|pbc|*AP*|=|~normal~P|pbo|FSSS|pbc||+|P|pbo|SFSS|pbc||+|P|pbo|SSFS|pbc||+|P|pbo|SSSF|pbc| &][&*AS**AP*|=||pbo|.1|pbc|^{3}|pbo|.9|pbc||+||pbo|.1|pbc|^{3}|pbo|.9|pbc||+||pbo|.1|pbc|^{3}|pbo|.9|pbc||+||pbo|.1|pbc|^{3}|pbo|.9|pbc| &][&*AS**AP*|=|4|pbo|.1|pbc|^{3}|pbo|.9|pbc||=|.0036 &]

    P(3of 4 adults pass)=P(FSSS)+P(SFSS)+P(SSFS)+P(SSSF)=(.1)3(.9)+(.1)3(.9)+(.1)3(.9)+(.1)3(.9)=4(.1)3(.9)=.0036

    Note that each of the four sample-point probabilities is the same because each sample point consists of three S’s and one F; the order does not affect the probability because the adults’ test results are (assumed) independent.

  2. We can characterize this experiment as consisting of four identical trials: the four test results. There are two possible outcomes to each trial, S or F, and the probability of passing, p=.1, is the same for each trial. Finally, we are assuming that each adult’s test result is independent of all the others’, so that the four trials are independent. Then it follows that x, the number of the four adults who pass the fitness test, is a binomial random variable.

  3. The event probabilities in parts a and b provide insight into the formula for the probability distribution p(x). First, consider the event that three adults pass (part b). We found that

    [&*AS*P|pbo|x|=|3|pbc|*AP*|=| &][&|pbo|~rom~Number of sample points for which~normal~ x|=|3|pbc|*N*[-1.5%0]|multi|*N*[-2%0]|pbo|.1|pbc|^{~rom~Number of successes}*N*[-2%0]|multi|*N*[-2.5%0]|pbo|~normal~.9|pbc|^{~rom~Number of failures}~norm~ &][&*N*[-4.5%0]|=|4|pbo|.1|pbc|^{3}|pbo|.9|pbc|^{1} &]

    P(x=3)=(Number of sample points for which x=3)×(.1)Number of successes×(.9)Number of failures=4(.1)3(.9)1

In general, we can use combinatorial mathematics to count the number of sample points. For example,

Number of sample points for which x=3

[&*AS**AP*|=|~rom~Number of different ways of selecting 3 successes in the 4 trials~normal~ &][&*AS**AP*|=||3(|~MAT~[1%2%C%120%C%A]*MAT*{4}{3}|3)||=|*frac*{4|fract|}{3|fract||pbo|4|-|3|pbc||fract|}|=|*frac*{4|mdot|3|mdot|2|mdot|1}{|pbo|3|mdot|2|mdot|1|pbc||mdot|1}|=|4 &]

=Number of different ways of selecting 3 successes in the 4 trials=(43)=4!3!(43)!=4321(321)1=4

The formula that works for any value of x can be deduced as follows:

[&P|pbo|x|=|3|pbc||=||3(|~MAT~[1%2%C%120%C%A]*MAT*{4}{3}|3)||pbo|.1|pbc|^{3}|pbo|.9|pbc|^{1}|=| &][&|3(|~MAT~[1%2%C%120%C%A]*MAT*{4}{x}|3)||pbo|.1|pbc|^{x}|pbo|.9|pbc|^{4|-|x} &]

P(x=3)=(43)(.1)3(.9)1=(4x)(.1)x(.9)4x

The component (4x) counts the number of sample points with x successes, and the component (.1)x(.9)4x is the probability associated with each sample point having x successes.

For the general binomial experiment, with n trials and probability p of Success on each trial, the probability of x successes is

Look Ahead

In theory, you could always resort to the principles developed in this example to calculate binomial probabilities; just list the sample points and sum their probabilities. However, as the number of trials (n) increases, the number of sample points grows very rapidly. (The number of sample points is 2n.) Thus, we prefer the formula for calculating binomial probabilities, since its use avoids listing sample points.

The binomial distribution* is summarized in the following box:

The Binomial Probability Distribution

[&p|pbo|x|pbc||=||pbo|~MAT~[1%2%C%120%C%A]*MAT*{n}{x}|pbc|p^{x}q^{n|-|x}|em||pbo|x|=|0, 1, 2,|elip|, n|pbc| &]

p(x)=(nx)pxqnx(x=0,1,2.,n)

where

[&*AS*p*AP*|=|~rom~Probability of a success on a single trial~normal~&][&*AS*q*AP*|=|1|-|p&][&*AS*n*AP*|=|~rom~Number of trials~normal~&][&*AS*x*AP*|=|~rom~Number of successes in~normal~ n ~rom~trials~normal~&][&*AS*n|-|~normal~~norm~x*AP*|=|~rom~Number of failures in~normal~ n ~rom~trials~norm~&][&*AS*|pbo|~MAT~[1%2%C%120%C%A]*MAT*{n}{x}|pbc|*AP*|=|*frac*{n|fract|}{x|fract||pbo|n|-|x|pbc||fract|}&]

p=Probability of a success on a single trialq=1pn=Number of trialsx=Number of successes in n trialsnx=Number of failures in n trials(nx)=n!x!(nx)!

As noted in Chapter 3, theThe symbol 5! means 54321=120. Similarly, n!=n(n1)(n2)321. (Remember, 0!=1.)

Example 4.11 Applying the Binomial Distribution—Physical Fitness Problem

Problem

  1. Refer to Example 4.10. Use the formula for a binomial random variable to find the probability distribution of x, where x is the number of adults who pass the fitness test. Graph the distribution.

Solution

  1. For this application, we have n=4 trials. Since a success S is defined as an adult who passes the test, p=P(S)=.1 and q=1p=.9. Substituting n=4,p=.1, and q=.9 into the formula for p(x), we obtain

    [&*AS*p|pbo|0|pbc|*AP*|=|*frac*{4|fract|}{0|fract||pbo|4|-|0|pbc||fract|}|pbo|.1|pbc|^{0}|pbo|.9|pbc|^{4|-|0}|=|*frac*{4|mdot|3|mdot|2|mdot|1}{|pbo|1|pbc||pbo|4|mdot|3|mdot|2|mdot|1|pbc|}|pbo|.1|pbc|^{0}|pbo|.9|pbc|^{4}|=|1|pbo|.1|pbc|^{0}|pbo|.9|pbc|^{4}|=|.6561 &] [&*AS*p|pbo|1|pbc|*AP*|=|*frac*{4|fract|}{1|fract||pbo|4|-|1|pbc||fract|}|pbo|.1|pbc|^{1}|pbo|.9|pbc|^{4|-|1}|=|*frac*{4|mdot|3|mdot|2|mdot|1}{|pbo|1|pbc||pbo|3|mdot|2|mdot|1|pbc|}|pbo|.1|pbc|^{1}|pbo|.9|pbc|^{3}|=|4|pbo|.1|pbc||pbo|.9|pbc|^{3}|=|.2916 &] [&*AS*p|pbo|2|pbc|*AP*|=|*frac*{4|fract|}{2|fract||pbo|4|-|2|pbc||fract|}|pbo|.1|pbc|^{2}|pbo|.9|pbc|^{4|-|2}|=|*frac*{4|mdot|3|mdot|2|mdot|1}{|pbo|2|mdot|1|pbc||pbo|2|mdot|1|pbc|}|pbo|.1|pbc|^{2}|pbo|.9|pbc|^{2}|=|6|pbo|.1|pbc|^{2}|pbo|.9|pbc|^{2}|=|.0486 &] [&*AS*p|pbo|3|pbc|*AP*|=|*frac*{4|fract|}{3|fract||pbo|4|-|3|pbc||fract|}|pbo|.1|pbc|^{3}|pbo|.9|pbc|^{4|-|3}|=|*frac*{4|mdot|3|mdot|2|mdot|1}{|pbo|3|mdot|2|mdot|1|pbc||pbo|1|pbc|}|pbo|.1|pbc|^{3}|pbo|.9|pbc|^{1}|=|4|pbo|.1|pbc|^{3}|pbo|.9|pbc||=|.0036 &] [&*AS*p|pbo|4|pbc|*AP*|=|*frac*{4|fract|}{4|fract||pbo|4|-|4|pbc||fract|}|pbo|.1|pbc|^{4}|pbo|.9|pbc|^{4|-|4}|=|*frac*{4|mdot|3|mdot|2|mdot|1}{|pbo|4|mdot|3|mdot|2|mdot|1|pbc||pbo|1|pbc|}|pbo|.1|pbc|^{4}|pbo|.9|pbc|^{0}|=|1|pbo|.1|pbc|^{4}|pbo|.9|pbc||=|.0001 &]

    p(0)=4!0!(40)!(.1)0(.9)40=4321(1)(4321)(.1)0(.9)4=1(.1)0(.9)4=.6561p(1)=4!1!(41)!(.1)1(.9)41=4321(1)(321)(.1)1(.9)3=4(.1)(.9)3=.2916p(2)=4!2!(42)!(.1)2(.9)42=4321(21)(21)(.1)2(.9)2=6(.1)2(.9)2=0.486p(3)=4!3!(43)!(.1)3(.9)43=4321(321)(1)(.1)3(.9)1=4(.1)3(.9)=.0036p(4)=4!4!(44)!(.1)4(.9)44=4321(4321)(1)(.1)4(.9)0=1(.1)4(.9)=.0001

Look Back

Note that these probabilities, listed in Table 4.3, sum to 1. A graph of this probability distribution is shown in Figure 4.9.

Figure 4.9

Probability distribution for physical fitness example: Graphical form

Table 4.3 Probability Distribution for Physical Fitness Example: Tabular Form

x p(x)
0 .6561
1 .2916
2 .0486
3 .0036
4 .0001

Now Work Exercise 4.53

Example 4.12 Finding μ and σ—Physical Fitness Problem

Problem

  1. Refer to Examples 4.10 and 4.11. Calculate μ and σ, the mean and standard deviation, respectively, of the number of the four adults who pass the test. Interpret the results.

Solution

  1. From Section 4.2, we know that the mean of a discrete probability distribution is

    [&|mu||=|~SA~[C]*sum*{}{}xp|pbo|x|pbc| &]

    μ=xp(x)

    Referring to Table 4.3, the probability distribution for the number x who pass the fitness test, we find that

    [&*AS*|mu|*AP*|=|0|pbo|.6561|pbc||+|1|pbo|.2916|pbc||+|2|pbo|.0486|pbc||+|3|pbo|.0036|pbc||+|4|pbo|.0001|pbc||=|.4 &][&*AS**AP*|=|4|pbo|.1|pbc||=|np &]

    μ=0(.6561)+1(.2916)+2(.0486)+3(.0036)+4(.0001)=.4=4(.1)=np

    Thus, in the long run, the average number of adults (out of four) who pass the test is only .4.

    [Note: The relationship μ=np holds in general for a binomial random variable.]

    The variance is

    [&*AS*|sig|^{2}*AP*|=||Sig||pbo|x|-||mu||pbc|^{2}p|pbo|x|pbc||=|*smsum*{}{}|pbo|x|-|.4|pbc|^{2}p|pbo|x|pbc| &][&*AS**AP*|=||pbo|0|-|.4|pbc|^{2}|pbo|.6561|pbc||+||pbo|1|-|.4|pbc|^{2}|pbo|.2916|pbc||+||pbo|2|-|.4|pbc|^{2}|pbo|.0486|pbc| &][&*AS**AP*|+||pbo|3|-|.4|pbc|^{2}|pbo|.0036|pbc||+||pbo|4|-|.4|pbc|^{2}|pbo|.0001|pbc| &][&*AS**AP*|=|.104976|+|.104976|+|.124416|+|.024336|+|.001296 &][&*AS**AP*|=|.36|=|4|pbo|.1|pbc||pbo|.9|pbc||=|npq &]

    σ2=Σ(xμ)2p(x)=Σ(x.4)2p(x)=(0.4)2(.6561)+(1.4)2(.2916)+(2.4)2(.0486)+(3.4)2(.0036)+(4.4)2(.0001)=.104976+.104976+.124416+.024336+.001296=.36=4(.1)(.9)=npq

    [Note: The relationship σ2=npq holds in general for a binomial random variable.]

    Finally, the standard deviation of the number who pass the fitness test is

    [&|sig||=|*rad*{|sig|^{2}}|=|*rad*{.36}|=|.6 &]

    σ=σ2=.36=.6

    Since the distribution shown in Figure 4.9 is skewed right, we should apply Chebyshev’s rule to describe where most of the x-values fall. According to the rule, at least 75% of the x values will fall into the interval μ±2σ=.4±2(.6)=(.8,1.6). Since x cannot be negative, we expect (i.e., in the long run) the number of adults out of four who pass the fitness test to be less than 1.6.

Look Back

Examining Figure 4.9, you can see that all observations equal to 0 or 1 will fall within the interval (.8,1.6). The probabilities corresponding to these values (from Table 4.3) are .6561 and .2916, respectively. Summing them, we obtain .6561+.2916=.9477.95. This result is closer to the value stated by the empirical rule. In practice, researchers have found that the proportion of observations that fall within two standard deviations of the mean for many skewed distributions will be close to .95.

We emphasize that you need not use the expectation summation rules to calculate μ and σ2 for a binomial random variable. You can find them easily from the formulas μ=np and σ2=npq.

Mean, Variance, and Standard Deviation for a Binomial Random Variable

  • Mean: μ=np

  • Variance: σ2=npq

  • Standard deviation: σ=npq

Using Tables and Technology for Binomial Probabilities

Calculating binomial probabilities becomes tedious when n is large. For some values of n and p, the binomial probabilities have been tabulated in Table I of Appendix B. Part of that table is shown in Table 4.4; a graph of the binomial probability distribution for n=10 and p=.10 is shown in Figure 4.10.

Table 4.4 Reproduction of Part of Table I of Appendix B: Binomial Probabilities for n=10

Alternate View
k p .01 .05 .10 .20 .30 .40 .50 .60 .70 .80 .90 .95 .99
0  .904  .599  .349  .107  .028  .006 .001 .000 .000 .000 .000 .000 .000
1  .996  .914  .736  .376  .149  .046 .011 .002 .000 .000 .000 .000 .000
2 1.000  .988  .930  .678  .383  .167 .055 .012 .002 .000 .000 .000 .000
3 1.000  .999  .987  .879  .650  .382 .172 .055 .011 .001 .000 .000 .000
4 1.000 1.000  .998  .967  .850  .633 .377 .166 .047 .006 .000 .000 .000
5 1.000 1.000 1.000  .994  .953  .834 .623 .367 .150 .033 .002 .000 .000
6 1.000 1.000 1.000  .999  .989  .945 .828 .618 .350 .121 .013 .001 .000
7 1.000 1.000 1.000 1.000  .998  .988 .945 .833 .617 .322 .070 .012 .000
8 1.000 1.000 1.000 1.000 1.000  .988 .989 .954 .851 .624 .264 .086 .004
9 1.000 1.000 1.000 1.000 1.000 1.000 .999 .994 .972 .893 .651 .401 .096

Table I actually contains a total of nine tables, labeled (a) through (i), one each corresponding to n=5,6,7,8,9,10,15,20, and 25, respectively. In each of these tables, the columns correspond to values of p and the rows correspond to values of the random variable x. The entries in the table represent cumulative binomial probabilities. For example, the entry in the column corresponding to p=.10 and the row corresponding to x=2 is .930 (highlighted), and its interpretation is

[&P|pbo|x|leq|2|pbc||=|P|pbo|x|=|0|pbc||+|P|pbo|x|=|1|pbc||+|P|pbo|x|=|2|pbc||=|.930 &]

P(x2)=P(x=0)+P(x=1)+P(x=2)=.930

This probability is also highlighted in the graphical representation of the binomial distribution with n=10 and p=.10 in Figure 4.10

You can also use Table I to find the probability that x equals a specific value. For example, suppose you want to find the probability that x=2 in the binomial distribution with n=10 and p=.10. This probability is found by subtraction as follows:

[&*AS*P|pbo|x|=|2|pbc|*AP*|=||sbo|P|pbo|x|=|0|pbc||+|P|pbo|x|=|1|pbc||+|P|pbo|x|=|2|pbc||sbc||-||sbo|P|pbo|x|=|0|pbc||+|P|pbo|x|=|1|pbc||sbc| &][&*AS**AP*|=|P|pbo|x|leq|2|pbc||-|P|pbo|x|leq|1|pbc||=|.930|-|.736|=|.194 &]

P(x=2)=[P(x=0)+P(x=1)+P(x=2)][P(x=0)+P(x=1)]=P(x2)P(x1)=.930.736=.194

Figure 4.10

Binomial probability distribution for n=10 and p=.10, with P(x=2) highlighted

The probability that a binomial random variable exceeds a specified value can be found from Table I together with the notion of complementary events. For example, to find the probability that x exceeds 2 when n=10 and p=.10, we use

[&P|pbo|x|gtr|2|pbc||=|1|-|P|pbo|x|leq|2|pbc||=|1|-|.930|=|.070 &]

P(x>2)=1P(x2)=1.930=.070

Note that this probability is represented by the unhighlighted portion of the graph in Figure 4.10.

All probabilities in Table I are rounded to three decimal places. Thus, although none of the binomial probabilities in the table is exactly zero, some are small enough (less than .0005) to round to .000. For example, using the formula to find P(x=0) when n=10 and p=.6, we obtain

[&P|pbo|x|=|0|pbc||=||pbo|~MAT~[1%2%C%120%C%A]*MAT*{10}{0}|pbc||pbo|.6|pbc|^{0}|pbo|.4|pbc|^{10|-|0}|=|.4^{10}|=|.00010486 &]

P(x=0)=(100)(.6)0(.4)100=.410=.00010486

but this is rounded to .000 in Table I of Appendix B. (See Table 4.4.)

Similarly, none of the table entries is exactly 1.0, but when the cumulative probabilities exceed .9995, they are rounded to 1.000. The row corresponding to the largest possible value for x,x=n, is omitted because all the cumulative probabilities in that row are equal to 1.0 (exactly). For example, in Table 4.4 with n=10,P(x10)=1.0, no matter what the value of p.

Of course, you can also use technology such as statistical software or a graphing calculator to find binomial probabilities. The following example further illustrates the use of Table I and statistical software.

Example 4.13 Using the Binomial Table and Software—Voting for Mayor

Problem

  1. Suppose a poll of 20 voters is taken in a large city. The purpose is to determine x, the number who favor a certain candidate for mayor. Suppose that 60% of all the city’s voters favor the candidate.

    1. Find the mean and standard deviation of x.

    2. Use Table I of Appendix B to find the probability that x10. Verify this probability using MINITAB.

    3. Use Table I to find the probability that x>12.

    4. Use Table I to find the probability that x=11. Verify this probability using MINITAB.

    5. Graph the probability distribution of x, and locate the interval μ±2σ on the graph.

Solution

  1. The number of voters polled is presumably small compared with the total number of eligible voters in the city. Thus, we may treat x, the number of the 20 who favor the mayoral candidate, as a binomial random variable. The value of p is the fraction of the total number of voters who favor the candidate (i.e., p=.6). Therefore, we calculate the mean and variance:

    [&*AS*|mu|*AP*|=|np|=|20|pbo|.6|pbc||=|12 &][&*AS*|sig|^{2}*AP*|=|npq|=|20|pbo|.6|pbc||pbo|.4|pbc||=|4.8 &][&*AS*|sig|*AP*|=|*rad*{4.8}|=|2.19 &]

    μ=np=20(.6)=12σ2=npq=20(.6)(.4)=4.8σ=4.8=2.19
  2. Looking in the row for k=10 and the column for p=.6 of Table I (Appendix B) for n=20, we find the value .245. Thus,

    [&P|pbo|x|leq|10|pbc||=|.245 &]

    P(x10)=.245

    Figure 4.11

    MINITAB Output for Example 4.13

    This value agrees with the cumulative probability shaded at the top of the MINITAB printout shown in Figure 4.11.

  3. To find the probability

    [&P|pbo|x|gtr|12|pbc||=|~SA~[C]*sum*{20}{x|=|13}p|pbo|x|pbc| &]

    P(x>12)=x=1320p(x)

    we use the fact that for all probability distributions,

    [&~SA~[C]*sum*{}{~rom~all~normal~ x}~norm~p|pbo|x|pbc||=|1. &]

    all xp(x)=1.

    Therefore,

    [&P|pbo|x|gtr|12|pbc||=|1|-|P|pbo|x|leq|12|pbc||=|1|-|~SA~[C]*sum*{12}{x|=|0}p|pbo|x|pbc| &]

    P(x>12)=1P(x12)=1x=012p(x)

    Consulting Table I of Appendix B, we find the entry in row k=12, column p=.6 to be .584. Thus,

    [&P|pbo|x|gtr|12|pbc||=|1|-|.584|=|.416 &]

    P(x>12)=1.584=.416
  4. To find the probability that exactly 11 voters favor the candidate, recall that the entries in Table I are cumulative probabilities and use the relationship

    [&*AS*P|pbo|x|=|11|pbc|*AP*|=||sbo|p|pbo|0|pbc||+|p|pbo|1|pbc||+||cdots||+|p|pbo|11|pbc||sbc||-||sbo|p|pbo|0|pbc||+|p|pbo|1|pbc||+||cdots||+|p|pbo|10|pbc||sbc| &][&*AS**AP*|=|P|pbo|x|leq|11|pbc||-|P|pbo|x|leq|10|pbc| &]

    P(x=11)=[p(0)+p(1)++p(11)][p(0)+p(1)++p(10)]=P(x11)P(x10)

    Then

    [&P|pbo|x|=|11|pbc||=|.404|-|.245|=|.159 &]

    P(x=11)=.404.245=.159

    Again, this value agrees with the probability shaded at the bottom of the MINITAB printout, Figure 4.11.

  5. The probability distribution for x is shown in Figure 4.12. Note that

    [&|mu||-|2|sig||=|12|-|2|pbo|2.2|pbc||=|7.6|em| |mu||+|2|sig||=|12|+|2|pbo|2.2|pbc||=|16.4 &]

    μ2σ=122(2.2)=7.6μ+2σ=12+2(2.2)=16.4

    The interval μ2σ to μ+2σ also is shown in Figure 4.12. The probability that x falls into the interval μ+2σ is P(x=8,9,10,,16)=P(x16)P(x7)= .984.021=.963. Note that this probability is very close to the .95 given by the empirical rule. Thus, we expect the number of voters in the sample of 20 who favor the mayoral candidate to be between 8 and 16.

Figure 4.12

The binomial probability distribution for x in Example 4.13; n=20 and p=.6

Now Work Exercise 4.54

Exercises 4.48–4.72

Understanding the Principles

  1. 4.48 Give the five characteristics of a binomial random variable.

  2. 4.49 Give the formula for p(x) for a binomial random variable with n=7 and p=.2.

  3. 4.50 Consider the following binomial probability distribution:

    [&p|pbo|x|pbc||=||pbo|~MAT~[1%2%C%120%C%A]*MAT*{5}{x}|pbc||pbo|.7|pbc|^{x}|pbo|.3|pbc|^{5|-|x}|em||pbo|x|=|0, 1, 2,|elip|, 5|pbc| &]

    p(x)=(5x)(.7)x(.3)5x(x=0,1,2,,5)
    1. How many trials (n) are in the experiment?

    2. What is the value of p, the probability of success?

Learning the Mechanics

  1. 4.51 Compute the following:

    1. 6!2!(62)!

    2. (52)

    3. (70)

    4. (66)

    5. (43)

  2. 4.52 Suppose x is a binomial random variable with n=3 and p=.3.

    1. Calculate the value of p(x),x=0,1,2,3, using the formula for a binomial probability distribution.

    2. Using your answers to part a, give the probability distribution for x in tabular form.

  3. 4.53 If x is a binomial random variable, compute p(x) for each of the following cases:

    1. n=5,x=1,p=.2

    2. n=4,x=2,q=.4

    3. n=3,x=0,p=.7

    4. n=5,x=3,p=.1

    5. n=4,x=2,q=.6

    6. n=3,x=1,p=.9

  4. 4.54 If x is a binomial random variable, use Table I in Appendix B or technology to find the following probabilities:

    1. P(x=2) for n=10,p=.4

    2. P(x5) for n=15,p=.6

    3. P(x>1) for n=5,p=.1

  5. 4.55 If x is a binomial random variable, use Table I in Appendix B or technology to find the following probabilities:

    1. P(x<10) for n=25,p=.7

    2. P(x10) for n=15,p=.9

    3. P(x=2) for n=20,p=.2

  6. 4.56 If x is a binomial random variable, calculate μ,σ2, and σ for each of the following:

    1. n=25,p=.5

    2. n=80,p=.2

    3. n=100,p=.6

    4. n=70,p=.9

    5. n=60,p=.8

    6. n=1,000,p=.04

  7. 4.57 The binomial probability distribution is a family of probability distributions with each single distribution depending on the values of n and p. Assume that x is a binomial random variable with n=4.

    1. Determine a value of p such that the probability distribution of x is symmetric.

    2. Determine a value of p such that the probability distribution of x is skewed to the right.

    3. Determine a value of p such that the probability distribution of x is skewed to the left.

    4. Graph each of the binomial distributions you obtained in parts a, b, and c. Locate the mean for each distribution on its graph.

    5. In general, for what values of p will a binomial distribution be symmetric? skewed to the right? skewed to the left?

Applet Exercise 4.3

Use the applets entitled Simulating the Probability of a Head with an Unfair Coin (P(H) = 0.2) and Simulating the Probability of a Head with an Unfair Coin (P(H) = 0.8) to study the mean μ of a binomial distribution.

    1. Run each applet once with n=1,000 and record the cumulative proportions. How does the cumulative proportion for each applet compare with the value of P(H) given for the applet?

    2. Using the cumulative proportion from each applet as p, compute μ=np for each applet, where n=1,000. What does the value of μ represent in terms of the results obtained from running each applet in part a?

    3. In your own words, describe what the mean μ of a binomial distribution represents.

Applet Exercise 4.4

Open the applet entitled Sample from a Population. On the pull-down menu to the right of the top graph, select Binary. Set n=10 as the sample size and repeatedly choose samples from the population. For each sample, record the number of 1’s in the sample. Let x be the number of l’s in a sample of size 10. Explain why x is a binomial random variable.

Applet Exercise 4.5

Use the applet entitled Simulating the Stock Market to estimate the probability that the stock market will go up each of the next two days. Repeatedly run the applet for n=2, recording the number of ups each time. Use the proportion of 2’s among your results as the estimate of the probability. Compare your answer with the binomial probability where x=2,n=2, and p=0.5.

Applying the Concepts—Basic

  1. 4.58 Working on summer vacation. An Adweek/Harris (July 2011) poll found that 35% of U.S. adults do not work at all while on summer vacation. In a random sample of 10 U.S. adults, let x represent the number who do not work during summer vacation.

    1. a. For this experiment, define the event that represents a “success.”

    2. b. Explain why x is (approximately) a binomial random variable.

    3. c. Give the value of p for this binomial experiment.

    4. d. Find P(x=3).

    5. e. Find the probability that 2 or fewer of the 10 U.S. adults do not work during summer vacation.

  2. 4.59 Superstitions survey. Are Americans superstitious? A Harris (Feb. 2013) poll of over 2,000 adult Americans was designed to answer this question. One survey item concerned the phrase “see a penny, pick it up, all day long you’ll have good luck.” The poll found that just one-third of Americans (33%) believe finding and picking up a penny is good luck. Consider a random sample of 20 U.S. adults and let x represent the number who believe finding and picking up a penny is good luck.

    1. For this experiment, define the event that represents a “success.”

    2. Explain why x is (approximately) a binomial random variable.

    3. Give the value of p for this binomial experiment.

    4. Find P(x<10).

    5. Find P(x=17).

    6. Find P(x>5).

  3. 4.60 Where will you get your next pet? According to an Associated Press/Petside.com poll, half of all pet owners would get their next dog or cat from a shelter (USA Today, May 12, 2010). Consider a random sample of 10 pet owners and define x as the number of pet owners who would acquire their next dog or cat from a shelter. Assume that x is a binomial random variable.

    1. For this binomial experiment, define a success.

    2. For this binomial experiment, what is n?

    3. For this binomial experiment, what is p?

    4. Find P(x=7).

    5. Find P(x3).

    6. Find P(x>8).

  4. 4.61 Chemical signals of mice.Consider Refer to the Cell (May 14, 2010) study of the ability of a mouse to recognize the odor of a potential predator, Exercise 3.63 (p. 141). Recall that theThe sources of these odors are typically major urinary proteins (Mups). In an experiment, 40% of lab mice cells exposed to chemically produced cat Mups responded positively (i.e., recognized the danger of the lurking predator). Consider a sample of 100 lab mice cells, each exposed to chemically produced cat Mups. Let x represent the number of cells that respond positively.

    1. Explain why the probability distribution of x can be approximated by the binomial distribution.

    2. Find E(x) and interpret its value, practically.

    3. Find the variance of x.

    4. Give an interval that is likely to contain the value of x.

  5. 4.62 Caesarian births. The American College of Obstetricians and Gynecologists reports that 32% of all births in the United States take place by Caesarian section each year. (National Vital Statistics Reports, Mar. 2010).

    1. In a random sample of 1,000 births, how many, on average, will take place by Caesarian section?

    2. What is the standard deviation of the number of Caesarian section births in a sample of 1,000 births?

    3. Use your answers to parts a and b to form an interval that is likely to contain the number of Caesarian section births in a sample of 1,000 births.

Apply the Concepts—Intermediate

  1. 4.63 Fingerprint expertise. ConsiderPsychological Science (Aug. 2011) published a study of fingerprint identification. The study found that when presented with prints from the same individual, a fingerprint expert will correctly identify the match 92% of the time. In contrast, a novice will correctly identify the match 75% of the time. Consider a sample of five different pairs of fingerprints where each pair is a match.

    1. What is the probability that an expert will correctly identify the match in all five pairs of fingerprints?

    2. What is the probability that a novice will correctly identify the match in all five pairs of fingerprints?

  2. 4.64 Immediate feedback to incorrect exam answers. Researchers from the Educational Testing Service (ETS) found that providing immediate feedback to students answering open-ended questions can dramatically improve students’ future performance on exams (Educational and Psychological Measurement, Feb. 2010). The ETS researchers used questions from the Graduate Record Examination (GRE) in the experiment. After obtaining feedback, students could revise their answers. Consider one of these questions. Initially, 50% of the students answered the question correctly. After providing immediate feedback to students who answered incorrectly, 70% answered correctly. Consider a bank of 100 open-ended questions similar to those on the GRE.

    1. In a random sample of 20 students, what is the probability that more than half initially answer the question correctly?

    2. Refer to part a. After providing immediate feedback, what is the probability that more than half of the students answer the question correctly?

  3. 4.65 Making your vote count. Democratic and Republican presidential state primary elections differ in the way winning candidates are awarded delegates. In Republican states, the winner is awarded all the state’s delegates; conversely, the Democratic state winner is awarded delegates in proportion to the percentage of votes. This difference led to a Chance (Fall 2007) article on making your vote count. Consider a scenario where you are one of five county commissioners voting on an issue, where each commissioner is equally likely to vote for or against.

    1. Your vote counts (i.e., is the decisive vote) only if the other four voters split, 2 in favor and 2 against. Use the binomial distribution to find the probability that your vote counts.

    2. If you convince two other commissioners to “vote in bloc” (i.e., you all agree to vote among yourselves first, and whatever the majority decides is the way all three will vote, guaranteeing that the issue is decided by the bloc), your vote counts only if these 2 commissioners split their bloc votes, 1 in favor and 1 against. Again, use the binomial distribution to find the probability that your vote counts.

  4. 4.66 Marital name changing. As a female, how likely are you to follow tradition and change your last name to your husband’s when you get married? This was one of the questions investigated in research published in Advances in Applied Sociology (Nov. 2013). According to the study, the probability that an American female will change her last name upon marriage is .9. Consider a sample of 25 American females. How many of these females would you expect to change their last name upon marriage? How likely is it that 20 or more of these females will change their last name upon marriage?

  5. 4.67 Victims of domestic abuse. According to researchers at Dan Jones & Associates, 1 in every 3 women has been a victim of domestic abuse (Domestic Violence: Incidence and Prevalence Study, Sept.–Dec., 2005). This probability was obtained from a survey of 1,000 adult women residing in Utah. Suppose we randomly sample 15 women and find that 4 have been abused.

    1. What is the probability of observing 4 or more abused women in a sample of 15 if the proportion p of women who are victims of domestic abuse is really p= p=13?

    2. Many experts on domestic violence believe that the proportion of women who are domestically abused is closer to p=.10. Calculate the probability of observing 4 or more abused women in a sample of 15 if p=.10.

    3. Why might your answers to parts a and b lead you to believe that p= p=13?

  6. 4.68 Underwater acoustic communication. A subcarrier is one telecommunication signal carrier that is piggybacked on top of another carrier so that effectively two signals are carried at the same time. Subcarriers are classified as data subcarriers (used for data transmissions), pilot subcarriers (used for channel synchronization), and null subcarriers (used for direct current). In the IEEE Journal of Oceanic Engineering (Apr. 2013), researchers studied the characteristics of subcarriers for underwater acoustic communications. Based on an experiment conducted off the coast of Martha’s Vineyard (MA), they estimated that 25% of subcarriers are pilot subcarriers, 10% are null subcarriers, and 65% are data subcarriers. Consider a sample of 50 subcarriers transmitted for underwater acoustic communications.

    1. How many of the 50 subcarriers do you expect to be pilot subcarriers? Null subcarriers? Data subcarriers?

    2. How likely is it to observe 10 or more pilot subcarriers?

    3. If you observe more than 25 pilot subcarriers, what would you conclude? Explain.

  7. 4.69 Testing a psychic’s ESP. Refer to Exercise 3.101 (p. 156) and the experiment conducted by the Tampa Bay Skeptics to see whether an acclaimed psychic has extrasensory perception (ESP). Recall that a crystal was placed, at random, inside 1 of 10 identical boxes lying side by side on a table. The experiment was repeated seven times, and x, the number of correct decisions, was recorded. (Assume that the seven trials are independent.)

    1. If the psychic is guessing (i.e., if the psychic does not possess ESP), what is the value of p, the probability of a correct decision on each trial?

    2. If the psychic is guessing, what is the expected number of correct decisions in seven trials?

    3. If the psychic is guessing, what is the probability of no correct decisions in seven trials?

    4. Now suppose the psychic has ESP and p=.5. What is the probability that the psychic guesses incorrectly in all seven trials?

    5. Refer to part d. Recall that the psychic failed to select the box with the crystal on all seven trials. Is this evidence against the psychic having ESP? Explain.

Applying the Concepts—Advanced

  1. 4.70 Assigning a passing grade. A literature professor decides to give a 20-question true–false quiz to determine who has read an assigned novel. She wants to choose the passing grade such that the probability of passing a student who guesses on every question is less than .05. What score should she set as the lowest passing grade? 

  2. 4.71 USGA golf ball specifications. According to the U.S. Golf Association (USGA),“The weight of the [golf] ball shall not be greater than 1.620 ounces avoirdupois (45.93 grams)…. The diameter of the ball shall not be less than 1.680 inches…. The velocity of the ball shall not be greater than 250 feet per second” (USGA, 2014). The USGA periodically checks the specifications of golf balls sold in the United States by randomly sampling balls from pro shops around the country. Two dozen of each kind are sampled, and if more than three do not meet size or velocity requirements, that kind of ball is removed from the USGA’s approved-ball list.

    1. What assumptions must be made and what information must be known in order to use the binomial probability distribution to calculate the probability that the USGA will remove a particular kind of golf ball from its approved-ball list?

    2. Suppose 10% of all balls produced by a particular manufacturer are less than 1.680 inches in diameter, and assume that the number of such balls, x, in a sample of two dozen balls can be adequately characterized by a binomial probability distribution. Find the mean and standard deviation of the binomial distribution.

    3. Refer to part b. If x has a binomial distribution, then so does the number, y, of balls in the sample that meet the USGA’s minimum diameter. [Note: x+y=24.] Describe the distribution of y. In particular, what are p, q, and n? Also, find E(y) and the standard deviation of y.

  3. 4.72 Does having boys run in the family? Chance (Fall 2001) reported that the eight men in the Rodgers family produced 24 biological children over four generations. Of these 24 children, 21 were boys and 3 were girls. How likely is it for a family of 24 children to have 21 boys? Use the binomial distribution and the fact that 50% of the babies born in the United States are male to answer the question. Do you agree with the statement, “Rodgers men produce boys”?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.15.1