Chapter Notes

Key Terms

Note: Starred (*) terms are from the optional sections in this chapter.

Key Symbols

μ Population mean
p Population proportion, P(Success), in binomial trial
σ2 Population variance
x Sample mean (estimator of μ)
p^ Sample proportion (estimator of p)
s2 Sample variance (estimator of σ2)
H0 Null hypothesis
Ha Alternative hypothesis
α Probability of a Type I error
β Probability of a Type II error
χ2 Chi-square (sampling distribution of s2 for normal data)
S *Test statistic for sign test
η *Population median

Key Ideas

Key Words for Identifying the Target Parameter

  • μ—Mean, Average

  • p—Proportion, Fraction, Percentage, Rate, Probability

  • σ2—Variance, Variability, Spread

  • η—Median

Elements of a Hypothesis Test

  1. Null hypothesis (H0)

  2. Alternative hypothesis (Ha)

  3. Test statistic (z,t, orX2)

  4. Significance level (α)

  5. p-value

  6. Conclusion

Forms of Alternative Hypothesis

  • Lower tailed: Ha:μ0<50

  • Upper tailed: Ha:μ0>50

  • Two tailed: Ha:μ050

  • Type I Error = Reject H0 when H0 is true (occurs with probability α)

  • Type II Error = Accept H0 when H0 is false (occurs with probability β)

Using p-Values to Decide

  1. Choose significance level (α)

  2. Obtain p-value of the test

  3. If α>p-value, Reject H0

Guide to Selecting a One-Sample Hypothesis Test

Supplementary Exercises 6.132–6.169

Note:

List the assumptions necessary for the valid implementation of the statistical procedures you use in solving all these exercises. Starred (*) exercises refer to the optional sections in this chapter.

Understanding the Principles

  1. 6.132 Specify the differences between a large-sample and small-sample test of hypothesis about a population mean μ. Focus on the assumptions and test statistics.

  2. 6.133 Which of the elements of a test of hypothesis can and should be specified prior to analyzing the data that are to be utilized to conduct the test?

  3. 6.134 Complete the following statement: The smaller the p-value associated with a test of hypothesis, the stronger is the support for the ____ hypothesis. Explain your answer.

  4. 6.135 Complete the following statement: The larger the p-value associated with a test of hypothesis, the stronger is the support for the ____ hypothesis. Explain your answer.

Learning the Mechanics

  1. *6.136 In a random sample of 10 observations selected from a nonnormal population, seven of the measurements exceed 150.

    1. Find the p-value for testing H0:η=150 against Ha.η>150.

    2. Make the appropriate conclusion at a α=.05

  2. 6.137 A random sample of n=200 observations from a binomial population yields p^=.29.

    1. Test H0:p=.35 against Ha:p<.35. Use α=.05.

    2. Test H0:p=.35 against Ha:p.35. Use α=.05.

    3. Form a 95% confidence interval for p.

    4. Form a 99% confidence interval for p.

    5. How large a sample would be required to estimate p to within .05 with 99% confidence?

  3. 6.138 A random sample of 175 measurements possessed a mean of x=8.2 and a standard deviation of s=.79.

    1. Form a 95% confidence interval for μ.

    2. Test H0:μ=8.3 against Ha:μ8.3. Use α=.05.

    3. Test H0:μ=8.4 against Ha:μ8.4. Use α=.05.

  4. *6.139 A random sample of 41 observations from a normal popu­lation possessed a mean of x=88 and a standard deviation of s=6.9.

    1. Test H0:σ2=30 against Ha:σ2>30. Use α=.05.

    2. Test H0:σ2=30 against Ha:σ230. Use α=.05.

  5. 6.140 A t-test is conducted for the null hypothesis H0:μ=10 versus the alternative hypothesis Ha:μ>10 for a random sample of n=17 observations. The test results are t=1.174 anp-value=.1288.

    1. Interpret the p-value.

    2. What assumptions are necessary for the validity of this test?

    3. Calculate and interpret the p-value, assuming that the alternative hypothesis was instead Ha:μ10.

Applying the Concepts—Basic

  1. 6.141 Use of herbal therapy. According to the Journal of Advanced Nursing (Jan. 2001), 45% of senior women (i.e., women over the age of 65) use herbal therapies to prevent or treat health problems. Also, senior women who use herbal therapies use an average of 2.5 herbal products in a year.

    1. Give the null hypothesis for testing the first claim by the journal.

    2. Give the null hypothesis for testing the second claim by the journal.

  2. 6.142 FDA mandatory new-drug testing. When a new drug is formulated, the pharmaceutical company must subject it to lengthy and involved testing before receiving the necessary permission from the Food and Drug Administration (FDA) to market the drug. The FDA requires the pharmaceutical company to provide substantial evidence that the new drug is safe for potential consumers.

    1. If the new-drug testing were to be placed in a test-of-­hypothesis framework, would the null hypothesis be that the drug is safe or unsafe? the alternative hypothesis?

    2. Given the choice of null and alternative hypotheses in part a, describe Type I and Type II errors in terms of this application. Define α and β in terms of this application.

    3. If the FDA wants to be very confident that the drug is safe before permitting it to be marketed, is it more important that α or β be small? Explain.

  3. 6.143 Sleep deprivation study. In a British study, 12 healthy college students deprived of one night’s sleep received an array of tests intended to measure their thinking time, fluency, flexibility, and originality of thought. The overall test scores of the sleep-deprived students were compared with the average score expected from students who received their accustomed sleep. Suppose the overall scores of the 12 sleep-deprived students had a mean of x=63 and a standard deviation of 17. (Lower scores are associated with a decreased ability to think creatively.)

    1. Test the hypothesis that the true mean score of sleep-­deprived subjects is less than 80, the mean score of subjects who received sleep prior to taking the test. Use α=.05.

    2. What assumption is required for the hypothesis test of part a to be valid?

  4. 6.144 Accuracy of price scanners at Wal-Mart. Refer to Exercise 5.129 (p. 297) andConsider the study of the accuracy of checkout scanners at Wal-Mart stores in California. Recall that theThe National Institute for Standards and Technology (NIST) mandates that, for every 100 items scanned through the electronic checkout scanner at a retail store, no more than 2 should have an inaccurate price. A study of random items purchased at California Wal-Mart stores found that 8.3% had the wrong price (Tampa Tribune, Nov. 22, 2005). Assume that the study included 1,000 randomly selected items.

    1. Identify the population parameter of interest in the study.

    2. Set up H0 and Ha for a test to determine whether the true proportion of items scanned at California Wal-Mart stores exceeds the 2% NIST standard.

    3. Find the test statistic and rejection region (at α=.05) for the test.

    4. Give a practical interpretation of the test.

    5. What conditions are required for the inference made in part d to be valid? Are these conditions met?

  5. 6.145 Accounting and Machiavellianism. Behavioral Research in Accounting (Jan. 2008) published a study of Machiavellian traits in accountants. (Machiavellian describes negative character traits that include manipulation, cunning, duplicity, deception, and bad faith.) A Machiavellian (“Mach”) rating score was determined for each in a sample of 122 purchasing managers with the following results: x=99.6, s=12.6. (Note: Scores range from a low of 40 to a high of 160, with the theoretical neutral Mach rating score of 100.) A director of purchasing at a major firm claims that the true mean Mach rating score of all purchasing managers is 85.

    1. Specify the null and alternative hypotheses for a test of the director’s claim.

    2. Define a Type I error for this test.

    3. Interpret the value, α=.10.

    4. Give the rejection region for the test using α=.10.

    5. Find the value of the test statistic.

    6. Use the result, part e, to make the appropriate conclusion.

    7. Do you need to make any assumptions about the distribution of Mach rating scores for the population of all purchasing managers? Explain.

  6. 6.146 The “Pepsi challenge.” “Take the Pepsi Challenge” was a marketing campaign used by the Pepsi-Cola Company. Coca-Cola drinkers participated in a blind taste test in which they tasted unmarked cups of Pepsi and Coke and were asked to select their favorite. Pepsi claimed that “in recent blind taste tests, more than half the Diet Coke drinkers surveyed said they preferred the taste of Diet Pepsi.” Suppose 100 Diet Coke drinkers took the Pepsi Challenge and 56 preferred the taste of Diet Pepsi. Test the hypothesis that more than half of all Diet Coke drinkers will select Diet Pepsi in a blind taste test. Use

  7. 6.147 Masculinizing human faces. Nature (Aug. 27, 1998) published a study of facial characteristics that are deemed attractive. In one experiment, 67 human subjects viewed side by side an image of a Caucasian male face and the same image 50% masculinized using special computer graphics. Each subject was asked to select the facial image they deemed more attractive. Fifty-eight of the 67 subjects felt that masculinization of face shape decreased attractiveness of the male face. The researchers used this sample information to test whether the subjects showed a preference for either the unaltered or the morphed male face.

    1. Set up the null and alternative hypotheses for this test.

    2. Compute the test statistic.

    3. The researchers reported a p-value0 for the test. Do you agree?

    4. Make the appropriate conclusion in the words of the problem. Use α=.01.

  8. 6.148 Identifying type of urban land cover. Geographical Analysis (Oct. 2006) published a study of a new method for analyzing remote-sensing data from satellite pixels in order to identify urban land cover. The method uses a numerical measure of the distribution of gaps, or the sizes of holes, in the pixel, called lacunarity. Summary statistics for the lacunarity measurements in a sample of 100 grassland pixels are x=225 and s=20. It is known that the mean lacunarity measurement for all grassland pixels is 220. The method will be effective in identifying land cover if the standard deviation of the measurements is 10% (or less) of the true mean (i.e., if the standard deviation is less than 22).

    1. Give the null and alternative hypotheses for a test to determine whether, in fact, the standard deviation of all grassland pixels is less than 22.

    2. A MINITAB analysis of the data is provided above. Locate and interpret the p-value of the test. Use α=.10.

  9. *6.149 Alkalinity of river water. TheThe median alkalinity level of water specimens collected from the Han River in Seoul, South Korea, is 50 milligrams per liter (Environmental Science & Engineering, Sept. 1, 2000). Consider a random sample of 10 water specimens collected from a tributary of the Han River. Suppose that 8 of the 10 alkalinity levels for the sample are greater than 50 mpl. Is there sufficient evidence (at α=.01) to indicate that the population median alkalinity level of water in the tributary exceeds 50 mpl?

  10. 6.150 Teacher perceptions of child behavior. Developmental Psychology (Mar. 2003) published a study on teacher perceptions of the behavior of elementary school children. Teachers rated the aggressive behavior of a sample of 11,160 New York City public school children by responding to the statement “This child threatens or bullies others in order to get his/her own way.” Responses were measured on a scale ranging from 1 (never) to 5 (always). Summary statistics for the sample of 11,160 children were reported as x=2.15 and s=1.05. Let μ represent the mean response for the population of all New York City public school children. Suppose you want to test H0:μ=3 against Ha:μ3.

    1. In the words of the problem, define a Type I error and a Type II error.

    2. Use the sample information to conduct the test at a significance level of α=.05.

    3. Conduct the test from part b at a significance level of α=.10.

Applying the Concepts—Intermediate

  1. 6.151 Post-traumatic stress of POWs. Psychological Assessment (Mar. 1995) published the results of a study of World War II aviators captured by German forces after having been shot down. Having located a total of 239 World War II aviator POW survivors, the researchers asked each veteran to participate in the study; 33 responded to the letter of invitation. Each of the 33 POW survivors was administered the Minnesota Multiphasic Personality Inventory, one component of which measures level of post-traumatic stress disorder (PTSD). [Note: The higher the score, the higher is the level of PTSD.] The aviators produced a mean PTSD score of x=9.00 and a standard deviation of s=9.32. Conduct a test to determine if the true mean PTSD score for all World War II aviator POWS is less than 16. [Note: The value 16 represents the mean PTSD score established for Vietnam POWS.] Use α=.10.

  2. 6.152 Errors in medical tests. Medical tests have been developed to detect many serious diseases. A medical test is designed to minimize the probability that it will produce a “false positive” or a “false negative.” A false positive is a positive test result for an individual who does not have the disease, whereas a false negative is a negative test result for an individual who does have the disease.

    1. If we treat a medical test for a disease as a statistical test of hypothesis, what are the null and alternative hypotheses for the medical test?

    2. What are the Type I and Type II errors for the test? Relate each to false positives and false negatives.

    3. Which of these errors has graver consequences? Considering this error, is it more important to minimize α or β? Explain.

  3. 6.153 Inbreeding of tropical wasps. Refer toConsider the Science study of inbreeding in tropical swarm-founding wasps, presented in Exercise 5.139 (p. 299). A sample of 197 wasps, captured, frozen, and subjected to a series of genetic tests, yielded a sample mean inbreeding coefficient of x=.044 with a standard deviation of s=.884. Recall that ifIf the wasp has no tendency to inbreed, the true mean inbreeding coefficient μ for the species will equal 0.

    1. Test the hypothesis that the true mean inbreeding coefficient μ for this species of wasp exceeds 0. Use α=.05.

    2. Compare the inference you made in part a with the inference you obtained in Exercise 7.139, using a confidence interval. Do the inferences agree? Explain.

  4. 6.154 Colored string preferred by chickens. Refer toConsider the Applied Animal Behaviour Science (Oct. 2000) study of domestic chickens exposed to a pecking stimulus, presented in Exercise 5.25 (p. 264). Recall that theThe average number of pecks a chicken takes at a white string over a specified time interval is known to be μ=7.5 pecks. In an experiment in which 72 chickens were exposed to blue string, the average number of pecks was x=1.13 pecks, with a standard deviation of s=2.21 pecks.

    1. On average, are chickens more apt to peck at white string than at blue string? Conduct the appropriate test of hypothesis, using α=.05.

    2. Compare your answer to part a with your answer to Exercise 7.25b.

    3. Find the p-value for the test and interpret it.

  5. 6.155 Single-parent families. Examining data collected on 835 males from the National Youth Survey (a longitudinal survey of a random sample of U.S. households), researchers at Carnegie Mellon University found that 401 of the male youths were raised in a single-parent family (Sociological Methods & Research, Feb. 2001). Does this information allow you to conclude that more than 45% of male youths are raised in a single-parent family? Test at

  6. 6.156 A dental bonding agent. When bonding teeth, orthodontists must maintain a dry field. A bonding adhesive (called “Smartbond”) has been developed to eliminate the necessity of a dry field. However, there is concern that the new bonding adhesive is not as strong as the current standard, a composite adhesive (Trends in Biomaterials & Artificial Organs, Jan. 2003). Tests on a sample of 10 extracted teeth bonded with the new adhesive resulted in a mean breaking strength (after 24 hours) of x=5.07 Mpa and a standard deviation of s=.46 Mpa, where Mpa=megapascal, a measure of force per unit area.

    1. a. Orthodontists want to know if the true mean breaking strength of the new bonding adhesive is less than 5.70 Mpa, the mean breaking strength of the composite adhesive. Conduct the appropriate analysis for the orthodontists. Use α=.01.

    2. *b. In addition to requiring a good mean breaking strength, orthodontists are concerned about the variability in breaking strength of the new bonding adhesive. Conduct a test to determine whether the breaking strength variance differs from .5 Mpa.

  7. 6.157 PCB in plant discharge. The EPA sets a limit of 5 parts per million (ppm) on PCB (polychlorinated biphenyl, a dangerous substance) in water. A major manufacturing firm producing PCB for electrical insulation discharges small amounts from the plant. The company management, attempting to control the PCB in its discharge, has given instructions to halt production if the mean amount of PCB in the effluent exceeds 3 ppm. A random sample of 50 water specimens produced the following statistics: x¯=3.1ppm and s=.5 ppm.

    1. Do these statistics provide sufficient evidence to halt the production process? Use α=.01.

    2. If you were the plant manager, would you want to use a large or a small value for α for the test in part a?

  8. *6.158 Weights of parrot fish. A marine biologist wishes to use parrot fish for experimental purposes due to the belief that their weight is fairly stable (i.e., the variability in weights among parrot fish is small). The biologist randomly samples 10 parrot fish and finds that their mean weight is 4.3 pounds and the standard deviation is 1.4 pounds. The biologist will use the parrot fish only if there is evidence that the variance of their weights is less than 4.

    1. Is there sufficient evidence for the biologist to claim that the variability in weights among parrot fish is small enough to justify their use in the experiment? Test at α=.05.

    2. State any assumptions that are needed for the test mentioned in part a to be valid.

  9. 6.159 Federal civil trial appeals. Refer toConsider the Journal of the American Law and Economics Association (Vol. 3, 2001) study of appeals of federal civil trials, presented in Exercise 3.126 (p. 161). A breakdown of 678 civil cases that were originally tried in front of a judge and appealed by either the plaintiff or the defendant is reproduced in the table on p. 362. Do the data provide sufficient evidence to indicate that the percentage of civil cases appealed that are actually reversed is less than 25%? Test, using α=.01.

    Table for Exercise 6.159

    Outcome of Appeal Number of Cases
    Plaintiff trial win—reversed 71
    Plaintiff trial win—affirmed/dismissed 240
    Defendant trial win—reversed 68
    Defendant trial win—affirmed/dismissed 299
    Total 678
  10. 6.160 Interocular eye pressure. Ophthalmologists require an instrument that can rapidly measure interocular pressure for glaucoma patients. The device now in general use is known to yield readings of this pressure with a variance of 10.3. The variance of five pressure readings on the same eye by a newly developed instrument is equal to 9.8. Does this sample variance provide sufficient evidence to indicate that the new instrument is more reliable than the instrument currently in use? (Use α=.05.)

  11. 6.161 Choosing portable grill displays. Refer toConsider the Journal of Consumer Research (Mar. 2003) experiment on influencing the choices of others by offering undesirable alternatives, presented in Exercise 3.29 (p. 129). Recall that eachEach of 124 college students selected three portable grills out of five to display on the showroom floor. The students were instructed to include Grill #2 (a smaller-sized grill) and select the remaining two grills in the display to maximize purchases of Grill #2. If the six possible grill display combinations (1–2–3, 1–2–4, 1–2–5, 2–3–4, 2–3–5, and 2–4–5) are selected at random, then the proportion of students selecting any display will be 1/6=.167. One theory tested by the researcher is that the students will tend to choose the three-grill display so that Grill #2 is a compromise between a more desirable and a less desirable grill. Of the 124 students, 85 students selected a three-grill display that was consistent with that theory. Use this information to test the theory proposed by the researcher at α=.05.

  12. 6.162 Study of lunar soil. Meteoritics (Mar. 1995) reported the results of a study of lunar soil evolution. Data were obtained from the Apollo 16 mission to the moon, during which a 62-cm core was extracted from the soil near the landing site. Monomineralic grains of lunar soil were separated out and examined for coating with dust and glass fragments. Each grain was then classified as coated or uncoated. Of interest is the “coat index”—that is, the proportion of grains that are coated. According to soil evolution theory, the coat index will exceed .5 at the top of the core, equal .5 in the middle of the core, and fall below .5 at the bottom of the core. Use the summary data in the accompanying table to test each part of the three-part theory. Use α=.05 for each test.

    Alternate View
    Location (depth)
    Top (4.25 cm) Middle (28.1 cm) Bottom (54.5 cm)
    Number of grains sampled 84 73 81
    Number coated 64 35 29

    Based on Basu, A., and McKay, D. S. “Lunar soil evolution processes and Apollo 16 core 60013/60014.”Meteoritics, Vol. 30, No. 2, Mar. 1995, p. 166 (Table 2).

  13. LICHEN 6.163 Radioactive lichen. Refer toConsider the Lichen Radionuclide Baseline Research project to monitor the level of radioactivity in lichen, presented in Exercise 5.133 (p. 298). Recall that University of Alaska researchers collected nine lichen specimens and measured the amount of the radioactive element cesium-137 (in microcuries per milliliter) in each specimen.

    1. Assume that in previous years the mean cesium amount in lichen was μ=.003 microcurie per milliliter. Is there sufficient evidence to indicate that the mean amount of cesium in lichen specimens differs from this value? Use the SAS printout below to conduct a complete test of hypothesis at α=.10.

    2. Conduct a nonparametric test to determine if the population median h differs from .003 micro curie per milliliter. Use α=.10.

Applying the Concepts—Advanced

  1. 6.164 Parents who condone spanking. In Exercise 4.188 (p. 241)A you read about a nationwide survey that claimed that 60% of parents with young children condone spanking their child as a regular form of punishment (Tampa Tribune, Oct. 5, 2000). In a random sample of 100 parents with young children, how many parents would need to say that they condone spanking as a form of punishment in order to refute the claim?

  2. 6.165 Polygraph test error rates. In a classic study reported in Discover magazine, a group of physicians subjected the polygraph (or lie detector) to the same careful testing given to medical diagnostic tests. They found that if 1,000 people were subjected to the polygraph and 500 told the truth and 500 lied, the polygraph would indicate that approximately 185 of the truth tellers were liars and that approximately 120 of the liars were truth tellers.

    1. In the application of a polygraph test, an individual is presumed to be a truth teller (H0) until “proven” a liar (Ha). In this context, what is a Type I error? A Type II error?

    2. According to the study, what is the probability (approximately) that a polygraph test will result in a Type I error? A Type II error?

  3. PCB 6.166 Solar joint inspections. X-rays and lasers are used to inspect solder-joint defects on printed circuit boards (PCBs). A particular manufacturer of laser-based inspection equipment claims that its product can inspect at least 10 solder joints per second, on average, when the joints are spaced .1 inch apart. The equipment was tested by a potential buyer on 48 different PCBs. In each case, the equipment was operated for exactly 1 second. The numbers of solder joints inspected on each run are listed in the table on the next page.

    Alternate View

    Data for Exercise 6.166

    10 9 10 10 11 9 12 8 8 9 6 10
    7 10 11 9 9 13 9 10 11 10 12 8
    9 9 9 7 12 6 9 10 10 8 7 9
    11 12 10 0 10 11 12 9 7 9 9 10
    1. a. The potential buyer doubts the manufacturer’s claim. Do you agree?

    2. *b. Assume that the standard deviation of the number of solder joints inspected on each run is 1.2, and the true mean number of solder joints that can be inspected is really equal to 9.5. How likely is the buyer to correctly conclude that the claim is false?

  4. PONDS 6.167 Effectiveness of skin cream. Pond’s has discontinued the production of Age-Defying Complex, a cream with alpha-hydroxy acid, with Age-Defying Towlettes. Pond’s advertised that the product could reduce wrinkles and improve the skin. In a study published in Archives of Dermatology (June 1996), 33 middle-aged women used a product with alpha-hydroxy acid for 22 weeks. At the end of the study period, a dermatologist judged whether each woman exhibited any improvement in the condition of her skin. The results for the 33 women ( whereI=improvedskin and N=noimprovement) are listed in the accompanying table. Can you conclude that the cream will improve the skin of more than 60% of middle-aged women?

    Alternate View
    I I N I N N I I I I I I
    N I I I N I I I N I N I
    I I I I I N I I N

Critical Thinking Challenges

  1. 6.168 The Hot Tamale caper. “Hot Tamales” are chewy, cinnamon-flavored candies. A bulk vending machine is known to dispense, on average, 15 Hot Tamales per bag. Chance (Fall 2000) published an article on a classroom project in which students were required to purchase bags of Hot Tamales from the machine and count the number of candies per bag. One student group claimed it purchased five bags that had the following candy counts: 25, 23, 21, 21, and 20. There was some question as to whether the students had fabricated the data. Use a hypothesis test to gain insight into whether or not the data collected by the students were fabricated. Use a level of significance that gives the benefit of the doubt to the students.

  2. 6.169 Verifying voter petitions. To get their names on the ballot of a local election, political candidates often must obtain petitions bearing the signatures of a minimum number of registered voters. According to the St. Petersburg Times, in Pinellas County, Florida, a certain political candidate obtained petitions with 18,200 signatures. To verify that the names on the petitions were signed by actual registered voters, election officials randomly sampled 100 of the signatures and checked each for authenticity. Only 2 were invalid signatures.

    1. Is 98 out of 100 verified signatures sufficient to believe that more than 17,000 of the total 18,200 signatures are valid? Use α=.01.

    2. Repeat part a if only 16,000 valid signatures are required.

References

  • Daniel, W. W. Applied Nonparametric Statistics 2nd ed. Boston: PWS-Kent, 1990.

  • Hoallander, M., and Wolfe, D. A. Nonparametric Statistical Methods, 2nd ed. New York: Wiley, 1999.

  • Snedecor, G. W., and Cochran, W. G. Statistical Methods, 7th ed. Ames, IA: Iowa State University Press, 1980.

  • Wackerly, D., Mendenhall, W., and Scheaffer, R. Mathematical Statistics with Applications, 7th ed. Belmont, CA: Thomson, Brooks/Cole, 2008.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.66.206