9.4 Assessing the Utility of the Model: Making Inferences about the Slope β1

Now that we have specified the probability distribution of ε and found an estimate of the variance σ2, we are ready to make statistical inferences about the linear model’s usefulness in predicting the response y. This is step 4 in our regression modeling procedure.

Refer again to the data of Table 9.1, and suppose the reaction times are completely unrelated to the percentage of drug in the bloodstream. What could then be said about the values of β0 and β1 in the hypothesized probabilistic model

y=β0+β1x+ε

if x contributes no information for the prediction of y? The implication is that the mean of y—that is, the deterministic part of the model E(y)=β0+β1x—does not change as x changes. In the straight-line model, this means that the true slope, β1, is equal to 0. (See Figure 9.11.) Therefore, to test the null hypothesis that the linear model contributes no information for the prediction of y against the alternative hypothesis that the linear model is useful in predicting y, we test

Figure 9.11

Graph of the straight-line model when the slope is zero, i.e., y=β0+ε

H0:β1=0
Ha:β10

If the data support the alternative hypothesis, we will conclude that x does contri­bute information for the prediction of y with the straight-line model [although the true relationship between E(y) and x could be more complicated than a straight line]. In effect, then, this is a test of the usefulness of the hypothesized model.

The appropriate test statistic is found by considering the sampling distribution of β^1, the least squares estimator of the slope β1, as shown in the following box:

Sampling Distribution of β^1

If we make the four assumptions about ε (see Section 9.3), the sampling distribution of the least squares estimator β^1 of the slope will be normal with mean β1 (the true slope) and standard deviation

Figure 9.12

Sampling distribution of β^1

σβ^1=σSSxx(see Figure 9.12)

We estimate σβ^1 by sβ^1=sSSxx and refer to sβ^1 as the estimated standard error of the least squares slope β^1.

Since σ is usually unknown, the appropriate test statistic is a t-statistic, formed as:

t=β^1Hypothesized value of β1sβ^1where sβ^1=sSSxx

Thus,

t=β^10s/SSxx

Note that we have substituted the estimator s for σ and then formed the estimated standard error sβ^1 by dividing s by SSxx. The number of degrees of freedom associated with this t statistic is the same as the number of degrees of freedom associated with s. Recall that this number is (n2) df when the hypothesized model is a straight line. (See Section 9.3.) The setup of our test of the usefulness of the straight-line model is summarized in the following two boxes.

A Test of Model Utility: Simple Linear Regression

Test statistic: tc=β^1sβ^1=β^1(s/SSxx)
One-Tailed Tests Two-Tailed Test
H0:β1=0H0:β1=0 H0:β1=0
Ha:β1<0Ha:β1>0 Ha:β10
Rejection region: tc<tαtc>tα |tc|>tα/2
p-value: P(t<tc)P(t>tc) 2P(t>tc) if tc is positive
2P(t<tc) if tc is negative

Decision: Reject H0 if α>p-value or if test statistic (tc) falls in rejection region where P(t>tα)=α,P(t>tα/2)=α/2, and t is based on (n2) degrees of freedom

Conditions Required for a Valid Test: Simple Linear Regression

The four assumptions about ε listed in Section 9.3.

STIMULUS Example 9.4 Testing the Regression Slope, β1—Drug Reaction Model

Problem

  1. Refer to the simple linear regression analysis of the drug reaction data performed in Examples 9.19.3. Conduct a test (at α=.05) to determine whether the reaction time (y) is linearly related to the amount of drug (x).

Solution

  1. For the drug reaction example, n=5. Thus, t will be based on n2=3 df, and the rejection region t (at α=.05) will be

    |t|>t.025=3.182

    We previously calculated β^1=.7,s=.61, and SSxx=10. Thus,

    t=β^1s/SSxx=.7.61/10=.7.19=3.7

    Since this calculated t-value falls into the upper-tail rejection region (see Figure 9.13), we reject the null hypothesis and conclude that the slope β1 is not 0. The sample evidence indicates that the percentage x of drug in the bloodstream contributes information for the prediction of the reaction time y when a linear model is used.

    Figure 9.13

    Rejection region and calculated t value for testing H0:β1=0 versus Ha:β10

    [Note: We can reach the same conclusion by using the observed significance level (p-value) of the test from a computer printout. The MINITAB printout for the drug reaction example is reproduced in Figure 9.14. The test statistic and the two-tailed p-value are highlighted on the printout. Since the p-value=.035 is smaller than α=.05, we will reject H0.]

    Figure 9.14

    MINITAB printout for the time–drug regression

Look Back

What conclusion can be drawn if the calculated t-value does not fall into the rejection region or if the observed significance level of the test exceeds α? We know from previous discussions of the philosophy of hypothesis testing that such a t-value does not lead us to accept the null hypothesis. That is, we do not conclude that β1=0. Additional data might indicate that β1 differs from 0, or a more complicated relationship may exist between x and y, requiring the fitting of a model other than the straight-line model.

Now Work Exercise 9.62c

Interpreting p-Values for β Coefficients in Regression

Almost all statistical computer software packages report a two-tailed p-value for each of the β parameters in the regression model. For example, in simple linear regression, the p-value for the two-tailed test H0:β1=0 versus Ha:β10 is given on the printout. If you want to conduct a one-tailed test of hypothesis, you will need to adjust the p-value reported on the printout as follows:

Upper-tailed test(Ha: β1>0): p-value={p/2ift>01p/2ift<0Lower-tailed test(Ha: β1<0): p-value={p/2ift<01p/2ift>0

where p is the p-value reported on the printout and t is the value of the test statistic.

Another way to make inferences about the slope β1 is to estimate it with a confidence interval, formed as shown in the following box:

A 100(1α)% Confidence Interval for the Simple Linear Regression Slope β1

β^1±(tα/2)sβ^1

where the estimated standard error of β^1 is calculated by

sβ^1=sSSxx

and tα/2 is based on (n2) degrees of freedom.

Conditions Required for a Valid Confidence Interval: Simple Linear Regression

The four assumptions about ε listed in Section 9.3.

For the simple linear regression for the drug reaction (Examples 9.19.3), tα/2 is based on (n2)=3 degrees of freedom. Therefore, a 95% confidence interval for the slope β1, the expected change in reaction time for a 1% increase in the amount of drug in the bloodstream, is

β^1±t.025sβ^1=.7±3.182(sSSxx)=.7±3.182(.6110)=.7±.61

Thus, the estimate of the interval for the slope parameter β1 is from .09 to 1.31. [Note: This interval can also be obtained with statistical software and is highlighted on the SPSS printout shown in Figure 9.15.] In terms of this example, the implication is that we can be 95% confident that the true mean increase in reaction time per additional 1% of the drug is between .09 and 1.31 seconds. This inference is meaningful only over the sampled range of x—that is, from 1% to 5% of the drug in the bloodstream.

Figure 9.15

SPSS printout with 95% confidence intervals for the time–drug regression betas

Now Work Exercise 9.67

Since all the values in this interval are positive, it appears that β1 is positive and that the mean of y, E(y), increases as x increases. However, the rather large width of the confidence interval reflects the small number of data points (and, consequently, a lack of information) used in the experiment. We would expect a narrower interval if the sample size were increased.

We conclude this section with a comment on the other β-parameter in the straight-line model—the y-intercept, β0. Why not conduct a test of hypothesis on β0? For example, we could conduct the test H0:β0=0 against Ha:β00. The p-value for this test appears on the printouts for SAS, SPSS, MINITAB, and most other statistical software packages. The answer lies in the interpretation of β0. In the previous section, we learned that the y-intercept represents the mean value of y when x=0. Thus, the test H0:β0=0 is equivalent to testing whether E(y)=0 when x=0. In the drug reaction simple linear regression, we would be testing whether the mean reaction time (y) is 0 seconds when the amount of drug in the blood (x) is 0%. The value x=0 is typically not a meaningful value (as in the drug reaction example), or, x=0 is typically outside the range of the sample data. In either of these cases, the test H0:β0=0 is not meaningful and should be avoided.

For those regression analyses where x=0 is a meaningful value, one may desire to predict the value of y when x=0. We discuss a confidence interval for such a prediction in Section 9.6.

Caution

In simple linear regression, the test H0:β0=0 is only meaningful if the value x=0 makes sense and is within the range of the sample data.

Statistics in Action Revisited

Assessing How Well the Straight-Line Model Fits the Dowsing Data

In the previous Statistics in Action Revisited, we fit the straight-line model E(y)=β0+β1x, where x=dowser's guess (in meters) and y=pipe location (in meters) for each trial. The MINITAB regression printout is reproduced in Figure SIA9.3. The two-tailed p-value for testing the null hypothesis H0:β1=0 (highlighted on the printout) is p-value=.118. Even for an α-level as high as α=.10, there is insufficient evidence to reject H0. Consequently, the dowsing data in Table SIA9.1 provide no statistical support for the German researchers’ claim that the three best dowsers have an ability to find underground water with a divining rod.

Figure SIA9.3

MINITAB simple linear regression for dowsing data

This lack of support for the dowsing theory is made clearer with a confidence interval for the slope of the line. When n=26,df=(n2)=24 and t.025=2.064. Substituting the latter value and the relevant values shown on the MINITAB printout, we find that a 95% confidence interval for β1 is

β^1±t.025(sβ^1)=.31±(2.064)(.19)=.31±.39,or(.08,.70)

Thus, for every 1-meter increase in a dowser’s guess, we estimate (with 95% confidence) that the change in the actual pipe location will range anywhere from a decrease of .08 meter to an increase of .70 meter. In other words, we’re not sure whether the pipe location will increase or decrease along the 10-meter pipeline! Keep in mind also that the data in Table SIA9.1 represent the “best” performances of the three dowsers (i.e., the outcome of the dowsing experiment in its most favorable light). When the data for all trials are considered and plotted, there is not even a hint of a trend.

Exercises 9.53–9.76

Understanding the Principles

  1. 9.53 In the equation E(y)=β0+β1x, what is the value of β1 if x has no linear relationship to y?

  2. 9.54 What conditions are required for valid inferences about the β's in simple linear regression?

  3. 9.55 How do you adjust the p-value obtained from a computer printout when you perform an upper-tailed test of β1 in simple linear regression whent>0?

  4. 9.56 For each of the following 95% confidence intervals for β1 in simple linear regression, decide whether there is evidence of a positive or negative linear relationship between y and x:

    1. (22, 58)

    2. (30,111)

    3. (45,7)

Learning the Mechanics

  1. 9.57 Construct both a 95% and a 90% confidence interval for β1 for each of the following cases:

    1. β^1=31,s=3,SSxx=35,n=12

    2. β^1=64,SSE=1,960,SSxx=30,n=18

    3. β^1=8.4,SSE=146,SSxx=64,n=24

  2. L09058 9.58 Consider the following pairs of observations:

    Alternate View
    x 1 5 3 2 6 6 0
    y 1 3 3 1 4 5 1
    1. Construct a scatterplot of the data.

    2. Use the method of least squares to fit a straight line to the seven data points in the table.

    3. Plot the least squares line on your scatterplot of part a.

    4. Specify the null and alternative hypotheses you would use to test whether the data provide sufficient evidence to indicate that x contributes information for the (linear) prediction of y.

    5. What is the test statistic that should be used in conducting the hypothesis test of part d? Specify the number of degrees of freedom associated with the test statistic.

    6. Conduct the hypothesis test of part d, using α=.05.

    7. Construct a 95% confidence interval for β1.

  3. L09059 9.59 Consider the following pairs of observations:

    Alternate View
    y 4 2 5 3 2 4
    x 1 4 5 3 2 4
    1. Construct a scatterplot of the data.

    2. Use the method of least squares to fit a straight line to the six data points.

    3. Graph the least squares line on the scatterplot of part a.

    4. Compute the test statistic for determining whether x and y are linearly related.

    5. Carry out the test you set up in part d, using α=.01.

    6. Find a 99% confidence interval for β1.

Applying the Concepts—Basic

  1. 9.60 Congress voting on women’s issues. The American Economic Review (Mar. 2008) published research on how the gender mix of a U.S. legislator’s children can influence the legislator’s votes in Congress. Specifically, the researcher investigated how having daughters influences voting on women’s issues. The American Association of University Women (AAUW) uses voting records of each member of Congress to compute an AAUW score, where higher scores indicate more favorable voting for women’s rights. The researcher modeled AAUW score (y) as a function of the number of daughters (x) a legislator has. Data collected for the 434 members of the 107th Congress were used to fit the straight-line model, E(y)=β0+β1x.

    1. If it is true that having daughters influences voting on women’s issues, will the sign of β1 be positive or negative? Explain.

    2. The following statistics were reported in the article: β^1=.27 and sβ^1=.74. Find a 95% confidence interval for β1.

    3. Use the result, part b, to make an inference about the model.

  2. H2OPIPE 9.61 Estimating repair and replacement costs of water pipes. Refer to the IHS Journal of Hydraulic Engineering (Sept. 2012) study of water pipes susceptible to breakage, Exercise 9.28 (p. 515). Recall that civil engineers used simple linear regression to model y=theratio of repair to replacement cost of commercial pipe as a function of x=thediameter (in millimeters) of the pipe. The MINITAB printout of the analysis is reproduced below. Are the engineers able to conclude (at α=.05) that the cost ratio increases linearly with pipe diameter? If so, provide a 95% confidence interval for the increase in cost ratio for every 1 millimeter increase in pipe diameter.

  3. 9.62 Generation Y’s entitlement mentality. The current workforce is dominated by “Generation Y”—people born between 1982 and 1999. These workers have a reputation as having an entitlement mentality (e.g., they believe they have a right to a high-paying job without the work ethic). The reasons behind this phenomenon were investigated in Proceedings of the Academy of Educational Leadership (Vol. 16, 2011). A sample of 272 undergraduate business students was administered a questionnaire designed to capture the behaviors that lead to an entitlement mentality. The responses were used to measure the following two quantitative variables for each student: entitlement score (y)—where higher scores indicate a greater level of entitlement—and “helicopter parents” score (x)—where higher scores indicate that the student’s parents had a higher level of involvement in their everyday experiences and problems.

    1. a. Give the equation of a simple linear regression model relating y to x.

    2. b. The researchers theorize that helicopter parents lead to an entitlement mentality. Based on this theory, would you expect β0 to be positive or negative (or are you unsure)?

      Would you expect β1 to be positive or negative (or are you unsure)? Explain.

    3. c. The p-value for testing H0:β1=0 versus H0:β1>0 was reported as .002. Use this result to test the researchers’ entitlement theory at α=.01.

  4. TRAPS 9.63 Lobster fishing study. Refer to the Bulletin of Marine Science (April 2010) study of teams of fishermen fishing for the red spiny lobster in Baja California Sur, Mexico, Exercise 7.18 (p. 383). Two variables measured for each of 8 teams from the Punta Abreojos (PA) fishing cooperative were y=totalcatch of lobsters (in kilograms) during the season and x=averagepercentage of traps allocated per day to exploring areas of unknown catch (called search frequency). These data are listed in the table.

    Total Catch Search Frequency
    2,785 35
    6,535 21
    6,695 26
    4,891 29
    4,937 23
    5,727 17
    7,019 21
    5,735 20

    Source: From Shester, G. G. “Explaining catch variation among Baja California lobster fishers through spatial analysis of trap-placement decisions.” Bulletin of Marine Science, Vol. 86, No. 2, Apr. 2010 (Table 1). Reprinted with permission from the University of Miami-Bulletin of Marine Science.

    1. Graph the data in a scatterplot. What type of trend, if any, do you observe?

    2. A simple linear regression analysis was conducted using SAS. Find the least squares prediction equation on the SAS printout at the bottom of the page. Interpret the slope of the least squares line.

    3. Give the null and alternative hypotheses for testing whether total catch (y) is negatively linearly related to search frequency (x).

    4. Find the p-value of the test, part c, on the SAS printout.

    5. Give the appropriate conclusion of the test, part c, using α=.05.

  5. 9.64 Beauty and electoral success. Are good looks an advantage when running for political office? This was the question of interest in an article published in the Journal of Public Economics (Feb. 2010). The researchers focused on a sample of 641 non-incumbent candidates for political office in Finland. Photos of each candidate were evaluated by non-Finnish subjects; each evaluator assigned a beauty rating—measured on a scale of 1 (lowest rating) to 5 (highest rating)—to each candidate. The beauty ratings for each candidate were averaged; then the average was divided by the standard deviation for all candidates to yield a beauty index for each candidate. (Note: A 1-unit increase in the index represents a 1-standard-deviation increase in the beauty rating.) The relative success (measured as a percentage of votes obtained) of each candidate was used as the dependent variable (y) in a regression analysis. One of the independent variables in the model was beauty index (x).

    1. Write the equation of a simple linear regression relating y to x.

    2. Does the y-intercept of the equation, part a, have a practical interpretation? Explain.

    3. The article reported the estimated slope of the equation, part a, as 22.91. Give a practical interpretation of this value.

    4. The standard error of the slope estimate was reported as 3.73. Use this information and the estimate from part c to conduct a test for a positive slope at α=.01. Give the appropriate conclusion in the words of the problem.

    SAS Output for Exercise 9.63

Applying the Concepts—Intermediate

  1. PGA 9.65 Ranking driving performance of professional golfers. Refer to The Sport Journal (Winter 2007) study of a new method for ranking the total driving performance of golfers on the Professional Golf Association (PGA) tour, presented in Exercise 9.29 (p. 515). You fit a straight-line model relating driving accuracy (y) to driving distance (x) to the data.

    1. Give the null and alternative hypotheses for testing whether driving accuracy (y) decreases linearly as driving distance (x) increases.

    2. Find the test statistic and p-value of the test you set up in part a.

    3. Make the appropriate conclusion at α=.01.

  2. FCAT 9.66 FCAT scores and poverty. Refer to the Journal of Educational and Behavioral Statistics (Spring 2004) study of scores on the Florida Comprehensive Assessment Test (FCAT), first presented in Exercise 9.30 (p. 515). Consider the simple linear regression relating math score (y) to percentage (x) of students below the poverty level.

    1. Test whether y is negatively related to x. Use α=.01.

    2. Construct a 99% confidence interval for β1. Interpret the result practically.

  3. OJUICE 9.67 Sweetness of orange juice. Refer to Exercise 9.32 (p. 516) and the simple linear regression relating the sweetness index (y) of an orange juice sample to the amount of water-soluble pectin (x) in the juice. Find a 95% confidence interval for the true slope of the line. Interpret the result.

  4. HEIGHT 9.68 Ideal height of your mate. Refer to the Chance (Summer 2008) study of the height of the ideal mate, Exercise 9.33 (p. 516). You used the data to fit the simple linear regression model E(y)=β0+β1x, where y=ideal partner’s height (in inches) and x=student's height (in inches), for both males and females.

    1. Find a 90% confidence interval for β1 in the model for the male students. Give a practical interpretation of the result.

    2. Repeat part a for the female students.

    3. Which group, males or females, has the greater increase in ideal partner’s height for every 1 inch increase in student’s height?

  5. FRAG 9.69 Forest fragmentation study. Refer to the Conservation Ecology (Dec. 2003) study on the causes of fragmentation of 54 South American forests, presented in Exercise 2.166 (p. 97). Recall that researchers developed two fragmentation indexes for each forest—one index for anthropogenic (human development activities) fragmentation and one for fragmentation from natural causes. Data on 5 of the 54 forests saved in the FRAG file are listed in the following table.

    Ecoregion (forest) Anthropogenic Index, y Natural Origin Index, x
    Araucaria moist forests 34.09 30.08
    Atlantic Coast restingas 40.87 27.60
    Bahia coastal forests 44.75 28.16
    Bahia interior forests 37.58 27.44
    Bolivian Yungas 12.40 16.75

    Based on Wade, T. G., et al. “Distribution and causes of global forest fragmentation.” Conservation Ecology, Vol. 72, No. 2, Dec. 2003 (Table 6).

    1. Ecologists theorize that a linear relationship exists be­t­ween the two fragmentation indexes. Write the model relating y to x.

    2. Fit the model to the data in the FRAG file, using the method of least squares. Give the equation of the least squares prediction equation.

    3. Interpret the estimates of β0 and β1 in the context of the problem.

    4. Is there sufficient evidence to indicate that the natural origin index (x) and the anthropogenic index (y) are positively linearly related? Test, using α=.05.

    5. Find and interpret a 95% confidence interval for the change in the anthropogenic index (y) for every 1-point increase in the natural origin index (x).

  6. BOXING2 9.70 Effect of massage on boxers. The British Journal of Sports Medicine (Apr. 2000) conducted a study of the effect of massage on boxing performance. Two variables measured on the boxers were blood lactate concentration (in mM) and the boxer’s perceived recovery (on a 28-point scale). On the basis of information provided in the article, the data shown in the accompanying table were obtained for 16 five-round boxing performances in which a massage was given to the boxer between rounds. Conduct a test to determine whether blood lactate level (y) is linearly related to perceived recovery (x). Use α=.10.

    Blood Lactate Level Perceived Recovery
    3.8 7
    4.2 7
    4.8 11
    4.1 12
    5.0 12
    5.3 12
    4.2 13
    2.4 17
    3.7 17
    5.3 17
    5.8 18
    6.0 18
    5.9 21
    6.3 21
    5.5 20
    6.5 24

    Based on Hemmings, B., Smith, M., Graydon, J., and Dyson, R. “Effects of massage on physiological restoration, perceived recovery, and repeated sports performance.” British Journal of Sports Medicine, Vol. 34, No. 2, Apr. 2000 (data adapted from Figure 3).

  7. 9.71 Eye anatomy of giraffes. Giraffes are believed to have excellent vision. The journal African Zoology (Oct. 2013) published a study of giraffe eye characteristics. Data were collected for a sample of 27 giraffes inhabiting southeastern Zimbabwe. Of interest to the researchers was how these eye characteristics relate to a giraffe’s body mass. They fit a simple linear regression equation of the form ln(y)=β0+β1ln(x)+ε, where y represents an eye characteristic and x represents body mass (measured in kilograms). For this model, the slope β1 represents the percentage change in y for every 1% increase in x.

    1. For the eye characteristic y=eye mass (grams), the regression equation yielded the following 95% confidence interval for β1:(.25,.30). Give a practical interpretation of this interval.

    2. For the eye characteristic y=orbit axis angle (degrees), the regression equation yielded the following 95% confidence interval for β1:(.14,.05). Give a practical interpretation of this interval.

  8. EMPATHY 9.72 Pain empathy and brain activity. Empathy refers to being able to understand and vicariously feel what others actually feel. Neuroscientists at University College of London investigated the relationship between brain activity and pain-related empathy in persons who watch others in pain (Science, Feb. 20, 2004). Sixteen couples participated in the experiment. The female partner watched while painful stimulation was applied to the finger of her male partner. Two variables were measured for each female: y=pain-related brain activity (measured on a scale ranging from 2 to 2) and x=score on the Empathic Concern Scale (0 to 25 points). The data are listed in the accompanying table. The research question of interest was “Do people scoring higher in empathy show higher pain-related brain activity?” Use simple linear regression analysis to answer this question.

    Couple Brain Activity (y) Empathic Concern (x)
    1 .05 12
    2 .03 13
    3 .12 14
    4 .20 16
    5 .35 16
    6 0 17
    7 .26 17
    8 .50 18
    9 .20 18
    10 .21 18
    11 .45 19
    12 .30 20
    13 .20 21
    14 .22 22
    15 .76 23
    16 .35 24

    Based on Singer, T. et al. “Empathy for pain involves the affective but not sensory components of pain.” Science, Vol. 303, Feb. 20, 2004. (Adapted from Figure 4.)

  9. RIBS 9.73 Characterizing bone with fractal geometry. In Medical Engineering & Physics (May 2013), researchers used fractal geometry to characterize human cortical bone. A measure of the variation in the volume of cortical bone tissue—called fractal dimension—was determined for each in a sample of 10 human ribs. The researchers used fractal dimension scores to predict the bone tissue’s stiffness index, called Young’s Modulus (measured in gigapascals). The experimental data are shown in the first column. Consider the linear model E(y)=β0+β1x, where y=Young’s Modulus and x=fractal dimension score. Find an interval estimate of the increase (or decrease) in Young’s Modulus for every 1-point increase in a bone tissue’s fractal dimension score. Use a confidence coefficient of .90.

    Young’s Modulus (GPa) Fractal Dimension
    18.3 2.48
    11.6 2.48
    32.2 2.39
    30.9 2.44
    12.5 2.50
    9.1 2.58
    11.8 2.59
    11.0 2.59
    19.7 2.51
    12.0 2.49

    Source: Sanchez-Molina, D., et al. “Fractal dimension and mechanical properties of human cortical bone.” Medical Engineering & Physics, Vol. 35, No. 5, May 2013 (Table 1).

  10. NAME2 9.74 The “name game.” Refer to the Journal of Experimental Psychology—Applied (June 2000) name-retrieval study, presented in Exercise 9.34 (p. 517). Recall that the goal of the study was to investigate the linear trend between proportion of names recalled (y) and position (order) of the student (x) during the “name game.” Is there sufficient evidence (at α=.01) of a linear trend?

Applying the Concepts—Advanced

  1. PARKS 9.75 Does elevation affect hitting performance in baseball? Refer to the Chance (Winter 2006) investigation of the effects of elevation on slugging percentage in Major League Baseball, Exercise 2.160 (p. 96). Data were compiled on players’ composite slugging percentages at each of 29 cities for the 2003 season, as well as on each city’s elevation (feet above sea level). The data are saved in the PARKS file. (Selected observations are shown in the table below.) Consider a straight-line model relating slugging percentage (y) to elevation (x).

    1. The model was fit to the data with the use of MINITAB, with the results shown in the printout below. Locate the estimates of the model parameters on the printout.

      City Slug Pct. Elevation
      Anaheim .480 160
      Arlington .605 616
      Atlanta .530 1,050
      Baltimore .505 130
      Boston .505 20
      Denver .625 5,277
      Seattle .550 350
      San Francisco .510 63
      St. Louis .570 465
      Tampa .500 10
      Toronto .535 566

      Based on Schaffer, J., and Heiny, E. L. “The effects of elevation on slugging percentage in Major League Baseball.” Chance, Vol. 19, No. 1, Winter 2006 (adapted from Figure 2).

    2. Is there sufficient evidence (at α=.01) of a positive linear relationship between elevation (x) and slugging percentage (y)? Use the p-value shown on the printout to make the inference.

    3. Construct a scatterplot of the data and draw the least squares line on the graph. Locate the data point for Denver on the graph. What do you observe?

    4. Recall that the Colorado Rockies, who play their home games in Denver, are annually among the league leaders in slugging percentage. Baseball experts attribute this to the “thin air” of Denver—called the Mile High city due to its elevation. Remove the data point for Denver from the data set and refit the straight-line model to the remaining data. Repeat parts a and b. What conclusions can you draw about the “thin air” theory from this analysis?

  2. LSPILL 9.76 Spreading rate of spilled liquid. Refer to the Chemical Engineering Progress (Jan. 2005) study of the rate at which a spilled volatile liquid will spread across a surface, Exercise 9.35 (p. 517). Is there sufficient evidence (at α=.05) to indicate that the mass of the spill tends to diminish linearly as elapsed time increases? If so, give an interval estimate (with 95% confidence) of the decrease in spill mass for each minute of elapsed time.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.19.31.73