CHAPTER 10

image

Traditional Statistical Methods

Statistics is an evolving, growing field. Consider that, at this moment, there are hundreds of scholars working on their graduate degrees in statistics. Each of those scholars must make an original contribution to the field of statistics, either in theory or in application. This is not to mention the statistics faculty and other faculty members in various research fields who are working on the cutting edge of statistical applications. Add to this total the statistical innovators in government, business, the biomedical field, and other organizations. You get the picture.

The statisticians of the 20th century gave us tools, and we are now in the process of making the tools better. This chapter will provide a quick run-through of the most common statistical procedures and hypothesis tests one might learn in an introductory statistics course, and then we will consider some modern alternatives in Chapter 11.

10.1 Estimation and Confidence Intervals

In this section, we will examine how to produce confidence intervals for means, proportions, and variances. The discussion of confidence intervals will lead directly into the treatment of the most common hypothesis tests.

10.1.1 Confidence Intervals for Means

For reasons we have discussed previously, we commonly use the t distribution to develop confidence intervals for means. We can determine the critical values of t using the qt function, and then we can multiply the critical value of t by the standard error of the mean to determine the width of one-half the confidence interval. Adding this margin of error to the sample mean produces the upper limit of the confidence interval, and subtracting the margin of error from the sample mean produces the lower limit of the confidence interval. The standard deviation of the sampling distribution of means for samples of size n is found as follows:

Eqn10-1.jpg

This quantity is known as the standard error of the mean. We then multiply the standard error of the mean by the critical value of t to determine the margin of error. The lower and upper limits of the confidence interval encompass twice the margin of error.

Eqn10-2.jpg

The t.test function in R can be used to find a confidence interval for the mean, and it is often used exactly for that purpose. The default is a 95% confidence interval. As you may recall, we must determine the degrees of freedom when finding a critical value of t for a confidence interval or hypothesis test. Let us find a 95% confidence interval for the mean science scores of the 200 students in our hsb sample. First, just for fun, let’s write a simple function to take the preceding formulas and implement them to get our confidence interval.

> CI <- function (x, alpha = .05) {
+   sampMean <- mean (x)
+   stderr <- sd(x)/ sqrt( length (x))
+   tcrit <- qt (1- alpha /2, length (x) - 1)
+   margin <- stderr * tcrit
+   CI <- c( sampMean - margin , sampMean + margin )
+   return (CI)
+ }
> CI( hsb $ science )
[1] 50.46944 53.23056

Now, compare the results with those from the one-sample t-test. As you see, the function and the t-test produced the same confidence limits.

> t.test(hsb$ science)

        One Sample t-test

data:  hsb$science
t = 74.061, df = 199, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 50.46944 53.23056
sample estimates:
mean of x
    51.85

10.1.2 Confidence Intervals for Proportions

You are probably familiar with journalists announcing that some opinion poll had a margin of error of ± 3 percentage points. For proportions, we use the standard normal distribution, rather than the t distribution, to develop confidence intervals. This is because of the relationship between the binomial distribution and the normal distribution we discussed in Chapter 6. Let us define the confidence interval for a sample proportion pˆ as follows:

Eqn10-3.jpg

As an example, suppose you wanted to determine a confidence interval for a poll of 200 randomly sampled people in your community of whom 135 expressed support for the death penalty. The sample proportion is 135/200 = .675. Assume it is known from national polls that 63% of the population is in favor of the death penalty. We can calculate a 95% confidence interval and then determine whether our confidence limits “capture” the population proportion.

Eqn10-4.jpg

We see that the lower limit of our confidence interval is thus 0.61, and our upper limit is .74, so the population value is in fact within the bounds of our confidence interval (CI). Of course, R has a built-in function, prop.test(), that provides such calculations for us. The confidence interval is slightly different from the preceding one, because the standard error term is based on the hypothesized population proportion rather than the sample proportion. The default for this function is a 95% CI; however, the command was made explicit for readers who wish to use other CI levels. Here, as always, the Help tab on the right-hand side of RStudio is very, ahem, helpful.

> prop.test(135,200, conf.level=0.95)

        1-sample proportions test with continuity correction

data:  135 out of 200, null probability 0.5
X-squared = 23.805, df = 1, p-value = 1.066e-06
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
 0.6047423 0.7384105
sample estimates:
    p
0.675

10.1.3 Confidence Intervals for the Variance

When we use the t distribution and the standard normal distribution for confidence intervals, the intervals are symmetrical about the estimated value. This is not true when we use the chi-square distribution because the distribution itself is not symmetrical. As mentioned earlier, we can use the chi-square distribution to form a CI around a sample variance. The CI can be constructed as follows1:

Eqn10-5.jpg

where X2L is the left-tailed critical value of chi square and X2R is the right-tailed critical value. The degrees of freedom for these values of chi square are n − 1. We could write a simple R function to calculate these values, as follows. This is a bare-bones function, and the reader is encouraged to adorn his or her own version with labels and other embellishments. Let us apply our confidence interval function to the variance of the city miles per gallon from the cars dataset we used extensively in Chapter 9.

> library ( openintro )

> varInterval <- function (data , conf.level = 0.95) {
+   df <- length ( data ) - 1
+   chi_left <- qchisq ((1 - conf.level )/2, df)
+   chi_right <- qchisq ((1 - conf.level )/2, df , lower.tail = FALSE )
+   v <- var ( data )
+   c(( df * v)/chi_right, (df * v)/ chi_left )
+   }


> var ( cars $ mpgCity )
[1] 43.88015

> varInterval ( cars $ mpgCity )
[1] 31.00787 66.87446

As the output shows, the interval is not symmetrical around the point estimate of the population variance, as explained earlier. To find a confidence interval for the sample standard deviation, simply take the square roots of the lower and upper limits of those for the variance.

10.2 Hypothesis Tests with One Sample

You have already learned in Chapter 7 how to do chi-square tests of goodness of fit with a single sample. In addition to testing frequencies, we can also test means and proportions for one sample.

We can test the hypothesis that a sample came from a population with a particular mean by using the one-sample t-test function shown previously for a confidence interval for the sample mean. The value of μ, the population mean is the test value, and the sample value of t is found as:

Eqn10-6.jpg

When the null hypothesis that there is no difference between the sample mean and the hypothesized population mean is true, the resulting statistic is distributed as t with n − 1 degrees of freedom. To illustrate, let’s determine the probability that the city miles per gallon of the car data came from a population with a mean city MPG of 25, adopting a traditional alpha level of 0.05.

> t.test ( cars $ mpgCity , mu = 25)

One Sample t-test

data : cars $ mpgCity
t = -1.8694 , df = 53, p- value = 0.06709
alternative hypothesis : true mean is not equal to 25
95 percent confidence interval :
21.50675 25.12288
sample estimates :
mean of x
23.31481

We can determine that we should not reject the null hypothesis in three equivalent ways. First, we can examine the confidence interval. The test value of 25 is “in” the confidence interval. Second, we might examine the p value and determine that it is greater than .05. Both approaches lead to the same conclusion: namely, we do not reject the null hypothesis. Third, the older critical value method (CVM) would also produce the same conclusion. We would see that our obtained t value of −1.87 is lower than the critical value of 2.01 with 53 degrees of freedom.

A special case of the t-test arises when we have matched, paired, or repeated measures data. In such cases, we do not have two independent samples but a single sample of difference scores. Our interest is to determine whether the average difference is zero. Failing to recognize the dependent nature of such data means that one is likely to apply a two-sample t-test when the appropriate test is a paired-samples test, also known as a dependent or correlated t-test. Performing the incorrect test naturally may result in drawing the wrong conclusion(s).

The t.test() function built into R takes as a default argument paired = FALSE that should be swapped to TRUE if two vectors of paired data are input.

To illustrate, let us use the UScrime data from the MASS package. For the 47 states, police expenditure was measured in 1960 and 1959. We will select only the 16 cases from southern states. Let us compare the t-test for paired data and the one-sample t-test for the same data using the difference scores and testing the hypothesis that the mean difference is zero. Notice it would be incorrect to not explicitly call out paired = TRUE in the first example!

> library(MASS)
> pairedPoliceExpenditure <- UScrime [( UScrime $ So == 1) ,]

> t.test ( pairedPoliceExpenditure $ Po1 , pairedPoliceExpenditure $ Po2 , paired = TRUE )

       Paired t-test

data:  pairedPoliceExpenditure$Po1 and pairedPoliceExpenditure$Po2
t = 6.4606, df = 15, p-value = 1.074e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 2.680336 5.319664
sample estimates:
mean of the differences
                      4
> PolExpenDiffs <- pairedPoliceExpenditure $ Po1 - pairedPoliceExpenditure $Po2
> t.test ( PolExpenDiffs , mu = 0)

        One Sample t-test

data:  PolExpenDiffs
t = 6.4606, df = 15, p-value = 1.074e-05
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 2.680336 5.319664
sample estimates:
mean of x
        4

It is clear that the two tests produced exactly the same results, apart from minor labeling differences. The values of t, the degrees of freedom, the p values, and the confidence intervals are identical. Some statistics textbooks do not have a separate presentation of the paired-samples t-test, but those with such sections, in our experience, often confuse students, many of whom have a difficult time understanding the difference between paired and independent samples. The best way to get a feel for this is to (incorrectly) calculate without the paired = TRUE and notice the rather stark difference.

10.3 Hypothesis Tests with Two Samples

We can test hypotheses that two independent means or two independent proportions are the same using the t.test and prop.test functions in base R. For the reasons discussed earlier, we use the standard normal distribution or an equivalent chi-square test to test the difference between proportions.

For each of two independent samples, the number of successes and the number of failures must be at least 5. That is, np ≥ 5 and n(1 − p) ≥ 5 for each of the two samples. When that is the case, the statistic shown next follows a standard normal distribution when the null hypothesis is true that the proportions of success in the population are the same for the two samples.

Eqn10-7.jpg

where Eqn10-8.jpg and Eqn10-9.jpg.

As an example, let us consider comparing the proportions of males and females in favor of the death penalty in two randomly selected samples of 200 each. When we pool the proportions and calculate the z statistic shown previously, the value of z2 will be equal to the value of χ2 in the prop.test when the continuity correction is not applied. To demonstrate, assume there are 136 men in favor of the death penalty and 108 women in favor. Are these proportions significantly different at an alpha level of 0.05? First, let us use the prop.test function both with and without the continuity correction.

> prop.test (x = c(136 , 108) , n = c(200 , 200) )

2- sample test for equality of proportions with continuity correction

data : c(136 , 108) out of c(200 , 200)
X- squared = 7.6608 , df = 1, p- value = 0.005643
alternative hypothesis : two . sided
95 percent confidence interval :
0.04039239 0.23960761
sample estimates :
prop 1 prop 2
0.68 0.54

> prop.test (x = c(136 , 108) , n = c(200 , 200) , correct = FALSE )

2- sample test for equality of proportions without continuity correction

data : c(136 , 108) out of c(200 , 200)
X- squared = 8.2388 , df = 1, p- value = 0.004101
alternative hypothesis : two . sided
95 percent confidence interval :
0.04539239 0.23460761
sample estimates :
prop  1  prop 2
0.68     0.54

Note the value of chi square without the continuity correction is 8.2388. The square root of this quantity is 2.8703. For fun, let’s create a quick function to perform the same test as a z-test using our previous formula and see if our value of z is in fact the square root of chi square. The results confirm this.

> zproptest <- function (x1 , x2 , n1 , n2 , conf.level = 0.95) {
+ ppooled <- (x1 + x2)/(n1 + n2)
+ qpooled <- 1 - ppooled
+ p1 <- x1/n1
+ p2 <- x2/n2
+ zstat <- round (( p1 - p2)/ sqrt (( ppooled * qpooled )/n1 + ( ppooled * qpooled )/n2) ,4)
+ pval <- round (2 * pnorm (zstat , lower.tail = FALSE ) ,4)
+ print ("two - sample z test for proportions ", quote = FALSE )
+ print (c(" valueof z: ",zstat ), quote = FALSE )
+ print (c("p- value : ", pval ), quote = FALSE )
+ }
> zproptest (136 , 108 , 200 , 200)
[1] two - sample z test for proportions
[1] valueof z: 2.8703
[1] p- value : 0.0041

The independent-samples t-test has two options. The version most commonly taught in the social and behavioral sciences uses a pooled variance estimate, while statisticians in other fields are more likely to favor the t test that does not make the assumption of equality of variances in the population. The t.test function in R covers both possibilities, and for convenience it provides the ability to use data that are coded by a factor as well as side-by-side data in a data frame or matrix as well as data in two separate vectors.

Because the samples are independent, there is no constraint that the numbers in each sample must be equal. The t-test assuming unequal variances in the population makes use of the Welch-Satterthwaite approximation for the degrees of freedom, thus taking the different variances into account. The t-test assuming equal variances pools the variance estimates, as discussed earlier. We will illustrate both tests. One expedient is simply to test the equality of the two sample variances and choose the appropriate test on the basis of that test.

Let us subset the UScrime data so that we have the southern states in one data frame and the not-southern in another. We will then perform the F test of equality of variances using the var.test function.

> southern <- UScrime [(UScrime $ So ==1) ,]
> notSouthern <- UScrime [(UScrime $ So ==0) ,]
> var.test(southern $ Po1 , notSouthern $ Po1)

        F test to compare two variances

data:  southern$Po1 and notSouthern$Po1
F = 0.56937, num df = 15, denom df = 30, p-value = 0.2498
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.2467857 1.5052706
sample estimates:
ratio of variances
         0.5693726

The F test is not significant, so we will use the t-test for equal variances which makes no adjustment to the degrees of freedom and pools the variance estimates. This test has a p value less than 0.05, and thus we believe there is a difference in police expenditures in 1959 for southern versus not-southern states (a quick look at the data via boxplot suggests this makes sense).

> t.test(southern $ Po1 , notSouthern $ Po1, var.equal = TRUE)

Two Sample t-test

data:  southern$Po1 and notSouthern$Po1
t = -2.6937, df = 45, p-value = 0.009894
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -40.408529  -5.833406
sample estimates:
mean of x mean of y
 69.75000  92.87097

In contrast, the default t-test adjusts the degrees of freedom to account for an inequality of variance. In this case, the two tests produce similar results.

> t.test(southern $ Po1 , notSouthern $ Po1)

        Welch Two Sample t-test

data:  southern$Po1 and notSouthern$Po1
t = -2.9462, df = 38.643, p-value = 0.005427
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -38.999264  -7.242671
sample estimates:
mean of x mean of y
 69.75000  92.87097

In the next chapter, we will introduce some additional tests which harness some of the compute power possible with the R language and modern computers.

References

1. Mario F. Triola, Elementary Statistics, 11th ed. (New York: Pearson Education, 2010).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.228.99