How to do it...

In this example, we will load a dataset containing heights for 199 individuals divided into two samples (100 and 99). Each sample is obtained in a different area. We want to know whether the two mean heights for the two samples are equal or not. It's worth noting that we don't need the sample sizes to be the same:

  1. First, we need to load dplyr and read the dataset. This contains two columns: Sample (either 1 or 2) and Height (numeric variable). To check whether the variances are equal, we first do LeveneTest (its null hypothesis is that the variances are the same). Note that we pass a formula to this function, because it expects variable ~ group (where group should be a factor indicating to which sample each observation belongs). This is why we transformed the Sample variable into a factor (it was numeric). Because we get a p-value of 0.42, we don't reject the null hypothesis and conclude that the variances are equal. Now, we may continue with our t-test:
library(dplyr)
library(car)
data = read.csv("./heights.csv")
data$Sample = as.factor(data$Sample)
leveneTest(Height ~ Sample.data)

The preceding code displays the following output of leveneTest results:

  1. Now, we create two DataFrames containing the respective samples using dplyr, as follows:
sample1 = data %>% filter(Sample==1) %>% select(Height)
sample2 = data %>% filter(Sample==2) %>% select(Height)
  1. We are now ready to do our analysis using the t.test function. We pass the two samples with var.equal=TRUE (we have already tested the equality of the variances). We can also pass the confidence level that we want for the test, which in this case is 95%. The statistic is equal to -0.141 with 197 degrees of freedom (=199 - 2) yielding a p-value of 0.88. Consequently, we do not reject the null hypothesis, and conclude that there is no statistical evidence to conclude that the means are different. As expected, the 95% confidence interval (from -3.09 to 2.68) does include the zero:
t.test(sample1,sample2,var.equal=TRUE,conf.level = .95,alternative="two.sided")

The preceding code displays the following output of the two sample t-test. The null hypothesis is not rejected:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.159.235