For our last example in the chapter, we will be performing a sort-of Bayesian analogue to the two-sample t-test using the same data and problem from the corresponding example in the previous chapter—testing whether the means of the gas mileage for automatic and manual cars are significantly different.
As before, let's specify the model using non-informative flat priors:
the.model <- " model { # each group will have a separate mu # and standard deviation for(j in 1:2){ mu[j] ~ dunif(0, 60) # prior stddev[j] ~ dunif(0, 20) # prior tau[j] <- pow(stddev[j], -2) } for(i in 1:theLength){ # likelihood function y[i] ~ dnorm(mu[x[i]], tau[x[i]]) } }"
Notice that the construct that describes the likelihood function is a little different now; we have to use nested subscripts for the mu
and tau
parameters to tell JAGS that we are dealing with two different versions of mu
and stddev
.
Next, the data:
the.data <- list( y = mtcars$mpg, # 'x' needs to start at 1 so # 1 is now automatic and 2 is manual x = ifelse(mtcars$am==1, 1, 2), theLength = nrow(mtcars) )
Finally, let's roll!
> results <- autorun.jags(the.model, + data=the.data, + n.chains = 3, + monitor = c('mu', 'stddev'))
Let's extract the samples for both 'mu's and make a vector that holds the differences in the mu samples between each of the two groups.
> results.matrix <- as.matrix(results$mcmc) > difference.in.means <- (results.matrix[,1] – + results.matrix[,2])
Figure 7.16 shows a plot of the credible differences in means. The likely differences in means are far above a difference of zero. We are all but certain that the means of the gas mileage for automatic and manual cars are significantly different.
Notice that the decision to mimic the independent samples t-test made us focus on one particular part of the Bayesian analysis and didn't allow us to appreciate some of the other very valuable information the analysis yielded. For example, in addition to having a distribution illustrating credible differences in means, we have the posterior distribution for the credible values of both the means and standard deviations of both samples. The ability to make a decision on whether the samples' means are significantly different is nice—the ability to look at the posterior distribution of the parameters is better.
3.145.163.242