In this chapter we consider fixed effects regression models for response variables that are categorical: dichotomous, unordered polytomous, and ordered polytomous. In chapter 2 we saw that linear fixed effects models for quantitative response variables could be estimated in several different ways, all producing the same results. Analogous methods are also available for categorical response variables, but these methods typically do not produce exactly the same results. So an important task of this chapter is to clarify the differences among the different methods and to develop appropriate interpretations of their coefficients.
In chapter 1 we saw that the paired-comparisons t-test could be interpreted as a fixed effects method. Let's begin this chapter with an analogous method for dichotomous variables observed at two points in time. Table 3.1, taken from Hu et al. (1998), is a cross-classification of responses by sixth and seventh graders to a question about whether they had smoked cigarettes in the preceding month. They were interviewed at baseline in 1984 and again one year later.
One Year Later | |||
---|---|---|---|
Yes | No | ||
Baseline | Yes | 27 | 26 |
No | 63 | 566 |
At baseline, 8% of the respondents said they had smoked. One year later, the percentage had increased to 13. Is this change statistically significant? We can't use a conventional test for a difference between two proportions because we don't have two independent samples. McNemar's (1955) test is a simple solution to this problem. We ignore the 593 children who didn't change from baseline to one year and use only the two off-diagonal cell counts. A chi-square statistic is calculated as
With 1 degree of freedom, this has a p-value less than .0001. We conclude that the probability of smoking increased over the one-year period.
While the hand calculation is simple enough, we can also let PROC FREQ do the work. Here's how to read in the data and get the McNemar statistic (requested with the AGREE option):
DATA smoking; INPUT baseline $ oneyear $ count; DATALINES; yes yes 27 yes no 26 no yes 63 no no 566 ; PROC FREQ DATA=smoking; WEIGHT count; TABLE baseline*oneyear / AGREE NOROW NOCOL NOPCT; RUN;
The NOROW, NOCOL, and NOPCT options suppress the percentage calculations so that only the raw frequency counts appear in the table.
Table of baseline by oneyear | ||||
---|---|---|---|---|
baseline | oneyear | |||
Frequency | no | yes | Total | |
no | 566 | 63 | 629 | |
yes | 26 | 27 | 53 | |
Total | 592 | 90 | 682 |
Statistics for Table of baseline by oneyear | |
---|---|
McNemar's Test | |
Statistic (S) | 15.3820 |
DF | 1 |
Pr > S | <.0001 |
The results in Output 3.1 confirm our hand calculations. Keep in mind that this test says nothing about the degree of association between the two responses. It merely tests the null hypothesis that the probability of a "yes" response is the same at the two time points, while allowing for any level of association.
Suppose we also want to answer the question "How do the odds of smoking change over the one-year period?" A natural way to answer this question is to compute the estimated odds of smoking at each interview, and then take the ratio of those odds.
But a fixed effects approach leads to a different estimate, the ratio of the two off-diagonal counts: 63/26 = 2.42. Thus a fixed effects approach leads to a much higher estimate of the change over time. The conventional method is called a population-averaged estimate, whereas the fixed effects estimate is called a subject-specific estimate. I'll have more to say about the difference between these two kinds of estimates later in this chapter. Note that this distinction was not relevant for the paired-comparisons t-test in the last chapter, where the difference in the means across the two time periods was the same as the mean of the differences.
While McNemar's test is well suited to its goal, it doesn't allow covariates to affect the response. In the next section we develop a logistic regression method that does just that. As we'll see, McNemar's approach can be seen as a special case of this more general method.
3.138.120.136