6.6. Adjacent Categories Model

Another general model for ordered categorical data is the adjacent categories model. As before, we let pij be the probability that individual i falls into category j of the dependent variable, and we assume that the categories are ordered in the sequence j=1, ..., J. Now take any pair of categories that are adjacent, such as j and j+1. We can write a logit model for the contrast between these two categories as a function of explanatory variables:

Equation 6.5


where βjxi = βjixil +...+ βjkxik. There are J–1 of these paired contrasts. It turns out that this is just another way of writing the multinomial logit model for unordered categories. In other words, equation (6.5) is equivalent to equation (5.1). To get the adjacent categories model for ordered data, we impose a constraint on this set of equations. Specifically, we assume that βj = β for all j. In other words, instead of having a different set of coefficients for every adjacent pair, there is only one set for the lot of them. So our adjacent category model becomes

Equation 6.6


Notice that the right-hand side is identical to equation (6.1), which defines the cumulative model, but the left-hand side compares individual categories rather than grouped, cumulative categories.

Although the adjacent category model is a special case of the multinomial logit model, CATMOD doesn’t allow the imposition of the appropriate constraints, at least not with maximum likelihood estimation. If the data is grouped, however, as in a contingency table, CATMOD can estimate the model by weighted least squares. Here’s how to do it with the happiness data:

PROC CATMOD DATA=happy;
  WEIGHT count;
  DIRECT married y84 y94;
  RESPONSE ALOGIT;
  MODEL happy = _RESPONSE_ married y84 y94 / WLS;
RUN;

The RESPONSE statement with the ALOGIT option invokes the adjacent categories function for the dependent variable, as in equation (6.5). Putting _RESPONSE_ in the MODEL statement tells CATMOD to estimate a single set of coefficients rather than a different set for each pair of categories, as in equation (6.6). Results are shown in Output 6.8.

The first thing to notice is that the residual chi-square test indicates that the model doesn’t fit. In fact, the value is fairly close to the deviance and Pearson chi-squares we got for the cumulative logit model in Output 6.5. As in that output, we also find a strong effect of marital status but little evidence for an effect of calendar year. The signs are reversed from Output 6.5, but that’s just a consequence of the way CATMOD parameterizes the model. (On the left-hand side of equation (6.6), it puts the j+1 on top and the j on the bottom). The adjusted odds ratio for MARRIED is exp(.8035)=2.23. This tells us that, whenever we compare adjacent happiness categories, married people have more than double the odds of being in the happier category than unmarried people have.

Output 6.8. CATMOD Output for Adjacent Categories Model, Happiness Data
              ANALYSIS-OF-VARIANCE TABLE

Source                   DF   Chi-Square      Prob
--------------------------------------------------

RESIDUAL                  7        35.74    0.0000

         ANALYSIS OF WEIGHTED-LEAST-SQUARES ESTIMATES

                                       Standard    Chi-
Effect            Parameter  Estimate    Error    Square   Prob
----------------------------------------------------------------
INTERCEPT                 1   -0.0437    0.0507     0.74  0.3885
_RESPONSE_                2    1.0737    0.0295  1323.91  0.0000
MARRIED                   3   -0.8035    0.0453   314.22  0.0000
Y84                       4   -0.0563    0.0588     0.92  0.3386
Y94                       5    0.0353    0.0516     0.47  0.4945

Because the model doesn’t fit, we can do what we did with the cumulative logit model: fit separate equations for each of the two adjacent pairs. In this case, we can easily accomplish that by removing _RESPONSE_ from the MODEL statement. The results are shown in Output 6.9.

Output 6.9. Adjacent Categories Model with Two Sets of Coefficients
             ANALYSIS-OF-VARIANCE TABLE

Source                     DF   Chi-Square      Prob
----------------------------------------------------

RESIDUAL                    4         4.16    0.3845


            ANALYSIS OF WEIGHTED-LEAST-SQUARES ESTIMATES

                                         Standard    Chi-
Effect              Parameter  Estimate    Error    Square   Prob
------------------------------------------------------------------
INTERCEPT                   1    0.9211    0.0770   143.03  0.0000
                            2   -0.8998    0.0939    91.90  0.0000
MARRIED                     3   -0.8619    0.0646   178.10  0.0000
                            4   -0.7087    0.0863    67.46  0.0000
Y84                         5    0.0295    0.0831     0.13  0.7230
                            6   -0.2105    0.1181     3.18  0.0747
Y94                         7    0.2900    0.0726    15.94  0.0001
                            8   -0.4036    0.1024    15.53  0.0001

These results are quite similar to what is shown in Output 6.7. The MARRIED coefficient is quite stable across the two equations, while the Y94 coefficients differ dramatically: they have opposite signs and both are highly significant. Again, the message is that in 1994, the middle category was more strongly favored over the two extremes.

Despite clear-cut differences in the formulation and interpretation of the cumulative logit model and the adjacent categories model, the two models tend to yield very similar conclusions in practice. I prefer the cumulative model, both for its appealing latent variable interpretation and for its ready availability in software. But the adjacent categories model has one advantage, at least in principle: it’s easy to formulate a model with selective constraints on coefficients (although such models can’t be estimated in CATMOD). For example, we could force the MARRIED coefficient to be the same for all category pairs, but allow the year coefficients to be different. In fact, we’ll see how to do that in Chapter 10, when we estimate the adjacent categories model by using the equivalent loglinear model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.223.123