4.1. Introduction

To many people, categorical data analysis means the analysis of contingency tables, also known as cross tabulations. For decades, the mainstay of contingency table analysis was the chi-square test introduced by Karl Pearson in 1900. However, things changed dramatically with the development of the loglinear model in the late 1960s and early 1970s. Loglinear analysis made it possible to analyze multi-way contingency tables, testing both simple and complex hypotheses in an elegant statistical framework. In Chapter 10, we’ll see how to estimate loglinear models by using the GENMOD procedure.

Although loglinear analysis is still a popular approach to the analysis of contingency tables, logit analysis can often do a better job. In fact, there is an intimate relationship between the two approaches. For a contingency table, every logit model has a loglinear model that is its exact equivalent. The converse doesn’t hold—there are loglinear models that don’t correspond to any logit models—but in most cases, such models have little substantive interest.

If logit and loglinear models are equivalent, why use the logit model? Here are three reasons:

  • The logit model makes a clear distinction between the dependent variable and the independent variables. The loglinear model makes no such distinction—all the conceptual variables are on the right-hand side of the equation.

  • The loglinear model has many more parameters than the corresponding logit model. Most of these are nuisance parameters that have no substantive interest, and their inclusion in the model can be confusing (and potentially misleading).

  • With larger tables, loglinear models are much more prone to convergence failure because of cell frequencies of 0.

In short, logit analysis can be much simpler than loglinear analysis even when estimating equivalent models. As we shall see, logit analysis of contingency tables has more in common with ordinary multiple regression than it does with traditional chi-square tests. In the remainder of this chapter, we’ll see how to analyze contingency tables by using either the LOGISTIC or the GENMOD procedure. The emphasis will be on GENMOD, however, because its CLASS statement and more flexible MODEL statement are particularly handy for contingency table analysis. In this chapter, we’ll look only at tables where the dependent variable is dichotomous, but later chapters will consider tabular data with dependent variables having more than two categories. Much of what we do in this chapter will involve the application of tools already developed in preceding chapters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.31.159