3.11. Unobserved Heterogeneity

In Section 3.9, I explained how the logit model could be derived from a dichotomized linear model with a disturbance term that has a logistic distribution. There we saw that the logit coefficients were related to the coefficients in the underlying linear model by the formula βj=αj/σ, where βj is the logit coefficient for xj, αj is the corresponding coefficient in the linear model, and σ is the coefficient of the disturbance term ε. This random disturbance term can be seen as representing all omitted explanatory variables that are independent of the measured x variables, commonly referred to as unobserved heterogeneity. Because σ controls the variance of the disturbance, we conclude that greater unobserved heterogeneity leads to logit coefficients that are attenuated toward 0. I’ll refer to this attenuation as heterogeneity shrinkage.

This assessment doesn’t depend on the plausibility of the latent variable model. Suppose we take an ordinary logit model and put a disturbance term in the equation:

Equation 3.20


We assume that ε is independent of the x’s and has a standard logistic distribution, as in equation (3.12). Notice that the probabilities on the left-hand side are expressed conditionally on ε. If we then express the model unconditionally, it can be shown (Allison 1987) that the result is closely approximated by

Equation 3.21


where

If we start with a probit model (and a standard normal disturbance) instead of a logit model, this result is exact rather than approximate. Again, we see that as the disturbance variance gets larger, the logit (or probit) coefficients get smaller.

Obviously, we can never measure and include all the variables that affect the dependent variable, so there will always be some heterogeneity shrinkage. What are the practical implications? First, in interpreting the results of a logit regression, we should keep in mind that the magnitudes of the coefficients and the corresponding odds ratios are likely to be conservative to some degree. Second, to keep such shrinkage to a minimum, it’s always desirable to include important explanatory variables in the model—even if we think those variables are uncorrelated with the variables already in the model. This advice is especially relevant to randomized experiments where there is a tendency to ignore covariates and look only at the effect of the treatment on the response. That may be OK for linear models, but logit models yield superior estimates of the treatment effect when all relevant covariates are included.

A third problem is that differential heterogeneity can confound comparisons of logit coefficients across different groups. If, for example, you want to compare logit coefficients estimated separately for men and women, you must implicitly assume that the degree of unobserved heterogeneity is the same in both groups. Elsewhere, I have proposed a method of adjusting for differential heterogeneity (Allison 1999).

Heterogeneity shrinkage is characteristic of a wide variety of non-linear regression models (Gail et al. 1984). The phenomenon is closely related to a distinction that is commonly made between population-averaged models and subject-specific models. The model that explicitly includes the disturbance term (equation 3.20) is called a subject-specific model, because its coefficients describe how the log-odds changes if the explanatory variables are changed for that particular individual. On the other hand, equation (3.21)—which integrates out the disturbance term—is called a population-averaged model because its coefficients describe what happens to a whole population when the explanatory variables are changed for all individuals. If our interest is in understanding the underlying causal mechanism, then subject-specific coefficients are of primary interest. If the aim is to determine the aggregate consequences of some policy change that affects everyone, then population-averaged coefficients may be more appropriate. Keep in mind, however, that there is only one true subject-specific model, but the population-averaged model may change with the population and the degree of unobserved heterogeneity.

Can we ever estimate the subject-specific model or can we only approximate it by including more and more explanatory variables? If we have only one observation per individual, the subject-specific model cannot be directly estimated. But if there are two or more observations per individual, the parameter σ is identified, which means that the subject-specific coefficients can be recovered. We’ll discuss methods for accomplishing that in Chapter 8. Unobserved heterogeneity is closely related to a phenomenon known as overdispersion, which arises when estimating a logit model that has grouped data. We’ll discuss overdispersion in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.22.169