6.5. Reciprocal Effects with Lagged Predictors

We have seen that many of the fixed and random effects models estimated in chapter 2 can also be estimated with PROC CALIS, and that there are both advantages and disadvantages to this approach. We are now going to consider some important fixed effects models that go considerably beyond those in chapter 2. These models involve reciprocal effects among two or more endogenous variables as well as lagged values of both endogenous and exogenous variables. The models are important because they offer the possibility of greatly advancing our ability to determine the direction of causality among variables that are associated with one another.

Let's suppose that we observe two variables, x and y, that are known to be correlated, and we would like to know whether x causes y or y causes x (or perhaps both). Both variables are observed at several points in time. Consider the following model:


This model says that y is affected by x at an earlier time point, and x is affected by y at an earlier time point. The model also includes fixed effects α and η, representing the effects of any and all time-invariant covariates on each variable, along with time-specific disturbances ε and υ. Other lagged time-varying variables could be included, but that would unnecessarily complicate the discussion.

How can this model be estimated? If there are observations at exactly three time points, the model can be estimated by taking first differences and applying OLS to each equation separately:


When there are more than three time points, it might seem that the methods used in chapter 2 (dummy variables for individuals or deviations from the means) might do the job. Unfortunately, because of the reciprocal effects, the deviation scores used in fixed effects estimation are necessarily correlated with the error terms in the regressions, and that leads to biased estimation (Wooldridge 2001). Fortunately, the method used for handling fixed effects in PROC CALIS can circumvent those difficulties.

Even more serious difficulties arise when the model is extended to allow for lagged values of the dependent (endogenous) variables:


Without the fixed effects, this model is well known in the social science literature as the two-wave, two-variable panel model, or as the cross-lagged panel model. In the econometric literature, models with lagged dependent variables are referred to as dynamic models. They are known to pose serious difficulties for conventional estimation methods, and several alternative methods have been proposed to deal with them (Baltagi 1995)

It turns out that dynamic fixed effects models can also be estimated in a straightforward way using PROC CALIS (or equivalent SEM software). Although the properties of this method have not been investigated analytically, my own simulation studies (Allison 2000) have shown that it does an excellent job of recovering the parameters for models such as the one in equations (6.4).

As an example, we shall analyze data for 178 occupations in the U.S. for the years 1983, 1989, 1995 and 2001 (labeled T1 through T4). The data come from the March "Current Population Survey: Annual Demographic File" (CPS). The observations in CPS data are individuals, but the analysis is based on occupational averages for each year on all the variables. For each year, I calculated the proportion of females and the median wage for females for each occupation. This was done only for the 178 occupations that had at least 50 sample members in each of the years. Further details can be found in England et al. (2004). For wages, the variables are labeled MDWGF1 through MDWGF4, and for the proportion of females we have PF1 through PF4.

For the model in equation (6.4), let y be median wage and let x be the proportion of females. In 1983, the correlation between these two variables was –.33, which was highly significant. There has been considerable controversy regarding the possible direction of causality between these two variables (England et al. 2004). One argument is that employers devalue occupations that have a high proportion of females and consequently pay lower wages. The rival hypothesis is that declining wages make occupations less attractive to men; as they leave for better paying work, women fill their vacant positions. I shall assume that changes in either of these variables show up in changes in the other variable six years later.

By estimating the equations in (6.4), we can assess each of the two possible causal effects. Although it's possible to estimate the two equations simultaneously, estimating them separately allows for considerably more flexibility in specifying the model. In addition to the fixed effects, the key device that allows for the reciprocal effects is this: the error term at each point in time must be allowed to correlate with future values of the time-dependent covariate (Wooldridge 2001). Here is the PROC CALIS program to estimate the two equations:

PROC CALIS DATA=my.occ UCOV AUG;
LINEQS
   mdwgf4= t4 INTERCEPT + b1 pf3 + b2 mdwgf3 + falpha + e4,
   mdwgf3= t3 INTERCEPT + b1 pf2 + b2 mdwgf2 + falpha + e3,
   mdwgf2= t2 INTERCEPT + b1 pf1 + b2 mdwgf1 + falpha + e2;
STD
   falpha=s1, e2 e3 e4=sa:;
COV
   falpha*mdwgf1 pf1 pf2 pf3=ca:,
   e2*pf3 =cb;
RUN;
PROC CALIS DATA=my.occ UCOV AUG;
LINEQS
   pf4= t4 INTERCEPT + b1 mdwgf3 + b2 pf3 + falpha + e4,
   pf3= t3 INTERCEPT + b1 mdwgf2 + b2 pf2 + falpha + e3,
   pf2= t2 INTERCEPT + b1 mdwgf1 + b2 pf1 + falpha + e2;
STD
   falpha=s1, e2 e3 e4=sa:;
COV
   falpha*pf1 mdwgf1 mdwgf2 mdwgf3=ca:,
 e2*mdwgf3=cb;
RUN;

The basic structure of this program should now be familiar. There is a separate equation for each dependent variable at each point in time, and those equations correspond directly to the equations in (6.4). Note that there is no equation predicting median wage or proportion of females at time 1 because we do not observe their lagged values six years earlier (1977).

The fixed effects are represented by FALPHA in each equation. The COV statement allows correlations between FALPHA and the time-varying covariates, thus implementing a fixed effects model. Note that for the lagged dependent variable, a correlation is allowed only between FALPHA and the value of the variable at time 1. That's because only the time 1 variable is exogenous, and correlations are only allowed among exogenous variables. There's actually no need to specify a correlation between FALPHA and the later values of the lagged dependent variable, because FALPHA is one of the predictors in the equation for each of these variables. The COV statement also allows a correlation between E2 and the cross-lagged variable at time 3. Again, this allows for the reciprocal effect of one variable on the other at a later point in time.

Table 6.6. Output 6.6 Estimates for Reciprocal Model with Fixed and Lagged Effects



Results for the two equations are shown in Output 6.6. To save space, I've edited out everything that's redundant and nonessential. Not surprisingly, each variable has a positive, statistically significant effect on itself six years later. With respect to the "cross-lagged" coefficients, however, there is no evidence for an effect in either direction.

Elsewhere, I have questioned the desirability of including lagged values of the dependent variable as predictors when fixed effects are already in the model (Allison 1990). So I also estimated a model that removes the lagged dependent variables, and I got essentially the same results for the cross-lagged coefficients. Similarly, a model that includes the lagged dependent variables but does not include the fixed effects (the classic 2-wave, 2-variable panel model) yields no evidence for a cross-lagged effect in either direction.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.207.145