5.7. Summary

Fixed effect regression analysis of event history data is easily accomplished when each individual has multiple, usually repeated, events. Like logistic regression, and unlike linear regression or Poisson regression, the use of dummy variables to represent the fixed effects typically leads to biased coefficient estimates for the other variables. Instead, the preferred method for fixed effects event history analysis is to do Cox regression with stratification to eliminate the fixed effects from the estimating equations. In PROC PHREG this is implemented by using the STRATA statement with a variable containing common ID numbers for each individual. This method is computationally efficient even for large numbers of strata and produces approximately unbiased estimates under most conditions.

As with other forms of fixed effects analysis, Cox regression with stratification can involve a substantial loss of statistical power. Of course, individuals with only one censored or uncensored observation contribute nothing to the analysis. Even individuals with one censored and one uncensored interval are eliminated if the censored interval is the shorter of the two. Finally, only within-individual variation is used in estimating the coefficients. For reasons that are not fully understood, the hybrid method, which worked well for linear, logistic and count data regression, does not produce correct results for Cox regression.

Serious difficulties arise in the attempt to do fixed effects regression analysis with nonrepeated events. The basic idea is to treat time as discrete and create a separate record for each discrete time point that is observed for each individual, from the beginning of observation to the time of the event or censoring. For each record, a dichotomous dependent variable is coded 1 if an event occurred at that time point, and otherwise is coded 0. Finally, one does a conditional logistic regression of this dependent variable with stratification on individuals, using predictors that vary across time points. The fundamental problem with this appealing approach is that if time (or any monotonic function of time) is used as a predictor, the model will not converge due to complete separation. The reason is that the event always occurs at the end of each individual's sequence of records, so time perfectly predicts the occurrence of the event.

Although models that do not include time can certainly be estimated, the resulting coefficient estimates might be biased because effects of time on both the hazard and the covariates have not been controlled. One solution is the case-time-control method, which appears to work well for a estimating the effect of a categorical covariate on the hazard. The innovation of this method is to reverse the role of the dependent and independent variables in the conditional logistic regression, making it possible to include time as a covariate in the model. Again, this is accomplished in SAS by using PROC LOGISTIC with stratification.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.181.209