Besides tests of differences between groups, PROC LIFETEST can test whether quantitative covariates are associated with survival time. Given a list of covariates, PROC LIFETEST produces a test statistic for each one and ignores the others. It also treats them as a set, testing the null hypothesis that they are jointly unrelated to survival time and testing for certain incremental effects of adding variables to the set. The statistics are generalizations of the log-rank and Wilcoxon tests discussed earlier in this chapter. They can also be interpreted as nonparametric tests of the coefficients of the accelerated failure time model discussed in Chapter 4, “Estimating Parametric Regression Models with PROC LIFEREG.”
You can test the same sorts of hypotheses with PROC LIFEREG or PROC PHREG. In fact, the log-rank chi-square reported by PROC LIFETEST is identical to the score statistic given by PROC PHREG for the null hypothesis that all coefficients are 0 (when the data contain no tied event times). However, in most cases, you are better off switching to the regression procedures, for two reasons. First, PROC LIFETEST doesn’t give coefficient estimates, so there is no way to quantify the effect of a covariate on survival time. Second, the incremental tests do not really test the effect of each variable controlling for all the others. Instead, you get a test of the effect of each variable controlling for those variables that have already been included. Because you have no control over the order of inclusion, these tests can be misleading. Nevertheless, PROC LIFETEST can be useful for screening a large number of covariates before proceeding to estimate regression models. Because the log-rank and Wilcoxon tests do not require iterative calculations, they require relatively little computer time. (This is also true for the SELECTION=SCORE option in PROC PHREG).
Let’s look at the recidivism data as an example. The covariate tests are invoked by listing the variable names in a TEST statement:
proc lifetest data=recid; time week*arrest(0); test fin age race wexp mar paro prio; run;
Output 3.13 shows selections from the output. I have omitted the Wilcoxon statistics because they are nearly identical to the log-rank statistics for this example. I also omitted the variance-covariance matrix for the statistics because it is primarily useful as input to other analyses.
Univariate Chi-Squares for the LOG RANK Test Test Pr > Variable Statistic Variance Chi-Square Chi-Square FIN 10.4256 28.4744 3.8172 0.0507 AGE 233.2 4305.3 12.6318 0.0004 RACE -2.7093 12.8100 0.5730 0.4491 WEXP 16.4141 27.3305 9.8580 0.0017 MAR 7.1773 13.1535 3.9164 0.0478 PARO 2.9471 26.7927 0.3242 0.5691 PRIO -108.8 812.3 14.5602 0.0001 Forward Stepwise Sequence of Chi-Squares for the LOG RANK Test Pr > Chi-Square Pr > Variable DF Chi-Square Chi-Square Increment Increment PRIO 1 14.5602 0.0001 14.5602 0.0001 AGE 2 25.4905 0.0001 10.9303 0.0009 FIN 3 28.8871 0.0001 3.3966 0.0653 MAR 4 31.0920 0.0001 2.2050 0.1376 RACE 5 32.4214 0.0001 1.3294 0.2489 WEXP 6 33.2800 0.0001 0.8585 0.3541 PARO 7 33.3828 0.0001 0.1029 0.7484 |
The top panel shows that age at release (AGE), work experience (WEXP), and number of prior convictions (PRIO) have highly significant associations with time to arrest. The effects of marital status (MAR) and financial aid (FIN) are more marginal, while race and parole status (PARO) are apparently unrelated to survival time. The signs of the log-rank test statistics tell you the direction of the relationship. The negative sign for PRIO indicates that inmates with more prior convictions tend to have shorter times to arrest. On the other hand, the positive coefficient for AGE indicates that older inmates have longer times to arrest. As already noted, none of these tests controls or adjusts for any of the other covariates.
The lower panel displays results from a forward inclusion procedure. PROC LIFETEST first finds the variable with the highest chi-square statistic in the top panel—in this case PRIO—and puts it in the set to be tested. Since PRIO is the only variable in the set, the results for PRIO are the same in both panels. Then PROC LIFETEST finds the variable that produces the largest increment in the joint chi-square for the set of two variables—in this case AGE. The joint chi-square of 25.49 in line 2 tests the null hypothesis that the coefficients of AGE and PRIO in an accelerated-failure time model are both 0. The chi-square increment of 10.93 is merely the difference between the joint chi-square in lines 1 and 2. It is a test of the null hypothesis that the coefficient for AGE is 0 when PRIO is controlled. On the other hand, there is no test for the effect of PRIO controlling for AGE.
This process is repeated until all the variables are added. For each variable, we get a test of the hypothesis that the variable has no effect on survival time controlling for all the variables above it (but none of the variables below it). For variables near the end of the sequence, the incremental chi-square values are likely to be similar to what you might find with PROC LIFEREG or PROC PHREG. For variables near the beginning of the sequence, however, the results can be quite different.
For this example, the forward inclusion procedure leads to some substantially different conclusions from the univariate procedure. While WEXP has a highly significant effect on survival time when considered by itself, there is no evidence of such an effect when other variables are controlled. The reason is that work experience is moderately correlated with age and the number of prior convictions, both of which have substantial effects on survival time. Marital status also loses its statistical significance in the forward inclusion test.
What is the relationship between the STRATA statement and the TEST statement? For a dichotomous variable like FIN, the statement TEST FIN is a possible alternative to STRATA FIN. Both produce a test of the null hypothesis that the survivor functions are the same for the two categories of FIN. In fact, if there are no ties in the data (no cases with exactly same event time), the two statements will produce identical chi-square statistics and p-values. In the presence of ties, however, STRATA and TEST use somewhat different formulas, which may result in slight differences in the p-values. (If you’re interested in the details, see Collett 1994, p. 284). In the recidivism data, for example, the 114 arrests occurred at only 49 unique arrest times, so the number of ties was substantial. The STRATA statement produces a log-rank chi-square of 3.8376 for a p-value of .0501, and a Wilcoxon chi-square of 3.7495 for a p-value of .0528. The TEST statement produces a log-rank chi-square of 3.8172 for a p-value of .0507 and a Wilcoxon chi-square of 3.7485 for a p-value of .0529. Obviously the differences are minuscule in this case.
Other considerations should govern the choice between STRATA and TEST. While STRATA produces separate tables and graphs of the survivor function for the two groups, TEST produces only the single table and graph for the entire sample. With TEST, you can test for the effects of many dichotomous variables with a single statement, but STRATA requires a new PROC LIFETEST step for each variable tested. Of course, if a variable has more than two values, STRATA treats each value as a separate group while TEST treats the variable as a quantitative measure.
What happens when you include both a STRATA statement and a TEST statement? Adding a TEST statement has no effect whatever on the results from the STRATA statement. This fact implies that the hypothesis test produced by the STRATA statement in no way controls for the variables listed in the TEST statement. On the other hand, the TEST statement can produce quite different results, depending on whether you also have a STRATA statement. When you have a STRATA statement, the log-rank and Wilcoxon statistics produced by the TEST statement are first calculated within strata and then averaged across strata. In other words, they are stratified statistics that control for whatever variable or variables are listed in the STRATA statement. Suppose, for example, that for the myelomatosis data we want to test the effect of the treatment while controlling for renal functioning. We can submit these statements:
proc lifetest data=renal; time dur*censor(0); strata renal; test treat; run;
The resulting log-rank chi-square for TREAT was 5.791 with a p-value of .016. This result is in sharp contrast with the unstratified chi-square of only 1.3126 that we saw earlier in this chapter (Output 3.5). As we’ll see in Chapter 5, “Estimating Cox Regression Models with PROC PHREG” (Output 5.17), you can obtain identical results using PROC PHREG with stratification and the score test.
Clearly, there is no point in listing a variable in both a STRATA and a TEST statement. If you do it anyway, the TEST statement will not give meaningful results for that variable.
18.191.157.186