Log Survival and Smoothed Hazard Plots

PROC LIFETEST produces two other plots that give useful information about the shape of the hazard function, the log-survival (LS) plot and the log-log survival (LLS) plot. In this section, we’ll see how these plots help determine whether the hazard function can be accurately described by certain parametric models discussed in Chapter 2. We’ll also see how to get smoothed estimates of the hazard function with ungrouped data.

Suppose we specify PLOTS=(S, LS, LLS) in the PROC LIFETEST statement. The S gives us the now-familiar survival curve. LS produces a plot of –log Ŝ(t) versus t. To explain what this plot is good for, we need a little background. From equation (2.6), you can readily see that

Because of this relationship, the log survivor function is commonly referred to as the cumulative hazard function, frequently denoted by Λ(t). Now, if h(t) is a constant with a value of λ (which corresponds to an exponential distribution), then the cumulative hazard function is just Λ(t) = λt. This result implies that a plot of –log Ŝ(t) versus t should yield a straight line with an origin at 0. Moreover, an examination of the log-survival plot can tell us whether the hazard is constant, increasing, or decreasing with time.

Output 3.14 displays the LS plot for the myelomatosis data. Instead of a straight line, the graph appears to increase at a decreasing rate. This fact suggests that the hazard is not constant, but rather declines with time. If the plot had curved upward rather than downward, it would suggest that the hazard was increasing with time. Of course, since the sample size is quite small, caution is advisable in drawing any conclusions. A formal test, such as the one described in the next chapter, might not show a significant decrease in the hazard.

Output 3.14. Log-Survival Plot for Myelomatosis Data


The LLS keyword produces a plot of log[–logŜ(t)] versus log t. If survival times follow a Weibull distribution with a hazard given by logh(t) = α + β logt, then the log-log survival plot (log cumulative hazard plot) should be a straight line with a slope of β. Examining Output 3.15, we see a rather rough plot with a slight tendency to turn downward at later times. Again, however, the data are so sparse that this is probably not sufficient evidence for rejecting the Weibull distribution.

In Chapter 4, we’ll see how to construct similar plots for other distributions such as the log-normal and log-logistic. We’ll also see how to apply the log-survival plot to residuals from a regression analysis, thereby testing the fit of the model.

Output 3.15. Log-Log Survival Plot for Myelomatosis Data


Although graphs based on the survivor function are certainly useful, they are ultimately rather frustrating. What we really want to see is a graph of the hazard function. We got that from the life-table method, but at the cost of grouping the event times into arbitrarily chosen intervals. While it’s possible to estimate hazards from ungrouped data, the estimates usually vary so wildly from one time to the next as to be almost useless. There are several ways to smooth these estimates by calculating some sort of moving average. One method, known as kernel smoothing, has been shown to have good properties for hazard functions (Ramlau-Hansen 1983). I’ve written a SAS macro called SMOOTH that calculates and graphs kernel estimates of the hazard function using output from PROC LIFETEST. See Appendix 1, “Macro Programs,” for detailed information.

For the recidivism data, you use the macro as follows:

proc lifetest outsurv=a data=recid;
   time week*arrest(0);
run;

%smooth(data=a,time=week,width=8)

In this example, PROC LIFETEST calculates Kaplan-Meier estimates of the survivor function and outputs them to data set A. Three parameters are passed to the SMOOTH macro: DATA=A gives the name of the input data set; TIME=WEEK provides the name of the variable containing event times; and WIDTH=8 sets the bandwidth for the smoothing function. This means that data points more than eight weeks away from the point being estimated are not used in the smoothing function. This is, of course, an arbitrary choice. In practice, you can try out various widths until you find a value that eliminates extreme choppiness but still leaves distinctive features of the curve. If you do not specify a value for WIDTH, SMOOTH uses one-fifth the range of the event times, which is often a pretty good start.

Output 3.16 shows the smoothed hazard function. The graph bears some resemblance to the grouped hazard estimates in Output 3.10, although here we see a rather pronounced trough at about 27 weeks. Note that the graph is truncated at about 8 weeks and 44 weeks. To avoid artifactual peaks or dips at the end of the curve, the SMOOTH macro does not produce estimates for times less than one bandwidth above the minimum event time or more than one bandwidth below the maximum event time.

Output 3.16. Smoothed Hazard Function Estimate for Recidivism Data


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.80.122