The Origin of Time

All models for survival data are fundamentally concerned with the timing of events. In assigning a number to an event time, we implicitly choose both a scale and an origin. The scale is just the units in which time is measured: years, days, minutes, hours, or seconds. We have already seen that the numerical value of the hazard depends on the units of measurement for time. In practice, however, the choice of units makes little difference for the regression models discussed in later chapters. Because those models are linear in the logarithm of the hazard or event time, a change in the units of measurement affects only the intercept, leaving the coefficients unchanged.

The choice of origin (0 point) is more problematic, however, for three reasons. First, it does make a difference—often substantial—in coefficient estimates and fit of the models. Second, the preferred origin is sometimes unavailable, and you must use some proxy. Third, many situations occur in which two or more possible time origins are available, but there is no unambiguous criterion for deciding among them.

Consider the problem of unavailability of the preferred origin. Many medical studies measure time of death as the length of time between the point of diagnosis and death. Most medical researchers prefer, if possible, to measure time from the point of infection or the onset of the disease. Since there is often wide variation in how long it takes before a disease is diagnosed, the use of diagnosis time as a proxy may introduce a substantial amount of random noise into the measurement of death times. A likely consequence is attenuation of coefficients toward 0. Worse yet, since variation in time of diagnosis may depend on such factors as age, sex, race and social class, there is also the possibility of systematic bias. Thus, if African Americans tend to be diagnosed later than Caucasians, they will appear to have shorter times to death. Unfortunately, if the point of disease onset is unavailable (as it usually is), you cannot do much about this problem except to be aware of the potential biases that might result.

On the other hand, the point of disease onset may not be the ideal choice for the origin. If the risk of death is heavily dependent on treatment—which cannot begin until the disease is diagnosed—then the point of diagnosis may actually be a better choice for the origin. This fact brings up the third issue. What criteria can be used to choose among two or more possible origins? Before attempting to answer that question, consider some of the possibilities:

  • Age. Demographers typically study the age at death, implicitly using the individual’s birth as the point of origin.

  • Calendar time. Suppose we begin monitoring a population of deer on October 1, 1993, and follow them for one year, recording any deaths that occur in that interval. If we know nothing about the animals prior to the starting date, then we have no choice but to use that as the origin for measuring death time.

  • Time since some other event. In studying the determinants of divorce, it is typical to use the date of marriage as the origin time. Similarly, in studying criminal recidivism, the natural starting date is the date at which the convict is released from prison.

  • Time since the last occurrence of the same type of event. When events are repeatable, it is common to measure the time of an event as the time since the most recent occurrence. Thus, if the event is a hospitalization, we may measure the length of time since the most recent hospitalization.

In principle, the hazard for the occurrence of a particular kind of event can be a function of all of these times or any subset of them. Nevertheless, the continuous-time methods considered in this book require a choice of a single time origin. (The discrete time methods discussed in Chapter 7 are more flexible in that regard). Although you can sometimes include time measurements based on other origins as covariates, that strategy usually restricts the choice of models and may require more demanding computation.

So how do you choose the principal time origin? Here are some criteria that researchers commonly use, although not always with full awareness of the rationale or the implications.

  1. Choose a time origin that marks the onset of continuous exposure to risk of the event. If the event of interest is a divorce, the natural time origin is the date of the marriage. Prior to marriage, the risk (or hazard) of divorce is 0. After marriage, the risk is some positive number. In the case of recidivism, a convict is not at risk of recidivating until he or she is actually released from prison, so the point of release is an obvious time origin.

    This criterion is so intuitively appealing that most researchers instinctively apply it. But the justification is important because there are sometimes attractive alternatives. The most compelling argument for this criterion is that it automatically excludes earlier periods of time when the hazard is necessarily 0. If these periods are not excluded, and if they vary in length across individuals, then bias may result. (Chapter 5 shows how to exclude periods of zero hazard using PHREG).

    Often this criterion is qualified to refer only to some subset of a larger class of events. For example, people are continuously at risk of death from the moment they are born. Yet, in studies of deaths due to radiation exposure, the usual origin is the time of first exposure. That is the point at which the individual is first exposed to risk of that particular kind of death—a death due to radiation exposure. Similarly, in a study of why some patients die sooner than others after cardiac surgery, the natural origin is the time of the surgery. The event of interest is then death following cardiac surgery. On the other hand, if the aim is to estimate the effect of surgery itself on the death rate among cardiac patients, the appropriate origin is time of diagnosis, with the occurrence of surgery as a time dependent covariate.

  2. In experimental studies, choose the time of randomization to treatment as the time origin. In such studies, the main aim is usually to estimate the differential risk associated with different treatments. It is only at the point of assignment to treatment that such risk differentials become operative. Equally important, randomization should ensure that the distribution of other time origins (for example, onset of disease) is approximately the same across the treatment groups.

    This second criterion ordinarily overrides the first. In an experimental study of the effects of different kinds of marital counseling on the likelihood of divorce, the appropriate time origin would be the point at which couples were randomly assigned to treatment modality, not the date of the marriage. On the other hand, length of marriage at the time of treatment assignment can be included in the analysis as a covariate. This inclusion is essential if assignment to treatment was not randomized.

  3. Choose the time origin that has the strongest effect on the hazard. The main danger in choosing the wrong time origin is that the effect of time on the hazard may be inadequately controlled, leading to biased estimates of the effects of other covariates, especially time-dependent covariates. In general, the most important variables to control are those that have the biggest effects. For example, while it is certainly the case that the hazard for death is a function of age, the percent annual change in the hazard is rather small. On the other hand, the hazard for death due to ovarian cancer is likely to increase markedly from time since diagnosis. Hence, it is more important to control for time since diagnosis (by choosing it as the time origin). Again, it may be possible to control for other time origins by including them as covariates.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.39.59