Interpretations of the Hazard Function

Before proceeding further, three clarifications need to be made:

  • Although it may be helpful to think of the hazard as the instantaneous probability of an event at time t, it’s not really a probability because the hazard can be greater than 1.0. This can happen because of the division by Δt in equation (2.1). Although the hazard has no upper bound, it cannot be less than 0.

  • Because the hazard is defined in terms of a probability (which is never directly observed), it is itself an unobserved quantity. We may estimate the hazard with data, but that’s only an estimate.

  • It’s most useful to think of the hazard as a characteristic of individuals, not of populations or samples (unless everyone in the population is exactly the same). Each individual may have a hazard function that is completely different from anyone else’s.

The hazard function is much more than just a convenient way of describing a probability distribution. In fact, the hazard at any point t corresponds directly to intuitive notions of the risk of event occurrence at time t. With regard to numerical magnitude, the hazard is a dimensional quantity that has the form number of events per interval of time, which is why the hazard is sometimes called a rate. To interpret the value of the hazard, then, you must know the units in which time is measured. Suppose, for example, that I somehow know that my hazard for contracting influenza at some particular point in time is .015, with time measured in months. This means that if my hazard stays at that value over a period of one month, I would expect to contract influenza .015 times. Remember, this is not a probability. If my hazard was 1.3 with time measured in years, then I would expect to contract influenza 1.3 times over the course of a year (assuming that my hazard stays constant during that year).

To make this more concrete, consider a simple but effective way of estimating the hazard. Suppose that we observe a sample of 10,000 people over a period of one month, and we find 75 cases of influenza. If every person is observed for the full month, the total exposure time is 10,000 months. Assuming that the hazard is constant over the month and across individuals, an optimal estimate of the hazard is 75/10000=.0075. If some people died or withdrew from the study during the one-month interval, we have to subtract their unobserved time from the denominator.

The assumption that the hazard is constant may bother some readers since one thing known about hazards is that they can vary continuously with time. That’s why I introduced the hazard function in the first place. Yet, this sort of hypothetical interpretation is one that is familiar to everyone. If we examine the statement “This car is traveling at 30 miles per hour,” we are actually saying that “If the car continued at this constant speed for a period of one hour, it would travel 30 miles.” But cars never maintain exactly the same speed for a full hour.

The interpretation of the hazard as the expected number of events in a one-unit interval of time is sensible when events are repeatable. But what about a nonrepeatable event like death? Taking the reciprocal of the hazard, 1/h(t), gives the expected length of time until the event occurs, again assuming that h(t) remains constant. If my hazard for death is .018 per year at this moment, then I can expect to live another 1/.018 = 55.5 years. Of course, this calculation assumes that everything about me and my environment stays exactly the same. Actually, my hazard of death will certainly increase (at an increasing rate) as I age. The reciprocal of the hazard is useful for repeatable events as well. If I have a constant hazard of .015 per month of contracting influenza, the expected length of time between influenza episodes is 66.7 months.

In thinking about the hazard, I find it helpful to imagine that each of us carries around hazards for different kinds of events. I have a hazard for accidental death, a hazard for coronary infarction, a hazard for quitting my job, a hazard for being sued, and so on. Furthermore, each of these hazards changes as conditions change. Right now, as I sit in front of my computer, my hazard for serious injury (one requiring hospitalization) is very low, but not zero. The ceiling could collapse, my chair could tip over, etc. It surely goes up substantially as I leave my office and walk down the stairs. And it goes up even more when I get in my car and drive onto the expressway. Then it goes down again when I get out of my car and walk into my home.

This example illustrates the fact that the true hazard function for a specific individual and a specific event varies greatly with the ambient conditions. In fact, it is often a step function with dramatic increases or decreases as an individual moves from one situation to another. When we estimate a hazard function for a group of individuals, these micro-level changes typically cancel out so that we end up capturing only the gross trends with age or calendar time. On the other hand, by including changing conditions as time-dependent covariates in a regression model (Chapter 5, “Estimating Cox Regression Models with PROC PHREG”), we can estimate their effects on the hazard.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.133.96