Chapter 16

Cost and Cost-Effectiveness Analysis with Censored Data

Hongwei Zhao
Hongkun Wang

Abstract

16.1 Introduction

16.2 Statistical Methods

16.3 Example

16.4 Discussion

Acknowledgments

References

 

Abstract

Cost assessment and cost-effectiveness analysis serve as an essential part in the economic evaluation of medical interventions. In clinical trials and many observational studies, cost data as well as survival data are often incomplete due to patients’ loss to follow up or administrative termination of the study. There are numerous well-established statistical methodologies and software available for analyzing censored survival data. However, standard techniques for survival-type data are invalid in analyzing censored cost data, due to the induced informative censoring (dependence between censored costs and potential uncensored costs). In this chapter, we present some statistical methods that have been proposed for estimating medical cost and cost-effectiveness analysis with censored data. An example from a clinical trial comparing the effectiveness of implantable cardiac defibrillators with conventional therapy for individuals at high risk for ventricular arrhythmia is used to illustrate the method. SAS code for performing the analysis is provided. The model assumptions are examined and further development is discussed.

 

16.1 Introduction

With the advance of medicine, medical costs have escalated. However, due to limited resources, it is of great interest for health care organizations and health policy makers to evaluate medical costs associated with different treatment options. In general, the mean cost per patient is of most concern to us because the total cost of an intervention can be derived from the mean, not from the mode, median, or other quartiles of the cost distribution (Ramsey et al., 2005).

Cost estimation in observational studies is challenging for many reasons. First of all, cost data are highly skewed. However, if a log transformation is used on cost data, the inference on the mean of the log transformed cost is transferred back to the geometric mean, not the arithmetic mean of cost as desired. The smearing method (Duan, 1983) has been proposed to handle this type of problem. Another challenge is that there can be a lot of missing data for medical costs, due to either missing visits or missing information on some types of medical costs (Briggs et al., 2003). Naive methods such as omitting missing data, carrying forward the last observation, or replacing the missing values with mean measures from observed data are often not satisfactory. Instead, multiple imputation (Rubin, 1987) and Bayesian simulation methods (Schafer, 1997; Van Buuren, 1999) have been advocated for computing missing data. In general, missing data can be classified into three categories according to Little and Rubin (1987):

• missing completely at random (MCAR), where missing mechanisms are independent of the variables of our interest

• missing at random (MAR), where missing is dependent only on observed variables

• not missing at random (NMAR), where missing depends on unobserved variables

It is usually easier to handle MCAR and MAR cases; when NMAR is true, inference will rely on the assumptions about the missing mechanisms, which cannot be verified from available data.

One type of missing data is caused by censoring, either due to dropout from the study or to administrative censoring from the design of the study. This chapter mainly concentrates on this problem. Similar to other missing mechanisms, censoring can be classified into three categories as well, according to whether

• censoring is independent of the survival and cost history process (censoring completely at random)

• censoring depends only on observed variables (censoring at random)

• censoring depends on some unobserved variables

We will show that it is challenging to estimate the mean costs, even for the first scenario of censoring, completely at random. Due to the censoring of costs, we cannot estimate the mean cost by simply averaging the medical cost of all the subjects. This would underestimate the true cost by equating the costs after censoring time to be zero. An average of the cost from only complete observations results in an estimator that is biased toward the costs of the patients with shorter survival time. Methods based on standard survival techniques, such as the Kaplan-Meier estimator (Kaplan and Meier, 1958), also result in biased estimators, even when the assumption that censoring is independent of the survival time is valid. This is due to so-called induced informative censoring, first noted by Lin and colleagues (1997). As an example, an individual with a higher cost accumulation rate tends to incur more medical cost at both censoring time and potential uncensored survival time, even when the censoring time is completely independent of the failure time.

Different statistical methods have been proposed for estimating mean cost with censored data (Young, 2005). We will focus on estimating the mean costs without using covariate information. Further development on incorporating covariate information in cost estimation is discussed in the last section of this chapter. Realizing that it is impossible to estimate nonparametrically the lifetime cost due to censoring, Huang and Louis (1998) proposed a method to jointly estimate survival time and lifetime costs. An alternative approach for handling the censoring issue is to limit the medical cost estimation to a certain time, which is determined by the availability of the data. Because the focus of an economical study involving cost is often on the marginal distribution of the cost, not the cost distribution conditional on a certain survival time, we concentrate on the second approach in the remainder of this chapter.

All methods for using censored data to estimate mean cost within a time limit can be classified into two types:

1. approaches that use only the information on the final cost observed in each individual that is complete

2. approaches that use additional information from a patient’s cost history for both complete and censored individuals

In general, the former approaches are simple but inefficient (that is, the estimator has a larger variance), while the latter are more complicated but produce asymptotically more efficient estimators.

Lin and colleagues (1997) proposed three different estimators for estimating mean costs using either patients’ total cost or cost history. Their methods provide consistent estimates only when the censoring times are discrete. Bang and Tsiatis (2000) employed the inverse probability weighting scheme and proposed several estimators that belong to a general class of consistent and asymptotically normal estimators for estimating mean cost with censored data. Their so-called partitioned estimator makes use of the cost accumulation data. Thus, it improves the efficiency of the simple weighted estimator, but it requires dividing the health history into subintervals. Later, a more convenient and efficient estimator was suggested by Zhao and Tian (2001), which also belongs to the general class of estimators of Bang and Tsiatis (2000). In a recent article, Zhao and colleagues (2007) established equivalency among the estimators that were introduced by Lin and colleagues (1997), Bang and Tsiatis (2000), and Zhao and Tian (2001). For each type of estimator (with or without utilizing cost history), the estimators are identical under the condition that partition boundaries are chosen at the censoring points.

Cost estimation is frequently used in cost-effectiveness analysis to compare different treatments and evaluate the economic impact of new treatment options. Willan and Lin (2001) and Willan and colleagues (2003) illustrate the use of incremental net benefit (INB), which depends upon a decision maker’s willingness to pay (WTP) for an additional unit of effectiveness, denoted as λ. Namely, INB=cost - λ effect. A major advantage of this measure is that it is more mathematically convenient to deal with simple differences, whereas a major disadvantage is that it depends on λ, which is usually unknown or not well defined. An alternative measure of cost-effectiveness is the incremental cost-effectiveness ratio (ICER), which is defined as the extra costs incurred for saving an additional year of (quality-adjusted) life. The ICER is very useful for comparing two treatments when one is more costly but more effective than the other. The ICER has long been considered a standard tool among decision makers (Gold et al., 1996). The ICER and INB approaches are also discussed in Chapters 14 and 15.

For the purpose of illustration, our data analysis uses an example from a clinical trial, the Multicenter Automatic Defibrillator Implantation Trial (MADIT), which compared the effectiveness of implantable cardiac defibrillators vs. conventional therapy in preventing death among people who had prior myocardial infarctions (Mushlin et al., 1998). Methods that utilize either only the total costs or additional cost history are employed to obtain the cost estimator and calculate the incremental cost-effectiveness ratio and its confidence. SAS code is provided together with analysis results.

The outline of this chapter is as follows. Section 16.2 introduces the methodology for estimating the mean cost as well as estimating the ICER and its confidence intervals. It is followed by the MADIT data analysis using SAS in Section 16.3. In Section 16.4, we examine the model assumptions and discuss other approaches available for estimating medical cost and for obtaining confidence intervals for cost-effectiveness ratios.

 

16.2 Statistical Methods

 

16.2.1 Notation and Assumptions

We first confine our attention to patients in one arm of the study. For the i th person in the study, let Ti denote his overall survival time and Ci the censoring time. Censoring is assumed to be random and independent of survival time. This assumption is usually satisfied when censoring is mainly caused by administrative reasons, such as the limited duration of a clinical trial or survey data. In the discussion section, we will comment on the case when this assumption is not valid. Due to censoring, Ti and Ci are observed only through the follow-up time Xi = min(Ti, Ci). Let the censoring indicator be Δi = I(TiCi). Then Δi = 1 means the i th person’s death is observed and Δi = 0 means his survival time is censored. Denote Ui(t) as the cost accumulated from time 0 (the point when the patient entered the study) to time t. Because of the presence of censoring, it is impossible to estimate the cost over the entire health history without making some distributional assumptions. Therefore, we consider only cost accumulated up to a prespecified time horizon L, where one has a reasonable amount of data available on the time period [0, L]. Hence, we will consider TiL=min(Ti,L). But for ease of notation, we will suppress the superscript L and continue to use Ti.

Our goal is to estimate the mean of the medical cost μ=E{Ui(Ti)} up to a maximum time of L, from a set of observed data [Xi,Δi,{Ui(t),tXi},i=1,,n] . If the cost history is not recorded, we see only the final cost Ui(Xi) for each individual, and those who experience the event of interest before being censored have Ui = Ui(Ti) = Ui(Xi).

 

16.2.2 Estimating Mean Cost

If every patient is followed up to time L or until his death, then we would have complete costs for each patient and the standard statistical method such as the sample mean could be used for estimating mean costs. However, in most cases, the cost and the survival time are not completely observed for all patients due to censoring. A simple weighted estimator for the mean cost was proposed by Bang and Tsiatis (2000), which has the following form:

μ^WT=1ni=1nΔiUiK^(Ti)                      (1)

where K^(Ti) is the Kaplan-Meier estimator for K(t) = Pr(C > t), the survival distribution of the censoring variable C evaluated at time Ti. In this simple weighted estimator, censoring is taken into account by weighting each uncensored individual with their probability of being observed. This inverse probability weighting idea originated with Horvitz and Thompson (1952) in sampling survey methods. This estimator is shown by Bang and Tsiatis (2000) to be consistent and asymptotically normal. Its variance can be consistently estimated by

V^(μ^WT)=1n[1ni=1nΔi(Ui-μ^WT)2K^(Ti)+1nj=1n(1-Δj)K^(Cj)2{Gj(U2)-Gj2(U)}]
     (2)

where

Gj(U)=1nS^(Xj)i=1nΔiUi(Xi)I(XiXj)K^(Xi)

This estimator has been shown by Zhao and colleagues (2007) to be equivalent to the estimator T of Lin and colleagues (1997), if the boundaries of the intervals were chosen to be at those censoring times.

The simple weighted estimator utilizes only data with complete observations. Thus it cannot be efficient, especially when censoring is heavy. One way to improve it is by capturing information from censored observations or from available cost history for both censored and uncensored observations. To improve the efficiency of the simple weighted estimator, Bang and Tsiatis (2000) proposed the partitioned estimator. It partitions the interval [0, L] into smaller intervals, computes the simple weighted estimator for cost incurred in each interval, and then sums over all intervals. Later, a more convenient and efficient estimator was suggested by Zhao and Tian (2001), which belongs to the general class of estimators of Bang and Tsiatis (2000) for mean cost but which does not require partitioning the health history. Pfeifer and Bang (2005) suggested a user-friendly formula for this improved estimator, which can be written as:

μ^IMP=1ni=1nΔiUiK^(Ti)+1nj=1n(1-Δj)K^(Cj){Uj-U¯(Cj)}
      (3)

where U¯(Cj)=i=1nU1(Cj)I(XiCj)i=1nI(XiCj) is the average cost at Cj of individuals who are still under observation at time Cj This estimator has been shown by Zhao and colleagues (2007) to be equivalent to the partitioned estimator of Bang and Tsiatis (2000) and the Lin and colleagues (1997) estimators A and B, when the partition boundaries are chosen to be at those censoring times.

The variance estimator for the improved estimator of mean cost is given by

V(μ^IMP)=V(μ^WT)

-2n2j=1n1-ΔjY(Cj)K^(Cj)i=1nΔiI(XiXj)K^(Ti){Ui-Gj(U)}{Ui(Xj)-U¯(Xj)}
     (4)

+1n2j=1n1-ΔjY(Cj)K^(Cj)i=1nΔiI(XiXj)K^(Ti){Ui(Xj)-U¯(Xj)}2

This improved estimator is not guaranteed to always be more efficient than the simple weighted estimator, but under most realistic situations, it will perform better than the simple weighted estimator.

For both estimators, we assume that we can compute a subject’s total cost or accumulated cost at a certain time before his censoring or death occurs. When missing cost data are present, an appropriate method for handling them must be employed first (Briggs et al., 2003). Wang and Zhao (2006) discussed the special situation when censoring for cost happens earlier than censoring for survival time for some subjects.

 

16.2.3 Estimating the Incremental Cost-Effectiveness Ratio and Its Confidence Interval

We now consider a two-arm trial and estimate the incremental cost-effectiveness ratio. For arm k(k=0,1) , denote μkU as the mean cost and μkT as the mean survival time, each limited to a window of time [0, L]. The ICER is estimated by

μ^1U-μ^0Uμ^1T-μ^0T,                           (5)

where μ^kU and μ^kT are the estimators for the mean cost and mean survival time for arm k(k = 0, 1), respectively. The mean survival time can be estimated using the area under the Kaplan-Meier curve of the survival function over [0, L], which can be shown to be equivalent to μ^T=1ni=1nΔiTiK^(Ti) . T is truncated at time L.

There are different approaches available to obtain confidence intervals for the ICER. Chapter 14 provides an example with the bootstrap method. Here we use Fieller’s Theorem (Fieller, 1954) because asymptotically the numerator x=μ^1U-μ^0U and the denominator y=μ^1T-μ^0T in (5) are bivariately normally distributed, which satisfies the requirement for this theorem. Hence, the 100(1-α) percent confidence limits for the ICER are

xy-zα/22Sxy±{f(x,y,Sxx,Sxy,Syy)}1/2y2-zα/22Syy     (6)

where f(x,y,Sxx,Sxy,Syy)=(xy-zα/22Sxy)2-(x2-zα/22Sxx)(y2-zα/22Syy), Sxx,Syy,Sxy are, respectively, the variances of and and the covariance of x and y, and zα/2 is the cutoff point with tail area α / 2 for the standard normal distribution. Because we assume that the two samples are independent, we can obtain the variance of x and y using previous results. Formulae that can be used to consistently estimate the covariance between costs and survival times are given in Zhao and Tian (2001), so they are not presented here.

 

16.3 Example

 

16.3.1 Study Description

To illustrate the methods discussed here, we use data collected from the Multicenter Automatic Defibrillator Implantation Trial (MADIT). MADIT was a randomized, fully sequential clinical trial that examined the effectiveness of an implantable cardiac defibrillator (ICD) in prevention of sudden death for patients who were at high risk for ventricular arrhythmia (Moss et al., 1996). Altogether, 181 patients were enrolled from 36 centers, with 89 patients assigned to the treatment group to receive ICDs and 92 assigned to the control group to receive conventional drug therapy. The first enrolled patient was followed for 61 months and the last for less than 1 month, with an average follow up of 27 months. After completion of the study, Moss and colleagues (1996) showed that use of an ICD as prophylactic therapy leads to improved survival compared with conventional medical therapy. Because of the high initial cost associated with the ICDs, cost data were collected for patients from the United States as part of the study. All medical costs incurred during the study were recorded, as described by Mushlin and colleagues (1998).

The original cost analysis was restricted to a 4-year period and performed using a method similar to the one proposed by Lin and colleagues (1997). We reanalyze the data using both the simple weight estimator and the improved estimator discussed earlier. Restricted to a 4-year period, the data were heavily censored, with a 70% censoring rate in the ICD arm and a 48% censoring rate in the conventional therapy arm. The improved estimator allowed us to capture the information from censored observations. For the cost-effectiveness analysis, the ICER was also calculated using the improved method in estimating the mean cost. As customarily done in cost-effectiveness analysis, both costs and survival time were discounted at 3% annual rate (Gold et al., 1996).

Although this example comes from a clinical trial, not an observational study, the same calculation can be used for a censored observational study as long as the assumption of independent censoring is still valid. In the discussion section, we talk about how to handle the case when censoring depends on observed variables and how to adjust for baseline imbalance for observational studies.

 

16.3.2 Data Analysis

The data for each treatment arm came from two separate files, one containing the survival information and the other containing the cost information. The survival data included three variables: subject ID, survival time (in days), and survival status (1=death; 0=censored). Because many costs, such as hospitalization costs, were accumulated over a certain period of time, they were recorded by start time, stop time, and total costs within the period (in dollars). These costs were already discounted at a 3% annual rate. The SAS code and output examining the first 10 observations of the two files from the ICD arm follow.

Program 16.1 SAS Code for Examining the Survival File

libname local "C:Documents and SettingsMy DocumentsExample";

proc print data=local.surv1 (obs=10);
run;

 

Output from Program 16.1

Obs id delta surv
1 1 0 167
2 2 0 1582
3 3 0 1792
4 4 0 1303
5 5 0 1204
6 6 0 763
7 7 0 453
8 8 0 1644
9 9 0 1818
10 10 0 804

Program 16.2 SAS Code for Examining the Cost File

proc print data=local.cost1 (obs=10);
run;

 

Output from Program 16.2

Obs cid start stop cost
1 1 1 29 133.12
2 1 1 29 16.44
3 1 1 158 421.75
4 1 1 158 28.94
5 1 1 158 25.28
6 1 1 158 29.80
7 1 1 158 2.79
8 1 5 7 32764.22
9 1 29 158 79.48
10 1 30 120 150.99

SAS/IML software was used in the analysis. Because our formula involves the Kaplan-Meier estimator, we need to call the SAS LIFETEST procedure within SAS/IML. SAS Stat Studio (now SAS/IML Studio) enables us to perform this task. The following program was run in this environment. For each set of survival and cost data, a user-defined value of time limit L, and a discount rate r, the program calculates and prints the cost estimator (simple weighted and improved), the mean (discounted) the survival time, and their estimated variances and covariance.

Program 16.3 SAS Code for Performing Cost Analysis

libname local "C:Documents and SettingshongweiMy
DocumentsHomeSASbookExample";

/* Read survival data */
use local.surv1;
read all var {id delta surv};
/*Subject ID, death indicator, and survival time;*/
close local.surv1;

/* Read cost data */
use local.cost1;
read all var {cid start stop cost};
/* Subject ID, cost starting date, stop date, cost incurred */
show names;
close local.cost1;

/* Define global variables */
n=nrow(id); /* number of subjects */
nobs=nrow(cid); /* total number of observations for cost data*/
L=1461; /* time limit */
r=0.03; /* annual discount rate */

/* Truncate survival time to L, name new variables tsurv and tdelta */
run TrunSurv(delta, surv, tdelta, tsurv);

/* Make the largest observation a failure */
do i= 1 to n;
 if (tsurv[i] = L) then tdelta[i]=1;
end;

/* Calculate the percentage of data that is censored */
percens=CalCensor(tdelta);
print , "Percent of censoring =" percens;

/* Calculate the Kaplan Meier estimator for K(t)=Pr(C>t), name it kc */
run KmCal(tsurv,tdelta,kc);

/* Calculate the Kaplan Meier estimator for S(t)=Pr(C>t), name it s */
censor=j(n,1,0);
do i= 1 to n;
  censor[i]=1-tdelta[i];
end;
run KmCal(tsurv,censor,s);

/* Calculate the total cost for each subject, which is needed for the
simple weighted estimator */
run CalTCost(cid, start,stop, cost, id, tsurv, tcost);

/* Calculate the mean cost using the simple weighted estimator */
mean sw=CalOurMean(tdelta, kc, tcost);
print , "Simple weighted estimator for mean cost =" mean_sw;

/* Calculate the standard error of the simple weighted estimator */
var sw=CalOurVar(tsurv, tdelta, s, kc, tcost, mean_sw);
se sw=sqrt(var sw);
print , "Standard error estimate for the simple weighted estimator ="
se_sw;

/* Calculate the mean discounted survival time and its standard error*/
dsurv=j(n,1,0);
do i= 1 to n;
 dsurv[i] = 365.25/r * (1.0-exp(-r*(double)tsurv[i]/365.25));
end;
mean T = CalOurMean(tdelta, kc, dsurv);
var T = CalOurVar(tsurv, tdelta, s, kc, dsurv, mean_T);
se T = sqrt(var T);
print , "Mean survival time =" mean T;
print , "Standard error for the mean survival time =" se_T;

/* Calculate cumulative cost at each censored time, which is needed for the
improved estimator */
run CalCulCost(cid, start, stop, cost, id, tsurv, tdelta, culcost);

/* Calculate improved estimator for mean cost and its standard error*/
run CalMeanAdd(tsurv, tdelta, kc, s, tcost, culcost, meanadd, varsub);
mean imp = mean sw+meanadd;
se imp = sqrt(var sw-varsub);
print , "Imporved estimator for mean cost =" mean_imp;

print , "Standard error of the improved estimator =" se imp;
/* Calculate the covariance between mean survival time and simple weighted cost estimator */
cov sw = CalOurCov(tdelta, tsurv, s, kc, tcost, mean sw, dsurv, mean T);
print , "Covariance between mean survival time and the simple weighted cost estimator =" cov_sw;

/* Calculate the covariance between mean survival time and improved cost estimator */
covsub=CalCovSub(tdelta, tsurv, s, kc, culcost, dsurv);
cov imp=cov sw-covsub;
print , "Covariance between mean survival time and the improved cost estimator =" cov_imp;

/* Subroutine to truncate the survival time to L */

start TrunSurv(delta, surv, tdelta, tsurv) global (L,n);

 tsurv=surv;

 tdelta=delta;

 do i= 1 to n;

   if surv[i]>L then do;

     tsurv[i]=L;

     tdelta[i]=1;

   end;

 end;

finish TrunSurv;

/* Subroutine to calculate the percentage of data that is censored */
start CalCensor(tdelta) global (L,n);
 cens=1-tdelta;
 percens = sum(cens)/n;
 return(percens);
finish CalCensor;

/* Subroutine to calculate the Kaplan Meier estimator for K(t)=Pr(C>t) */
start KmCal(surv, delta,kc);
 create InputDataSet var {surv delta};
 append;
 close InputDataSet;
   
 submit;
   proc lifetest data=InputDataSet noprint outsurv=OutputData;
   time surv*delta(1);
   run;
   
   data Out;
   set OutputData;
  tpdelta=1-CENSOR_;
  tpsurv=surv;
   run;
 endsubmit;

 use Out;
 read all var {tpsurv tpdelta survival};
 run ChangeKmSurv(tpsurv, tpdelta, survival, surv, delta, kc);
 close Out;
finish KmCal;

/* Subroutine to carry forward the survival function estimate at the last
failure time */
start ChangeKmSurv(tpsurv, tpdelta, survival, surv, delta, kc) global
(L,n);
 minkc=1000;
 nn=nrow(tpsurv);
 kc=j(n,1,0);

 do j= 1 to nn;
       if (survival[j]>=0 & survival[j]<minkc) then do;
     minkc=survival[j];
     maxtime=tpsurv[j];
       end;
 end;
       
 do i= 1 to n;
  if (surv[i]>maxtime) then kc[i]=minkc;
   else do;
     do j = 1 to nn;
       if (surv[i]=tpsurv[j]) then kc[i]=survival[j];
     end;
   end;
 end;
finish ChangeKmSurv;

/* Subroutine to calculate the total cost */
/* This subroutine takes less time to run compared to the routine
calculating cumulative cost */
start CalTCost(cid, start, stop, cost, id, tsurv, tcost) global (n, nobs);
 tcost=j(n,1,0);
 do i= 1 to n;
   do k=1 to nobs;
     if (cid [k] = id[i] & start [k] <= tsurv[i]) then do;
        if (stop[k] > tsurv[i]) then
           tcost[i]=tcost[i]+cost[k]*(tsurv[i]-start[k]+1.0)/(stop[k]-
start[k]+1.0);
        else tcost[i] = tcost[i] + cost[k];
     end;
   end;
 end;
finish CalTCost;

/* Subroutine to calculate the simple weighted estimator for the mean cost
*/
start CalOurMean(tdelta, kc, tcost) global (n);
 mymean=0.;
 do i= 1 to n;
   if (tdelta[i]=1) then mymean = mymean + tcost[i]/kc[i];
 end;
 mymean = mymean/n;
 return(mymean);
finish CalOurMean;

/* Subroutine to calculate the variance of the simple weighted estimator */
start CalOurVar(tsurv, tdelta, s, kc, tcost, mymean)global (n);
 temp1 = 0.; /* part 1 of equation (2) */
 temp2 = 0.; /* part 2 of equation (2) */
 do i= 1 to n;
   if (tdelta[i]=1) then temp1 =temp1 + (tcost[i]-mymean)**2/kc[i];
 end;
 temp1 =temp1/n;

 do j= 1 to n;
   e=0.;
   f=0.;
   if (tdelta[j]=0) then do;
     do i= 1 to n;
       if(tdelta[i]=1 & tsurv[i]>=tsurv[j]) then do;
         e =e + tcost[i]/kc[i];
         f = f + (tcost[i])**2/kc[i];
       end;
     end;
     e = e / (s[j]*n);
     f = f /(s[j]*n);
     temp2 = temp2 + (f-e*e)/(kc[j]*kc[j]);
   end;
 end;
 temp2=temp2/n;

 myvar = temp1+temp2;
 myvar = myvar/n;
 return(myvar);
finish CalOurVar;

/* Subroutine to calculate the cumulative cost */
/* This routine is needed for calculating the improved estimator */
start CalCulCost(cid, start, stop, cost, id, tsurv, tdelta, culcost) global
(n, nobs);
 culcost=j(n,n,0);
 do i= 1 to n;
   do j= 1 to n;
     if (tsurv[i]>=tsurv[j] & tdelta[j]=0) then do;
       do k = 1 to nobs;
         if (cid [k] = id[i] & start [k] <= tsurv[j]) then do;
           if (stop[k] > tsurv[j]) then
             culcost[i,j]=culcost[i,j]+cost[k]*(tsurv[j]-
start[k]+1.0)/(stop[k]-start[k]+1.0);
           else culcost[i,j] = culcost[i,j] + cost[k];
         end;
       end;
   end;
  end;
 end;
finish CalCulCost;

/* Subroutine to calculate the additional terms for the improved estimator
and its variance */
start CalMeanAdd(tsurv, tdelta, kc, s, tcost, culcost, meanadd, varsub)
global (n);

/* First calculate Ubar[j] and risk set y[j] at censoring places */
 Ubar=j(n,1,0);
 y=j(n,1,0);
 do j= 1 to n;
   if (tdelta[j]=0) then do;
     do i= 1 to n;
       if (tsurv[i]>=tsurv[j]) then do;
         Ubar[j]= Ubar[j]+ culcost[i,j];
         y[j] = y[j]+1;
       end;
     end;
     Ubar[j]= Ubar[j]/y[j];
   end;
 end;

 /* Next calculate the additional terms for the improved estimator and its
variance */
 part1=0.; /* Additional term for the improved estimator */
 part2=0.; /* Second term in the variance formula for the improved
estimator, equation (4) */
 part3=0.; /* Third term in the variance formula for the improved
estimator, equation (4) */
 do j= 1 to n;
   if (tdelta[j]=0) then do;
     part1 = part1+ (tcost[j]-Ubar[j])/kc[j];
    
     gu=0.;
     par2temp=0.;
     par3temp=0.;
     do i= 1 to n;
       if(tdelta[i]=1 & tsurv[i]>=tsurv[j]) then
       gu = gu + tcost[i]/kc[i];
     end;
     gu = gu/( s[j]*n);

     do i= 1 to n;
       if(tdelta[i]=1 & tsurv[i]>=tsurv[j]) then
         par2temp = par2temp+ (tcost[i]-gu)*(culcost[i,j]-Ubar[j])/kc[i];
     end;
     part2 = part2 + par2temp/(y[j]*kc[j]);

     do i= 1 to n;
       if(tsurv[i]>=tsurv[j])then
         par3temp = par3temp +(culcost[i,j]-Ubar[j])**2;
     end;
     part3 = part3+ par3temp/(y[j]*kc[j]*kc[j]);
   end;
 end;
 part1 = part1/n;
 meanadd=part1;
 varsub=(2.0*part2-part3)/(n*n);
finish CalMeanAdd; 

/* Subroutine to calculate the covariance between mean survival time and
simple weighted cost estimator */
start CalOurCov(tdelta, tsurv, s, kc, tcost, mymean, dsurv, tmean) global
(n);
 temp1 = 0.;
 temp2 = 0.;

 do i= 1 to n;
   if (tdelta[i]=1) then temp1 = temp1 + tcost[i]*dsurv[i]/kc[i];
 end;

 temp1 = temp1/n;
 temp1 = temp1 - mymean * tmean;

 do j= 1 to n;
   gtc=0.;
   gt=0.;
   gc=0.;
   if (tdelta[j]=0) then do;
     do i= 1 to n;
       if(tdelta[i]=1 & tsurv[i]>=tsurv[j]) then do;
         gtc = gtc + tcost[i]*dsurv[i]/kc[i];
         gt = gt + dsurv[i]/kc[i];
        gc = gc + tcost[i]/kc[i];
       end;
     end;
     gtc = gtc / (s[j]*n);
     gt = gt/(s[j]*n);
     gc = gc/(s[j]*n);
     temp2 = temp2 +(gtc-gt*gc)/(kc[j]*kc[j]);
   end;
 end;
 temp2 =temp2/n;
 mycov = temp1+temp2;
 mycov = mycov/n;
  return(mycov);
finish CalOurCov;

/* Subroutine to calculate the additional term for covariance between mean
survival time and improved cost estimator */
start CalCovSub(tdelta, tsurv, s, kc, culcost, dsurv) global (n);

 /* First calculate Ubar[j] and risk set y[j] at censoring places */
 Ubar=j(n,1,0);
 y=j(n,1,0);
 do j= 1 to n;
   if (tdelta[j]=0) then do;
     do i= 1 to n;
       if (tsurv[i]>=tsurv[j]) then do;
         Ubar[j] = Ubar[j] + culcost[i,j];
         y[j] = y[j] + 1;
       end;
     end;
     Ubar[j] =Ubar[j]/y[j];
   end;
 end;

 /* Next calculate the additional term for the covariance using improved
cost estimator */
 part2=0.;
 do j= 1 to n;
   if (tdelta[j]=0) then do;
     par2temp=0.;
     gt = 0.;
     do i= 1 to n;
       if(tdelta[i]=1 & tsurv[i]>=tsurv[j]) then
         gt = gt + dsurv[i]/kc[i];
     end;

     gt = gt/(s[j]*n);

     do i= 1 to n;
       if(tdelta[i]=1 & tsurv[i]>=tsurv[j]) then
         par2temp = par2temp+(culcost[i,j]-Ubar[j])*(dsurv[i]-gt)/kc[i];
     end;
     part2 = part2 + par2temp/(y[j]*kc[j]);
   end;
 end;
 covsub=part2/(n*n);
 return(covsub);
finish CalCovSub;

 

Output from Program 16.3 for the ICD Group

PERCENS

Percent of censoring = 0.6966292

          MEAN_SW

Simple weighted estimator for mean cost = 110108.86

                                                               SE_SW

Standard error estimate for the simple weighted estimator = 6929.7977

                       MEAN_T

Mean survival time = 1261.4032

                                                SE_T

Standard error for the mean survival time = 35.996617

                                   MEAN_IMP

Improved estimator for mean cost = 99311.725

                                             SE_IMP

Standard error of the improved estimator = 5481.115

COV_SW

Covariance between mean survival time and the simple weighted cost estimator = 24943.547

                                                                         COV_IMP

Covariance between mean survival time and the improved cost estimator = 30485.773

Output from Program 16.3 for the Conventional Treatment Group

                        PERCENS

Percent of censoring = 0.4782609

                                           MEAN_SW

Simple weighted estimator for mean cost = 70034.696

                                                               SE_SW

Standard error estimate for the simple weighted estimator = 9267.5059

                       MEAN_T

Mean survival time = 968.77197

                                                SE_T

Standard error for the mean survival time = 58.370098

                                   MEAN_IMP

Improved estimator for mean cost = 72544.906

                                             SE_IMP

Standard error of the improved estimator = 8529.8308

COV_SW

Covariance between mean survival time and the simple weighted cost estimator = 24852.951

                                                                         COV_IMP

Covariance between mean survival time and the improved cost estimator = 29764.298

Program 16.4 calculates the ICER (in $1,000/year saved) and its confidence interval, using the improved estimators for the costs.

Program 16.4 SAS Code for Obtaining Estimate of ICER and Its 95% Confidence Interval

cost1=72544.91/1000;
 secost1=8529.83/1000;
 survt1=968.77/365.25;
 sesurvt1=58.37/365.25;
 covcs1=29764.30/1000/365.25;

 cost2=99311.73/1000;
 secost2=5481.11/1000;
 survt2=1261.40/365.25;
 sesurvt2=36.00/365.25;
 covcs2=30485.77/1000/365.25;

 icer=(cost2-cost1)/(survt2-survt1);
 run CalCIiCER(cost1,secost1,survt1,sesurvt1,covcs1,
cost2,secost2,survt2,sesurvt2,covcs2, lowbd,upperbd);

 start CalCIiCER(cost1,secost1,survt1,sesurvt1,covcs1,
cost2,secost2,survt2,sesurvt2,covcs2,lowbd,upperbd);
  t=1.96;
  x=cost1-cost2;
  y=survt1-survt2;
  sxx=secost1**2+secost2**2;
  syy=sesurvt1**2+sesurvt2**2;
  sxy=covcs1+covcs2;

  f=(x*y-t**2*sxy)**2-(x**2-t**2*sxx)*(y**2-t**2*syy);
  lowbd=(x*y-t**2*sxy-sqrt(f))/(y**2-t**2*syy);
  upperbd=(x*y-t**2*sxy+sqrt(f))/(y**2-t**2*syy);
 finish CalCIiCER;
print , "Incremental cost-effectiveness ratio =" icer;
print , "Lower 95% confidence limit for icer =" lowbd;
print , "Upper 95% confidence limit for icer =" upperbd;

 

Output from Program 16.4

ICER

Incremental cost-effectiveness ratio = 33.40936

                                         LOWBD

Lower 95% confidence limit for icer = 8.6318453

                                       UPPERBD

Upper 95% confidence limit for icer = 73.552101

When restricted to a 4-year period, the average cost was $99,312 (standard error [s.e.] $5,481) for the ICD arm and $72,545 (s.e. $8,530) for the conventional therapy arm. The average survival time during the 4-year period was 1,261 (s.e. 36) days for the ICD arm and 969 (s.e. 58) days for the conventional therapy arm. The ICER comparing the ICD arm with the conventional arm was $33,400 per year of life saved, with a 95% confidence interval of (8.6, 73.6). The estimated ICER was less than $50,000 per year of life saved, an often-mentioned threshold under which treatment can be considered cost-effective compared to controls (Gold et al., 1996).

Using the simple weighted estimator, the result for the mean cost for the ICD arm was $110,109 (s.e. $6,930) and the mean cost for the conventional arm was $70,035 (s.e. $9,268). We can clearly see that the simple weighted estimator has a larger standard error, and thus it is less efficient than the improved estimator. Using a method that is similar to Lin and colleagues (1997) and a bootstrap method for the standard error, Mushlin and colleagues (1998) reported a mean cost of $99,310 for the ICD arm and $72,540 for the conventional arm. The estimated ICER is $27,000 per year of life saved, with a 95% confidence interval of (0.2, 68.2). These numbers are very close to the improved estimator.

 

16.4 Discussion

Throughout this chapter, it is assumed that censoring is random and independent of the survival time and cost-accumulating process. This assumption is usually met in well-conducted clinical trials where censoring is mainly caused by administrative termination of the study. In observational studies, this assumption might not be reasonable. However, if the censoring process can be modeled through some known variables, it is still possible to use the inverse-probability weighted method. In that case, the survival probability for the censoring variable will not be obtained by the non-parametric Kaplan-Meier estimator, but instead it can be estimated by some regression method such as the Cox Proportional Hazards model (Cox, 1972), if the proportional hazard assumption is met. Another alternative is an adjusted Kaplan-Meier estimator where one uses inverse probability of treatment weighting to adjust for the confounding factors for the survival distribution of the censoring (Xie and Liu, 2005).

The weight obtained from the inverse of the survival probability for the censoring variable can turn out to be a very large number when there is considerable censoring near the end of the time limit, L. Consequently, it is possible that a few very small probabilities can inflate this estimator. Under this sort of situation, one may want to reduce the limit, L; it is difficult to estimate costs when there are many censored values near the tail area.

Cost data are usually right-skewed, with some patients accumulating huge costs, while the majority of subjects incur only very little costs. The methods discussed here are fully nonparametric, which means that there is no distributional assumption for either cost history or survival time. However, we do need to use a reasonably large sample because the nonparametric method relies on the large sample theory.

We have discussed how to use Fieller’s method to obtain the confidence interval of the incremental cost-effectiveness ratio. Fieller’s method always provides us with a confidence set that has a correct coverage probability, as long as the numerator and the denominator in the ICER have a bivariate normal distribution, which was satisfied asymptotically for our method. An alternative way is to implement bootstrapping methods (Efron and Tibshirani, 1986, 1993). See Chapter 14.

The methods demonstrated in this chapter are applicable to observational data when one is interested in estimating costs for a population of patients or in comparing costs or cost-effectiveness between groups and one is not interested in causal inferences. That is, use these methods when you are simply estimating naturalistic treatment differences without needing to adjust for selection bias between groups. Researchers interested in cost comparisons from observational data often need to incorporate covariate information due to baseline imbalance between treatment groups. The methods described in this section may also be helpful in these situations. For instance, if propensity score stratification was used as the method to adjust for selection bias, the methods demonstrated here could be applied within each of the propensity score strata and then a pooled estimator could be obtained by averaging across strata. If one has a propensity score matched population, then the groups are balanced with respect to baseline covariates and the methods demonstrated may be applicable.

Other methodology using regression models with direct covariate adjustment has been proposed by researchers that is applicable to comparative observational research. Among them, Lin (2000a) considered a proportional mean regression model; Jain and Strawderman (2002) proposed a model based on a flexible hazard function of the medical costs; and Lin (2000b) and Willan and colleagues (2005) proposed methods that directly model the mean, using the simple weighted estimator from inverse probability weighting.

 

Acknowledgments

The authors are very grateful to Dr. Alvin I. Mushlin and Dr. Arthur J. Moss for making the cost data of MADIT available to us.

 

References

Bang, H., and A. A. Tsiatis. 2000. “Estimating medical costs with censored data.” Biometrika 87: 329–343.

Briggs, A., T. Clark, J. Wolstenholme, and P. Clarke. 2003. “Missing…presumed at random: cost-analysis of incomplete data.” Health Economics 12(5): 377–392.

Cox, D. R. 1972. “Regression models and life-tables (with discussion).” Journal of the Royal Statistical Society B 34: 187–220.

Duan, N. 1983. “Smearing estimate: a nonparametric retransformation method.” Journal of the American Statistical Association 78: 605–610.

Efron, B., and R. Tibshirani. 1986. “Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy.” Statistical Science 1: 54–75.

Efron, B., and R. Tibshirani. 1993. An Introduction to the Bootstrap. New York: Chapman & Hall/CRC.

Fieller, E. C. 1954. “Some problems in interval estimation.” Journal of the Royal Statistical Society, Series B (Statistical Methodology) 16(2): 175–185.

Gold, M. R., J. E. Siegel, L. B. Russell, and M. C. Weinstein, eds. 1996. Cost-Effectiveness in Health and Medicine. New York: Oxford University Press.

Horvitz, D. G., and D. J. Thompson. 1952. “A generalization of sampling without replacement from a finite universe.” Journal of the American Statistical Association 47(260): 663–685.

Huang, Y., and T. A. Louis. 1998. “Nonparametric estimation of the joint distribution of survival time and mark variables.” Biometrika 85: 785–798.

Jain, A. K., and R. L. Strawderman. 2002. “Flexible hazard regression modeling for medical cost data.” Biostatistics 3:101–118.

Kaplan, E. L., and P. Meier. 1958. “Nonparametric estimation from incomplete observations.” Journal of the American Statistical Association 53: 457–481.

Lin, D. Y. 2000a. “Proportional means regression for censored medical costs.” Biometrics 56: 775–778.

Lin, D. Y. 2000b. “Linear regression analysis of censored medical costs.” Biostatistics 1: 35–47.

Lin, D. Y., E. J. Feuer, R. Etzioni, and Y. Wax. 1997. “Estimating medical costs from incomplete follow-up data. Biometrics 53: 419–434.

Little, R. J. A., and D. B. Rubin. 1987. Statistical Analysis with Missing Data. New York: John Wiley & Sons, Inc.

Moss, A. J., W. J. Hall, D. S. Cannom, J. P. Daubert, S. L. Higgins, H. Klein, J. H.Levine, S. Saksena, A. L. Waldo, D. Wilber, M. W. Brown, and M. Heo. 1996. “Improved survival with an implanted defibrillator in patients with coronary disease at high risk for ventricular arrhythmia.” The New England Journal of Medicine 335(26): 1933–1940.

Mushlin, A. I., W. J. Hall, J. Zwanziger, E. Gajary, M. Andrews, R. Marron, K. H. Zou, and A. J. Moss for the MADIT Investigators. 1998. “The cost-effectiveness of automatic implantable cardiac defibrillators: results from MADIT.” Circulation 97(21): 2129–2135.

Pfeifer, P. E., and H. Bang. 2005. “Non-parametric estimation of mean customer lifetime value.” Journal of Interactive Marketing 19(4): 48–66.

Ramsey , S., R. Willke, A. Briggs, R. Brown, M. Buxton, A. Chawla, J. Cook, H. Glick, B. Liljas, D. Petitti, and S. Reed. 2005. “Good research practices for cost-effectiveness analysis alongside clinical trials: the ISPOR RCT-CEA task force report.” Value in Health 8(5): 521–533.

Rubin, D. B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons, Inc.

Schafer, J. L. 1997. Analysis of Incomplete Multivariate Data. London: Chapman & Hall/CRC.

van Buuren, S., H. C. Boshuizen, and D. L. Knook. 1999. “Multiple imputation of missing blood pressure covariates in survival analysis.” Statistics in Medicine 18(6): 681–694.

Wang, H., and H. Zhao. 2006. “Estimating incremental cost-effectiveness ratios and their confidence intervals with differentially censored data.” Biometrics 62: 570–575.

Willan, A. R., and D. Y. Lin. 2001. “Incremental net benefit in randomized clinical trials.” Statistics in Medicine 20: 1563–1574.

Willan, A. R., D. Y. Lin, and A. Manca. 2005. “Regression methods for cost-effectiveness analysis with censored data.” Statistics in Medicine 24: 131–145.

Willan, A. R., E. B. Chen, R. J. Cook, and D. Y. Lin. 2003. “Incremental net benefit in randomized clinical trials with quality-adjusted survival.” Statistics in Medicine 22: 353–362.

Xie, J., and C. Liu. 2005. “Adjusted Kaplan-Meier estimator and log-rank test with inverse probability of treatment weighting for survival data.” Statistics in Medicine 24: 3089–3110.

Young, T. A. 2005. “Estimating mean total costs in the presence of censoring: a comparative assessment of methods.” PharmacoEconomics 23(12): 1229–1242.

Zhao, H., and L. Tian. 2001. “On estimating medical cost and incremental cost-effectiveness ratios with censored data.” Biometrics 57: 1002–1008.

Zhao H., H. Bang, H. Wang, and P. E. Pfeifer. 2007. “On the equivalence of some medical cost estimators with censored data.” Statistics in Medicine 26: 4520–4530.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.138.41