Chapter 5

Propensity Scoring with Missing Values

Yongming Qu
Ilya Lipkovich

Abstract

5.1 Introduction

5.2 Data Example

5.3 Using SAS for IPW Estimation with Missing Values

5.4 Sensitivity Analyses

5.5 Discussion

References

 

Abstract

Propensity scores have been used widely as a bias reduction method to estimate the treatment effect in nonrandomized studies. Because many covariates are generally included in the model for estimating propensity scores, the proportion of subjects with at least one missing covariate can be relatively large. In this chapter, we review existing methods for estimating propensity scores when missing values are present. The methods include a complete covariate (CC) method, an indicator variable (IND) method, various multiple imputation (MI) methods, a missingness pattern (MP) method, and a multiple imputation missingness pattern (MIMP) method. We provide SAS programs to implement all five methods for a data set from a clinical study in osteoporosis. We also provide a SAS macro for pooling small patterns of missing data to increase the stability and efficiency of MP and MIMP estimators. Because estimation may be sensitive to model misspecification for imputation and/or propensity score estimation as well as to the tuning parameters of associated algorithms, we also suggest various sensitivity analyses.

 

5.1 Introduction

Observational studies are becoming increasingly important because they allow us to observe treatment outcomes for large numbers of subjects in real-world treatment practice. Well-designed observational studies could provide valuable information to enhance information from randomized controlled trials (Concato, Shah, and Horwitz, 2000). The propensity score concepts introduced by Rosenbaum and Rubin (1983) are tools for estimating causal effects of alternative treatments in the presence of imbalance in baseline covariate (X) distributions between treatment groups due to lack of randomization.

Let T denote a binary treatment group indicator. Throughout this chapter, we restrict attention to only two treatment groups, with T=0 indicating the reference treatment group and T=1 indicating the investigational or active treatment group. The propensity score Pr(T=1|X), which is the probability of a subject being assigned to the active treatment given X, is essentially a mapping of multiple covariates onto a single, scalar valued variable. Propensity scores are typically estimated using a multiple logistic regression model, as follows:

  log[Pr(T=1|X)/(1-Pr(T=1|X))]=(1,XT)β .

It has been shown that using propensity scores results in substantial reduction of bias in estimating the treatment effect when treatment assignments are subject to selection bias (Rosenbaum and Rubin, 1983). Furthermore, the propensity score method provides advantages compared to simply incorporating all the covariates in the model for the treatment effect. For example, propensity score methods tend to be more robust than direct covariate adjustment with respect to model overparameterizations (including too many covariates) and situations with different covariance matrices within treated and untreated groups (D’Agostino, 1998, page 2286.)

Propensity score methods encourage use of many covariates because only predicted probabilities of alternative treatment choices end up being used, while the relative magnitudes, numerical signs, and p-values of fitted coefficients tend to be ignored. The generally larger number of missing values recorded in observational studies, compared to well-controlled trials, implies that a large proportion of subjects have at least one missing covariate value. The first simple approach is using only the observations without missing covariates. We call this method the complete covariate (CC) method. Clearly, simply ignoring patients with at least one missing covariate value is not a viable strategy. A simple and intuitive way for handling categorical missing values is to treat a missing value for each categorical variable as an additional category. For a continuous covariate with missing values, we can impute the missing values with the marginal mean and create a new dummy variable to indicate the missingness. We call this the indicator variable (IND) method. However, creating this new value ignores any observed correlations among original covariate values and, thus, is not an efficient approach.

A more sophisticated method is to fit separate regressions in estimation of the propensity score for each distinct missingness pattern (MP) (D’Agostino, 2001). Although this approach includes all nonmissing values for those subjects with the same MP, it increases the variability of estimated propensity scores because only a subset of subjects is included in the propensity score model. A much more complicated and computationally intensive approach is to jointly model the propensity score and the missingness, and then use the EM/ECM algorithm (Ibrahim et al., 1999) or Gibbs sampling (D’Agostino et al., 2000) to estimate parameters and propensity scores. Because there is currently no SAS procedure to perform such analyses, the computational complexity makes this alternative approach less attractive and practical.

As a different approach, propensity scores in the presence of missing covariates could also be estimated using multiple imputation (MI) concepts proposed by Rubin (1978, 1987). Recently, Crowe, Lipkovich, and Wang (2009) studied the performance of different multiple imputation strategies in propensity score-based estimation through a simulation study. The central idea of MIs is to randomly fill in any missing values multiple times with sampling from the posterior predictive distribution of the missing values given the observed values, thereby creating a sequence of complete data sets. One advantage to this method is that each data set in the imputed sequence can be analyzed using standard complete data methods. Another advantage is that MI procedures allow us to include ancillary variables that, although they do not directly affect propensity scores, may none the less contain useful information about missing values in important variables.

Recently, Qu and Lipkovich (2009) developed a new method called multiple imputation missingness pattern (MIMP), which utilizes not only a multiple imputation method but also the pattern of missing data in the estimation of propensity scores. In this approach, missing data are imputed using a multiple imputation procedure. Then, the propensity scores are estimated from a logistic regression model including the covariates (with missing values imputed) and a factor (a set of indicator variables) indicating the missingness pattern for each observation. A simulation study showed that MIMP performs as well as MI and better than MP when the missingness mechanism is either “missing completely at random” or “missing at random,” and it performs better than MI when data are missing not at random (Qu and Lipkovich, 2009).

We will use a data example from a clinical trial in osteoporosis to show how to estimate propensity scores in the presence of missing data for some covariates using the CC, IND, MI, MP, and MIMP methods. There are many ways to use these estimated propensity scores to estimate treatment effects, including regression, stratification, inverse probability weighting, matching, and some combinations of these (see Chapters 2–4 for detailed discussions of these methods). Note that each of the approaches for estimating propensity scores discussed in this chapter can be used in combination with any method of using the propensity scores to estimate treatment effects. In this chapter, for illustration, we use the inverse probability weighting (IPW) approach to estimate the treatment difference. It has been shown that standardizing the weights before performing the IPW estimation provides a more stable estimator for the treatment difference. Therefore, we use the standardized IPW estimation throughout this chapter. Specifically, the estimated treatment effects is estimated as the difference in mean outcome between the two treatment groups,   θˆd=θˆ1-θˆ0, where

θˆ0=(j=1n(1-Tj)/(1-pˆj))-1j=1n(1-pˆj)-1Yj(1-Tj)θˆ1=(j=1nTj/pˆj)-1j=1npˆj-1YjTj

and individual propensity scores, pˆj, are estimated from the logistic regression model described previously (Lunceford and Davidian, 2004).

 

5.2 Data Example

In this section, we apply the five methods of handling missing data when estimating propensity scores to a set of data from an osteoporosis study: Multiple Outcomes of Raloxifene Evaluation (MORE) (Delmas et al., 2002). In this study, 7,705 women with osteoporosis were randomly assigned to one of the three treatment groups with an intended ratio of 1:1:1 for placebo, raloxifene 60 mg/day, or raloxifene 120 mg/day and were followed up for 4 years. After 3 years of follow-up, women were allowed to take other bone-active agents such as bisphosphonates. In this analysis, we compared the change in the femoral neck bone mineral density (BMD) during the fourth year (Y) between women not taking bisphosphonates (referred to in this analysis as the untreated group, T=0) and women taking bisphosphonates (the treated group, T=1) among the 1,643 women who were originally randomized to placebo. Our primary analytic method for evaluating treatment difference is an analysis of variance (ANOVA) model for the outcome (Y) against the assigned treatment group (T). However, because taking bisphosphonates was not a randomized factor, the response variable Y in treated and untreated groups may be confounded due to selection bias. Therefore, we use a weighted ANOVA model with weights taken as the standardization of the inverse of the estimated probability of treatment received. Because PROC MIXED in SAS automatically standardizes the weights before performing the analysis, there is no need to explicitly compute standardized weights in the SAS program. These propensities were estimated using a logistic regression model with 16 covariates (Table 5.1): age at baseline (i.e., prior to randomization), body mass index (BMI) at baseline, family history of breast cancer, 5-year breast cancer risk score (Costantino et al.,1998) at baseline, whether a woman had had a hysterectomy at baseline, lumbar spine BMD at baseline, femoral neck BMD at baseline, change in lumbar spine BMD during the first 3 years, change in femoral neck BMD during the first 3 years, previous hormone replacement therapy status, prevalent vertebral fracture at baseline, new nonvertebral fracture in the first 3 years, new vertebral fracture in the first 3 years, weighted adverse event score during the first 3 years calculated as (1 × #mild AE + 2 × #moderate AE + 3 × #severe AE + 4 × #serious AE), smoking status at baseline, and baseline semi-quantitative vertebral fracture status (0=no fracture, 1=mild fracture, 2=moderate fracture, and 3=severe fracture).

A total of 1,643 women (1,512 with T=0 and 131 with T=1) were included in this sub-analysis and 603 women (36.7%) had at least one missing covariate. Table 5.1 displays the variable names and the numbers of missing values for women originally treated with the placebo. There were 14 distinct patterns of the missing data. The largest pattern included 1,040 subjects with no missing covariates, and each of the three smallest patterns consisted of only one subject.

Table 5.1 Variables in the Example Data Set

Variable Name Variable Description # Missing Observations
AGE Age at MORE baseline 0
BMIR Body mass index (BMI) at MORE baseline 1
GAILMORE 5-year breast cancer risk score 562
LSC Change in lumbar spine BMD during the first 3 years 6
LTOTBMR Lumbar spine BMD at baseline 6
FNC Change in femoral neck BMD during the first 3 years 0
NECKBMDR Femoral neck BMD at baseline 0
AESCORE Weighted adverse event score during the first 3 years 0
SQ Baseline semi-quantitative vertebral fracture status (0=no fracture, 1=mild fracture, 2=moderate fracture, and 3=severe fracture) 9
FAMHXBCN Family history of breast cancer (Y/N) 41
KHYSYN Whether hysterectomized at MORE baseline (Y/N) 0
NVFX New nonvertebral fracture in the first 3 years (Y/N) 0
PREVHRT Previous hormone replacement therapy status (Y/N) 3
PREVVERT Prevalent vertebral fracture at MORE baseline (Y/N) 8
SMOKE Smoking status at MORE baseline (Y/N) 23
VFX New vertebral fracture in the first 3 years (Y/N) 8
FNBMD_C Change in the femoral neck BMD during the fourth year (Outcome variable) 0
BISMORE Taking bisphosphonates versus not taking (treatment group), (Y/N) 0

 

5.3 Using SAS for IPW Estimation with Missing Values

SAS procedures (MEANS, GLM, and MIXED) automatically standardize the weights before performing the analysis. Therefore, although there is no explicit code for standardizing the weights (defined as the inverse of the propensity scores in the SAS programs), the standardized IPW estimation is actually performed for all methods of handling missing data.

 

5.3.1 Complete Covariates (CC) Analysis

The CC method simply omits observations with at least one missing covariate. Program 5.1 shows the SAS code to perform the CC analysis and Output from Program 5.1 shows the results.

Program 5.1 Complete Covariate (CC) Analysis

PROC FORMAT;
   VALUE FORMATYN 0 = 'NO'
                  1 = 'YES';
RUN;

*****************************************************;
* COMPLETE COVARIATE (CC) ANALYSIS;
*****************************************************;

PROC LOGISTIC DATA = ANALDATA NOPRINT;
    MODEL BISMORE =  &VARLIST;
    OUTPUT OUT=PRED PREDICTED=P;
RUN;

 

DATA PRED;
   SET PRED;
   IF BISMORE =  0 THEN PROB = P;
   IF BISMORE =  1 THEN PROB = 1-P;
   W = 1/PROB;
RUN;

 

PROC SORT DATA=PRED;
   BY BISMORE;
RUN;

 

TITLE 'ANALYSIS RESULTS USING THE COMPLETE COVARIATE  (CC) METHOD';
PROC MIXED DATA = PRED;
   CLASS BISMORE;
   MODEL FNBMD_C  = BISMORE;
   WEIGHT W;
   FORMAT BISMORE FORMATYN.;
   LSMEANS BISMORE/DIFF=ALL;
RUN;

 

Output from Program 5.1

ANALYSIS RESULTS USING THE COMPLETE COVARIATE (CC) METHOD

The Mixed Procedure

Least Squares Means

Effect Bisphosphonates
use in the 4th
yr of MORE
Estimate Standard
Error
DF t Value Pr > |t|
 
BISMORE No -0.00225 0.001224 1038 -1.84 0.0668
BISMORE Yes 0.006102 0.001180 1038 5.17 <.0001

 

Differences of Least Squares Means

 

Effect Bisphosphonates
use in the 4th
yr of MORE
Bisphosphonates
use in the 4th
yr of MORE
Estimate  Standard
Error  
DF t Value Pr > |t|
 
BISMORE No Yes -0.00835 0.001700 1038 -4.91 <.0001

 

5.3.2 Indicator Variable (IND) Analysis

The IND approach is a straightforward method for categorical data where the missing value is treated as a special category. For continuous variables, we generally impute the missing values with the marginal means and create a dummy variable to indicate missingness. Program 5.2 shows the SAS code to perform the IND analysis and Output from Program 5.2 shows the results.

Program 5.2 Indicator Variable (IND) Analysis

**********************************************************************;
* INDICATOR VARIABLE (IND) ANALYSIS;
**********************************************************************;
PROC MEANS DATA = ANALDATA NOPRINT;
   VAR AGE BMIR GAILMORE LSC LTOTBMDR FNC NECKBMDR AESCORE SQ;
   OUTPUT OUT = ANALDATA_MEAN MEAN=AGE_M BMIR_M GAILMORE_M LSC_M  
LTOTBMDR_M FNC_M NECKBMDR_M AESCORE_M SQ_M;
   BY STUDY;
RUN;

DATA  ANALDATA_2;
   MERGE ANALDATA ANALDATA_MEAN;
   BY STUDY;
RUN;

DATA  ANALDATA_IV;
   SET ANALDATA_2;
   ARRAY X{9} AGE BMIR GAILMORE LSC LTOTBMDR FNC NECKBMDR AESCORE SQ;
   ARRAY M{9} M1-M9;
   ARRAY XM{9} AGE_M BMIR_M GAILMORE_M LSC_M LTOTBMDR_M FNC_M NECKBMDR_M   
               AESCORE_M SQ_M;
   DO I  =  1  TO  9;
       IF X{I}  =  .  THEN DO;
             M{I} = 1;
                X{I} = XM{I};
         END;
         ELSE M{I} = 0;
   END;

      ARRAY XC{7}  FAMHXBCN KHYSYN NVFX PREVHRT PREVVERT SMOKE VFX;
      DO I  =  1  TO  7;
         IF XC{I}  =  .  THEN XC{I}  =  -1;
      END;
RUN;

PROC  LOGISTIC  DATA  =  ANALDATA_IV NOPRINT;
     CLASS FAMHXBCN KHYSYN NVFX PREVHRT PREVVERT SMOKE VFX;
     MODEL BISMORE  =  &VARLIST M1-M9;
     OUTPUT OUT=PRED PREDICTED=P;
RUN;

DATA PRED;
   SET PRED;
   IF BISMORE  =  0  THEN PROB  =  P;
   IF BISMORE  =  1  THEN PROB  =  1-P;
   W  =  1/PROB;
RUN;

PROC SORT DATA=PRED;
   BY BISMORE;
RUN;

TITLE 'ANALYSIS RESULTS USING THE INDICATOR VARIABLE (IND) METHOD';
PROC MIXED DATA  =  PRED;
   CLASS BISMORE;
   MODEL FNBMD_C  =  BISMORE;
   WEIGHT W;
   LSMEANS BISMORE/DIFF=ALL;
   FORMAT BISMORE FORMATYN.;
RUN;

 

Output from Program 5.2

ANALYSIS RESULTS USING THE INDICATOR VARIABLE (IND) METHOD

The Mixed Procedure

Least Squares Means

 

Effect Bisphosphonates
use in the 4th
yr of MORE
Estimate Standard
Error
DF t Value Pr > |t|
 
BISMORE No -0.00217 0.001031 1641 -2.11 0.0354
BISMORE Yes 0.08131 0.000995 1641 8.17 <.0001

 

Differences of Least Squares Means

 

Effect Bisphosphonates
use in the 4th
yr of MORE
Bisphosphonates
use in the 4th
yr of MORE
Estimate Standard
Error
DF t Value Pr > |t|
 
BISMORE No Yes -0.01030 0.001433 1641 -7.19 <.0001

 

5.3.3 Multiple Imputation (MI) Analysis

Applying MI for categorical predictors may be challenging because most commercially available statistical software for MI works with continuous data under an assumption of normality. Unfortunately, there is no current procedure in SAS to perform multiple imputation easily for categorical covariates. As an alternative, we could create dummy variables for categorical variables and perform MI treating the dummy variables as continuous variables, which is readily justified for binary variables. For example, let X=1 indicate a subject who smoked regularly at baseline, and X=0 indicate a subject who did not. The imputed value for X could then be, say, 0.4, indicating a subject who has a 40% chance of smoking. This might provide better information about a patient than rounding the probability down to 0 (no smoking). In this chapter, we impute missing binary predictors with probabilities without rounding down or up.

Another difficulty in applying MI inference in the context of propensity-based estimation of treatment effects is that using combining rules (Rubin, 1987) may result in variance estimators that are not valid because the uncertainty in estimated weights has not been accounted for. Another reason why Rubin’s variance estimator may not be appropriate is that one’s imputation model (for baseline covariates) and analysis model (for treatment effects) are unlikely to be compatible. See Meng (1994), Wang and Robins (1998), and Robins and Wang (2000) for more information. One general recipe for improving the variance estimator is bootstrapping the entire estimation procedure. This can be easily done using the available SAS macro suite for bootstrapping and implementing various bootstrap-based confidence intervals (http://cuke.hort.ncsu.edu/cucurbit/wehner/software/pathsas/jackboot.txt). In Section 5.3.6, we outline how to use this macro.

Program 5.3 shows the SAS code to perform the MI analysis and Output from Program 5.3 shows the results. Program 5.3 shows how to calculate the point estimator when PROC MI is used to impute missing values. Essentially, it is the average of the estimates from samples generated by multiple imputations.

Program 5.3 The Multiple Imputation (MI) Analysis

**********************************************************************;
* ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD;
**********************************************************************;
PROC MI DATA = ANALDATA ROUND=.001 NIMPUTE=5 SEED=6731205
OUT=IMPUTED_DATA NOPRINT;
  VAR &VARLIST FNBMD_C BISMORE;
RUN;

 

PROC LOGISTIC DATA = IMPUTED_DATA NOPRINT;
    MODEL BISMORE = &VARLIST;
    OUTPUT OUT=PRED PREDICTED=P;
    BY _IMPUTATION_;
RUN;

 

DATA PRED;
   SET PRED;
   IF BISMORE = 0 THEN PROB = P;
   IF BISMORE = 1 THEN PROB = 1-P;
   W = 1/PROB;
RUN;

 

PROC SORT DATA=PRED;
   BY _IMPUTATION_ BISMORE;
RUN;

 

ODS LISTING CLOSE;
ODS OUTPUT LSMEANS = LSM DIFFS=DIFFS;

PROC MIXED DATA = PRED;
   CLASS BISMORE;
   MODEL FNBMD_C = BISMORE;
   WEIGHT W;
   BY _IMPUTATION_;
   LSMEANS BISMORE/ DIFF=ALL;
RUN;
ODS LISTING;

 

TITLE 'ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD';
TITLE2 'POINT ESTIMATES BY TREATMENT GROUP';
PROC MEANS DATA=LSM;
   CLASS BISMORE;
   VAR ESTIMATE;
   FORMAT BISMORE FORMATYN.;
RUN;

 

TITLE2 'POINT ESTIMATE FOR THE TREATMENT DIFFERENCE';
PROC MEANS DATA = DIFFS;
   VAR ESTIMATE;
RUN;

 

Output from Program 5.3

ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD
POINT ESTIMATES BY TREATMENT GROUP

The MEANS Procedure

Analysis Variable : Estimate

 

Bisphosphonates
use in the 4th
yr of MORE
Maximum
N
Obs
N Mean Std Dev Minimum  
--------------------------------------------------------------------------------
No
0.0021606
5 5 -0.0021784 0.000011247 -0.0021881 -
Yes
0.0086499
5 5 0.0083319 0.000191791 0.0081308  
--------------------------------------------------------------------------------

ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD
POINT ESTIMATE FOR THE TREATMENT DIFFERENCE

The MEANS Procedure

 

Analysis Variable : Estimate

 

N Mean Std Dev Minimum Maximum
----------------------------------------------------------------------------------
5 -0.0105103 0.000189049 -0.0108240 -0.0103159
----------------------------------------------------------------------------------

The following SAS code produces the MI estimators of treatment effect with naïve estimates of standard error using PROC MIANALYZE. Specifically, it proceeds in three steps: first one creates multiple data sets without missing baseline values by imputation using a multivariate normal model (PROC MI), then one computes IPW estimates of treatment effects for each completed data set (PROC MIXED), and finally one computes a single MI estimator of treatment effect and associated approximate 95% confidence interval (CI) using Rubin’s combining rules (PROC MIANALYZE).

Note that PROC MIANALYZE has different formats for input data sets PARMS and COVB for SAS versions 8 and 9. Our example assumes use of SAS 9. Remember also that the variance estimator from PROC MIANALYZE may not be valid for the reasons mentioned here. Program 5.4 shows the SAS code to summarize the estimates from the MI method using PROC MIANALYZE and Output from Program 5.4 shows the results.

Program 5.4 Summarize the Estimates from the MI Method Using PROC MIANALYZE

TITLE2 'ESTIMATE THE TREATMENT DIFFERENCE USING PROC MIANALYZE';
DATA FOR_MI_EST (KEEP = _IMPUTATION_ EFFECT ESTIMATE RENAME=(EFFECT=
                 PARAMETER));
   SET DIFFS;
   EFFECT = 'DIFF';
RUN;
 
DATA FOR_MI_COV (KEEP = _IMPUTATION_ ROWNAME BISMORE DIFF );
   SET DIFFS;
   DIFF = STDERR**2;
   ROWNAME = "DIFF";
RUN;

 

PROC MIANALYZE PARMS=FOR_MI_EST COVB=FOR_MI_COV;
   MODELEFFECTS DIFF;
   ODS OUTPUT PARAMETERESTIMATES=MI_EST
             VARIANCEINFO=MI_VAR;
RUN;

 

Output from Program 5.4

ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD
ESTIMATE THE TREATMENT DIFFERENCE USING PROC MIANALYZE

The MIANALYZE Procedure

Model Information

 

PARMS Data Set WORK.FOR_MI_EST
COVB Data Set WORK.FOR_MI_COV
Number of Imputations 5

Multiple Imputation Variance Information

 

  -----------------Variance-----------------  
Parameter Between Within Total DF
 
DIFF 3.5739427E-8 0.000002072 0.000002115 9729.1

Multiple Imputation Variance Information

 

Parameter Relative
Increase
in Variance
Fraction
Missing
Information
Relative
Efficiency
 
DIFF 0.020696 0.020478 0.995921

Multiple Imputation Parameter Estimates

 

Parameter Estimate Std Error 95% Confidence Limits DF
 
DIFF -0.010510 0.001454 -0.01336 -0.00766 9729.1

Multiple Imputation Parameter Estimates

 

Parameter Minimum Maximum Theta0 t for H0:
Parameter=Theta0
Pr > |t|
 
DIFF -0.010824 -0.010316 0 -7.23 <.0001

The two data sets passed to PROC MIANALYZE contain estimated treatment effects (data set FOR_MI_EST) and their associated variance-covariance matrices (squared estimated standard errors in data set FOR_MI_COV) for each imputation. The output of PROC MIANALYZE contains information on partitioning of the total variance associated with estimated treatment effect into between- and within-imputation pieces, multiple imputation parameter estimates, approximate 95% confidence intervals based on t-distributions, and total variance. Note that for a finite number of imputations (M), Rubin’s variance estimator is inconsistent because it has a non-degenerate chi-squared limiting distribution. Therefore, a standard Wald-type inference based on the normal distribution is invalid and a t-distribution is used (Rubin, 1987). The impact of missing covariates on the final estimates of treatment effect can also be assessed by the fraction of missing information (about treatment effect), which is fairly low in this case (only 2%).

 

5.3.4 Missingness Pattern (MP) Analysis

The MP method essentially estimates propensity scores separately for each missingness pattern by including only variables without missing values within a missingness pattern. As a result, the independent variables included in the propensity score estimation models differ across missingness patterns. One challenge in implementing MP analysis is that usually there are some missingness patterns with a small number of observations, which renders estimation using the described model unstable. To address this problem, we developed an algorithm for pooling small missingness patterns according to their similarities to reach a prespecified minimum number of observations in each pattern (Qu and Lipkovich, 2009). After combining similar patterns, we impute all missing values within each pooled pattern with the marginal means (to avoid “holes”) and estimate the propensity scores using a logistic regression model. In this example, we pooled missingness patterns with a minimum of 100 observations for each pooled cell. Program 5.5 provides a macro to create these pooled patterns.

Program 5.5 Macro to Pool Small Missingness Pattern

***********************************************************************;
* Input parameters:
*    indata = input data set;
*    outdata = output data set;
*    varlist = a list of variables to be included in the propensity score estimation;
*    M_MP_MIN = minimum number of observations for each missing pattern.
*             Missing patterns with less than MIN_MP observations will be pooled;
***************************************************************************;
%MACRO MP_ASSIGN(MSDATA = , OUTDATA =, VARLIST =, N_MP_MIN = 100);

     /* Determine how many variables to include in the propensity score estimation */
     %LET N = 1;
       %LET VARINT = ;
     %DO %UNTIL(%QSCAN(&VARLIST., &N. , %STR( )) EQ %STR( ));
           %LET VAR = %QSCAN(&VARLIST. , &N. , %STR( ));
            %LET VARINT = &VARINT  &VAR.*MP;
          %LET N = %EVAL(&N. + 1);
     %END;
       %LET KO = %EVAL(&N-1);
       %LET M_MISSING = %EVAL(&N-1);
       %PUT &VARINT;
       %PUT &KO;
       %PUT &M_MISSING;

  /* Create indicators for missing values and missingness patterns */
  DATA MS;
     SET &MSDATA;
       ARRAY MS{&M_MISSING} M1-M&M_MISSING.;
       ARRAY X{&M_MISSING} &VARLIST;
       MV = 0;
       DO I = 1 TO &M_MISSING;
          IF X{I} = . THEN MS{I} = 1;
             ELSE MS{I} = 0;
          MV = 2*MV + MS{I};
       END;
       MV = MV + 1;
       DROP I;
   RUN;

  /* Only keep one record for each missingness pattern */
  PROC SORT DATA = MS OUT = PATTERN NODUPKEY;
     BY MV;
  RUN;

  /* Calculate the number of observations in each missingness pattern */
  PROC FREQ DATA = MS NOPRINT;
     TABLES MV / OUT = M_MP(KEEP = MV COUNT);
  RUN;

  DATA PATTERN;
     MERGE PATTERN M_MP;
       BY MV;
  RUN;

  PROC SORT DATA = PATTERN;
     BY DESCENDING COUNT;
  RUN;

  /* Assign missingness pattern to new index from the largest to the smallest */
  DATA PATTERN;
     RETAIN M1-M&M_MISSING MV COUNT MV_S;
     SET PATTERN;
       KEEP M1-M&M_MISSING MV COUNT MV_S;
       MV_S = _N_;
  RUN;

PROC IML;
   USE PATTERN;
      READ ALL INTO A;
   CLOSE PATTERN;
   MS = A[, 1:&M_MISSING];
   MV = A[, 1+&M_MISSING];
   N_MP = A[, 2+&M_MISSING];
   MV_S = A[, 3+&M_MISSING];

   M_MP = NROW(MS);
   M = NCOL(MS);

  /* Calculate the distance between missingness patterns */
  DISTANCE = J(M_MP, M_MP, 0);
  DO I = 1 TO M_MP;
     DO J = 1 TO I-1;
          D = 0;
          DO L = 1 TO M;
             D = D + ( (MS[I,L]-MS[J,L])*(MS[I,L]-MS[J,L]) );
            END;

            DISTANCE[I,J] = D;
            DISTANCE[J,I] = D;
      END;
  END;

  I = 0;
  K_MV_POOL = 0;
  MV_POOL = J(M_MP, 1, 0);

    /*Pooling small missingness patterns according to their similarities to
     reach a prespecified minimum number of observations (&N_MP_MIN) in each
     pattern */

  DO WHILE( I < M_MP);
     I = I + 1;
     IF MV_POOL[I] = 0 THEN
      DO;
        K_MV_POOL = K_MV_POOL + 1;
            N_MP_POOL = N_MP[I];
        IF N_MP_POOL >= &N_MP_MIN THEN
         DO;
               MV_POOL[I] = K_MV_POOL;
          END;
         ELSE
         DO;
              IF I < M_MP THEN
                 DO;
                   A = DISTANCE[(I+1):M_MP, I];
                       B = MV[(I+1):M_MP];
                       C = N_MP[(I+1):M_MP];
                       D = MV_S[(I+1):M_MP];
                       E = MV_POOL[(I+1):M_MP];
                       TT = A || B || C || D || E;
                       CALL SORT( TT, {1 3});
                       J = 0;
                       DO WHILE( (N_MP_POOL < &N_MP_MIN) & (I+J < M_MP) );
                     J = J+1;
                         IF (TT[J,5] = 0) THEN
                         DO;
                            N_MP_POOL = N_MP_POOL + TT[J,3];
                              TT[J,5] = K_MV_POOL;
                         END;
                       END;
                 END;
                 IF ( N_MP_POOL >= &N_MP_MIN ) THEN
                 DO;
                   MV_POOL[I] = K_MV_POOL;
                   DO K = 1 TO J;
                      MV_POOL[TT[K,4]] = K_MV_POOL;
                   END;
                 END;
                 ELSE
                 DO J = I TO M_MP;
                   SGN_TMP = 0;
                   K = 1;
                   DO WHILE(SGN_TMP = 0 & K <= M_MP);
                      DO L = 1 TO M_MP;
                           IF (DISTANCE[J,L] = K) & (MV_POOL[J]=0) &
                                                  (MV_POOL[L]>0) THEN
                           DO;
                                MV_POOL[J] = MV_POOL[L];
                                   SGN_TMP = 1;
                           END;
                        END;
                        K = K + 1;
                   END;
                 END;

         END;
     END;
  END;

  MV_FINAL = MV || MV_POOL;

  VARNAMES={'MV' 'MV_POOL'};
  CREATE MVPOOL FROM MV_FINAL[COLNAME=VARNAMES];
  APPEND FROM MV_FINAL;
QUIT;

PROC SORT DATA = MVPOOL;
  BY MV;
RUN;

PROC SORT DATA = MS;
  BY MV;
RUN;

/* The variable MVPOOL in the &OUTDATA set indicates the pooled missingness pattern */

DATA &OUTDATA(RENAME=(MV=MP_ORIG MV_POOL=MP));
   MERGE MS MVPOOL;
   BY MV;
RUN;

%MEND MP_ASSIGN;

Program 5.6 shows the SAS code to perform the MP analysis after the macro %MP_ASSIGN is applied. Output from Program 5.6 shows the results.

Program 5.6 The Missingness Pattern (MP) Analysis

**********************************************************************;
* MISSINGNESS PATTERN (MP) METHOD;
**********************************************************************;

%MP_ASSIGN(MSDATA = ANALDATA, OUTDATA = ANALDATA2, VARLIST = &VARLIST, N_MP_MIN = 100);

PROC MEANS DATA = ANALDATA2 NOPRINT;
   VAR &VARLIST;
   OUTPUT OUT = MN MEAN = XM1-XM16;
   BY STUDY;
RUN;

DATA TEMP;
   MERGE ANALDATA2 MN;
   BY STUDY;
RUN;

DATA TEMP;
   SET TEMP;
   ARRAY X{16} &VARLIST;
   ARRAY XM{16} XM1-XM16;

   DO I = 1 TO 16;
      IF X{I} = . THEN X{I} = XM{I};
   END;
   DROP I;
RUN;

PROC SORT DATA = TEMP;
  BY MP;
RUN;

PROC LOGISTIC DATA = TEMP NOPRINT;
   CLASS MP;
   MODEL BISMORE = &VARLIST;
   OUTPUT OUT=PRED PREDICTED=P;
   BY MP;
RUN;

DATA PRED;
   SET PRED;
   IF BISMORE = 0 THEN PROB = P;
   IF BISMORE = 1 THEN PROB = 1-P;
   W = 1/PROB;
RUN;

PROC SORT DATA=PRED;
   BY BISMORE;
RUN;

TITLE 'ANALYSIS RESULTS USING THE MISSINGNESS PATTERN (MP) METHOD';
PROC MIXED DATA = PRED;
   CLASS BISMORE;
   MODEL FNBMD_C = BISMORE;
   WEIGHT W;
   LSMEANS BISMORE/DIFF=ALL;
   FORMAT BISMORE FORMATYN.;
RUN;

 

Output from Program 5.6

ANALYSIS RESULTS USING THE MISSINGNESS PATTERN (MP) METHOD

The Mixed Procedure

Least Squares Means

Effect Bisphosphonates
use in the 4th
yr of MORE
Estimate Standard
Error
DF t Value Pr > |t|
 
BISMORE No -0.00222 0.001021 1641 -2.18 0.0295
BISMORE Yes 0.007187 0.000987 1641 7.28 <.0001

 

Differences of Least Squares Means

 

Effect Bisphosphonates
use in the 4th
yr of MORE
Bisphosphonates
use in the 4th
yr of MORE
Estimate Standard
Error
DF t Value Pr > |t|
 
BISMORE No Yes -0.00941 0.001420 1641 -7.6.63 <.0001

 

5.3.5 Multiple Imputation Missingness Pattern (MIMP) Analysis

The MIMP method essentially is a combination of MI and MP methods. First of all, missing values are multiply imputed. Then, for each imputed data set, propensity scores are estimated using the baseline covariates and the categorical variable for the missingness pattern. Similar to the MP method, we combine small missingness patterns to reach a minimum of 100 observations for each cell. Once we create data sets with indicators for pooled patterns (as shown in Section 5.3.4), we simply apply PROC MI to this data set (as shown in Section 5.3.3). Program 5.7 shows the SAS code to perform the MIMP analysis and Output from Program 5.7 shows the results.

Program 5.7 The Multiple Imputation Missingness Pattern (MIMP) Analysis

**********************************************************************;
* Multiple Imputation Missingness Pattern (MIMP) Method;
**********************************************************************;

PROC MI DATA = ANALDATA2 ROUND=.001 NIMPUTE=5 SEED=6731205 OUT=IMPUTED_DATA NOPRINT;
   VAR &VARLIST FNBMD_C BISMORE;
RUN;

 

PROC LOGISTIC DATA = IMPUTED_DATA NOPRINT;
   CLASS MP;
   MODEL BISMORE = &VARLIST MP;
   OUTPUT OUT=PRED PREDICTED=P;
   BY _IMPUTATION_;
RUN;

DATA PRED;
   SET PRED;
   IF BISMORE = 0 THEN PROB = P;
   IF BISMORE = 1 THEN PROB = 1-P;
   W = 1/PROB;
RUN;

PROC SORT DATA=PRED;
   BY _IMPUTATION_ BISMORE;
RUN;

ODS OUTPUT LSMEANS = LSM DIFFS=DIFFS;
PROC MIXED DATA = PRED;
   CLASS BISMORE;
   MODEL FNBMD_C = BISMORE;
   WEIGHT W;
   BY _IMPUTATION_;
   LSMEANS BISMORE/ DIFF=ALL;
RUN;

 

TITLE 'ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION MISSINGNESS PATTERN (MIMP) METHOD';
TITLE2 'POINT ESTIMATES BY TREATMENT GROUP';
PROC MEANS DATA=LSM;
   CLASS BISMORE;
   VAR ESTIMATE;
   FORMAT BISMORE FORMATYN.;
RUN;

TITLE2 'POINT ESTIMATE FOR THE TREATMENT DIFFERENCE';
PROC MEANS DATA = DIFFS;
   VAR ESTIMATE;
RUN;

 

Output from Program 5.7

ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION MISSINGNESS PATTERN (MIMP) METHOD
POINT ESTIMATES BY TREATMENT GROUP

The MEANS Procedure

Analysis Variable : Estimate

 

Bisphosphonates
use in the 4th
yr of MORE
Maximum
N
Obs
N Mean Std Dev Minimum Maximum
No 5 5 -0.0021775 0.000011208 -0.002187 -0.0021596
Yes 5 5 0.0083296 0.000190528 0.0081254 0.0086443

 

ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION MISSINGNESS PATTERN (MIMP) METHOD POINT ESTIMATE FOR THE TREATMENT DIFFERENCE

The MEANS Procedure

Analysis Variable : Estimate

 

N Mean Std Dev Minimum Maximum
5 -0.0105070 0.000187855 -0.0108178 -0.0103096

One can also use PROC MIANALYZE to get the point estimate for the treatment difference. However, the variance estimation from PROC MIANALYZE is not valid. This approach is shown in Program 5.8, and Output from Program 5.8 shows the results.

Program 5.8 Summarize the Estimates from the MIMP Method Using PROC MIANALYZE

TITLE2 'ESTIMATE THE TREATMENT DIFFERENCE USING PROC MIANALYZE';
DATA FOR_MIMP_EST (KEEP = _IMPUTATION_ EFFECT ESTIMATE RENAME=(EFFECT= PARAMETER));
   SET DIFFS;
   EFFECT = 'DIFF';
RUN;

DATA FOR_MIMP_COV (KEEP = _IMPUTATION_ ROWNAME BISMORE DIFF );
   SET DIFFS;
   DIFF = STDERR**2;
   ROWNAME = "DIFF";
RUN;

PROC MIANALYZE PARMS=FOR_MIMP_EST COVB=FOR_MIMP_COV;
   MODELEFFECTS DIFF;
   ODS OUTPUT PARAMETERESTIMATES=MI_EST
             VARIANCEINFO=MI_VAR;
RUN;

 

Output from Program 5.8

ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION MISSINGNESS PATTERN (MIMP) METHOD ESTIMATE THE TREATMENT DIFFERENCE USING PROC MIANALYZE

The MIANALYZE Procedure

Model Information

 

PARMS Data Set WORK.FOR_MI_EST
COVB Data Set WORK.FOR_MI_COV
Number of Imputations 5

Multiple Imputation Variance Information

 

  -----------------Variance-----------------  
Parameter Between Within Total DF
 
DIFF 3.5289507E-8 0.000002072 0.000002115 9972.9

Multiple Imputation Variance Information

 

Parameter Relative
Increase
in Variance
Fraction
Missing
Information
Relative
Efficiency
 
DIFF 0.020436 0.020224 0.995972

Multiple Imputation Parameter Estimates

 

Parameter Estimate Std Error 95% Confidence Limits DF
 
DIFF -0.010507 0.001454 -0.01336 -0.00766 9972.9

Multiple Imputation Parameter Estimates

 

Parameter Minimum Maximum Theta0 t for H0:
Parameter=Theta0
Pr > |t|
 
DIFF -0.010818 -0.010310 0 -7.23 <.0001

 

5.3.6 Obtaining Bootstrap Confidence Intervals

Direct estimation of standard errors for the point estimators in all of these propensity score-based IPW methods is challenging because the additional variability in the estimated weights is difficult to account for. Therefore, we use bootstrap methods to estimate the standard error and the confidence interval for the point estimates. The SAS macros %BOOT and %BOOTCI (http://cuke.hort.ncsu.edu/cucurbit/wehner/software/pathsas/jackboot.txt) provide nonparametric estimates of standard errors and various bootstrap confidence intervals (including the popular bias-corrected accelerated [BCa] method; Efron, 1987) for the IPW treatment difference. First, we need to create a user-defined macro that has to be named %ANALYZE. This macro computes the point estimate of the IPW treatment difference that will be repeatedly called from the %BOOTand %BOOTCI macros. Next, the %BOOTand %BOOTCI macros are called to compute bootstrap estimates. Note that the %ANALYZE macro must have two parameters: data to identify the input data set and out to name the output data set. Program 5.9 is an illustration of the %ANALYZE macro and calls to the %BOOT and %BOOTCI macros.

Program 5.9 Estimation of the Variance and Confidence Interval Using the Bootstrap Method

/*****************************************************************************
     MIMP_ANALYSIS is a macro which calculates CC, MI, MP and MIMP estimates;
     Q_METH indicates the method
*****************************************************************************/

%INCLUDE ‘BOOTS.SAS’; /* the file can be found in http://cuke.hort.ncsu.edu/cucurbit/wehner/software/pathsas/jackboot.txt */

%MACRO ANALYZE(DATA=BMDPS, OUT= );
   %MIMP_ANALYSIS(INDATA = &DATA, VARLIST = &VARLIST, Y = FNBMD_C, G = BISMORE, M_MP_MIN = 100);

  PROC SORT DATA = EST OUT = &OUT;
     BY Q_METH;
  RUN;
%MEND ANALYZE;
PROC PRINTTO LOG=NOLOG;
RUN;

TITLE 'BOOSTRAP: NORMAL ("STANDARD") CONFIDENCE INTERVAL WITH BIAS CORRECTION';
TITLE2;
%BOOT(DATA=BMDPS, ALPHA=.05, SAMPLES=1000, RANDOM=123, ID=Q_METH);

TITLE 'BOOTSTRAP BCA';
%BOOTCI(BCA, ID=Q_METH);

PROC PRINTTO;
RUN;

 

Carpenter and Bithell (2000) used simulation to show that the BCa bootstrap method generally produces reliable results, even if the distribution of the test statistic is far from symmetric. In this example, we use BCa bootstrapping to estimate confidence intervals for the data set in Section 5.2 (see Table 5.2). The naive estimator (a direct unweighted estimator of treatment effect using a t-test) appears to overestimate the treatment difference. The CC method had the widest confidence interval and the MP method had wider confidence intervals than the MI and MIMP methods, whereas the last two methods produced similar results. It should be noted that for this example, the 95% confidence intervals produced by the CC and MP methods contained 0 while those for the MI and MIMP methods did not.

Table 5.2 The Mean (95% CI) of the Change in Femoral Neck BMD (g/cm2) During the Fourth Year of MORE Study for Women Assigned to Placebo (n = 1,643)*.

Method Untreated (n=1512) Treated (n=131) Treated vs. Untreated
NAIVE -0.002(-0.004,-0.001) 0.011( 0.006, 0.017) 0.013( 0.007, 0.019)
CC -0.002(-0.004,-0.001) 0.006(-0.002, 0.020) 0.008( 0.000, 0.022)
MP -0.002(-0.004,-0.001) 0.007(-0.003, 0.016) 0.009(-0.001, 0.018)
MI -0.002(-0.003,-0.001) 0.009( 0.002, 0.017) 0.011( 0.004, 0.019)
MIMP -0.002(-0.003,-0.001) 0.009( 0.002, 0.017) 0.011( 0.004, 0.019)

* Results are from Qu and Lipkovich (2009). The estimates for MI and MIMP were slightly different from the SAS output presented previously due to the choice of different seeds in PROC MI.

 

5.4 Sensitivity Analyses

Estimation of treatment effects using propensity scores may be sensitive to the misspecification of imputation and propensity score models, as well as to the tuning parameters of the associated algorithms. To address this issue, we discuss several strategies for performing sensitivity analysis in the context of propensity-based analyses with missing data.

 

5.4.1 Varying Analytic Methods

Examples of alternative analytic methods were illustrated in Section 5.3. Within each method, one can also vary parameters. For example, one can vary the minimum pattern size to test how sensitive the results are to the selection of the minimum size of the pooled pattern in MP and MIMP. We also recommend, as illustrated here, computing propensity-stratified estimates of treatment effects in addition to IPW estimates.

The SAS code here constructs five strata (subgroups, bins) based upon fitted values of propensity for treatment and computes the corresponding stratified estimators of treatment effects. This can be done in various ways. For example, one can compute treatment effects within each stratum and combine them by weighting inversely to the square of estimated standard errors. In general, this approach is not recommended because it always down-weights results from the strata with highly different observed treatment fractions and up-weighs results from the strata with nearly equal observed treatment fractions. Here we follow a stratified regression (SR) approach described in D’Agostino (1998) and Lunceford and Davidian (2004) that combines stratification on propensity scores and regression and computes the overall estimator of treatment effect as an average of regression-adjusted estimators within each strata. Specifically, for stratum s, a regression-based estimator of treatment effect is computed as follows:

θˆd(s)=ns-1jɛQs{mj(s)(1,Xj,αˆ(s))-mj(s)(0,Xj,αˆ(s))},,

where Qs is the set of indices of subjects with estimated propensity scores falling within stratum s; ns is the size of stratum s; mj(s)(1,Xj,αˆ(s)) denotes the predicted outcome for subject j using regression of the form m(T,X,α)=E(Y|T,X) fitted to the data within stratum s, with predictors T and X, and αˆ(s) representing the vector of estimated parameters. When a linear regression model is used, as in the following example, the estimated treatment effect within stratum is simply the estimated regression coefficients for treatment, αˆT(s).

In this example, shown in Program 5.10, the estimated propensity scores are divided into K=5 strata and the overall SR estimator of treatment effect is computed as an average of strata-specific estimators based on linear regression within each stratum, θˆdSR=K-1s=1Kθˆd(s). The estimate of standard error can be obtained using bootstrapping, as explained in Section 5.3.6.

This alternative stratified estimator may be more robust compared to the IPW estimator when the estimated propensity for the treatment actually received is close to 0 for a few patients, resulting in these few subjects having undue impact on the overall estimated treatment difference. Program 5.10 shows the SAS code to perform the SR method and Output from Program 5.10 shows the results. The SR estimate for the treatment effect is 0.0099.

Program 5.10 Illustration of the Sensitivity Analysis Using Propensity Score Stratified Regression Estimator

/* THIS STEP CAN BE REPLACED WITH ANY METHOD FOR OBTAINING PROPENSITY SCORES */
PROC LOGISTIC DATA = ANALDATA DESC NOPRINT;
    MODEL BISMORE = &VARLIST;
    OUTPUT OUT=PRED PREDICTED=P;
RUN;

/* DEFINE THE QUINTILES WITH PROC RANK (GROUP=5) */
PROC RANK DATA = PRED OUT=PRED GROUPS=5;
   VAR P;
   RANKS PS_STRATA;
RUN;

PROC SORT DATA=PRED; BY PS_STRATA; RUN;

ODS LISTING CLOSE;
ODS OUTPUT SOLUTIONF=SF;
PROC MIXED DATA = PRED (WHERE=(PS_STRATA NE .));
    BY PS_STRATA; 
    MODEL FNBMD_C = BISMORE &VARLIST/SOLUTION;
RUN;
ODS LISTING;

 

TITLE 'ESTIMATING TREATMENT EFFECT BY ADJUSTING FOR BASELINE COVARIATES WITHIN EACH STRATA';
PROC MEANS DATA = SF N MEAN;
   CLASS PS_STRATA;
   TYPES PS_STRATA ();
   VAR ESTIMATE;
   where effect = 'BISMORE';
RUN;

 

Output from Program 5.10

ESTIMATING TREATMENT EFFECT BY ADJUSTING FOR BASELINE COVARIATES WITHIN EACH STRATA

The MEANS Procedure

Analysis Variable : Estimate

  N
Obs
N      Mean
  5 5 0.0099243

Analysis Variable : Estimate

 

Rank for
Variable P
N
Obs
N Mean
0 1 1 0.0144995
1 1 1 0.0102909
2 1 1 0.0023628
3 1 1 0.0143711
4 1 1 0.0080970

 

5.4.2 Using Different Imputation Strategies

For methods involving multiple imputation, try several imputation models using different sets of predictors for imputation. It has been suggested (Meng, 1994; Schafer, 1997) that it is better to include as many variables as possible in one’s imputation model, because omitting an important covariate results in an incorrect analysis while keeping an irrelevant predictor may only lower efficiency. Also, a recent simulation study comparing performance of different imputation strategies for propensity-based estimation (Crowe, Lipkovich, and Wang, 2009) showed that including treatment (T) and treatment outcome (Y) in imputation models improves estimation of the treatment effect. While it may appear counterintuitive to incorporate future outcomes in the imputation model for baseline covariates, it makes the imputation model more compatible with the analysis model (ANOVA for Y and T, in our case) and, therefore, helps reduce bias in estimating the treatment effect (Meng, 1994). Note that, in our examples of multiple imputation in Sections 5.3.3 and 5.3.5, we include both treatment and outcome variables in addition to baseline covariates in the VAR statement of PROC MI.

It is useful to evaluate the impact of missingness by examining the fraction of missing information (FMI), which is part of the output from PROC MIANALYZE. While the FMI may be substantial when estimating some of the coefficients of the propensity score model, the overall impact of missing covariates may be fairly low when estimating treatment effects (which, after all, is the ultimate goal of one’s analysis). This is the case because variables that have the largest FMI may contribute little to the probability of treatment assignment. In our example, the FMI for estimating the treatment difference was only about 2% (see the output in Section 5.3.3); for some variables included in the logistic model, it was fairly large, such as for FAMHXBCN (30%) and GAILMORE (31%). However, these variables were not significant predictors of treatment assignment, given other covariates, and, therefore, their missing values did not contribute much to the uncertainty associated with estimating treatment effects. On the other hand, the FMI for the most significant predictor of treatment assignment, LTOTBMDR, FNC, and NECKBMDR was estimated as <0.1%, 0.1%, and 0.1%, respectively. Program 5.11 shows the SAS code of PROC MIANALYZE for logistic regression coefficients. Output from Program 5.11 shows the results.

Program 5.11 Calculation of the Fraction Missing Information

/**** ASSESING FRACTION MISSING INFORMATION FOR COEFFICIENTS IN PS MODEL *****/
PROC MI DATA = ANALDATA ROUND=.001 NIMPUTE=5 SEED=6731205 OUT=IMPUTED_DATA NOPRINT;
   VAR &VARLIST FNBMD_C BISMORE;
RUN;

ODS LISTING;
ODS OUTPUT PARAMETERESTIMATES = _EST
           COVB=_COV;
PROC LOGISTIC DATA = IMPUTED_DATA;
   MODEL BISMORE = &VARLIST/COVB;
   OUTPUT OUT=PRED PREDICTED=P;
BY _IMPUTATION_;
RUN;
TITLE 'ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD';
DATA _EST;
  SET _EST;
  RENAME VARIABLE=PARAMETER;
  KEEP _IMPUTATION_ VARIABLE ESTIMATE;
RUN;

 

PROC MIANALYZE PARMS= _EST COVB=_COV;
  MODELEFFECTS INTERCEPT &VARLIST;
  ODS OUTPUT PARAMETERESTIMATES=MI_EST
             VARIANCEINFO=MI_VAR;
RUN;

 

Output from Program 5.11

Multiple Imputation Variance Information

  Relative Fraction  
  Increase Missing Relative
Parameter in Variance Information Efficiency
 
intercept 0.003791 0.003784 0.999244
AGE 0.041368 0.040481 0.991969
BMIR 0.003756 0.003749 0.999251
FAMHXBCN 0.380552 0.301688 0.943096
GAILMORE 0.395063 0.310298 0.941567
KHYSYN 0.000864 0.000863 0.999827
LSC 0.008437 0.008401 0.998323
LTOTBMDR 0.000264 0.000264 0.999947
FNC 0.001002 0.001001 0.999800
NECKBMDR 0.001434 0.001433 0.999713
NVFX 0.001993 0.001991 0.999602
PREVHRT 0.003607 0.003601 0.999280
PREVVERT 0.008648 0.008610 0.998281
AESCORE 0.007936 0.007904 0.998422
SMOKE 0.073996 0.071100 0.985979
SQ 0.010930 0.010870 0.997831
VFX 0.002259 0.002256 0.999549

Multiple Imputation Parameter Estimates

 

Parameter Estimate Std Error 95% Confidence  Limits DF
intercept -3.910976 1.415616 -6.68554 -1.13641 280408
AGE 0.032916 0.016281  0.00099  0.06484 2534.8
BMIR 0.011069 0.027360 -0.04256  0.06469 285733
FAMHXBCN -0.131857 0.410049 -0.95444  0.69073 52.643
GAILMORE -0.130776 >0.168963 -0.47017  0.20862 49.879
KHYSYN -0.042621 >0.230992 -0.49536  0.41011 5.37E6
LSC 5.866674 >1.922807  2.09796  9.63539 57142
LTOTBMDR 2.713227 >0.893752  0.96150  4.46495 5.73E7
FNC 9.155874 >2.772187  3.72249 14.58926 3.99E6
NECKBMDR 4.221311 >1.430553  1.41748  7.02515 1.95E6
NVFX -0.255861 >0.294621 -0.83331  0.32159 1.01E6
PREVHRT -0.245808 0.210377 -0.65814  0.16652 309625
PREVVERT -0.338755 0.372863 -1.06957  0.39206 54417
AESCORE -0.015502 0.007301 -0.02981 -0.00119 64523
SMOKE 0.562806 0.311964 -0.04951  1.17512 842.65
SQ -0.134405 0.18 4790 -0.49660  0.22779 34218
VFX -0.600087 0.282802 -1.15437 -0.04581 787654

 

5.4.3 Handling Extreme Weights

One major drawback for the IPW estimator is that it can produce unstable estimators if there are extreme large or small weights when the propensity scores are close to 0 or 1. One immediate remedy is to set boundaries for the weights. For example, for propensity scores < δ, we set them to be δ; for propensity scores > 1 − δ, we set propensity scores to be 1 − δ, where δ  is an empirical small number. In our experience, we used δ = 0.001 in some simulations, and it yields great results. For the data example presented in Section 5.3, there were no very small or large weights, so we did not use this empirical technique. Program 5.12 shows how to set boundaries for weights to produce stable estimators.

Program 5.12 Setting Boundaries for the Weights Derived from Propensity Scores

%LET DELTA = 0.001;
DATA PRED;
    SET PRED;
    IF . < P < &DELTA. THEN P = 0.001;
    IF P > 1-&DELTA. THEN P = 1-&DELTA.;
RUN;

 

5.5 Discussion

We have presented several methods of estimating propensity scores in the presence of missing values for some covariates. The CC method simply includes only subjects without any missing covariates. The IND method creates dummy variables to indicate the missingness. The MP method estimates the propensity scores within each missingness pattern. For some missingness patterns with a small number of observations, we proposed a method to pool small missingness patterns according to their similarities. The MI method estimates propensity scores with missing values imputed by multiple imputations. The MIMP method essentially combines the MI and MP methods by first imputing missing values with multiple imputations and then including information on the missingness pattern as a covariate in estimating propensity scores.

These approaches to estimating propensity scores can be combined with any propensity score-based method to estimate the treatment difference in the presence of treatment selection bias. In this chapter, all the analytic methods presented were based on the assumption of two treatment groups and used standardized inverse-probability weighted ANOVA with estimated propensities from a logistic regression. When more than two treatment arms are present, propensities are vectors of probabilities that sum to 1. The IPW approach is thus easily extended, with probabilities estimated using a multinomial logistic model (e.g., in SAS using PROC LOGISTIC with GLOGIT in its MODEL statement). Sometimes multiple treatment groups can be naturally ordered (e.g., when representing increasing levels of exposure/dose), and propensity scores can be estimated using ordinal logistic regression as shown in Leon and Hedeker (2005).

If the proportion of subjects with at least one missing value is low (e.g., < 10%), intuitively all approaches handling missing values will produce similar results. Therefore, one can choose the most convenient method such as the complete covariate analysis.

 

References

Carpenter, J. and J. Bithell. 2000. “Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians.” Statistics in Medicine 19: 1141–1164.

Concato, J., N. Shah, and R. I. Horwitz. 2000. “Randomized, controlled trials, observational studies, and the hierarchy of research designs.” The New England Journal of Medicine 342: 1887–1892.

Costantino. J., M. Gail, S. Pee, et al. 1999. “Validation studies for models projecting the risk of invasive and total breast cancer incidence.” Journal of the National Cancer Institute 91: 1541–1548.

Crowe, B., I. Lipkovich, and O. Wang. 2009. “Comparison of several imputation methods for missing baseline data in propensity scores analysis of binary outcome.” Pharmaceutical Statistics. In Press. DOI: 10.1002/pst.389.

D’Agostino, R. B. 1998. “Tutorial in biostatistics: propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group.” Statistics in Medicine 17(19): 2265–2281.

D’Agostino, Jr., R. B., and D. B. Rubin. 2000. “Estimating and using propensity scores with partially missing data.” Journal of the American Statistical Association 95: 749–759.

D’Agostino, R., W. Lang, M. Walkup, and T. Morgon. 2001. “Examining the impact of missing dData on propensity score estimation in determining the Effectiveness of self-monitoring of blood glucose (SMBG).” Health Services & Outcomes Research Methodology 2: 291–315.

Delmas, P. D., K. E. Ensrud, J. D. Adachi, et al. 2002. “Efficacy of raloxifene on vertebral fracture risk reduction in postmenopausal women with osteoporosis: four-year results from a randomized clinical trial.” The Journal of Clinical Endocrinology & Metabolism 87(8): 3609–3617.

Efron, B. 1987. “Better bootstrap confidence intervals.” Journal of the American Statistical Association 82: 171–185.

Horvitz, D. G., and D. J. Thompson. 1952. “A generalization of sampling without replacement from a finite universe.” Journal of the American Statistical Association 47(260): 663–685.

Ibrahim, J., S. Lipitz, and M. Chen. 1999. “Missing covariates in generalized linear models when the missing data mechanism is non-ignorable.” Journal of the Royal Statistical Society, Series B (Statistical Methodology) 61: 173–190.

Leon, A. C., and D. Hedeker. 2005. “A mixed-effects quintile-stratified propensity adjustment for effectiveness analyses of ordered categorical doses.” Statistics in Medicine 24: 647–658.

Little, R. J., and D. B. Rubin. 2002. Statistical Analysis with Missing Data. 2d ed. New York: Wiley-Interscience.

Lunceford, J. K., and M. Davidian. 2004. “Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study.” Statistics in Medicine 23: 2937–2960.

Meng, X. L. 1994. “Multiple-imputation inferences with uncongenial sources of input.” Statistical Science 9: 538–573; discussion 558–573.

Qu, Y., and I. Lipkovich. 2009. “Propensity score estimation with missing values using a multiple imputation missingness pattern (MIMP) approach.” Statistics in Medicine. 28:1402–1414.

Robins, J. M., A. Rotnitzky, and L. P. Zhao. 1994. “Estimation of regression coefficients when some regressors are not always observed.” Journal of the American Statistical Association 89: 846–866.

Robins, J. M., and N. Wang. 2000. “Inference for imputation estimators.” Biometrika 87: 113–124.

Rosenbaum, P. R. 1987. Model-based direct adjustment.” Journal of the American Statistical Association 82: 387–394.

Rosenbaum, P. R., and D. B. Rubin. 1983. “The central role of the propensity score in observational studies for causal effects.” Biometrika 70: 41–55.

Rosenbaum, P. R., and D. B. Rubin. 1984. “Reducing bias in observation studies using subclassification on the propensity score.” Journal of the American Statistical Association 79: 516–524.

Rubin, D. B. 1978. “Multiple imputations in sample surveys: a phenomenological Bayesian approach to nonresponse.” In the ASA Proceedings in theSection on Survey Research Methods 20–28. Alexandria, VA: American Statistical Association.

Rubin, D. B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons, Inc.

Schafer, J. L. 1997. Analysis of Incomplete Multivariate Data. London: Chapman and Hall/CRC.

Wang, N., and J. M. Robins. 1998. “Large-sample theory for parametric multiple imputation procedures.” Biometrika 85: 935–948.

Xie, J., and C. Liu. 2005. “Adjusted Kaplan-Meier estimator and log-rank test with inverse probability of treatment weighting for survival data.” Statistics in Medicine 24: 3089–3110.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.19.120.87