12.5. MAR Methods

In this section, we give an overview the most commonly used methods, which are valid under the MAR assumption. See Dmitrienko et al. (2005, Chapter 5) for details.

12.5.1. Direct Likelihood Analyses

As stated earlier, likelihood-based inference is valid whenever the mechanism is MAR, and provided the technical condition holds that the parameters describing the nonresponse mechanism are distinct from the measurement model parameters (Little and Rubin, 1987). The log-likelihood then partitions into two functionally independent components, one describing the measurement model, the other one the missingness model. This implies that likelihood-based software with facilities to handle incomplete records provides valid inferences. Such a likelihood-based ignorable analysis is also termed likelihood-based MAR analysis, or, as we will call it further on, a direct likelihood analysis.

Turning to SAS software, this implies that likelihood-based longitudinal analyses, conducted by means of the MIXED, NLMIXED, and GLIMMIX procedures, are valid, given the MAR assumption holds or is considered a reasonable approximation. This is useful, even when inferences are conducted at the end of the planned measurement sequence. An important example is the assessment of treatment effect at the end of a clinical trial. A way to practically implement such an analysis is by specifying a sufficiently general treatment by time-mean profile model, supplemented with an unstructured variance-covariance structure. Appropriate use of the CONTRAST or ESTIMATE statement then leads to the required test. Technically, the score equations take the expected value of the incomplete measurements, given the observed ones, into account (Beunckens, Molenberghs, and Kenward, 2005). This implies that all information on a subject is used to assess treatment effect. If the treatment assignment is used as randomized, the method is fully consistent with the intention-to-treat principle.

12.5.2. Multiple Imputation

Apart from direct likelihood, other methods that are valid under MAR include multiple imputation (Rubin 1978, 1987, Rubin and Schenker 1986) (discussed here) and the Expectation-Maximization algorithm.

The key idea of the multiple imputation (MI) procedure is to replace each missing value with a set of M plausible values. Precisely, such values are drawn from the conditional distribution of the unobserved values, given the observed ones. The imputed data sets are then analyzed using standard complete data methods and software. Finally, the M obtained inferences obtained inferences thus obtained must be combined into a single one by means of the method proposed by Rubin (1978). Ample detail can be found in Dmitrienko et al. (2005, Chapter 5).

With the availability and ease of direct likelihood, it remains to be discussed when to use multiple imputation. First, the method can be used to conduct checks on direct likelihood. Second, MI is really useful when there are incomplete covariates, along with missing outcomes. Third, when several users want to conduct a variety of analyses on the same incomplete set of data, it is sensible to provide all of them with the same multiply imputed sets of data. Finally, multiple imputation can be used within the context of sensitivity analysis.

The SAS procedure is a multiple imputation procedure that creates multiply imputed data sets for incomplete p-dimensional multivariate data. It uses methods that incorporate appropriate variability across the M imputations. Once the M complete data sets are analyzed by using standard procedures, the MIANALYZE procedure can be used to generate valid statistical inferences about these parameters by combining results for the M complete data sets. See also Dmitrienko et al. (2005, Chapter 5).

12.5.3. The EM Algorithm

The Expectation-Maximization (EM) algorithm has been described in detail in Dmitrienko et al. (2005, Chapter 5). It is an alternative to direct likelihood and MI, in the sense that, in its basic form, it is valid under the same conditions. Dempster, Laird, and Rubin (1977) provided a very general description of the algorithm, showing its use in broad classes of missing data, latent variable, latent classes, random effects, and other data augmentation settings.

Within each iteration, there are two steps. In the E step, the expectation of the complete data log-likelihood is calculated. In exponential family settings, this is particularly easy since it reduces to the calculation of complete-data sufficient statistics. In the M step, the so-obtained function is maximized. Little and Rubin (1987), Schafer (1997), and McLachlan and Krishnan (1997) provide detailed descriptions and applications of the EM algorithm.

While the algorithm is elegant, the basic version does not provide precision estimates and a number of proposals have been made over the years, summarized in McLachlan and Krishnan (1997), to rectify this situation. A number of applications of the EM algorithm are possible with the SAS MI procedure (Dmitrienko et al. 2005, Chapter 5).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.210.143