12.1. Introduction

In a longitudinal clinical trial, each unit is measured on several occasions. It is not unusual in practice for some sequences of measurements to terminate early for reasons outside the control of the investigator. Any unit so affected is called a dropout. It might therefore be necessary to accommodate the dropout in the modeling process.

When referring to the missing-value (or non-response) process we will use the terminology of Little and Rubin (1987, Chapter 6). A non-response process is said to be missing completely at random (MCAR) if the missingness is independent of both unobserved and observed data. A process is said to be missing at random (MAR) if, conditional on the observed data, the missingness is independent of the unobserved measurements. A process that is neither MCAR nor MAR is termed non-random (MNAR). In the context of likelihood inference, and when the parameters describing the measurement process are functionally independent of the parameters describing the missingness process, MCAR and MAR are ignorable, while a non-random process is non-ignorable.

Many methods are formulated as selection models (Little and Rubin, 1987) as opposed to pattern-mixture models (PMM) (Little 1993, 1994a). A selection model factors the joint distribution of the measurement and response mechanisms into the marginal measurement distribution and the response distribution, conditional on the measurements. This is intuitively appealing since the marginal measurement distribution would be of interest also with complete data. Little and Rubin's taxonomy is most easily developed in the selection setting. Parameterizing and making inference about the effect of treatment and its evolution over time is straightforward in the selection model context.

12.1.1. Incomplete Data in Clinical Trials

In the specific case of a clinical trial setting, standard methodology used to analyze longitudinal data subject to non-response is mostly based on such methods as last observation carried forward (LOCF), complete case analysis (CC), or simple forms of imputation. This is often done without questioning the possible influence of these assumptions on the final results, even though several authors have written about this topic. A relatively early account is given in Heyting, Tolboom, and Essers (1992). Mallinckrodt et al. (2003a,b) and Lavori, Dawson, and Shera (1995) propose direct-likelihood and multiple-imputation methods, respectively, to deal with incomplete longitudinal data. Siddiqui and Ali (1998) compare direct-likelihood and LOCF methods.

It is unfortunate that there is such a strong emphasis on methods like LOCF and CC, since they are based on extremely strong assumptions. In particular, even the strong MCAR assumption does not suffice to guarantee an LOCF analysis is valid. On the other hand, under MAR, valid inference can be obtained through a likelihood-based analysis, without the need for modeling the dropout process. As a consequence, we can simply use, for example, linear or generalized linear mixed models (Verbeke and Molenberghs, 2000), without additional complication or effort. We will argue that such an analysis not only enjoys much wider validity than the simple methods but in addition is simple to conduct, without additional data manipulation using such tools as, for example, the SAS MIXED or NLMIXED procedure. Thus, clinical trial practice should shift away from the ad hoc methods and focus on likelihood-based, ignorable analyses instead. As will be argued further, the cost involved in having to specify a model will arguably be mild to moderate in realistic clinical trial settings. Thus, we promote the use of direct-likelihood ignorable methods and argue against the use of the LOCF and CC approaches.

At the same time, we cannot avoid a reflection on the status of MNAR approaches. In realistic settings, the reasons for dropout are varied, and it is therefore difficult to fully justify on a priori grounds the assumption of MAR. For example, the rate of and the reasons for dropout varied considerably across eleven clinical trials of similar design, of the same drug in the same indication. In one study, completion rates were 80% for drug and placebo. In another study, two-thirds of the patients who were taking a drug completed the study, while only one-third did so on placebo. In yet another study, 70% finished on placebo but only 60% on drug. Reasons for dropout also varied, even within the drug arm. For example, at low doses, more patients on drug dropped out due to lack of efficacy, whereas at higher doses dropout that was due to adverse events was more common. At first sight, this calls for a further shift towards MNAR models. However, some careful considerations have to be made, the most important one of which is that no modeling approach, whether either MAR or MNAR, can recover the lack of information that occurs due to incompleteness of the data.

First, under MAR, a standard analysis would follow, if it would be possible to be entirely sure of the MAR nature of the mechanism. However, it is only rarely the case that such an assumption is known to hold (Murray and Findlay, 1988). Nevertheless, ignorable analyses may provide reasonably stable results, even when the assumption of MAR is violated, in the sense that such analyses constrain the behavior of the unseen data to be similar to that of the observed data (Mallinckrodt et al., 2001a,b). A discussion of this phenomenon in the survey context has been given in Rubin, Stern, and Vehovar (1995). These authors argue that, in well-conducted experiments (some surveys and many confirmatory clinical trials), the assumption of MAR is often to be regarded as a realistic one. Second, and very important for confirmatory trials, an MAR analysis can be specified a priori without additional work relative to a situation with complete data. Third, while MNAR models are more general and explicitly incorporate the dropout mechanism, the inferences they produce are typically highly dependent on the untestable and often implicit assumptions built in regarding the distribution of the unobserved measurements given the observed ones. The quality of the fit to the observed data need not reflect at all the appropriateness of the implied structure governing the unobserved data. This point is irrespective of the MNAR route taken, whether a parametric model of the type of Diggle and Kenward (1994) is chosen, or a semiparametric approach such as in Robins, Rotnitzky, and Scharfstein (1998). Hence in any incomplete-data setting there cannot be anything that could be termed a definitive analysis. Based on these considerations, we recommend, for primary analysis purposes, the use of ignorable likelihood-based methods. In many examples, however, the reasons for dropout will be many and varied. It is therefore difficult to justify on a priori grounds the MAR assumption. Arguably, in the presence of MNAR missingness, a wholly satisfactory analysis of the data is not feasible.

In fact, modeling in this context often rests on strong (untestable) assumptions and relatively little evidence from the data themselves. Glynn, Laird, and Rubin (1986) indicated that this is typical for selection models. It is somewhat less the case for pattern-mixture models (Little, 1993; 1994a; Hogan and Laird, 1997), although caution should be used (Thijs, Molenberghs, and Verbeke, 2000). This awareness and the resulting skepticism about fitting MNAR models initiated the search for methods to investigate the results with respect to model assumptions and for methods that allow us to assess influences in the parameters describing the measurement process, as well as the parameters describing the non-random part of the dropout mechanism. Several authors have suggested various types of sensitivity analyses to address this issue (Molenberghs, Kenward, and Goetghebeur, 2001; Scharfstein, Rotnitzky, and Robins, 1999; Van Steen et al., 2001; Verbeke et al., 2001). Verbeke et al. (2001) and Thijs, Molenberghs, and Verbeke (2000) developed a local influence-based approach for the detection of subjects that strongly influence the conclusions. These authors focused on the Diggle and Kenward (1994) model for continuous outcomes. Van Steen et al. (2001) adapted these ideas to the model of Molenberghs, Kenward and Lesaffre (1997), for monotone repeated ordinal data. Jansen et al. (2003) focused on the model family proposed by Baker, Rosenberger, and DerSimonian (1992, henceforth referred to as BRD).

Thus, to explore the impact of deviations from the MAR assumption on the conclusions, we should ideally conduct a sensitivity analysis, within which MNAR models can play a major role, together with, for example, pattern-mixture models (Verbeke and Molenberghs, 2000, Chapters 18-20).

12.1.2. Outline

Three case studies used throughout this chapter, the exercise bike data and the mastitis in dairy cattle data, are introduced in Section 12.2. The general data setting is introduced in Section 12.3, as well as a formal framework for incomplete longitudinal data. A brief overview on the problems associated with simple methods is presented in Section 12.4. Next, methods valid under the MAR assumption are described in Sections 12.5 and 12.6, for continuous and categorical outcomes respectively. MNAR modeling is covered in Section 12.7, while Section 12.8 covers the aspects on sensitivity analysis.

To save space, some SAS code has beeen shortened and some output is not shown. The complete SAS code and data sets used in this book are available on the book's companion Web site at http://support.sas.com/publishing/bbu/companion_site/60622.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.209.180