9.4. Allocations Balanced on Baseline Covariates

When certain subject characteristics (baseline covariates) are known to affect the response, it is often important to have the treatment groups balanced with respect to these covariates:

  • A balanced allocation helps minimize bias due to covariate imbalances unaccounted for in the analysis model.

  • A balance with respect to important covariates improves the efficiency of the analysis (Senn, 1997). Note, however, that in large trials the gain in power resulting from the use of a balanced allocation is typically small compared to post-stratified analysis without balanced allocation (McEntegart, 2003). At the same time, as pointed out by McEntegart, even a large trial can suffer from a substantial loss of power due to an unbalanced allocation in the presence of many small strata. Also, an allocation balanced by baseline predictors can considerably improve the efficiency of interim and subgroup analyses.

  • The results of a trial with treatment groups balanced with respect to major covariates are more convincing for medical community. Serious imbalance in important covariates might raise concerns even if it is adjusted for in the analysis.

It is debatable to what extent the balance in prognostic factors should be pursued, when statistical analysis can account for any imbalances in covariates; see, for example, a heated discussion following Atkinson (1999). The decision to balance or not to balance in a particular study can be guided by an assessment of the probability to end up with imbalance that is either perceived as dangerously high in the clinical community or exceeds the boundaries within which the model assumptions can be trusted. More insights can be gained from McEntegart (2003).

It is recommended that the baseline predictors balanced upon be included in the analysis model (International Conference on Harmonisation, 1998; Gail, 1988; Simon, 1979; Kalish and Begg, 1987; Senn, 1997; Scott et al., 2002). A failure to do so might result in an inflated Type I error rate. When the randomization-based analysis is performed, it should follow the randomization procedure (Rosenberger and Lachin, 2002).

9.4.1. Stratified Permuted Block Randomization

The most common way to achieve balance in a given factor is stratified randomization which relies on creating a separate randomization schedule for each level of the factor. If there are several baseline covariates to balance upon, a separate restricted randomization schedule (most commonly, a permuted block schedule) is prepared for each stratification cell defined by a combination of factor levels. For example, in a study balanced by gender (male or female) and smoking status (smoker, ex-smoker or never-smoker), a separate schedule is generated for each of the six strata formed by a combination of the factor levels.

The stratified randomization approach has its limitations: only a small number of factors can be balanced upon. If the number of strata is large, some strata will have few or no subjects resulting in an inadequate balance across factor levels (Therneau, 1993; Rosenberger and Lachin, 2002; Hallstrom and Davis, 1988).

9.4.2. Covariate-Adaptive Allocation Procedures

When there are too many important prognostic factors for stratification to handle, one of the covariate-adaptive allocation procedures can be used to provide a balance in selected covariates. Such procedures are dynamic in nature—the treatment assignment of a subject depends on the subject's vector of covariates and thus is determined only when the subject arrives. It is conceptually different from the regular stratified randomization method that relies on a fixed allocation schedule prepared for each stratum prior to the study start.

Minimization, pioneered by Taves (1974) and expanded by Pocock and Simon (1975), is the most commonly used covariate-adaptive allocation procedure. Minimization produces a marginal balance in each individual factor but not in individual factor interaction cells. For example, when balancing on gender and smoking status, the balance in treatment assignments is achieved across factor levels (across males, females, smokers, ex-smokers and never-smokers) but not within each cell, e.g., male smokers, as is the case with the stratified randomization. Because of that, minimization can simultaneously balance on a large number of factors even in a moderate size trial.

Minimization achieves the balance in treatment assignments across factor levels by choosing the allocation for the new subjects that would result in the least possible imbalance (in some sense) across the set of his or her baseline characteristics.

In what follows we will describe the minimization algorithm proposed by Taves with a variance imbalance function popularized by Freedman and White (1976). Due to its simplicity, this algorithm is frequently used in clinical trials (McEntegart, 2003) and is often described in the literature (Senn, 1997; Scott et al, 2002).

9.4.3. Taves Minimization Algorithm

Consider a parallel study in which subjects are to be allocated equally to two treatment groups (Treatment A and Treatment B) and the allocation needs to be balanced in gender (male or female) and smoking status (smoker, ex-smoker or never-smoker).

When the minimization algorithm is described it is convenient to assign scores of 1 and −1 to the treatment groups A and B, respectively. Suppose a male ex-smoker arrives for allocation when there are 20 allocated subjects in the trial. To select the treatment assignment for this subject, we count the number of males and the number of ex-smokers in each of the treatment arms. Assume that there are five males in the Treatment A group and seven males in the Treatment B group. The imbalance across males, defined as the difference in number of males allocated to A versus B, is 5 − 7 = − 2. Also, there are three ex-smokers in the Treatment A group and two ex-smokers in the Treatment B group. The imbalance across ex-smokers is 3 − 2 = 1. The total imbalance, defined as the sum of imbalances across males and across ex-smokers, is (−2)+1 = −1. The negative total imbalance indicates that, overall, Treatment A is underrepresented among males and among ex-smokers combined. Thus, to improve the balance, the male ex-smoker is allocated to Treatment A. When the group totals are equal, the subject is allocated to one of the treatment arms at random with equal probability.

Given the covariates and treatment allocations of the subjects already in the study, the treatment assignment of a new subject is fully determined by his or her set of covariates, except when a tie in group totals is encountered. This is why Scott et al (2002) referred to the Taves minimization algorithm as "largely nonrandom"— deterministic assignments occur more often than random ones. Nevertheless, simulations show that assignments at random occur often enough to provide a reasonably rich set of possible allocation sequences for a given sequence of covariates (Kuznetsova and Troxell, 2004). Thus, in a masked trial a largely deterministic nature of the Taves minimization algorithm does not lead to selection bias. However, in a single-center unmasked trial, a large share of the treatment assignments will be predictable, providing a considerable opportunity for selection bias.

9.4.4. Pocock-Simon Minimization Algorithm

Pocock and Simon (1975) extended the Taves minimization algorithm to make treatment assignments less predictable. This is achieved by using an additional random element at each treatment assignment. A subject is allocated to the treatment that results in the least imbalance with probability p < 1 rather than p = 1. If p is close to 1 (p = 0.9 or p = 0.95 are often used) the Pocock-Simon procedure still has good balancing properties and somewhat less potential for selection bias in an unmasked trial. The ICH E9 guidance recommends using a random element at each allocation.

To define the Pocock-Simon minimization algorithm, consider a clinical study with subjects allocated in a 1:1 ratio to Treatments A and B. The allocation scheme in this study needs to be balanced by gender (male or female) and the smoking status (smoker, ex-smoker or never-smoker). As was explained above, the Pocock-Simon algorithm extends the Taves algorithm by introducing a random element to each allocation step. In this trial, a subject will be assigned to the treatment that results in a smaller imbalance with probability p = 0.9 and to the opposite treatment with probability p = 0.1. When the tie in total imbalances is encountered, the treatment will be assigned by a toss of a fair coin.

The Pocock-Simon procedure can be expanded to more than two treatment groups and allows the use of different measures of imbalance across a factor level. If some of the factors are considered more important than others, they can be included in combined imbalance with higher weight. If an interaction between the two factors is known to affect the response, the interaction should be included as a factor in the minimization algorithm (Pocock and Simon, 1975).

The minimization approach has been shown to provide a good marginal balance in a large number of factors simultaneously (Taves, 1974; Pocock and Simon, 1975; Therneau, 1993; Begg and Iglewicz, 1980; Birkett, 1985; Zielhuis et al., 1990; Weir and Lees, 2003). By McEntegart's (2003) estimate, minimization has been used in more than 1000 trials, including several prestigious mega-trials.

9.4.5. Implementation of Minimization Algorithms in SAS

Below we describe the %ASSIGN macro that needs to be invoked each time a new subject is available for allocation. The set of the subject's covariates is specified through the macro parameters. The macro assigns a treatment to the new subject and also updates the ALL_ANS data set that stores the allocation numbers, covariates and treatment assignments of all allocated study subjects. This data set will have one observation for each study subject and will include the following variables:

  • The study subjects will be identified by their allocation numbers (AN variable), assigned to them in the order they were allocated.

  • The treatment assignments of the study subjects will be stored in the TREATMENT variable. Treatments A and B will be coded by 1 and −1, respectively.

  • The set of covariates of each subject will be described by five 0/1 variables, C1 to C5. The first three variables, C1, C2 and C3, are 0/1 indicators of the subject's level of the smoking status (smoker, ex-smoker or never-smoker, respectively), while C4 and C5 are the indicators of the subject's gender (male or female, respectively). For example, a male ex-smoker will have the following set of variables: C1=0, C2=1, C3=0, C4=1, C5=0.

  • Lastly, there will be five variables, M1 to M5, to store the marginal imbalances, that is, the differences in the number of subjects allocated to A versus B across smokers (M1), ex-smokers (M2), never-smoker (M3), males (M4), and females (M5), respectively, that result after the subject is allocated.

Before the first subject is allocated, the ALL_ANS data set needs to be initialized by setting all of the variables to 0 in the following step as shown in Program 9.7.

Example 9-7. Initialize the parameters in the ALL_ANS data set
data all_ans;
    label an='Allocation Number'
        c1='Smoker'
        c2='Ex-smoker'
        c3='Never-smoker'
        c4='Male'
        c5='Female'
        m1='Imbalance across Smokers'
        m2='Imbalance across Ex-smokers'
        m3='Imbalance across Never-smokers'
        m4='Imbalance across Males'
        m5='Imbalance across Females'
        treatment='Treatment';
    input an c1-c5 m1-m5 treatment;
    datalines;
    0 0 0 0 0 0 0 0 0 0 0 0 0
    ;
    run;

Each time a new subject arrives for allocation, the %ASSIGN macro is called to determine the treatment assignment of the new subject and update the ALL_ANS data set. The macro works in the following way:

  • The macro reads the current marginal imbalances into a one-observation data set (ASSIGN data set). It creates indicator variables C1-C5 that describe the covariates of the new subject and assigns them values of the macro parameters &C1-&C5. The last macro parameter, &P, is the probability of assigning a new subject to Treatment B.

  • The treatment assignment for the new subject is determined by the scalar product of the vectors (C1, C2, C3, C4, C5) and (M1, M2, M3, M4, M5), the so called total imbalance (TOTIMB variable). If TOTIMB=0, the TREATMENT variable is set to 1 or −1 with probability 0.5. If TOTIMB is positive, the TREATMENT variable is set to −1 (Treatment B) with probability &P and 1 (Treatment A) with probability 1-&P. If TOTIMB is negative, TREATMENT=1 (Treatment A) with probability &P and value TREATMENT=−1 (Treatment B) with probability 1 to &P.

  • After the new subject has been assigned to a treatment group, the marginal imbalances M1 to M5 are updated in the ASSIGN data set. This data set, which contains the allocation number, covariate indicators C1 to C5, treatment assignment for the new subject and updated marginal imbalances, is appended to the ALL_ANS data set.

We need to go through these steps every time a subject arrives for allocation. The %ASSIGN macro is defined in Program 9.8.

Example 9-8. The %ASSIGN macro
%macro assign(c1,c2,c3,c4,c5,p);
data assign;
    set all_ans(keep=m1-m5 an) end=lastobs;
    if lastobs;
    c1=&c1; c2=&c2; c3=&c3; c4=&c4; c5=&c5;
    totimb=c1*m1+c2*m2+c3*m3+c4*m4+c5*m5;
    * Assign −1 or 1 with probability 0.5 if totimb=0;
    if totimb=0 then treatment=2*rantbl(0,0.5,0.5)-3;

* Assign 1 with probability &p and −1 with probability 1-&p if totimb>0,
      otherwise assign −1 with probability &p and 1 with probability 1-&p;
    else treatment=-sign(totimb)*(2*rantbl(0,1-&p,&p)−3);
    an=an+1;
    m1=m1+c1*treatment;
    m2=m2+c2*treatment;
    m3=m3+c3*treatment;
    m4=m4+c4*treatment;
    m5=m5+c5*treatment;
    keep an c1-c5 m1-m5 treatment;
data all_ans;
    set all_ans assign;
    if an>0;
    run;
%mend assign;

Program 9.9 invokes the %ASSIGN macro to allocate the first three subjects in a study that happen to be a never-smoking male, never-smoking female and a smoker male in the order of arrival. These subjects will be assigned to the treatment that produces better balance with probability 0.9 and thus &P=0.9.

Example 9-9. Pocock-Simon minimization algorithm with p = 0.9
* 1st subject: never-smoking male;
%assign(c1=0,c2=0,c3=1,c4=1,c5=0,p=0.9);
* 2nd subject: never-smoking female;
%assign(c1=0,c2=0,c3=1,c4=0,c5=1,p=0.9);
* 3rd subject: smoker male;
%assign(c1=1,c2=0,c3=0,c4=1,c5=0,p=0.9);
proc print data=all_ans noobs;
    var an treatment c1-c5 m1-m5;
    run;

Example. Output from Program 9.9
an    treatment    c1    c2    c3    c4    c5    m1    m2    m3    m4    m5

 1        −1        0     0     1     1     0     0     0    −1    −1     0
 2         1        0     0     1     0     1     0     0     0    −1     1
 3         1        1     0     0     1     0     1     0     0     0     1

Output 9.9 lists the allocation numbers, treatment assignments and values of the C1 to C5 and M1 to M5 variables. The three subjects were assigned to Treatments B, A and A, respectively.

To change the probability of assigning subjects to the treatment that produces better balance, one needs to change the &P macro parameter. For example, increasing the value of &P to 0.95 will results in a tighter balance with respect to baseline covariates.

To implement the Taves minimization algorithm, the &P macro parameter is set to 1.

There are other approaches to balancing an allocation on baseline covariates. Atkinson (1982) proposed an approach based on optimal design considerations that focuses on minimizing the variance of treatment contrasts in the presence of covariates rather than on balancing over the covariates to minimize the bias. A different approach (Miettinen, 1976) is based on stratifying by a single risk score that accounts for the effect of all known covariates.

The CPMP Points to Consider document on adjustment for baseline covariates (EAEMP CPMP, 2003) discourages the use of covariate-adaptive allocation procedures. The issues involved are discussed by Roes (2004). He shows that some of the arguments against dynamic allocation procedures (e.g., predictability of the assignments) apply to stratified randomization as well, and might be even more pronounced with stratified randomization. The utility of dynamic allocation procedures is well described by McEntergart (2003).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.20.90