Chapter 49

Subgroup Analysis

Richard M. Simon

49.1 Introduction

Subgroup analysis refers to the practice of attempting to determine whether and how treatment effects vary among subgroups of subjects who are studied in intervention studies. This article reviews the variety of statistical methods that have been used and developed for subgroup analyses and the criticisms that have been made of subgroup analyses. The article provides guidance for the conduct of subgroup analyses in a manner that limits the opportunity for misleading claims.

49.2 The Dilemma of Subgroup Analysis

Most clinical trials have as primary objective the evaluation of an intervention (treatment) relative to a control for a representative sample of patients who satisfy prospectively defined eligibility criteria. The evaluation is made with regard to a prospectively defined end point, that is, a measure of patient outcome. This experimental paradigm is frequently used for laboratory and observational studies. The focus on a single treatment effect is consistent with the Neyman–Pearson theory of hypothesis testing that is the general basis for analysis of such studies. Subgroup analysis, which is also called subset analysis, refers to examination of treatment effects in subgroups of study patients. For clinical trials, the subgroups should be defined based on baseline covariates measured before randomization.

Subgroup analysis is frequently observed in even the most prominent journals of medicine and science. Assmann et al. [1] reviewed 50 clinical trials published in major medical journals in 1997 and noted that 70% reported a median of four subgroup analyses. Humans with a given disease are heterogeneous with regard to numerous characteristics, some of which may influence treatment effect. Hence, investigators are interested in examining treatment effect within subgroups of patients [2]. Temple and Ellenberg [3] indicated that approved treatments for many conditions fail to produce significantly better results than placebo in a large proportion of clinical trials and attributed this to heterogeneity of effectiveness among subgroups of patients. Subgroup analysis has, however, long been criticized by statisticians and clinical trialists [4–7]. The criticisms have generally focused on investigators attempting to transform “negative” studies into “positive” studies, that is, taking studies in which the overall treatment effect is not statistically significant and finding a subgroup of patients for which the treatment effect seems significant. Despite substantial criticism, however, subgroup analysis is widely practiced because of a desire of practicing physicians to use results of clinical trials of heterogeneous diseases for treatment of individual patients.

49.3 Planned Versus Unplanned Subgroup Analysis

It is important to distinguish planned from unplanned subgroup analyses [8,9]. The subgroup analyses that are commonly criticized and commonly practiced are unplanned. They often involve “data dredging” in which treatment effect in many subgroups defined with regard to a long list of baseline covariates is examined. In some cases, the list of covariates is limited to the factors used to stratify the randomization of the clinical trial, but generally no strong biological rationale holds for expecting treatment effect to be different for the subgroups examined and generally no prespecified analysis plan exists for performing the subgroup analysis or for controlling the experiment-wise type I error rate. Unplanned subgroup analysis has been described as analogous to betting on a horse after watching the race [10], Rather than testing predefined hypotheses, unplanned subgroup analysis amounts to finding the covariate and cut-point that partitions the data in a manner that maximizes differences in treatment effect estimates. Because the subgroups are not defined prospectively, multiplicity cannot be controlled. Authors often do not report all of the subgroups examined [11]. Even when they do, it is not clear how many more subgroups might have been examined had they not found the differences that they identify in their report.

From a Bayesian viewpoint, prior distributions cannot be properly defined post hoc in a data-driven manner [12]. If no prespecified subgroup analysis plan exists, one must suspect that important real variation in treatment effect among subgroups was not expected. This implicitly low prior probability reduces the plausibility of “significant” findings.

Unplanned subgroup analysis should be viewed as exploratory hypothesis generation for future studies. The use of statistical significance tests, confidence intervals, multiplicity corrections, or Bayesian modeling can confuse the fact that no valid inference can really be claimed. The most essential requirement for conducting a subgroup analysis that is intended to be more than completely exploratory is that the analysis should be prospectively planned and should not be data driven. The subgroups to be examined and the manner in which the analysis will be conducted should be prospectively defined. A strong basis must exist for examining treatment effect to be different in those subgroups, either based on previous studies or the biology of the treatment and the disease. It is best that this planning occur during the process of study planning and be documented in the study protocol. In some cases, however, it may still be possible to “prospectively” plan the subgroup analysis after the patients have been treated but before the data have been examined. With prospective planning of subgroup analyses, it is important that the subgroups be limited to a very few for which strong reasons exist to believe that treatment effect may differ from the overall effect.

In pharmacogenomic studies, evaluating treatment effect in a predefined subgroup may have importance equal to the overall comparison. It is becoming increasingly clear that many diseases traditionally defined based on morphology and symptomatology arc in fact molecularly heterogeneous. This finding makes it less rational to expect that subgroups of patients will benefit from the same treatment [13,14]. Traditional clinical trials with broad eligibility criteria and subgroup analysis viewed as exploratory are of less relevance for this setting, which is particularly the case with treatments developed for particular molecular targets that may be driving disease progression in only a subset of the patients. For example, Simon and Wang [15] describe a clinical trial setting in which it is of interest to test treatment effect overall and for one prespecified subgroup. They allocate the conventional 0.05 level between the two tests to preserve the experiment-wise type I error rate at 0.05. The sample size could be planned to ensure adequate power for both analyses. This approach can be used more generally for an overall test and G subgroup tests in a prospective subset analysis design. It is prospective in that the subsets and analysis plan are prespecified, the experiment-wise error rate is preserved, and the power for subset analysis is considered in the planning of the study. Preplanned subgroups also occur in 2 × 2 factorial designs. Those trials are generally planned assuming that interactions between the treatment factors do not exist. Consequently, it is important to plan the trial to test this assumption [16,17].

In subsequent sections, we review methods that have been used for subgroup analysis.

49.4 Frequentist Methods

49.4.1 Significance Testing Within Subgroups

The most commonly used method of subgroup analysis involves testing the null hypothesis of no treatment effect separately for several subgroups S1, …, SG. The subgroups may or may not be disjoint. The results are often considered “statistically significant” if the significance level is less than the conventional 0.05 threshold. At least three problems exist with this approach. First, no control is implemented for the fact that multiple null hypotheses are being tested. Hence, the probability of obtaining at least one false-positive conclusion will be greater than the nominal 0.05 level of each test and may approach 1 rapidly as the number of subgroups G increases. Table 1 shows the probability of observing a nominally significant (P < 0.05) treatment effect in at least one subgroup if treatment is evaluated in G independent subgroups. With only four comparisons, the chance of at least one false-positive conclusion is 18.5%. Four subsets can be defined by two binary covariates. Although the comparisons in subsets determined by binary covariates are not independent, the dependence does not have a major effect on ameliorating the problem. For example, Fleming and Watelet [18] performed a computer simulation to determine the chance of obtaining a statistically significant treatment difference when two equivalent treatments are compared in six subgroups determined by three binary variables. The chance of obtaining a P < 0.05 difference in at least one subgroup was 20% at the final analysis and 39% in the final or one of the three interim analyses.

Table 1: Probability of Seeing a Significant (P < 0.05) Treatment Effect in at Least One of G Independent Subgroups

Number of Subgroups GProbability
20.097
30.143
40.185
100.40
200.64

A second problem with the approach of significance testing within subgroups is that the power for detecting a real treatment effect in a subgroup may be small. The sample size of most studies is planned to be sufficient only for the primary overall analysis of all eligible subjects. Consequently, only in a meta-analysis of multiple studies is the sample size within a subgroup likely to be large enough for adequate within-subgroup statistical power.

The third problem with this approach is that usually no strong reason exists to expect that the treatment would be effective in specific subgroups and not effective in other specific subgroups. If strong reasons existed to suspect that, then it would not have been appropriate to plan and size the trial for the overall analysis of the eligible subjects as a whole. This problem is often stated as “qualitative interactions are a priori unlikely,” as clarified below in this section. All three of these considerations work in the direction of making a P < 0.05 result for a subgroup more likely to represent a false positive than a true positive [19].

The approach of using statistical significance tests within prespecified subgroups can be modified by reducing the threshold for declaring statistical significance for subgroup effects. The simplest approach to controlling multiple comparisons is to use the Bonferroni adjustment based on the number G of subgroups examined. If the G subgroups are disjoint, then the probability of obtaining at least one false-positive subgroup effect significant at the α′ level is 1 − (1 − α′)G. This probability can be limited to be no greater than α by setting α′L = 1 − (1 − α)1/G. The experiment-wise type I error rate, which is conventionally set at 0.05, could be allocated between the overall analysis and the subgroup analyses. For example, the overall analysis could be performed at a significance level of 0.04 and if nonsignificant then the subgroup analysis utilize the remaining α = 0.01 to be allocated among the G disjoint subgroups using the Bonferroni adjustment.

For nondisjoint subgroups, the within subgroup analyses are not independent and the Bonferroni adjustment is conservative. Many somewhat more efficient approaches are available [e.g., References 8, 20, and 21].

Although the multiplicity adjustments can protect the experiment-wise type I error rate, they render the problem of lack of statistical power within subgroups even more severe, because the significance tests are performed at a more stringent level.

49.5 Testing Treatment by Subgroup Interactions

Statisticians generally recommend testing a global hypothesis of no treatment by subgroup interaction as a prelude to any testing of treatment effect within subgroups [22]. If the true treatment effects in the G subgroups are equal, then there is said to be no treatment by subgroup interactions. The null hypothesis of no interaction does not stipulate whether the common treatment effect is null or nonnull. The interaction test used depends on the structure of the subgroups and the distribution of the data.

For model-based analysis, one general approach is first to fit the model that contains a single treatment effect and the main effects for the categorical covariates that define the subgroups. One then fits a model that also contains all of the treatment by covariate interactions. Twice the difference in log likelihoods provides an approximate test of the null hypothesis of no interactions.

Gail and Simon [23] developed a test for no qualitative interactions for a categorical covariate with G levels. A qualitative interaction is said to exist when the true treatment effects in the different subgroups are not all of the same sign. Having the treatment effect positive in some subgroups and null in other subgroups also represents a qualitative interaction. Qualitative interactions are the interactions of real interest because ordinary interactions are scale dependent. That is, different subgroups may have the same treatment effect when measured on the logit scale, but not on the difference in response probability scale. Russek-Cohen and Simon [24] extended the Gail–Simon test to a single continuous covariate or to data cross-classified by several categorical covariates, and other tests for qualitative interactions have been developed [25,26].

Many authors use interaction tests one at a time for each categorical covariate used to define subgroups. This approach, however, does not really protect the experiment-wise type I error rate. A better approach is to structure an analysis in which significance tests for treatment effect, within subgroups are performed only if a global test of no interaction or no qualitative interactions is rejected in the context of a model that includes all of the categorical covariates simultaneously. Consider, for example, a proportional hazards model

(1) equation

where z denotes a binary treatment indicator, x is a vector of binary covariates, α is the main effect of treatment, β is the main covariate effects, and γ is the treatment by covariate interactions. A global test of no interactions tests the hypothesis that all of the components of γ are simultaneously zero. If the global interaction test is performed at the 0.05 level, and the within-subgroup tests are also performed at the 0.05 level, then the experiment-wise type I error rate is preserved under the global null hypothesis that all subset effects are null.

Interaction tests tend to have low power because they involve a comparison of multiple treatment; effect estimates [27,28]. Brookes et al. [27] reported that if a trial has 80% power to detect the overall effect of treatment, then reliable detection of an interaction between two subgroups of the same magnitude as the overall effect would need a fourfold greater sample size. Brookes et al. [27] and Peterson and George [16] have studied sample size requirements for detecting interactions. Qualitative interaction tests are considered to be even more demanding with regard to sample size requirements. Some authors view the limited power of interaction tests as an advantage, however, as it will be more difficult to justify evaluation of treatment effects within subgroups. Consequently, the best setting for subgroup analysis is a meta-analysis of clinical trials. Certainly for unplanned subgroup analyses, replicability of findings is more important than the uninterpretable “statistical significance” frequently reported, and this can best be addressed in meta-analyses [10].

49.6 Subgroup Analyses in Positive Clinical Trials

It is suggested frequently that subgroup analysis should be disregarded unless the treatment effect is significant for the subjects as a whole. This viewpoint is motivated by a wish to prevent authors from reporting “negative” studies as positive based on false-positive subgroup findings. Subgroup analyses, however, are generally not more reliable for positive trials than for trials that are negative overall. If the true treatment effect is of size δ for all subgroups, then it is likely that in some subgroups, the treatment effect looks more like it is of size 0 or 2d than the true value δ.

Grouin et al. [29] point out that it is a standard regulatory requirement to examine treatment effects within subgroups in trials for which the overall treatment effect is significant “to confirm that efficacy benefits that have been identified in the complete trial are consistently observed across all subgroups defined by major factors of potential clinical importance.” Unfortunately, however, this requirement itself may result in misleading findings. Peto [30] showed that if the patients in a trial that is just significant at the P < 0.05 level overall are randomly divided into two equal-sized subgroups, then a one in three chance exists that the treatment effect will be large and significant in one of the subgroups (P < 0.05) and negligible in the other (P < 0.50). Hence the regulatory requirement of demonstrating that efficacy benefits are observed consistently across all subgroups is likely to lead to false conclusions that the efficacy is concentrated in only some subgroups. This finding is true of effects such as center, which have no a priori clinical relevance and should be viewed as random effects and not subjected to subgroup analysis. It is ironic that statistical practice is to be suspicious of subgroup analyses for studies that are negative overall and to require such analyses for studies that are positive overall; the problems of multiplicity and inadequate sample size limit validity of inference in both settings. Subgroup analyses that are not of sufficient a priori importance to be planned and allocated some experiment-wise type I error rate should generally not be considered definitive enough to influence post-study decision making.

49.7 Confidence Intervals for Treatment Effects within Subgroups

In some studies, it may be more feasible to identify the subsets prospectively and to plan the analysis so that the experiment-wise type I error rate is preserved than to increase the sample size sufficiently for adequately powered subgroup analyses. The limitations of inadequate sample size can be made explicit by using confidence intervals for treatment effects within subgroups instead of or in addition to significance tests [31]. Reporting a broad confidence limit for a treatment effect within a subgroup is much less likely to be misinterpreted as a lack of efficacy of the treatment within that subgroup than a “non-significant” P-value. The confidence coefficients used should be based on the “multiplicity-corrected” significance levels that preserve the experiment-wise type I error rate. If the subgroups are disjoint, then the multiplicity adjustment may be of the simple Bonferroni type described above. Otherwise, model-based confidence intervals and multiplicity adjustments can be used to account for the dependence structure. Model-based frequentist confidence intervals for treatment effects in subgroups were illustrated by Dixon and Simon [32]. Consider, for example, a generalized linear model

(2) equation

where L is the link function, x is a vector of categorical covariates that define the subgroups, z is a binary treatment indicator with z = 1 for the experimental treatment and z = 0 for control, β is a vector of main effects of covariates, and γ is a vector of treatment by subgroup interactions. The treatment effect for a subgroup specified by x can be defined as . This subgroup effect can be estimated by . Asymptotically, we generally have that is multivariate normal and is univariate normal. Consequently, the estimate of treatment effect for a specific subgroup is approximately univariate normal, and the estimates for several subgroups are approximately multivariate normal with a covariance matrix that can be estimated from the data. An approximate confidence interval for treatment effect within each subgroup can be constructed based on the multivariate normal distribution [32].

It is common to report confidence intervals for average outcome on each treatment arm in each subgroup presented. They, however, are much less useful than confidence intervals for treatment effects within subgroups.

49.8 Bayesian Methods

Several authors [33–35] have studied empirical Bayesian methods for estimating treatment effects δi in G disjoint subgroups of patients. Often, the estimators are taken as independent Ni, σi2) with estimates of {σi2} available and the subgroup specific treatment effects are taken as exchangeable from the same distribution N(μ, τ2). The mean μ and variance τ2 are estimated from the data, often by maximizing the marginal likelihood. This technique is similar to frequentist James–Stein estimation [36] and to mixed-model analysis of variance such as is used in random effects meta-analysis [37].

Several authors have discussed the use of fully Bayesian methods for evaluating treatment effects within subgroups of patients [e.g., References 12, 32, 38–40] Dixon and Simon [40] presented perhaps the most general and extensively developed method. For a generalized linear model such as Equation (2) with subgroups determined by one or more binary covariates they assumed that γ: MVN(0, ξ2I) with I denoting the identity matrix and they used a modified Jeffrey’s prior for the variance component ξ2. They used flat priors for the main effects and derived an expression for the posterior density of any linear combination of the parameters θ = (α, β, γ).

Simon et al. [41] simplified the Dixon–Simon method by using a multivariate normal prior θ: MVN(0, D). They show that the posterior distribution of the parameters can be approximated by MVN(Bb,B) where , and denotes the usual maximum likelihood (or maximum partial likelihood) estimate of the parameters obtained by frequentist analysis with C the estimated covariance matrix of . With independent priors for the interaction effects and flat priors for the main effects, D−1 is diagonal with p + 1 zeros along the main diagonal corresponding to the main effects followed by diagonal elements 1/d1, …, 1/dp corresponding to the reciprocal of the variances of the priors for the p treatment by covariate interactions. They specify these variances to represent the a priori belief that qualitative interactions are unlikely. For any linear combination of parameters η = aθ, the posterior distribution is univariate normal N(aBb,aBa). They define linear combinations of parameters to represent subgroups based on simultaneous specification of all covariates and for subgroups determined by a single covariate, averaged over the other covariates.

Simon [42] applied the Bayesian approach of Simon et al. [41] to the requirement that randomized clinical trials sponsored by the U.S. National Institutes of Health include an analysis of male and female subgroups. He showed that the mean of the posterior distribution of treatment effect in each sex subgroup is a weighted average of both sex-specific estimates, with the weights determined by the variance of the prior distribution of the interaction effect. As the variance goes to infinity, the posterior distribution of treatment effect for women depends only upon the data for the women. This analysis is equivalent to a frequentist analysis that includes an interaction term. Most clinical trials, however, are not conducted in the context where large interactions are as a priori likely as no interactions. If the variance of the prior is specified as zero, then the posterior distribution of the treatment effect for women weights treatment effects for both subgroups in proportion to their variances. It is essentially equivalent to the usual frequentist analysis without interactions. This limit seems equally extreme, however, as it corresponds to an assumption that treatment by subgroup interactions are impossible. The Bayesian analysis enables an analysis to be performed using the assumption that large treatment by subgroup interactions are possible but unlikely.

References

[1] S. F. Assmann, S. J. Pocock, L. E. Enos, and L. E. Kasten, Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet 2000; 355: 1064–1069.

[2] D. L. Sackett, Applying overviews and meta-analyses at the bedside. J. Clin. Epidemiol. 1995; 48: 61–70.

[3] R. Temple and S. S. Ellenberg, Placebo-controlled trials and active-control trials in the evaluation of new treatments. Part 1: ethical and scientific issues. Ann. Int. Med. 2000; 133: 455–463.

[4] S. J. Pocock, S. E. Assmann, L. E. Enos, and L. E. Kasten, Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practice and problems. Stats. Med. 2002; 21: 2917–2930.

[5] I. F. Tannock, False-positive results in clinical trials: multiple significance tests and the problem of unreported comparisons. J. Natl. Cancer Inst. 1996; 88: 206.

[6] T. R. Fleming, Interpretation of subgroup analyses in clinical trials. Drug Informat. J. 1995; 29: 1681S–1687S.

[7] D. I. Cook, V. J. Gebski, and A. C. Keech, Subgroup analysis in clinical trials. Med. J. Australia 2004; 180: 289–291.

[8] D. R. Bristol, P-value adjustments for subgroup analyses. J. Biopharm, Stats. 1997; 7: 313–321.

[9] G. G. Koch, Discussion of p-value adjustments for subgroup analyses. J. Biopharm. Stats. 1997; 7: 323–331.

[10] P. M. Rothwell, Subgroup analysis in randomised controlled trials: importance, indications, and interpretation. Lancet 2005; 365: 176–186.

[11] S. Hahn, P. R. Williamson, L. Hutton, P. Garner, and E. V. Flynn, Assessing the potential for bias in meta-analysis due to selective reporting of subgroup analysis within studies. Stats. Med. 2000; 19: 3325–3336.

[12] D. A. Berry, Multiple comparisons, multiple tests and data dredging: a Bayesian perspective. In: J. M. Bernardo, M. H. DeGroot, D. V. Lindley, and A. F. M. Smith (eds.), Bayes Statistics. Oxford: Oxford University Press; 1988.

[13] R. Simon, New challenges for 21st century clinical trials. Clin. Trials 2007; 4: 167–169.

[14] R. Simon, A roadmap for developing and validating therapeutically relevant genomic classifiers. J. Clin. Oncol. 2005: 23: 7332–7341.

[15] R. Simon and S. J. Wang, Use of genomic signatures in therapeutics development. Pharmacogenom. J. 2006; 6: 1667–1673.

[16] B. Peterson and S. L. George, Sample size requirements and length of study for testing interaction in a 1 × k factorial design when time-to-failure is the outcome. Control. Clin. Trials 1993; 14: 511–522.

[17] R. Simon and L. S. Freedman, Bayesian design and analysis of 2 by 2 factorial clinical trials. Biometrics 1997; 53: 456–464.

[18] T. R. Fleming and L. Watelet, Approaches to monitoring clinical trials. J. Natl. Cancer Inst. 1989; 81: 188.

[19] R. Simon, Commentary on Clinical trials and sample size considerations: Another perspective. Stats. Science 2000; 15: 95–110.

[20] R. J. Simes, An improved Bonferroni procedure for multiple tests of significance. Biometrika 1986; 73: 751–754.

[21] Y. Hochberg and Y. Benjamini, More powerful procedures for multiple significance testing. Stats. Med. 1990; 9: 811–818.

[22] R. Simon, Patient subsets and variation in therapeutic efficacy. Br. J. Clin. Pharmacol. 1982; 14: 473–482.

[23] M. Gail and R. Simon, Testing for qualitative interactions between treatment effects and patient subsets. Biometrics 1985; 41: 361.

[24] E. Russek-Cohen and R. M. Simon, Qualitative interactions in multifactor studies. Biometrics 1993: 49: 467–477.

[25] D. Zelterman, On tests for qualitative interactions. Statist. Probab. Lett. 1990: 10: 59–63.

[26] J. L. Ciminera, J. F. Heyse, H. H. Nguyen, and J. W. Tukey, Tests for qualitative treatment-by-centre interaction using a ‘pushback’ procedure. Stats. Med. 1993; 12: 1033–1045.

[27] S. T. Brookes, E. Whitley, T. J. Peters, P. A. Mulheran, M. Egger, and G. Davey-Smith, Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives. Health Technol. Assessm. 2001; 5: 1–56.

[28] R. F. Potthoff, B. L. Peterson, and S. L. George, Detecting treatment-by-centre interaction in multi-centre clinical trials. Stats. Med. 2001; 20: 193–213.

[29] J. M. Grouin, M. Coste, and J. Lewis, Subgroup analyses in randomized clinical trials: statistical and regulatory issues. J. Biopharm. Stats. 2004; 15: 869–882.

[30] R. Peto, Statistical aspects of cancer trials. In: Halnan KE (ed.), Treatment of Cancer. London: Chapman & Hall, 1982: 867–871.

[31] R. Simon, Confidence intervals for reporting results from clinical trials. Ann. Int. Med. 1986; 105: 429.

[32] D. O. Dixon and R. Simon, Bayesian subset analysis in a colorectal cancer clinical trial. Stats. Med. 1992; 11: 13–22.

[33] T. A. Louis, Estimating a population of parameter values using Bayes and empirical Bayes methods, J. Am. Stats. Assoc. 1984; 79: 393–398.

[34] R. Simon, Statistical tools for subset analysis in clinical trials. In: M. Baum, R. Kay, H. Scheurlen (eds.), Recent Results in Cancer Research, vol. 3. New York: Springer-Verlag, 1988.

[35] C. E. Davis and D. P. Leffingwell, Empirical Bayes estimation of subgroup effects in clinical trials. Control. Clin. Trials 1990; 11: 37–42.

[36] B. Efron and C. Morris, Stein’s estimation rule and its competitors—An empirical Bayes approach. J. Am. Stats. Assoc. 1973; 68: 117–130.

[37] R. DerSimonian and N. Laird, Meta-analysis in clinical trials. Control. Clin. Trials 1986; 7: 177–188.

[38] J. Cornfield, Sequential trials, sequential trials and the likelihood principle. Am. Statistician 1966: 20: 18–23.

[39] A. Donner, A Bayesian approach to the interpretation of subgroup results in clinical trials. J. Chron. Dis. 1982; 34: 429–435.

[40] D. O. Dixon and R. Simon, Bayesian subset analysis. Biometrics 1991; 47: 871.

[41] R. Simon, D. O. Dixon, and B. A. Freidlin, A Bayesian model for evaluating specificity of treatment effects in clinical trials. In: Thall PF (ed.), Recent Advances in Clinical Trial Design. Norwell, MA: Kluwer Academic Publications, 1995.

[42] R. Simon, Bayesian subset analysis: application to studying treatment-by-gender interactions. Stats. Med. 2002; 21: 2909–2916.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.55.151