Preface

1. Who Is This Book for?

This book is intended for anyone who has an interest in the synthesis, or ‘pooling’, of evidence from randomised controlled trials (RCTs) and particularly in the statistical methods for network meta-analysis. A standard meta-analysis is used to pool information from trials that compare two interventions, while network meta-analysis extends this to the comparison of any number of interventions.

Network meta-analysis is one of the core methodologies of what has been called comparative effectiveness research (Iglehart, 2009), and, in view of the prime role accorded to trial evidence over other forms of evidence on comparative efficacy, it might be considered to be the most important.

The core material in this book is largely based on a 3-day course that we have been running for several years. Based on the spectrum of participants we see on our course, we believe the book will engage a broad range of professionals and academics. Firstly, it should appeal to all statisticians who have an interest in evidence synthesis, whether from a methodological viewpoint or because they are involved in applied work arising from systematic reviews, including the work of the Cochrane Collaboration.

Secondly, the methods are an essential component of health technology assessment (HTA) and are routinely used in submissions not only to re-imbursement agencies such as the National Institute for Health and Care Excellence (NICE) in England but also, increasingly, to similar organisations worldwide, including the Canadian Agency for Drugs and Technologies in Health, the US Agency for Healthcare Research and Quality and the Institute for Quality and Efficiency in Health Care in Germany. Health economists involved in HTA in academia and those working in pharmaceutical companies, or for the consultancy firms who assist them in making submissions to these bodies, comprise the largest single professional group for whom this book is intended.

Clinical guidelines are also making increasing use of network meta-analysis, and statisticians and health economists working with medical colleges on guideline development represent a third group who will find this book highly relevant.

Finally, the book will also be of interest, we believe, to those whose role is to manage systematic reviews, clinical guideline development or HTA exercises and those responsible at a strategic level for determining the methodological approaches that should underpin these activities. For these readers, who may not be interested in the technical details, the book sets out the assumptions of network meta-analysis, its properties, when it is appropriate and when it is not.

The book can be used in a variety of ways to suit different backgrounds and interests, and we suggest some different routes through the book at the end of the preface.

2. The Decision-Making Context

The contents of this book have their origins in the methodology guidance that was produced for submissions to NICE. This is the body in England and Wales responsible for deciding which new pharmaceuticals are to be used in the National Health Service. This context has shaped the methods from the beginning.

First and foremost, the book is about an approach to evidence synthesis that is specifically intended for decision-making. It assumes that the purpose of every synthesis is to answer the question ‘for this pre-identified population of patients, which treatment is “best”?’ Such decisions can be made on any one of a range of grounds: efficacy alone, some balance of efficacy and side effects, perhaps through multi-criteria decision analysis (MCDA) or cost-effectiveness. At NICE, decisions are based on efficacy and cost-effectiveness, but whatever criteria are used, the decision-making context impacts evidence synthesis methodology in several ways.

Firstly, the decision maker must have in mind a quite specific target population, not simply patients with a particular medical condition but also patients who have reached a certain point in their natural history or in their referral pathway. These factors influence a clinician’s choice of treatment for an individual patient, and we should therefore expect them to impact how the evidence base and the decision options are identified. Similarly, the candidate interventions must also be characterised specifically, bearing in mind the dose, mode of delivery and concomitant treatments. Each variant has a different effect and also a different cost, both of which might be taken into account in any formal decision-making process. It has long been recognised that trial inclusion criteria for the decision-making context will tend to be more narrowly drawn than those for the broader kinds of synthesis that aim for what may be best described as a ‘summary’ of the literature (Eccles et al., 2001). In a similar vein Rubin (1990) has distinguished between evidence synthesis as ‘science’ and evidence synthesis as ‘summary’. The common use of random effects models to average over important heterogeneity has attracted particular criticism (Greenland, 1994a, 1994b).

Recognising the centrality of this issue, the Cochrane Handbook (Higgins and Green, 2008) states: ‘meta-analysis should only be considered when a group of studies is sufficiently homogeneous in terms of participants, interventions and outcomes to provide a meaningful summary’. However, perhaps because of the overriding focus on scouring the literature to secure ‘complete’ ascertainment of trial evidence, this advice is not always followed in practice. For example, an overview of treatments for enuresis (Russell and Kiddoo, 2006) put together studies on treatments for younger, treatment-naïve children, with studies on older children who had failed on standard interventions. Not surprisingly, extreme levels of statistical heterogeneity were observed, reflecting the clinical heterogeneity of the populations included (Caldwell et al., 2010). This throws doubt on any attempt to achieve a clinically meaningful answer to the question ‘which treatment is best?’ based on such a heterogeneous body of evidence. Similarly, one cannot meaningfully assess the efficacy of biologics in rheumatoid arthritis by combining trials on first-time use of biologics with trials on patients who have failed on biologic therapy (Singh et al., 2009). These two groups of patients require different decisions based on analyses of different sets of trials. A virtually endless list of examples could be cited. The key point is that the immediate effect of the decision-making perspective, in contrast to the systematic review perspective, is to greatly reduce the clinical heterogeneity of the trial populations under consideration.

The decision-making context has also made the adoption of Bayesian Markov chain Monte Carlo (MCMC) methods almost inevitable. The preferred form of cost-effectiveness analysis at NICE is based on probabilistic decision analysis (Doubilet et al., 1985; Critchfield and Willard, 1986; Claxton et al., 2005b). Uncertainty in parameters arising from statistical sampling error and other sources of uncertainty can be propagated through the decision model to be reflected in uncertainty in the decision. The decision itself is made on a ‘balance of evidence’ basis: it is an ‘optimal’ decision, given the available evidence, but not necessarily the ‘correct’ decision, because it is made under uncertainty.

Simulation from Bayesian posterior distributions therefore gives a ‘one-step’ solution, allowing proper statistical estimation and inference to be embedded within a probabilistic decision analysis, an approach sometimes called ‘comprehensive decision analysis’ (Parmigiani and Kamlet, 1993; Samsa et al., 1999; Parmigiani, 2002; Cooper et al., 2003; Spiegelhalter, 2003). This fits perfectly not only with cost-effectiveness analyses where the decision maker seeks to maximise the expected net benefit, seen as monetarised health gain minus costs (Claxton and Posnett, 1996; Stinnett and Mullahy, 1998), but also with decision analyses based on maximising any objective function. Throughout the book we have used the flexible and freely available WinBUGS software (Lunn et al., 2009) to carry out the MCMC computations required.

3. Transparency and Consistency of Method

Decisions on which intervention is ‘best’ are increasingly decisions that are made in public. They are scrutinised by manufacturers, bodies representing the health professions, ministries of health and patient organisations, often under the full view of the media. As a result, these decisions, and by extension the technical methods on which they are based, must be transparent, open to debate and capable of being applied in a consistent and fair way across a very wide range of circumstances. In the specific context of NICE, there is not only a scrupulous attention to process (National Institute for Health and Clinical Excellence, 2009b, 2009c) and method (National Institute for Health and Care Excellence, 2013a) but also an explicit rationale for the many societal judgements that are implicit in any health guidance (National Institute for Health and Clinical Excellence, 2008c).

This places quite profound constraints on the properties that methods for evidence synthesis need to have. It encourages us to adopt the same underlying models, the same way of evaluating model fit and the same model diagnostics, regardless of the form of the outcome. It also encourages us to develop methods that will give the same answers, whether trials report the outcomes on each arm, or just the difference between arms, or whether they report the number of events and person-years exposure or the number of patients reaching the endpoint in a given period of time. Similarly, we should aim to cope with situations where results from different trials are reported in more than one format. Meeting these objectives is greatly facilitated by the generalised linear modelling framework introduced in Chapter 4 and by the use of shared parameter models. The extraordinary flexibility of MCMC software pays ample dividends here, as shared parameter models cannot be readily estimated by frequentist methods.

An even stronger requirement is that the same methods of analysis are used whether there are just two interventions to be compared or whether there are three, four or more. Similarly, the same methods should be used for two-arm trials or for multi-arm trials, that is, trials with three or more arms. For example, manufacturers of treatments B and C, which have both been compared with placebo A in RCTs, would accept that recommendations might have to change, following the addition of new evidence from B versus C trials, but not if this was because a different kind of model had been used to synthesise the data. The methods used throughout this book can be applied consistently: a single software code is capable of analysing any connected network of trials, with two or more treatments, consisting of any combination of indirect comparisons, pairwise comparisons and multi-arm trials, without distinction. This is not a property shared by several other approaches (Lumley, 2002; Jackson et al., 2014), in which fundamentally different models are proposed for networks with different structures. This is not to deny, of course, that these models could be useful in other circumstances, such as when checking assumptions.

4. Some Historical Background

The term ‘network meta-analysis’, from Lumley (2002), is relatively recent. Other terms used include mixed treatment comparisons (MTCs) or multiple-treatments meta-analysis (MTM). The idea of network meta-analysis as an extension of simple pairwise meta-analysis goes back at least to David Eddy’s Confidence Profile Method (CPM) (Eddy et al., 1992), particularly to a study of tissue plasminogen activator (t-Pa) and streptokinase involving both indirect comparisons and what was termed a ‘chain of evidence’ (Eddy, 1989) (see Chapter 11). Another early example, from 1998, was a four-treatment network on smoking cessation from Vic Hasselblad (1998), another originator of the CPM. This much studied dataset has been used by many investigators and lecturers to illustrate their models, and it continues to do service in this book.

A second strand of work can be found in the influential Handbook of Research Synthesis edited by Cooper and Hedges (1994), which came from an educational and social psychology tradition, rather than medicine. A chapter by Gleser and Olkin (1994) describes a method for combining data from trials of several interventions, including some multi-arm studies.

Two other groups, working independently of both the CPM and of the educational psychology statisticians, also published seminal papers. Heiner Bucher, working with Gordon Guyatt and others, discussed ‘indirect comparisons’ (Bucher et al., 1997a, 1997b). A year earlier, Julian Higgins and Anne Whitehead (1996) proposed two separate ideas. One was to use external data to ‘borrow strength’ to inform the heterogeneity parameter in a random effects models, and the other to pool ‘direct’ and ‘indirect’ evidence on a particular comparison, subject to an assumption that they both estimated the same parameter. The statistical model that they proposed is essentially identical to the core model used throughout this book.

In the course of time, we have re-parameterised the Higgins and Whitehead model in several ways (Lu and Ades, 2004, 2006; Ades et al., 2006), adapting it to changes in WinBUGS and recoding it to make it as general as possible. The form in which it is used throughout this book is the same as in the NICE Decision Support Unit Technical Support Documents (Dias et al., 2011a, 2011b, 2011c, 2011d, 2011e), which were written to support the NICE guide to the methods of technology appraisal in its 2008 (National Institute for Health and Clinical Excellence, 2008a) and 2013 editions (National Institute for Health and Care Excellence, 2013a). These documents were reproduced in shorter form as a series of papers in the journal Medical Decision Making (Dias et al., 2013a, 2013b, 2013c, 2013d). Finally, it is worth mentioning that the same coding with different variable names was adopted by the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Task Force (Hoaglin et al., 2011; Jansen et al., 2011).

5. How to Use This Book

This book is designed to be used in a variety of different ways to suit readers with different interests and backgrounds. The first chapter ‘Introduction to Evidence Synthesis’ may serve as a miniature version of the core chapters in the book, touching on the assumptions of network meta-analysis, its key properties, showing what it does that other approaches cannot do and mentioning some of the policy implications. These themes are all picked up in greater detail in later chapters.

The second chapter, ‘The core model’ introduces the key network meta-analysis concepts for binomial data. It not only sets out the core model in some detail for dichotomous data but also shows readers how to set up data and obtain outputs from WinBUGS. Chapter 3 introduces the basic tools for model comparison, model critique, leverage analysis and outlier detection. In Chapter 4 a generic network meta-analysis model is presented based on generalised linear models (McCullagh and Nelder, 1989). This brings out the highly modular nature of the WinBUGS coding: once readers are familiar with the basic structure of the core model and its coding, they will be able to adapt the model to carry out the same operations on other outcomes, or an extended set of operations, such as meta-regression and inconsistency analysis. This modularity is exploited throughout this book to expand the core model from Chapters 2 and 4 in subsequent chapters. A large number of example WinBUGS codes are provided as online material. To complete the core material, Chapter 5 discusses the conceptual and computational issues that arise when a network meta-analysis is used to inform a cost-effectiveness analysis or other model for decision making. Chapter 6 briefly suggests some workarounds that can be useful in sparse data or zero cell count situations.

From our perspective (see Chapter 12), all forms of heterogeneity are undesirable. Chapters 7, 8 and 9 look at the issues arising from specific types of heterogeneity: inconsistency, covariate adjustment through meta-regression and bias adjustment. Although not listed as ‘core material’, readers should not regard these chapters as optional extras: a thorough understanding of these issues is fundamental to securing valid and interpretable results.

Chapters 10 and 11 take synthesis methodology into a further level of complexity, looking at the possibilities for network meta-analysis of survival outcomes based on data in life-table form and synthesis within- and between-trials reporting multiple (correlated) outcomes.

A final chapter on the validity of network meta-analysis attempts to address the many concerns about this method that have appeared in the literature. In this final chapter we try to clarify the assumptions of network meta-analysis and the terminology that has been used; we review empirical studies of the consistency assumption and present some ‘thought experiments’ to throw light on both the origins of ‘bias’ in evidence synthesis and the differences – if any – between direct and indirect evidence.

For the more experienced WinBUGS user, the book can be used as a primer on Bayesian synthesis of trial data using WinBUGS, or as a source of WinBUGS code that will suit the vast majority of situations usually encountered, with little or no amendment. The chapters on bias models, survival analysis and multiple outcomes may be read simply out of interest, or as a source of ideas the reader might wish to develop further.

For readers who wish to learn how to use network meta-analysis in practice, but who are not familiar with WinBUGS, a grounding in Bayesian methods (Spiegelhalter et al., 2004) and WinBUGS (Spiegelhalter et al., 2007) is essential. The WinBUGS manual, its tutorial, the BLOCKER and other examples in the manual are key resources, as is The Bugs Book (Lunn et al., 2013). Such readers should start with pairwise meta-analysis of binomial data (Chapter 2). They should assure themselves that they understand the WinBUGS outputs and that they are fully aware of the technical details that have to be attended to, such as convergence and the use of a suitably large post-convergence sample. Once all this has been mastered, more complex structures and other outcomes can be attempted (Chapter 4).

Throughout the book more difficult sections are asterisked (*). These sections are not essential, but may be of interest to the more advanced reader.

There are several routes through the book. For a health economist engaged in decision modelling, working with a network meta-analysis generated by, for example, a statistician, Chapter 5 will be particularly relevant. A project manager supervising a synthesis or an HTA exercise, or someone involved in methodological guidance, will want to read the relevant parts of the core material and the heterogeneity sections, ignoring the WinBUGS coding and algebra, glance through the extensions to complex outcomes and give careful attention to Chapter 12 on validity.

We have tried, above all, not to be prescriptive. We have also avoided the modern trend of issuing ‘guidance’. Instead, our priority is to be absolutely clear about the properties that alternative models have. We have already set out some of the properties that we believe that evidence synthesis models need to have in the context of transparent, rational, decision making. In the course of the book, we will add further to this list of desiderata. Readers can decide for themselves how important they feel these properties are and can then make an informed choice.

Finally, we would like to acknowledge the many colleagues and collaborators who have contributed to, supported, disagreed with and inspired our work. It is not possible to mention them all, but we would like to record our special thanks to statisticians Keith Abrams, Debbi Caldwell, Julian Higgins, Ian White, Chris Schmid and Tom Trikalinos; to health economists Nicola Cooper, Mark Sculpher, Karl Claxton and Neil Hawkins; in the commercial and government sectors, to Jeanni van Loon at Mapi Group, Rachael Fleurence at PCORI, Meindert Boyson and Carole Longson at NICE and Tammy Clifford at CADTH and above all to our dear colleague Guobing Lu.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.124.65