14.3. The Structure of a Decision Analysis

The foundations of statistical decision theory were laid by Wald (1950) and Savage (1954). The theory was mainly concerned with problems within statistical inference, such as estimation and hypothesis testing, rather than with real-life decision problems. A standard formulation (Berger, 1985; Lehmann, 1986; French and Ríos Insua, 2000) is that θ denotes the unknown parameter (or "state of nature"). After observing a random vector X, which distribution is determined by θ, the decision-maker is to make a decision (take an action) denoted by d(X). Statistical decisions may be related to accepting or rejecting a hypothesis or to choosing a point estimate of θ. The consequences of different decisions are modeled by a fully specified loss function L(θ,d(X)). One may proceed by calculating the risk function


where the expectation is taken over the distribution of X for the parameter θ.

Since the risk function depends on the unknown parameter, it is not clear which decision rule is optimal. Two possible criteria for optimality are the minimax and Bayes criteria defined below.

Minimax criterion. The minimax rule choose the decision d = d(X) that minimises maxθ r(θ,d).

Bayes criterion. The Bayes solution is based on a prior distribution π(θ) for θ and chooses the decision rule that minimises


Statistical decision theory can also be applied to decision problems which we do not necessarily think of as "statistical", such as deciding whether to invest or not (Raiffa, 1961). In these problems the temporal order is often different from the typical statistical problem: it is common that a decision d (e.g., a go/no go decision in a Phase III trial) has to be taken before observing a random outcome X (e.g., trial results). The decision maker's preferences are often modelled as a utility u = u(θ,d,X). To model a utility u instead of a loss L is purely conventional as they are easily interchangeable by the relation


It is also common to see sequential decision problems. There are often several decisions d1,d2,... which are to be taken. Between the decision points, more and more observation X1,X2,... can be collected. Each Xk may be a random vector whose distribution is determined by θ. The random outcome Xk can also depend on earlier decisions d1,...,dk-1, as these may relate to which experiments are run. On the other hand, the decisions depend on previously collected information, dk+1 = dk+1(X1,...,Xk). This structure can, at least for simple discrete problems, be illustrated as a decision tree with a number of decision nodes as well as chance nodes.

By writing d = {d1,d2,...} and X = {X1,X2,...}, we still have a utility function of the form u = u(θ,d,X). Given a prior distribution for θ, say, π(θ), and using expected utility as the optimality criterion, the problem is to find the decision strategy d which maximises


Backward induction is often useful to solve this problem (Bather, 2000). The minimax optimality criterion, although sometimes fruitful for some statistical problems, is usually of limited value in a real-life decision problem. For example, the minimax solution of the investment problem is almost invariably not to invest. This philosophy could be parodied as one which leads one to spend one's life in bed for fear of being run over by a car if one leaves the house.

A main criticism of fully Bayesian methods that rely on informative priors, has been the subjective component. Results that strongly depend on the experimenter's prior beliefs are likely not to be fully accepted by other stakeholders or by the scientific community. This criticism, however, loses its strength when the conclusions are not meant to convince individuals other than those who make the assumptions. Therefore, a semi-Bayesian approach is reasonable: use Bayesian methods internally but classical frequentism externally. For example, a company might use its prior opinion of the effect of a new drug in order to decide on how much resources, if any, to put into its development, how clinical trials should be designed and dimensioned, etc. Still, the company may give traditional frequentist statistical analyses to the regulatory bodies and to the scientific community when publishing the results. As we will see, the utility may even be constructed as a function of the result of the frequentist analysis.

The term "Decision Analysis" was coined in 1965 by Howard (Howard and Matheson, 1984, Pages 3 and 97). Decision analysis is partly the application of statistical decision theory but does also pay attention to structuring and modelling a problem. These issues are typically more complicated than the computational optimisation, given a specified model. Good accounts of decision analysis are provided by Raiffa (1968) and in a collection of papers edited by Howard and Matheson (1984).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.118.229