2
Coherence

Decision making under uncertainty is about making choices whose consequences are not completely predictable, because events will happen in the future that will affect the consequences of actions taken now. For example, when deciding whether to play a lottery, the consequences of the decision will depend on the number drawn, which is unknown at the time when the decision is made. When deciding treatment for a patient, consequences may depend on future events, such as the patient’s response to that treatment. Political decisions may depend on whether a war will begin or end within the next month. In this chapter we discuss de Finetti’s justification, the first of its kind, for using the calculus of probability as a quantification of uncertainty in decision making.

In the lottery example, uncertainty can be captured simply by the chance of a win, thought of, at least approximately, as the long-term frequency of wins over many identical replications of the same type of draw. This definition of probability is generally referred to as frequentist. When making a prognosis for a medical patient, chances based on relative frequencies are still useful: for example, we would probably be interested in knowing the frequency of response to therapy within a population of similar patients. However, in the lottery example, we could work out properties of the relevant frequencies on the basis of plausible approximations of the physical properties of the draw, such as independence and equal chance. With patients, that is not so straightforward, and we would have to rely on observed populations. Patients in a population are more different from one another than repeated games of the lottery, and differ in ways we may not understand. To trust the applicability of observed relative frequencies to our decision we have to introduce an element of judgment about the comparability of the patients in the population. Finally, events like “Canada will go to war within a month” are prohibitive from a relative frequency standpoint. Here the element of judgment has to be preponderant, because it is not easy to assemble, or even imagine, a collection of similar events of which the event in question is a typical representative.

The theory of subjective probability, developed in the 1930s by Ramsey and de Finetti, is an attempt to develop a formalism that can handle quantification of uncertainty in a wide spectrum of decision-making and prediction situations. In contrast to frequentist theories, which interpret probability as a property of physical phenomena, de Finetti suggests that it is more useful and general to define probability as a property of decision makers. An intelligent decision maker may recognize and use probabilistic properties of physical phenomena, but can also go beyond. Somewhat provocatively, de Finetti often said that probability does not exist—meaning that it is not somewhere out there to be discovered, irrespective of the person, or scientific community, trying to discover it. De Finetti posed the question of whether there could be a calculus for these more general subjective probabilities. He proposed that the axioms of probability commonly motivated by the frequency definition could alternatively be justified by a single rationality requirement now known as coherence. Coherence amounts to avoiding loss in those situations in which the probabilities are used to set betting odds. De Finetti’s proposal for subjective probability was originally published in 1931 (de Finetti 1931b) and in English, in condensed form, in 1937 (de Finetti 1937).

In Section 2.1.1 we introduce the simple betting game that motivates the notion of coherence—the fundamental rationality principle underlying the theory. In Section 2.1 we present the so-called Dutch Book argument, which shows that an incoherent probability assessor can be made a sure loser, and establishes a connection between coherence and the axioms of probability. In Section 2.1.3, we will also show how to derive conditional probability, the multiplication rule, and Bayes’ theorem from coherence conditions. In Section 2.2 we present a temporal coherence theory (Goldstein 1985) that extends de Finetti’s to situations where personal or subjective beliefs can be revised over time. De Finetti’s 1937 paper is important in statistics for other reasons as well. For example, it was a key paper in the development of the notion of exchangeability.

Featured article:

de Finetti, B. (1937). Foresight: Its logical laws, its subjective sources, in H. E. Kyburg and H. E. Smokler (eds.), Studies in Subjective Probability, Krieger, New York, pp. 55–118.

Useful general readings are de Finetti (1974) and Lindley (2000). For a comprehensive overview of de Finetti’s contributions to statistics see Cifarelli and Regazzini (1996).

2.1 The “Dutch Book” theorem

2.1.1 Betting odds

De Finetti’s theory of subjective probability is usually described using the metaphor of betting on the outcome of an as yet unknown event, say a sports event. This is a stylized situation, but it is representative of many simple decision situations, such as setting insurance premiums. The odds offered by a sports bookmaker, or the premiums set by an insurance company, reflect their judgment about the probabilities of events. So this seems like a natural place to start thinking about quantification of uncertainty.

Savage worries about a point that matters a great deal to philosophers and surprisingly less so to statisticians writing on foundations:

The idea of facts known is implicit in the use of the preference theory. For one thing, the person must know what acts are available to him. If, for example, I ask what odds you give that the fourth toss of this coin will result in heads if the first three do, it is normally implicitly not only that you know I will keep my part of the bargain if we bet, but also that you will know three heads if you see them. The statistician is forever talking about what reaction would be appropriate to this or that set of data, or givens. Yet, the data never are quite given, because there is always some doubt about what we have actually seen. Of course, in any applications, the doubt can be pushed further along. We can replace the event of three heads by the less immediate one of three tallies-for-head recorded, and then take into our analysis the possibility that not every tally is correct. Nonetheless, not only universals, but the most concrete and individual propositions are never really quite beyond doubt. Is there, then, some avoidable lack of clarity and rigor in our allusion to known facts? It has been argued that since indeed there is no absolute certainty, we should understand by “certainty” only strong relative certainty. This counsel is provocative, but does seem more to point up, than to answer, the present question. (Savage 1981a, p. 512)

Filing away this concern under “philosophical aches and pains,” as Savage would put it, let us continue with de Finetti’s plan.

Because bookmakers (and insurance companies!) make a profit, we will, at least for now, dissect the problem so that only the probabilistic component is left. So we will look at a situation where bookmakers are willing to buy and sell bets at the same odds. To get rid of considerations that come up when one bets very large sums of money, we will assume, like de Finetti, that we are in a range of bets that involves enough money for the decision maker to take things seriously, but not big enough that aversion to potentially large losses may interfere. In the next several chapters we will discuss how to replace monetary amounts with a more abstract and general measure, built upon ideas of Ramsey and others, that captures the utility to a decision maker of owning that money. For now, though, we will keep things simple and follow de Finetti’s assumptions. With regard to this issue, de Finetti (1937) notes that:

Such a formulation could better, like Ramsey’s, deal with expected utilities; I did not know of Ramsey’s work before 1937, but I was aware of the difficulty of money bets. I preferred to get around it by considering sufficiently small stakes, rather than to build up a complex theory to deal with it. (de Finetti 1937, p. 140)

So de Finetti separates the derivation of probability from consideration of utility, although rationality, more broadly understood, is part of his argument. In later writings (de Finetti 1952, de Finetti 1964b), he discussed explicitly the option of deriving both utilities and probability from a single set of preferences, and seemed to consider it the most appropriate way to proceed in decision problems, but maintained that the separation is preferable in general, giving two reasons:

First, the notion of probability, purified from the factors that affect utility, belongs to a logical level that I would call “superior”. Second, constructing the calculus of probability in its entirety requires vast developments concerning probability alone. (de Finetti 1952, p. 698, our translation)

These “vast developments” begin with the notion of coherent probability assessments. Suppose we are interested in predicting the result of an upcoming tennis match, say between Fisher and Neyman. Bookmakers are generally knowledgeable about this, so we are going to examine the bets they offer as a possible way of quantifying uncertainty about who will win. Bookmakers post odds. If the posted odds in favor of Fisher are, say, 1:2, one can bet one dollar, and win two dollars if Fisher wins and nothing if he does not. In sports you often see the reciprocal of the odds, also called odds “against,” and encounter expressions like “Fisher is given 2:1” to convey the odds against.

To make this more formal, let θ be the indicator of the event “Fisher wins the game.” We say, equivalently, that θ occurred or θ is true or that the true value of θ is 1. A bet is a ticket that will be worth a stake S if θ occurs and nothing if θ does not occur. A bookmaker generally sells bets at a price πθS. The price is expressed in units of the stake; when there is no ambiguity we will simply use π. The ratio π : (1 – π) is the betting odds in favor of the event θ. In our previous example, where odds in favor are 1:2, the stake S is three dollars, the price πS is one dollar, and π is 1/3.

The action of betting on θ, or buying the ticket, will be denoted by aS,θ. This action can be taken by either a client or the bookmaker, although in real life it is more often the client’s. What are the consequences of this action? The buyer will have a net gain of (1 – π)S, that is the stake S less the price πS, if θ = 1, or a net gain of –πS, if θ = 0. These net gains are summarized in Table 2.1. We are also going to allow for negative stakes. The action aS,θ, whose consequences are also shown in Table 2.1, reverses the role of the buyer and seller compared to aS,θ. A stake of zero will represent abstaining from the bet. Also, action aS,θ (selling the bet) has by definition the same payoff as buying a bet on the event θ = 0 at stake S and price π1−θ = 1 – πθ.

Table 2.1 Payoffs for the actions corresponding to buying and selling bets at stake S on event θ at odds π : (1 – π).

Action

 

States of the world

 

 

θ = 1

θ = 0

Buy bet on θ

aS,θ

(1 – π)S

πS

Sell bet on θ

aS,θ

–(1 – π)S

πS

We will work in the stylized context of a bookmaker who posts odds π : (1 – π) and is willing to buy or sell bets at those odds, for any stake. In other words, once the odds are posted, the bookmaker is indifferent between buying and selling bets on θ, or abstaining. The expression we will use for this is that the odds are fair from the bookmaker’s perspective. It is implicitly assumed that the book-maker can assess his or her willingness to buy and sell directly. Assessing odds is therefore assumed to be a primitive, as opposed to derivative, way of expressing preferences among actions involving bets. We will need a notation for binary comparisons among bets: a1 ~ a2 indicates indifference between two bets. For example, a bookmaker who considers odds on θ to be fair is indifferent between aS,θ and aS,θ, that is aS,θ ~ aS,θ. Also, the symbol ≻ indicates a strict preference relation. For example, if odds in favor of θ are considered too high by the bookmaker, then aS,θaS,θ.

2.1.2 Coherence and the axioms of probability

Before proceeding with the formal development, let us illustrate the main idea using a simple example. Lindley (1985) has a similar one, although he should not be blamed for the choice of tennis players. Let us imagine that you know a bookmaker who is willing to take bets on the outcome of the match between Fisher and Neyman. Say the prices posted by the bookmaker are 0.2 (1:4 odds in favor) for bets on the event θ: “Fisher wins,” and 0.7 (7:3 odds in favor) for bets on the event “Neyman wins.” In the setting of the previous section, this means that this bookmaker is willing to buy or sell bets on θ for a stake S at those prices. If you bet on θ, the bookmaker cashes 0.2S and then gives you back S (a net gain to him of –0.8S) if Fisher wins, and nothing (a net gain of 0.2S) if Fisher loses. In tennis there are no ties, so the event “Neyman wins” is the same as “Fisher loses” or θ = 0. The bookmaker has posted separate odds on “Neyman wins,” and those imply that you can also bet on that, in which case the bookmaker cashes 0.7S and returns S (a net gain of –0.3S) if Neyman wins and nothing (a net gain of 0.7S) if Neyman loses. Let us now see what happens if you place both bets. Your gains are:

 

Fisher wins

Neyman wins

Bet 1

0.8S

–0.2S

Bet 2

–0.7S

0.3S

Both bets (total)

0.1S

0.1S

So by placing both bets you can make the bookmaker lose money irrespective of whether Fisher or Neyman will win! If the stake is 10 dollars, you win one dollar either way. There is an internal inconsistency in the prices posted by the bookmaker that can be exploited to an economic advantage. In some bygone era, this used to be called “making Dutch Book against the bookmaker.” It is a true embarrassment, even aside from financial considerations. So the obvious question is “how to avoid Dutch Book?” Are there conditions that we can impose on the prices so that a bookmaker posting those prices cannot be made a sure loser? The answer is quite simple: prices π have to satisfy the axioms of probability. This argument is known as the “Dutch Book theorem” and is worth exploring in some detail.

Avoiding Dutch Book is the rationality requirement that de Finetti had in mind when introducing coherence.

Definition 2.1 (Coherence) A bookmaker’s betting odds are coherent if a client cannot place a bet or a combination of bets such that no matter what outcome occurs, the bookmaker will lose money.

The next theorem, due to de Finetti (1937), formalizes the claim that coherence requires prices that satisfy the axioms of probabilities. For a more formal development see Shimony (1955). The conditions of the theorem, in the simple two-event version, are as follows. Consider disjoint events θ1 and θ2, and assume a bookmaker posts odds (and associated prices) on all the events in the algebra induced by these events, that is on 1 – Θ, θ1, θ2, (1 – θ1)(1 – θ2), θ1 + θ2, 1 – θ1, 1 – θ2, Θ, where Θ is the indicator of the sure event. As we discussed, there are two structural assumptions being made:

DBT1: The odds are fair to the bookmaker, that is the bookmaker is willing to both sell and buy bets on any of the events posted.

DBT2: There is no restriction about the number of bets that clients can buy or sell, as long as this is finite.

The first condition is required to guarantee that the odds reflect the bookmaker’s knowledge about the relevant uncertainties, rather than desire to make a profit. The second condition is used, in de Finetti’s words, to “purify” the notion of probability from the factors that affect utility. It is strong: it implies, for example, that the book-maker values the next dollar just as much as if it were his or her last dollar. Even with this caveat, this is a very interesting set of conditions for an initial study of the rational underpinnings of probability.

Theorem 2.1 (Dutch Book theorem) If DBT1 and DBT2 hold, a necessary condition for a set of prices to be coherent is to satisfy Kolmogorov’s axioms, that is:

Axiom 1 0 ≤ πθ ≤ 1, for every θ.

Axiom 2 πΘ = 1.

Axiom 3 If θ1 and θ2 are such that θ1θ2 = 0, then πθ1 + πθ2 = πθ1 + θ2.

Proof: We assume that the odds are fair to the bookmaker, and consider the gain gθ made by a client buying bets from the bookmaker on event θ. If the gain is strictly positive for every θ in a partition, then the bookmaker can be made a sure loser and is incoherent.

Axiom 1: Suppose, by contradiction, that πθ > 1. When S < 0, the gain gθ to a client is

Image

which is strictly positive for both values of θ. Similarly, πθ < 0 and S > 0 also imply a sure loss.

Axiom 2: Let us assume that Axiom 1 holds. Say, for a contradiction, that 0 ≤ πΘ < 1. For any S > 0 the gain gΘ is (1 – πΘ)S > 0, if Θ = 1. Because Θ = 1 by definition, this implies a sure loss.

Axiom 3: Let us consider separate bets on θ1, θ2, and θ3 = θ1 + θ2θ1θ2 = θ1 + θ2. θ3 is the indicator of the union of the two events represented by θ1 and θ2. Say stakes are Sθ1, Sθ2, and Sθ3, respectively. Consider the partition given by θ1, θ2, and (1 – θ1)(1 – θ2). The net gains to the client in each of those cases are

gθ1 =

Sθ1 + Sθ3 – (πθ1Sθ1 + πθ2Sθ2 + πθ3Sθ3)

gθ2 =

Sθ2 + Sθ3 – (πθ1Sθ1 + πθ2Sθ2 + πθ3Sθ3)

g(1−θ1)(1−θ2) =

−(πθ1Sθ1 + πθ2Sθ2 + πθ3Sθ3).

These three equations can be rewritten in matrix notation as Rs = g, where

Image

If the matrix R is invertible, the system can be solved to get s = R−1g. This means that a client can set g to be a vector of positive values, corresponding to losses for the bookmaker. Thus coherence requires that the matrix R be singular, that is |R| = 0, which in turn implies, after a little bit of algebra, that πθ1 + πθ2πθ3 = 0.

We stated and proved this theorem in the simple case of two disjoint events. This same argument can be extended to an arbitrary finite set of disjoint events, as done in de Finetti (1937). From a mathematical standpoint it is also possible, and standard, to define axioms in terms of countable additivity, that is additivity over a denumerable partition. This also permits us to talk about coherent probability distributions on both discrete and continuous random variables.

The extension of coherence results from finite to denumerable partitions is controversial, as some find it objectionable to state a rationality requirement in terms of a circumstance as abstract as taking on an infinite number of bets. In truth, the theory as stated allows for any finite number of bets, so this number can easily be made to be large enough to be ridiculously unrealistic anyway. But there are other reasons as well why finite-bets theory is fun to explore. Seidenfeld (2001) reviews some differences between the countably additive theory of probability and the alternative theory built solely using finitely additive probability.

A related issue concerns events of probability zero more generally. Shimony (1955), for example, has criticized the coherence condition we discussed in this chapter as too weak, and prefers a stricter version that would not consider it rational to choose a bet whose return is never positive and sometimes negative. This version implies that no possible event can have probability zero—a requirement sometimes referred to a “Cromwell’s Rule” (Lindley 1982a).

2.1.3 Coherent conditional probabilities

In this section we present a second Dutch Book theorem that applies to coherence of conditional probabilities. The first step in de Finetti’s development is to define conditional statements. These are more general logical statements based on a three-valued logic: statement A conditional on B can be either true, if both are true, or false, if A is false and B is true, or void if B is false. In betting terminology this idea is operationalized by the so-called “called-off” bets. A bet on θ1, with stake S at a price π, called off if θ2 does not occur, means buying at price πS a ticket worth the following. If θ2 does not occur the price πS will be returned. If θ2 occurs the ticket will be worth S if θ1 occurs and nothing if θ1 does not occur, as usual. We denote by πθ1|θ2 the price of this bet. The payoff is then described by Table 2.2. Under the same structural conditions of Theorem 2.1, we have that:

Theorem 2.2 (Multiplication rule) A necessary condition for coherence of prices of called-off bets is that πθ1|θ2πθ2 = πθ1θ2.

Table 2.2 Payoffs corresponding to buying a bet on θ1 = 1, called off it θ2 = 0.

Action

States of the world

 

θ1θ2 = 1

(1 – θ1)θ2 = 1

θ2 = 0

bet on θ1, called off if θ2 = 0

(1 – πθ1|θ2)S

πθ1|θ2S

0

Proof: Consider bets on θ2, θ1θ2, and θ1|θ2 with stakes Sθ2, Sθ1θ2, and Sθ1|θ2, respectively, and the partition θ1θ2, (1 – θ1)θ2, (1 – θ2). The net gains are

gθ1θ2 =

Sθ2 + Sθ1θ2 + Sθ1|θ2 – (πθ2Sθ2 + πθ1θ2Sθ1θ2 + πθ1|θ2Sθ1|θ2)

g(1−θ1)θ2 =

Sθ2 – (πθ2Sθ2 + πθ1θ2Sθ1θ2 + πθ1|θ2Sθ1|θ2)

g1−θ2

= −(πθ2Sθ2 + πθ1θ2Sθ1θ2 + 0).

These three equations can be rewritten in matrix notation as Rs = g, where

Image

The requirement of coherence implies that |R| = 0, which in turn implies that

πθ1|θ2πθ2πθ1θ2 = 0.

At this point we have at our disposal all the machinery of probability calculus. For example, a corollary of the law of total probability and the conditioning rule is Bayes’ rule. Therefore, coherent probability assessment must also obey the Bayes rule.

If we accept countable additivity, we can use continuous random variables and their properties. One can define a parallel set of axioms in terms of expectations of random variables that falls back on the case we studied if the random variables are binary. An important case is that of conditional expectations. If θ is any continuous random variable and θ2 is an event, then the “conditional random variable” θ given θ2 can be defined as

θ|θ2 = θθ2 + (1 – θ2)E[θ|θ2]

where θ is observed if θ2 occurs and not otherwise. Taking expectations,

E[θ|θ2] = E[θθ2] + (1 – πθ2)E[θ|θ2]

and solving gives

Image

We will use this relationship in Section 2.2.

2.1.4 The implications of Dutch Book theorems

What questions have we answered so far? We definitely answered de Finetti’s original one, that is: assuming that we desire to use probability to represent an individual’s knowledge about unknowns, is there a justification for embracing the axioms of probability, as stated a few years earlier by Kolmogorov? The answer is: yes, there is. If the axioms are not satisfied the probabilities are “intrinsically contradictory” and lead to losing in a hypothetical game in which they are put to the test by allowing others to bet at the implied odds.

The laws of probability are, in de Finetti’s words, “conditions which characterize coherent opinions (that is, opinions admissible in their own right) and which distinguish them from others that are intrinsically contradictory.” Within those constraints, a probability assessor is entitled to any opinion. De Finetti continues:

a complete class of incompatible events θ1, θ2, ..., θn being given, all the assignments of probability that attribute to π1, π2, ..., πn any values whatever, which are non-negative and have a sum equal to unity, are admissible assignments: each of these evaluations corresponds to a coherent opinion, to an opinion legitimate in itself, and every individual is free to adopt that one of these opinions which he prefers, or, to put in more plainly, that which he feels. The best example is that of a championship where the spectator attributes to each team a greater or smaller probability of winning according to his own judgment; the theory cannot reject apriori any of these judgments unless the sum of the probabilities attributed to each team is not equal to unity. This arbitrariness, which any one would admit in the above case, exists also, according to the conception which we are maintaining, in all other domains, including those more or less vaguely defined domains in which the various objective conceptions are asserted to be valid. (de Finetti 1937, pp. 139–140)

The Dutch Book argument provides a calculus for using subjective probability in the quantification of uncertainty and gives decision makers great latitude in establishing fair odds based on formal or informal processing of knowledge. With this freedom comes two important constraints. One is that probability assessors be ready, at least hypothetically, to “put their money where their mouth is.” Unless ready to lie about their knowledge (we will return to this in Chapter 10), the probability assessor does not have an incentive to post capricious odds. The other is implicit in de Finetti’s definition of event as a statement whose truth will become known to the bettors. That truth of events can be known and agreed upon by many individuals in all relevant scientific contexts is somewhat optimistic, and reveals the influence of the positivist philosophical school on de Finetti’s thought. From a statistical standpoint, though, it is healthy to focus controversies on observable events, rather than theoretical entities that may not be ultimately measured, such as model parameters. The latter, however, are important in a number of scientific settings. De Finetti’s theory of exchangeability, also covered in his 1937 article, is a formidable contribution to grounding parametric inference in statements about observables. Covering it here would takes us too far astray. A good entry point to the extensive literature is Cifarelli and Regazzini (1996).

A second related question is the following: is asking someone for their fair betting odds a good way to find out their probability of an event? Or, stated more technically, is the betting mechanism a practical elicitation tool for measuring subjective probability? The Dutch Book argument does not directly mention this, but nonetheless this is an interesting possibility. Discussions are in de Finetti (1974), Kadane and Winkler (1988), Seidenfeld et al. (1990a) and Garthwaite et al. (2005) who connect statistical considerations to the results of psychological research about “how people represent uncertain information cognitively, and how they respond to questions about that information.”

A third question is perhaps the most important: do all rational decision makers facing a decision under uncertainty (say the betting problem) have to act as though they represented their uncertainty using a coherent subjective probability distribution? There is a little bit of extra work that needs to be done before we can answer that, and we will postpone it to our discussion of Ramsey’s ideas in Chapter 5.

Lastly, we need to consider the question of temporal coherence. We have seen that the Bayes rule and conditional probabilities are derived in terms of called-off bets, which are assessed before the conditioning events are observed. As such they are static constraints among probabilities of events, all of which are in the future. Much of statistical thinking is about what can be said about unknowns after some data are observed. Ramsey (1926, p. 180) first pointed out that the two are not the same. Hacking (1976) draws a distinction between conditional probability and a posteriori probability, the latter being the statement made after the conditioning event is observed. The dominant view among Bayesian statistician has been that the two can be equated without resorting to any additional principle. For example, Howson and Urbach (1989) argue that unless relevant additional background knowledge accrues between the time the conditional probability is stated and the time the conditioning event occurs, it is legitimate to equate conditional probability and a posteriori probability. And one can often make provisions for this background knowledge by incorporating it explicitly in the algebra being considered.

Others, however, have argued that the leap to using the Bayes rule for a posteriori probability is not justified by the Dutch Book theorem. Goldstein writes:

As no coherence principles are used to justify the equivalence of conditional and a posteriori probabilities, this assumption is an arbitrary imposition on the subjective theory. As Bayesians rarely make a simple updating of actual prior probabilities to the corresponding conditional probabilities, this assumption misrepresents Bayesian practice. Thus Bayesian statements are often unclear. ... The practical implication is that Bayesian theory does not appear to be very helpful in considering the kind of question that we have raised about the expert and his changing judgments. (Goldstein 1985, p. 232)

There are at least a couple of options for addressing this issue. One is to add to the coherence principle the separate principle, taken at face value, that conditional and a posteriori probabilities are the same. This is sometimes referred to as the conditionality principle. For example, Pratt et al. (1964) hold that conditional probability before and after the conditioning event are two different behavioral principles, though in their view, equating the two is a perfectly reasonable additional requirement. We will revisit this in Chapter 5. Another, which we briefly examine next, is to formalize a more general notion of coherence that would apply to the dynamic nature of updating.

2.2 Temporal coherence

Let us start with a simple example. You are about to go to work on a cloudy morning and your current degree of belief about the event θ that it will rain in the afternoon is 0.9. If you ask yourself the same question at lunchtime you may state a different belief perhaps because you hear the weather forecast on the radio, or because you see a familiar weather pattern develop. We will denote by Image the present assessment, and by Image the lunchtime assessment. The quantity Image is unknown at the present time and, therefore, it can be thought of as a random quantity.

Goldstein describes the dynamic nature of beliefs. Your beliefs, he says, are:

temporally coherent at a particular moment if your current assessments are coherent and you also believe that at each future time point your new current assessments will be coherent. (Goldstein 1985, p. 232)

To make this more formal, one needs to think about degree of belief about future probability assessments. In our example, we would need to consider beliefs about our future probability Image. We will denote the expected value, computed at time 0 of this probability, by Image. Goldstein (1983) proposes that in order for one to be considered coherent over time, his or her expectation for an event’s prevision ought to be his or her current probability of that event.

Definition 2.2 (Temporal coherence) Probability assessments on event θ at two time points 0 and T are temporally coherent iff

Image

This condition establishes a relation between one’s current assessments and those to be asserted at a future time. Also, this relation assures that one’s change in pre-vision, that is Image, cannot be systematically predicted from current beliefs. In Goldstein’s own words:

I have beliefs. I may not be able to describe “the precise reasons” why I hold these beliefs. Even so, the rules implied by coherence provide logical guidance for my expression of these beliefs. My beliefs will change. I may not now be able to describe precise reasons how or why these changes will occur. Even so, just as with my beliefs, the rules implied by coherence provide logical guidance for my expression of belief as to how these beliefs will change. (Goldstein 1983, p. 819)

The following theorem, due to Goldstein (1985), extends the previous result to conditional previsions.

Theorem 2.3 If you are temporally coherent, then your previsions must satisfy

Image

where Image is the revised value for Image declared at time T, still before θ2 is obtained.

Proof: Consider the definition of conditional expectation given in equation (2.1). If we choose θ to be the random Image, then

Image

Next, applying the definition of temporal coherence (2.2) with Image, and substituting, we get

Image

At time T, Image is constant, so

Image

from the definition of conditional probability. Finally, from temporal coherence, Image, so

Image

Consider events θi, i = 1, ..., k, forming a partition of Θ; that is, one and only one of them will occur. Formally Image and θiθj = 0, for all ij. Also, choose θ to be any event in Θ, not necessarily one of the θi above. A useful consequence of Theorem 2.3 above is that

Image

Similarly, at a future time T,

Image

Our next goal is to investigate the relationship between Image and Image. The next theorem establishes that Image cannot systematically predict the change Image in the prevision of conditional beliefs.

Theorem 2.4 If an agent is temporally coherent, then

  1. Image,

  2. Q and Image are uncorrelated,

  3. Image is identically zero.

Proof: Following Goldstein (1985) we will prove (iii) by showing that Image for every i:

Image

Summing over i gives the desired result.

Although the discussion in this section utilizes previsions of event indicators (and, therefore, probabilities of events), the arguments hold for more general bounded random variables. In this general framework, one can directly assess the prevision or expectation of the random variable, which is only done indirectly here, via the event probabilities.

2.3 Scoring rules and the axioms of probabilities

In Chapter 10 we will study measures used for the evaluation of forecasters, after the events that were being predicted have actually taken place. These are called scoring rules and typically involve the computation of a summary value that reflects the correspondence between the probability forecast and the observation of what actually occurred.

Consider the case where a forecaster must announce probabilities π = (πθ1, πθ2, ..., πθk) for the events θ1, ..., θk, which form a partition of the possible states of nature. A popular choice of scoring rule is the negative of the sum of the squared differences between predictions and events:

Image

It turns out that the score s above of an incoherent forecaster can be improved upon regardless of the outcome. This can provide an alternative justification for using coherent probabilities, without relying on the Dutch Book theorem and the betting scenario. This argument is in fact used in de Finetti’s treatise on probability (de Finetti 1974) to justify the axioms of probability. In Section 2.4 we show how to derive Kolmogorov’s axioms when the forecaster is scored on the basis of such a quadratic scoring rule.

2.4 Exercises

Problem 2.1 Consider a probability assessor being evaluated according to the scoring rule (2.3). If the assessor’s probabilities π violate any of the following two conditions,

  1. 0 ≤ πθj ≤ 1 for all j = 1, ..., k,

  2. Image,

then there is a vector π′, satisfying conditions 1 and 2, and such that s(θj, π) ≤ s(θj, π′) for all j = 1, ..., k and s(θj, π) < s(θj, π′) for at least one j.

We begin by building intuition about this result in low-dimensional cases. When k = 2, in the (πθ1, πθ2) plane the quantities s(θ1, π) and s(θ2, π) represent the negative of the squared distances between the point π = (πθ1, πθ2) and the points e1 = (1, 0) and e2 = (0, 1), respectively. In Figure 2.1, points C and D satisfy condition 2 but violate condition 1, while point B violates condition 2 and satisfies condition 1. Can we find values which satisfy both conditions and have smaller distances to both canonical vectors e1 and e2? In the cases of C and D, e1 and e2 do the job. For B, we can find a point b that does the job by looking at the projection of B on the πθ1 + πθ2 = 1 line. The scores of B are hypotenuses of the Be1b and Be2b triangles. The scores of the projection are the sides of the same triangles, that is the line segments e1b and e2b. The point E violates both conditions. If you are not yet convinced by the argument, try to find a point that does better than E in both dimensions.

Figure 2.2 illustrates a similar point for k = 3. In that case, to satisfy the axioms, we have to choose points in the shaded area (the simplex on ℜ3). This result can be generalized to ℜk by using the triangle inequality in k dimensions. We now make this more formal.

Image

Figure 2.1 Quadratic scoring rule for k = 2. Point B satisfies the first axiom of probability and violates the second one, while points C and D violate the first axiom and satisfy the second. Point E violates them both, while points b, e1, and e2 satisfy both axioms.

Image

Figure 2.2 Quadratic scoring rule for k = 3.

We begin by considering a violation of condition 1. Suppose that the first component of π is negative, that is πθ1 < 0, and 0 ≤ πθj ≤ 1 for j = 2, ..., k. Consider now π′ constructed by setting Image and Image for j = 2, ..., k. π′ satisfies condition 1. If event θ1 occurs,

Image

since πθ1 < 0. Furthermore, for j = 2, ..., k,

Image

Therefore, if πθ1 < 0, s(θj, π) < s(θj, π′) for every j = 1, ..., k.

We now move to the case where the predictions violate condition 2, that is 0 ≤ πθi ≤ 1 for i = 1, ..., k, but Image. Take π′ to be the orthogonal projection of π on the plane defined to satisfy both conditions 1 and 2. For any j we have

Image

The term

Image

corresponds to the squared Euclidean distance between π and ej, the k-dimensional canonical point with value 1 in its jth coordinate, and 0 elsewhere. Similarly, the term

Image

is the squared Euclidean distance between π′ and ej. Since π′ is an orthogonal projection of π and satisfies conditions 1 and 2, it follows that ||π, π′|| + ||π′, ej|| = ||π, ej||, for any j = 1, ..., k. Here ||π1, π2|| denotes the squared Euclidean distance between π1 and π2. As ||π, π′|| > 0, with ||π, π′|| = 0 if, and only if, π = π′, we conclude that ||π′, ej|| < ||π, ej||. Therefore s(θj, π) – s(θj, π′) < 0 for any j = 1, ..., k.

Problem 2.2 We can extend this result to conditional probabilities by a slight modification of the scoring rule. We will do this next, following de Finetti (1972, chapter 2) (English translation of de Finetti 1964). Suppose that a forecaster must announce a probability πθ1|θ2 for an event θ1 conditional on the occurrence of the event θ2, and is scored by the rule –(πθ1|θ2θ1)2θ2. This implies that a penalty of −(πθ1θ1)2 occurs if θ2 is true, but the score is 0 otherwise. Suppose the forecaster is also required to announce probabilities πθ1θ2 and πθ2 for the events θ1θ2 and θ2 subject to quadratic scoring rules –(πθ1θ2θ1θ2)2 and –(πθ2θ2)2, respectively. Overall, by announcing πθ1|θ2, πθ1θ2, and πθ2 the forecaster is subject to the penalty s(θ1|θ2, θ1θ2, θ2, πθ1|θ2, πθ1θ2, πθ2) = −(πθ1|θ2θ1)2θ2 – (πθ1θ2θ1θ2)2 – (πθ2θ2)2.

Under these conditions πθ1|θ2, πθ1θ2, and πθ2 must satisfy

πθ1θ2 = πθ1|θ2πθ2

or else the forecaster can again be outscored irrespective of the event that occurs.

Let x, y, and z be the values taken by s under the occurrence of θ1θ2, (1 – θ1)θ2, and (1 – θ2), respectively:

Image

To prove this, let (πθ1|θ2, πθ1θ2, πθ2) be a point at which the gradients of x, y, and z are not in the same plane, meaning that the Jacobian

Image

is not zero. Then, it is possible to make x, y, and z smaller by moving (πθ1|θ2, πθ1θ2, πθ2). For a geometrical interpretation of this idea, look at de Finetti (1972). Therefore, the Jacobian, 8(πθ1|θ2πθ2πθ1θ2), has to be equal to zero, which implies πθ1θ2 = πθ1|θ2πθ2.

Problem 2.3 Next week the Duke soccer team plays American University and Akron. Call θ1 the event Duke beats Akron and θ2 the event Duke beats American. Suppose a bookmaker is posting the following betting odds and is willing to take stakes of either sign on any combination of the events:

Event

Odds

θ1

4:1

θ2

3:2

θ1 + θ2θ1θ2

9:1

θ1θ2

2:3

Find stakes that will enable you to win money from the bookmaker no matter what the results of the games are.

Problem 2.4 (Bayes’ theorem) Let θ1 and θ2 be two events. A corollary of Theorem 2.2 is the following. Provided that πθ2 ≠ 0, a necessary condition for coherence is that πθ1|θ2 = πθ2|θ1πθ1/πθ2. Prove this corollary without using Theorem 2.2.

Problem 2.5 An insurance company, represented here by you, provides insurance policies against hurricane damage. One of your clients is especially concerned about tree damage (θ1) and flood damage (θ2). In order to state the policies he is asked to elicit (personal) rates for the following events: θ1, θ2, θ1 | θ2, and θ2 | θ1. Suppose his probabilities are as follows:

Event

Rates

θ1

0.2

θ2

0.3

θ1 | θ2

0.8

θ2 | θ1

0.9

Show that your client can be made a sure loser.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.252.204