Chapter 3

Probability

3.1 Measurement

In this chapter the systematic study of uncertainty begins. Recall that there is a person “you”, contemplating an “event”, and it is desired to express your uncertainty about that event, which uncertainty is called your “belief” that the event is true. The tool to be used is reason (§2.1) or rationality, based on a few fundamental premises and emphasizing simplicity (§2.6). The first task is to measure the intensity of your belief in the truth of the event; to attach to each event a number that describes your attitude to the statement. Many people object to the assignment of numbers, seeing it as an oversimplification of what is rightly a complicated situation. So let us be quite clear why we choose to measure and what the measurement will accomplish. One field in which numbers are used, despite being highly criticized by professionals, is wine-tasting, where a bottle of wine is given a score out of 100, called the Parker score after its inventor, the result being that a wine with a high score such as 96 commands a higher price than a mere 90. Some experts properly object that a single number cannot possibly capture all the nuances that are to be found in that most delectable of liquids. Nevertheless, numbers do have a role to play in wine-tasting, where a collection of different wines is tasted by a group of experts, the object being to compare the wines, which naturally vary; variation, as we shall see in §9.1, gives rise to uncertainty. In addition to the wines being different, so are the tasters, and in a properly conducted tasting, it is desirable to sort out the two types of variation and any interaction between wines and tasters, such as tasters of one nationality preferring wines of their own country. If tasting results in comments like “a touch of blackcurrant to a background of coffee with undertones of figs”, sensible comparisons are almost impossible. A useful procedure is for each taster to score each wine, the usual method employing a score out of 20 devised at the University of California at Davis. It is then possible by standard statistical methods to make valuable judgments both about the wines and their tasters.

The point here is that, whether it is the Parker or Davis score that is employed, the basic function is to compare wines and tasters. Whether a wine with an average score of 19 is truly better than one with an average of 16, will depend on the variation found in the tasting. (Notice that there are uncertainties here, but wine tasters do not always mention them.) What is not true is that the scores for different wines are combined in any way; a Chablis at 17 is not diluted with a claret at 15 to make a mixture at 16. The numbers are there only for comparison; 17 is bigger than 15. The situation is different with uncertainty where, in any but the simplest scenario, you have to consider several uncertainties and necessarily need to combine them to produce an overall assessment. A doctor has several beliefs about aspects of the patient, which need to be put together to provide a belief about the treatment. It is this combination that makes measurement of uncertainty different from that of wine, where only comparison is required. Now numbers combine very easily and in two distinct ways, by addition and by multiplication, so it is surely sensible to exploit these two simple procedures by associating numbers with your beliefs. How else is the doctor to combine beliefs about the various symptoms presented by the patient?

We aim to measure separate uncertainties in order to combine them into an overall uncertainty, so that all your beliefs come together in a sensible set of beliefs. In this chapter, only one event will be discussed and the combination aspect will scarcely appear, so bear with me while we investigate the process of measurement itself for a single event, beginning with some remarks about measurement in general. A reader who is unconvinced may like to look at §6.4, which concerns the uncertainty of someone who has just been tested for cancer. Without numbers, it is hard to see how to persuade the person of the soundness of the conclusion reached there.

Take the familiar concept of a distance between two points, where a commonly used measure is the foot. What does it mean to say that the distance is one foot? All it means is that somewhere there is a metal bar with two thin marks on it. The distance between these two marks is called a foot, and to say that the width of a table is one foot means only that, were the table and the bar placed together, the former would sit exactly between the two marks. In other words, there is a standard, a metal bar, and all measurements of distance refer to a comparison with this standard. Nowadays the bar is being replaced by the wavelength of krypton light and any distance is compared with the number of waves of krypton light it could contain. The key idea is that all measurements ultimately consist of comparison with a standard with the result that there are no absolutes in the world of measurement. Temperature was based on the twin standards of freezing and boiling water. Time is based on the oscillation of a crystal, and so on. Our first task is therefore to develop a standard for uncertainty.

Before doing this, one other feature of measurement needs to be noticed. There is no suggestion that, in order to measure the width of the table, we have to get hold of some krypton light; or that to measure temperature, we need some water. The direct comparison with the standard is not required. In the case of distance, we use a convenient device, like a tape measure, that has itself been compared with the standard or some copy of it. The measurement of distances on the Earth's surface, needed for the production of maps, was, before the use of satellites, based on the measurement of angles, not distances, in the process known as triangulation, and the standard remains a conceptual tool, not a practical one. So do not be surprised if you cannot use our standard for uncertainty, any more than you need krypton light to determine your height. It will be necessary to produce the equivalent of tape measures and triangulation, so that belief can be measured in reality and not just conceptually.

In what follows, extensive references will be made to gambles and there are many people who understandably have strong, moral objections to gambling. The function of this paragraph is to assure such sensible folk that their views need not hinder the development here presented. A gamble, in our terminology, refers to a situation in which there is an event, uncertain for you, and where it is necessary for you to consider both what will happen were it true, and also were it false. Webster's dictionary expresses our meaning succinctly in the definition of a gamble as “an act…having an element of.. uncertainty”. Think of playing the game of Trivial Pursuit and being asked for the capital of Liberia, with the trivial outcomes of an advance on the board or of the move passing to your opponents. Your response is, in Webster's sense and ours, a gamble. The examples of §1.2 show how common is uncertainty and therefore how common are gambles in our sense. We begin by contemplating the act, mentioned by Webster, and it is only later, when decision analysis is developed in Chapter 10, that action, following on from this contemplation, is considered. In §14.5 we have a little to say about gambling, in the sense of monetary affairs in connection with activities such as horse racing, and will see that the moral objections mentioned above can easily be accommodated using an appropriate utility function.

3.2 Randomness

The simplest form of uncertainty arises with gambles involving physical objects such as playing cards or roulette wheels, as we saw in Example 7 of §1.2. The standard to be used is therefore based on a simple type of gamble. Take an urn containing 100 balls that, for the moment, are as similar as modern mass-production methods can make them. There is no significance in 100; any reasonably large number would do and a mathematician would take n balls, where n stands for any number, but we try to avoid unnecessary math. An urn is an opaque container with a narrow neck, so that you cannot see into the urn but can reach into it for a ball, which can then be withdrawn but not seen until it is entirely out of the urn.

Suppose that the balls are numbered consecutively from 1 to 100, with the numbers painted on the balls, and imagine that you are to withdraw one ball from the urn. In some cases you might feel that every ball, and therefore every number, had the same chance of being drawn as every other; that is, all numbers from 1 to 100 had the same uncertainty. To put it another way, suppose that you were offered a prize of 10 dollars were number 37 to be withdrawn; otherwise you were to receive nothing. Suppose that there was a similar offer, but for the number 53. Then if you are indifferent between these two gambles, in the sense that you cannot choose between them, you think that 37 is as uncertain as 53. Here your feeling of uncertainty is being translated into action, namely an expressed indifference between two gambles, but notice that the outcomes are the same in both gambles, namely 10 dollars or nothing, only the circumstances of winning or losing differ. We shall discuss later in Chapter 10 the different types of gambles, where the outcomes differ radically and where additional problems arise.

There are circumstances where you would not exhibit such indifference. You might feel that the person offering the gamble on 37 was honest, and that on 53 a crook, or you might think that 37 is your lucky number and was more likely to appear than 53. Or you might think that the balls with two digits painted on them weighed more than those with just one, so would sink to the bottom of the urn, thereby making the single-digit balls more likely to be taken. There are many occasions on which you might have preferences for some balls over others, but you can imagine circumstances where you would truly be indifferent between all 100 numbers. It might be quite hard to achieve this indifference, but then it is difficult to make the standard meter bar for distance, and even more difficult to keep it constant in length. The difficulties are less with krypton light, which is partly why it has replaced the bar.

If you think that each number from 1 to 100 has the same chance of being drawn; or if a prize contingent on any one number is as valuable as the same prize contingent on any other, then we say that you think the ball is taken at random, or simply, random. More formally, if your belief in the event of ball 37 being drawn is equal to your belief in the event of ball 53, and similarly for any pair of distinct numbers from 1 to 100, then you are said to believe that the ball is drawn at random. This formal definition avoids the word “chance”, which will be given a specific meaning in §7.8, and embraces only the three concepts, “you”, “event”, and “belief”.

The concept of randomness has many practical uses. In the British National Lottery, there are 49 balls and great care is taken to make a machine that will deliver a ball in such a way that each has the same chance of appearing; that is, to arise at random. You may not believe that the lottery is random and that 23 is lucky for you; all we ask is that you can imagine a lottery that is random for you. Randomness is not confined to lotteries, thus, with the balls replaced by people, not in an urn but in a population, it is useful to select people at random when assessing some feature of the population such as intention to vote. We mentioned in §1.5, and will see again in §8.5, how difficulties in comparing two methods can be avoided by designing some features of an experiment at random, such as when patients are randomly assigned to treatments. Computer scientists have gone to a great deal of trouble to make machines that generate numbers at random. Many processes in nature appear to act randomly, in that almost all scientists describe their beliefs about the processes through randomness, in the sense used here. The decay of radioactive elements and the transfer of genes are two examples. There is a strong element of randomness in scientific appreciation of both the physical and the biological worlds and our withdrawal of a ball from the urn at random, although an ideal, is achievable and useful.

3.3 A Standard for Probability

We have an urn containing 100 balls, from which one ball is to be drawn at random. Imagine that the numbers, introduced merely for the purpose of explaining the random concept, are removed from the balls but instead that 30 of them are colored red and the remaining 70 are left without color as white, the removal or the coloring not affecting the randomness of the draw. The value 30 is arbitrary; a mathematician would have r red, and nr white, balls. Consider the event that the withdrawn ball is red. Until you inspect the color of the ball, or even before the ball is removed, this event is uncertain for you. You do not know whether the withdrawn ball will be red or white but, knowing the constitution of the urn, have a belief that it will be red, rather than white.

We now make the first of the premises, the simple, obvious assumptions upon which the reasoned approach is based, and measure your belief that the random withdrawal of a ball from an urn with 100 balls, of which 30 are red, will result in a red ball, as the fraction 30/100 (recall the mathematical notation explained in §2.9) of red balls, and call it your probability of a red ball being drawn. Alternatively expressed, your belief that a red ball will be withdrawn is measured by your probability, 30/100. Sometimes the fraction is replaced by a percentage, here 30%, and another possibility is to use the decimal system and write 0.3, though, as explained in §2.8, it pays to stay with one system throughout a discussion. There is nothing special in the numbers, 30 and 100; whatever is the fraction of red balls, that is your probability.

Reflection shows that probability is a reasonable measure of your belief in the event of a red ball being taken. Were there more than 30 red balls, the event would be more likely to occur, and your probability would increase; a smaller number would lessen the probability. If all the balls were red, the event would be certain and your probability would take its highest possible value, one; all white, and the impossible event has the lowest value, zero. Notice that all these values are only reasonable if you think that the ball is drawn at random. If the red balls were sticky from the application of the paint, and the unpainted, white ones, not, then the event of being red might be more likely to occur and the value of 0.3 would be too low.

In view of its fundamental importance, the definition is repeated with more precision. If you think that a ball is to be withdrawn at random from an urn containing only red and white balls, then your probability that the withdrawn ball will be red is defined to be the fraction of all the balls in the urn that are red.

The simple idea extends to other circumstances. If a die is thrown, the probability of a five is 1/6, corresponding to an urn with six balls of which only one is red. In European roulette, the probability of red is 18/37, there being 37 slots of which only 18 are red. In a pack of playing cards, the probability of a spade is 13/52, or 1/4, and of an ace, 4/52 = 1/13. These considerations are for a die that you judge to be balanced and fairly thrown, a roulette wheel that is not rigged and a pack that has been fairly shuffled, these restrictions corresponding to what has been termed random.

The first stage in the measurement of uncertainty has now been accomplished; we have a standard. The urn is our equivalent of the metal bar for distance, perhaps to be replaced by some improvement in the light of experience, as light is used for distance. Other standards have been suggested but will not be considered here. The next stage is to compare any uncertain event with the standard.

3.4 Probability

Consider any event that is uncertain for you. It is convenient to fix ideas and take the event of rain tomorrow (Example 1 of §1.2), but the discussion that follows applies to any uncertain event. Alongside that event, consider a second event that is also uncertain for you, namely, the withdrawal at random of a red ball from an urn containing 100 balls, of which some are red, the rest white. For the moment, the number of red balls is not stated. Were there no red balls, you would have higher belief in the event of rain than in the impossible extraction of a red ball. At the other extreme, were all the balls red, you would have lower belief in rain than in the inevitable extraction of a red ball. Now imagine the number of red balls increasing steadily from 0 to 100. As this happens, you have an increasing belief that a red ball will be withdrawn. Since your belief in red was less than your belief in rain at the beginning, yet was higher at the end with all balls red, there must be an intermediate number of red balls in the urn such that your beliefs in rain and in the withdrawal of a red ball are the same. This value must be unique, because if there were two values, then they would have the same beliefs, being equal to that for rain tomorrow, which is nonsense as you have greater belief in red with the higher fraction. So there are two uncertain events in which you have the same belief: rain tomorrow and the withdrawal of a red ball. But you have measured the uncertainty of one, the redness of the ball; therefore, this must be the uncertainty of the other, rain. We now make a very important definition:

Your probability of the uncertain event of rain tomorrow is the fraction of red balls in an urn from which the withdrawal of a red ball at random is an event of the same uncertainty for you as that of the event of rain.

This definition applies to any uncertain event, not just to that about the weather. To measure your belief in the truth of a specific uncertain event, you are invited to compare that event with the standard, adjusting the number of red balls in the urn until you have the same beliefs in the event and in the standard. Your probability for the event is then the resulting fraction of red balls.

Some minor comments now follow before passing to issues of more substance. The choice of 100 balls was arbitrary. As it stands, every probability is a fraction out of 100. This is usually adequate, but any value between 0 and 1 can be obtained by increasing the total number of balls. When, as with a nuclear accident (Example 13 of §1.2) the probability is very low, perhaps less than 1/100, yet not zero, the number of balls needs to be increased. As we have said, a mathematician would have r red, nr white, balls and the probability would be r /n.

The following point may mean nothing to some readers, but some others will be aware of the frequency theory of probability, and for them it is necessary to issue a warning: there is no repetition in the definition. The ball is to be taken once, and once only, and the long run frequency of red balls in repeated drawings is irrelevant. After its withdrawal, the urn and its contents can go up in smoke for all that it matters. Repetition does play an important role in the study of probability (see §7.3) but not here in the basic definition.

Some writers deny the existence, or worth, of probability. We have to disagree with them, feeling convinced by the measuring techniques just proposed. Others accept the concept of probability but distinguish between cases where the probability is known, and those where it is unknown, the probability in the latter case being called ambiguous. For example, the probability of a coin falling heads is unambiguous at ½, whereas the probability of your candidate winning the election is ambiguous. (Rather than referring to “the” probability, we would prefer “your” probability.) In the descriptive mode, the distinction is important because, as Ellsberg's paradox (§9.11) shows, people make different choices depending solely on whether the probability is ambiguous or not. In the normative mode adopted here, the distinction is one of measurement, an ambiguous probability being harder to measure than one with no ambiguity. The paradox disappears in the normative view, though your action may depend on how good your measurement processes are. As will be seen in Chapter 13, they are not as good as one would wish.

3.5 Coherence

In the last section, we took a standard, or rather a collection of standards depending on the numbers of red balls, and compared any uncertain event with a standard, arranging the numbers such that you had the same beliefs in the event and in the standard. In this way, you have a probability for any uncertain event.

You immediately, and correctly, respond, “I can't do it”. You might be able to say that the number of red balls must be at least 17 out of a 100 and not more than 25, but to get closer than that is impossible for most uncertain events, even simple ones such as rain tomorrow. A whole system has been developed on the basis of lower (17/100) and upper (25/100) probabilities, both of which go against the idea of simplicity and confuse the concept of measurement with the practice of measurement. Recall the metal bar for length; you cannot take the table to the institution where the bar is held and effect the comparison. It is the same with uncertainty, as it is with distance; the standard is a conceptual comparison, not an operational one. We put it to you that you cannot escape from the conclusion that, as in the last section, some number of red balls must exist to make the two events match for you. Yes, the number is hard to determine, but it must be there. Another way of expressing the distinction between the concept and the practice is to admit that reasoning persuades you that there must exist, for a given uncertain event, a unique number of red balls that you ought to be able to find, but that, in practice you find it hard to determine it. Our definition of probability provides a norm to which you aim; only measurement problems hinder you from exactly behaving like the norm. Nevertheless, it is an objective toward which you ought to aim.

When it comes to distance, you would use a tape measure for the table, though even there, marked in fractions of inches, you might have trouble getting an accuracy beyond the fraction. With other distances, more sophisticated devices are used. Some, such as those used to determine distances on the Earth's surface between places far apart, are very elaborate and have only been developed in the last century, despite the concept of distance being made rigorous by the Greeks. So please do not be impatient at your inability, for we do not as yet have a really good measuring device suitable for all circumstances. Nevertheless, you are entitled to wonder how such an apparently impossible task can be accomplished. How can you measure your belief in practice? Much of the rest of this book will be devoted to this problem. For the moment, let me try to give you a taste of one solution by means of an example.

Suppose you meet a stranger. Take the event that they were born on March 4 in some year. You are uncertain about this event but the comparison with the urn is easy and most of you would announce a probability of 1/365, ignoring leap years and any minor variations in the birth rate during the year. And this would hold for any date except February 29. The urn would contain 365 balls, each with a different date, and a ball drawn at random. Now pass to another event that is uncertain for you. Suppose there are 23 unrelated strangers and consider the uncertain event that, among the 23, there are at least two of them who share the same day for the celebration of their births. It does not matter which day, only that they share a day. Now you have real difficulty in effecting the comparison with the urn. However, there exist methods, analogous to the use of a tape measure with length, that demonstrate that your probability of a match of birthdays is very close to 1/2. These methods rely on the use of the rules of the calculus of probability to be developed in later chapters. Once you have settled on 1/365 for one person, and on the fact that the 23 are unrelated, the value of 1/2 for the match is inevitable. You have no choice. That is, from 23 judgments of probability, one for each person, made by comparison with the standard, you can deduce the value of 1/2 in a case where the standard was not easily available. The deduction will be given in §5.5.

The principle illustrated here is called coherence. A formal definition of coherence appears in §5.4. The value of 1/2 coheres with the values of 1/365. Coherence is the most important tool that we have today for the measurement of uncertainty, in that it enables you to pass from simple, measurable events to more complicated ones. Coherence plays a role in probability similar to the role that Euclidean geometry plays in the measurement of distance. In triangulation, the angles and a single distance, measured by the surveyor, are manipulated according to geometrical rules to give the distance, just as the values of 1/365 are manipulated according to the rules of probability to give 1/2. Some writers use the term “consistent”, rather than “coherent”, but it will not be adopted here. The birthday example was a diversion; let us return to the definition of probability in §3.4.

3.6 Belief

The definition of probability holds, in principle, for any event, the numerical value depending not only on the event but also on you. Your uncertainty for rain tomorrow need not be the same as that of the meteorologist, or of any other person. Probability describes a relationship between you and the world, or that part of the world involved in the event (see §1.7). It is sometimes said to be subjective, depending on the subject, you, making the belief statement. Unfortunately, subjectivity has connotations of sloppy thinking, as contrasted with objectivity. We shall therefore use the other common term, personal, depending on the person, you, expressing the probability. Throughout this book, probability expresses a relationship between a person, you, and the real world. It is not solely a feature of your mind; it is not a value possessed by an event but expresses a relationship between you and the event and is a basic tool in your understanding of the world. There are many uncertainties upon which most people agree, such as the 1/365 for the birthday in the last section, though there is no complete agreement here. I once met a lady at a dinner party who, during the course of the evening, in which birth dates had not been mentioned, turned to me and said, “you are an Aries”. She had a probability greater than 1/365 for dates with that sign, a value presumably based on her observation of my conversation. She is entitled to her view and considered alone, it is not ridiculous although in combination with other beliefs she might hold, she may be incoherent. Note: I am not an Aries.

Similarly, there are events over which there is a lot of disagreement. Thus, the nuclear protester and the nuclear engineer may not agree over the probability of a nuclear accident. One of the matters to be studied in §§6.9 and 11.6 is how agreement between them might be reached, essentially both by obtaining more information and by exposing incoherence.

Probability therefore depends both on the event and on you. There is equally something that it should not depend on—the quality of the event for you. Consider two uncertain events: a nuclear accident and winning a lottery. The occurrence of the first is unpleasant, that of the second highly desirable. These two considerations are not supposed to influence you in your expressions of belief in the two events through probability. This is important, so let us spell it out.

We suppose that you possess a basic notion of belief in the truth of an uncertain event that does not depend on the quality of the event. Expressed differently, you are able to separate in your mind how plausible the event is from how desirable it is. We shall see in Chapter 10 that plausibility and desirability come together when we make a decision, and strictly it is not necessary to separate the two. Nevertheless, experience seems to show that people prefer to isolate the two concepts, appreciating the advantages gained from the separation, so this view is taken here.

To reinforce this point, consider another method that has been suggested for comparing your uncertainty of an event with the standard. In comparing the nuclear accident with the extraction of a red ball from an urn in order to assess your probability for the former, suppose that you were invited to think about two gambles. In the first, you win $100 if the accident occurs; in the second, you win the same amount if the ball is red. The suggestion is that you choose the number of red balls in the urn so that you feel the two gambles are equivalent. The comparison is totally different from our proposal because the winning of $100 would be trivial if there were an accident and you might not be alive to receive it, whereas the red ball would not affect you and the prize could be enjoyed. In other words, this comparability confuses the plausibility of the accident with its desirability, or here, horror. Gambling for reward is not our basis for the system and where it was mentioned in §3.2, the rewards were exactly the same in all the gambles considered, so desirability did not enter. Some aspects of gambling are raised in §14.5.

3.7 Complementary Event

Consider the event of rain tomorrow (Example 1 of §1.2). Associated with this event is another event that it will not rain tomorrow; when the former is true, the latter is false and vice versa. Generally, for any event, the event that is true when the first is false, and false when it is true, is called the event that is complementary to the first. Just as we have discussed your belief in the event, expressed through a probability, so we could discuss your probability for the complementary event. How are these two probabilities related? This is easily answered by comparison with the withdrawal of a red ball from an urn. The event complementary to the removal of a red ball is that of a white one. The probability of red is the fraction of red balls in the urn and similarly, the probability of white is the fraction of white balls. But these two fractions always add to one, for there are no other colors of ball in the urn; if 30 are red out of 100, then 70 are white. Hence, the standard event and its complement have probabilities that add to one. It follows by the comparison of any event with the urn that this will hold generally. If your belief in the truth of an event matches the withdrawal of a red ball, your belief in the falsity matches with a white ball. Stated formally, it means the following:

Your probability of the complementary event is one minus your probability of the original event. If your probability of rain tomorrow is 0.3, then your probability of no rain tomorrow is 0.7.

This is our first example of a rule of probability; a rule that enables you to calculate with beliefs and is the first stage in developing a calculus of beliefs. Since calculation is involved, it is convenient to introduce a simple piece of mathematics, effectively rewriting the above statement in another language.

Instead of using the word “event”, it is often useful to use a capital letter of the Roman alphabet. E is the natural one to use, being the initial letter of event, and thereby acting as a mnemonic. Later it will be necessary to talk about several events and use different letters to distinguish them, thus E, F, G, and so on. When we want to state a general rule about events, it is not necessary to spell out the meaning of E, which can stand for any event; whereas in an application of the rule, we can still use E, but then it will refer to the special event in the application. Your probability for the event E is written p (E). Here the lower-case letter p replaces probability and the brackets encompass the event, so that in a sense they replace “of ” in the English language equivalent. Some writers use P or Pr or prob but we will use the simple form p. Notice that p always means probability, whereas E, F, G, and so on refer to different events and p (E) is simply a translation of the phrase “your probability for the event E ”. It might be thought that reference should also be made to “you” but since we will only be talking about a single person, this will not be necessary, see §3.6.

Let us have a bit of practice. If R is the event of rain tomorrow, then the statement that your probability of rain is 0.3 becomes p (R) = 0.3. If C is the event of a coincidence of birthdays with 23 people (see §3.5), then p (C) = 1/2 to a good approximation. Notice that R is an event, r a number of balls, and mathematicians make much more use of a distinction between upper- and lower-case than does standard English. If E is any event, then the event that is complementary to E is written Ec, the raised c standing for “complement” and again, the initial letter acting as a mnemonic. Complement being such a common concept, many notations besides the raised c are in use.

With this mathematical language, the rule of probability stated above can be written as follows:

equation

this being a mathematical translation of the English sentence “The probability of the complementary event is equal to one minus the probability of the event”. Mathematics has the advantage of brevity and, with some practice, has the benefit of increased clarity. Notice that in stating the rule, we have not said what the event E is since the statement is true for any event.

Let us perform our first piece of mathematical calculation and add p (E) to both sides of this equation (see §2.9) with the result

equation

or in words, your probability for an event and your probability for the complementary event add to one. Several important rules of probability will be encountered later, but they have one point in common—they proscribe constraints on your beliefs. While you are free to assign any probability to the truth of the event, once this has been done, you are forced to assign one minus that probability to the truth of the complementary event. If your probability for rain tomorrow is 0.3, then your probability for no rain must be 0.7. This enforcement is typical of any rule in that there is great liberty with some of your beliefs, but once they are fixed, there is no freedom with others that are related to them and we have an example of the coherence mentioned in §3.5. You are familiar with this phenomenon for distance. If the distance from Exeter to Bristol is 76 miles, and that from Bristol to Birmingham is 81 miles, then that from Exeter to Birmingham, via Bristol is inevitably 157 miles, the sum of the two earlier distances. Mathematically, if the distance from A to B is x, and that from B to C is y, then the distance from A to C, via B, is x + y, a statement that is true for any A, B, and C and any x and y compatible with geography.

3.8 Odds

Although probability is the usual measure for the description of your belief, some people prefer to use an alternative term, just as some prefer to use miles instead of kilometers for distance, and we will find that an alternative term has some convenience for us in §6.5. To introduce the alternative measure, let us return to any uncertain event E and your comparison of it with the withdrawal at random of a red ball from an urn containing r red and w white balls, making a total of n = r + w in all. Previously, we had 100 balls in total, n = 100, purely for ease of exposition. As before, suppose r is adjusted so that you have the same belief in E as in the random withdrawal of a red ball, then your probability for E, p (E), is the ratio of the number of red balls to the total number of balls, r /(r + w). The alternative to probability as a measure is the ratio of the number of white balls to that of red w /r and is called the odds against a red ball and therefore, equally the odds against E. Alternatively, reversing the roles of the red and white balls, the ratio of the number of red balls to that of white r /w is termed the odds on a red ball or the odds on E. We now encounter a little difficulty of nomenclature and pause to discuss it.

The concept of odds arises in the following way. Suppose that, in circumstances where E is an event favorable to you, such as a horse winning, you have arranged the numbers of balls such that your belief in E equals that in a red ball being withdrawn; then there are w possibilities corresponding to E not happening because the ball was white, and r corresponding to the pleasant prospect of E. Hence, it makes sense to say w against E and r for E, or simply “w to r against”, expressed as a ratio w /r. As an example, suppose your probability is a quarter, 1/4, that High Street will win the 2.30 race at Epsom (Example 8 of §1.2), then a quarter of the balls in the matching urn will be red, or equivalently, for every red ball there will be three white; the odds against High Street are 3 to 1, the odds on are 1 to 3; as ratios, 3 against, 1/3 on.

Odds are commonly used, at least in Britain, in connection with betting (§14.5). Odds in betting are always understood as odds against; in the few cases where odds on are used, they say “odds on”. Thus “against” is omitted but “on” is included. As a way through this linguistic tangle, we shall always use odds in the sense of odds on. If we do need to use odds against, the latter word will be added. This is opposite to the convention used in betting and is weakly justified by the fact that our probabilities will commonly be larger than those encountered in sporting events, also because a vital result in §6.5 is slightly more easily expressed using odds on. It will also be assumed that you are comfortable with using fractions.

There is no standard notation for odds and we will use o (E), o for odds on replacing p for probability. There is a precise relationship between probability and odds, which is now obtained as follows. p (E) is the fraction r /(r + w) and equally, p (Ec) is the fraction w /(w + r). The ratio of the former fraction to the latter is r /w, which is the odds on, o (E). The reader is invited to try it with the numbers appropriate to High Street. Consequently, we have the general result that the odds on an event is the ratio of the probability of the event to that of its complement. That sentence translates to

equation

Because a ratio printed this way takes up a lot of vertical space, it is usual to rewrite this as

(3.1) equation

keeping everything on one line, as in English (see §2.9). Recall that p (Ec) = 1 − p (E), so that (3.1) may be written

(3.2) equation

The square brackets are needed here to show that the whole content, 1 − p (E), divides p (E); the round brackets having been used in connection with probability.

Equation (3.2) enables you to pass from probability on the right to odds on the left. The reverse passage, from odds to probability, is given by

(3.3) equation

To see this, note that p (E) = r /(r + w), so that dividing every term on the right of this equality by w, p (E) = (r /w)/(1 + r /w) and the result follows on noting that o (E) = r /w. Thus, if the odds on are 1/3, the probability is 1/3 divided by (1 + 1/3) or 1/4. The change from 3 in odds to 4 in probability, caused by the addition of 1 to the odds in (3.3), can be confusing. Historians have a similar problem where dates in the 16 hundreds are in the seventeenth century; and musicians have four intervals to make up a fifth, so we are in good company. Notice that if your probability is small, then the odds are small, as is clear from (3.2). Similarly, a large probability means large odds. Probability can range from 0, when you believe the event to be false, to 1, when you believe it to be true. Odds can take any positive value, however large, and probabilities near 1 correspond to very large odds; thus, a probability of 99/100 gives odds of 99.

Odds against are especially useful when your probability is very small. For example, the organizers of the National Lottery in Britain state that their probability that a given ticket (yours?) will win the top prize is 0.000 000 071 511 238, a value that is hard to appreciate. The equivalent odds against are 13,983,815 to 1. There is only one chance in about 14 million that your ticket will win. Think of 14 million balls in the urn and only one is red. Another example is provided by the rare event of a nuclear accident.

In everyday life, odds mostly occur in connection with betting, and it is necessary to distinguish our usage with their employment by bookmakers (§14.5). If a bookmaker quotes odds of 3 to 1 against High Street winning, it describes a commercial transaction that is being offered and has little to do with his belief that the horse will win. All it means is that for every 1 dollar you stake, the bookmaker will pay you 3 dollars and return your stake if High Street wins; otherwise you lose the stake. The distinction between odds as a commercial transaction from odds as belief is important and should not be forgotten. You would ordinarily bet at odds of 3 to 1 against only if your odds against were smaller, or in probability terms, if your probability of the horse winning exceeded 1/4.

3.9 Knowledge Base

Considerable emphasis has been placed on simplicity, for we believe that the best approach is to try the simplest ideas and only to abandon them in favor of more complicated ones when they fail. It is now necessary to admit that the concept of your probability for an event, p (E) as just introduced, is too simple and a complication is forced upon us. The full reason for this will appear later but it is perhaps best to introduce the complication here, away from the material that forces it onto our attention. Our excuse for duping the reader with p (E) is a purely pedagogical one of not displaying too many strange ideas at the same time.

Suppose that you are contemplating the uncertain event of rain tomorrow and carrying through the comparison with the balls in the urn, arriving at the figure of 0.3. It then occurs to you that there is a weather forecast on television in a few moments, so you watch this and, as a result, revise your probability to 0.8 in the light of what you see. Just how this revision should take place is discussed in §6.3. So now there are two versions of your belief in rain tomorrow, 0.3 and 0.8. Why do they differ, for they are probabilities for the same event? Clearly because of the additional information provided by the forecast, which changes the amount of knowledge you have about tomorrow's weather. Generally, your belief in any event depends on your knowledge at the time you state your probability and it is therefore oversimple of us to use the phrase “your probability for the event”. Instead, we should be more elaborate and say “your probability for the event in the light of your current knowledge”. What you know at the time you state your probability will be referred to as your knowledge base.

The idea being expressed here can alternatively be described as saying that any probability depends on two things, the uncertain event under consideration and what you know, your knowledge base. It also depends on the person whose beliefs are being expressed, you, but as we have said, we are only thinking about one person, so there is no need to refer to you explicitly. We say that probability depends on two things, the event and the knowledge. Some writers on probability fail to recognize this point, with a resulting confusion in their thoughts. One expert produced a wrong result, which caused confusion for years, the expert being so respected that others thought he could not be wrong. In the light of this new consideration, the definition of probability in §3.4 can be rephrased. Your probability of an uncertain event is equal to the fraction of red balls in an urn of red and white balls when your belief in the event with your present knowledge is equal to your belief that a single ball, withdrawn at random from the urn, will be red. The change consists in the addition of the words in italics.

This necessary complexity means that the mathematical language has to be changed. The knowledge base will be denoted by img, the initial letter of knowledge, but written in script to distinguish it from an event. In place of p (E) for your probability of the event E, we write p (E |img). The vertical line, separating the event from your knowledge, can be translated as “given” or “conditional upon”. The whole expression then translates as “your probability for the event E, given that your knowledge base is img”. In the example, where the event is “rain tomorrow” and your original knowledge was what you possessed before the forecast, p (E |img) was 0.3. With the addition of the forecast, denoted by F, the probability changes to p (E |F and img) and 0.8. Your knowledge base has been increased from img to F and img.

Despite the clear dependence on how much you know, it is common to omit img from the notation because it usually stays constant throughout many calculations. This is like omitting “you” because there is only one individual. Thus, we shall continue to write p (E) when the base is clear. In the example, after the forecast has been received, we shall write p (E |F). Although the knowledge base is often not referred to, it must be remembered that it, like you, is always present. The point will arise in connection with independence in §8.8.

Some people have put forward the argument that the only reason two persons differ in their beliefs about an event is that they have different knowledge bases, and that if the bases were shared, the two people would have the same beliefs, and therefore the same probability. This would remove the personal element from probability and it would logically follow that all with knowledge base img for an uncertain event E, would have the same uncertainty, and therefore the same probability p (E |img), called a logical probability. We do not share this view, partly because it is very difficult to say what is meant by two knowledge bases being the same. In particular, it has proved impossible to say what is meant by being ignorant of an event or having an empty knowledge base, and although special cases can be covered, the general concept of ignorance has not yielded to analysis. People often say they know nothing about an event but all attempts to make this idea precise have, in my view, failed. In fact, if people understand what is under discussion, such as rain tomorrow, then, by the mere fact of understanding, they know something, albeit very little, about the topic. In this book, we shall take the view that probability is your numerical expression of your belief in the truth of an event in your current state of knowledge; it is personal, not logical.

3.10 Examples

Let us return to some of the examples of Chapter 1 and see what the ideas of this chapter have to say about them. With almanac questions, such as the capital of Liberia, the numerical description of your uncertainty as probability would not normally be a worthwhile exercise, though notice how, in the context of “Trivial Pursuit ” your probability would change as you consulted with other members of your team and, as a result, your knowledge base would be altered. A variant of the question would present you with a number, often four, possible places that might be the capital, one of which is correct. This multiple-choice form does admit a serious and worthwhile use of probability by asking you to attach probabilities to each of the four possibilities rather than choosing one as being correct, which is effectively giving a value 1 to one possibility and 0 to the rest. An advantage of this proposal in education is that the child being asked could face up to the uncertainty of their world and not be made to feel that everything is either right or wrong. There is a difficulty in making such probability responses to multiple-choice examination questions, but these have been elegantly overcome and the method has been made a real, practical proposal.

The legal example of guilt (Example 3) is considered in more detail in §10.14 but for now, just note how the uncertain event remains constant throughout the trial but the knowledge base is continually changing as the defense and prosecution present the evidence.

Medical problems (selenium, Example 4 and fat, Example 15) are often discussed using probability and therefore raise a novel aspect because you, when contemplating your probability, may have available one or more probabilities of others, often medical experts in the field. You may trust the expert and take their probability as your own but there is surely no obligation on you to do so since the expert opinion has to be combined with other information you might have, such as the view of a second expert or of features peculiar to you. There exists some literature, which is too technical for inclusion here, of how one person can use the opinions of others when these opinions are expressed in terms of probability.

Historians only exceptionally embrace the concept of probability, as did one over the princes in the tower (Example 5) but they are enthusiasts for what we have called coherence, even if their form is less numerate than ours. When dealing with the politics of a period, historians aim to provide an account in which all the features fit together; to provide a description in which the social aspects interact with the technological advances and, together with other features, explain the behavior of the leading figures. Their coherence is necessarily looser than ours because there are no rigid rules in history, as will be developed for uncertainty, but the concepts are similar. Whether probability will eventually be seen to be of value in historical research remains to be seen though, since so much in the past is uncertain, the potential is there.

The three examples, of card play (Example 7), horse racing (Example 8), and investment in equities (Example 9) are conveniently taken together because they are all intimately linked with gambling, though that term seems coarse in connection with the stock market; nevertheless, the placing of a stake in anticipation of a reward is fundamentally what is involved. Games of chance have been intimately connected with probability, and it is there that the calculus began and where it still plays an important role, so that many cardplayers are knowledgeable about the topic. There is a similar body of experts in odds, namely, bookmakers, but here the descriptive results seem to be at variance with the normative aspects of §2.5 (see §14.5). Bookmakers are very skilled and it would be fascinating to explore their ideas more closely, though this is hindered by their understandable desire to be ahead of the person placing the bet, indulging in some secrecy. A descriptive analysis of stockbrokers would be even more interesting since they use neither odds nor probabilities. There is a gradation from games of chance, where the probabilities and rewards are agreed and explicit; to horse racing, where the rewards are agreed but probabilities are not, since your expectation of which horse is going to win typically differs from mine; to the stock market, where nothing is exposed except the yields on bonds.

Some of the examples, but especially that of opinion polls before an election (Example 12), are interesting because the open statement of uncertainty itself can affect an uncertainty. An obvious instance of this arises when a poll says that the incumbent is 90% certain of winning the election, with the result that her supporters will tend not to vote, deeming it unnecessary, and your probability of their victory will drop. The other examples either raise no additional issues, or, if they do, the issues are better discussed when we have more familiarity with the calculus of probability to be developed in the following chapters. Instead, let us recapitulate and see how far we have got.

3.11 Retrospect

It has been argued that the measurement of uncertainty is desirable because we need to combine uncertainties and nothing is better and simpler at combination than numbers. Measurement must always involve comparison with a standard and here we have chosen balls in urns for its simplicity. Other standards have been used, perhaps the best being some radioactive phenomena, which seem to be naturally random and, like krypton light, have the reliance of physics behind them. It has been emphasized that the role of a standard is not that of a practicable measuring tool but rather a device for producing usable properties of uncertainty. From these ideas, the notion of probability has been developed and one rule of its calculus derived, namely that the probabilities of an event and its complement add to one.

So what has been achieved? Quite frankly, not much, and you are little better at assessing or understanding uncertainty than you were when you began to read. So has it been a waste of time? Of course, my answer is an emphatic “No”. The real merit of probability will begin to appear when we pass from a single event to two events, because then the two great rules of combination will arise and the whole calculus can be constructed, leading to a proper appreciation of coherence. Future chapters will show how new information, changing your knowledge base, also changes your uncertainty, and, in particular, explains the development of science with its beautiful blend of theory and experimentation. When we pass from two events to three, no new rules arise, but surprising features arise that have important consequences.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.34.205