CHAPTER 8

When Are Crowds Wise?

We have noted that on many questions, the average or majority view of group members turns out to be stunningly accurate. One reason is that by polling all members and by avoiding social influences, groups make very efficient use of information. And if the groups are diverse and the judgments are independent, there is a lot of information to aggregate.

Imagine that a company is attempting to project its sales for certain products in the following year; maybe the company needs an accurate projection to know how much to spend on advertising and promotion. Might it do best to poll its salespeople and trust the average number?1 Or suppose that a public official is deciding on whether to embark on a new program in the hope that the program would reduce unemployment. Should the official poll her advisers and take the average answer?

Many people, including most executives, would say, “No! Why settle for an average answer? We want the best answer.” But this view, which is often based on the illusion of omniscience in hindsight, is wrong. The problem is that in foresight, before the judgment has been rendered, no one knows which individual has the best answer. In hindsight, it seems (too) easy. “We should have just relied on Fred; he had the answer.” But of course, before the answer was known, no one knew for sure that Fred would be right. We return to this bad habit of chasing the expert in chapter 9.

The Condorcet Jury Theorem and the Law of Large Numbers

To understand why group averages and majority-rule judgments might be so accurate, we can learn much from a long historical tradition. A large part of the answer comes from the Condorcet Jury Theorem.2 The Marquis de Condorcet, after whom the theorem is named, is one of the greatest social theorists of all time. Working in the eighteenth century, he used a little mathematics to produce some exceptionally important results. Condorcet was especially interested in how groups might aggregate the judgments of individuals. Writing around the time of the French Revolution, he was obsessed with the question of how to justify a shift from a dictatorial monarchy to a democracy. How could rule via the “voice of the people” be justified rationally?

To see how the Jury Theorem works, suppose that group members are all answering the same question, which has two possible answers, one false and one true. Assume too that the probability that each member will answer correctly exceeds 50 percent. The Jury Theorem says that the probability of a correct answer, by a majority of the group, increases toward 100 percent as the size of the group increases. This result holds if the members are only slightly more likely to be correct than incorrect, even far from 100 percent as individuals. The central point is that groups will do better than individuals, and big groups better than little ones, so long as two conditions are met: majority rule is used and each group member is more likely than not to be correct.

The theorem is not exactly intuitive, but it is based on some pretty simple arithmetic. Suppose, for example, that an employer has appointed a three-person team and that each member has a 67 percent probability of being right. The probability that a majority vote will choose the correct answer is 74 percent. As the size of the group increases, this probability increases too, at least so long as each member is more likely to be right than wrong. It should be clear that as the likelihood of a correct answer by individual members increases (say, from 60 percent to 70 percent), the likelihood of a correct answer by the group increases too, at least if majority rule is used. If group members are 80 percent likely to be right, and the group contains ten or more people, the probability of a correct answer by the majority is very close to 100 percent.

The importance of the Jury Theorem lies in its demonstration that so long as deliberation is not involved and the groups are purely statistical (in the sense that members are being polled independently, rather than influencing one another), they are likely to do better than individuals. Moreover, large groups are likely to do better than small ones, if majority rule is used and if each person is more likely than not to be correct. The point bears on decisions of all kinds of groups, including businesses, religious organizations, and of course political institutions.

The Condorcet Jury Theorem is essentially a more specific version of a very general statistical principle: the law of large numbers. The basic idea is that if the inputs have some kernel of truth to them (in the case of the Jury Theorem, the individuals are more likely to be right than wrong) and if they are relatively independent, then the more inputs, the more accurate the average or the majority answer. This was exactly the conclusion reached by Francis Galton about a century after Condorcet (we mentioned Galton’s study in chapter 1). Interestingly, Galton was also motivated by the philosophical question of when “the crowd” will be wise (his paper was titled Vox Populi)—although Galton, who was something of an elitist, was surprised by the accuracy of the average estimate.3 We have mentioned his study of fairgoers estimating the weight of a prize ox, where the average was within two pounds of the truth, and the median, the measure Galton favored for mathematical reasons, was within ten pounds; both crowd estimates beat the best individual estimator.

In addition to the benefit of many individual estimates, different perspectives or evidence also promote better answers or solutions. This kind of diversity is highly desirable and is almost certain to increase the quality of the solution selected or calculated.

In the case of estimating numerical answers, the value of diversity is easy to see. The key property of a sample of estimates of a single value is that the estimates are unbiased and equally likely to fall above or below the true value. If the individual estimates “bracket” the truth, then any degree of variability can be corrected when an average value is calculated, and the errors cancel one another out.4

To demonstrate the point, we have to use a little arithmetic.5 Consider the number of nation states in the United Nations. Suppose two individuals estimate there are 250 and 150 (in 2014, the correct answer was 192). Those two estimates “bracket” the truth; 250 is 58 too high, while 150 is 42 too low. So their absolute errors (the difference between the individual’s estimate and the actual number) are 58 and 42, and the average individual error is 50, or (58 + 42)/2.

But consider the average of the two estimates, or 200, with its much lower absolute error of only 8. If the estimates bracket the truth, the error of the average estimate will always be lower than the average error of the individual estimates. This means if individual estimates (two or more) bracket the truth, the average of the estimates will be more accurate than the individual estimates (which should be compared to the average of the individual errors).

Now what if the individual estimates do not bracket the truth and hence are all on the same side of the truth (either too high or too low)? In this situation, the error of the average estimate will always equal the average of the individual errors, but will not be worse. In the UN example, suppose the individual estimates were 200 and 250 (both too high and both on the same side of the truth). Then the individual errors would be 8 and 58, and the average of those individual errors would be 33. This is the same as the error of the average (the average is 225, and the error of that average is 225 – 192 = 33).

The conclusion is that the average estimate is always better (smaller error) if the individual estimates bracket the truth, and it is equal if all the individual estimates are over- or underestimates. It follows that the average is always at least as good as a typical, representative individual. Bracketing is, of course, likelier when the estimates come from diverse sources that differ in background information, perspectives, or strategies to reach an answer. A more general version of this principle is the intuition that triangulation is a useful method to determine the validity of facts or procedures. If different lines of analysis or independent sources of evidence converge on a single answer, then that answer is more likely to be correct.6

But what if the group’s estimators are not diverse? The answer is that the group average will still usually beat most individuals, but the advantage of averaging is reduced. The full power of error canceling, achieved when a truly diverse sample is obtained, with lots of bracketing of the truth, is lost. But although the average is biased, it is no more biased than the typical (also biased) individual.

The disadvantage of lack of diversity (that is, a dependency between the individual estimates) is compounded when the estimators communicate with one another or seek additional information. Then we see the polarization effect kicking in, and the individuals can be locked into a spiral of self-confirmatory discussions that make the group even more biased than it was when the individual estimates were not subjected to discussion.

Lots of Numbers

In the context of what we are calling statistical groups, several of Condorcet’s stringent assumptions are met. Indeed, the likelihood that they will be met is higher with statistical groups than with deliberating ones. Condorcet assumed that (1) people would be unaffected by whether their votes would be decisive, (2) people would be unaffected by one another’s votes, and (3) the probability that one group member would be correct would be statistically unrelated to the probability that another group member would be correct.7

The first two assumptions plainly hold for statistical groups, such as those assessing the number of beans in a jar, the weight of prize livestock, or the number of purchases of a new product. People do not know what others are saying, and hence they cannot be influenced by social pressure to conform with other members of the group. The third assumption may or may not be violated. Those who have similar training or who work closely together will be likely to see things in the same way, possibly leading to a violation of the third assumption. On the other hand, the Condorcet Jury Theorem has been shown to work even in the face of violations of this third assumption; we put the technical complexities to the side here.8

To get a clearer sense of why statistical groups often perform so well, note that even if everyone in the group is not more than 50 percent likely to be right, the theorem’s optimistic predictions may well continue to hold. Suppose, for example, that 60 percent of people are 51 percent likely to be right and that 40 percent of people are 50 percent likely to be right; or that 45 percent of people are 40 percent likely to be right and that 55 percent of people are 65 percent likely to be right; or even that 51 percent of people are 51 percent likely to be right and that 49 percent of people are merely 50 percent likely to be right. Even under these conditions, the likelihood of a correct answer, via majority vote, will move toward 100 percent as the size of the group increases. It will not move as quickly as it would if every group member were highly likely to be right, but it will nonetheless move.

We could imagine endless variations on these particular numbers. Even if some or many group members are more likely to be wrong than right, a majority vote can produce the correct answer if many other members are more likely to be right and if the group is big enough. Consider another possibility, one with great practical importance: 40 percent of group members are more likely than not to be right, and 60 percent are only 50 percent likely to be right, but the errors of the 60 percent are entirely random. Because those who blunder do so randomly, the group, if it is big enough, will still end up with the right answer.

Here is the reason, which follows from our account of the law of large numbers: if a core of people have some insight into what’s right, and if the rest of the group make genuinely random errors, the majority will be driven in the direction set by the core. Suppose that a thousand people are asked the name of the actor who played Katniss in the Hunger Games movies. If 40 percent of the group choose Jennifer Lawrence (the right answer, of course), and if the other 60 percent make random guesses, Jennifer Lawrence will emerge with the highest number of votes.

Lots of Possibilities

In studies of statistical groups, most judgments do not involve a binary choice, that is, a choice between just two possibilities. The easiest cases for use of the Jury Theorem ask a simple yes-or-no question. Compare the very different question of how many beans are in a jar, how many pounds a given object weighs, how many bombs a nation has, how many copies of a certain book will sell, and how much money an investment will make in the following year. When many options are offered, will the Jury Theorem hold?

Note first that in answering such questions, each person is effectively being asked to answer a long series of binary questions—ten beans or a thousand, twenty beans or five hundred, fifty beans or one hundred, and so on. If a big-enough group is asked to answer such questions, and if most individual answers are better than chance, the average answer is likely to be highly accurate.

Unfortunately, the combination of probabilities for a series of binary results might mean that things will turn out poorly. If someone is 51 percent likely to answer each of two questions correctly, the probability that he will answer both questions correctly is only slightly higher than 25 percent. Suppose that you are 51 percent likely to be right on whether there are 800 or 780 beans in a jar and 51 percent likely to be right on whether there are 780 or 760 beans in a jar. Sad to say, you’re almost 75 percent likely to get either one or both of the two questions wrong, and things get rapidly worse as the number of questions increases.

If someone is 51 percent likely to answer each of five questions correctly, the likelihood that he will answer all five questions correctly is very small—a little over 3 percent. But with many large groups, the average answer will nonetheless be quite accurate. Here is the key point, again a product of the law of large numbers: if a significant number of people are more likely to choose the correct question than any of the incorrect ones, and if errors are randomly distributed, then the average judgment will be quite reliable.

More-technical analysis offers a nice result for people who are interested in using statistical groups, which is that even with a range of options, the correct outcome is more likely to attract plurality support than any of the others.9 The central idea is that if group members face three or more choices, the likelihood that the best option will win a plurality increases with the size of the group if each individual member is more likely to vote for the best option than for any of the others. As the number of members expands to infinity, the likelihood that the correct answer will be the plurality’s choice increases to 100 percent.

The mathematical details are not important. What matters is that there is good reason to think that employers and others might reasonably conclude that it makes sense to ask a lot of people and to rely on the average or majority answer.

The Dark Side of the Jury Theorem

Unfortunately, the Jury Theorem has a dark side, which also has important implications. Most important, there is a variation on a problem we have already encountered: “garbage in, garbage out.” Suppose that most people in a group are more likely to be wrong than right, maybe because they suffer from some kind of behavioral bias. If so, the likelihood that the group’s majority will decide correctly falls to zero as the size of the group increases!

Here is an important warning: groups should not be carried away by the seductive illusion of accuracy of statistical averages in every circumstance. Accuracy can be found only under certain conditions—most importantly, those in which many or most people are likely to be right. If group members are inclined in the wrong direction, it would be a mistake to rely on statistical averages.

To see the problem more concretely, imagine that a group consists of a number of people, each of whom is at least 51 percent likely to be mistaken. The group might be a small business, a political party, a religious organization, or a law firm. The probability that the group will err approaches 100 percent as its size expands. Condorcet himself explicitly signaled this possibility and its source: “In effect, when the probability of the truth of a voter’s opinion falls below ½, there must be a reason why he decides less well than one would at random. The reason can only be found in the prejudices to which this voter is subject.”10

If group members are prejudiced in some way, the average answer is likely to be way off. Moreover, the errors might stem not only from literal prejudices, but also from confusion, ignorance, poor information-processing, or simple incompetence. On hard questions requiring specialized knowledge, there is no reason to think that the average answer of group members will be right. So too on complex issues involving business or politics. And even if people are competent, they might well be led astray, especially if they are dealing with highly technical questions.

We have emphasized that individuals are subject to a wide range of biases uncovered by behavioral scientists. As noted earlier, the optimistic bias means that most people show a systematic tendency toward unrealistic optimism.11 There is good reason to think that in both business and politics, optimistic bias is a pervasive problem. If group members are unrealistically optimistic, then statistical groups will reflect that bias, producing mistaken judgments. If a company’s senior managers think that a product is bound to succeed and if they are unrealistically optimistic, the group is not well served by relying on the view of the average member. And if those involved in a political campaign think that their candidate is best and that voters will inevitably concur with this opinion, the systematic bias will be reflected in the average view within the group. (Recall that in 2012, many supporters of Mitt Romney believed that Romney would win, even though all the empirical evidence suggested otherwise.)

Of course, falsehoods are often the conventional wisdom. And in many contexts, group members are in fact likely to blunder. Joseph Henrich and his coauthors offer some examples of misguided beliefs: “Many Germans believe that drinking water after eating cherries is deadly; they also believe that putting ice in soft drinks is unhealthy. The English, however, rather enjoy a cold drink of water after some cherries; and Americans love icy refreshments.”12 At least one group must be wrong. If most people in certain countries believe that the United States was itself responsible for the terrorist attacks in New York on 9/11—and they do—then large groups have badly blundered. In short, the optimistic conclusion of the Jury Theorem holds only if we assume a certain level of accuracy on the part of the people involved. That assumption might be heroic.

Biases and Blunders

With the dark side of the Jury Theorem in mind, we should see that a systematic bias in one or another direction will create problems for the answers of statistical groups. Suppose that Arab terrorists were asked about the recent history of the United States, or that men with misogynist attitudes were asked to evaluate women’s abilities. Major errors would be inevitable, however large the number of people involved. Or suppose that some bias is leading group members to favor one or another product (say, a new tablet or automobile) or even a candidate for public office; perhaps most of the members have been manipulated or misled. The dark side of the Jury Theorem ensures that big mistakes will result from relying on averages. Condorcet was fully alert to this problem, saying that it is “necessary, furthermore, that voters be enlightened; and that they be the more enlightened, the more complicated the question upon which they decide.”13

If they rely on statistical averages, legislatures and executive officers may make bad decisions for this very reason. Speaking of democracy in general, Condorcet made the point clearly, calling it “a rather important observation”: “A very numerous assembly cannot be composed of very enlightened men. It is even probable that those comprising this assembly will on many matters combine great ignorance with many prejudices. Thus there will be a great number of questions on which the probability of the truth of each voter will be below 1/2. It follows that the more numerous the assembly, the more it will be exposed to the risk of making false decisions.”14

Condorcet contended that because of the risk of pervasive prejudice and ignorance, “it is clear that it can be dangerous to give a democratic constitution to an unenlightened people.” And even in societies with relatively enlightened people, he believed that citizens should not make decisions themselves, but should generally be restricted to the role of electing representatives, “those whose opinions will have a large enough probability of being true.”15 Condorcet was speaking of very large groups, but his observation also holds for small ones whenever the average view may be wrong.

The Payoff: When Should Groups Use Averages?

For groups, it would be a big mistake to suggest that the best approach to hard questions is always to ask a large number of people and to take the average answer. That approach is likely to work only under particular circumstances: those in which many or most people are more likely than not to be right. Such circumstances might be found when, for example, a company president is asking a group of informed advisers about the proper course of action, or when a dean at a university is asking the faculty whether to hire a certain job candidate, or when a head of a government agency is consulting a group of scientists about whether a particular pollution problem is likely to be serious. In all of these cases, there might well be reason to trust the people who are being asked, and if so, the average answer is likely to be right.

The law of large numbers has another implication. If you know little about the truth, cannot correctly select the most accurate individual judges beforehand, and suspect systematic biases in the individual estimates, then you should still go with the average estimate. In the absence of better information (including knowledge of the direction of the individuals’ biases), it is better to rely on the average. An average will be biased, but still more accurate than a typical individual member. And again, diversity is your friend.

The implications for group behavior are mixed. To the extent that the goal is to arrive at the correct judgments on facts, the Condorcet Jury Theorem affords no guarantees. In numerous domains, too many people are likely to blunder in systematic ways. Wise groups fully recognize this point. Now let’s turn to the question of expertise and how groups might best harness it—a question that has a close connection to the Jury Theorem.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.130.199