Math Through the Ages

21	What's in a Game? The Start of Probability Theory

In 1654 the Chevalier de Méré, a wealthy French nobleman with a taste for gambling, proposed a gaming problem to the mathematician Blaise Pascal. The problem was how to distribute the stakes in an unfinished game of chance. The "stakes" are the amounts of money each gambler bets at the start of a game. By custom, as soon as these bets are made, that staked money belongs to no one until the game is over, at which time the winner or winners get it all. De Méré's question, now known as the "Problem of Points," was how to divide the stakes of an unfinished game if the partial scores of the players are known. In order to be "fair," the answer should somehow reflect each player's likelihood of winning the game if it were to be finished. Here's a simple version of De Méré's Problem of Points.¹

Xavier and Yvon have staked $10 each on a coin-tossing game. Each player tosses the coin in turn. If it lands heads up, the player tossing the coin gets a point; if not, the other player gets a point. The first player to get three points wins the $20. Now suppose the game has to be called off when Xavier has 2 points, Yvon has 1 point, and Xavier is about to toss the coin. What is a fair way to divide the $20?

The actual Problem of Points considered by Pascal asks that question for all the possible scores in an interrupted game of this kind. Pascal communicated the problem to Pierre de Fermat, another prominent French mathematician, and from their correspondence a new field of mathematics emerged. Using somewhat different methods, these two mathematicians arrived at the same answer to the problem. Here is Pascal's way of answering our simple case:²

A fair coin is equally likely to turn up heads or tails. Thus, if each player had two points, each would be equally likely to win the game on the next toss, so it would be fair for each player to get $10, half of the staked amount at that stage. In this case Xavier has 2 points and Yvon 1. If Xavier tosses the coin and wins, he has 3 points and hence gets the $20. If Xavier loses, then each player has 2 points and hence each is entitled to $10. So Xavier is guaranteed at least $10 at this stage. Since it is equally likely that Xavier would win or lose on this toss, the other $10 should be split equally between the players. Therefore, Xavier should get $15 and Yvon $5.

Pascal handled the other cases of the interrupted game in turn, reducing each one to a previously solved situation and dividing the money accordingly. Once Pascal and Fermat found that both of their methods led to the same answers, their correspondence petered out. Unfortunately, their work did not become known until much later. But the question was in the air, and soon other scholars took up the challenge of analyzing gambling games.

Placing a numerical measure on the likelihood that something unknown might happen or might have happened is the central idea of probability. The key to understanding this process begins with the idea of equally likely outcomes, as Pascal's solution suggests. If a situation can be described in terms of possible outcomes that are equally likely, then the probability that one of them might happen is just 1 divided by their total number. This principle was recognized and explored by Girolamo Cardano more than a century before the Chevalier de Méré was playing dice, but his book on the subject, Liber de Ludo Aleae (Handbook on Games of Chance), was not published until nine years after Pascal and Fermat had solved the Chevalier's problem. In that book, Cardano recognized a related principle that we now call the Law of Large Numbers. In terms of equally likely outcomes, this principle is simply an affirmation of our common sense:

If a game (or other experiment) with n equally likely outcomes is repeated a large number of times, then the actual number of times each outcome actually occurs will tend to be close to The more times the game is played, the closer the results will come to matching this ratio.

If a single fair die is thrown, any one of its six faces is equally likely to turn up. Thus, for example, we have one chance in six of throwing a 5; its probability is . This doesn't guarantee that a 5 will turn up exactly once if we throw the die six times, but the Law of Large Numbers says that, if we throw the die 100 or 1000 or 1,000,000 times, the number of times that a 5 appears will tend to get closer and closer to of the total number of throws.

Assigning probabilities to the outcomes of such situations depends on being able to count accurately the total number of equally likely possibilities. This can be a bit tricky sometimes. For instance, there are eleven possible outcomes for throwing a pair of dice (getting 2 or 3 or... or 12 spots), but they're not all equally likely. If you count up all the possible pairings of the numbers 1 through 6 for each die, six of them add up to 7. but only one of them adds up to 12. Thus, the probability of throwing 7 is or but the probability of throwing 12 is only In some sense, each of the different numerical occurrences must be "weighted" differently by accounting for the number of different equally likely ways it might occur.

The principle of weighting outcomes in this way can be extended to situations in which counting equally likely possibilities won't work. For instance, a spinner at the center of a disk with red, yellow, and blue segments of different sizes is not equally likely to stop on any particular color. Rather, the size of each colored segment should "weight" the likelihood of its occurrence; if half the disk is colored red, then the probability of a fair spinner stopping on red should be ½, and so on. The underlying principle of probability, recognized by Cardano, Pascal, and Fermat, is that the probability measurement assigned to each possible outcome should be a number less than 1 (reflecting its relative likelihood of occurring) and that the numbers for all the possible outcomes in a situation should add up to exactly 1. (If you think of probabilities as "likelihood percentages," then the total of all of them should be 100%.)

In 1657, the Dutch scientist Christiaan Huygens became aware of the ideas of Pascal and Fermat and began to work more systematically on the question. The result was On Reasoning in a Dice Game, which extended the theory to games involving more than two players. Huygens's approach started from the idea of "equally likely" outcomes. His central tool was not the modern notion of probability, but rather the idea of expectation or "expected outcome.'' Here is a simple example.

You are offered one chance to throw a single die. If 6 comes up, you get $10; if 3 comes up, you get $5; otherwise, you get nothing. What is a fair price to pay for playing this game?

From the modern point of view, the mathematical expectation of a game is found by adding the products of each possible reward multiplied by the probability that you will get it. In this case, each of the six faces is equally likely to turn up (assuming the die is fair), so you have one chance in six of getting $10, one chance in six of getting $5, and four chances in six of getting nothing. Therefore, the mathematical expectation is

This means that, if a casino were to offer this game to its customers for a fee of $2.50, it would expect to break even in the long run. If it charged $3 to play the game, it should expect to make $.50 per player, in the long run. (If you buy a $1 lottery ticket and use the data on its back to calculate your mathematical expectation, you'll find that it's considerably less than $1. That's why states run lotteries.) Huygens reversed this process, using the expectation to compute the probability instead of the other way around. But the fundamental idea was the same: Equally likely outcomes mean equal expectations.

Mathematical expectation, like most of probability theory, applies to far more than lotteries and casino gambling. Among other things, it is fundamental to the way insurance companies assess their risks when they underwrite policies. Jakob Bernoulli recognized the wide-ranging applicability of probability in his book Ars Conjectandi ("The Art of Conjecture"), which was published in 1713, eight years after his death. There are many things in the book. In the fourth chapter, Bernoulli examined the relationship between theoretical probability and its relevance to various practical situations. In particular, he recognized that the assumption that outcomes were equally likely was a serious limitation when discussing human life spans, health, and the like, suggesting instead an approach based on statistical data. In so doing, Bernoulli also sharpened Cardano's idea of the Law of Large Numbers. He asserted that if a repeatable experiment had a theoretical probability p of turning out in a certain "favorable" way, then for any specified margin of error the ratio of favorable to total outcomes of some (large) number of repeated trials of that experiment would be within that margin of error. By this principle, observational data can be used to estimate the probability of events in real-world situations.

If we want to do this with some precision, we need to be able to decide how many observations are needed. Bernoulli attempted to do this in his book but ran into serious problems. The mathematics was just very hard! He did manage to estimate a number of trials that he could show was enough, but the number he got was huge — so big that it must have been a great disappointment. If getting a reasonable estimate of a probability requires a ridiculously large number of trials, then it can't really be done in practice. Perhaps this is why Bernoulli's book remained unpublished until after he died. Once it was published, however, other mathematicians managed to improve on his methods and show that the number of observations didn't need to be as large as Bernoulli had thought.

The probabilistic point of view was not easily accepted. Consider life insurance, for example. We think it's obvious that probabilistic thinking will help companies make money selling life insurance. In particular, because we believe in the Law of Large Numbers, we understand that a company is better off if it sells many policies. The more policies it sells, the more likely it is that the death rates will be as expected, so that the company will make a profit. In the 18th century, however, many companies seemed to feel that each new policy sold increased the risk to the company. Hence, they felt that selling too many policies was positively dangerous!

Interest in probability questions led to a variety of results by a variety of people during the 18th century. Towards the end of that century, Pierre Simon Laplace, a French mathematician of wide-ranging interests and prodigious talent, became interested in probability questions. He wrote a series of papers on the subject between 1774 and 1786, before focusing his efforts on the mathematics underlying the workings of the solar system. In 1809, Laplace returned to probability by way of a statistical question, the analysis of probable error in scientific data gathering. Three years later, he published Thèorie Analytique des Probabilités ("Analytical Theory of Probabilities"), an encyclopedic tour de force that pulled together everything he and others had done in probability and statistics up to that point. It was truly a masterwork, but its technical, dense style made much of it inaccessible to all but the most determined, mathematically sophisticated reader. British mathematician Augustus De Morgan wrote of it:3

The Thèorie Analytique des Probabilités is the Mont Blanc of mathematical analysis; but the mountain has this advantage over the book, that there are guides always ready near the former, whereas the student has been left to his own method of encountering the latter.

To make his ideas more accessible to a wider audience, Laplace wrote an expository 153-page preface to the second edition in 1814. This preface, which contained very few mathematical symbols or formulas, was also published as a separate booklet, entitled Philosophical Essay on Probabilities. In it, Laplace argued for the applicability of mathematical probability to a wide range of human activities, including politics and what we now think of as the social sciences.

In this respect he was echoing the ideas of Jakob Bernoulli, whose Ars Conjectandi of a century earlier had suggested ways of applying probabilistic principles to government, law, economics, and morality. As the study of statistics has developed in the past two centuries, it has provided the means by which the vision of Bernoulli and Laplace has become a reality. Today the ideas of probability are applied not only to the fields they suggested, but to education, business, medicine, and many other areas. (For information about the history of statistics, see Sketch 22.)

For a Closer Look: See [44] for an account of the Fermat-Pascal correspondence and its outcome. There are several scholarly accounts of the history of probability: [170] is more mathematical, while [36] and [83] take (each in a different way) a broader view.

¹ Adapted from [1], p. 14.

² [1], p. 243

³ From an 1837 review, [39, p. 347].