Golf by the Numbers

FIVE
Handicap Systems and Other Hustles

I could beat you with a shovel, a baseball bat, and a rake.
—John Montague to Bing Crosby

When a single player asks to join your threesome on the first tee, you usually want to find out how good he or she is. You might ask for the player’s handicap. A more direct approach is, “What do you usually shoot?” My brother George once asked this question and received the unusual answer of “94.”

This guy was a pleasant playing companion but turned out not to be as good as advertised, failing to break 50 on the front nine. After a decent drive on the 15th hole, he picked up his ball, thanked everybody for an enjoyable round, and walked away. It turns out that he was not mad or late for an appointment. This was his unique strategy for coping with the frustrations of being a novice golfer: he always stopped after hitting his 94th shot. Instead of fretting about breaking 100 or having to record some astronomical number, he could enjoy his round and stop before it got ugly. His goal was to play the 18th hole.

I have wondered how a match against this fellow would work. I know I could give him a handicap (let him subtract strokes from his score on certain holes), but how many strokes would I give him? It might not help him much to receive strokes on the 17th hole.

One of the most ingenious handicaps ever offered was famously suggested by amateur John Montague to Bing Crosby, in the early 1930s, after a series of the crooner’s “19th-hole” complaints about not receiving enough strokes. Bing was, at the time, a 3-handicapper, getting 5 or 6 shots but losing consistently to Montague. They headed back out to the 10th tee, Crosby with his golf clubs and Montague with a shovel, baseball bat, and rake retrieved from his car trunk. While Crosby was making a routine par, Montague tossed his ball into the air and hit it with the bat into a greenside bunker on the 366-yard hole. He then shoveled the ball on to the green, lay down with the rake, and used the handle like a pool cue to run the ball into the hole for birdie. Game over.

Undoubtedly, the Crosby family enjoyed a much prouder moment some 50 years later, when Bing’s son Nathaniel won the 1981 U.S. Amateur.

Handicap systems of a more standard nature are the subject of this chapter.

A Wee Wager

Betting has been a part of golf since its beginning. A lively account of some of this history can be found in Michael Bohn’s Money Golf: 600 Years of Bettin’ on Birdies. One of the earliest references to golf is an edict from King James II of Scotland banning golf in 1457. The official reason for the ban was that the young lads of Scotland were being distracted from their archery practice (i.e., military training). Bob Cupp’s book The Edict gives an entertaining alternative motivation for the edict, while describing what golf was like in its infancy.

The purpose of a handicap is to allow two players of different abilities to play and bet on equal footing. That is, each player should be equally likely to win their bet. For example, if I usually shoot 80 and you usually shoot 74, you might give me a 6-stroke handicap. Then, our average scores would tie, but a better-than-average 77 by me (with 6 strokes, a net 77 − 6 = 71) would beat an average 74 by you.

The challenge is to determine how many strokes would be fair. With the exception of the Man Who Shoots 94, none of us have the same score every time out. Even if we did, a 94 from the white tees at the local municipal is not the same as a 94 from the back tees at Bethpage Black. So, a handicap system must take into account a variety of scores played on a variety of courses to produce a number that “fairly” represents a golfer’s ability.

The USGA Handicap System

The handicap system used in the U.S. is administered by the United States Golf Association.1 At first glance, it might seem impossibly complicated. One goal of this chapter is to explain each component of the system so that you can see its logic and its flaws.

Start with a score (S) for a round. The course that you played and the tee locations that you used have been assigned a course rating (CR) and a slope rating (SR) that indicate how difficult the course is. Essentially, the course rating tries to define the true par for the course. The slope rating is a correction factor that will be explained later.

The following computation gives you a differential (D) for your round.

As an example, suppose that you shoot 82 on a course with rating 68 and slope 106. The differential for this round is

using the convention that the differential is always rounded to the nearest tenth. For your USGA handicap, you take the differentials of your last 20 rounds and drop the 10 highest differentials. Your handicap is 96% of the average of the 10 smallest differentials.

The most striking feature of the USGA’s system is the use of the best 10 rounds of the last 20. Different systems could use the middle 10 rounds (throw out the best 5 and the worst 5) or the worst 10 or even all 20 rounds. The choice of which scores to count has a large effect on the handicaps produced. The following example illustrates a flaw in the handicap system.

Suppose that golfers A and B post the following scores.

A: 71, 77, 71, 74, 74, 76, 79, 73, 73, 74, 74, 71, 74, 73, 74, 78, 73, 69, 77, 74

B: 80, 66, 79, 75, 78, 89, 68, 76, 72, 72, 80, 74, 72, 82, 78, 75, 74, 69, 77, 71

To keep things simple, suppose that for each round the course rating is par 71 and the slope rating is an average 113. Then the differentials are simply the actual scores with par 71 subtracted— in other words, the number of strokes above or below par.

A: 0, 6, 0, 3, 3, 5, 8, 2, 2, 3, 3, 0, 3, 2, 3, 7, 2, −2, 6, 3

B: 9, −5, 8, 4, 7, 18, −3, 5, 1, 1, 9, 3, 1, 11, 7, 4, 3, −2, 6, 0

The next step is to drop the 10 highest differentials for each golfer.

A: 0, x, 0, x, x, x, x, 2, 2, x, x, 0, x, 2, 3, x, 2, −2, x, 3

B: x, −5, x, x, x, x, −3, x, 1, 1, x, 3, 1, x, x, 4, 3, −2, x, 0

Then average the remaining differentials and multiply by 0.96.

Player A would have a handicap of 1, and player B would have a handicap of 0 (a “scratch” golfer).

There are some important observations to make before moving on. First, notice that player A receives a 1 handicap in spite of shooting 2 strokes or more over par in 16 out of 20 rounds. Second, player B is a scratch golfer in spite of recording 4 rounds in the 80s, including an 89! This explains the following description of the USGA handicap system: the USGA handicap measures potential, not average score.

You may have recognized the scores used in this example as being from the previous chapter, where player A is a fairly consistent player (p = 0.1), and player B is a more erratic player (p = 0.2). The erratic nature of B’s play creates more potential for low scores, and this is reflected in the lower handicap. Player A matched or bettered his handicap of 1 only 4 times out of 20, while player B matched or bettered his handicap of 0 only 4 times out of 20. You can only expect to match or beat your USGA handicap 20 to 25% of the time.

Is It Fair?

If the purpose is to even out a match between two players, you can see that the USGA system does not work.2 In the above example, player A beats player B 11 out of 20 times with 2 ties, yet player A would actually get a stroke from player B! With the stroke, player A now wins 13 out of 20 matches, with 1 tie. Player A’s average score is one stroke better than player B’s average score, but player A gets the handicap stroke. The handicap does the exact opposite of its objective: to even up the match.

You can object that the above numbers apply only to one special, made-up case. What can we expect for real golfers? The answer, as always, depends on the nature of the golfers. The lesson to learn from players A and B here is that the USGA handicap is strongly influenced by how erratic the golfer is. If a player has a 6 handicap, you should not expect that person to shoot 6 over par. If the player is unusually consistent, you might expect to see a score of 6 or so over the course rating (depending on the slope rating, of course). However, a more realistic expectation is that the player’s “A” game will produce a score of about 6 over the course rating.³ Golfers who do not bring their A game will score much higher than their handicaps would predict.

A couple of generic conclusions can be drawn. If two players have the same handicap, the more consistent player will have a lower score more often than not. It is then only a small step to this conclusion: for two players of different handicaps, the more consistent player will win more than half the time. Generally, a 10-handicap player will be more consistent than a 20-handicap player, since consistency is part of what makes a good golfer good. So, with appropriate “ifs” and “buts,” we can say: in general, the USGA handicap favors the better player in head-to-head competition. One case, however, in which the “better” player is not favored is if the higher handicap player is more consistent.

Why would the USGA choose a handicap system like this? Part of the answer could be to discourage the reporting of bogus high scores. I have always wanted two handicaps: a low one to brag about and a high one to help win matches. A dishonest person could intentionally throw away shots on the course or shoot a bad round to try to artificially raise his handicap, but the USGA system requires such a person to cheat on the best 10 rounds of the year, which most golfers would not be willing to ruin.

More importantly, handicaps are not used just for head-to-head matches. They are also used for tournaments. A “captain’s choice” (“scramble”) tournament, where each group plays the best of four shots in the group, is more about potential than average. The group plays the best shot and picks up the bad shots.

Some tournaments are based on net score (actual score minus handicap). In this case, fair might mean that there is an equal chance of a low handicap player and a high handicap player winning. Recall from chapter 4 that, in a tournament of 90 player As and 50 player Bs, one of the inconsistent player Bs would pull off a career round and win 56% of the time. While most player As would have a better score than their player B counterparts, player As are generally too consistent to go low enough to win. In this case, it makes sense for player A to have a higher handicap as it makes up for the absence of player A’s potential to match player B’s best score.

So, what is the bottom line? The USGA handicap system does not work very well for evening up head-to-head matches, but it works well in large, stroke-play tournaments. The better, more consistent player has an advantage in head-to-head competition: keep this in mind when making bets.

Regardless of these biases in the system, the USGA works very hard to administer its system as fairly as possible. This is the reason that the system appears to be so complicated. In particular, the slope rating of courses is the result of a much-needed correction to the old system.

Different Courses

It is obvious that a good handicap system needs to rate the difficulty of a variety of courses. It is probably not so clear why there is both a course rating and a slope rating. You might be surprised to learn the reason: the USGA recognizes that professional golfers and average golfers have different needs.4

Imagine two courses, one Easy and one Hard. When pros play the two courses, the average scores are 70 on Hard and 64 on Easy. So, Hard should have a course rating that is 6 above Easy’s rating. This makes sense, but wait. The drive on the third hole on Hard requires a 220-yard carry over water. This is no problem for the pros but would be a nearly automatic 2-stroke penalty for the average player. Hard has narrow fairways and tall rough, making it tough for the pros to reach some greens and costing them half a stroke on some holes when they drive into the rough. However, average golfers are doing well to reach the fairway chipping straight out from the rough. Each shot into the rough costs them at least a stroke, and they may chip out to the fairway from a wayward drive, only to have to chip out again when they flail a long iron right back into the rough.

The point is that hazards and features that increase scores a little for the professional can increase scores a lot for the average golfer. The purpose of the slope rating is to adjust the course rating for all levels of golfers. Compared to the Easy course, the Hard course may play 6 strokes higher for pros, but it might be 16 strokes harder for the average golfer and 26 strokes harder (almost unplayable) for a weak golfer. The handicap system tries to take this into account.

A Slippery Slope Rating

A thought experiment illustrates the logic of the slope rating system.5 Imagine a course on which a scratch golfer averages 70 for the best 10 out of 20 rounds. Then 70 becomes the course rating. You might expect that a 5-handicapper would average 75 on this course. However, as noted above, the handicap of 5 counts only the best 10 out of 20 scores and is further reduced by a factor of 0.96. Taking this into account, it might turn out that 5-handicappers actually average 75.6, 10-handicappers average 81.3, 15-handicappers average 87, and 20-handicappers average 92.6.⁶ We can visualize these results graphically, plotting the handicap on the horizontal axis and the average score on the vertical axis. When the points are plotted, they fall on a line with a slope of 1.13, as shown in figure 5.1.

The slope of 1.13 becomes the course’s slope rating of 113— a rating considered “average” by the USGA. A harder course, with long carries over water and other punitive features, would have a different distribution, with some scores skyrocketing for high-handicap players. The assumption is that the scores would line up somewhat like those in figure 5.1. The slope rating for the course is named for, and equals, the slope of the line that the scores form, multiplied by 100. In practice, a USGA rating team will establish the course rating and a rating for a bogey golfer (one who shoots 1 over par on most holes). These two data points determine a line, and the slope of that line determines the slope rating.

Figure 5.1 Hypothetical average scores for players of various handicaps on a course of average difficulty

The hypothetical scores shown above are based on a general rule of thumb that the average of a golfer’s top 10 out of 20 scores will equal about 92% of the average of all 20 scores. If the golfer’s scores are normally distributed (that is, if they follow a bell curve), then the average of the best half of the scores should be about 0.8 standard deviations below the mean. For a golfer with an average score of 80 and a standard deviation of 8, the average of the best 10 out of 20 would be about 80 − 0.8 * 8 = 73.6, which equals exactly 92% of 80.7 Using the 92% rule, a golfer whose average score is A will have a handicap of (0.92)(0.96)A = 0.8832A. Conversely, if the player’s handicap is H, then the average score would be . The average slope rating of 113 therefore corresponds to a course on which the 92% rule holds.

As mentioned earlier, the differential that is actually used to compute your handicap is

where S is the recorded score, CR is the course rating, and SR is the course’s slope rating. As an example, suppose that you shoot 85 on a course with course rating 70 and slope 130. To compute your differential for this round, start with 85 − 70 = 15, meaning that your score is 15 strokes above the course rating. The slope rating of 130 means that the course is more challenging than average. The expectation is that you would have had a better score playing on an easier course. The fraction estimates how much better the score would have been. Since 15 * 0.87 ≈ 13, the prediction is that you would have scored 83 on a course of average difficulty, 2 strokes better than your actual 85. (Perhaps you did not make the 220-yard carry on the third hole.) Your differential for the round is 13, and it counts as one of the 20 differentials that determine your handicap.

The process is reversed when you play a match. That is, if you bring a 13 handicap to a course with a slope rating of 130, your effective handicap for the round will be . Playing a harder course, as defined by the slope rating of 130, you need more strokes to compete on an even basis with better golfers.

Variants

There are numerous proposals for alternative handicap systems, and there are numerous handicapping systems in use around the world. The USGA system is used only in the United States and Mexico. For instance, the Australian Men’s Handicapping System starts with an initial handicap determined by the results of the most recent 5 rounds, all of which are counted. This is done on the basis of differentials similar to those in the method described above, except that slope ratings are not used. From this starting value, adjustments are made to the handicap as more scores are recorded.8 Although the rules are somewhat complicated, essentially a score that is higher than predicted by the handicap is used to increase the handicap by a small amount. In the same way, a score that is lower than predicted by the handicap is used to decrease the handicap by a small amount.

The USGA system has undergone quite a few changes in the past 100 years. The 96% factor used in computing handicaps was set at 85% until 1976.⁹ The change to 96% was one of many compromises as the pooh-bahs of golf debated whether the handicap should reflect potential scores or average scores. The 96% figure splits the difference between the previous figure of 85% and that from a study commissioned by the USGA showing that it would take an adjustment of 107% to even out head-to-head matches.¹⁰

A commonly used handicap system is to self-report a handicap on the first tee and then adjust it at the turn if the match has become one-sided. The effectiveness of this system depends on what caused the imbalance in the first place. If one player had an unusually good nine holes, then the fairness of the adjustment depends on whether that person continues to play above average, returns to his or her normal game, or gets carried away by a temporary burst of competence and tries several risky plays that backfire.

The examples of Hardy golfers in the previous chapter show that more inconsistent golfers tend to have a higher scoring average but a greater likelihood of a very low score. This creates the apparent paradox of a player with a lower handicap having a higher scoring average. Recognizing the importance of variability in a golfer’s scores, Bingham and Swartz have proposed a handicap system that takes both average and variability into account.11

The Back Tee: Variance

To see how variability affects results, consider two golfers of different ability, one averaging 80 and the other 90. Bingham and Swartz estimate that the scores of such golfers might have standard deviations of 3.27 and 3.8, respectively. Figure 5.2 shows graphs of the probability density functions (pdf’s) for these golfers’ scores.¹²

Figure 5.2 Graphs of probability density functions for scores of two golfers, one with mean 80 and standard deviation 3.27, the other with mean 90 and standard deviation 3.8

The bell-shaped curves assume that each player’s scores follow a normal distribution. There is evidence that this is a reasonable assumption.13 The numbers on the vertical axis are significant only in a relative sense. The higher the curve is for a given score, the larger the probability of the player achieving that score. The average scores of 80 and 90 locate the peaks of the curves. The standard deviation measures how “spread out” the curves are. Notice that the curve centered at 80 visibly separates from the axis only between 70 and 90, indicating that this golfer is very unlikely to score lower than 70 or higher than 90. The 90-average golfer has a little more variability, with likely scores extending down about 12 strokes to 78 and up 12 strokes to 102.¹⁴

Given the averages of 80 and 90, it might be that our two golfers have handicaps of 10 and 20. Subtracting these handicaps gives us the net scores of the two golfers (shown in figure 5.3). This graph shows more clearly that one distribution is more spread out than the other. The taller curve represents the low-handicap, consistent player who is not likely to stray very far from a net 70. The high-handicap player is less consistent and has a wider range of possible scores.

Figure 5.3 Probability density functions for net scores of the golfers in figure 5.2, assuming handicaps of 10 and 20

The scenario that Bingham and Swartz explore is the wonderful day on which both players are playing well. The big question is how we should precisely define two players of different ability as both playing “well.” This could mean 10 strokes better than average, but certainly an improvement from 80 to 70 is more meaningful than an improvement from 90 to 80.

Probability theory gives us a nice way to think about how to define “well.” Let’s say that playing well means having a score in the top 16% of all scores. In other words, only about 1 in 6 rounds is going to be this good. This hypothetical value is convenient because of the property of the normal distribution, whereby 68% of the scores are within one standard deviation of the average. For the low-handicap player, 68% of the scores are in the range 80 ± 3.27, placing the net scores between 66.73 and 73.27. For the high-handicap player, 68% of the scores are in the range 90 ± 3.8, placing the net scores between 66.2 and 73.8. For each player, about 16% of the scores will be less than the lower boundary. So, the 16% mark for the 10-handicapper is at 66.7, and the 16% mark for the 20-handicapper is at 66.2. The 20-handicapper is more likely to have a lower net score when playing well.

If the handicaps effectively equalize the scoring averages, then the less consistent player has a better chance of having the lower net score. The best 20% of net scores for the inconsistent player are better than the best 20% of scores for the consistent player. Of course, the exact same advantage flips to the consistent player if we look at the worst 20% of scores.

As we have seen, the USGA handicap is not designed to even out the scoring averages. A player averaging 90 might have an 18 handicap,15 while a player averaging 80 might have a 9 handicap. Adjusting figure 5.2 by these amounts gives figure 5.4.

Figure 5.4 Probability density functions for net scores of the golfers in figure 5.2, assuming handicaps of 9 and 18

The advantage has now shifted primarily to the low-handicapper. The average net score is lower, and only for scores lower than 64 is there even a slight advantage for the high-handicapper. However, if many such players are competing in a tournament, this slight advantage translates into a good chance that one of the high-handicappers will post the lowest score. Bingham and Swartz computed the minimum net score out of 20 scores for a number of golfers at a club in British Columbia. They found that, on average, an increase in handicap of 10 reduced the minimum net score by one stroke.¹⁶ This is strong evidence that, for players having their best round of the year, the USGA handicap system favors the high-handicapper.

The problem identified here is that for a tournament with a large number of players a handicap system should equalize players of different abilities who are playing their best rounds of the year. To know what the best round might be, you need a graph like figure 5.2 showing the range of possible scores and their likelihood. Then, you need a way of equalizing different players.

Statistically, there is a simple fix for this problem. In any introductory statistics class, you learn about standard scores (or z-scores). For a normally distributed random variable X, the average and the standard deviation uniquely determine the distribution. The z-score subtracts the average A and divides out the standard deviation SD. Then the quantity

is normally distributed with average 0 and standard deviation 1. A comparison of z-values from different normal random variables is “fair” in the sense that a z-value of 1 always represents a score at the 16% mark, regardless of the original distribution.

In terms of golf handicaps, Bingham and Swartz equate x with the differentials computed above, A with the player’s (unknown) true average, and SD with the player’s (unknown) true standard deviation. They develop estimates for A and SD based on the player’s USGA handicap H, ending with the formula

Each player would plug in the score S, handicap H, course rating CR, and slope rating SR. The smaller the T-value, the better. Simulations run by Bingham and Swartz indicate that this calculation produces a match that is fair in the sense that there is no bias toward high- or low-handicap golfers for either a head-to-head match or a tournament.

And some people think that the USGA handicap system is complicated! For casual rounds, most golfers will probably continue to determine handicaps through negotiations on the first tee, but beware of anyone who just happens to have a baseball bat, shovel, and rake in his car.

FIVEHandicap Systems and Other Hustles