Golf by the Numbers

SIX
The ShotLink Revolution
Golf Statistics

Some of these numbers acquire a kind of poetry to them.
—Tim Wiles, Baseball Hall of Fame

Televised golf has changed in numerous ways, as technology has given us spectacularly detailed slow-motion replays, swing analyses, and shot tracers that show the curve of a ball in flight. The level of precision provided is also revolutionary, if not as visually dramatic. Instead of an on-course announcer guessing that a putt is “at least 30 to 35 feet,” the announcer in the booth can authoritatively say that the putt is 33 feet, 5 inches. The source of this precision is ShotLink, and it is rapidly transforming the world of golf statistics.

I should hasten to emphasize that this is statistics with a lowercase “s.” My dad was a Ph.D. statistician and practiced Statistics with an upper-case “S.” Research statisticians employ sophisticated, highly mathematical techniques to build accurate models from carefully collected experimental data. In the early 1960s, my dad was also a statistician for the Dallas Cowboys. This meant that he carefully accounted for each play in a game and did the arithmetic to compute Don Meredith’s passing percentage. The golfing equivalent of such statistics is discussed in the next several chapters.

Baseball’s Poetry

Baseball fans have a special affinity for the statistics of their game. Certain numbers are so recognizable that the Baseball Hall of Fame will not correct players’ plaques when new information creates a change in the record books. The Hall of Fame does not want to disrupt the “poetry” of the museum, and indeed baseball stats could make a good Jeopardy topic, perhaps called “This Year in Baseball.” Try your luck with these (answers are in the notes):

The year of .406 and 56.

The year of 70, 66, and 56.

The year of 1.12, 31, , and .301.

The year of 61 and 54.1

The popularity of baseball cards owes a large debt to the statistics on the back of each card,² stats that generations of fans learned and memorized. The main reason why baseball has such great number recognition is the discrete nature of the game. This is different from “discreet,” which baseball players have only rarely been but reporters used to be. In this setting, discrete means that each pitch, at bat, and so on is a separate event that can be listed and quantified. Compare this to European football or interior line play in American football, where the action is fluid and continuous and, therefore, difficult to quantify. Golf is also discrete, since a round consists of 70 or so separate strokes—for professionals, anyway. However, baseball plays always start with the ball in the pitcher’s hand and the batter at home plate, while a golf stroke can be taken from almost anywhere on the course and (sadly) can end almost anywhere on or off the course. The limitless options for ball location make golf much more difficult to describe numerically.

Baseball statistics were revolutionized by Bill James. The annual Bill James Baseball Abstracts, published in the 1980s, introduced a new way of looking at baseball. James found improvements to classic statistics like batting average and RBI (runs batted in), quantified the number of runs an individual player contributed offensively, and invented ways to measure a player’s defensive contribution to a team. Frustrated with his inability to obtain detailed baseball data, James organized Project Scoresheet, which collected data on every pitch and batted ball of each new season. This data opened the floodgates as thousands of fans could now work out their own special “sabermetrics”³ theories.

By contrast, the golf revolution came from above. There was no Bill James rattling the saber (so to speak) for better data. The world of golf statistics was completely revolutionized by ShotLink (see logo, figure 6.1), the PGA Tour’s system of lasers and volunteers who record data. ShotLink determines the starting and finishing locations of each shot to the inch. We can know that Davis Love III’s second shot on the first hole of the 2007 Mercedes-Benz Championship started in the left fairway 8,156 inches from the hole and finished in the right fairway 1,090 inches from the hole.

Before we dive into the data, there is an important disclaimer to be made. ShotLink is administered by the PGA Tour,⁴ which runs many of the professional tournaments in the United States but not all. The statistics that follow come from ShotLink only, so some tournaments are not included. The most important absences are the four men’s major golf championships⁵ and tournaments from the LPGA, European, Japanese, and other tours.

Those Good Old-time Statistics

If you track your own golf statistics, you probably record your number of putts, fairways hit, greens hit in regulation, and up-and-downs for each round. These basic statistics are relatively easy to mark on a scorecard. Similar statistics have been available for PGA golfers for years.

Greens hit in regulation is clearly a useful measure of solid ball-striking. In 2009, ShotLink recorded data for 307,836 holes. On 199,161 of these, the golfer hit the green in regulation (that is, after a standard number of strokes). This computes to a 64.7% rate for hitting greens, a percentage that has changed very little since 2004 (figure 6.2). Even with the magnified vertical scale of this graph (showing only 60 to 70), it is clear that the percentage has not deviated much from 64%. While the overall Tour average has stayed nearly constant, percentages for individual golfers can increase or decrease significantly from year to year. Table 6.1 shows the top five percentages of greens hit in regulation during 2004 through 2009. Some players repeat on the list, but others rank near the top one year and then drop back into the pack.

Figure 6.1 ShotLink, a revolutionary data collection system

Figure 6.2 Percentage of greens hit in regulation, 2004–2009

The consistency of a statistic will be a concern for us. When a statistic varies wildly from one year to the next, we question whether that statistic measures a skill level or is simply random noise.

Nuggets and Flakes

Bill James is said to have written, “Do we need to have 280 brands of breakfast cereal? No, probably not. But we have them for a reason—because some people like them. It’s the same with baseball statistics.”6 With ShotLink, we have thousands of statistics available: percentage of putts made from 12 feet, average approach distance for shots from the fairway 100–110 yards from the hole, and so on. Which of these statistics provide important information? One goal of this chapter is to separate the statistics with good nutritional value from the ones that are just empty calories.

Table 6.1 Top fives in greens in regulation, 2005–2009

The ultimate goal in a round of golf is to post the lowest score possible, so a criterion for a useful statistic is that it must relate to scoring. As a first attempt at evaluation of the available statistics, I computed the correlation between several statistics and scoring. For this study, the data were sorted by tournament. Then, for example, the list of numbers of putts taken for the tournaments was correlated to the list of scores for the 6,000-plus player tournaments in the data set for a given year. In 2009, the correlation rounds to 0.238. I then created a list of number of greens hit in regulation for each tournament. The correlation between greens hit in regulation and score in 2009 is −0.579. A scatter plot of the data for greens in regulation is shown in figure 6.3.

The correlation between two variables gives information about how the variables change from data point to data point. The sign (+ or −) of the correlation relates to whether the values of the variables increase or decrease. The negative sign for greens hit in regulation and score means that, in general, when the number of greens hit goes up, the score goes down. This certainly makes sense and is clearly visible in figure 6.3. The positive sign for putts and score means that, when the number of putts increases, the score increases. This is also what we would expect. With equal validity, we can say that the correlation of −0.579 means that, when the score goes up, the number of greens hit goes down. That is, correlation simply relates the two variables, in either order. Correlation does not imply causation. In this case, it makes sense that hitting more greens can cause your score to drop and that taking more putts clearly causes your score to increase. However, it is very important to realize that a high correlation in no way proves that changes in one variable cause changes in the other.⁷

Figure 6.3 Scatter plot of average tournament score versus greens hit in regulation, 2009

The numerical value of the correlation is also meaningful, although the meaning can be subtle. A correlation is always between −1 and 1. A correlation of 0 indicates that there is no measurable linear trend in how the quantities increase or decrease. Figure 6.4a shows a plot of data with the coordinates having a correlation of 0.09. An increase in one variable is equally accompanied by increases and decreases in the other variable. Correlating with score, a correlation close to 0 indicates a statistic that does not predict scoring well (using a linear equation). A high correlation (close to 1 or close to −1) indicates a statistic that could be used to accurately predict scores and, in this sense, is an important statistic. Figure 6.4b illustrates a correlation of 0.36. This is not very close to 1, but you can see a general trend for the points to be higher (increased y) as you move from left to right (increased x). You might imagine a line through the middle of the data points which could be used to predict scores.

To get a little more technical, correlation measures the extent to which there is a linear (straight line) relationship between the variables. The points in figure 6.4b come closer to forming a line than do the points in figure 6.4a. The square of the correlation gives the percentage of variation in the data that can be explained by the best-fit line. In figure 6.4b, there is a line (through the “middle” of the data) that explains 0.36² ≈ 0.13, or about 13% of the variation in the data. The best-fit line in figure 6.4a explains a mere 0.09² = .0081, or less than 1% of the variation in the data. The best-fit line in figure 6.3 explains 0.579² ≈ 0.335, or about 33.5% of the variation. You can see that, as you move to the right (more greens hit in regulation), there is a tendency for the average score to drop, but there is a fairly wide band of scores at each value of greens hit in regulation. That is, a significant amount of variation in scoring (about 66.5%) is not explained by greens hit in regulation.

Figure 6.4 Low versus higher correlations: (a) low correlation (.09); (b) higher correlation (.36)

For our purposes, we will use the guideline that bigger correlations (ignoring the plus or minus sign) are better. I computed six-year (2004–2009) averages of correlations for a large number of statistics. Ranked by size of correlation, the top six statistics are given in table 6.2. Here, “scrambling” reflects only those holes on which the golfer misses the green and equals the percentage of times the player makes par or better.

The first three statistics are completely reasonable. To score well, you want to hit many greens and take few putts. If you miss the green, you want to get up-and-down to save par. Greens hit in regulation and putts per green hit in regulation (often called “putting average”) are among the few statistics that are readily available online. Week by week, they are the two statistics that best predict overall score.⁸ Notice that they are the only statistics with correlations above 0.5.

You might be surprised that the number of putts per green hit in regulation correlates to scoring better than the total number of putts does. The problem with the total number of putts is that it is a “combination” statistic that has multiple influences working at cross purposes. While it is never good to take a large number of putts, one way to minimize the number of putts taken is to miss every green and chip close. It is easy to improve your total putts statistic without improving your overall scoring.

Table 6.2 Correlations of statistics to scoring, 2004–2009

One type of statistic that is conspicuously missing from table 6.2 is a driving statistic. Modern technology allows players to adopt a “bomb and gouge” approach. Even if the drive misses the fairway, they have bombed it close enough to the green that they can gouge a wedge out of the rough and onto the green. Four common driving statistics have the following correlations (2004–2009) with scoring: driving distance (−0.193), fairways hit (−0.184), average distance (−0.151), and longest drive (−0.101).

Driving distance has a slightly higher correlation than percentage of fairways hit, but the correlation to scoring is not very high for either.⁹ There are actually two driving distance statistics. Distance is measured on all holes, but values are separated out for holes on which most players hit driver. Driving distance is for the special driving holes. The correlation to scoring for tee shot distance on all holes (average distance) is slightly less.

Driving distance, then, does not seem to have much to do with scoring well.¹⁰ The bottom line seems to be that, on the PGA Tour, driving is important only insofar as it increases or decreases your ability to hit the green in regulation. A drive that leaves you stuck behind a tree obviously affects your score adversely. However, on the larger scale of multiple rounds, the evidence does not support the theory that driving is a critical factor in scoring well.

The sixth best statistic in table 6.2 is “proximity of approach shots,” which is the average distance to the hole after an approach shot. This statistic would clearly influence each of the top four statistics, since the closer you are to the hole the more likely you are to be on the green and have a chance to make a birdie putt. I find it interesting that proximity of approach shots has a much smaller correlation to scoring than does greens hit in regulation.

The proximity of approach shots can be broken down by distance. Aggregated into 25-yard intervals, the distance range with the highest correlation to scoring is 150–175 yards (a correlation of 0.267 for 2004–2009), followed by 125–150 yards (0.244) and 175–200 yards (0.217). The distances shorter than 125 yards and longer than 200 yards all had lower correlations, as shown in figure 6.5. One confounding factor for the shorter distances is the occasional need to hit these shots after laying up from a bad drive. That is, a good drive followed by a poor shot from 60 yards could result in a par, the same as a great shot from 60 yards after a horrible drive and pitch out. The lower correlations do not necessarily mean that the skills are less important, only that a smaller average approach distance for a tournament does not always correspond to a lower score for the tournament.

“Scrambling” ranks third on the list of statistics in table 6.2. Scrambling is defined as the percentage of times the golfer made par or better on holes on which the golfer did not hit the green in regulation. The tour average from 2004–2008 was 56.5%. Percentages of par saves from different categories are illustrated in figure 6.6. Notice that sand saves are about the same as saves from 20–30 yards (60–90 feet). Which of these scrambling categories is the most important? The category with the highest correlation to score is saving par from 10–20 yards, followed by sand saves. However, the correlations are all fairly small (see figure 6.7).

Figure 6.5 Correlations of approach distances to score

Figure 6.6 Percentages of par saves

Three of the top six statistics (ranked by correlation with score) are putting statistics. This is one of several indicators that putting is the most important skill in golf. Putting statistics are the focus of the next chapter, however. For now, I want to complete the study of correlations by reporting scores for several putting statistics. Figure 6.8 shows correlations for six common putting statistics. To make comparisons more visual, the absolute values of the correlations are shown. Here is a conclusion worth repeating: Putts per green hit in regulation is the best basic putting statistic. Percentages of 1-putts, 3-putts, and putts per round have significantly lower correlations with scoring. Distance is the total length of all made putts. Approach is the average distance to the hole after first putts that did not go in. As you would expect, the correlations for 1-putts and distance are negative, while the other correlations are positive.

Figure 6.7 Correlations of par saves to score

Figure 6.8 Absolute value of correlations of putting statistics with score

Some golfers are solid on short putts but cannot make long putts. Others make more than their share of long putts but are shaky on short putts. What can be said about which distance is the most important? The answer is “not much” using correlations. Broken down using the PGA Tour’s distance ranges, the highest correlation is for putts from less than 10 feet. Direct comparisons are not entirely valid, since there are vastly different numbers of putts taken from the different distance ranges. Some of the dangers of small sample sizes are discussed next.

When Correlations Don’t Relate

The following discussion is intended as a caution against overusing correlations. The story begins, unfortunately, with an overuse of correlations. The first time I ran the correlations for figure 6.9, the values for the ranges 15′–20′, 20′ –25′ and 25′ + were 0.111, 0.196, and 0.158, respectively. My first thought was that they were too large, and then I noticed that all three correlations were positive. In other words, the higher the percentage of long putts made, the higher the score!

Figure 6.9 Correlations of percent of putts made from different distance ranges to score

Some detective work was required to figure it out. Here’s what happened: I was using all tournament lines, whether the golfer made the cut or not. Some golfers who missed the cut happened to make the only putt they took from one of these distance ranges. If they had played more rounds or hit more greens in the rounds they played, they would have faced more putts from that distance and would have missed some. As it was, there were several 100% entries in the data, almost all associated with golfers who missed the cut (and, obviously, had high scores). There were enough such entries to shift the balance of the relationship to a positive correlation.

The values shown in figure 6.9 were compiled from tournament data using only those golfers who made the cut, which may affect interpretations of the correlations. Certainly, the correlations cannot be used to support any theories about which aspects of the game are most associated with missing the cut.

Given the above discussion, you may well question the value of −0.08 in figure 6.9 for putts of length greater than 25 feet. The correlations for 10′–15′, 15′ –20′, and 20′ –25′ show a nice trend whereby the longer the putt, the smaller the correlation with scoring. The argument that this trend should continue is based on my assumption that holing a long putt is a rare event, most often accidental. You do not have to make long putts to get a good score, and you can have a bad score in a round in which you make a long putt.

By this logic, the correlation for distances greater than 25 feet should be extremely small. However, it is not. A possible explanation is related to the argument for excluding golfers who miss the cut. The number of putts attempted from long distance is small enough that a lucky made putt could produce an unnaturally large percentage and distort the correlation. Another possibility is that the correlation (which is not large) is giving us some causal information. If you make a 30-foot putt, you have lowered your score one stroke below any reasonable expectation of what you should make on that hole. Holing a 60-footer not only beats a 2-putt by one; it also beats a likely 3-putt by two. That is, there is a very real savings when you make a long putt.

The primary lesson here is to be wary of placing too much importance on a single correlation between two statistics, especially if the number of data points is small.

The Back Tee: Tailoring the Basic Statistics

One objective of the second half of this book is to rate the PGA golfers. The work in this chapter sets the stage for a logical rating system. The bad news is that it is not a great rating system. The good news is that we will be able to improve on it in the chapters to come.

The correlations computed above identify three statistics as being especially good predictors of scoring: greens hit in regulation, putts made per green in regulation, and scrambling. These statistics have the advantage of being readily available online. The three variables can be combined in a linear best-fit model (also called linear regression) to give a better predictor of score. If GIR equals the fraction of greens hit in regulation, PUTT equals the average number of putts per green hit in regulation, and SCR equals the fraction of par saves per missed green, then

S = 65.98 − 13.00 GIR + 10.29 PUTT − 7.94 SCR

gives a good predictor of score. In turn, S (predicted score) can also be used to rate golfers. Plug in a golfer’s stats, and the predicted score indicates the golfer’s level of performance. The lower the predicted score, the better the golfer is.

The derivation of the formula for S starts with a general equation for a linear combination of the variables GIR, PUTT, and SCR. That means we want an equation of the form S = a + b * GIR + c * PUTT + d * SCR for (unknown) numbers a, b, c, and d. Our task is to figure out the “best” values for the parameters a, b, c, and d. “Best” is in quotes because there are numerous ways to define precise criteria for optimality. The least squares criterion is commonly used.

To define the least squares criterion, start by imagining a golfer who hits 60% of the greens, averages 1.8 putts per green in regulation, and saves par 70% of the time. The predicted score for the golfer is S = a + .6b + 1.8c + .7d. If the golfer actually averages a score of 70, then the error in the prediction is |a + .6b + 1.8c + .7d − 70|, the difference between the predicted score and the actual score. Square the error and add up the squares of the errors for all of the golfers in the data set. The values of a, b, c, and d identified by the least squares criterion are the ones that make the sum of the squares of the errors as small as possible.

The description may sound complicated, but the mathematics for solving the least squares problem is surprisingly straightforward.11 Spreadsheets and graphing calculators can do this easily. In the case of PGA tournament data for the 2004 through 2009 seasons, the values that minimize the total (squared) error turn out to be a = 65.98, b = −13.00, c = 10.29, and d = −7.94. The correlation between the predicted scores and the actual scores is slightly over 0.9, which is far higher than the correlation for any individual statistic.

So, what is wrong with this system? Both its strength and its weakness derive from the fact that it uses simple statistics. There are better statistics, which we will develop in later chapters. For now, let’s look at how the system rates the PGA Tour golfers in 2009, using statistics from the regular season. For example, to get Tiger’s predicted score, you plug in GIR = 0.6849, PUTT = 1.738, and SCR = 0.6809 and get 65.98 − 13(.6849) + 10.29(1.738) − 7.94(.6809) = 69. 554, which rounds to 69.6. The top five ratings for the 2009 season are shown in table 6.3.

Table 6.3 Top five ratings from regression, 2009

There is not much arguing to be done with the top two; Tiger Woods and Steve Stricker finished one-two in the regular season FedEx Cup standings (a point system that the PGA Tour uses to rank Tour golfers). Zach Johnson also ranked in the top five in FedEx Cup points. However, Kevin Na was 18th and David Toms 19th in FedEx Cup points, and no experts ranked them in the top ten, much less the top five. The ratings in table 6.3 are therefore suspect.

Subjectively, then, this rating system is not bad, but neither is it great. It is not going to replace any of the official rating systems. The advantage of this system is that it uses readily available statistics. You can go online, grab some numbers, and plug them into the system at any point in the season. For this reason, it might be a useful system for choosing fantasy golfers.