3

Tournament Rules

In athletic-related data analysis, there are two more or less overarching narratives. The first concerns the strength of relationship, if any, between particular measurable skills and performance. The first two chapters answered the basics of that question. The second question relates to the ability to comparatively assess golfers across generations. This is by far the more complex question, given the constantly evolving circumstances within which the game is played. But it is also by far the most intriguing, since a fair match between the great players of yesterday and today can occur only on paper.

If one hopes to make sense of the relative abilities of golfers who played in different times and conditions, one first needs some commonly understood and applied guidelines. One can think of these as the rules of an “all-time tournament.” The guidelines will be designed to answer three broad questions: Who will be measured, what will be measured, and how will the measurements be done? The first two questions can offer leeway; the third cannot.

Who’s in the Field, and Which Events Should Be Measured?

The essays that follow are based on examinations of the career records of about two hundred men and women from every era. Most of those whose stories are told herein would, most likely, be considered among a consensus ranking of the greatest players of all time, although a few who do not attain that status are included because their stories are worth telling.

Today’s touring pros play between twenty and twenty-five events in a season. That’s a generalization, of course. If you’re Jason Day, you can make enough money for groceries and the mortgage in as few as fifteen events. Then there are workaholics who play forty-some tournaments a season because the tour only schedules that many. In confronting the question of which tournaments to rate, that issue of variable field quality is one excellent reason to exercise discretion. After all, should one view the Booz Allen Classic on a par with the Masters?

For cross-era comparison, far and away the most logical approach is to focus on performance in the major events. In addition to field quality—the best players almost always show up—there are several other good reasons for doing so.

  1. 1. They’re familiar. Every 20-handicapper can name the men’s majors: the Masters, U.S. Open, British Open, and PGA Championship.
  2. 2. The men’s majors are stable and the women’s events reasonably so. Whatever did happen to the Booz Allen, anyway? The Masters, the youngest of the men’s majors, has been around for more than eight decades. Like the majors, the PGA Tour’s FedEx Cup series also draws the best players . . . but that event has been played since only 2007. As a tool for ranking all but the current generation of players, it is useless.
  3. 3. They present a consistent and manageable amount of data.

Sadly for symmetry, the history of major women’s tournament golf is less ordered than the history of major men’s golf. To its credit, the LPGA long ago recognized the problem its occasionally checkered history had created. That checkered history began in 1930 when the Western Golf Association—which had been conducting a widely recognized tournament for men since the nineteenth century—anticipated gender equity and created the Women’s Western Open. It was won by somebody named Mrs. Lee Mida, and that is virtually everything on record about Mrs. Lee Mida. Initially, the Women’s Western did not draw many big names, probably because there weren’t any. June Beebe won in 1931, and when she did it again in 1933 she became the first repeat champion. Opal Hill won back-to-back tournaments in 1935–36. The first winner there’s actually an outside chance you’ve heard of was Betty Hicks, a longtime pro in the seminal days of the women’s tour who took the 1937 championship and remained active for fifteen years thereafter.

Also in 1937 a group of women golf enthusiasts organized an event they called the “Titleholders Classic.” There weren’t many titles to hold if you were a woman in the late 1930s, but that didn’t stop the game’s best from journeying to Augusta, Georgia, each spring to compete at the most prestigious women’s competition of its like at the time. It was held at the Augusta Country Club, where in addition to the golf the women could look at, but not touch, the all-male Augusta National situated directly through the azaleas. Patty Berg, then a nineteen-year-old phenom, won the Titleholders in 1937 and liked it so much she won again in 1938. She also took the third event in 1939, before Betty Hicks diversified the list of champions in 1940.

At the time, there was nothing “major” about those two events, either in prize money (which was negligible enough to go unreported) or in public attention. Occasionally, such as when Babe Zaharias won the Western in 1940 or when Louise Suggs took the Titleholders in 1946, the event would merit a few paragraphs in the New York Times, but that was about it. It was, after all, women’s golf. The whole concept was so little a deal that the notion of a national championship for women didn’t even coalesce until after World War II, when a loose entity of female professional golfers, styling themselves as the Women’s PGA, pitched the notion to the Spokane Athletic Round Table, which offered facilities at the local country club for a match-play event. Berg won, defeating Betty Jameson 5 and 4 to claim the first prize: $5,600 in bonds. The entire purse was $19,000.

In 1947 a second Open, this one at medal play, was scheduled for Greensboro, North Carolina. But there was so little public interest in competitive women’s golf that the purse won by Betty Jameson was 60 percent smaller. (It would be two decades before the Open purse again exceeded $19,000.)

In 1949 the Women’s PGA collapsed, and a new entity, the Ladies Professional Golf Association, took charge of both the Open and tour play. The arrival of this entity, driven largely by Zaharias and Berg, marks the functional beginning of an ongoing oversight body. But the LPGA’s interests lay as much in promoting the entire women’s tour as in promoting the Open itself. So in 1953 it surrendered management of the Women’s Open to the United States Golf Association, which had been running the men’s U.S. Open since its creation in 1896. Two years later, the LPGA created its own branded event, the LPGA Championship (now known as the Women’s PGA).

By the late 1950s, with Arnold Palmer’s ascendancy driving interest in men’s golf, the notion of four “men’s majors” had locked into the public psyche. But that raised a question among those interested in the women’s game: Were there also women’s “majors”? To answer the question, the LPGA essentially decreed it so. The Open and LPGA, so closely paralleling men’s majors, were obvious choices. The Titleholders, with a prestigious location that suggested status as the women’s Masters and a prestigious list of winners dating to the 1930s, was the third. Since there was no obvious parallel to the British Open, the LPGA went to its oldest, retroactively designating the Women’s Western Open as its fourth major.

That worked well enough until the late 1960s, when two of those majors hit the rocks. The Titleholders folded in 1967 (not counting a one-year comeback effort in 1972), the Women’s Western in 1968. That left the women’s tour with just three majors in 1967 and just two in 1968.

What it needed, the brains finally decided, was something with an international cachet that would counter the presence of the British Open in the men’s schedule. The women’s tour had just one regular international event at the time, the Peter Jackson Classic, which had been staged in Canada since 1973. Presto, in 1979 the Peter Jackson Classic became a major. Unlike the LPGA’s handling of the Titleholders and Women’s Western, though, previous winners were not elevated to major status retroactively. In 1983 du Maurier bought naming rights to the event, injecting cash into the pot that made it a “major” in that sense as well.

Also in 1983, the LPGA elevated the tournament previously known as the Colgate Dinah Shore—and rebranded over the previous winter by Nabisco as the Nabisco Dinah Shore—to major status. Shore, a golf enthusiast as well as a nationally known entertainer, had created the event in 1972. In all ways other than label, the Dinah Shore was widely considered a major well prior to 1983. It had star status, TV attention, and corporate sponsorship, factors that allowed Shore to offer a purse that was the envy of other tournaments. The inaugural event offered champion Jane Blalock a $20,500 first prize, nearly double the previous richest prize in golf history. The Open champion that year got $6,000.

Thus did the LPGA rota of majors enter a seventeen-year window of stability. But in the mid-1990s the European tour had begun pouring resources, including good sponsorship money, into the Women’s British Open. By 2001 the lure of an actual women’s major played in Britain was too great to bypass, and the British Open bumped the du Maurier. Finally, in 2013 the LPGA decreed that the Evian Masters, a tournament played since 1994 in Évian-les-Bains, France, would be considered a fifth major.

Over time, then, eight different events have been recognized as women’s majors for at least part of their experience, although just two—the U.S. Open and the LPGA—have constituted a reliable core. Two—the Titleholders and Women’s Western—were majors for a time without knowing it. Two others, the Peter Jackson/du Maurier and the Shore/Nabisco/Kraft/ANA, weren’t, then were (and, in the case of the du Maurier, then wasn’t again). The last two, the British Open and the Evian, weren’t, then were.

How do you handle all this coming and going? You declare that if it’s good enough for the LPGA, it’s good enough for us. You will find a year-by-year breakdown of the women’s majors in the appendix.

One also needs to consider the odd cases that are the amateur championships. In the modern game, the U.S. Amateur is essentially a college tournament. But there was a time—think of the Bobby Jones era—when the Amateur was a huge-enough deal, attracting major media coverage, that its results should not be ignored. The same is true of its antecedent, the British Amateur, and of the U.S. Women’s Amateur. At the same time, even in their most significant days, those events were weak by comparison with concurrent professional events. When Jones famously won the U.S. Amateur as part of his 1930 Grand Slam, there were only three other players in the thirty-two-person field who either had or eventually would make a national mark on golf. Those three were 1913 U.S. Open champion Francis Ouimet and future U.S. Open champions Johnny Goodman and Lawson Little. Ouimet was on the far fringes of his prime, Goodman and Little had not yet reached theirs, and as it happened Jones never faced any of them anyway. Instead, he defeated a fellow named Eugene Homans of Englewood, New Jersey, 8 and 7 in the finals. At that year’s British Amateur, Jones won seven matches, none of them against an opponent even the most rabid golf researcher is likely to have heard of. For those reasons, included here are results from the various men’s and women’s Amateur tournaments with two conditions attached: Only players who were career amateurs, or who competed prior to World War II, the Amateur’s precollege period, are considered. And because amateur-only fields were generally weaker than professional or mixed fields, results from amateur-only events are devalued by 50 percent.

Who Really Won at Oakmont?

In rating the competitors, one thing should be pretty clear: one can’t just use scores from tournament leaderboards. Way too many variables would have to be overlooked to make such raw comparisons valid.

Consider that the U.S. Open has been played at the Oakmont Country Club outside Pittsburgh on nine occasions spanning nine decades. In 1927 Tommy Armour won with a four-round score of 301. Seven years later, the trophy went to Olin Dutra at 299. In 1953 Ben Hogan won by shooting 283, a score matched by Jack Nicklaus in 1962. In 1973 Johnny Miller won at 279, the same total as Ernie Els in 1994. Eleven years earlier, Larry Nelson had won with 280. In 2007 Angel Cabrera won by shooting 285. In 2016 Dustin Johnson won with a score of 276. Which of the nine was the superior accomplishment?

Certainly, 276 sounds better than 301. But Armour and Dutra used more primitive clubs than Hogan and Nicklaus, who in turn used less advanced clubs than Nelson or Els, whose equipment Johnson probably wouldn’t touch. The rubber-core ball used by Armour and Dutra was out of fashion by Nicklaus’s day and would be a museum piece today. Then there was the course itself. Els played a longer Oakmont than Armour, but Armour and Dutra had to contend with those famously furrowed Oakmont bunkers. The course was basically bare and windswept until the mid-1960s, when members undertook a tree-planting campaign. By the time Nelson won in 1983, parts of Oakmont looked like a forest course. By Els’s 1994 victory, many of the trees had been removed; by Cabrera’s 2007 win, they were all gone.

Same tournament, same course, but nine different setups, nine different sets of equipment, and nine vastly different outcomes. And one must not forget weather or agronomic advances. The task of any serious intergenerational rating system is to ascertain a fair basis on which to normalize all those variables.

There’s actually a surprisingly comprehensive mathematical tool that can do what needs to be done. It’s called standard deviation, and it’s designed to ascertain how unusual a performance is compared to related performances. For a full explanation of standard deviation, see the appendix. For this book’s immediate purpose, it’s sufficient to understand that standard deviation automatically adjusts for all the variables that would otherwise confound us. The following paragraph, which is critical to everything that follows, explains why.

Although the equipment used by Tommy Armour to win the 1927 tournament was obviously inferior by today’s standards, it was not inferior to the equipment used by his fellow competitors; in fact, like Armour, most were probably equipped with the state of the art. The weather, the course, and the ball would have changed markedly from year to year, but probably not very much each day. Because it measures Armour’s performance strictly in relation to his peers who were competing under the same general conditions on the same course at the same time, standard deviation minimizes all the cross-era variables and reduces the generational comparison to a question of relative skill. It focuses on this pivotal question: In the conditions in force at that moment, how much better than their competitors were Armour, Els, Nicklaus, Hogan, Johnson, and all the rest?

If both the average and the standard deviation of a normally distributed set of data are known, it’s a simple matter to calculate the number of standard deviations any individual bit of data is from the average. Mathematicians have an exotic term for that measurement of exceptionality, whose most familiar application today is probably in standardized testing. It’s called a Z score, and in simplest terms it’s an expression of the number of standard deviations an event lies outside the average. When the U.S. Open was played at Oakmont in 2016, Johnson’s four-round total of 276 was three strokes better than his closest competitors and about 12 strokes below the field average for players completing four rounds. But more meaningfully for this book’s purposes, it was 2.26 standard deviations better than that 288.30 field average. (One standard deviation that week amounted to 5.44 strokes.) Johnson, then, had a Z score of –2.26. Jason Day tied for eighth with a Z score of –1.16. Jordan Spieth tied for 37th with a Z score of +0.13. Spieth shot 289, marginally above the field average for playing completing four rounds.

Who actually had the best of it in the nine U.S. Opens played at Oakmont? The answer is Ben Hogan in 1953. When he shot 283, it was 19 shots below the field average of 302. Given a 6.38 standard deviation of the field performance that week, Hogan’s 283 translates to a Z score of –2.98. Larry Nelson’s 1983 total of 280—which produced a Z score of –2.69—was second best. Third best? That belonged not to any of the nine winners but to the man who finished runner-up to Nelson in 1983. Tom Watson shot 281, producing a –2.54 Z score that would have won any 1983 major except the one he happened to be playing in at the time. In fact, a month later Watson won the British Open with a –2.36 Z score.

Because standard deviation allows us to normalize all of the cross-time variables that would otherwise confound this process, it becomes the basis for the player rankings.

Peaks and Careers

Inevitable in the devising of any sort of rating system is the question of what one is trying to rate. Bill James laid this issue out so insightfully in his popular Historical Baseball Abstract that there is no need to do more than quote him. There will need to be one major adjustment to James’s approach, which follows Bill’s explanation of the basics: “When you ask who was a greater player than whom, do you want to know which was more valuable at some moment in his career, or do you want to know which was more valuable over the course of his career? . . . There is no standard or consensus answer to the question; some people mean one thing, some mean the other. The answer that it’s some of one and some of the other won’t do . . . because if you don’t decide, you’ve got two correct answers to every question.”

In baseball analysis, James frames the question by imagining a player’s career path as a line graph. Those more interested in James’s first posit—value at a particular point in a career—focus on the highest points of the line: what can be easily understood as a player’s “peak value.” Those interested in the second topic—essentially the player’s “career value”—are effectively measuring the area below the line.

As James notes, you can do both, as long as you recognize that you are providing two separate and distinct answers. But in analyzing the performance of golfers, it seems to me that “peak value” is somewhat the truer, sexier number. Why? Keep in mind that unlike a peak Z score—which is an average of a player’s best performances—career Z scores are cumulative. That is, you calculate a player’s career Z score by the simple process of adding up all of his or her performances in the majors. That process begins when the player turns pro (except, obviously, in the case of career amateurs), and it ends only when the player retires or turns fifty.

When one applies statistical models that develop concepts such as relative winners and losers—as this book will—a player whose performance declines with age “gives back” career achievement he has previously banked. His career value, in other words, tends to retreat toward (and in some cases beyond) zero. In team sports, the issue of declining performance takes care of itself because at a certain point, the declining athlete loses his place on the team. Golf, however, is not a team sport; players can and often do continue to play in championship-tour events well into their forties and even in a few cases into their fifties. Not well, usually, but they play.

This means career-value calculations can confer an advantage on players who retire before their skills begin to recede. In the 1930s there was a fellow named Ralph Guldahl—good player, won a couple of U.S. Opens back to back. His record will be studied in greater detail in a couple of chapters. No one would assert he was as good a player as Tom Watson, who won eight majors and contested Jack Nicklaus for domination of the game between 1975 and 1985. But for a few years in the 1930s, Guldahl was considered the equal of Byron Nelson, which is saying something.

If—applying the methods to be outlined later in this chapter—you assess Guldahl and Watson strictly in terms of their career value, Guldahl rates higher. Given that everyone agrees Watson was the better player, how can this be? It can be because when Guldahl tired of competitive golf at age thirty-nine in 1950, he quit. His career Z score for the majors at that point was –31.24. When he was thirty-nine at the conclusion of the 1988 season, Watson’s career Z score was –48.14, substantially better than Guldahl’s. But Watson did not retire at age thirty-nine; he continued to play seriously for eleven years until qualifying for the senior tour. Along the way, he suffered recurring putting problems, his long game shortened, new challengers arose, and his scores climbed. Watson played forty-three majors after his thirty-ninth birthday, and his Z score in those forty-three was +31.66, eroding his career Z score to –16.48.

Determining both a player’s “peak” and “career” begin with a couple of definitional adjustments.

As noted, golf is a game of lows, not highs, so when one expresses interest in a player’s peak performance, graphically one is actually talking about who fashioned the deepest, widest trough, not the highest peak. Next question: How long should a peak be? Among the players constituting this field of study whose careers are complete or nearly so, the average career length was about 23.6 seasons. A half dozen played competitively for at least three decades. And of those with shorter careers—Willie Anderson and Tony Lema come to mind—the reason was generally an early death. In this study, a player’s peak is defined as his or her period of five best consecutive seasons; that’s about 22 percent of the average career for all the players under consideration. An exception is made for women players in the “five-major” era since 2013 to negate the statistical inequity created by selecting their ten best scores from among twenty-five. In the cases of post-2012 women, the standard will be four seasons, again encompassing twenty majors. There is a judgmental aspect to all of this. One could as easily make the case for an assessment period covering four consecutive years or six, fifteen scores, twenty, or more.

Match Play and Data Sufficiency

If one wants to evaluate historically important amateurs—and what’s the point of an all-time evaluation that doesn’t include Bobby Jones?—then one has to devise a method of handling match play, the dominant amateur form of competition. The problem is that in match play, there is no final “score” . . . at least not in the sense one is used to thinking about it. In match play, the length of a match varies, and a player’s score is measured relative only to his or her opponent’s rather than to par. That’s consequential. In two matches, one player might “shoot” a round of 69 yet lose 3 and 2, while another might “shoot” a round of 75 yet win 3 and 2. Same final result, yet far different performance levels. In fact, neither competitor would have actually “shot” the projected number, since neither completed the requisite eighteen holes.

If one wants to incorporate match play, one has to accept the fact that one can only estimate, not precisely determine, Z scores.

Having laid out the problems, the lure of developing some means of at least estimating standard deviation into match play is still compelling because of the substantial difference it makes to many of the great players. Between 1916 and 1957, the PGA Championship was contested at match play. Those were also the years when many of the great career amateurs performed. The U.S. and British Amateurs were, for part of that time, viewed as “majors” on a par with the Opens, and the U.S. Women’s Amateur was until 1930 the only meaningful tournament competition for American women. Ratings of players of the stripe of Bobby Jones, Walter Hagen, and Gene Sarazen are all materially affected by the inclusion or exclusion of match play. For women, the issue turns on the Amateur as well as the Women’s Western Open from its inception in 1930 until 1954 and for the 1946 U.S. Open. The two women whose ratings are most impacted by the handling of match play are Patty Berg and Babe Zaharias.

The method employed to convert match-play results into a “score” involves looking at every possible match-play outcome and converting it to a stipulated stroke-play equivalent. For instance, in matches ending 1-up—that is, one player winning by one hole with none to play—a score of 71 is assigned to the winner and 72 to the loser. That step will be repeated for each possible outcome, the designated stroke margins increasing with the decisiveness of the match play result. These results are not necessarily representative of the actual stroke-play score at the time of the match, but the scores themselves are not important. In calculating standard deviation, what is important is the margin of the victory. If, given three matches decided by 1-up, one of those pairs of players shooting 65 and 66, a second pair shooting 71 and 72, and a third pair shooting 77 and 78, the critical element is that the winner was (approximately) 1 stroke superior to the loser.

Here are the assigned scores used:

Win Result Lose

72

extra holes

72

71

1-up

72

71

2-up

72

70

2 and 1

73

70

3 and 1

73

69

3 and 2

73

69

4 and 2

74

69

4 and 3

74

68

5 and 3

74

68

5 and 4

75

68

6 and 4

75

67

6 and 5

75

67

7 and 5

76

67

7 and 6

76

66

8 and 6

76

66

8 and 7

77

65

9 and 7

77

65

9 and 8

78

64

10 and 8

78

63

10 and 9

79

The second problem involves data sufficiency. Since 1958 on the men’s tour, and for much of the women’s tour, there have been four recognized major events played annually. That means in any five-year peak period, a player’s record (unless he or she missed events) would be comprised of twenty major performances. But as previously noted, for much of the 1970s the LPGA got along with just three majors, and for a time there were just two. Today there are five. Go back far enough, and the same problem surfaces on the men’s tour. True, as of the institution of the Masters in 1934 there were four “majors,” but the British Open was separated by an ocean.

That means that prior to 1934, even a star-quality player could have competed in no more than fifteen presently recognized majors within a five-year period and then only if they made an ocean crossing. That was asking too much for many of the game’s stars. True, Walter Hagen made it to ten British Opens over eighteen seasons and Gene Sarazen to eight (not counting his famous return to Troon as a septuagenarian in 1973). But Byron Nelson played in only two, and Bobby Jones, for all his emotional connection to St. Andrews and Scotland, completed just three British Opens his whole life. (Jones actually teed it up in four but withdrew during the 1921 event.) Willie Anderson, the greatest U.S.-based player of the first decade of the twentieth century, emigrated from Scotland but never returned to play there. The east-to-west crossings were even more rare; Harry Vardon played in just three U.S. Opens over two decades, J. H. Taylor in just two, and Ted Ray in three. Henry Cotton, the greatest British player between Vardon and Faldo, played in only four U.S.-based majors, none until he was in his forties. James Braid never made it to an American major.

If one uses five-year windows and considers only the recognized modern majors, the records of many of the men’s game’s greats prior to the mid-1930s become distressingly inadequate. But one can supplement this record. Into the second half of the century, the Western Open was widely viewed as major quality. (In fact, during the 1960s when the tour’s World Series of Golf consisted only of the winners of the four major tournaments, the Western Open winner fleshed out the foursome if one man had won two of the others.) So prior to 1958, when the Atlantic Ocean ceased being a severe impediment to American participation in the British Open, the Western Open makes a perfectly valid supplement to the database.

Life also gets in the way of the assessment. Players of an earlier age simply competed in fewer events than pros do today. Travel, records availability, scheduling, and lack of financial incentive created too many obstacles. The following table shows the number of major tournaments played in by more than two hundred of the game’s stars. It is based on the seasons they began their full careers and also their peaks during twenty-year increments beginning in 1900. The difference in major opportunities is clear.

Career began Peak opportunities Career opportunities

Pre-1900

9.62

19.23

1901–20

10.27

22.91

1921–40

14.15

45.67

1941–60

15.84

51.64

1961–80

15.67

61.87

1981–2000

18.19

70.61

There is an obvious inequity in attempting to compare Tiger Woods’s performance over eighteen majors (for a peak rating) or seventy (for a career) to Harry Vardon’s performance over ten or nineteen. Golf has its own way of dealing with this; it’s called the handicapping system, and those of you with an official handicap already know how it works. But for the others, here’s a brief course. Take your most recent twenty scores, throw out the ten worst, and average the ten best. With a couple of minor adjustments, the difference between your average and par is your handicap. With an exception noted below, the rule will be that only a player’s ten best scores among the twenty within his or her peak window are averaged to produce a peak rating.

Special Exemptions

A few of the game’s early greats—notably the nineteenth-century British pros whose only available major was the Open championship—never competed in as many as ten recognized majors during a five-year period. Under a rigid interpretation of the tournament rules used here, these players could not be included because of the impossibility of developing a peak performance rating.

For such players, the five-year rule will be extended to enable those players to put together the requisite ten stroke-play majors. And in the cases of Young Tom Morris and John McDermott, whose entire careers consisted of fewer than ten major tournaments, they’ll be graded based on the number of tournaments they did play.

Outliers, Gaps, and Other Adjustments

From time to time throughout the history of major tournament golf, there have been participants who—to be blunt—should not have been allowed in the championship without a ticket. This was especially true in the early days of tournaments (pre–World War I for the men and pre-1973 for the women.)

In 1967 Kathy Whitworth won the Women’s Western Open with a score of 289, 11 strokes under par on the 6,505-yard par-75 Pekin, Illinois, Country Club course. There were forty-two contestants in the event, among them a woman named Mona Erickson. Two years earlier, Mrs. Erickson had teed it up in the Women’s Western Open—the first professional event of her life—and shot rounds of 87, 86, 93, and 88 for a four-round total of 354, 62 strokes above par and 64 behind the winner. In 1967, having sharpened her game not very much, Mrs. Erickson recorded scores of 88, 87, 95, and 92 for a four-round total of 362. That, as you might guess, was good for dead last in the field, 73 strokes behind Whitworth and 11 behind her nearest competitor.

Based solely on her golf skills, Mrs. Erickson did not belong in the field for either the 1965 or the 1967 event. But for reasons known only to history (a need to flesh out the field, a desire to include host club members), even big tournaments have from time to time accepted entries from relative duffers . . . players who produce scores plainly out of step with the skills of the better players of the period in question. Such players can be considered outliers, a term that derives from the fact that their data (scores) lie outside the normal range of expected performance. In major tournament golf, outliers haven’t been a problem for decades. The last true outlier was the grand old lady Patty Berg, who at age sixty-one completed four rounds in the 1979 Dinah Shore in 343 strokes—an average of just under 86 per round—because Dinah didn’t have the heart to cut a legend.

For this book’s purposes, the problem with outliers is that while they were technically part of the field, the inclusion of their results distorts the full-field data and thus any data-based conclusions. If I were ruled eligible to compete in the U.S. Open, I might shoot 450, a performance that all by itself would raise the overall field average a couple of strokes and also increase the standard deviation of that week’s performance. Statistically, in other words, my presence would be meddlesome and effectively misleading. It’s the same principle that would apply if a musician of little skill played second violin in the philharmonic. The negative effect would accrue to an assessment of the whole performance, even though the real problem lay with one performer.

Back to golf. By her mere presence in the 1967 Western Open, Mrs. Erickson and a handful of fellow outliers raised the tournament average 2 strokes, from 313.4 to 315.4 and raised the standard deviation from 10.32 to 13.69. These changes are consequential for the Z score of every player in the field. Accepting such scores as legitimate, Ms. Whitworth’s winning 289 translates to a Z score of –1.93. Discounting such outlier showings, the reduced field average and standard deviation make Ms. Whitworth’s Z score –2.36.

Because they exist chiefly as mischief makers, the solution for us—as it would be for the symphony—is pretty straightforward: boot the outliers. Hence, for this book’s purposes, the various Mona Ericksons of major tournament golf, male and female, do not exist.

Outliers are a modest problem, but players who miss the cut are a more serious one. For most of professional tournament history, the practice has been to lop from the field those who fall far behind the front-runners after a certain point, usually halfway. These days that usually reduces the number of weekend competitors to the top seventy and ties. Obviously, if a player fails to make the cut, he does not have a four-round score for the tournament. This isn’t much of a concern as one tries to calculate a peak-performance rating. If in twenty majors a player doesn’t have at least ten scores better than “missed cut,” there’s little need to rate his playing “peak.” But career ratings are cumulative. One way to deal with this problem would be to base the calculations on per-round scores rather than four-round scores. But that would muddy the calculations since some conditions—weather and hole locations being the most obvious—could have changed in the interim, and the validity of the calculations hinges on environmental neutrality. Since there is no way to calculate what a player would have shot if he or she had not missed the cut, one will declare a rule for doing so. Such a player will be assigned a “four-round score” that is three strokes higher than the highest four-round score recorded in that tournament, and his or her Z score will be calculated accordingly.

One must also confront what can be thought of as the Lee Trevino provision. Trevino, whose predominant shot was a left-to-right fade, was noted for his aversion to the Masters, a strong right-to-left ball-flight course. Four times in the 1970s he did not even accept his invitation to compete, and when he did he rarely contended. Only once in eighteen Masters appearances did Trevino reach the top ten, and that finish—a tie for tenth—came in 1985, on the forty-six-year-old’s fourteenth attempt.

Golfers worthy of consideration as the best of all time cannot have too many jinx courses by tour standards. Accordingly, every major tournament in which a player competes during his or her five-year peak period must be represented at least once among the scores used to calculate his peak rating. If none of a player’s ten best scores come from a major in which he or she participated, then better scores must be struck until all majors the player participated in during that peak period are represented at least once. This rule does not penalize players who never played in a particular major during their peak. For example, an American player during the 1930s who never competed in the British Open would accrue no penalty. But if that same player made one appearance during his five-year peak in the Brit, his Z score from that one appearance must be among the ten factored into the peak score.

For the record, Trevino’s peak rating includes his nineteenth-place finish at the 1969 Masters in which he registered a Z score of –0.18. That replaces his tenth-place showing and –0.68 Z score at the 1973 British Open. (It also changes Trevino’s average Z score for the period from –1.70 to –1.63.)

Converting Standard Deviation to Stroke Average

In this book, excellence is measured by standard deviation. In golf, however, excellence is measured by a stroke average. That makes it desirable to convert the former to the latter.

As a rule of thumb, 1 stroke in a major professional golf tournament is equal to between 0.16 and 0.17 of a standard deviation in performance. Armed with this information, and assuming that 0.0 standard deviations equals a score slightly less than 288 on a par 72 course, one can assign an estimated stroke value to every increment of standard deviation. A tabular breakdown of all the major intervals is in the appendix.

There is, however, a catch: converting standard deviation to an estimated stroke value works well only in determining peak ratings. That’s because a peak rating is a representation of performance at a particular point in time. Career ratings, however, are volume measures. That’s why a player’s career rating is not translated to an estimated stroke average.