The Hole Truth: Determining the Greatest Players in Golf Using Sabermetrics

The Hole Truth: Determining the Greatest Players in Golf Using Sabermetrics

by Bill Felber
The Hole Truth: Determining the Greatest Players in Golf Using Sabermetrics

The Hole Truth: Determining the Greatest Players in Golf Using Sabermetrics

by Bill Felber

Hardcover

$29.95 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview


Ever wonder whether Tiger Woods in his prime would have beaten Bobby Jones, Ben Hogan, or Jack Nicklaus in their primes? And could any of them have beaten Babe Zaharias? Obviously, if Bobby Jones were returned to life and health and then given his old hickory-shafted mashie, persimmon-headed driver, and rubber-core ball in a match against Jordan Spieth, the outcome would be foreordained. But what if the impact of the training, equipment, courses, and traveling conditions could be neutralized in order to create a measurement? Now for the first time, questions are answered about the relative abilities of the greatest players in the history of professional golf.  

In The Hole Truth Bill Felber provides a relativistic approach for evaluating and comparing the performance of golfers while acknowledging the game’s changing nature. The Hole Truth analyzes the performances of players relative to their peers, creating an index of exceptionality that automatically factors the changing nature of the game through time. That index is based on the standard deviation of the performances of players in golf’s recognized major championships dating back to 1860. More than two hundred players are rated in comparison with one another, more than sixty of them in detail with profiles providing context on their ranking. For the dedicated golf fan, The Hole Truth is an engaging way to see in the numbers where their favorite golfers rank across eras and where current players like Rory McIlroy and Inbee Park compare to the game’s greats.

Product Details

ISBN-13: 9781496206541
Publisher: Nebraska
Publication date: 01/01/2019
Pages: 328
Product dimensions: 6.30(w) x 9.00(h) x 1.40(d)

About the Author


Bill Felber is the author of several books, including The Book on the Book: A Landmark Inquiry into Which Strategies in the Modern Game Actually Work; Under Pallor, Under Shadow: The 1920 American League Pennant Race That Rattled and Rebuilt Baseball (Nebraska, 2011); and A Game of Brawl: The Orioles, the Beaneaters, and the Battle for the 1897 Pennant (Nebraska, 2014). He was executive editor at the Manhattan Mercury in Manhattan, Kansas, from 1986 to 2013 and is a former member of the Board of Directors of the Associated Press Managing Editors.
 

Read an Excerpt

CHAPTER 1

The 3 Percent Game

The role of statistical analysis in golf can be explored to a reasonable degree of certainty. But to do so, one must first recognize and then adjust for the impact of a variety of natural changes that have occurred to golf — and, for that matter, to almost every other form of human endeavor — over a period that covers centuries. The groundwork to do so has already been laid — not in golf but in baseball. In a 1989 essay titled "The Changing Game" published in Total Baseball, I noted the ways that changes over time in numerous aspects of life — the numbers of people available to fill rosters, improved equipment, technological and sociological advances, educational level, and strategy — affected the national pastime. "By what context does one measure [Rogers] Hornsby's feats of the 1920s relative to [Wade] Boggs's of today?" I asked at that time, answering, "By the context of the technological, strategic, societal and cultural changes that have wrought both of them."

Even more so than baseball, golf suits itself to this sort of statistical approach because while baseball is substantially a statistical game, golf is almost entirely so. In baseball the team-wide object may be victory, but individual players contribute in varying and disparate ways, some by hitting, some by pitching, some by fielding. Each skill requires a separate technique of measurement and a separate field of reference, and those can be imprecise, even unquantifiable. A batter may contribute to victory even by making a well-placed and well-timed out. There are many career .125 hitters in the Hall of Fame; they generally had 95 mph fastballs and great control of that pitch. But because the tasks performed by members of a baseball team are often different, the difficulty with baseball analysis lies in determining how all those disparate aspects can be merged into a single accurate and meaningful expression.

That's the easy part in golf, where one is measuring just one player with just one goal. The goal is called his or her score. The task is merely to relativize and, if possible, explain it.

Such statistical analysis is also a lot more important to the understanding of golf than other sports for the simple reason that golf is more competitive. The gap between the determinable talent level of golf professionals is smaller than it is among those in other sports. Nor, among other sports where such a thing can be quantified, is there an especially close second.

In a typical modern Major League Baseball (MLB) game, the average winning team scores about 5.5 runs; the average losing team scores about 3.5. In other words, one would expect the winning team on any given day to score runs at a rate about 22 percent above the average and expect the losing team to score runs at a rate about 22 percent below the average. That's subject to wide variation in the particulars, but it's accurate as a generalization.

In other popular professional sports, similar results emerge. In professional football, an average modern score is about 27–15. That means the winning team scores points at a rate about 29 percent above the league average, while the loser's score rate is correspondingly reduced. The score of an average National Hockey League (NHL) game is about 4–2, meaning that the winning and losing teams over- or underperform league averages by about 34 percent. Because National Basketball Association (NBA) games are by their nature higher scoring, the differential between winning and losing is smaller as a percentage. But it still amounts to about 5.2 percent (10 points). One sees even more striking disparities when looking at individual skills. As measured by batting average and considering all those with enough plate appearances to qualify for consideration, the difference between the best (.348 D. J. LeMahieu) and worst (.209, Danny Espinosa) hitters in baseball during 2016 was 67 percent. Measured by yards per game, the difference between the best passer in the National Football League (NFL) in 2016 (Drew Brees, 325.5) and the worst (Brock Osweiler, 197.1) was 65 percent.

Such distinctions are unfathomable in golf. On the 2016 Professional Golfers' Association (PGA) Tour, the difference between the lowest stroke average per round (Jordan Spieth, 68.85) and the average (70.93) was 2.08 strokes, or about 2.9 percent. The difference between Spieth and the worst player (Steven Bowditch) was just 7.46 percent. The spread on the Ladies Professional Golf Association (LPGA) Tour was a bit higher (Lexi Thompson, 69.02, average 71.62, worst Ssu-Chia Cheng 76.00), but the differences still amounted to 3.6 percent and 9.2 percent, respectively.

Reduced to tabular form:

Average Winning Percentage Leaguer team/individual score team/individual score difference

NHL 3 4 33.3
NFL 21 27 28.6
MLB 4.5 5.5 22.2
NBA 101 106 5.2
LPGA 71.62 69.02 3.6
PGA 70.93 68.85 2.9

Because both games are so statistically oriented, many of the sabermetric techniques developed for baseball are easily translatable to golf. The question becomes: What does one want to learn? There may be an infinite number of answers to that question, but two should suffice to begin the dialogue. The first: What tangible skills are most important to success on the professional tour? The second — can one compare and rank the greatest professional golfers of all time? — is set aside for succeeding chapters.

As is the case with virtually every professional sport today, the by-product of sabermetric research many years ago spilled over into golf. More than a decade ago, the PGA Tour's website began making available second-level or third-level statistical tools. Initially, there were fewer than a dozen such measurements. Today it is possible to study 60 different measurements related to the striking of the ball by a driver alone. The tour offers 97 statistical measurements related to play from the fairway or rough, 29 analyzing play from around (but not on) the green, and 95 related to the seemingly simple act of putting. That's an obsessive 281 stats designed to analyze performance that during an average round probably encompasses only about seventy strokes.

The basic tool for quantifying relationships between sets of numbers — say, scoring average and any of the 281 available statistics — is called regression analysis. You can think of it as correlation. It asks a fundamental question: How strong is the relationship between the two numbers? In other words, if one goes up (or, in the case of scoring average, down), is the other likely to follow? Correlations can be thought of as running from 0.0 to 1.0, with zero indicating no correlation whatsoever and 1.0 indicating a perfect correlation. When comparative sets of data normally flow in opposite directions — does your score get lower as you drive the ball farther? — correlations are calculated negatively, with –1.0 indicating the strongest correlation and 0.0 indicating no correlation. For purposes of simplicity in this analysis, all correlations — even naturally negative ones — will be expressed on a 0.0 to 1.0 basis. The significance of the findings will be unaffected.

As one talks about correlation, honesty compels acknowledgment of a mathematical axiom: correlation may or may not indicate causation. The stereotypical illustration of this is the fact that an extremely high percentage of those convicted of violent crimes brush their teeth every day. It does not, however, follow, that tooth brushing causes violent crime. In correlation studies, as elsewhere in life, logic must be allowed to intervene.

An important cautionary note, though, is merely that if everyone can do it, correlation studies may not point to causation. Regarding golf, that forces some questions: Can everybody hit a 300-yard drive? Can everybody stick a 5-iron within 5 feet of a 4.25-inch hole from 200 yards away? Can everyone make a good percentage of their 20foot putts? No, no, and no. So if strong correlations exist between those abilities and a player's final score, one is safe in giving at least some weight to the causal prospects of them.

The table that follows lists the correlations between scoring average and six measures of player performance during the 2017 PGA Tour season, one in which 195 pros played enough rounds to meet tour minimums for analysis. Again, the average score of a qualifying PGA Tour player during 2017 was 71.05. Here are the results, showing the skill, the average performance, and the strength of the correlation. Because one is more used to seeing correlations expressed as percentages rather than decimal points, that's how it will be done henceforward. Just keep in mind that 0.0 percent is meaningless and 100 percent is perfect.

The data appear to suggest that only a couple of the six aspects of play correlated to any meaningful degree with a player's stroke average. The strongest correlation, with percentage of greens hit in regulation, measured only about 59 percent. The only other correlation above 50 percent involved scrambling ability, measuring 52 percent. The correlation between scoring and driving distance, surely the most discussed aspect of tour play, registered 44 percent.

It is not only possible but confirmed by analyzing additional data back to 1980. Here are the correlations for each of those six categories for every fifth year dating to the beginning of record keeping. Where a space is blank, data for that season were not kept. The final column shows the average for all since the statistic in question was first kept.

On average, two of the six skills — greens hit in regulation and scrambling — correlate with a player's score to a level above 50 percent. The rest appear to have a modest to insignificant historical relationship to scoring, although in saying that, a caveat is in order with respect to distance off the tee. For the past four seasons, the correlation between driving distance and scoring has basically shot directly up, from just 14 percent in 2013 to 24 percent, then 33 percent, then 35 percent, and finally to an all-time high of 44 percent in 2017. More on that in a few paragraphs.

If that is all one had, one would conclude that winning on tour is more a matter of artistry than talent, that there is no formula for excellence. It is not, however, all one has; there are the 281 categories mentioned earlier. The data push has come in several stages. Between 2001 and 2004, the PGA Tour accelerated its use of lasers to measure player distance and accuracy. A select couple of categories became 30, then 50. In 2007 the tour first measured the actual mechanics of ball flight, giving us such micro-measurements as clubhead speed, spin rate, launch angle, carry and hang time ... basically, the same things your sporting-goods retailer use to sell you a new driver.

Statistically, however, the most important of those new categories was made possible in 2003. That is when tour officials began utilizing a system they called ShotLink to more precisely and more thoroughly record every shot actually hit on tour. They did more than that; they made the data available to researchers. A few years later, Mark Broadie, research director of the program for financial studies at Columbia University, took the tour up on its data availability, eventually developing a performance measurement system that came to be called Strokes Gained.

The Strokes Gained system is sufficiently intricate that it cannot be explained to any level of detail here. Suffice to say, it analyzes each shot against the average result of all shots taken from similar circumstances in order to attach a positive or negative value. In that sense, it is similar to baseball's wins above average, an offshoot of the better-known wins above replacement, which measures the relative contribution of each game-related act. In both cases, a zero-based norm results, which is a good thing. The second good thing is that Strokes Gained improves the correlational relationship between skills and scoring. Here are the correlations for the four major "Strokes Gained" categories for five-year increments since the formula's application dating back to 2004:

2005 2010 2015

Strokes Gained off the tee 46% 46% 52%
Strokes Gained approaching the green 76% 60% 64%
Strokes Gained around the green 31% 49% 31%
Strokes Gained putting 37% 46% 42%

If one compares the strength of those correlations with the strength of correlations of the six basic tasks already examined, the comparison generally favors the "Strokes Gained" approach. This is true in the most significant category — greens in regulation (GIR) versus Strokes Gained approaching the green — but is also true in Strokes Gained putting versus putts per round and in the emerging category, driving distance versus Strokes Gained off the tee. Since its inception, the correlation between Strokes Gained approaching the green and scoring has never fallen below 59 percent and has averaged 70 percent. That is a noteworthy correlation indeed, and it is also noticeably stronger than either the 47.5 percent correlation between GIR and scoring or the 37 percent correlation between proximity and scoring over the same years. Since 2004 the average correlation between Strokes Gained off the tee and scoring has been 49 percent; the average correlations for distance and accuracy for that same period are 19 percent and 15 percent, respectively. The average correlation between Strokes Gained putting and scoring has been 38 percent; substitute putts per round, and the correlation falls to 26 percent.

Where the Strokes Gained approach does less favorably than the tried-and-true method is in assessing a player's work around the greens, chiefly his ability to scramble for par or better. Since 2004 the average correlation between Strokes Gained around the green and scoring has been a relatively modest 38 percent, measurably less than the 52 percent correlation between basic scrambling ability and scoring. There are two possible explanations for this: first, Strokes Gained doesn't work particularly well the closer you get to the green, and, second, success around the green doesn't correlate well to winning.

Back to putting ... what's going on here? In 2017 tour pros averaged 29.07 putting strokes, 41 percent of their average score. Yet neither the correlations between Strokes Gained putting or putts per round and scoring have measured higher than 50 percent in thirty years, and in 2017 those correlations were just 25 percent and 20 percent, respectively. If one particular skill amounts to 41 percent of the game, shouldn't its correlation to score be something beyond incidental? The explanation is probably pretty logical, if one thinks about it. In 2017 the average pro hit his approach shot 36.74 feet from the hole. For all putts longer than 25 feet, the best putter on tour in 2017 — Xander Schauffele — made just 5.6 percent ... but that only amounted to 30 makes in 540 attempts. Three-putts are even more rare. There were 7,846 of them in 2017, but that's across more than a quarter-million holes, amounting to just 2.2 three-putts in an average player's seventy-two-hole tournament.

Speaking broadly, in other words, tour pros pretty much always two-putt from any significant distance. So, no, putting isn't as broadly decisive as one might think. (That does not rule out it's being decisive on an event-by-event basis; more on that shortly.)

To this point, the book has been dealing only with general, season-long data sets. When working from the general to the specific, a host of variables come into play. Those include, but are not limited to, the depth or quality of a tournament field, the playing conditions, weather, and innumerable idiosyncratic factors. Jordan Spieth may be the best putter on tour, but how's his tummy feeling this morning? Did a contender celebrate a good round with a late night on the town? And by far the biggest variable, not at all idiosyncratic, is the competitive closeness of the field ... in a word, luck.

What all of that means is that on an event-by-event basis, the relationships between skills and scores may be more meaningful. Is there a correlation between excellence in certain skills and scoring that may come and go over time but may influence results on a week-to-week basis?

Here's a secret to fantasy players: Given the competitive nature of the tour, predictive analysis of a golf tournament is almost impossible to do in a statistically reliable fashion. That in turn means even excellent data are essentially worthless for predictive purpose. Sorry, Draft Kings fans. If those data make any sense at all, it is only retrospectively, which is how they are used here.

The difficulty with predictive analysis is even more challenging on the women's tour. That's because LPGA data-collection techniques are primitive for this data-driven age, generally encompassing just six rudimentary skills and only dating back to 1993. The six are driving distance, driving accuracy, greens in regulation, sand saves, putting, and putts per GIR. Here are the correlations for the four most pertinent in 2017:

Correlation

Distance 0.30
Accuracy 0.30
Greens in regulation 0.86
Putting 0.37

(Continues…)


Excerpted from "The Hole Truth"
by .
Copyright © 2019 Board of Regents of the University of Nebraska.
Excerpted by permission of UNIVERSITY OF NEBRASKA PRESS.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents


Acknowledgments    
Introduction    
1. The 3 Percent Game    
2. Dominance and Chance    
3. Tournament Rules    
4. Pioneers    
5. Coming to America    
6. Interwarriors    
7. Bantam Ben and Slammin’ Sam    
8. The King, Some Queens, and a Black Prince    
9. The Golden Bear Market    
10. Metallurgy    
11. Millennials    
12. Still on the Course    
Afterword    
Appendix    
Glossary    
Notes    
Index    
From the B&N Reads Blog

Customer Reviews