September 22, 2010

Statistics in sports

 There are three kinds of lies – lies, damned lies, and statistics.  Although attributed to Mark Twain, it was actually British politician Benjamin Disraeli who coined that expression.  I don’t think statistics in sports have reached that level of mendacity, but there are often misleading.

I’ve been a statistics aficionado every since I was a kid collecting hundreds of baseball cards.  (I could have retired from USAA earlier if I hadn’t misplaced those cards sometime along the way.)  One of my favorite games when I was a kid was to compare one baseball card against another to determine which player had the better year.  Because I gave each category of statistic on the card equal value (e.g., triples and RBIs), I remember certain non-star players with a large number of doubles, triples, stolen bases, and runs would stack up surprisingly well against slugging stars.  I also gave the same credit for winning a category by one or by 50.  Thus, having one more triple would be worth the same as 50 more RBIs.  These rules made the game fun and a bit unpredictable, but even then I realized that baseball-card statistics could be misleading.  

Statistician Bill James has done more than anyone to make sports statistics meaningful.  Starting in 1977, he self-published The Bill James Baseball Abstract, which contains his statistical analysis of baseball strategy, productivity, and effectiveness, and his influence in the baseball world has grown continually ever since.  James has developed countless statistics that reveal a player’s offensive and defensive effectiveness (e.g., runs created and range factor), and he has shown that certain time-honored managerial strategies are incorrect (e.g., when to hit-and-run or bunt).  In 2006, James was named by Time magazine as one of the hundred most influential people in the world (in the Thinking category).  He has also been the subject of a profile on “60 Minutes.”

The reliance on statistics in baseball is called sabermetrics, which refers to the acronym of the Society of American Baseball Research (SABR), of which Bill James is the most prominent member.  Another famous practitioner of sabermetrics is Oakland GM Billy Beane, about whom the book Moneyball was written in 2003.  According to Wikipedia:

  • The central premise of Moneyball is that the collected wisdom of baseball insiders (including players, managers, coaches, scouts, and the front office) over the past century is subjective and often flawed. Statistics such as stolen bases, runs batted in, and batting average, typically used to gauge players, are relics of a 19th century view of the game and the statistics that were available at the time. The book argues that the Oakland A’s’ front office took advantage of more empirical gauges of player performance to field a team that could compete successfully against richer competitors in Major League Baseball.   

Examples of sabermetrics include:

  • OPS – on-base plus slugging
  • LIPS – late-inning pressure situations
  • DIPS – defense independent pitching situations
  • WHIP – walks plus hits per inning pitched

As sabermetrics became generally accepted in baseball, it made gradual in-roads into other sports, too, like basketball and football.  In fact, I was prompted to write about this subject by a recent Happy Hour with Kevin Brown during which he complained about some misleading basketball statistics.  He thinks scoring, rebounds, and assists should be modified to reflect so many per minute.  (I think Kevin’s favorite player doesn’t play a lot of minutes, but generates a lot of numbers in those minutes.)  Kevin also complained that quarterback ratings in football should consider dropped passes and interceptions off of deflections.

I responded to Kevin that I had some pet-peeve statistics, too:

  • Basketball – points per possession consumed.  Whereas Kevin wants statistics to show points per minute, I want show points per possession consumed.  You might think that the number of shots taken would fully reflect this, but it doesn’t because it fails count possessions that end in free throws.  For example, Kobe Bryan may shoot 4 of 14 and score 14 points because he made 6 of 8 free throws.  According to my statistic, he scored 14 points out of 18 possessions (or 20 possessions if you count his two turnovers).  I would be interested in comparing Kobe, LeBron, and Durant with this statistic.  
  • Football – time-of-possession.  Virtually every commentator thinks that time-of-possession is a significant statistic, whereas I think it is easily inferior to the total number of plays.  I concede that a football team wants the offense on the field because they can score easier than a defense can and because defenses tend to get worn down, but it shouldn’t matter whether the clock is running (because of running plays and short passes over the middle) or is stopped (because of sideline passes or incompletions).  Furthermore, running the clock may be good if you are ahead, but it also reduces the number of possessions in a game, and only the weaker team wants to reduce the number of possessions.      
  • Baseball – errors.  A batter who gets on base because of an error is charged with an out for purposes of his batting average.  That always struck me as misleading (and unfair) because, as a relatively fast softball player, I felt that my ability to run would cause the infielder to make an error and this should be reflected in my batting average. 

Interestingly, football and basketball numbers crunchers use the term sabermetics even though, technically, the term is based on the term, “Society of American Baseball Research.” 

In my opinion, managerial hunches has been have been vastly overrated; not much more than an excuse for not doing the necessary study and thinking.  With all of the money involved in professional sports, statistical analysis should be a part of Management 101.