Joe Posnanski has written another thoughtful piece on the divide between writers of a statistical bent and those who prefer the evidence of their eyes. I highly recommend it; Posnanski distills the arguments into one about stories. Do statistics ruin them? His answer is no. Obviously, one should use statistics to tell other stories, if not necessarily better ones. He approached this by examining how one statistic, “Win Probability Added”, helped him look at certain games with fresh eyes.
My only comment here is that, I’ve noticed on his and other sites (such as Dave Berri’s Wages of Wins Journal) that one difficulty in getting non-statisticians to look at numbers is that they tend to desire certainty. What they usually get from statisticians, economists, and scientists are reams of ambiguity. The problem comes not when someone is able to label Michael Jordan as the greatest player of all time*; the problem comes when one is left trying to place merely great players against each other.
* Interestingly enough, it turns out the post I linked to was one where Prof. Dave Berri was defending himself against a misperception. It seems writers such as Matthew Yglesias and King Kaufman had mistook Prof. Berri’s argument using his Wins Produced and WP48 statistics, thinking that Prof. Berri wrote other players were “more productive” than Jordan. To which Prof. Berri replied, “Did not”, but also gave some nuanced approaches in how one might look at statistics. In summary, Prof. Berri focused on the difference in performance of Jordan above that of his contemporary peers.
The article I linked to about Michael Jordan shows that, when one compares numbers directly, care should be taken to place them into context. For example, Prof. Berri writes that, in the book Wages of Wins, he devoted a chapter to “The Jordan Legend.” at one point, though, he writes that
in 1995-96 … Jordan produced nearly 25 wins. This lofty total was eclipsed by David Robinson, a center for the San Antonio Spurs who produced 28 victories.
When we examine how many standard deviations each player is above the average at his position, we have evidence that Jordan had the better season. Robinson’s WP48 of 0.449 was 2.6 standard deviations above the average center. Jordan posted a WP48 of 0.386, but given that shooting guards have a relatively small variation in performance, MJ was actually 3.2 standard deviations better than the average player at his position. When we take into account the realities of NBA production, Jordan’s performance at guard is all the more incredible.
If one simply looked at the numbers, it does seem like a conclusive argument that Robinson, having produced more “wins” than Jordan, should be the better player. The nuance comes when Prof. Berri places that into context. Centers, working closer to the basket, ought to have more, high-percentage shooting opportunities, rebounds, and blocks. His metric of choice, WP48, takes these into consideration. When one then looks at how well Robinson performed above his proper comparison group (i.e. other centers), we see that Robinson’s exceptional performance is something one should expect when comparing against other positions but is not beyond the pale when compared to other centers. However, Jordan’s performance, when compared to other guards, shows him to be in a league of his own.
That argument was accomplished by taking absolute numbers (generated for all NBA players, for all positions) and placing them into context (comparing to a specific set of averages, such as by position.)
This is where logic, math, and intuition can get you. I don’t think most people would have trouble understanding how Prof. Berri constructed his arguments. He tells you where his numbers came from, why there might be issues and going against “conventional wisdom”, and in this case, the way he structured his analysis resolved this difference (it isn’t always the case he’ll confirm conventional wisdom – see his discussions on Kobe Bryant.)
However, I would like to focus on the fact that Prof. Berri’s difficulties came when his statistics generated larger numbers for players not named Michael Jordan. (I will refer people to a recent post listing a top-50 of NBA players on Wages of Win Journal.*)
* May increase blood pressure.
In most people’s minds, that clearly leads to a contradiction: how can this guy, with smaller numbers, be better than the other guy? Another way of putting this is: differences in numbers always matter, and they matter in the way “intuition” tells us.
In this context, it is understandable why people give such significance to 0.300 over 0.298. One is larger than the other, and it’s a round number to boot. Over 500 at-bats, the difference between a 300-hitter and a .298-hitter translates to 1 hit. For most people who work with numbers, such a difference is non-existent. However, if one were to perform “rare-event” screening, such as for cells in the blood stream that were marked with a probe that “lights” up for cancer cells, then a difference of 1 or 2 might matter. In this case, the context is that, over a million cells, one might expect to see, by chance, 5 or so false-positives in a person without cancer. However, in a person with cancer, that number may jump to 8 or 10.
For another example: try Bill Simmons’s ranking of the top 100 basketball players in his book, The Book of Basketball. Frankly, a lot of the descriptions, justifications, arguments, and yes, statistics that Simmons cites looks similar. However, my point here is that, in his mind, Simmons’s ranking scheme matters. The 11th best player of all time lost something by not being in the top-10, but you are still better off than the 12th best player. Again, as someone who works with numbers, I think it might make a bit more sense to just class players into cohorts. The interpretation here is that, at some level, any group of 5 (or even 10) players ranked near one another are practically interchangeable in terms of their practicing their craft. The differences between two teams of such players is only good for people forced to make predictions, like sportswriters and bettors. With that said, if one is playing GM, it is absolutely a valid criterion to put a team of these best players together based on some aesthetic consideration. It’s just as valid to simply go down a list and pick the top-5 players as ordered by some statistic.* If two people pick their teams in a similar fashion, then it is likely a crap shoot as to which will be the better team in any one-off series. Over time (like an 82-game season), such differences may become magnified. Even then, the win difference between the two team may be 2 or 3.
* Although some statistics are better at accounting for variance than others.
How this leads back to Posnanski is as follows. In a lot of cases, he does not just simply rank numbers; partly, he’s a writer and story teller. The numbers are not the point; the numbers illustrate. Visually, there isn’t always a glaring difference between them, especially when one looks at the top performances.
Most often, the tie-breaker comes down to the story, or, rather, what Posnanski wishes to demonstrate. He’ll find other reasons to value them. In the Posnanski post I mentioned, I don’t think the piece would make a good story, even if it highlighted his argument well, had it ended differently.