Punditry was something quite removed from my work life and home life. I avoid the detritus that passes for political analysis in the United States, choosing instead to focus on long form articles in The Atlantic, the NYT Magazine, and The New Yorker. I am surprised at one “holy war” (*nix vs. Win vs. Mac style) that has cropped up regarding ereaders.
Emma Silver is one of the latest to defend paper books against in silico texts. My acquaintance, Chris Meadows, has written a response to it; these two provide a snapshot of the types of arguments slung by both sides.
Generally, most partisans talk up the virtues of either paper or e-books. That is, they defend the form used by readers to engage authors.
My problem with these arguments is that neither side focus on the real issue. Reading is not a competition between old-school curmudgeons and bleeding-edge tech heads. Reading is being assaulted by demands on our attention by video games, movies, television, music, and time spent with friends and families. Whether one goes to a concert, a theater, sits on a couch, in a bar, or use the Internet is besides the point. Again, it is not the how one obtains entertainment that matters, only that, with the limited time we have, we seek other types of entertainment.
In this context, I do not see e-readers (whether it be Kindle, Nook, or a software reader on an iPhone/Android phone/netbook/PDA) competing against paper. The e-readers are competing against the devices people use to listen to music and watch movies on the go. That is why I think it is in every book lovers interest to promote long-form reading, and to defend this form from subversion.
No one can predict how devices like the Kindle will affect the novel and historical scholarship, two types of writing I would classify as most endangered. There will always be a demand for light fiction. There will always be people who seek out information and political interpretation from sources with whom they already agree with. There will always be a demand for hack and slash biographies providing salacious drug and sex habits of the rich and famous.
Novels and histories require an immense amount of attention. I can see that histories will become more “multimedia” in the future. Histories already are: photographic plates and maps are generally included, along with charts, even in paper versions. As the recent future recedes, we will be able to include more news and sounds. And why is this a bad thing? For instance, why wouldn’t we want to hear Churchill speak? He was a brilliant writer and a speaker; how wonderful would it be if a discussion of his service during World War II also provided aural examples of his rousing speeches to raise British morale?
The problem with anti-technology screeds is that they ignore the proscriptive phase of the argument. The solution will never be, let us ignore the device. It is already too late: the devices are too popular. I see the Kindle becoming the paperbacks of the ebook world: it cannot not yet do video. The iPad and Android tablets will drive ebook development, not the Nook or the Kindle. These will provide the basic platform for how texts are presented to the public.
And there is a real fear here that long-form reading will be lost, since it is so attention-intensive. The defense of reading will be successful only if we can persuade youth to turn to long form (paper and electronic) books when they desire knowledge and thoughtful analysis. That is where we all need to focus our efforts; to teach the young that, for some things, they need to sit, read, and think. We need to increase exposure of historians who write with brio and panache. We need to convince future readers that long form books are still relevant, providing the best method of compressing knowledge and complex ideas (as opposed to the fact and information based content stored in databases and across the Internet.) If an e-reader is how the youth today will engage with long texts, then we need to do more to insert ourselves into the processes by which books and their presentation is brought to the public. The paper versus electronic format as a diversion. We cannot afford to lose to the perception that long books belong with the dinosaurs.
Although this blog is ostensibly about books, I’ve written a lot about sports, mostly dealing with how non-scientist readers perceive statistical analysis of athlete productivity. This issue fascinates me; I think how people think about sports statistics provides a microcosm in how they may respond to similar treatments in the scientific realm. Economists, mathematicians, engineers and physicists will provide a better explanation of the analysis than I can. Instead, I want to focus on the people who draw (shall we say) interesting conclusions about research.
In a recent podcast, Bill Simmons interviewed Buzz Bissinger on the BS Report (July 28, 2010). Bissinger gained some negative exposure as he had railed against the blogosphere and sports analysis. In this podcast, Bissinger was given some time to elaborate on his thoughts. He most certainly is not a raving lunatic, but he did say a few things that I find representative of how statistical analyses are often misinterpreted by non-scientists (and even scientists.)
Bissinger took the opportunity to trash Michael Lewis’s Moneyball, mostly by pointing out how Billy Beane isn’t so smart, and that all in the end, the statistical techniques didn’t work – only Kevin Youkilis – mentioned in the book, had proven to be a success. I think that misses the point. Yes, the book documents the tension between the scouts and the stat-heads. I think Lewis chose this approach to make the book more appealing, by taking the human interest angle, than simply writing a technical description of Beane’s “new” approach. Perhaps Lewis overstates the case in showing how entrenched baseball GMs were in relying on eyeball and qualitative skill assessments, but the point I got from the book was that: Beane worked under money constraints. He needed a competitive edge. Most baseball organizations relied on scouts. Beane thought that to be successful, he needed to do something different (but presumably had some relevance) to provide baseball success.
Beane could have used fortune tellers; I think the technique in Moneyball (i.e. statistical analysis) is besides the point. Beane found something that was different and based more of his decisions on this new evaluation method. This is a separate issue from how well the new techniques performed. the first issue is whether the new technique told him something different. As it happens (as documented in Moneyball, Bill James’s Baseball Abstracts, and by many sports writers and analysts), it did. The result is that Beane was able to leverage that difference – in this case, he valued some abilities that others did not – and signed those players to his roster. The assumption is that if his techniques couldn’t give him anything different from previous methods of evaluation, than he would have had nothing to exploit.
The second point is whether the techniques told him something that was correct. And again, the stats did provide him with a metric that has a high correlation with winning baseball games – the on-base percentage. So one thing he was able to exploit was the perception in value of batting average (BA) versus on-base percentage (OBP). He couldn’t sign power hitters: GMs – and fans – like home runs. He avoided signing hitters with high BA and instead signed those with high OBP.
This led to a third point: Beane can only leverage OBP to find cheap players (and still win) so long as there were few GMs doing the same. Of course the cost of OBP will increase if others come onboard and have deep pockets (like the Yankees and the Red Sox.) So Beane – and other GMs – would have to become more sophisticated in how they draft and sign players. Especially if they work under financial constraints. As my undergraduate advisor said, “You have to squeeze the data.”
One valid point point Bissinger made was that the success of the Oakland A’s coincided with the Big Three pitchers. So clearly, Bissinger wrote off a significant amount of Oakland success to the three. That’s fine, as the question can be settled by looking at data. What annoyed me is when readers do not pay attention to the argument. I just felt that Moneyball was more about how one can find success by examining what everyone else is doing, and then doing something different. The only constraint is whether something different would bring success.
I felt that Bissinger is projecting when he assumes that using stats means the rejection of visual experience. The importance of Moneyball is in demonstrating that one can find success by simply finding out what people have overlooked. Once the herd follows, it makes sense to seek out alternative measures, or, more likely, to find out what others are ignoring. If the current trend is on high OBP and ignoring pitchers with a high win-count, then a smart GM needs to exploit what is currently undervalued. Statistics happens to be one such tool – but it isn’t the only tool.
And part of the reason I write this is, again, to highlight the fact that people usually have unvoiced assumptions about the metrics they use. The frame of reference is important. In science, we explicitly create yardsticks for every experiment we perform. We assess things as whether they differ from control. It is a powerful concept. And even if the yardstick is simply another yardstick, we can still draw conclusions based on differences (or even similarities, if one derives the same answer by independent means.)
This brings me to recent Joe Posnanski and David Berri posts. The three posts I selected all demonstrate the internal yardsticks (hidden or otherwise) that people use when they make comparisons. I am a fan of these writers. I think Posnanski has provided a valuable service in bridging the gap between analysis and understanding, facts and knowledge. Whether one agrees or disagrees with his posts, I think Posnanski is extremely thoughtful and clear about his assumptions and conclusions, which facilicates discussion. The post has a simple point: Posnanski wrote about “seasons for the ages.” A number of readers immediately wrote to him, complaining about how just about anyone who hits 50 home runs in a season would qualify. To which Posnanski coined a new term (kind of like a sniglet) – obviopiphany.He realized that most people simply associate home runs with a fantastic season for a hitter. That isn’t what Posnanski meant, and in the post he offers some correction.
The Posnanski post has a simple theme and an interesting suggestion: the outrage over steroids may be due to the fact that people assume that home run hitters are good hitters. Since steroids help power, the assumption is that steroids make hitters good – which in most cases simply means more home runs. But Posnanski – and others sabermetricians – propose that one must hit home runs in the context of getting fewer strikeouts and more walks. The liability involved in striking out more, and not walking, is too much and washes out the gains made from hitting the ball far. Thus Posnanski posts names a 5 players who are not in the Hall of Fame, and aren’t home run hitters, but who nevertheless produced at the plate – according to some advanced hitting metrics. I won’t go into this more, except to say that here, Posnanski makes his assumptions clear. He uses OBP+, wins above replacement player, and other advanced metrics to make his point. But it is telling that Posnanski had to stitch together the assumptions his readers had – that the yardstick for good hitting simply boils down to home runs.
The Berri posts describe something similar. One of them is from a guest contributor, Ben Gulker, writing about how Rajon Rondo was not going to be selected for Team USA in the world championship because he doesn’t gather enough points. The other highlights how the perception of Bob McAdoo changed as a function of the fortunes of his team. Interestingly enough, McAdoo became a greater point getter while becoming a less efficient shooter and turning the ball over more; at the same time, his reputation was burnished by the championships his teams won.
The story has been told many times by Berri. It seems that in general, basketball writers and analysts associate good players as those who score points (in the literal sense, regardless of shooting percentage) and who played on championship teams. There are several problems here. Point getting must take place in the context of a high shooting percentage. One must not turn the ball over, one must rebound, one must not commit an above average number of fouls, and hopefully get a few steals and blocks. I don’t think anyone would disagree that such a player is a complete player and ought to be quite desirable, regardless of how many championship rings he has or if he scores only 12 points a game. Berri has examined this issue of yardsticks, and he has found that what sports writers, coaches, and GMs think of players has an extremely high correlation with, simply, how many points they get (this is shown by what the writers write and how they vote for player awards, how often coaches play someone, and how much GMs pay players.) The verbiage writing up about the defensive prowess and the “little things” are ignored when the awards are given and fat contracts handed out. Point getters get the most accolades and the most money.
And the other point is how easily point getters reflect the luster of championships. Nevermind that no player can win alone, but this again is an example of how people end up with not only unspoken yardsticks, but also choose a frame of reference without analyzing if it is the correct one. The reference point is a championship ring. As has been documented, championships are not good indicators of good teams. The regular season is. This is simply due to sample sizes. More games are played in the regular season. Teams are more likely to arrive at their “true” performance level than in a championship tourney with a variable number of games – and frankly where streaks matter. A good team might lose four games in a row, in the regular season, but they may lose only 10 for the year. In a tournament, they would be bounced out if they lose four in a series.
In this context, the Premier League system in soccer makes sense. The best teams compete in a regular season; the team with the best record is the champion. So people who assume that a point-getter who plays on a championship is better than a player who shoots efficiently (but with fewer points) and rebounds/steals/blocks/does not turnover above average, and on a non-champion team, make two errors. They selected the wrong metric twice over.
With that said, I could only have made that point because of newer metrics that provide another frame of reference. Moreover, the new metrics tend to have improved predictive abilities over simply looking at point-getting totals. Among the new metrics, there are some that show a higher correlation with the scoring difference (and thus win/loss record) of teams. It doesn’t matter what they are, but an important point is that one can derive these conclusions about which metric is better or worse.
This is the main difference in scientific (of which I include athlete productivity analysis) and lay discourse. In the former, the assumptions are made bare and frames discussion. A good scientific paper (and trust me, there are bad ones) makes excruciatingly detailed descriptions of controls, the points of comparisons, any algorithms/formulae, and how things are compared. In the lay discourse, this isn’t the standard one would use, because communicating scientific findings to other scientists use a stylized convention. Using such a mode of communication with friends would make one a bore and a pedant – not to mention one would become lonely real quick.
One recent meme making the rounds on the Internet is the site “I write like…” I haven’t looked into the algorithm yet, but I’m not sure if I can. It isn’t obvious on the website what the statistical analysis entails. But of course, I was curious about my writing style. Some preliminary findings:
1) Repeated submissions with the same text results in the same author
2) Of the 12 samples I submitted (all from this blog), I got the following results:
The Arthur Conan Doyle hit is an interesting one. It came from my post on James Patterson’s King Tut book. Part of the algorithm must account for theme/genre, probably based upon a concordance. There’s no reason to think that I changed my style so much when I wrote about crime. The algorithm might have narrowed the field down using certain keywords, and then selected an author.
What I write here are essays. I’m not sure what it means to bear similarities to (mostly) fiction authors.
I am mildly insulted by the HP Lovecraft: perhaps that post rambled and didn’t come to a point?
All in all, a nice bit of fun.
Update: It turns out I was on the right track regarding “keywords”. Here’s a report from the Huffington Post, which contains a few words from the author of I Write Like.