In many ways, Strangers on a Train is a much more satisfying work than Crime and Punishment. In broad strokes, both detail the guilt-wracked protagonists after each committed murder.  Guy Haines was browbeaten into committing murder, which seemed a questionable plot point. But what struck me as eminently believable was the way in which Guy’s mind grew distraught, even as his life continued apace.

I think it seems in vogue to write about murder as if any one of us can commit it. From my reading, Highsmith took the opposite thesis. People who kill are a little bit off. Charles Bruno is the son of a rich  man; he’s indolent and insolent. He is a little bit too close to his mother, and he probably harbors homosexual tendencies (it’s not weird now, but in the 50’s, it was). He certainly has a strong sense of the fantastical. He feels that Guy is the only person who can understand him and that they can escape together and recount their crime.
Guy and Bruno meet on a train. They talk, and Bruno senses some hint of tension in Guy. Guy has a wife whom he wishes to divorce – the very reason for the train trip – and Bruno has a father who apparently is an ogre. Bruno suggests what is the perfect crime, and has become a detective genre cliche. Bruno would kill Miriam, Guy’s soon to be ex-wife and Guy would kill Bruno’s father. The perfect crime, as the killers would have no obvious links to the victims, amounting to a random murder. Guy is disturbed and appalled by Bruno; I think he senses something is off-kilter about Bruno. Needless to say, Bruno is crazy, and decides to force matters and kills Miriam. Part of this might be because Bruno hates women. He says he hates his father because the father is an adulterer. But it turns out that Bruno’s mom gives as good as she gets, having her own stable of men to toy with. As another hint as to the fact that Bruno lives in his own head, he tells Guy that his mother is an example of the purity of women. Other evidence to show that Bruno is mentally unstable is that Bruno cannot leave Guy alone. He needs to drop hints to the detective who is following him. He involves himself in Guy’s life. In other words, establishing the connections that make it much easier to to link the two men.

What was interesting to me is how Highsmith handled Guy’s eventual descent into his own madness and commits murder. It is as if the motive is for Guy to shut Bruno up. Not necessarily to avoid being framed for a murder he didn’t commit, or that he wished to indulge in an animal behavior, but to kill so that Bruno would stop bothering him. Guy is portrayed as a depressed individual. He can’t take joy in his success. He is divorcing Miriam because she cheated on him, often. There is the element that Guy feels ill-used and played for a fool. He can’t be happy with his new girlfriend. He cannot confide in her, certainly not the murder but also very few other things.  It doesn’t take too much to disrupt Guy’s life, because he is already on the edge. He couldn’t be happy with his life before the murder, and he lets guilt take over after the murder. That is the one thing Guy can do extremely well – play martyr.

But I thought the best part of the novel was in how it slowly developed that others began to notice Guy’s odd behavior. It was a neat trick to portray it subtly, where others begin to see that something is not quite right with Guy. This is true especially in how Guy’s fiance notices that Guy goes from depression to something wilder.

In similar fashion, Highsmith’s short stories, in the collection The Selected Stories of Patricia Highsmith, show that she has little sympathy for humankind. Although there is a collection of character sketches that paint women in a terrible light (Little Tales of Misogyny),  in truth, no one came off looking too sympathetic. That’s not true; the collection opens up with a number of stories about animals that commit murder. Highsmith portrays these murderers as eminently justified. Everyone else is selfish, ugly, and dark. We see murder committed in cold blood, as an afterthought, for the joy of it, and from negligence and indifference. It’s impressive that, to my eyes, the stories are distinctive enough such that they don’t seem repetitive.

My favorite story is “The Romantic”. It is about a secretary who gets stood up on a date. Eventually, she starts going on made-up dates, where she sits and enjoys her time at a bar. She imagines the men she is waiting for. Knowing that these men will never show up, she feels liberated and happy.  She comes to realize that she very much prefers these pretend dates. So much so that when she is asked to go out on a date, she stands her date up. Her imagination gives her more satisfaction then men (and perhaps even other companions.) While it isn’t quite the slamming of the door in Ibsen’s “The Doll House”, I think it is a strong statement to make: Fuck them; I don’t need them.

Lev Grossman’s novel scares me. I have two boys, and I worry about their reading books like these. The magic doesn’t bother me. The sex and violence do not bother me. What bothers me is that Grossman never resolves the question of how Quentin deals with the dark void that is his heart.

For some reason, I read a lot of fantasy novels. In some ways, I suppose I like that pristine wilderness ideal that is so prominent in these books. However, I do find the plots involving prophecies to be especially compelling, for the same reason that puzzle books (like solving Masonic mysteries or crime novels) interest me. I like the chase to figure out how the clues resolve the deeply buried secret. The Magicians has the mystery element, but the main point is how Quentin deals with the pointlessness of life.

Quentin is unhappy. He trudges to school. He is bright, but he is not as smart or popular as the two friends he hangs out [with, one] of whom is a girl he has an unrequited crush on. Quentin is depressed and because “magic” is lacking from his life. Not real magic, necessarily. Life just isn’t fanciful enough for him. It holds no wonder. He is jaded.

One day, he arrives at his college entrance interview with a Princeton alumnus only to find him dead. On the way home, he was handed a folder with a  tantalizing title of The Magicians, a heretofore unknown sixth book of a childhood favorite, the fictional Christopher Plover’s Fillory series. This is probably a Harry Potter/Lord of the Rings type of series, a seminal [book] in the character’s lives. At the same time, a slip of paper comes loose and he chases it. He chases and chases and chases, until he winds up in front of a manor house. He senses something magical, follows his guts, gives in, and enters the manor. He sits down for a test. It is the strangest test, but as it turns out, he passes. He passes by giving an extremely vibrant [display] of magic, pulling a sword from a stack of cards.

He faints and when he awakes, he finds himself that much closer to the magical world described in his favorite fantasy novel – Christopher Plover’s Fillory. [There’s] a good bit of plot, but the one thing that grabbed my attention was Quentin’s continued slide into despair. This book is a bit more serious than Harry Potter. Grossman does a great job of turning magic mundane. Which makes sense if one is born into this world; it is not in fact “magical” but somewhat… run of the mill. Magic isn’t special to a sorceror, and Grossman does his best to get the point across: in his world, magic is a learned skill. The power may in fact be particular to the student, but the actual invocation is work.  That part was nice.

However, as Quentin’s time at school drags on, we realize that his heart remains empty. Even as he has entered this world of magic, he dares ask, “Is this all there is?” In fact, that’s the theme that is never resolved in this book. Part of the appeal of the fantasy genre is escapism. As if our lives would be better if somehow we could live in a simpler time – but with magic. I have never liked this aspect of fantasy. I do not like how writers treat this world (the real one!) as something so bereft of wonder that they seek retreat into a place with literal magic. This is especially apparent in a so-called urban fantasy such as The Magicians.

Grossman plays the trick out. It is clear that Quentin is a broken boy. He yearned for something greater than the mundane world. He found it. But it isn’t a matter of the inadequacy of a wish coming true. He just is not in a position to appreciate where he is or what he has. He understands it intellectually, but he doesn’t feel it. In a way, the magic loses its luster once he realizes that, it still doesn’t answer the basic questions of what good is his magic for, what will he do when he graduates.

He finds a girl friend whose parents are magicians. She tried warning him about the purposeless lives that magicians can fall into, when they realize that their lives are not any better than a mundane’s. After graduation, Quentin falls into a life of dissipation, exactly the scenario the girl friend, Alice, wanted to avoid.

Until the characters find out that Fillory, one of the many magic worlds the students grew up reading about, is real. There is a whole second novel here, where the characters journey to Fillory and save this land from the domination of an evil king. Again, what bothered me was just that Quentin’s character remained unhappy. throughout the whole book, the character gets everything he desired, but it is still not enough. This might be Grossman’s indictment against fantasy worlds. If your heart is empty, no amount of magic will act as a salve.

At the same time, there is no resolution because Quentin did not snap out of his funk until the climactic battle and his estranged girlfriend sacrificed herself. Only after this dramatic event did Quentin come to terms with himself. But how often do we get this big reset event?That’s what worries me about this book. Life isn’t as fantastic; life is mundane and full of so-called little moments. My worry is that readers draw the wrong lesson and wait for the big dramatic event and avoid the difficult work of coming to terms with oneself.

I started reading this book because it was about education. My wife and I have 2 boys, one five-year-old and one 16 month old.  I’ve been thinking a great deal about their education. My wife and I are both scientists. We feel that while this line of work is intellectually rewarding, the road is hard. For one to reach the top, one has to make sacrifices. My wife and I are more interested in making sure that the boys grow up to make a comfortable living.

I would consider myself a lifelong student. I spent my 12 years in primary and secondary school. Four and a half years of collegiate learning (the extra semester came because I spent a yearlong exchange in Germany, and I decided to take some more courses to receive at least an International Studies minor). [This was followed] by 10 years of doctoral and post-doctoral training.

I have had the opportunity to learn in many settings. The modes of learning included both defined coursework and independent study. I did fine with both, although I think I had the advantage of being extremely interested in just about everything. So much so that on occasion I welcomed the structure imposed by instructors and their syllabus, plotting out a course of study that I may not have bothered with, on my own.

My graduate and post-doctoral work focused on olfaction, specifically on the physiology of olfaction as assessed using optical indicators of neural activity. I basically focused on recording brain responses in the “smell processing pathways” in the brain.

I didn’t have any affinity for the sense of smell. I applied to a neuroscience graduate program because I wanted to understand how the brain worked. It mattered not one wit whether the system was smell, taste, hearing, vision, or touch. I had a general question that I wanted to answer, and the specifics did not matter to me.

More recently, and this has some bearing on the book by Daniel Wolff, I had to find a second post-doctoral position because I decided that my skills were not marketable. I guess you can argue that I failed in convincing human resources that I can be productive for their company. However, it is also the case that biotechnology companies are not looking for a neurophysiologist who records in vivo neural responses using conventional microscopy. Instead, they look for electrophysiologists, scientists who do imaging in cell cultures, or deep tissue scanning using fMRI, CT scans, or PET scans.

Regardless, I couldn’t make myself fit into their bucket, and they weren’t willing to accommodate someone with my skill set and who could possibly bring something unique to their company.

The point is that I consider myself a professional student. Since I started my new post-doc in flow cytometry, I am learning new techniques, a new system, and getting to know the intimate lives of single cells. The strategy is simple: my boss has certain ideas he wants implemented. He left it to me to work out the specifics. So I am currently identifying attributes of my technique (UV spectroscopy), understanding the life cycle of a cell, identifying subcellular organelles, and learning how to make them stick to slides so I can look at them under a microscope. Most of these things I can find in published literature. The key thing is that I look for ways to combine my [new] technique ([involving] UV microscopy) and established ways of looking at cells.

In my previous post-doc, I had to determine the best way to preserve a “cranial window” through which I can look at brain responses, in the same animal, over a period of months. I adapted and extended previous work, and brought some newer techniques, to help me accomplish this goal. I also established a method to mimic natural breathing patterns in anesthetized mice, and so I had to learn LabView and MatLab to write software to control various devices and to analyze data.

So yes, I have some experience with learning new things.

And for the life of me I cannot think of a so-called best way for my boys to learn.

Thus I became interested in books like How Lincoln Learned to Read. It seems, from my reading, that the book confirmed  a few things about how kids learn. That is, it may not be clear until much later what exactly kids learned in school. Daniel Wolff took a snap shot of how 12 famous Americans were educated. He selected one child from each era and wrote about the formative years of each. Of course, these people may have become famous despite, and not because of, their education. One could tell that all of the young people were driven. Driven to achieve greatness, or driven to just do whatever it is that they became famous for.

In a way, the formal education was not all that important for each child. Some did in fact thirst for knowledge for its own sake. Wolff shaped his descriptions in terms of both pragmatic outcome and post-hoc analysis. We know where they ended up, so it becomes a way for us to interpret and identify the steps that led the children to their destiny. However, what he also noted was that the kids had the balls to chase the learning they needed. There is certainly an individualist streak, strongly evident in Lincoln’s and [Henry] Ford’s backgrounds; [both boys avoided farm work and] both boys were considered lazy for their time. [This was true given the focus of their times on the importance of farm work.] [Lincoln and Ford] were not layabouts. Lincoln read, and Ford tinkered. They learned what they wanted to know.

And even when the children were forced into limited opportunities for learning – such as for Thomectony, Abigail Adams, and Sojourner Truth – they did not let that education define them. They, as they should, got what they needed from books or their teachers, or even from their everyday observations. They did not let the limits of their so-called education prevent them achieving their ends.

Of all the characters, I felt the strongest affinity for W.E.B. Dubois, especially the view he took of education as uplift, in his younger days. A learned man is a rational man. The emphasis on books and abstract learning is the hallmark of civilization. Through reason and the sheer force of intellect and action, others cannot help but look past surface appearance to admire the man beneath. Education, simply, gives man and woman choice, so they do not have to sell their work cheaply.

But [the mass education] system seems broken to me. I have, and I hope this doesn’t turn readers off, thought for a long time now that education – and especially higher education – is not suited for everyone. I don’t think I am being elitist; I just happen to think that higher education is as useful to [most] people the way that knowing carpentry is useful to a plumber. It might be [handy] to know, but one certainly doesn’t require it to succeed in the job at hand. I will elaborate some more.

I’ve concluded that a university education is something that prepares students… to do research. For scientific research, the goal is to identify mechanisms underlying observed phenomena. The use may in fact not be obvious – this is more of the knowledge of the sake of knowledge [mode]. To me, it seems a destructive idea to [use] a university degree as a form of uplift. It cheapens education to the point that one thinks a degree is in fact a commodity to be bought. This cannot be further from the truth: to pay for an education should mean that one has decided that the resources of a university helps expand his knowledge, whether by working with specific machines and tools or with specific professors. This is active knowledge seeking.

The alternative is to think of college as a paid for experience, at the end of which one is conferred a piece of paper that acts as a passport to a job. In this mode, I can see how students and parents may impinge on faculty, outraged that the instructor dared to fail the lazy student. [I detest students who blame their own lack of academic success on the teacher wh odid not engage his interest], when in fact all they have paid for is access. The rest is up to the student to provide.

And so I was left with this: each of the subjects Wolff discusses had the common attribute of being presenters. These were all men and women who grew famous as politicians, orators, writers, or entrepreneurs. In a sense, each of these people excelled at internalizing and then espousing what they knew. Rachel Carson wrote about her Romantic-era like visions of her bucolic home, an ideal that was nowhere near the [reality] of her growing up near a glue factory. Elvis Presley spent time in school, sure, but he certainly did not shy from joining quartets or cutting demos. Ben Franklin, Helen Keller, and Andrew Jackson learned a fair bit about finding popular topics to write or talk about.

This theme, whether Wolff intended or not, has dovetailed with my own thinking about the type of education I want my sons to have. I think, most of all, they need to read proficiently. Not in the cheap post-modern way where all language simply reflects one’s preconceived notions and thoughts. No, I mean to read and to understand and internalize the writer’s point of view. To engage him honestly. The second thing would be to then tell an audience what he understood, and how he extends or refutes the idea. I think this second point is something that needs to be emphasized explicitly. By telling, one learns.

It is a bit of a cliche, in graduate school, that the best way to learn is to be forced to teach someone else. Only [by] actually thinking about the audience will one truly begin to understand. I have found the hard way that this is in fact true. One should aspire not to understand something, but to make it so that he can help a second person understand what he [now professes] to know. I realized I have been implementing this in a soft way. I keep asking my older son questions, helping him develop details to his stories. I am always amazed and gratified when he can put together sentences with subclauses, declaring a proper sequence of events.

Part of it is just joy in hearing him talk, in seeing him learn. [I] am glad to see that there are precedents for such [a] type of education.

Update 4/9/2010: Ick. I know blogs are meant to be fast, first drafts sort of posts, but I just can’t stand seeing obvious mistakes and not correct them. I placed the edits in brackets (I’m not sure if the strikethroughs I see in other blogs are real corrections or if it’s another tool to convey snide comments – I think usually the latter…. So, I am going with brackets.)

First, a digression (and I haven’t even gotten to the official topic sentence for the post that pertains to the title!): I am currently reading The Numbers Game by Michael Blastland and Andrew Dilnot. The book is something in the mode of what I’d want to write about: it’s a guide for helping non-mathematicians, non-economists, and non-scientists (and perhaps those very people) in dealing with numbers. I’ve written in this blog (and commented a number of times on Dave Berri’s Wages of Wins blog) on how sports fans and  journalists misunderstand and misinterpret sports productivity measures. The greater theme is that I think there is a lack of perspective in how laymen apply scientific information into their own worldviews. The book I’d write would deal with this topic, and this is the one that Blastland and Dilnot  wrote.

A lot of the book presents numbers within a context. Actually, Blastland and Dilnot exhorts readers to develop and build the proper context around numbers in order to make them more manageable. This is especially salient in the opening chapters about large and small numbers. In some sense, a number like 800,000,000 might not be so large, if it represents the amount overspent by Britain’s National Health Service – assuming the budget for this agency numbers in the $80 billion range. As another example, the well-memed “Six degrees of separation” might imply that members of a peer group may actually be about 5  intermediaries away, but that number may as well be infinite if you are linked to the President of the United States by your knowing a neighbor who knows a councilman who knows the mayor who knows a state rep who knows the senator who knows the President. The impact of your linkage to the President, at a personal level, is clearly small.

At any rate, there is another chapter on “chance.” The example that Blastland and Dilnot use is one of cancer clusters. Most humans have some innate sense of how “random” ought to look. If one throws rice up in the air and letting it land on flat ground, one might imagine that some parts of the ground contain more rice than others. This is a value neutral, and no one disagrees with the appearance of random clusters. Or rather, we do not think anything sinister behind the appearance of clusters. But replace rice with “cancer incidence”, the interpretation changes. No more do humans accept that a cluster might just mean the chance meeting of many events that result in a higher number of cancer patients. There must be some environmental cause that led to the cancer cases. Never mind that number of cases may not take into account length of habitation (what if all the cancer cases were from people who moved recently into the town? The case for environmental factors for cancer incidence falls apart), the types of cancers, or the genetic background of the patients.

The specific example happens to involve a cell phone mast that was built in Wishaw, England. Citizens in the area were outraged and angry enough to knock down the power, when they found out they were in a “cancer cluster”. OBviously, the citizens keyed in on the mast as the cause for the cancer. Of course, the personal involvement of the townspeople tends to skew their perception, and a dispassionate observer might be needed to ask simply, “If the cell phone mast was responsible for cancer in this town, shouldn’t all cell phone masts be at the center of cancer clusters?”

The reaction of the townspeople to the Wishaw cancer cases is illustrative of the same symptoms shown, in a less significant way, by sports fans and journalists who base their conclusions about athletic productivity on so-called observational “evidence” and not on controlled, rigorous  studies. The dispassionate observer who asks if all cell phone towers should be at the center of clusters would try to overlay the distribution of towers to a map of cancer cases. He might slice the cancer cases further, trying to isolate cancers that have a higher likelihood of being cause by electromagenetic fields.  He tries to address the hypothesis that cell phone towers cause cancer. The Wishaw denizens, in contrast, didn’t bother to look past the idea that the towers caused their cancer. This highlights the difference between the so called statistical approach and eyeball approach to evaluating athletic performance. The first method is valid for an entire population of athletes, while the second may or may not be valid for even the few athletes used to make the observation. A huge part of science is to make sure that the metrics being used are actual, useful indicators of the observed system.

This brings me to the Science review of The Trauma Myth. A key component for why humans go wacky over cancer clusters and not rice clusters is that cancers are more personal. It becomes more difficult for humans to let go. Case in point: some of the criticisms leveled at investigators of the Wishaw cancer cluster is that they took away hope. I suppose what the critics meant was that the certainty of cause-and-effect  was lost. The Trauma Myth sounds like an interesting book. It takes a view contrary to “conventional wisdom”. Clancy provides some evidence to suggest that young victims, at the time of their abuse by pedophiles, might not look upon the episode as traumatic as they did not have enough experience to classify it as such. Of course the children were discomforted and hurt, but they did not quite understand what exactly was wrong. The problem wasn’t in how the children felt; the problem would be if how they felt interfered with them coming forward to report the crime.

Clancy’s book then aims to address how best to guide these abused children to come forward to report the crime and receive the help they need.  Apparently, conventional wisdom suggests there is a single reaction after sexual abuse: trauma. Clancy might have oversold how much this affected the number of children who came forward, but the reviewer notes that it is entirely unfair to portray Clancy as somehow being sympathetic to pedophiles.

And yet that is what Clancy is accused of. The fact that laymen cannot seem to countenance any criticism as constructive and useful is problematic, but not limited  to laymen. Even her colleagues have thought the worst about her work.

It is cynical, but I am glad my research is not in such an emotionally charged field. Of course, I have seen strong personalities argue over arcane points, and rather vehemently, but in no case could any researcher be accused of abetting pedophiles and murderers.

The obvious lesson here is that science gives voice to even the wildest of ideas. The objectivity that science enjoys is based exclusively on the gathering of evidence. That’s it. The framing of the question, which methods to use, and how one draws conclusions is all subject to biases and politics. However, all scientists expect that once a method is selected, the data were in fact obtained exactly as stated and in the most complete and rigorous way possible. This is what allows one scientist to look on another’s work and criticize it. The reviewer of The Trauma Myth noted Clancy did not dwell on this idea, which is a shame. Intellectual honesty can often be at odds with political expediency or comfort. It seems that  laymen and Clancy’s colleagues would do well to focus on those subjects, however many or few of them, who had not reported these sexual assaults, regardless of whether Clancy is correct or not.

The reviewer noted that the main point in The Trauma Myth is that sex crimes are underreported, and possibly due to children being confused by the fact that they had not felt traumatized (and thus somehow thought that they were not victims). I hope for their sakes that Clancy, her colleagues, and her current opponents can work to ensure that all victims of child abuse can come forward and obtain justice against the perpetrators.

This was a disappointing read, since I had wanted more details on Patterson’s research and thoughts about the evidence supporting that Tutankhamen was killed. While I wasn’t expecting a work of historical scholarship, I did not anticipate that he was going to dramatize his interpretation of this slice of Egyptian history. This would have been fine, but I will be honest and admit that I wasn’t in the mood for it. Especially since the writing style is clipped, with a Dick and Jane cadence. I do not care for it.

There were two reasons I did not like the book. The first is that Patterson talks up his research. Due to the exposition style, it was unclear how much research he had done, compared to pure invention. I don’t mean that Patterson did not get the dates and major events right, but since drama requires a bit more flavor, there are certainly liberties he took with constructing the details of Egyptian life. The dialogue is one example; the thoughts and motivations he ascribes to the pharoah, queen, and the court functionaries are another. However, this wouldn’t be so bad if the pieces of research leading to his thesis, that King Tut was murdered, wasn’t so weak.

The weakness in the evidence and the long build-up make up the second fault. As far as I could tell, Patterson calls this a homicide based on a cranial wound (as determined from CT scans of the mummy skull), the elevation of 3 pharoahs from Tut’s court, and the small tomb and lack of hieroglyphic records of Tut. The fresh piece of evidence is in fact the head wound. The rest of the evidence had been known, and certainly the circumstances described does not rule out murder. The fact that following Tut, all three subsequent rulers came from his court is consistent with foul play. First, Tut’s wife/sister succeeded him, then his court advisor, then his general. Human ambition being what it is, one can construct all sorts of stories about Tut’s wife and the court advisor. The lack of mention in the hieroglyphic record may be due to incompleteness in the the record, although it could also be interpreted as the systematic obliteration of Tut’s legacy. Burying Tut in a small tomb also could indicate carelessness, and at least diffidence in how the pharaoh was laid to rest. But it might just mean that Tut was not liked. Or it could mean the murder was going through the motions of the burial. But then why would the murdered line the small tomb with treasure? One might think the head wound would prove crucial to Patterson’s case that tips the theory in favor of murder.

Yet Patterson, in his dramatization, documents the wound as stemming from a chariot fall. Hmm. And, during the assassination scene, the killer supposedly suffocated the pharaoh (fine, that was fiction. I suppose Patterson found it to be weird to have the killer strike the pharaoh on the exact spot injured from the fall – there was only a single wound to the head.) So the smashing new bits of insight wasn’t even used to weave a consistent story regarding the murder of Tut. That I found strange. The lead up to the supposed new piece of evidence did not pay off. That would be fine for any writer but Patterson: he is a writer of detective stories. Are his other books so poorly tied together?

Although I had been expecting something a bit more serious (it certainly makes for good copy for a detective story writer to do a bit of crime investigation), the fact that the historical tidbits were translated into a story didn’t bother me, in and of itself. Yes, there are issues concerning the provenance of each detail, but as a whole, it works as one amateur’s interpretation of how Egypt’s ruling class lived. At some point, with the difficulty in translating hieroglyphics and the length of time separating us from the pharaohs, a scholar’s educated reconstruction of how these Egyptians lived may not fair any better than what Patterson can invent based on his research.

There were also other minor problems. Patterson wove three stories together: the story of the pharaohs, Patterson’s modern day research, and Howard Carter’s excavation of Egypt and his finding Tut’s tomb. Patterson, on two occasions, wrote of Carter’s removal from active excavation, and merely alluded to Carter’s personality clashes with his superiors. But somehow, Patterson did not recount the details of the arguments that led to Carter’s removal. He simply just wrote that Carter was about to flout the wrong people… and left it at that.

So, the major problem was that Patterson played up the historical research he and his co-author performed. It may have been submerged into the background details of the pharaoh’s story. But Patterson didn’t describe in clear terms what new evidence he had, and the story he wrote differed in interpretation, but not substance, from what was already known. And given the circumstantial evidence surrounding Tut’s tomb and succession, it seems strange that no one had posited that Tut was murdered, as Patterson seems to be suggesting.

What a strange book. The whole point of being is to trash intellectuals who idealizes the pursuit of freedom (either in behavior, in intellectual pursuits, from society). Paul Johnson admitted that it was unfair to use the private lives of individuals to judge the strength of their thoughts, but nonetheless he spent the entire book documenting the deficiencies of men who talked big and lived meanly. The quality of the men never matched the beauty of their vision, prose, or poetry.

The futility of such an exercise is noted early, in the chapter about Shelley. Johnson admits that this cad was a wastrel who had no compunction about writing mean letters detailing the failures of his parents while concurrently asking for money. Shelley used people, seeing his family as nothing but a source of income and women no more than a means for physical pleasure. Naturally, he thought himself liberal, dispensing with archaic institutions of monogamy. He expected his wife to accept his mistress to share their apartment, but he graciously extended the same privilege to his wife (whom apparently complained about this arrangement.)

Regardless, all this is peripheral: Johnson thinks Shelley wrote beautifully, and his poetry moved Johnson. Johnson writes,

The truth, however, is fundamentally different and to anyone who reveres Shelley as a poet (as I do) it is deeply disturbing. It emerges from a variety of sources, one of the most important of which is Shelley’s own letters.”

Great. But why should the gap between artisanal accomplishments and the empty lives of artists be so surprising, in an age when starlets, athletes, politicians, authors, musicians, and entertainers behave as if they were competing for the favor of the Borgias? Johnson already conceded the point that he can appreciate the artistry, if not the artist.

There was one high point in the book, though. Johnson destroyed Karl Marx on both a personal and professional level. In this instance, it seems that there are elements in Marx’s personality that might have directly resulted in the shoddy intellectual quality of his work. Marx made a better short form than long form writer; the long form exposed Marx’s deficiencies as a researcher and investigator. Das Kapital contained a number of misuse of evidence. Marx did do a spectacular job of digging up dirt on his enemies, though.

In a coda, Johnson links 2oth century atrocities to both secular intellectuals ignoring atrocities committed in their name and to the social milieu they created that promoted nihilism (namely in excesses of Communist regimes.)  It seems to me a simpler case that these mass murderers were ambitious, ruthless, and disposed to murder even before they encountered post-modern philosophy. As much as I detest social relativism, post-modernism, and religious dogma, I can’t fault these ideas as causing mass effects. I can, however, fault the men who, upon gaining power to commit atrocities, cloak their acts in the trappings of a recognizable philosophy.  To suggest that terrorists or dictators  valued life until reading a book seems to be placing the cart before the horse.

In the end, I do agree with Johnson in that it is so disappointing that philosophers rarely reach the ideals they espouse. So what else is new?

I read Bill Simmons’s The Book of Basketball. I enjoyed his book, as it is a fun survey of NBA history. The book isn’t just a numbers game or just breaking down plays. It includes enough human interest elements that it should appeal to a casual fan or diffident parties (like me; I can count the number of basketball games I’ve seen – TV or live – on both hands.) Simmons does a fantastic job of conveying his love of basketball. For me, he really brought different basketball eras to life, inserting comments from players, coaches, and sportswriters. He also seems fairly astute in breaking down plays and describing the flow of the game.

Yes, I bought the book because I think Bill Simmons’s writing. If you enjoy his blog, you will find that same breezy conversation style here. The man has a gift for dropping pop culture references and making it germane to his arguments. But what I like most is that he is earnest in trying to understand and to make his readers appreciate the people who play a game for a living.

His segment on Elgin Baylor was moving, in showing how racism affected this one man; in some ways, it was probably more effective than if he just talked in general terms about the 1960’s. His whole book works because it stays at the personal level. Even in his discussion of teams and individual players, he takes pains to discuss how this person was and is regarded by his peers and teammates.

In this way,  I think Simmons did a fantastic job of making a case that basketball can contain as much historical perspective as baseball. This is something that should not have to be argued. Baseball has a lock on “the generational game by which history can be measured” status. What seems important is that there are human elements that make it accessible between generations: things like fathers taking their sons to the games, talking about the games and players, the excitement of watching breathtaking physical acts that expand how one views the human condition, and the joy and agony of championship wins and losses. While baseball’s slow pace lends itself to the way history moves one (periods where nothing seems to happen punctuated by drama), it doesn’t mean other things happen in a vacuum. Style of play, the way the players are treated, and the composition of the player demographic all reflect the times. These games can be a reflection of society, and one can see the influence of racial injustice in something as mundane as box scores as integration occurred.

Simmons blend basketball performance, its history, and its social environment of basketball effectively, some examples could be found in his discussion of Dr. J, Russell, Baylor, Kareem, and Jordan. In discussing why there probably won’t be another Michael Jordan (or Hakeem, or Kevin McHale), he takes inventive routes. Most of his points relate to societal/basketball environment pressures. Players are drafted sooner, the high pay scale for draft picks lower motivation to prove their worth, and perhaps society itself would actively discourage players from behaving as competitively as Jordan did. I suppose it’s interesting, but I’m not sure if that matters so much if the player is perceived to be an excellent player. Regardless, it seems to me that Simmons has been thinking about these things for some time. And I found it fun to read his take on basketball.

And I liked this book because it gives the lie to the weird view that someone who hasn’t done something cannot make reasonable, intelligent statements about it. Simmons wasn’t a professional basketball player, but he certainly uses every resource available to absorb the history and characters populating the game. He read a fair bit, he watched and rewatched games, he talked to players, he talked to people who covered basketball and he watched some more.  And he isn’t afraid to raise issues that occur to readers; you’ll see what I mean when you read his footnotes.

The book (and his podcast) confirms my opinion of Simmons as the smart friend who’d be a blast to have (one who bleeds Celtics green, watches sports for a living, and must keep up with Hollywood gossip, gambles, and pop culture because it gives him ammunition for columns).

***

There are some issues with the book, mainly in how statistical analysis of basketball is portrayed. I should be upfront and say that these issues did not detract from his arguments (for reasons that will be clear later), but I wish he would reconcile eyeball and statistical information.  And because I’ve decided one focus of this blog should be how non-scientists deal with science (and scientists), I thought I should offer some thoughts on some of these issues.

I am somewhat undecided about how Simmons (and I suppose I am using him as a proxy for all “non-scientist”) actually feels about statistics. He claims that team sports like basketball and football are fundamentally different from baseball; the team component of the former increase the number of additive and subtractive interactions while the latter game is composed of individual units of performance.  Thus the increase in complexity makes it difficult to model. So he discards so called simple measures of NBA player performance like WP48, PER, and adjusted plus-minus.

His rationale is that these indicators ought to back up existing observations about NBA players. So Kobe Bryant needs to be ranked as a top-20 player of all time (WP48 ranks Bryant as a superior player – like Paul Pierce – and not a step or two behind Michael Jordan.) It seems like he wants statistics to tell him what he wants to hear, when in fact statistics helps you see things you don’t see.

But then that leads to my second point about Simmons: why does he need the model to back up his mental model of player performance? Put differently, why is it that he cannot accept differences in rankings calculated by some turn-the-crank-spit-out-value model? I think Simmons lacks a nuanced view of how these numbers ought to be interpreted, and that he refuses to see that a simple model can capture a great many things about a complex system. Sure, once you’ve set up your criteria (like some level of significance you are willing to accept), you align everything by it, but there is room for some judgement as to where that line is drawn.

Another way of describing a complex system is to say that there are many things going on at once, and they are all interacting in some way. There are 10 players on a basketball court. One player, with the ball, has options to pass, to shoot, or to move the ball. Within each of these options, he has a set of suboptions: which one of the other four guys do I pass to? Who’s open? Which open player has a good shot from where he is? Am I in my optimal position to shoot? Do I need to drive to the basket or kick the ball out to the perimenter? There are many more possibilities than these.

***

At one level, Simmons is right; it is useful to break things down into “hyperintelligent” stats – identifying the tendency of players (whether he likes breaking to his left or right when he’s starts driving from the top of the key, whether he is equally good in shooting from his left or right hand, how often he does a turnaround, fadeaway, or drives to the hoop), trying to figure out how many forced errors a defender creates, how often a unforced turnovers happen (like someone dribbling off his foot), how many blocks get slapped out of bounds vs being tipped to get possession, and so on.

But isn’t it just as intelligent to find an easy way of collapsing the complex game into a simple “x + y” formula? On several occasions, Simmons uses a short quote (and praises the person who said it) that captures everything he wanted to say in 15 pages. A simple model is analogous to that short quote.

More importantly, what if we didn’t need all these hyperintelligent stats to capture the essence of the game?

I just switched the problem from one of identifying player performance and productivity to one that captures the game a broad strokes. The two ideas are of course related but still distinct and should not be confused to mean the same thing.

This gets back to the original motives of the person who does the modeling.

If it’s a scientist or economist, I’ll tell you now that he is interested in getting the most impact with the least amount of work. He probably has to teach, run a lab/research program, and write grants and publications. He doesn’t have time to break game film down. And he certainly does not have the money to hire someone to look at game film (although I am sure he’ll have no lack of applicants for the job.) He spends his money finding people to do research and teach. If his research program is into finding ways to measure worker productivity, he will probably start with existing resources. So fine; he now has a database of NBA player box scores.

He’ll want to link these simple measures of player output to wins and losses. But players score points, not wins, and thankfully the difference in points scored and points given up correlate extremely well with wins and losses.

From there, it is relatively simple to do a linear regression for all players for all teams, finding how each of the box score stats relate to the overall points scored for each team. And as noted, some metrics have a higher correlation to the point difference (I will not use the term differential to mean difference; differential belongs to diff EQ’s.) Regardless, it seems an affliction for males that they rank things; so the researchers have these numbers, and it’s trivial to list players from high to low.

Now, here’s another consideration. In this, and in other branches of science, the data are not “clean”. That is, we scientists (generally) assume that the phenomenon we are observing conforms to a “normal” distribution – that is, there is some true state for the thing we observe (found by taking the average of our observations) and the individual pieces of observation hover around this true state (or average). So there is variation around the mean.

In my research, for example, I can measure neural responses in the olfactory bulb. I use optical indicators of neural activity; essentially, the olfactory bulb lights up with odor stimulation. The more the neurons respond, the brighter things get. The olfactory bulb is separated into these circular structures called glomeruli. Each glomerulus receives connections from the sensory neurons situated in the nose and the output neurons of the olfactory bulb (some other cells are also present, but they aren’t important for this story.)

When a smell is detected by humans (or animals and insects), what we mean is that some chemical from the odor source has been carried, through the air, into the nose and neurons become active (they fire “action potential spikes”). And the pattern of this activity, at the olfactory bulb, is quite similar – but not exactly the same – from animal to animal.

Sometimes, we see fewer responses to the same smell. Other times, we see a few more responses. Sometimes we see a different pattern from what we expect. Sometimes, we see no responses. This might happen once every 15 animals. Not a whole lot to take away from our general, broad stroke understanding of how this part of the brain processes smell information. In most cases, some of these things might be explained technically; the animal was in poor health, or our stimulus apparatus has a leak, or the smell compound is degraded. We know this because we can improve the signal by fixing the equipment or giving the animal a drug to clear up its nose (mucus secretion – snot! – is a problem).

And as a direct analogy to this WP48 vs “hyperintelligent stats” problem, we find that a complex smell (compose of hundreds of different chemicals) may be “recreated” by using a few of these chemicals. There is good empirical evidence this is the case: prepared food manufacturers and fragrance makers can mimick smells and flavor reasonably well. This is akin to capturing the essence of the smell (or sport) with a few simple chemicals (or box scores). And generally, we don’t even need people to describe to us what they smell to figure this out (i.e. break down game film to create detailed stats). We can simply force them to make them answer a simple question: do these two things smell the same to you, yes or no? Thus “complex” brain processes and decision making can be boiled down into a forced-choice test results. Do we lose information? Yes, but everyone realizes this is a start. As we know more, and new technology becomes available, we can do more and ask more with less effort. Then we will be able to better use the information we have. As far as I know, most statheads have access to box-scores (although there is nothing to stop them from breaking down game film aside from time and money issues.)

But that’s the broad strokes view. If we get into details (that is, as if we started working with the “hyperintelligent” stat breakdowns), we find that of course there is more going on, and that the differences we see are not only technical issues. For example, the pattern of activity we see differs slightly from animal to animal, but this is because the cells that form connections with the olfactory bulb do not hit the same spot. And if we can use a single chemical to recreate a smell, the smell itself is still different enough that humans generally can tell something is missing. So the other chemicals are in fact detected and contributing some information that the brain uses to form the sensation of smell. And we know that the way neurons respond to a single chemical differs from how they respond to a mixture, confirming that there is in fact additional information being transmitted.

The important point is that the simple model captures an important part, but not all, of the complex system. One problem that can occur with increasing the complexity of models is that overfitting occurs: the model becomes applicable to one small part, rather than the whole, system. Even game film breakdown hinders  if it gives you so many options that you are back where you started. You’d probably avoid focusing on rare events and just concentrate on the things that happen often – which, again, is the point of a simple model.

The intense break down of game film to provide detailed portraits of player effectiveness could be combined with the broad strokes analysis. A metric like WP48 can tell a coach where a player is deficient. The coach can use the detailed breakdown to figure out why the player isn’t rebounding, passing, shooting well, and so on. That’s where things like defensive pressure, help defense, and positional analysis can be used for further evaluation. And I’m not sure if stat heads argued otherwise.

Deficiencies of statistical models

As in the things that models explicitly ignores.

One thing statistical models do not address is the fan’s enjoyment of a player. Actually, I suppose one might be able simply chart percent-capacity of stadiums when a particular player comes to town, but that’s something I don’t think Simmons would argue. There’s something to be said about how a player scores: Simmons pays tribute to Russell and Baylor, the first players to make basketball a vertical game. He cites Dr. J. as introducing the urban playground style  into basketball. He loves talking about the egos of players, especially when players take MVP snubs personally and then dominates the so-called MVP in a subsequent game.

Simmons also offers a rebuttal to PER, adjusted plus/minus, and “wages of win” metrics in his ranking of Allen Iverson – by saying that he doesn’t care. It’s sufficient for him that he finds Iverson a presence on the court. His emotions are acted out as basketball plays. He finds Iverson’s toughness and anger on the court fascinating to watch.

But Simmons does use metrics: the standard box scores. I would ask this: if Iverson didn’t score as much as he did, would Simmons still care? As Berri has noted, the rankings by sportswriters, the salaries given to scorers, and PER rankings all correlate highly with volume scoring (i.e. the points total, not field-goal percentage). Despite the tortured arguments writers might make, and the lip service given to building a lineup with complete players, “good” players are players who score a lot.

However, I should be clear and say that Simmons’s approach does not detract from his defense of his rankings. He uses player and coach testimonies, historical relevance, visual appeal of their playing style, sports writers, and the box scores to generate a living portrait of these players as people. Outside of the box scores, there are enough grist for the mill. I would suggest that it is these arguments that make the whole argument process fun. Even in baseball, supposedly the sport with the most statistically validated models of player performance (and Berri would argue that basketball players and their contribution to team records are even more consistent), there are enough differences of opinion concerning impact, playing styles, and relvance to confound Hall of Fame/MVP arguments (see Joe Posnanski).

Because Simmons is upfront about his criteria (even if the judgement of each might be not as “objective” as a number), it is fine for him to weight non-statistical arguments for greatness. It’s how he defined the game. Just as Berri defined “player productivity” in terms of his WP48 metric. Because Berri publishes in peer-reviewed journal, he needs methods that are reproducible. Science, and in general the peer review process, is a different process than writing books or Hall-of-Fame arguments or historical rankings. The implicit understanding of peer-review is that the work is technically sound and reproducible. Berri cannot take the chance of publishing a Simmons-like set of criteria and have other sports economist “turn the crank” and come out with different rankings. But Berri can publish an algorithm, and proper implementation will yield the same results.

Does this mean that Berri is right? Or that a formula is better than Simmons’s criteria? Mostly no. The one time where it is “better” is when one is preparing the analysis for peer-review. In this case, it is nicer to have a formula, or a process, or a set of instructions, that yield the same result each and everytime the experiment is run. In other words, we try to remove our bias as much as possible. Bias here does not mean anything pernicious; it just is a catch-all term for how we think a certain way (with our own gut feelings about the validity of ideas and research direction). Being objective simply means we try to make sure that our interpretation conforms to the data, and that the work is good enough so that other researchers come to the same general conclusions.

I think Simmons actually doesn’t need to trash statistics, nor does he need to ignore it. Once he establishes ground rules, he can emphasize or deemphasize how important box scores are in his evaluation. As it is, I found his arguments compelling. His strength, again, is to make basketball history an organic thing. He does his best to eliminate the “you had to be there” barrier and tries to place the players in the context of their time.

Now, one might ask why stats can’t be used to resolve these arguments about all time greats. Leaving aside the issue of the different eras (and frankly, this can be addressed by normalizing performance scores to the standard deviation for a given time period, as Berri does here ), there is the issue of what the differences in these metrics mean. In the same article I cited, Berri reports that the standard deviation for the performance of all power forwards, defined by his WP48 metric, is about .110. His average basketball player has a WP48 of .100. Kevin Garnett, for example, has a WP48 (2002-2003) of 0.443. That translates roughly that Garnett is more than 4x as productive as an average player, but normalized to the standard deviation, he is only 3.5x as productive.

But how much different is a power forward from Kevin Garnett if the other forward has a WP48 of 0.343? One might interpret this to mean that Garnett is still nearly 1 standard deviation better than the other player, but it could also mean that their performance fall within 1 standard deviation of each other. Depending on the variation of each player’s performance for a given year, compared to his career mean, they could be statistically similar. That is, the difference might be accounted for by the “noise” in slight upticks/downticks in rebounds/assists/steals/turnovers/shooting percentages/blocks. If you prefer, how about the difference between a .300 hitter and a .330 hitter? Over 500 at-bats, the .300 has 150 hits, and the .330 hitter has 165; the difference would be 15 hits over the course of a season. Are the two hitters really that different? The answer would depend on the variability of batting average (for the compared players) and how these numbers look with a larger sample set (i.e. over a career with over 5000 at-bats, for instance.) The context for the difference must be analyzed.

Here’s another example: let’s assume that Simmons and Berri’s metric turned out similar listings, perhaps with different order (one difference is that Iverson would be nowhere near Berri’s top 96.) And further, let us assume that the career WP48 scores are essentially within 1.5 standard deviations of one another. How might Simmons break with the WP48 rankings?

Let us tackle how Berri would have constructed his ranking: he would simply list players from highest to lowest WP48. That’s probably because he is in peer-review article mode. And frankly, if you profess to have a metric, why would you throw it out? You might if, like Simmons, you defined the argument differently. Of his Pyramid of Fame rankings, he lists a few arguments that do not encompass basketball productivity. Again, the idea of historical relevance, player/coach testimony, and the style and flair of the players enter into Simmons’s arguments. So all things being equal, and if the difference in rankings by metric is slight, there really is no reason against weighing the statistics more than any other attribute. Heck, even if the metric differences are large, it wouldn’t matter. Simmons like his other arguments more anyway.

But if you do talk about the actions on the court, then I believe you are in fact constrained. Of the metrics I had mentioned, WP48 offers high correlation with point-difference and thus with win-loss records. Further, some of the other metrics actually correlate with points-scored by players, suggesting that there is no difference between that metric and simply looking at the aggregate point total. So there are actually models that do reasonably well in predicting and “explaining” the mechanics of how teams win and lose.

In a way, I think the power of a proper metric is not in ranking similarly “productive” players, but in identifying the surprisingly bad or good players. Iverson is an example of the former; Josh Smith (of the 2009-2010 Hawks) of the latter. It might not be as powerful a separator of players with similar scores, because their means essentially fall within 1 standard deviation of one another; in essense, they are statistically the same. In this case, it  helps to have other information to aid evaluation (and this isn’t easy; as Malcolm Gladwell has written, and Steven Pinker taken issue with, some measuring sticks are less reliable than others.)

Another example where statistics is powerful is in determining, in the aggregate, if player performance varies from year to year. Berri found that it isn’t, suggesting that the impact of coaching and teammate changes may not be as high as one thinks. However, such a finding in no way precludes coaches and teammates from having an effect on teammates. It just means that these people are too few to affect the mean. Or perhaps it suggests that coachs are not using information properly to make adjustments that are meaningful to player performance. Overall, I suppose, one cause for why Simmons hates advanced stats and rankings is that he isn’t sensitive to the importance of standard deviation, and ironically enough,  he applies the mean tyrannically when there is such a concept as statistical insignificance.

But Berri has never pushed his work as a full explanation of the game of basketball. First, he doesn’t present in-game summaries: he only looks at averages over time. There’s nothing in his stat to indicate the ups and downs (i.e. standard deviation in performance) a player experiences from game to game. Even in baseball, hitting .333 does not guarantee a hit every 3 at-bats. It just means that over time, a hitter’s hit streaks and lulls add up to some number that is a third of his at-bats. Berri’s metric (and any other work that proposes to measure player performance) certainly cannot predict what a given box score would be, for a given game, for a given player.

Regardless, I do not see a problem with Simmons’s ranking his players. Simply, he values entertainment value as much as production. I would say he values the swings in performance just as much, if not more (more on this later). Yes, he says stats do not matter, but of course it does. It’s interesting that all the scoring lines he cites, in admiration, all lead with a high score or score per game. And if you can’t shoot, rebound, pass, steal, or block and coughs the ball up a lot, it wouldn’t matter how pretty you make everything look.

No-no’s

Joe Posnanski has pointed out that, whenever someone trashes stats, he tends to offer some other supplemental numbers that back up his point. In other words, the disagreement isn’t about statistics per se, but between the distinction of “obvious” stats vs. “convoluted” stats.

Even if one disagrees with basketball statistics, at least he can believe that statheads came up with a formula first and turned the crank before comparing the readout with their perceptions of players. Hence Simmons blowing up when PER or WP48 doesn’t rank his favorites highly.

Simmons approaches this from the opposite direction. He has an outcome in mind and “builds” a stat/model to fit it (like his 42-Club). But he mistakes his way of tinkering with what modelers actually do. Berri arrived at his model by performing linear regression on a particular box score and seeing whether the point-difference increased. It isn’t an arbitrary way of deriving some easy to use formulation. The regression coefficients are meaningful in that, what it says is, if you increase shooting percentage by this amount, the point-difference goes up by that amount. It so happens that points scored by a player did not increase the point-difference. And he built it by using all players; it’s strange to decide before hand what players are great, and then build a metric around that. Why even bother in the first place?

And for Berri to report differently on these aggregate data because Kobe isn’t ranked any higher, actually would become scientific fraud. But as I noted above, applying these WP48 rankings isn’t as hard and firm a process as Simmons thinks. There is some room for flexibility, depending on what one tries to accomplish.

In general, I agree that more break downs in the game would be useful, in the sense that more data is always nice. The problem, for academics, is that these stats might remain proprietary, and it becomes difficult to apply across all teams. Even if we could get all the “hyperintelligent” stat breakdowns from a single team, it is unclear if other teams would view the break down in the same way. The utility for examining general questions about worker (i.e. player) productivity for academic publication becomes less clear. The database ought to help the teams – assuming they are intellectually honest enough to verify that their stats that produce a better picture of player productivity and aren’t impressed by the gee-whiz-ness of it all. My guess is that they won’t be entirely successful, as Simmons still has a job trashing bad GM decisions.

Standard Deviations

Why I watch sports: it seems to be similar to the way Simmons does. He watches over a thousand hours of sports each year, waiting for the chance to see something he has never seen before. Something that stretches the imagination and the realm of human physical achievement.

I feel the same way; I am team and sport agnostic, and although I used to follow Boston Bruins hockey religiously, I left that behind in high school. Although I have lived in Boston from the age of 7 onwards, I had not been infected by the Red Sox or Celtics bug (even during their mid-80’s run). I did root for the Red Sox in 2003 and 2004, but that was because of the immense drama involved in the playoff games against the Yankees. And Bill Simmons’s blog for the season.

Perhaps I prove Simmons’s point about stat heads; I like to say that I am interested in sports in the abstract. I like the statistical analysis for the same reason Dave Berri had pointed out in his books. There is a wealth of data in there to be mined. I thought one good example of the type of research that can come from these data is finding evidence for racial bias in the way basketball referees call games.

However, what got me interested in watching professional sports was Simmons writing about it. Although I didn’t watch football, basketball, or baseball for a long time, I did watch the Olympics and, believe it or not, televised marathons. Partly it was because my wife and I were running, but mostly I saw the track and field type sports as a wonderful spectacle. So it wasn’t that much of a stretch to fall into a stereotypical male activity.

At any rate, I was amazed at Usain Bolt’s performance in the 2008 Summer Olympics. I was disappointed by Paula Radcliffe injuring herself during the Athens Olympics, and then relieved when she won the NYC marathon, setting a new speed record in the process. I rooted for Lance Armstrong to win his seventh Tour. I rooted for the Patriots to get their perfect season. And until the Colts laid down and the Saints loss a couple of weeks ago, I wanted the Colts and the Saints to meet in the Super Bowl, both sporting 18-0 records. I was glad that the Yankees won the World Series, and with that fantasy baseball lineup, I hope they continue to win. I want to see the best teams win, and win often. And yes, I wish the regular season records lined up with the championship winners for a given season. Then we wouldn’t have arguments about best regular season records and the championship winners.

This isn’t because I’m a bandwagon fan; I watch sports now for the same reason that Simmons does. To see the best of the best do great things. But not always because they might have a competitor who wants it more, leading to the best failing, at times. This drama is the power of sports.

And I can see why Simmons argues so passionately against stats. He likes the visceral impact of sports. I can say that Bolt ran a 9.69s 100 m. But it was nothing compared to seeing Bolt accelerate, distance himself from the other runners, and then slow down as he pulled into the finish line. He blew away the competition. My eyes were wide and my mouth hung open: he slowed down! And he was 2 strides ahead of everybody. And he set a new record. Even if Bolt didn’t set the record, he still made it look easy. On the field, on that particular day, he out-classed his competitors. It is watching the struggle of the competitors (like Phelps winning the 100m fly by 10 milliseconds), on that day, that matters. Over time, if one didn’t watch that particular heat, then the line World Record: Usain Bolt, 100 m, 9.69s doesn’t quite hit you the same way.

But then, there is this. What if instead of looking at the single race, you looked at the athlete performing in 8 or 20 or  50 events for a year? And at these events, the same set of athletes compete over and over?

Here are some possible outcomes: Phelps and Bolt lose every other match, essentially giving us a single transcendental moment. Phelps and Bolt win half their meets. Phelps and Bolt utterly dominate the field, winning 65% or more of their meets.

For first case, we would probably admit that the Phelps and Bolt phenomena was a one-off. For whatever reason, the contingencies (no sports gods or stars aligning here!) lined up such that they did highly improbable feats (but not impossible. This distinction is the point of this section.) The third case proves our point; they are not perfect, but they sure are good. The second case is a bit trickier: since they are right on the borderline, we need some analysis to help us decide. One way might be to sum up our individual observations about these two. Being .500, while giving us a single breathtaking moment might be persuasive. Or one might look at how everybody else did (Phelps and Bolt might have won 50% of the time, but if the remainder is split among their competitors, they have still dominated the field.)

But then what if Bolt and Phelps won 49% of the time, and some other competitor won 50% of the time? What then? Here, criteria are important. Most of the time, we say better meaning, well, something is better. Generally, we aren’t specific about what we mean by it.

In the book, Simmons ranks his top 96 players in a pyramid schematic. He is rather specific about what he wants in a player. And as one expects, he is specific about the types of intangibles his basketball player should have (basically, basketball sense – i.e. The Secret, if he made his teammates better, winnability, and if you choose someone based on “if your life depended on this one guy winning you a title.”) The evaluation of those intangibles, however, is not as precise as he’d like. However, the advantage here is that one might be able to answer “why” questions. In some cases, Simmons seemingly ranked two players differently while giving them the same arguments (like the consistency of Tim Duncan and John Stockton. Somehow, Stockton just rubbed Simmons the wrong way, while Duncan’s consistency makes him the seventh best player of all time.) And his emphasis on projecting Bill Russell’s game into the modern era seemed like Russell should have ranked lower. On occasion, I was left with the feeling that the arguments did not match the ranking.  From what he said about the stat inflation and how Wilt didn’t get the secret, I thought he would be ranked lower than 6.

Dave Berri has the opposite problem: he has a mathematically defined metric and when he says better or worse, it’s whether this metric is higher or lower between the players being compared. He can further break down this stat to show where a player is good or deficient (whether shooting percentage, blocks, turnovers, fouls, steals,  and assists are above or below the average). He can tell you the hows, with his model spitting out a number that combines these different performance stat into a metric of productivity. But he simply ranks players numerically, without talking about how these differences one might see between the players (and one might not be able to see it… it could be one more missed shot or one less rebound every couple of games.)

I am amazed that Simmons cannot reconcile eyeball and statistical information. Just about every time Simmons bitches out scorers, he talks about how this player didn’t get “The Secret”. It isn’t about scoring; it’s about having a complete game. It is about making the team better with the skills you have. To top it off, Simmons then says that point getters are one dimensional. You can’t shy away from rebounds. It’s great to have a few steals/blocks. Sure, not every athlete can do it all, and certainly not be as prolific as superstars, but you can’t avoid doing those things.

I’m sure Berri is nodding his head, agreeing with Simmons. Point getting isn’t the same as being a efficient shooter (at least average field goal and free throw percentages). And you certainly can’t be below average in the other areas if you want to help your team.

But Berri generally writes about the average. Simmons focuses on the standard deviations. He doesn’t just care about the scoring line; he focuses on Achilles-wreaking-havoc-on-the-Trojans type of performances. He loves the stories of Jordan’s pathological competitiveness. In other words, Simmons lives for the outlier moments.

And I think therein lies the nutshell (and to borrow a Simmons device, I could have said this 5500 words ago and shortened this review.) Simmons views the out-of-normal performance as transcendent, as examples of players who wanted something more or had something to prove. He treats the extreme as something significant; he uses a back story to it to give the event meaning. That’s fine. It’s also fine when Berri (and stat heads) are constrained in treating outliers as noise (possibly) or irrelevant to the general scope of the model, if they desire a model of what usually happens and are not concerned with doing the job of a GM and a coach for free. Because they both defined the game they wish to play in.

I swear I never meant for this blog to focus so much on sports. But Dave Berri has a post that dovetails neatly with some thoughts I have regarding experts, expertise, and how the public should handle them. I think it can be interesting to approach science issues from the side, rather than head on. Specifically, three authors (Berri, Malcolm Gladwell, and Steven Pinker), all of whom I admire, have had a minor verbal tussle about the issue of expertise.

First, a digression. I was already going to comment on the interface between experts and laymen. The original impulse came about because I just finished reading Trust Us, We’re Experts! by Sheldon Rampton and John Stauber. Like books of this ilk, the authors spend many chapters recounting the failures of authority figures and the exploitation of these failings by people who follow the profit motive to an extreme degree. Although the title hints at a broadside against arrogance of scientists, it really is about the appropriation of the authority, rigor, and analysis of science to sell things. The targets of this book are mainly PR companies and the corporations that hire them. There are also a few choice words for scientists who become corporate flacks.

The book lacked in presentation, mostly because the authors avoided analyzing how one can tell good from bad science. The presentation leans on linkages between instances of corporate malfeasance; there is no analysis and data on how many companies engage PR firms in this. There is no analysis on the amount of research from company scientists versus independent ones. The authors focus on motives of corporate employees, but somehow ignore the possibility of bias within the academy. There is no attempt to identify if and when corporate research can be solid. In broad brush strokes, then, chemists who discover compounds with therapeutic potential are suspect; the same people working in academia (and presumably someone who will not capitalize on this finding financially) can be trusted.

This is actually a huge problem in the book; one of the techniques that Rampton and Stauber document is the use of name-calling (good old fashion “going negative”, ironically enough, the PR firms would simply label all opposition as junk science.) in describing research and scientists who publish contrary findings from whatever corporations happen to be pushing. But by avoiding the main issue of identifying good and bad science, the two stitch examples of corporate and public relations collusion. Now, the evidence they present is good; they hoist PR and corporate employees by their own petards, quoting from interviews, articles written for PR workers, and from internal memos. But the ultimate point here is that Rampton and Stauber simply tarnish corporate research because the scientists work for corporations. I believe this to be a weak argument and is ultimately useless. One example I can think of is, what if two groups with different ideologies present contrary findings? Assuming that the so called ‘profit motive’ are equally applicable, or not at all, then readers will have lost the major tool that Rampton and Stauber pushed on in this book. But as I will show, the situation is not always as stark as, for example, corporate shills and academicians or creationists against biologists. There is enough research of varied quality, published by ‘honest actors’, to cause enough head-scratching about how solid a scientific finding was.

Let’s be clear, though. Of course the follow-the-money strategy is straightforward and, I would think more likely than not, correct. But that cannot be the only analysis one does; if the thesis is that PR firms use name-calling as a major tactic in discrediting good, rational, scientific research, it seems bad form to use funding source as a way to argue that investigators funded by corporations do bad research. It’s just another instance of name calling. I expected more analysis so that we could move away from that.

And that’s the unfortunate thing about a book like this; why wouldn’t I want a book that causes outrage? Why, in essence, am I asking for an intellectually “pure” book, one that deals with corporate strong arm tactics in a so-called more methodical, scientific way. Doesn’t this smack of the political posturing, where somehow a result matters less than the means – and no, I do not mean the ends justify the means. I am just pointing out that there might be multiple ways of doing something (like taking route A vs. B or cutting costs by choosing between vendor C and vendor D). Workplace politics might elevate these mundane differences into managerial warfare. Why should I care what the politics are, so long as it leads to a desirable end result?

One problem problem with a book like Trust Us is that it appeals to emotions with rhetoric, without a corresponding appeal to logic. I think including analytical rigor is important as it provides the tools for lasting impact. As it is written, the book (published in 2000) provides catchy examples of corporate malfeasance. The most basic motif is as follows: activists use studies that, for example, correlate lung cancer with smoking in order to drive legislation to decrease smoking. Corporations and interested parties attack by calling this bad science, by calling the researchers irresponsible, by calling the activists socialist control freaks who wish to moralize on an issue that is really a matter of personal choice. They have a considerable war chest for this sort of thing. Frankly, if that’s what Rampton and Stauber are worried about, then their focus should have been on the herd mentality of people, not the fact that PR firms use negative ads.

But that is only one weapon; the other weapon is the recruitment or outright purchase of favorable scientific articles. The  example would be the studies published by scientists who work for tobacco companies, with the studies refuting the claims of the investigators. But Rampton and Stauber focus on simply point out that this favorable finding comes from researchers who are paid by Philip Morris. That’s nice, but how is this different from the name-calling Philip Morris engages in? The real issue is how one goes about identifying what bad research is.

They do throw a sop to analytical tools, at the end of the book. The discussion is cursory; the focus is again on helping the reader dissociate the emotional rhetoric from the arguments (such as they are.) The appeal is that the analysis is simple. Just question the motives of the spokesmen and experts.Worst of all, their discussion of the difficulties of science gives the impression that the whole enterprise is a bit of a crapshoot anyway. They point out peer review is a recent phenomenon, that grant disbursal depends upon critiques from competing scientists, and that the statistically significant differences reported are more often than not, mundane and not dramatic. Their discussion of p-values make scientific conclusions sound like so much guesswork, rather then the end result of hard work. Day-to-day science isn’t as bad as the pair portrayed it.

It is a trick to take a broad question (“How does the brain work?”), break it down into a model (“Let us use the olfactory system as a ‘brain-network lite'”), identify a technique that can answer a specific question (“I wonder if the intensity of a smell is related to the amount of neural activity in the olfactory system? We expect to see more synaptic transmission from the primary neurons that detect ‘smells.'”), do different experiments to get at this single question, analyze the data, and write up the results.

Forget the fact that different scientists have different abilities to ask and answer scientific questions; nature doesn’t often give a clear answer. So yes, it is hard to get conclusive statements. To confound the issue further, even good research can have a flaws, unclear experimental design, incorrect analysis, and distressingly minor differences between control and test conditions.  Which leads us to the question, what exactly does good research look like?

I am not going to answer this now, and I can’t answer this. The blog will, eventually, attempt to deal with this very issue by presenting papers and research that I read about, in addition to book reviews. But my point here is that Rampton and Stauber didn’t address this issue either. The very end of the book is a populist appeal, one that emphasizes “common sense” over jargon and statistics. They even appeal to our civic duty, that we should become more politically active and associate with (my term, not theirs) “lay-experts”. At some point, however, even well-informed non-scientist and non-experts must have turned to experts for some original research. Rather than disregard that research, then, one must learn and gain a comfort level with parsing scientific literature.

It took a while, but we return to the Gladwell-Pinker-Berri flap. The setup is simple: Berri is a sports economist, specializing in creating models that predict athletic performance. However, he has tackled multi-player games (basketball and American football), which, presumably, would lead to complex models, or perhaps something computationally intractable. Surprisingly, he found that neither was the case. The important point this time is that he was able to show where quarterbacks are selected in the NFL draft doesn’t fit with their performance (assessed using the Berri and Simmons QB Score metric.) Gladwell wrote an essay that presented Berri and Simmons argument favorably. Pinker made a short comment refuting this, saying that QB’s drafted high do have better performance.

Both Pinker and Gladwell‘s review and response seemed snippy to me. But what I found interesting was that while Pinker questioned Gladwell’s ability as an analyst (while giving Gladwell the backhanded compliment that he is a rather gifted essayist – but not a researcher or analyst), Gladwell, in turn, questioned the background of Pinker’s sources. I think Gladwell’s highlighting the faults with the arguments was sufficient, as Pinker’s sources are somewhat weak. It really wasn’t necessary to impugn their background.

This is ironic, as Pinker raises some peripheral issues regarding Gladwell’s suitability in reviewing the research and observations from experts. Just as with Gladwell, I think Pinker gave a reasonable counter-argument to Gladwell’s generally gung-ho and favorable presentation of his subjects. For example, there is a flip side to imperfect predictors: while they may not be useful for predicting the most suitable candidates, they help to remove the worst ones from the pool, in a cost-effective way. That’s an interesting, and I think one “system” that scientists can study to answer this is… sports (because of the wealth of performance data).

There really is no need to trash an expositor just because he is a better essayist than a scientist, for instance. Isn’t Gladwell in fact an expert in conveying novel research to the public (and effectively)?

In this case, I think both the “expert” and “lay person” gave a good accounting of their (intellectual) problems with the other. However, they both engaged in what amounted to look-at-the-source “analysis” (Pinker says Gladwell doesn’t know what he writes about. Gladwell trashes Pinker’s football sources for things they did, that are unrelated to football). The only thing the ad hominem attacks achieved was to raise the blood pressure of both participants.

Strangely enough, I find myself writing again about Bill Simmons. I found his latest article interesting, well-thought out, with his conclusions generally supported by his arguments. So why am I writing? Simmons did a great job breaking down film and the problems with the type of statistics used. I took issue with the fact that he concludes this “proves” the lack of predictive power of statistics, when I thought he should have concluded that he used statistical and observational analysis correctly. Simmons missed a golden opportunity to show readers how to synthesize statistics and low-sample number observations.

The setup:  Week 10, Patriots at the Colts, 34-28. The Patriots had the ball on their 28 yard line, 2 min 3 s left to play, and it was 4th-and-2. Belichek decided to go for the first down rather than punting. There might have been some issue with the ball being spotted in the wrong place, but essentially, the Colts stopped the Patriots. Turnover on downs. The Colts scored on their series, after dragging out the clock, and won the game by a point.

First, Simmons does what I like sports writers to do: combine on-the-field observation with the context of what one usually sees from football teams, in the aggregate (i.e. some group analysis, which usually does mean statistical analysis). I happen to think his argument against not-punting, in this specific play, is stronger than, for example,  Joe Posnanski’s and Gregg Easterbrook’s posts about the statistical analyses that generally supported Belichek’s decision. Simmon’s arguments were stronger because he specifically placed his observation of the game and the Patriot’s performance leading up to this last offensive call in the context of aggregate statistics. True to form, however, he followed this by trashing the statistical analysis, rather than concluding that he had properly evaluated singular performance and identified how the Patriots deviated from the aggregate.

Simmon’s argument is that most stat-heads used the wrong set of probabilities. Posnanski,  Easterbrook and Simmons presented the statistical arguments that the Patriots had a greater chance of winning had they gone for the conversion, rather than punting. To be fair, the difference might have been slight; numerically, of course, one probability was higher than the other (Tim Graham of ESPN arriving at a 1.5% win probability). Had Simmons focused on reconciling the statistical assumptions with how Belichek’s play calling lowered the Patriots’ chances of achieving first down, I believe he would have provided a wonderful illustration of how one goes about reconciling statistical/probability estimates with actual events. Unfortunately, Simmons ignores the probability of winning, focuses on the probability of losing, and asserts that  punting was the unequivocal correct call.

Simmons had a contrary opinion from Easterbrook and Posnanski on the punting issue, but all three of them found problems with Belichek’s coaching in the last minutes of play, preceding the 4th down conversion attempt. All three seemed to have pointed out issues with game management (such as 2 timeouts that were called just to make sure the right players were on the field) and with play calling (rushing on first down, passing on the next two downs). That last sequence seemed to have suggested that the call to play out the fourth down rather than punting was a spontaneous call. Simmons broke that down nicely, suggesting that rushing on third down made more sense if one is in fact going for a 4th down conversion. Finally, the actual play on 4th down was atrocious, as the Patriots limited their options drastically, going with an empty backfield. In this formation, there was no running option, and the Colts simply jammed Brady to hurry his throw. As it happens, he connected with Kevin Faulk, but short of first down.

I don’t think anything here contradicts the aggregate story (such as a greater than even chance of getting 2 yards). The fact is, there was much circumstantial evidence that Belichek might have flubbed the play. After all, there are no guarantees; just because the average play nets 5 yards doesn’t mean the players just stand there, waiting for the refs to spot the ball up field. You need to select a play and then execute it. As the saying goes, that’s why they play the game. The players still need to give their fullest effort.

What one should consider is how Belichek reduced the Patriots’ chance of converting by using a bad strategy. And Simmons actually did this. He noted that this play was essentially a 2-point conversion attempt, as both offense and defense were lined up to attack and defend a short field (i.e. defending the end zone with the line of scrimmage at the 2 yard line). There seemed to have been some confusion between the special teams and offense as it wasn’t clear to the players whether they were attempting a punt or not, necessitating a time out that could have been used later to challenge the Faulk bobble (see Posnanski’s post). Simmons presented some stats showing that 2-point conversions had a lower success rate (on the road; I have issues with Simmons’s selective stat picking, but that piece wasn’t exactly a peer-reviewed article.) It was unreasonable to conclude that the Colts would have rolled back down field to score with under 2 minutes to go, possessing only 1 timeout (despite the fact that the Colts did exactly that on their preceding drive. It probably was an aberration and won’t happen again. But a stat here would be nice, comparing how long in distance and time an avg NFL drive is.) The Colts  also had an inexperienced, young receiver corps, which might have increased the Patriots’ chances of stopping the Colts after a punt.)

So, even if the average successful 4th down conversion is around 60%, the Patriots did not maximize the likelihood of success. Thus the stat-heads, in essence, should have altered the assumptions for their calculations, based on the on the field observations, from the last couple of minutes of the game. Maybe the Patriots should have punted.

There are some arguments against punting. Easterbrook focused on the specific offense/defense matchups as determined by this particular game. Easterbrook wrote that, on the previous possession, the Colts drove 79 yards in 1:40, without a time out, for a touch down. Easterbrook also noted that, to his eyes, the Patriots defense seemed a step behind the Colts offense. Also, the Patriots were playing against a weak secondary. As it happened, Brady and company rolled up 370 yards on the night. It seemed like they should have had a greater than the league average chance of converting the 4th down.  They might have had a slightly lower than league average chance of defending ~70 yards, had they punted, as they had just shown they could give up a long drive (although Simmons pointed out that the Patriots stopped the Colts in 5 of the last 7 defensive series in that game.)

Again, the two arguments are  whether the Patriots can stop the Manning with under 2 minutes and whether Brady plus Faulk, Welker, and Moss can gain 2 yards. On the field, there are probably enough game-related distractions and observations for Belichek. As Posnanski said, there might have been a lot going in Belichek’s mind. It might have taken him until the last second to come to some conclusion about what to do on that fourth down. He probably did know, in general terms, the arguments above, but might not have led to a clear cut answer. He might have just decided that there was a very good chance his QB would have found a way to get the 2 yards. Although I support Simmons’s argument (and only because I think the win probability is shaded just slightly more towards punting, with Simmons’s modifications taken into account), I’m not sure if punting is a clear answer with so much time left on the clock, against a quarterback like Manning.

I think both punt and no-punt, observational arguments are valid. And the whole point of statistics is to help you weigh these alternatives against some metric (i.e. the league average.) Where it actually detracts from the analysis (to the non-statistician’s mind mind) is when the likelihoods of a positive outcome, for the considered alternatives, are rather similar.

The two points here is that, 1) contrary to Simmons point that observations are somehow better, observations also led to two contradictory, sound conclusions about the overall strategy, and 2) with the situation as stated, punting was still not a guarantee of a win (punting would have been the better option as time left to play decreased.)

The problem with the former is that we have a tendency to shoehorn these anecdotes into fitting the conclusions that we want to draw. That’s why having some statistics can provide a context for evaluating the single sample observations. You can’t do what Simmons did, which is to say that the aggregate is wrong because of the details in this situation (wrong play selection or no strategy leading to a 4th down conversion attempt) just as you can’t argue against the punt if a punt return-touchdown happened. Because in the aggregate, these things are aberrations. Even if Simmons arguments for punting was strong, it probably should have modified the outcome to only a greater than 50% winning probability, not the 100% win that Simmons thinks. In other words, you can’t just turn a 60% win probability into 100% just because you chose it. In the aggregate, both plays would yield a win more than 50% of the time.

Some other criticisms of Simmons’s piece: not all stats are created equal. Examples of what not to do with stats include Simmons using spurious stats, like how often there are 3TDs scored in the 4th quarter, to bolster his point. But why limit it to 4th quarter? Why not just look at how often 3TDs are scored in a quarter? Or why look at only 2 point conversion plays, on the road? I know Simmons made a point about how this particular play is set up like one, but the proper comparison is still against all 2 yard attempts or a comparison against all 2-point conversion plays. The problem is that, he made no attempt to discuss the validity of that particular stat in general before analyzing the break downs. In some regards, it might be simpler to prove the general case before the specific one. And certainly it helps to present all the splits, not just the ones that support your case.

Part of the issue with probability and statistics is that people do not have the luxury of the long-run or multiple trials. We only have this one trial. Which brings us the the asymmetry referred to in the title of this post. Models are one way in that one can build them by collecting multiple observations; it is a mug’s game to apply models to predict a specific event. Something might happen, until it does; the model is probabilistic, but the outcome is binary. That is part of the difficulty in accepting statistical models.

I thought that Simmons piece indicated that he did not separate the overall strategy with the details of the execution.  As he is so fond of arguing, the details cannot be captured by a simple measure as “conversion”. There were many ways of getting there: is a recovered fumble an ideal way of converting a 4th down? How about a penalty against the defense? Was it a 4th and inches grind forward? Was it 8 yd pass against a weak opponent? Did the coach rest the first string defense in the fourth quarter, with the game well in hand? However, this was in the context of a Brady plus Welker, Faulk, and Moss offense that had nearly 400 yards on the night. That is a detail that Simmons did not dwell on. The players gave the Patriots a legitimate shot at converting the 4th down. It was the playcalling from Belichek that failed the Patriots. I thought it was unfair for Simmons to trash the strategy based on the example of this particular play.

And to spread the criticism a bit, I don’t think it makes sense to never punt, as Easterbrook maintains (though he argues this from an aesthetic perspective.)  The contribution of that particular play to the overall win probability depends on the situation. It is the coach’s job to identify the most significant factors in terms of the aggregate (i.e. whole NFL result) and then apply it to an analysis of how his particular offensive and defensive play callings maximize the actual performance of his players.

Simmons missed a great opportunity to show how a proper analysis should be done. He could have supported the obvious point, that, hey, to maximize on that 60% success rate, you need to treat this like a normal play in a scripted series, not like a 2 pt conversion. He even said as much; another one of his points is that Belichek did not treat the whole series like a four down set. Doing so would have enhanced the overall chance of success. Instead, he raised the metaphorical equivalent of the “blogger-in-Mom’s-basement” attack against stat-heads: that they don’t watch the games. And that watching the game would have told you what the correct strategy was. I don’t think that was the case as all, as the contrary view can be derived using Easterbrook’s asssumptions.

I got to thinking about a difference between writers and commenters. One crucial difference is skill, naturally. However, I am thinking about some of the emails sportswriters such as Joe Posnanski, Dave Berri, Peter King, and Bill Simmons get. The best correspondence they publish tends to follow up on a thought, often giving an example about some tragedy the pundits had written about.

Considering this small and selective sample, I concluded that the main difference beween lay writers and the professional is context. Professionals establish context in which lay writers tend to work. That is, professional writers organize examples by their themes, while the lay writers (i.e. commenters) write single examples. This leads, firstly, to the difference in length. The commenters provide an example or a vignette that refers to the established idea. I suppose one-graf bloggers tend to fall into this category, no matter how good the actual prose is. The professional writer would have developed the context for his main argument before using examples to emphasize his own point. While longer is not always better, of course developing ideas take up space. This leads to longer pieces. It takes a bit of skill to compress ideas into a paragraph (try reading abstracts from science papers and see if it makes sense to someone outside of the field you work in. The good ones will make sense to someone who doesn’t work in your field.)

For now, I want to focus on the difference between a professional writer’s and a scientist’s mode of writing. At the level of sports pundits and analysis, there are the Joe Posnanskis and Bill Simmons of the world, and there are popularizer of research, like Dave Berri. All three are wonderful writers for their fields, but I would rather read Posnanski and Simmons before Berri, if considering only the literary aspects of their writing. Nevertheless, the main difference between the two is not in the scope but in the details that provide context for their pieces.

Recently, Posnanski wrote about his desire to adopt a baseball stat for his blog. He hinted at reasons for disliking OPS (simply, on-base percentage + slugging avg), and presented an argument for his “hitting average.” That’s all fine and good; readers of Dave Berri’s blog and book Wages of Wins will note that finding Berri in fact tries to find statistical measures of athlete “productivity” that relates to point production and thus, wins. Now, here’s the difference between Posnanski’s and Berri’s approaches. It certainly isn’t scope, since both are ostensibly doing the same thing. However, Berri’s approach is scientifically sound where Posnanski’s isn’t, despite Posnanski dealing with objective mathemetical measures.

A caveat: I am not saying that Posnanski’s stat or approach is wrong. Posnanski has made every attempt to say that what he is doing is more for aesthetic reasons and than to find THE stat, the single model that explains MOST aspects of baseball. Again, I am merely considering their styles of presentation, which are partially limited by the scope and how they approach the details.

In any case, Posnanski details how stat-geek readers of his blog, led by Tom Tango, generated a new stat called “linear weights ratio.” Posnanski tests this stat out by checking the rankings of a number of players; of course, there is some alignment with more traditional advanced baseball stats. He also presents the formula for his hitting average, for readers to play with. Again, there’s nothing intrinsically wrong with this; Posnanski isn’t doing econometrics. If anything, he is doing a great service by getting various reads to think mathematically. But Posnanski doesn’t provide a context to evaluate that new metric. Mainly, he doesn’t compare this metric to established metrics. In contrast, Berri’s approach is, in essence scientific, since his arguments are constrained by the context of describing and comparing these metrics.

This context is the difference between a layman’s approach and a scientist’s approach. Berri did much the same thing as Posnanski suggests in researching basketball players’ productivity. Berri looked at the linear regression of things like points score, shooting percentage, rebounds, turnovers, and so forth, on the amount of points scored. Based on these stats and the weights identified from the regression analysis, he generated a linear model. He placed this stat, Wins Produced, into context by first applying it to all NBA players through all years for which stats are available, he compared its correlation to points scored for and against to existing NBA statistical models, and he generated points of comparisons for each NBA player to the “mean” player at his position. In this way, he is able to actually determine that his measure has a higher correlation to the efficiency differential (points scored – points given up) than the other stats. He was also able to identify the main difference between his and other models, in that the other models tend to use points scored as opposed to the ratio of points scored and shots attempted.

The weights Berri used are not arbitrary in the sense that he simply pulled them out in order to emphasize some difference between NBA players that he thought should exist. Naturally, he might have removed some measures from his model because the weight isn’t high enough, but that’s a different matter from “fine tuning” the weight. Regardless, the most important point is that generally, he made a model from the aggregates that significantly correlated with efficiency differential before applying the model to the players. In this way, he has created rankings of NBA player productivity that has generated some arguments in the sport pundit community (for an example, see here, here, here and here.)
While the particulars aren’t important, the conflict is illustrative of a scientific versus a more laid-back  (although it could still be rigorous) analytical approach. For Berri, he simply sets up a model, cranks out the numbers, and then organizes his views of the players by examining the stats. For the laid-back approach, one sees if the stat is properly associated with a player. Again, this latter approach is fine, within its domain. Sports writers are not scientists, nor do they control the purse strings for a sports team. Even within a sports franchise, one does not need to rely on statistics, if they so desire. As Berri notes, the stats comprise merely one component of NBA evaluation. It’s a shortcut to organizing player’s performance. In no case does it substitute ways of identifying why certain players are not rebounding, or generating enough assists, or reducing their turnovers.

In the Posnanski example, he presented a stat which is correlated with runs scored in baseball. He didn’t say whether this correlation is necessarily higher than other measures (such as OPS). This is a subtle point that is often missed. If the correlations between both measures are similar, than there really is no difference. Of course, there may be a lot more numbers involved in one over the other, but most scientists would simpler choose with one with fewer values. It’s probably also easier to calculate. Using the other numbers do not give you added value. I have seen people talk about complex stats as if complexity (lots of math squigglies) is somehow better or is more correct. That is not the case.

So, how does this relate to writing styles? Well, if the laymen write in examples, and professional writers extract themes and trends from examples, then scientists try to extract ideas/themes/trends that apply to all examples (well, ideally, all, but in generally they try to capture data from a meaningful sample that is indicative of the whole population.)

However, there is a limitation in the presentation of a scientific finding: the conclusions are bound by the premise of the hypothesis and the methods and measures that are used. Thus, in Berri’s case, he presents arguments for NBA player’s productivity in terms of his measure (or other measures, if he’s interested in comparing the different metrics.) But he is constrained by that, less so in his blog, but certainly in his peer-reviewed papers. As a matter of fact, Berri’s blog tends to be a bit dry, breaking down a player’s deficiencies by examining the particulars of how low his shooting percentage, rebounds, assists, etc are relative to the league or position average. Just as importantly, Berri suggests that the metric is best used as an entry point into proper player evaluation and development. It’s a short hand for identify players who might be improved. Despite Berri suggesting players don’t change much from year to year, from team to team, from coach to coach, it may be because no one has tailored a practice program for players based on this simple evaluation. Or it may reflect the ceiling offered by a player’s talent. Aside from these straightfoward analysis of why players have below, above, or near average productivity, Berri doesn’t write about how he might enjoy watching certain NBA players. I think it gives an unfair impression that he is a bloodless machine who doesn’t know what a basketball looks like. His model does not account for flair, style, or aesthetics that is probably the raison d’etre for watching sports in the first place.

For sports writers like Simmons and Posnanski, they approach it from the aesthetic domain first. The assumption is that they have an eye for talent and style, and that this is applicable to how everyone else enjoys watching that player or game. I don’t mean that they are interested in a so-called objective way to rank the entertainment or productive value of these players. I mean that they want, but are frustrated by the fact that they can’t always, to identify an essence of a player that can be applied without qualification or exception and can be easily demonstrable. The clearest example is in the way some describe and compare Kobe Bryant to Michael Jordan. Dave Berri can rank the two, not only in absolute terms but as some standard deviation above the league average for their eras. In that comparison, not only is Jordan more “productive” than Kobe, he is a nearly twice so. Simmons would argue that Kobe is the best there is now. He might be a cut below Jordan, but there is no player closer.

One solution here is to recognize that there is a difference between the professional and the scientific presentation of ideas. Berri started from the metrics first, despite whatever he might think about the players. Simmons cannot, or would not, separate the aesthetics and productivity of the players he enjoys watching. There is nothing wrong with either approach. The only difference is that Berri’s work easily translates into a scientific publication format. Its details all concern finding some measure, defending that measure, identifying advantages of using that measure, and discussing how this measure may be insufficient. In other words, Berri and other scientists are biased into finding “measurables”. For better or for worse, because in the end, the basic scientific hypothesis is “how much.” How much did this drug improve patient outcome? How much did the tumor reduce? How much is a photon deflected from its true path by a massive body? Can we identify how many molecules of this do we have?

This isn’t necessarily a reductionist approach; at its best, finding quantifables is a way of creating a reference point so we can start to discuss things. Thus, the proper angle to take against a scientist (i.e. Berri) is to identify and improve on his assumptions, find a different metric that gives a higher correlation, or improve on his metric by finding more terms that add value to enhance correlation. In other words, scientific discussion is limited by the context of the methods, which acts as a framework for subsequent arguments.

The sports writers do not have this limitation. They can seque between stats and aesthetics. Like Simmons, they can also sprinkle pop-culture references that actually advance their argument. However, I think because they do approach things from an aesthetic angle first, they tend to provide contexts based on motifs and not on metrics. In other words, it allows Simmons to focus on the literary spin of his piece, relating the NBA offseason to lines from  the movie Almost Famous. It allows Posnanski to say that he wants a new stat, because he doesn’t like how OPS is pronounce “ops” and not “Oh-Pee-Ess”. There is a lot of room for literary flourish, which shouldn’t make the argument any more objective, but it becomes much more enjoyable.

Interestingly enough, and, ironically, I haven’t looked at this for all cases, I think for the most part, Simmons and Berri emphasizes the same attributes they want from their ideal basketball player. They want someone who can shoot well (i.e. high shooting percentage), score a lot of points, make passes for assists, don’t cough the ball up, and make rebounds. Where they differ is in how they rank the so called “top players”.  Berri has noted that most conventional players evaluation centers on points scored (without regard to the number of misses the player made.) He has noted that player rankings and player salaries have a correlation of 0.99 compared to points scored. And strangely enough, Berri’s work showed that scoring points, by itself, does not lead to higher efficiency differentials. Despite what writers and general managers profess about finding complete basketball players, they put their money on the point-getters. In other words, all the verbiage devoted to arguing how smooth and graceful players are, how much one should enjoy their talent before they fade into old age, the idea of “aesthetics” and “points” are no different. It’s interesting that Berri noted that in fact there may be an implicit metric being used to evaluate players based on the so called explicit measure of a player’s style/gracefulness/aethetics.