I was going to tweet a question to Rob Neyer, who happens to be one of the most respected (by me) baseball writers in the world. He clearly loves and is passionate about baseball, which will surprise people who think stats have taken all the fun out of the game, because Rob is also one of the most intelligent baseball thinkers you will come across. And now you see why I didn’t just tweet this, because I’m already well over 140 characters and I haven’t even finished sucking up yet.
So anyway, as the baseball season wore down, Rob wrote periodic updates about the state of the National League Cy Young race. On September 27, he wrote his last one, which included this bit:
The real action happened Sunday, when both Clayton Kershaw and Roy Halladay pitched.
Of course, because they’re Clayton Kershaw and Roy Halladay, they both won.
How close are they?
Halladay’s 19-6 and Kershaw’s 21-5.
Halladay’s got a 2.35 ERA, Kershaw’s is 2.28.
Halladay pitched 233⅔ innings, Kershaw 233⅓.
Kershaw struck out 248 batters, Halladay “only” 220.
Halladay has one big edge: strikeout-to-walk ratio. While Kershaw’s command of the strike zone has been excellent, his 4.6 strikeout-to-walk ratio pales next to Halladay’s MLB-best 6.3 mark.
I majored in English in college, and one of the things I remember is the distinction between two kinds of dictionaries. Prescriptive dictionaries tell you how things should be; descriptive dictionaries tell you how things are. For example, some dictionaries now list, as an acceptable definition of “literally,” a definition that is actually the opposite: “in effect; in substance; very nearly; virtually.” This is because people have misused the word so much that descriptive dictionaries have included the misuse.
I believe that there are two kinds of stats, too, with similar names. The difference being that while we can all agree that descriptive dictionaries that tell me “literally” means “virtually” are dumb and wrong, the two kinds of stats are not good and bad — they are both good, but have different applications. I call them “descriptive stats” and “predictive stats.” Descriptive stats tell you what has happened; predictive stats help you guess what will happen in the future.
These days, there are a lot of really smart people thinking about stats, and they understandably focus mostly on the predictive stats, because one of the main points in being smart about baseball is to more accurately guess what might happen in the future. Of course, it’s all still just guessing, but there’s a lot more value in an educated guess than in a blind guess (or a guess based on bad education).
So there’s a stat called strikeout-to-walk ratio (K/BB), which is a wonderful predictive stat. With the disclaimer that I am not an expert, a sabermetrician, or anything like a smart guy, my understanding of the thinking on the subject is this: there are a lot of things a pitcher can’t control, but the three things that he can control more than others are strikeouts, walks, and home runs, because those three things remove the defense from the equation. So stats like FIP (Fielding-Independent Pitching) are popular and useful ways to see how well a pitcher does the things he can control. K/BB is a similar stat, with the thinking that if a guy is four times as good at the good thing (striking out hitters) as he is bad at the bad thing (walking hitters), that’s a good sign for the future.
Let me be very clear: as a baseball fan, I want my team’s general manager to know about these types of stats, both pitching and offensive, and to use them intelligently. Predictive stats can help you predict how well a player will do in the future, because they more accurately portray a player’s underlying skillset. And that predictive power is very useful when figuring out, say, who to offer a contract to.
But when we’re talking about postseason awards, I think predictive stats should take a bit of a back seat to descriptive stats. K/BB doesn’t tell the whole story about how good a pitcher was in the past season. Let’s look at the Halladay/Kershaw comparison:
Just looking at K/BB, Halladay is clearly ahead. But what if we don’t care about the predictive value in the stats? What if we just care about how successful a pitcher was. At that point, with the season over, can’t we look at ALL baserunners, and not just the ones from walks?
When all is said and done, for something like the Cy Young Award, which is clearly based on this past season with no regard for how well either pitcher might pitch in the future, I think Kershaw’s 34 fewer hits allowed more than makes up for his 19 more walks. He had more strikeouts than baserunners allowed, which is insane. Kershaw’s 2011 is only the 18th season (by 12 different pitchers) where a starting pitcher allowed less than one baserunner per inning and had more than one strikeout per inning.
I believe that Clayton Kershaw was the best pitcher in the National League in 2011. I admit that I am a Dodger fan and a Kershaw fan, and the fact that I am also a Halladay (and Cliff Lee) fan doesn’t eliminate the possibility that I am blinded and using whatever stats I can find to justify what I already believe. But I don’t think that’s the case. I think the descriptive stats tell us that, while Halladay and Lee had excellent seasons, Kershaw’s season was slightly better and that he deserves the Cy Young Award.