The trouble with stats

I’m not gonna lie to you: I love baseball statistics. I find them fascinating and sometimes extremely useful. I think that, in general, well-used statistics paint an accurate picture of history and can be reasonably useful predictors of the future. But in the wrong hands, they can be very dangerous. A few thoughts about the potential problems with stats:

Problem #1: The Wrong Stats

Anyone who has read much of what I have written knows that one of my biggest pet peeves is people using the wrong stats. People think the guy with the highest batting average is the best hitter in the league, when really he’s just the guy with the highest batting average. Pitchers win the Cy Young Award based on the number of wins they have in a season, rather than on any number of statistics that actually measure, to some degree, how good the pitcher is.

Back in 2003, when I was a college student, I walked into my apartment and stumbled upon a heated argument between my roommate and a neighbor. They were watching the MTV Music Awards, and Pink was performing. My roommate, Deck, was arguing that she was unattractive; my neighbor, Levi, took the position that she was hot. As I listened closely, though, I discovered that the gist of each man’s argument was: “She’s hardly wearing any clothes!” They were using the SAME evidence to prove two opposite points!

They say you can use stats to prove anything; like the debate between Deck and Levi, I’ve seen arguments where both sides were “proving” their points with stats. I don’t think everything is as ambiguous as the Pink Debate, though. While there’s technically no for-sure answer to whether Pink is hot, many (or most) statistical debates can be solved by looking at which side is using stats that actually, you know, mean something.

I can tell you, for a fact, that in 2005, Roger Clemens was a better pitcher than Dontrelle Willis, even though Willis finished ahead of Clemens in the Cy Young voting. You can “prove” that Willis was better by pointing out how many more victories he had, but I can actually prove that Clemens is better by focusing on other stats that more accurately measure a pitcher’s quality. Even if you don’t believe in fancy new stats like WAR (where Clemens had a 7.2 to 6.4 edge over Willis, and winner Chris Carpenter was way down at 4.8), an old-fashioned stat like ERA — which is definitely imperfect but also clearly MORE indicative of how well a pitcher pitched than W/L record — shows that Clemens was the better pitcher.

So the first key to responsible stat usage is to use the right stats.

Problem #2: Cause and Effect

Once upon a time, when LeBron James still played in Cleveland, I saw a scrolling message that said that the Cleveland Cavaliers had not lost a game in the 2008 postseason when LeBron scored 21 or more points. That stat, on its own, was a very accurate portrayal of history. If someone were to ask you if the Cavs won or lost a particular game, you can ask how many points LeBron scored, and if it’s 21 or more, you can safely say that they won. (Or at least, you COULD, until the Cavs lost that very night despite LeBron’s 25 points.)

But it’s a question of cause and effect. Did the Cavs win BECAUSE LeBron scored 21 or more points? Well, unless they won some of those games 21-20 (I checked; they didn’t), then his points don’t tell the whole story. The Dodgers have won every game they played in the last 24 hours that ended while I was sitting at my computer in my underwear, but you’ll never see that stat rolling across ESPN’s BottomLine.

(Funny side note: I actually started writing this post on March 30, 2008, and it was this LeBron “stat” that got me thinking about it. Now here we are, more than three years later, and I am finishing it up. But I didn’t have to change the Dodgers line, because I kid you not, right before I started in on this, I sat in my underwear at my computer and watched the Dodgers beat the Padres.)

Why does an understanding of cause and effect matter? Sometimes it doesn’t. But what if LeBron or his teammates heard that stat and actually took it to be meaningful? What if they spent the whole game feeding LeBron the ball, trying to get him to 21 points to guarantee victory? Is getting LeBron to 21 points actually the key to victory? No, and of course no one with a brain thinks it is. So why do we report these “stats” as if they mean anything?

Problem #3: Some stats are just terrible

I know, this is kind of a combination of points 1 and 2, but it deserves its own section. There was a time when pitcher wins were more meaningful than they are now, back when a starting pitcher would often throw every pitch for his team. There was also a time when win/loss record was the best indicator we had available to us. But I have a “but” for each of those:

1) BUT even when pitchers threw complete games most of the time, they still had little impact on the offensive side of the ball, so W/L was still an imperfect reflection of how well they pitched.

2) BUT the fact that W/L used to be the best indicator we had is a sign of how bad things were back then when it comes to actually being able to tell how good a pitcher is.

We have all sorts of stats now that tell us how good a pitcher is (WAR, FIP, xFIP, SIERA, etc. — if you aren’t familiar with these, go to FanGraphs.com’s library and read up). When someone defends win/loss record as a reflection of how well a guy pitched, they will often say things like, “He had to have pitched well to get that many wins.” Well guess what? You can look at OTHER stats that actually tell you how well he pitched.

For batters, runs scored and RBIs are my biggest pet peeves. Runs scored is a combination of two things: how well you got on base, and how well the guys behind you in the lineup hit. Only one of those is a reflection of your skills as a hitter, and there’s actually a stat that tells how often you get on base. It’s call on-base percentage. It’s been around for a million years.

RBIs reflect three things: how well the guys in front of you got on base, how well you hit, and how many home runs you hit. The two of those that you control, yeah, there are stats for those. They’re not even new-fangled stats. Home runs have been a stat forever. Slugging percentage has been mainstream for decades. If you want to know how good a hitter is, look at his stats that reflect his skills and his alone.

Problem #4: Descriptive vs. Predictive

I am not a sabermetrician. I want to be someday, but for now I think I just qualify as an appreciator of stats and sabermetric analysis. From this view as a respectful outsider, most of my criticisms are of the anti-saber crowd. But I have one criticism of the basement-dwellers, and it is an occasional failure to appreciate the difference between the time to tell what happened and the time to predict what will happen. (It’s not every sabermetrician, and it’s not ANY sabermetrician all of the time, as far as I can tell. But it happens more than I wish it did.)

Simply put, it’s the difference between a writer voting for the MVP and a general manager looking to make a big free agent signing in the offseason. And I think “clutch hitting” is a great example of a stat that is both overrated and underrated.

If we’re talking about clutch hitting, of course we will talk about David Ortiz, the poster child for clutch hitting. In 2005, Papi batted 1.226 OPS with two outs and runners in scoring position, and a 1.293 OPS in “late and close” situations, compared with his 1.001 overall OPS for the season. The fact is, he was a great clutch hitter that season.

Here’s where things get tricky. Stats people will sometimes say “clutch hitting is a myth.” A Google search of that phrase has 4110 results. It gets said. But what people MEAN is that clutch hitting ABILITY is a myth. The myth is not that clutch hitting exists — it clearly does, as nearly every single game has a winning run score, often being driven in by another player. The myth is that a particular player, like David Ortiz, possesses some clutch-hitting ability that sets him apart from other players.

In 2007, in those same two clutch categories, Big Papi’s OPSes fell to .994 and .766, compared to his 1.066 overall OPS. In David Ortiz’s best offensive season, his statistics in clutch situations were worse than his overall stats. That’s because for all the myth and legend, there is nothing about David Ortiz that makes him a great clutch hitter. If you were to take those clutch situations for his entire career, his numbers would look eerily similar to his overall numbers. That’s because hitting is the actual skill, not clutch hitting.

BUT … and this is a big but … that doesn’t always matter. If I am a GM and I sign Big Papi to a huge free agent contract based on his clutch hitting ability, I should be fired on the spot. But if I am a baseball writer tasked with voting for the AL MVP, I absolutely SHOULD take into account how well he hit in the clutch that season. Is it the only thing to look at? Of course not. But when you are looking at how much value a guy provided to his team, how well he hit with the game on the line is a pretty useful thing to know.

Clutch stats are descriptive stats. They don’t tell me a darn thing about what might happen in the future, and I would be foolish to base projections on them. But they tell me a lot about what happened in the past, and if my particular task is about the past, I would be foolish to ignore it.

So there you have it. My issues with statistics, which I love. I think Joe Posnanski, as usual, got it exactly right earlier today when he wrote about stats and stories: “I want to tell you one reason why I love baseball numbers. I love them because I believe advanced numbers can help us tell better stories.”

S	M	T	W	T	F	S
« Dec
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Leave a Reply Cancel reply