For the past few weeks, I have been mesmerized by the wormhole that is BABIP. In the confounding and fascinating world of baseball statistics, BABIP is an outsider. It stands against a sea of more powerful statistics, most aimed at assessing a numerical value to the product on the field. Compared to statistics like WAR, wRC and UZR, BABIP is the ugly stepchild. A part of the numerical family? Yes. But the potential for its use is limited, if not already reached.
BABIP is an acronym for Batting Average on Balls In Play. The number tells how often a ball that is hit in the field of play (i.e.: not a foul ball) results in a base hit versus any other result. The magic number to know regarding BABIP is .300. The league average BABIP hovers around .300 every year with little fluctuation. So when a batter (or pitcher) deviates in any extreme way in a given season from the .300 average, it is assumed that the player was simply lucky that year and will soon revert back to the league average.
Easy enough, right?
Well, my odyssey aimed at understanding the stat was prompted by a different number: .404. In Joey Votto’s injury-plagued 2012 season—one that saw a drop in his traditional power numbers—he had a BABIP of .404, more than 100 points higher than the league average. Maybe that doesn’t seem particularly outrageous, but here’s an idea of how rare the feat is.
Out of the 11,646 individual seasons in which a batter had at least 450 plate appearances since 1920, only 13 have had a final BABIP of more than .400. Thirteen. Out of 11,646. For a reference point, it’s more than five times more likely that a pitcher records a sub-2.00 ERA (.57 percent chance) than a batter recording a .400 BABIP season (.11 percent).
But it wasn’t until I saw the other names of hitters that reached the mythical .400+ BABIP mark that I was really intrigued. Here’s a sampling: Ruth. Cobb. Hornsby (the only one to do it twice). Carew. Clemente. Manny.
Altogether the list consists of 12 names, nine of which are enshrined in Cooperstown, and another in Ramirez that should be one day. (Though the likelihood of that happening seems low at this point, with the track record of others implicated in PED usage.) The only real outlier is Jose Hernandez, a lifetime .252 hitter that translated his one outrageous BABIP season into a career-best batting average of .288 in 2002. Let’s chalk that up to dumb luck. (Though he did have a career best line drive percentage and home run/fly ball ratio, for what it’s worth.)
But the others, including Votto, are regarded as some of the best hitters of their time. So my question was this: how is it that 11 of the 12 players that have achieved this statistical anomaly, which is almost always attributed to “good luck,” happen to be 11 Hall of Fame-caliber hitters?
This can’t be a coincidence, right? There has to be something that these great hitters understood about pitching tendencies, or a shift they made to their swing, or an altered stance—something that contributed to these outrageous numbers.
But alas, pitch-by-pitch numbers and film analysis don’t exist for Babe, Clemente, or anyone on the list other than Hernandez and Votto. (Fangraphs pitch-by-pitch numbers only go back to 2002.) So while analyzing the statistics that are available for players a few generations back could provide a sliver of an explanation, it would be nothing more than a semi-educated guess. And comparing Votto’s numbers to Hernandez’s does not make much sense either—the one season of .400 BABIP is about the only thing their careers have in common.
But I was not going to let that deter me. I needed to have some explanation other than luck for Votto’s absurd numbers.
So for two weeks, I traversed the labyrinth of statistical data that Fangraphs offers and scoured a few other sites as well, hoping that with a mix of good fortune and perseverance and more good fortune, I could crack the code John Nash-style and eliminate luck as a viable reason for the anomaly. I holed myself up in a shed, consulted my old college buddy Charles, and quickly developed a few theories.
My first theory focused on the idea of plate discipline. It seemed logical to conclude that if I could prove that Votto only hit balls that were in the strike zone, that he would be more likely to make solid contact. Votto’s Outisde-the-Strike-Zone Swing percentage (O-Swing%) was the lowest of his career at 21.2 percent, nearly 10 percent below league average. (Translation: he was not chasing as many pitches.) Maybe I was onto something. Urban legends linger about Ted Williams and his ability to call out the number written on baseballs in mid-flight before hitting them. Maybe Votto had developed this same Jedi-mind cognition skills.
But this theory was quickly thwarted when I realized that Votto’s Zone Swing percentage (Z-Swing%) actually dropped to slightly below league average. So while he was swinging less at pitches bad pitches, he was also swinging less at good pitches. (In fact, Votto swinging less explains his increased walk percentage (BB%), but it’s not particularly useful for what I was searching for.) I needed to focus on statistics more pertinent to batted balls.
Let’s call this second theory the solid contact theory. My hypothesis was this: perhaps Votto was just hitting the ball harder, thus making his batted balls more difficult to field, which translated into higher BABIP. To do this, I disregarded the plate discipline statistics. It made more sense to attack this BABIP conundrum using batted ball statistics, which most directly correlate to BABIP.
I ultimately decided to focus on three main areas: line drive, ground ball, and fly ball percentages. Each batted ball falls into one of these three categories. As you may expect, line drives are the most valuable types of hits, followed by fly balls and ground balls. The fine folks at Fangraphs have assigned a “run per out value” to each type of hit:
TYPE OF HIT
RUNS PER OUT
It makes sense to assume that line drives are the most valuable type of hit, but to see just how valuable they are blew me away. A line drive hit (and remember, in this case, the word hit is synonymous with batted ball) is worth more than one run per out. So if a player like Votto consistently hits line drives, he is statistically likely to produce more runs than outs.
Let’s allow that fact to really marinate for a second. More runs than outs? So, in theory, an elite squadron of line-drive hitting robots would produce more than 27 runs per game. That’s a lot, in my expert opinion. It seems, then, that a premium should be placed on the ability to consistently smack line drives rather than hitting the long ball or simply making contact.
Of course, batters are not line-drive hitting robots. For most professional hitters, it is really, really difficult to hit line drives. Few hitters have the capability to determine how they are going to attack each pitch. Only a select few, such as Pujols and Cabrera or the aforementioned Hornsby and Clemente, can adjust their swing to the pitch thrown. Most are set in their swing and hope that it proves successful against whatever pitch is offered.
But not Joey Votto. Votto is, by all accounts, a hitting savant. His calculated approach to each at bat has been detailed by various outlets. In the batter’s box, Votto is surgical. He accounts for every possible scenario, recalculating his approach after each pitch. Unlike many, he actively thinks about grip, angle, ball rotation, and myriad other bits of information typically reserved for analytically-starved scouts, general mangers, and analysts. It would make sense that he, much more so than other hitters, would be aware of the statistical value of line drives.
The numbers reinforce that he has embraced hitting liners. In 2010, a season that saw Votto take home National League MVP honors, his LD% was just 20.0, less than two percent above league average and a 3.5 percent drop from his 2009 total. In 2011, his LD% increased to 27.5 percent. By 2012, he was at 30.2 percent, the highest mark in the league among those with 450+ plate appearances. (Note: Votto’s 2012 numbers technically didn’t qualify among league leaders. To qualify, a player must have 502 plate appearances. Votto had 475 in 2012 due to injury.)
I was finally on to something. Perhaps it was Votto’s altered approach, which placed more emphasis on line drives than ground balls, that explained his increased BABIP.
But I needed to find a number to back that up, some figure that accounted for the importance of LD%.
It was then that I found xBABIP. It’s an advanced statistic that seeks to prove just how much of a deviation there is from league average and how much of it can be attributed to change in the hitter’s approach. (If you’re really curious about the formula, here it is:
xBABIP = 0.392 + (LD% x 0.287709436) + ((GB% – (GB% * IFH%)) x -0.152 ) + ((FB% – (FB% x HR/FB%) – (FB% x IFFB%)) x -0.188) + ((IFFB% * FB%) x -0.835) + ((IFH% * GB%) x 0.500))
Easy enough, right?
Fortunately there are spreadsheets available that calculate an individual’s xBABIP versus his BABIP. This idea focuses on determining an individual’s expected BABIP, taking into account their batted ball statistics. Only when xBABIP is determined does BABIP become really useful. I thought it best to chart out Votto’s career numbers to see if his progression as a hitter, and his 2012 BABIP, could be explained by a combination of BABIP, xBABIP, and the factors accounted for in xBABIP: LD%, FB%, and GB%. (For good measure, I included his BB%—an unrelated statistic numerically, but still useful when looking at his overall approach.) This is what I came up with.
(For league averages over this stretch, click here.)
As you can see, Votto’s first two seasons were all over the place. His FB% and GB% rose and fell dramatically while his LD% began a decline.
But something happened between 2010 and 2011. Votto’s LD% increased while his GB% and FB% decreased. That type of transition does not happen by accident. This Fangraphs piece hints at a less risky, more level swing coupled with a concerted effort hitting to the opposite field, among other factors. (In addition, he became increasingly more selective at the plate and saw his BB% continue to rise.)
The fruit of whatever mechanical changes Votto made can be seen in the chart. The numbers paint a picture of a hitter who has become remarkably efficient. He is attempting to limit his ground balls and fly balls in favor of line drives, a far more valuable type of hit.
Nevertheless, there is one more lingering factor that can make Joey Votto perhaps the most dangerous hitter in baseball. On top of this cerebral understanding of the value of each type of hit, Votto still has power. Fans may point to his decreased home run numbers as evidence to the contrary, but I’m proposing that the power is not gone—it’s merely shifted. Instead of throwing his weight behind fly balls in hopes of going yard, Votto has thrown that heft behind line drives, valuing gap shots laced to the outfield wall instead of deep flys. Even with just 14 home runs in 2012 (and none after returning from his knee injury), Votto still managed a top five slugging percentage (.567). And consider this: in 2010, when Votto belted 37 home runs and totaled 114 RBI, he had a 172 wRC+. (wRC+ stands for weight runs created and is arguably the most accurate calculation of a player’s offensive value.) That year, Votto created 72 percent more runs than league average. In 2012, when Votto’s power numbers supposedly slipped, his wRC+ was 177. To give you some context, Triple Crown winner Miguel Cabrera’s wRC+ was 166. Mike Trout also had a wRC+ of 166. National League MVP Buster Posey had a wRC+ of 162. Fantasy numbers-wise, Votto may have slipped. But in 2012, he was more valuable than ever in terms of the offensive production for the Reds.
Was I able to solve the mystery of Joey Votto’s absurdly high 2012 BABIP? Well, yes and no. Votto is unlike any other hitter in baseball. His BABIP should not be measured against league average, but rather by the standard each player sets for himself. And without looking at xBABIP or LD% or any of the other batted ball statistics, BABIP is a relatively meaningless statistic. Last year, Votto’s BABIP was .404, which was .042 higher than his xBABIP of .362 (which was the fifth highest in the league). Sure, that’s still a significant deviation, but what contributed to it is beyond me. I guess you can chalk it up to good luck after all.
Adam Flango is a video producer for CBSSports.com and weirdly enjoys the world of sports statistics. For tips on numbers to look out for, tweet them to @adam_flango or e-mail them to firstname.lastname@example.org