BABIP - Cliff Notes review

Post Reply
User avatar
ToddZ
Posts: 2798
Joined: Sat May 22, 2004 6:00 pm

BABIP - Cliff Notes review

Post by ToddZ » Sat Jun 15, 2013 1:07 pm

The defenders of BABIP were posed a couple of questions in another thread and it was asked that I start a new thread to explain BABIP and it's utility, so here goes.

BABIP (batting average on balls in play) is designed to quantify the fate of batted balls the defense has a chance to make a play on. There are a couple of flaws in that balls in the park that are too high up on outfield walls to catch and foul pop-ups are factored in (if a foul pop is caught, it get counted, but if it lands safely it is just a strike) but by and large it is effective in measuring its intent.

The formula is (hits - HR) / (AB - K - HR) and is the same for hitters and pitchers.

BABIP is an offshoot from DIPS Theory (Voros McCracken). Long story short is McCracken discovered that the BABIP for all pitchers clustered around .300 regardless of the quality of the pitcher. The BABIP's for individual pitchers would fluctuate above and below .300, suggesting (at the time) the pitcher did not have control over the fate of a batter ball put in play. Keep in mind this was circa 1999 and data collection had not yet included tracking specific batted ball types.

From an analytic perspective, since a pitcher did not exhibit control over his BABIP, if it was below .300, he was considered lucky that more of the balls he allowed to be put in play became outs while he was unlucky if more BIP than normal went for hits.

The obvious follow-up was to look at hitters where it was discovered that hitters DID exert control over their BABIP, each establishing their own baseline to which they would fluctuate.

As data collection began to include batted ball types (fly balls, ground balls and line drives), it was shown that BABIP is driven by line drive rate since approximately 72 percent of LD go for hits while only 24 percent of GB and 14 percent of non-HR FB are hits. This is pretty intuitive but something that may not be so intuitive is batters exert the most control whether he hits a line drive or non line drive while pitchers exert the control whether a non-line drive is a GB or FB.

This may be confusing so I'll try to say it another way -- a pitcher does not have much control whether a hitter hits a LD or something else.

Since LD are the primary driving force behind BABIP, this explains why pitchers cluster around .300 while hitters establish their own baseline. The more LD a hitter has, the higher the BABIP. The analogous statement for pitchers doesn't exist since they don't exert sufficient control whether they give up a line drive or not. This could be one of the more troubling aspects of BABIP as intuitively better pitchers should be able to allow fewer LD, but the fact is they don't - or they don't do it to an extent that is significant to trump the randomness.

The next revelation in BABIP analysis is since more GB go for hits than FB, a GB pitcher should have a slightly higher BABIP than a FB pitcher (but a FB pitcher gives up more homers and induces fewer GIDP). The same is true for hitters -- more GB than FB will help increase BABIP.

Keep in mind at this point all that is being collected is FB, GB and LD. That said, analysis could be refined a bit to help discern the reason behind a high or low BABIP. If a hitter's BABIP was higher than his baseline, but his hit distribution was basically the same, he was considered lucky - a higher percentage of batted balls found holes in the infield or grass in the outfield. It was assumed that running speed had something to do with BABIP as well, but there was no data to back it up - more circumstantial evidence.

The next improvement in data collection was breaking out infield pops from FB and tracking the number of bunt hits and infield hits. Bunt hits and infield hits helped lend an argument to the speedy guys supporting a higher BABIP.

Starting just a couple of years ago, how hard a ball was struck was identified and deemed hard, medium and soft. As expected, hard hit balls resulted in the highest BABIP within each class but what may be surprising is soft is next with medium hit balls resulting in the lowest BABIP. But when you think about it, a medium hot ground ball affords the fielder to both get to it and throw the runner out while a medium hit fly ball is not as likely to fall in so it makes sense after all.

The real key here is how much control does a pitcher have with respect to how HARD a ball is struck. We're still in the embryonic stage of the research since the sample of data collection is still small (only a few years) but there is some indication a pitcher can induce weaker contact with the following caveat - there's no proof this is sustainable (not enough data) and the randomness still masks the control. That is, the luck overrides the skill.

Similarly, we're getting a handle on whether a hitter can sustain an elevated hard hit ball rate. As an aside, while speed is important, a hard hit ball is more important.

The next major refinement of data collection will be measuring the speed and trajectory of a batter ball electronically as right now it is still mostly subjective. There are some interesting studies that show different viewing angles will denote the same ball in the air as being a LD and some a FB - it has to do with parallax.

Anyway...

That's a brief review of what BABIP is. But that's only one-third of the story. But before I get to the next part, a comment was made that any stat that doesn't include HR isn't of interest. No single stat will tell the whole story. I like to look at several distinct stats then blend them together to get a better idea of the big picture. K%, BB%, HR% and BABIP in tandem offer that global view. None of the metrics are useful as a stand-alone, but together they help tell the tale.

The final part of the story will be the actual utility of BABIP in analysis. But before that, we need to be on the same page with respect to what is being analyzed. At the end of the day, we want an idea of the player's future performance. Right now, we care how a player will produce the final 3 1/2 months of the season. As soon as the season is over (or maybe even before for those of us in Doughy's Premature League) will care about how we feel they will do in the 2014 campaign.

There are two important things to consider. Expected results are driven by skills and a player's skill is not a static number but a range, hence the expected performance should be a range.

In other words, a players expectations are a weighted average of several plausible outcomes. The static projection those of us in the business of prognostication offer is the weighted average of all those plausible outcomes, but what is sometimes called a bad year (or bad projection) is nothing more than a less likely plausible outcome while a good year (and also a bad projection) is really just a less likely plausible outcome as well.

An true increase or decrease in skill can alter a projection as can just plain old dumb luck. One of the tricks of prognostication is identifying when a skill change is real. Will a pitcher repeat an improved walk rate? Will a hitter continue to make better contact?

The proper use of BABIP is to help eliminate the randomness of performance, leaving just the skill. We then use the skill to anticipate future performance.

As data collection has been refined, it is getting more possible to separate the skill from the luck. So a high or low BABIP is no longer being brush-stroked as good or bad luck (or at least it isn't by those that understand how to break BABIP into its components). Unfortunately, some analysts are still stuck in the time where we didn't have the trajectory data.

The key is looking at the component data for each individual player - be it a hitter or a pitcher - to decide what is skill and what is luck. The luck should be considered to be neutral going forward, leaving the skills.

One of the more difficult aspects of BABIP to accept are really good pitchers are really good because they strike guys out, don't walk many and limit homers. They have very little control over the fate of balls in play. I have learned NOT to say NO CONTROL, because they have SOME - but it is a lot less than many want to believe. There just isn't any data that shows it is significantly harder to hit one pitcher over another, even if the eye and sniff test suggest otherwise. The data shows it doesn't exist - the happenstance involved with a round bat striking a round ball masks any ability to induce weaker contact (or whatever). Over a game or two - yeah. But this is not sustainable.

Another fact that is hard to accept is luck does is not discerning -- good players can get lucky too. This is the Mike Trout Syndrome. There were couple elements of Trout's 2012 season that were positively influenced by Lady Luck. The shame of it is this has completely hidden the fact that Trout is a better baseball player this season since he has IMPROVED his contact rate - it sucks that no one is even talking about that but instead focusing on "regression" and "on pace". We're seeing a special player get even more special but we're dwelling over an element of his production that was out of his control.

I suspect I missed some things I intended to include or missed a question or two that was asked but hopefully they'll come up in ensuing discussion and I can address them at that point.

But the take home message is BABIP is just one of a number of tools that helps build the complete structure. For me. it's specific use is to flesh out the happenstance of performance so we can minimize the variance associated with projected performance. The key is minimize since a projection should always be thought of as a weighted average of plausible outcomes, not a static number.

One more quickie...

Regression is the reversion to a mean. In terms of analysis, I like to think of it as a correction of things out of a players control. It's luck becoming neutral. Every player regresses. Sometimes metaphors can be confusing which is what I think happened with the whole Trout defying gravity stuff. Gravity is not discerning. If you drop me and Mike the Mouth from the top of a building in a vacuum, we're going to hit the ground at the same time, even though one of us is morbidly obese and the other a flawless physical specimen of manhood. To say Trout will not defy gravity is simply to say the luck element of his production will turn normal - just like it has.

Finally - there is no schedule for regression. Some like Paul Maholm have it kick then in the nuts right away. For others like Matt Moore it takes a while. And for others like Pat Corbin, we're still effing waiting.
2019 Mastersball Platinum

5 of the past 6 NFBC champions subscribe to Mastersball

over 1300 projections and 500 player profiles
Standings and Roster Tracker perfect for DC and cutline leagues

Subscribe HERE

User avatar
Outlaw
Posts: 1498
Joined: Sun Mar 27, 2011 6:00 pm

Re: BABIP - Cliff Notes review

Post by Outlaw » Sat Jun 15, 2013 4:24 pm

Todd- thank you for information - good stuff. I understand better the whole balls in play as it relates to home runs now.

User avatar
ToddZ
Posts: 2798
Joined: Sat May 22, 2004 6:00 pm

Re: BABIP - Cliff Notes review

Post by ToddZ » Sat Jun 15, 2013 4:50 pm

Outlaw wrote:Todd- thank you for information - good stuff. I understand better the whole balls in play as it relates to home runs now.
Wait until you find out there's luck involved with HR's as well 8-)
2019 Mastersball Platinum

5 of the past 6 NFBC champions subscribe to Mastersball

over 1300 projections and 500 player profiles
Standings and Roster Tracker perfect for DC and cutline leagues

Subscribe HERE

User avatar
Atlas
Posts: 598
Joined: Wed Feb 28, 2007 6:00 pm
Contact:

Re: BABIP - Cliff Notes review

Post by Atlas » Sat Jun 15, 2013 7:29 pm

"Regression is the reversion to a mean. In terms of analysis, I like to think of it as a correction of things out of a players control. It's luck becoming neutral. Every player regresses"

Isn't progression also a reversion to a mean? Not to nit pick, but not "every" player regresses. Some will pick up their batting average, for example, to get to the back of the baseball card...no?

User avatar
ToddZ
Posts: 2798
Joined: Sat May 22, 2004 6:00 pm

Re: BABIP - Cliff Notes review

Post by ToddZ » Sat Jun 15, 2013 7:46 pm

Atlas wrote:"Regression is the reversion to a mean. In terms of analysis, I like to think of it as a correction of things out of a players control. It's luck becoming neutral. Every player regresses"

Isn't progression also a reversion to a mean? Not to nit pick, but not "every" player regresses. Some will pick up their batting average, for example, to get to the back of the baseball card...no?
Regression goes in both directions. The problem is regression has become synonymous with "play worse".

The "re" part is misleading.

I think some connote negative to re and positive to pro, but regression is the term (far as I know) for both directions. Progression really doesn't mean a positive correction to the mean - I think it's been made up.
2019 Mastersball Platinum

5 of the past 6 NFBC champions subscribe to Mastersball

over 1300 projections and 500 player profiles
Standings and Roster Tracker perfect for DC and cutline leagues

Subscribe HERE

User avatar
Outlaw
Posts: 1498
Joined: Sun Mar 27, 2011 6:00 pm

Re: BABIP - Cliff Notes review

Post by Outlaw » Sat Jun 15, 2013 8:08 pm

ToddZ wrote:
Outlaw wrote:Todd- thank you for information - good stuff. I understand better the whole balls in play as it relates to home runs now.
Wait until you find out there's luck involved with HR's as well 8-)
As long as the hitter fails 2/3 of the time I'll be happy... See ball - Hit Ball...

User avatar
Atlas
Posts: 598
Joined: Wed Feb 28, 2007 6:00 pm
Contact:

Re: BABIP - Cliff Notes review

Post by Atlas » Sat Jun 15, 2013 8:46 pm

ToddZ wrote:
Atlas wrote:"Regression is the reversion to a mean. In terms of analysis, I like to think of it as a correction of things out of a players control. It's luck becoming neutral. Every player regresses"

Isn't progression also a reversion to a mean? Not to nit pick, but not "every" player regresses. Some will pick up their batting average, for example, to get to the back of the baseball card...no?
Regression goes in both directions. The problem is regression has become synonymous with "play worse".

The "re" part is misleading.

I think some connote negative to re and positive to pro, but regression is the term (far as I know) for both directions. Progression really doesn't mean a positive correction to the mean - I think it's been made up.

Progression can be defined as the act of forward movement. Doesn't have to be positive. We "progress' in age...not necessarily positive. But you're right, most of the time it is connoted positively. Progressive thinking, for example.

Meh...now you know I'm bored on a Saturday night. :D

User avatar
ToddZ
Posts: 2798
Joined: Sat May 22, 2004 6:00 pm

Re: BABIP - Cliff Notes review

Post by ToddZ » Sat Jun 15, 2013 8:58 pm

Atlas wrote:
ToddZ wrote:
Atlas wrote:"Regression is the reversion to a mean. In terms of analysis, I like to think of it as a correction of things out of a players control. It's luck becoming neutral. Every player regresses"

Isn't progression also a reversion to a mean? Not to nit pick, but not "every" player regresses. Some will pick up their batting average, for example, to get to the back of the baseball card...no?
Regression goes in both directions. The problem is regression has become synonymous with "play worse".

The "re" part is misleading.

I think some connote negative to re and positive to pro, but regression is the term (far as I know) for both directions. Progression really doesn't mean a positive correction to the mean - I think it's been made up.

Progression can be defined as the act of forward movement. Doesn't have to be positive. We "progress' in age...not necessarily positive. But you're right, most of the time it is connoted positively. Progressive thinking, for example.

Meh...now you know I'm bored on a Saturday night. :D
We're on the same page, but I wasn't clear. In the instance of reversion to the mean, progression is not really a word - but some use it as such.

I'm not bored - just waiting for overtime to begin :)
2019 Mastersball Platinum

5 of the past 6 NFBC champions subscribe to Mastersball

over 1300 projections and 500 player profiles
Standings and Roster Tracker perfect for DC and cutline leagues

Subscribe HERE

User avatar
Deadheadz
Posts: 1963
Joined: Mon Mar 25, 2013 12:16 pm

Re: BABIP - Cliff Notes review

Post by Deadheadz » Sun Jun 16, 2013 8:45 am

A very interesting read.

I have only one dispute:
Trout came into the season 15+ lbs heavier than last season, but I wouldn't call him "morbidly obese".
:roll:
The Bill Buckner of FAAB
Deadheadz

User avatar
ToddZ
Posts: 2798
Joined: Sat May 22, 2004 6:00 pm

Re: BABIP - Cliff Notes review

Post by ToddZ » Sun Jun 16, 2013 8:53 am

Deadheadz wrote:A very interesting read.

I have only one dispute:
Trout came into the season 15+ lbs heavier than last season, but I wouldn't call him "morbidly obese".
:roll:
Mike Trout did not write the post. I did

If you drop me and Mike the Mouth from the top of a building in a vacuum, we're going to hit the ground at the same time, even though one of us is morbidly obese and the other a flawless physical specimen of manhood.
2019 Mastersball Platinum

5 of the past 6 NFBC champions subscribe to Mastersball

over 1300 projections and 500 player profiles
Standings and Roster Tracker perfect for DC and cutline leagues

Subscribe HERE

User avatar
Deadheadz
Posts: 1963
Joined: Mon Mar 25, 2013 12:16 pm

Re: BABIP - Cliff Notes review

Post by Deadheadz » Sun Jun 16, 2013 12:09 pm

<sarcasm>
So I was wrong to assume YOU we're the flawless specimen of manhood?
</sarcasm>

I hoped my use of the rolling eyes would convey my attempt at ironic humor. Too subtle perhaps.


Cheers!
The Bill Buckner of FAAB
Deadheadz

COZ
Posts: 715
Joined: Tue Nov 29, 2011 11:48 pm
Location: Rolling Meadows, IL

Re: BABIP - Cliff Notes review

Post by COZ » Sun Jun 16, 2013 9:14 pm

WOW. Awesome explanation, very well written. Thank you for that, very informative and educational. The word " luck" always annoyed me in describing performance especially given how analytical sabermetricians are. I prefer the word you used: variance.

COZ
COZ

"Baseball has it share of myths, things that blur the line between fact & fiction....Abner Doubleday inventing the game, Babe Ruth's Called Shot, Sid Finch's Fastball, the 2017 Astros...Barry Bonds's 762 HR's" -- Tom Verducci

King of Queens
Posts: 3602
Joined: Wed Feb 04, 2004 6:00 pm
Contact:

Re: BABIP - Cliff Notes review

Post by King of Queens » Mon Jun 17, 2013 7:56 am

Good read and thanks for posting, Todd.

With that said, can I get the Cliff Notes of the Cliff Notes review? :lol:

Post Reply