Why Line Drive Rate Isn’t Sticky

0


Joe Nicholson-USA TODAY Sports

“An egregious error of Umpire Hurst in construing the rules helped Boston to two runs and added to the confusion of the Orioles. In the fourth inning Boston had three men on bases and one out. Ryan came to the bat and scratched out a short fly over third base. Jennings ran for the ball, got under it and muffed it. According to Rule 45, Section 9, a batter is out ‘if he hits a fly ball that can be handled by an infielder while first base is occupied with only one out.’ Ryan should have been declared out whether the ball was muffed or not…

When seen at the club-house after the game he started in defense of his position by attempting a distinction between the outfield and infield, claiming that the ball was not hit to the infield, but when his attention was called to the wording of the rule, which does not state that the ball must be hit to the infield, but simply that it shall be such a ball as an infielder can handle, he abandoned that position, and argued that it was not a fly ball, but a line drive. He soon saw the absurdity of that argument, as a line drive which does not touch the ground is as much a fly ball as if it were hit 100 feet up into the air.”

– “Errors Lost the Game,” The Morning Herald, April 26, 1894

The graph below has been haunting me for weeks now. I made it, but there’s nothing unique about it. You can find an identical graph in this Alex Chamberlain piece, this Tom Tango blog post, or any number of other articles. It shows the batting average and wOBA for every batted ball, based on launch angle.

I cut off 20 degrees from either side, but you get the point. Worthless groundballs and popups are on the sides, and valuable line drives and fly balls make up a narrow sliver in the middle. It occurred to me a few weeks ago that we’ve been splitting batted balls into those same four categories for a very long time now. Moreover, one of those categories is suspect. If you’ve been reading FanGraphs for a while, you know that line drive rate is considered fluky rather than sticky. Only a handful of elite players – Luis Arraez, Freddie Freeman, maybe Steven Kwan – are capable of consistently putting up top-10 line drive rates. According to Baseball Savant, batters have a .639 wOBA on line drives this year. Hitting line drives is what every single batter is trying to do, and yet somehow what Russell Carleton wrote seven years ago still holds true: “There is some skill in hitting line drives, but it is hard to repeat, and how many line drives you hit seems to be unrelated to where you fall on the ground-ball/fly-ball spectrum.” I set out to find some new way to look at this old puzzle, figuring that with all of the tools as our disposal, there had to be a better way to slice this particular pie. I failed, but I came across some interesting things along the way, and that (I have decided after the fact) is what’s really important.

Let’s start with Sports Info Solutions, which began categorizing balls in play in 2002 and provides the data on our batted ball leaderboards. I reached out to Mark Simon, who has been with SIS since 2018. Although he couldn’t reveal any specifics about the criteria SIS uses to determine batted ball type, he did relate some information that was already available publicly: first, that hang time is an important part of their criteria, and second, that SIS has even finer categories than the core four. (For example, they might have additional categories for balls that fall somewhere between the traditional definitions of line drive and fly ball.) I pulled data for every qualified player season, then calculated the correlation coefficient of each player’s performance from one year to the next.

Year-Over-Year Batted Ball Type R-Values (SIS)

GB LD FB IFFB
.80 .42 .79 .58

As you can see, groundballs and fly balls are much sticker year-over-year than line drives and popups (or as SIS refers to them, infield fly balls). SIS’ numbers are also different from Statcast’s. I told you before that batters have a .639 wOBA on line drives this year, but that was according to Statcast. According to SIS, that number is .681. Statcast’s numbers go back to 2015. Here are the year-over-year correlations.

Year-Over-Year Batted Ball Type R-Values (Statcast)

GB LD FB PU
.77 .42 .75 .68

SOURCE: Baseball Savant

Everything’s pretty much the same except for popups. Statcast has a stricter definition of a popup than SIS. Because it’s more extreme, Statcast’s league-wide popup rate is always two or three percentage points below SIS’, and on an individual player basis, it’s a bit stickier. Although Statcast measures the launch angle, exit velocity, and distance of every batted ball, batted balls are not categorized according to a formula. They are categorized by a human, namely, by the stringer at each game. That caught me by surprise, especially because Baseball Savant’s glossary lays out rough guidelines for which launch angles constitute which batted ball type:

  • Groundball: Less than 10 degrees
  • Line drive: 10-25 degrees
  • Fly ball: 25-50 degrees
  • Popup: Greater than 50 degrees

If those four groups of numbers look familiar, it’s because they represent the gray and white boxes in the graph at the top of this article. At first blush, it seems like divvying things up according to these numbers would make a lot of sense. It might be rough around the edges and allow a few misclassifications to slip through, but it would certainly be more exact than using estimations made by odd, inconsistent human beings. Here are the correlation coefficients we’d get if we used those criteria.

Year-Over-Year Batted Ball Type R-Values (MLB Glossary)

GB LD FB PU
.73 .28 .65 .63

They’re not terrible, but this method is the least sticky of our three options in every category except for popups. I tried other combinations of launch angles, but in the end, I couldn’t settle on a formula for using launch angle alone that was better than what SIS or Statcast already provides.

Here’s the real reason Statcast doesn’t just let the machines handle things. The video below contains four batted balls. The first was classified as a groundball, the second as a line drive, the third as a fly ball, and the fourth as a popup. All four were hit at a launch angle of 30 degrees.

Now, you could argue that any one of these balls should be in a different bucket, but none is so egregiously misclassified that you can’t understand what the person who made the decision was thinking. There are plenty of edge cases, and humans are pretty good at those. The real reason this method doesn’t work as well is that more than launch angle goes into determining the type of batted ball. I’ll show you what I mean using another example. Here are two balls that were both hit at a launch angle of nine degrees. The first one came off the bat of Yordan Alvarez at 106.3 mph. It was crushed and it traveled 218 feet down the line. It could have been classified only as a line drive. The second came off the bat of Colt Keith at 82 mph, resulting in a bouncer to the second baseman and a double play. There’s no way it could have been classified as anything other than a groundball.

These two balls had the same launch angle, but the exit velocities determined their batted ball types. In edge cases, of which there are plenty, a harder-hit ball will end up categorized as a line drive rather than a groundball or a fly ball, which means that all of a sudden, we’re dealing with a combination of launch angle and contact quality. That’s a whole second factor to incorporate, so no wonder line drives are harder to predict year-over-year. This season, nearly a quarter of the balls that Statcast has classified as line drives had launch angles that fell above or below the general 10-25 degree range in the glossary (and although I don’t have access to SIS’ data, those numbers must be pretty similar). That’s a huge number of edge cases, exceptions, and balls where factors other than launch angle helped determine the classification. All of this makes line drive rate much messier and less consistent than the other batted ball types.

I spent hours testing the data, and I was able to come up with some passable definitions for batted ball types using a combination of launch angle and distance. For just one example, you could make rules like this:

  • Groundballs: Less than 25 degrees and 155 feet
  • Line Drives: Less than 25 degrees and more than 155 feet
  • Fly Balls: More than 25 degrees and 250 feet
  • Popups: More than 25 degrees and less than 250 feet
  • Those aren’t perfect rules by any means, but they give you the year-over-year correlations below.

    Year-Over-Year Batted Ball Type R-Values (Davy’s Version)

    GB LD FB PU
    .72 .57 .67 .69

    SOURCE: Baseball Savant

    Fly ball rate becomes a little less sticky, but line drive rate becomes a whole lot stickier. I think that’s really interesting and that it has a chance of being useful, but it takes a bit of work to calculate, so I doubt it will catch on any time soon. Besides, by tying our metric to distance, we’re just incorporating contact quality by another means. Further, we’re leaving our determinations to the vicissitudes of the atmospheric conditions, so at best we’re trading human biases for less pronounced meteorological ones.

    My favorite way to measure this would be strictly using launch angle, except not at the moment the ball leaves the bat, but at some point out in front of the plate, maybe 20 feet or so. I’ll let Baseball Savant’s little launch angle chart explain what that might look like.

    If we wait to measure the launch angle until the ball has had a chance to travel a bit, then contact quality will naturally be making its effect felt. A rocket hit at eight degrees will count as a line drive, while a jam shot hit at eight degrees will count as a grounder because it will already be falling. I would be extremely curious to see how sticky our batted ball types would be if we measured them this way, but my assumption that it would work better than our current methods is just an educated guess. Besides, although I’m sure Statcast could measure them this way, it’s not set up to do so, and I can’t imagine the small amount of knowledge we might gain about the predictability of line drive rates would be worth all the effort that would take. For now, I think we are where we are. Line drive rate is not as sticky as we’d like, but at least we now have a better idea of why.



    Source link

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here