Fantasy Football Red Flags: The Injury Bug
It seems like every year in the NFL, one team catches ‘the injury bug’.
Last year it was the Baltimore Ravens. Starting running back J.K. Dobbins went down before the season started, followed quickly by Gus Edwards. Former MVP Lamar Jackson missed time as well, and the Ravens finished 8-9 (and with six straight losses to the end the year).
This is, of course, difficult and frustrating for the players and teams themselves. It also has an impact on fantasy since, just like in real football, availability is an ability. The label ‘injury risk’ is bandied about for specific players, but it’s a complicated and often contentious term. Just as impactful is when you draft a player who ends up on a ‘cursed’ NFL team, where Murphy’s Law is in full effect and multiple players go down during a season.
It’s just bad luck, a total coincidence…right? Is it possible that injuries cluster among teams? Perhaps certain teams have a very aggressive playing style, or a training staff that is using a new and unproven technique. What if one player getting injured is a canary in the coal mine for the rest of the team? This is the very question I’ll hope to answer in this article; all data is from nflfastR.
We can look at play-by-play, regular season data going back to 1999. Every single play, in addition to crucial variables like down and yardage, has a text description of what exactly happened during the play. One example might look like this:
(15:00) D.Pederson pass to B.Finneran to PHI 29 for 10 yards (T.Knight).
Fortunately for our purposes, the descriptions also mention when a player is injured on a given play. Here are two examples:
(15:00) L.Smith up the middle to NO 38 for 4 yards (M.Barrow). CAR-M.Barrow was injured during the play.
(2:56) C.Batch pass to G.Crowell to DET 33 for 10 yards (J.Bellamy). Ball placed on 23 yard line for start of drive. DET#84 Moore injured on play – assisted of field.
This is the perfect set-up for text extraction, or picking out relevant bits that we want from ‘unstructured’ data. In this case, we want to keep track of the injured player’s team, or the Carolina Panthers and Detroit Lions in the examples above. To do this, I set up some code* that loops through the text descriptions and does its best to try to isolate what NFL team is mentioned.
I say that the code ‘does its best’ because we don’t successfully extract all of the team names. Some of the descriptions are in a wonky format, and it would be painstaking to go through and find a workaround for every example. As it stands, we get over 5,000 teams out of the 13,000 injury plays from some relatively simple code. This is a large enough sample size to move forward, especially because the 8,000 data points we are missing probably aren’t systematically different (although they certainly would be nice to have!).
Now that we have data that tells us when each team has an injury, in each week of each year, we can do some analysis…
First, we can try to answer this question: are early-season injuries predictive of later-season injuries? If the answer is yes, then fantasy managers have a legitimate red flag to consider. Early-season injuries to teammates could be indicative of a higher injury risk for your RB in the second half of the year.
To check this, we can run a regression that predicts the number of injuries on a team from Week 5 onwards using the number of injuries from Weeks 1 – 4. We control for the year, which helps adjust for any broader time trends (i.e., more or less injuries in the league over time). The result: early season injuries are highly significant predictors of later season injuries!
More specifically, for every Week 1 – 4 injury a team suffers, we expect 0.6 more injuries from Week 5 onwards, and this number is incredibly statistically significant. This chart provides a visual; notice the (general) upward trend line.
That is a perhaps surprising result. Can we find an even stronger result? Do injuries persist across years? I could imagine a situation where a team plays a stretch of physical games within one season, but does this relationship hold across seasons?
Turns out it does. Injuries the year prior are highly significant predictors of injuries in the next year. The coefficient is a bit smaller: each injury in Year T – 1 is associated with 0.5 more injuries in Year T, although it’s still highly significant. This chart depicts those two variables:
Last year, according to this methodology, the three most injured teams were the New York Jets, San Francisco 49ers and – you guessed it – the Baltimore Ravens. The Kansas City Chiefs were by far the least injured.
I must admit, these results really surprised me. I didn’t expect the ‘injury bug’ to be anything but random within a season, and I certainly didn’t expect the effect to persist across years. The upshot is that team injuries are something that we should be paying attention to, since it could be an early warning sign for other players to go down. If two fantasy players are relatively similar, go with the one on the less injured team.
Now, it’s unclear what the specific ‘mechanisms’ are that are causing this. It’s possible that certain teams are resting and recovering better, or certain teams are playing a really physical, aggressive style. Another possibility is that injuries tend to be re-aggravated, so if you start out with a lot of injuries they are likely to continue. That is a whole separate question, one that is likely much difficult to answer!
Lastly, it’s important to note that not all injuries are created equally. Some are extremely minor: players might return in the same game, or a few plays later. Further, this data picked up injuries to all players, not just fantasy relevant ones (QB/RB/WR/TE). Still, because of our large sample size (5,000+ injuries), these marginal deviations should ‘wash out’, and should give us a good view of the overall picture. I’ll close where I opened: teammate injuries are a serious red flag!
Want to hear more? Message me on Twitter
*If you’re interested, here’s the basic process for grabbing the team name. We look for descriptions with the word ‘injured’, split the sentence by periods, and grab the last fragment, since the injury announcement is usually at the end of the description. Then we remove all non-alpha characters (numbers and signs like #), and finally we grab the first word in the sentence.