On August 25th, 2020, on NFL,com, the league’s “Next Gen Stats Analytics Team” posted “Next Gen Stats: Intro to new Route Recognition Model”. And, thus continuing the NFL’s fascination with the data pouring out of the new player-tracking technology, the Next Gen Stats Analytics Team has proven that the old computer industry saying “garbage in – garbage out” still applied today. The Next Gen Stats Route Recognition Model is evidence of that.
First, in explaining the raison d’tre behind the Route Recognition Model, the Next Gen Stats Analytics Team wrote this (which is a little like driving from Downtown Oakland, down to San Jose, then up the 101 in the West Bay, past Stanford and then SFO Airport, and Redwood City, to get to San Francisco, rather than just using the Bay Bridge):
Conventional counting stats like receptions and receiving yards provide a way to measure an individual player’s ability to catch and move the football, but they only tell part of the story. Advanced stats like depth of target, separation window and completion probability provide greater insight, but they still leave out an important factor. Namely, which route did the pass catcher run to get open before catching the ball?
With the help of player-tracking technology, the Next Gen Stats Analytics team set out to answer that exact question, decoding one of the key elements of an offensive play call by using player-tracking data to measure which routes pass catchers are running on any given pass play.
The Next Gen Stats player-tracking system records the x-y location, speed, acceleration, direction and orientation of all 22 players on the field in real time. Our new Route Recognition model leverages this data as inputs into a model that assigns a route type to every eligible receiver on every pass play, including tight ends and running backs. Our architectural approach uses a combination of convolutional neural networks (CNNs) and long short-term memory (LSTM) networks trained on Amazon’s SageMaker platform. CNNs allow us to engage with the spatial nature of our dataset (that is, where each player is on the field in a given play), while LSTM networks allow us to engage with the temporal nature of our dataset (what happens as the play develops over time).
We approached routes run by players aligned in the backfield separately from routes run by players aligned out wide, in the slot or tight, because of clear differences in route archetypes. Below are the 15 unique route types assigned to all route runners, based on their location when the ball is snapped. Note that while NFL playbooks have hundreds of variations of routes, we’ve narrowed it down to these high-level categories, including 10 routes for those in typical wideout alignments and five for those aligned in the backfield:
Wideout Routes (10): Screen, flat, slant, crossing, out, in, hitch, corner, post, go
Backfield Routes (5): Screen, flat, angle, out, wheel
Real-time route classification enables us to contextualize the passing game in new ways. We can study league-wide trends to gain a new understanding of offensive strategy and tendencies, and we can break down and rank individual players by advanced performance metrics.
The last sentence, “ we can break down and rank individual players by advanced performance metrics” is one I completely disagree with.
Got that? Ok, I digress here to point out that since the Halfback in my play “Eagle Right H Fly Motion Up, Shortout Swing Pass” (and part of my soon-to-be-introduced “Black Panther Offense”) runs an “up” pattern, the Next Gen Stats Route Recognition Model would have a problem fitting it into its analysis parameters. And that’s where I start my critique.
The Saints Michael Thomas Was The NFL Leading Receiver For 2019, Before The Next Gen Stats Route Recognition Model
Second, the Next Gen Stats Route Recognition Model that was introduced also came with the headlines for a post that reads Michael Thomas is the best route runner. Then, it goes on to say that he runs the crossing pattern better than anyone else, because his catch rate is higher. Right?
Well, if you actually look at Michael Thomas’ catches for all of 2019, you will see that when he did catch a crossing pattern, most of the time, he came out of a slot or a bunch set, or was helped by a “clear out” pattern. The problem with the Next Gen Stats Route Recognition Model, and its attempt to simply routes for the purpose of the model, is one loses any understanding of what the Saints do to get Thomas into a position where he’s effective so often. In other words, the reason for Michael Thomas’ great 2019 campaign was what head coach Sean Payton and his staff did to put Mr. Thomas in position to be great – and that was enabled by Teddy Bridgewater and Drew Brees as the quarterbacks for that year.
What I take away from Michael Thomas’ record-setting 2019 effort was the continued advances in using wide receivers as the focus of a ball-control passing game. (Something pioneered by Bill Walsh, but improved on by Bill Belichick during the 2006 NFL Season where the Pats nearly achieved a perfect record – but were stopped by the NY Giants in Super Bowl XLII). I also take away from Michael Thomas’ record-setting 2019 effort, the importance of designing plays that cause the quarterback to get rid of the ball faster.
The main question is this: can Michael Thomas’ record-setting 2019 effort be topped in 2020? I think it can if head coach Sean Payton introduces new formations continue to to give Thomas the “rub” and space from defenders to make catches. It’s clear the Saints have arguably the best passing game developed from a perspective of timing between quarterback and receiver. Don’t discount the role formations play in that outcome.
The missing element in the Next Gen Stats Route Recognition Model is any real consideration for formations. There’s no ability to determine which route came from a bunch set, for example, or a tight-slot lineup, to offer another.
In closing, as one who was the only media (other than NFL Media) to attend the NFL’s Big Data Event at the 2020 NFL Combine, I applaud the league’s effort to shine a light, a bright one, on game analytics. My only concern is that, in doing so, there is a tendency to forget the importance of knowing how the game works. For example, the winner of the 2020 Big Data Event was a very nice man named Matt Ploenzke who’s model was such that no one, not even himself, could clearly articulate how it could help the NFL beyond what we already know.
What he found was “Among roughly 40 input variables, a ball carrier’s “effective acceleration” was the most important for estimating yards gained on a handoff play.” Look, anyone who’s seen Dallas Cowboys NFL Hall of Fame Running Back Tony Dorsett hit the hole fast can tell you that. Dorsett will always be my favorite NFL ball carrier. Watch this:
Meanwhile, Graham Pash and Walker Powell of NC State, who…
used kinematic data such as player positions and velocity to determine zones of control for both the offensive and defensive teams at the time of the handoff. These zones of control predict the probabilities of yards lost or gained and quantifies the risk involved with plays.
Key stat: Robert Woods (Los Angeles Rams) and Raheem Mostert (San Francisco 49ers) outperformed the model predictions the most, averaging nearly three more yards than predicted over the 2017 and 2018 seasons.
…Came in second!
And Namrata Ray, Jugal Marfatia (Washington State University), who…
measured the open space of the rusher at three time intervals — handoff, after a half-second, and after one second — to understand the association between open space and yards gained. Results indicated that the difference in the open space between the time of handoff and after a half-second or full second was a strong predictor of the number of yards gained.
Key Stat: Yards gained by the rusher increases by four yards on average for every one percent increase in the additional open area created within a half-second of the handoff.
Came in third!
That was nuts.
What Namrata Ray and Jugal Marfatia learned, for me, should revolutionize running play design and formation use. Right now, too many NFL teams pay zero attention to line split variations in play design. Just saying.
Matt Ploenzke’s now working for the Niners. Remember, Matt: formations matter!
Don’t Forget The Name Ermal Allen, Dallas Cowboys Quality Control Coach
At the same NFL Combine 2020, I noticed that media types seem to think we’re in some kind of golden age of data. Even someone with the Dallas Cowboys (name left out) bought that kool aide. Then, I mentioned that the real pioneer of analytics was Ermal Allen, Dallas Cowboys Quality Control Coach under Head Coach Tom Landry. Allen used computers to scout team tendencies, and a did remarkable job with the limited technology of the age compared to today. But the point is, he set the tone for the analytics departments of NFL teams in the 21st Century.
This is from United Press International in 1981:
Allen played his college football as a quarterback at the University of Kentucky and coached there from 1948 through 1961. He and assistant head coach Jim Myers have the longest tenure under Cowboys head coach Tom Landry, who has coached the team since its inception in 1960.
NFL teams exchange films of their last four games for scouting purposes and Allen dissects those films in an attempt to learn everything there is to know about the opponent.
All the information he gleans from his studies is fed into a computer to try to predict what a club is most likely to do in certain situations. All of that information is ready for the rest of the staff by the first of each week to assist Landry and his assistants in the formulation of their game plan.
On game days Allen stays in the press box with three other assistants to chart the opposition’s plays with Tubbs.
Stay tuned.