The past few weeks we’ve taken a close look at “What makes teams win” in the NFL. Most coaches and traditionalists will tell you that teams must be able to run the football, stop the run and win the turnover battle. All three are certainly important, but not the most important factors in predicting success in the National Football League. Statistics can be used for a number of different reasons in professional football, but as a General Manager or Director of Player Personnel they should be utilized to help guide personnel decision making, not necessarily justify it.
Brian Burke of Advanced NFL Stats has done his own extensive and exhaustive study over an eight season stretch to see just which stats correlate and are most predictive to winning. Burke’s work inspired The Football Educator with the following posts;
- A “NY” GIANT of a problem – Player Personnel, Injuries & Passing Efficiency
- Denver’s football scouting can learn from the math – Orton not Broncos only problem
- Winning & Offensive Run Efficiency; Football Scouts looking for top RB’s
- NFL Player Personnel – Find your “shutdown” DC and a stronger “equation” to winning
- Fumblitis – Is that on a football player evaluation form?
- NFL GM to his scout, “Aim straight, aim true, don’t miss”
- Scouting to win in the NFL? It all adds up in the end
- Draft picks & NFL free agents: Smart + Quicks = WIN
Here’s Burke’s descriptive explanation of the entire equation. Read it carefully and really think it through before formulating your opinion.
Success Rate (SR) is a simple measure of whether or not a play improves an offense’s expected net point potential. It essentially ignores the magnitude of a play’s result, and instead focuses only on whether a play was simply a good outcome or bad outcome.
Although team SR statistics ignore important information in terms of explaining past wins, it may be able to predict future outcomes better than other measures. A team’s SR on run plays is particularly informative, because it is not sensitive to the low-frequency but high-impact events that are largely subject to randomness, such as long broken runs or turnovers.
Compared to simple running efficiency, run SR correlates better with winning (0.39 compared to 0.15). This is telling and helpful, but it only accounts for past outcomes. How well a statistic predicts future outcomes is not just about the parlor game of picking winners. Stats that predict future outcomes measure the signal of how good a team really is, underneath all the noise of randomness.
In other words, there is no ‘right now.’ There is no is. There is the known past, clouded by randomness, and there is the unknown future, clouded by uncertainty. Now is merely the ephemeral intersection between the past and future. When trying to measure team strength or player ability, the focus should be on how well a team or player is likely to play in the future.
That’s why stats that minimize noise, even while sacrificing some of the signal, can be more predictive and better measures of true ability than stats that include all the signal and all the noise.
Consistency is the key to measuring how much signal is in a stat. A stat–say total TDs–explains a lot about past wins, but unless it can be counted on as consistent measure of a team’s future output, it isn’t helpful. But if teams consistently scored the same number of TDs each game, we’d know a lot about which teams were truly better than the others.
I selected several team stats known to correlate well with winning and tested how consistent they were within a season. Consistency was measured by how well the stat correlates with itself. I broke each team’s season into two alternating sets of games. There were 2 sets of 8 games, with set A comprised of a team’s #1, #3, #5… games, and with set B comprised of a team’s #2, #4, …games. A statistic’s correlation coefficient between the two sets of games measures its consistency and how well we can rely on it as a predictor. The data consist of team stats from the 2002-2009 regular seasons.
A statistic that both correlates with winning and correlates with itself would be a reliable predictor of future wins. Put mathematically:
For example, team offensive net pass efficiency correlates with team win totals at 0.66, which is about as strong a correlation as you’ll find. It is also fairly consistent throughout the season, correlating with itself at 0.55. The result of the two correlations is a 0.36 predictivity factor, which is the highest of the stats I measured.
In contrast, defensive net pass efficiency correlates with winning at -0.56, nearly as strong as its offensive counterpart. (The negative sign does not indicate a weak correlation. Instead it simply indicates that lower is better.) But defensive net pass efficiency isn’t very consistent throughout the season, correlating with itself at only 0.17. The resulting predictivity factor is only 0.09, one quarter of the predictive power of offensive pass efficiency.
The table below lists the full results of each of the selected team statistics. The stats are listed from most to least predictive.
|O Pass Eff||0.66||0.55||0.36|
|D Pass SR||0.37||0.29||0.11|
|D Pass Eff||-0.56||0.17||0.09|
|O Fum Rate||-0.33||0.24||0.08|
|D Run SR||0.21||0.32||0.07|
|D Int Rate||0.34||0.15||0.05|
|O Run Eff||0.15||0.33||0.05|
|O Int Rate||-0.47||0.08||0.04|
|D Fum Rate||0.20||0.18||0.04|
There are implications for improvements for a prediction model. For example, replacing run efficiency with run SR would likely be a significant improvement. There are other implications too. Previously, I had found that offensive interceptions self-correlated much stronger than defensive interceptions. That result appears to have been a spurious result of using too few seasons worth of data, as defensive interception rate appears more consistent than offensive interception rate.
At the very least, this tells us that if you had to choose only one stat to measure a team, it should be offensive pass efficiency. It’s both highly predictive and simple to calculate.