by W. Casan Scott - Baylor University
Traditionally we value teams based on yards gained or allowed, and points scored or allowed. How often does the offense score, or the defense stop opponents from scoring? A team may gain 5000 yards in a season and score 50 touchdowns. How does this compare to other teams in the conference? Or other teams in the country? And what implications does the offense have on defensive statistics? If an offense is extremely efficient, that may lead to more plays per game for the defense. More defensive plays generally correlates with more points allowed and a worse overall defense. Tracking team statistics individually can be useful, but only if understood within the complete context of the team. Standardizing a statistic to the population and the complete data set can correct for misleading biases. This article will introduce the use of standardized scores or z scores in quantifying NCAA football team statistics. I will also explore the ability to predict synchronous responses in z scores using a team’s recruiting class rating.
A standard score, or z scores, is the number of standard deviations a measurement is above or below the mean. So, a z score of 1 is assigned to a measurement that is 1 standard deviation above the mean. The figure below shows Z scores and the associated percentage of a population. Between the -1 and +1 Z scores, lies 68% of the population. Between the -2 and +2 Z scores, lies 95% of the population. And between -3 and +3 Z scores, lies virtually the entire population.
For example: Suppose Johnny Manziel rushed for 1000 yards last year and the average rushing yards for all NCAA QBs was 250 yards. If the standard deviation, a measure of variability in the population, is 250 yards, you would solve for Manziel’s z score like this:
Z = (measurement – mean) / standard deviation of the population
Z= (1000-250) / 250
Manziel’s rushing Z score = 3.00
In this example, Johnny Manziel’s rushing Z score would place him in the ~99.7 percentile… elite as it gets. These are hypothetical numbers, but the rest of the article won’t be.
I collected offensive and defensive data from 120 FBS teams in 2013. I wanted to understand what the cumulative z score for all statistics told us about the 2013 season. My hope was that a cumulative, or synchronous, response in all statistics would tell us more about the true performance of a team rather than reducing everything down to expected points or yards. Although expected points is ultimately what we care about, I feel that this approach allows us a truer comparison between teams. I calculated z scores for all offensive and defensive team statistics for the 120 teams, normalized scores to strength of schedule, and ranked teams according to their cumulative z score. The figure below show all 120 teams and the distribution of their offensive and defensive z scores. I also plotted each school according to their recruiting points in 2011 (i.e. The highest points equaled the highest rated recruiting class.).
Team statistics include the following for BOTH offense and defense: Points per game, Pass Completions per game, Pass Attempts per game, Pass Percentage per game, Pass Yards per game, Pass TD per game, Rush attempts per game, Rushing yards per game, Rushing TDs per game, Total plays per game, Total yards per game, Yards per play, Passing first downs per game, Rushing first downs per game, first downs on penalty, Total first downs per game, Number of penalties per game, Penalty yards per game, Fumbles per game, and Interceptions per game.
This figure is a bit much to take in, but lets look at Florida State. Offensive z scores (green) are largely positive while defensive z scores (red) are largely negative. This means the offense and defense performed very well in relation to the rest of the country. Teams on the low end of the recruiting scale have much less polarity in offensive and defensive z scores, meaning both measures were more average that those of Florida State. I wanted to make this information more digestible, so I placed the Top 25 cumulative (Offense and Defense) Z scores in the FBS in 2013 in the table below.
We can see that some teams had dominant defenses (Alabama) and others dominant offenses (Baylor). By taking the cumulative Z score of all team statistics, we can get a sense of the overall team performance and how each offense and defense compliment each other. Visually, the top 30 cumulative Z scores in 2013 are displayed below.
This gives an idea of how offense and defense contribute to an overall team performance. Florida State is “closest” to the ideal Great Offense and Great Defense, and they ended up winning the BCS Championship. But how do these cumulative Z scores relate to actual collegiate football rankings? Actually, 22 of the top 25 teams in ESPN’s 2013 final rankings scored above a cumulative Z score of 10. So a threshold of 10 is somewhat comparable to the conventional top 25 threshold. And earlier I mentioned each team’s recruiting class in 2011, and what influence they had on its 2013 performance. This idea builds on a previous article of mine that detailed the correlation between a team’s performance and its recruiting class 3 years prior. Using recruiting class as a predictor for Cumulative Z score, I saw a pretty compelling relationship, seen below.
Recruiting class does a good job of predicting Cumulative Z scores, which tracks fairly well with conventional rankings. I do not anticipate Cumulative Z scores to replace conventional measures of team performance or rankings. I do however, feel that normalizing and standardizing team statistics and observing synchronous responses in team statistics is integral to understanding team analytics. Z scores standardize a measurement back to the mean and the standard deviation. Standardizing and normalizing data to better understand what is truly happening can only help analytics.