*UPDATE: Dan in the comments correctly points out that the numbers I listed are actually double what they should be. All the attempt numbers should be cut in half. Thanks, Dan.*
On September 7, 2008, rookie Matt Ryan made his debut, launching a 62-yard touchdown strike to Michael Jenkins on his first ever NFL pass. It was quite the start for the 3rd overall pick, but while the future was bright for the young signal-caller, nobody expected him to average 62 yards per attempt. He would most certainly come back to earth.
So we can all agree that one pass attempt is not enough data to draw conclusions about a player’s true ability. In his debut, Ryan went on to complete 9 of his 13 passes (69.2%) without an interception; in his second start, he completed just 39.4% of his passes with no TDs and 2 INTs. Again, most of us will accept that a game or two is still too small of a sample. So, what is the point at which we can start to accept the results as indicative of a player’s true talent?
To answer that question, I will use a technique championed by Tom Tango in baseball to determine how long it takes for specific skills to stabilize. The idea is to find out how many attempts (or whatever denominator you are using) it takes before the observed data is half real (skill or true variance) and half noise (luck or random variance).
To find the point at which the data is half real and half noise, I need to determine where the correlation (r) between two random sets of attempts is 0.5. Let’s take pass yards per attempt as an example. I’ll start with, say, 200 attempts. For all QBs in my set (2000-2009) who have at least 200 attempts, I randomly select two sets of 100 attempts each, calculate the YPA for each set for each QB, and then find the correlation between the two sets. To make sure the correlation coefficient converges, I repeat this process 25 times. I then choose a new number of attempts and repeat the process again, producing a new “r” for each number. For each bucket, we can predict the point at which r = 0.5 by plugging into the formula ((1-r)/r) * Attempts. Below is the table for Yards Per Attempt:
|Attempts||r||n||r = 0.5|
As you can see, the projected number of attempts for where r = 0.5 hovers around 800. So when a QB reaches approximately 800 pass attempts, our best prediction of his true YPA talent would be half his current YPA and half league average (we could use the average for some other population besides the league if we choose, but league average is generally a good starting point).
Let’s now look at the results for six important QB metrics.
|Sack%||Sack / Dropback||around 400 dropbacks||0.75|
|Comp%||Comp / Att||around 500 attempts||1.00|
|YPA||Yards / Att||around 800 attempts||1.60|
|YPC||Yards / Comp||around 650 completions||2.15|
|TD%||Pass TD / Att||around 2250 attempts||4.50|
|INT%||INT / Att||around 5000 attempts||10.00|
To interpret this table, we can say: “At around 500 attempts, a QB’s completion percentage is half real and half noise.” I added a “seasons” column to give a sense of about how many seasons each stat takes to stabilize. For example, Yards stabilizes faster with respect to completions than when compared to attempts. However, it takes a little over 2 years for a QB to pile up 650 completions, compared to around a season and a half to get to 800 attempts, meaning the QBs YPA will actually stabilize faster in real time than YPC.
A bit of a surprise to some is that Sack % actually stabilizes the fastest. While this doesn’t necessarily mean that QBs themselves most control how often they are sacked (offensive line, scheme, opponents, etc. are also involved), it does suggest it. On the other end of the spectrum is INT %, which takes approximately 5000 attempts to stabilize. That means that until a QB plays nearly 10 seasons in the league, his true interception rate is probably closer to league average than his current interception rate. The implication that QBs have much more control over their sack rate than their interception rate is worth further research. At the very least, sack percentage is much more stable than interception percentage, whatever the cause.
This provides an interesting look at what things a QB most controls. Sack rate, yards per attempt or completion, and completion percentage are all things that quarterbacks have sizable control over. Touchdowns and especially interceptions are much more susceptible to luck and other extraneous factors.
Tom Brady, for example, is coming off of a fantastic MVP performance, where his 0.8% INT% lowered his career rate to 2.2%. However, at 4700 career attempts and using 3.0% as the league average, we’d expect Brady to be around 2.6% going forward, a far cry from his incredible season in 2010. On the other side, Eli Manning saw his INT% climb to 4.6%, the highest of his career (3.4% career rate on 3300 attempts). This analysis suggests he’ll be expected to return near league average next season. To make actual predictions, we’d want to do a much more rigorous analysis including many more factors, but this gives us a sense that guys like Brady and Eli are much more likely to see their interception rates regress heavily towards league average than stay at the extreme levels we saw last season.
We can use this type of analysis to look at team-level statistics, or players at other positions. The idea is to help us know when statistics stabilize and become reliable, and when we should take them with a grain of salt (or league average).