Blue Bombers Forum
October 22, 2017, 12:23:00 PM *
Welcome, Guest. Please login or register.

Login with username, password and session length
News:
 
   Home   Help Login Register  
Pages: [1]
  Print  
Author Topic: QUAR - Analysis of the CFL's QB rating system  (Read 192 times)
Stats Junkie
Hero Member
*****
Posts: 1317


Unofficial Blue Bombers Historian


« on: October 12, 2017, 07:56:51 PM »

Earlier this season the Canadian Football League unveiled QUAR, the league's new quarterback rating system. So what is QUAR and how effective is it?

QUAR is based on CFL data for the period 2009-2016. We have been told that QUAR incorporates the existing pass efficiency rating and then adds 8 new categories which results in a rating scored out of 100. As the season progressed, we have been given sample data and other details about QUAR. The one thing that is sorely lacking is the formula.

Fortunately, enough data has been provided to backwards engineer the QUAR formula - it looks like this:


How effective is it?
One of the first considerations is how does a formula like QUAR deal with statistical outliers, the data points that are so far from the average that they skew the end results. The method that the CFL chose is to establish ranges of relevant data and apply minimum and maximums to each aspect of the formula. This very effective - in fact it is the same method that is used in the Pass Efficiency Rating.

Pass Efficiency Rating (40%)
Speaking of Pass Efficiency Rating (PER), perhaps we should start our analysis with that component of QUAR - PER comprises 40% of the end result for QUAR.

PER was developed in the early 1970's and takes into account four categories; completion percentage, average gain per pass attempt, TD frequency and interception frequency. For each of these categories, a score of one represented an average score (average in the 1970's). Each component is also subject to a minimum score of zero and a maximum score of 2.375.

The minimum possible score for PER is zero; this is achieved by scoring 0 in each of the 4 components. A perfect score for PER is 158.33; this is achieved by scoring a maximum 2.375 is each of the 4 categories.


When it comes to pass efficiency rating and the CFL, it should be noted that there are 3 variations of the formula in play. There are 2 versions used by the CFL and the third version is the correct one (detailed above).

The more common version of the formula that the CFL uses was described in the weekly stats package earlier this season as an "uncapped" version. That is to say, the min/max parameters that prevent statistical outliers from skewing the results are ignored by the CFL.


Various articles and stats packages issued by the CFL also indicate that the maximum score for PER is a "mysterious" or "theoretical" value. It is also suggested that the 158.33 score has been exceeded several times in recent years - of course this in NOT possible if the correct formula is used.

The second variation of the PER used by the CFL shows up in the post game stats packages. This version of the formula is simply the "uncapped" version with an arbitrary maximum set at 158.3 and an arbitrary minimum set at 0.

To demonstrate the difference between the 3 formulas, let's look at a simple example. A quarterback throws one pass which is completed for 19 yards; no TDs and no interceptions.


For completion percentage a score of 3.5 results which should be reduced to the maximum score of 2.375. For average gain a score of 4.0 is the result which again should be reduced to 2.375. For the TD calculation, the minimum score of 0 is attained (no adjustment required) and for the interception calculation, the maximum score of 2.375 results (again, no adjustment needed).

The "uncapped" version of the formula yields a PER of 164.58. The post-game version of the formula is 158.33. The correct formula results in a PER of 118.75.

Here we have 3 variations of the same formula with 3 very different results. The only difference between the formulas is how the min/max parameters are applied.

I would argue that since the CFL uses a broken version of the PER that QUAR is also a broken formula.

Let's assume that the CFL was to use the correct version of PER in its QUAR formula. The PER portion of QUAR would yield a score of 46.23 which is reduced to the maximum score of 40.


Instead of using PER as a single score, let's look at the 4 categories of PER as separate calculations with a maximum score of 10 for each. In the example above, each score of 2.375 would become a score 10 while the score of zero would remain a zero. That means that the 3 scores of 2.375 become 3 scores of 10 for a net score of 30.

Why is there a difference? Well 118.75 is 75% of 158.33 just as 30 is 75% of 40. The PER portion of QUAR applies min/max parameters to a calculation that has already had min/max parameters applied. By applying a second set of parameters, the PER component of QUAR becomes unbalanced.

We now know that the pass efficiency rating portion of QUAR is poorly constructed, perhaps the remainder of QUAR has some redeeming qualities. Let's take a look at the other components.

Winning Percentage (10%)
Many stats are used as a predictive measure for winning percentage. By incorporating winning percentage into the QUAR calculation it removes the predictive measure from the formula.

-   QUAR is NOT valid as a single game calculation. In statistics, a large sample size is required to make a result significant. When it comes to winnings percentage, there is a sample of one for a single game which makes this portion of the formula statistically insignificant.

-   For most of the components of QUAR, the average score is 70-80% of the maximum possible score for the category. For winning percentage, 50% is the best that can be achieved. A score of less than 50% is actually possible.

Example: Earlier this season, Calgary defeated Hamilton 60-1. In that game, 5 quarterbacks each led at least one drive to qualify for a QUAR rating. Bo Levi Mitchell recorded the win (10 points) and Zach Collaros was tagged with the loss (0 points). The three other QBs (Jeremiah Masoli, Andrew Buckley & Ricky Stanzi) each received a no decision which results in a score of zero - the same as a loss. As a result, the average score for winning percentage from that game was 2.0 or 20% of maximum (hardly consistent with the other QUAR categories).

Drive based calculations
Points per drive (15%)
1st downs per drive (5%)
Yards per drive (10%)
Sacks taken per drive (5%)
Fumbles per drive (5%)
QB rushing yards per drive (5%)

I commend the CFL for delving into drive based stats for the QUAR formula.

1.   It is possible for two quarterbacks to have the same stats for a game. If one QB achieves his stats on a lesser number of drives, then he is actually more efficient when it comes to stats based on yards, first downs, points scored, etc. The QB with a greater number of drives is better for stats based on sacks taken, turnovers, etc.
2.   In recent years, we have seen CFL teams use a specialist QB for short yardage situations and goal line packages. It makes sense to have all statistics accumulated on a drive attributed to the drive QB.

Three components of QUAR are well designed - 1st downs per drive, yards per drive and points per drive.

There are three other QUAR components which are also described as drive based - QB rushing yards per drive, fumbles per drive and sacks taken per drive. Although they are described as a per drive calculation, they are actually individual stats disguised as drive based numbers.

Example: Earlier this season, Andrew Buckley entered the game on a 3rd and 1 situation and he galloped 60 yards for a TD. Bo Levi Mitchell was the drive QB for Calgary and he was credited with the 1st down attained on the play, 60 yards of net offence and 6 points for the unconverted TD. It was Andrew Buckley and not Mitchell who received credit for the QB rushing yards (60) although Buckley was not credited with the drive.

This means that the specialist quarterbacks are accumulating stats to be applied against other drives that these men lead. When these specialist QBs are tagged with a sack or fumble, these stats are applied against other drives as well.

To put this into perspective, Andrew Buckley would have maximum score (5) in the QB rushing yards component of QUAR for his next 40 drives.

2nd Down Conversion Rate (5%)
The final component of QUAR is 2nd conversion rate. It makes sense to have this stat apply to the drive QB because in many instances it the drive QB who accumulates the first 8-9 yards. In practice, it is the QB who plays on the 2nd down play who accumulates the stats. As a result, these specialist QBs often achieve a maximum score in the 2nd down conversion category because they are always attempting to convert 2nd and short situations.

In conclusion, my initial concerns about QUAR were that a formula was not provided and the lack of historical data for comparison - it was based on data from 2009-2016. After breaking QUAR down into its components, I am just as concerned about the poor execution of the formula in general.
 
Logged

@Stats_Junkie
BomberPride
Hero Member
*****
Posts: 2308


« Reply #1 on: October 12, 2017, 08:03:29 PM »

My nose started bleeding when I read this.

Thanks for breaking this down Stats Junkie.
Logged

What is more gentle than water, yet who can control the raging flood?
TecnoGenius
Hero Member
*****
Posts: 524


« Reply #2 on: October 13, 2017, 01:36:20 AM »

Fortunately, enough data has been provided to backwards engineer the QUAR formula - it looks like this:

I minored in math and there's no way someone derived this formula from just the final number and misc stats they've been assigning QB's.  Someone had to have inside info.

For one thing, not even knowing what exact inputs were used would make it nearly impossible by itself.  Second, even if you knew exactly what stats were used as input values, you're talking about solving a polynomial of order 10, with multiple constant coefficients involved, and some inverse values.  Maybe that could be done, but it's approaching NP-complete level difficulty.  And it might take more than just 3/4 a season's worth of data to calculate.  So I doubt someone actually reverse engineered the formula.

That said, it's a fascinating formula and kudos to whoever came up with it, and it's interesting to see it started in Canada (right?).  Thanks for posting.  Hopefully the black CFL helicopters won't be circling your house at night.  Having a formula which has a max of 100 is so much better than the lame PER with its arbitrary maximum.
Logged
ModAdmin
Administrator
*****
Posts: 8801


Reaves,Cameron,Riley,Walby - Blue Bomber Legends


« Reply #3 on: October 13, 2017, 01:42:13 AM »

I minored in math and there's no way someone derived this formula from just the final number and misc stats they've been assigning QB's.  Someone had to have inside info.

For one thing, not even knowing what exact inputs were used would make it nearly impossible by itself.  Second, even if you knew exactly what stats were used as input values, you're talking about solving a polynomial of order 10, with multiple constant coefficients involved, and some inverse values.  Maybe that could be done, but it's approaching NP-complete level difficulty.  And it might take more than just 3/4 a season's worth of data to calculate.  So I doubt someone actually reverse engineered the formula.

That said, it's a fascinating formula and kudos to whoever came up with it, and it's interesting to see it started in Canada (right?).  Thanks for posting.  Hopefully the black CFL helicopters won't be circling your house at night.  Having a formula which has a max of 100 is so much better than the lame PER with its arbitrary maximum.


I was right there with you until the bolded portion!  Grin
Logged

"You can't let praise or criticism get to you. It's a weakness to get caught up in either one." - John Wooden
Tiger
Hero Member
*****
Posts: 3735



« Reply #4 on: October 13, 2017, 03:18:50 AM »

Thanks Stats Junkie.  That was a dense but interesting review.  I would agree that if the CFL uses a broken version of the "PER that QUAR is also a broken formula."
Logged

Football is easy if you're crazy as hell
 Bo Jackson

We are inclined to think that if we watch a football game or a baseball game, we have taken part in it
John Fitzgerald Kennedy

BC Sucks
Tiger
Stats Junkie
Hero Member
*****
Posts: 1317


Unofficial Blue Bombers Historian


« Reply #5 on: October 13, 2017, 03:49:26 AM »

I minored in math and there's no way someone derived this formula from just the final number and misc stats they've been assigning QB's.  Someone had to have inside info.

For one thing, not even knowing what exact inputs were used would make it nearly impossible by itself.  Second, even if you knew exactly what stats were used as input values, you're talking about solving a polynomial of order 10, with multiple constant coefficients involved, and some inverse values.  Maybe that could be done, but it's approaching NP-complete level difficulty.  And it might take more than just 3/4 a season's worth of data to calculate.  So I doubt someone actually reverse engineered the formula.
It really wasn't terribly complicated to figure out the QUAR formula. The CFL was providing a weekly QUAR package (up to week #13) as well as detailed QUAR information in the weekly team stats packages.

QUAR is really just 9 mini formulas added together to provide a final tally out of 100. None of the 9 variables is interdependent on any of the other variables.  So, at no time was I solving for more than 1 variable at a time.
Logged

@Stats_Junkie
TecnoGenius
Hero Member
*****
Posts: 524


« Reply #6 on: October 13, 2017, 04:00:46 AM »

It really wasn't terribly complicated to figure out the QUAR formula. The CFL was providing a weekly QUAR package (up to week #13) as well as detailed QUAR information in the weekly team stats packages.

QUAR is really just 9 mini formulas added together to provide a final tally out of 100. None of the 9 variables is interdependent on any of the other variables.  So, at no time was I solving for more than 1 variable at a time.

Oh!  They break it down in their public info.  I didn't know that.  That changes it completely.  I thought you meant you were taking just the final number, like "90.5" and figuring out the whole formula from that without any other knowledge!!!  Grin

Ya, with the components laid out individually it becomes somewhat trivial (though you still need some decent algebra skills)  Cheesy  Thanks for all the info Junkie.
Logged
GOLDMEMBER
Hero Member
*****
Posts: 17711


R.I.P. BLUE BONGER


« Reply #7 on: October 13, 2017, 11:51:05 AM »

Ouch my brain hurts!
Logged

I LOSHT MY MEMBER IN AN UNFORTUNATE SHMELTING ACCSHIDENT!
rubanski
Hero Member
*****
Posts: 822


« Reply #8 on: October 13, 2017, 11:55:54 AM »

Can't believe they still quote PER without the appropriate cap some of the time. Like you said, been around a longtime and not complicated to figure out.
Logged
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC Valid XHTML 1.0! Valid CSS!