Monday, December 12, 2011

Oversigning meets yppd

So, I can’t help my love for yppd (yards per per play differential).  I think it’s a great indicator of relative team strength.  I thought it would be interesting to compare a team’s yppd vs the size of their classes – so does oversigning help or hurt a team’s performance?

On the one hand, the data is clearly all over the place.  There are bad teams that sign a lot of players and there are good teams that sign a lot of players. 

That said, check this out – here are yppd scores (x-axis) mapped against the 4-year average size signing class (y-axis, using data from oversigning.com).  This is data for BCS schools only for 2006-2011 seasons.  The dots are specific “YPPD scores” (actual yppd + a SOS adjustment), the red line is a 5 data-point rolling average.



Interestingly, there is a very straight line here – the better the team is from a yppd perspective, the more players they sign on year on average.   Although, it trends up by only about 1.5 signees/yr (23 is at 0 yppd score [average] and about 24.5 as you move towards the top end).  

Also interestingly, is how, for the most part, top teams sign a high number of players.  Consider the list below – showing all teams with a 2 or greater yppd score – and how large their signing classes are.

So, the intuition that oversigning helps teams looks to be true, at least at a high level.



(the teams below are the top 27 yppd teams & avg # of signees over a 4 year period... the average class size is 24.7, the average class size of the bottom 27 is 22.8)
2006 West Virginia  29
2011 Alabama  28
2009 Alabama  28
2010 Auburn  28
2010 Arkansas  28
2010 Alabama  27
2007 Texas Tech  27
2011 LSU  26
2006 Arkansas  26
2008 Oregon  26
2011 Oregon  25
2010 Oregon  25
2007 Oklahoma  24
2007 West Virginia  24
2006 Texas Tech  24
2007 Mizzu  24
2008 Florida  24
2009 Florida  24
2007 Florida  24
2008 Oklahoma  24
2006 LSU  23
2007 LSU  23
2007 USC  23
2006 Florida  23
2011 Wisconsin  22
2009 Texas  21
2007 Ohio State  20

Wednesday, December 7, 2011

Does running = winning?

Here's how much a team rushes relative to what % of their games they win (2008-2011 data)

1st column = % of yards gained through running (for the whole season)
2nd column = % of games won

10% 32%
20% 42%
30% 49%
40% 51%
50% 55%
60% 56%
70% 54%
80% 42%

Note: the first column is actually a range, so 10%, really equals 10-19.9%

Then distribution of teams across the rushing categories by number of wins.

For teams that have won 10 or more games:

1st column = % of yards gained through running (for the whole season)
2nd column = % of teams in that bucket

10% 0%
20% 7%
30% 30%
40% 27%
50% 28%
60% 6%
70% 1%
80% 0%


Then teams with 7-9 wins

10% 1%
20% 7%
30% 31%
40% 37%
50% 16%
60% 4%
70% 3%
80% 1%


Then teams with 6 or fewer
10% 1%
20% 13%
30% 39%
40% 26%
50% 15%
60% 3%
70% 1%
80% 1%


A quick look at this tells me that you can be a very successful team with running or throwing the ball. Although, a higher proportion of very winning teams run the ball more.

ND runs the ball more with Kelly than it did with Weis, but it in the mid-to-high 30%'s - within the range in which successful teams commonly are.

Also - one more cut of data, teams winning 12 or more games over the last 4 years:
10% 0%
20% 8%
30% 32%
40% 20%
50% 36%
60% 4%
70% 0%
80% 0%

YPPD - Bowl edition

I’ve taken a few cuts of my yppd scoring applied to the upcoming bowl games – hopefully this is fun/interesting for some here.

Best bowls to watch

Maximizes both quality of the teams and closeness of the games
(#'s after the team are their 'yppd scores')

#10 - Orange Bowl Clemson 0.6 v West Virginia 1.3
0.9 yppd (#9 in quality), 0.8 gap (tied for #9)

#9 - Independence Bowl Missouri 0.7 v North Carolina 0.9
0.8 yppd (#11), 0.2 gap (tied for #3)

#8 - Sugar Bowl Michigan 1.4 v Virginia Tech 0.8
1.1 (#7), 0.6 gap (tied for #7)

#7 - Capital One Bowl Nebraska 0.9 v South Carolina 1.2
1.1 (#8), 0.3 gap (tied for #4)

#6 - TicketCity Bowl Houston 1.6 v Penn St. 1.1
1.4 (#5), 0.5 gap (tied for #6)

#5 - Outback Bowl Georgia 1.1 v Michigan St. 1.6
1.4 (#4), 0.5 gap (tied for #6)

# 4 - Champs Sports Bowl Florida St. 1.1 v Notre Dame 1.4
1.2 (#6), 0.5 gap(tied for #4)

#3 - Title game Alabama 2.9 v LSU 2.2
2.5 (#1), 0.7 gap (tied for #8)

#2 - Fiesta Bowl Oklahoma St. 1.7 v Stanford 1.4
1.5 (#3), 0.3 gap (tied for #4)

#1 - Rose Bowl Oregon 2.0 v Wisconsin 1.6
1.5 (#2), 0.3 gap (tied for #5)

Least Competitive Bowls
Games with the largest gaps in skill between the teams.

Cotton Bowl
Arkansas 1.3 v Kansas St. (0.4)

Texas Bowl
Northwestern (0.5) v Texas A&M 1.0

Gator Bowl
Florida 1.4 v Ohio St. 0.1

Las Vegas Bowl
Arizona St. (0.1) v Boise St. 1.0

New Mexico Bowl
Temple 0.1 v Wyoming (1.0)

Armed Forces Bowl
BYU 0.1 v Tulsa 1.1

Most competitive bowls
Games with the most closely matched teams

Holiday Bowl
California 0.6 v Texas 0.5

Pinstripe Bowl
Iowa St.(0.4) v Rutgers (0.4)

Kraft Fight Hunger
Illinois 0.5 v UCLA 0.5

Independence Bowl
Missouri 0.7 v North Carolina 0.9

GoDaddy.Com Bowl
Arkansas St. 0.5 v Northern Ill. 0.7

Famous Idaho Potato Bowl
Ohio 0.3 v Utah St. 0.6

Beef O’Brady’s
FIU (0.0) v Marshall (0.2)

New Orleans Bowl
La.-Lafayette (0.0) v San Diego St. (0.3)

Little Caesar’s Pizza
Purdue (0.7) v Western Mich. (0.4)

Friday, December 2, 2011

Tuesday, November 29, 2011

There is still hope!

This is a very long post. If you want to skip reading it, the point is – all is well!

(Now, stop thinking about animal house)

Seriously though, this is a long, data-based view of looking at ND performance. There is a lot of skepticism/negativity on this board and I don’t think it’s warranted. When I (or anyone, doesn’t have to be me) look at a data-based view of performance, there is great cause for optimism. I understand, as I’m sure everyone here does as well, that we tend to make judgments in life based on perception and piecing together our experiences – and rarely change those perceptions based on data. With that in mind, I don’t expect this post to convince anyone – and perhaps this post is no more than mix of a test to see how quickly people give up on reading a long post an ND fan-style Rorschach test. But I enjoy playing with data and the journey of looking at this data has been fun for me – perhaps some here will find it interesting as well.

Preamble (yes, this is long enough to need a preamble)
I’ve started my yards per play differential analysis at the start of the year trying to find an objective way to measure performance. After playing with data, looking at a variety of stats, I’ve come to believe it is an excellent indicator of how good a team is or is not. Using this metric to assess ND’s progress shows some good news – a pretty optimistic story actually. Let me show you some more the data and the thinking behind it.

The theory
For anyone who’s done a lot of data analytics, you’ll know that it’s very easy to get data to lie to you, particularly when you take small cross sections of data – there can be a lot of noise that creates misleading stats. I was looking for a simple, effective measure of team strength. I also wanted a way to adjust that metric for strength of schedule – as SOS plays a very significant role in how a team’s football stats would look.

I settled on yards per play differential. I had started off with total yards differential – inspired by omahadomer’s pts differential. I wanted to do yards because points can have significant swings due to things I thought were just as much luck as skill (like turnovers returned for TDs), granted better teams may do this more – but, in general, a team that moves the ball better will win games more frequently.

In looking at yards, I saw that the number of plays you ran had a very significant impact on some yardage totals. There was just a lot of variability – but I notice that the yards per play differential seemed to be far more consistent – after doing some adjustment for strength of schedule. It still addresses that core concept – of better teams move the ball better – and is simple, so I went with it. (I should also note that while this metric says ND is getting better, I started looking at this metric last year – well before I would know that it tells an optimistic story about ND)

Does the theory work?
Using yppd with a few adjustments (like home field advantage), I can “predict” winners to games about 75% of the time. It takes about 5-6 games into the season for the stats to start smoothing out, but it becomes a very good indicator (although, my strength of schedule adjustment needs to get better).

Not only can I predict winners, I can look at how people have performed in the past –and this shows a very strong linkage between yppd and overall winning %. Using my “yppd score” (which includes a strength of schedule adjustment), I get the following distribution of teams (using 2008-2011 data):

typpd score, % of teams win%
2.5, 4% , 93%
2, 9% , 87%
1.5, 19% , 77%
1, 33% , 69%
0.5, 48% , 63%
0, 79% , 50%
-0.5, 91% , 41%
-1, 96% , 27%
-1.5, 99% , 25%
-2, 99% , 12%
-2.5, 100% , 11%

The way to read this table – all numbers are a range. 0 is really -0.4 to + 0.4. 0.5 is really 0.5 to 0.9, etc. An example – 4% of teams get a yppd score at 2.5 or above – and they win 93% of their games.

The math here actually turns out to be pretty good. Thanks to another poster (I forget your handle, take credit for helping me out!), I have data from 2000-2007 as well. This data does not have my strength of schedule adjustment, but you can see that comparing yppd against top 25 finishing position shows this same story:

AP Final Ranking, YPPD
1 to 4, 1.8
5 to 9, 1.4
10 to 14, 1.1
15 to 19, 0.8
20 to 24, 0.9

If this doesn’t convince you of the validity of the statistic, nothing will – but I am a big fan.

What does this mean for ND?

For me, it means a lot of optimism. Why? Check this out. Since 2000 (thanks again other poster!), here is ND’s unadjusted yppd:

2000 (0.5)
2001 (0.6)
2002 0.3
2003 (0.4)
2004 (0.3)
2005 0.2
2006 0.2
2007 (1.3)
2008 0.2
2009 0.2
2010 0.5
2011 1.0

Using my “yppd score” (which would correspond to the tables above), ND has been:
2008 0
2009 0.3
2010 0.9
2011 1.3

Put differently, ND was in the top 79% of teams in 2008 & 2009 (although on the cusp of the top 48% in 2009), then in the top 48% of teams in 2010 (although, on the cusp of the top 33%), and is in the top 33% of teams in 2011 (although on the cusp of the top 19%).

This is a clear, significant upwards trend. We’ve gone from being decidedly average to just around the top 25 teams.

What I regret is that I haven’t found a way to get the yppd score to truly account for strength of schedule. For example, Houston has one of the highest yppd scores this year. But if I split their schedule into thirds, here is the average difficulty of each third:
1. top 4 0.7
2. top 8 (0.8)
3. rest (1.7)
Overall (0.5)

The overall average for D-IA is:
1. top 4 1.1
2. top 8 (0.1)
3. rest (1.2)
Overall (0.0)

Wow, so their schedule is easy – and their yppd score looks better than it should because of this.

Compare this with Alabama:
1. top 4 1.6
2. top 8 0.4
3. rest (1.2)
Overall 0.4

Wow, what a night and day schedule. Alabama’s best 4 opponents are all, on average, in the top 20% of teams. Compared to Houston’s toughest 4 being in the top 48% on average.

But strength of schedule is another reason I’m optimistic. Check out 2009 v 2011 yppd score SOS for ND:

2009
1. top 4 1.1
2. top 8 0.3
3. rest (0.9)
Overall 0.2

2011
1. top 4 1.6
2. top 8 0.1
3. rest (0.5)
Overall 0.5

Granted my yppd score is supposed to account for strength of schedule, but as it doesn’t go far enough and you can see how our schedule is harder across the board this year (and comparable to Alabama) – the 8-4 schedule doesn’t look as bad (I could go a lot more into how SOS impacts W’s… and it has a very significant impact on how many games a team wins, but this post is long enough already).

What about next year?

For fun, I did these same calculations with our schedule for next year, based on our opponent’s performance this year and came up with:
2012
1. top 4 1.6
2. top 8 0.5
3. rest (0.5)
Overall 0.5

Wow, this is a killer schedule – likely to be one of the toughest in the nation.

Why this data makes me feel good?

I know I get a accused of playing with data – but this data and analysis is very good, I promise ;)

Seriously though, data analytics has been part of my career and I’ve seen all sorts of good and bad analysis. If I were to be critical of what I’ve just shown you, I would say that SOS is not quite accounted for – and there is a chance that some noise exists around this. Otherwise, this is really good stuff. And the conclusions are very clear – ND is making good progress with Kelly. Is it great progress? No, I would say great progress would be more W’s. But things are moving in the right direction.

And I can also promise you that I’ve looked at more data than likely anyone on this board (or anywhere?) when it comes to evaluating a team’s performance based on numbers. I’ve looked at point differential, wins against different qualities of teams, yards, turnovers, contribution of a running-focused offense to winning %... and everything I look at tell the very same story that is here. We are making progress.

Why this data makes me feel sad?

Because we just might be killing a good thing. Go look at the yppd numbers for ND since 2000. Wow, we are clearly doing better now relative to the last 10 years. If you want, go check out point differential, two year win totals, strength of schedule, have at it – but you’ll get the same story.

I look at this board as the loudest ND fan base site on the net (props to the board ops). I don’t know if that’s the case, but it appears to be so. I just assume that some players, recruits, and maybe even coaches read this site. I also believe that, with time, people become what they are perceived as. And I am genuinely concerned that the negativity on this board could be the most likely to derail the team’s success.

Why?

For Kelly to be successful, players and recruits need to believe in him. It sounds cheesy, but I have never seen a leader be successful when people don’t believe. And I believe if the negativity on here continues to swell, I wager it will trickle into the program (if it hasn’t already).

Granted, we’ll never know if this happens, just like we’ll never know if Crist would have beat UM, USC, and Stanford. But – just in case, would you want to give some support, say – until signing day this year?

Because I can tell you next year will be a more difficult schedule than this and who knows what that will bring. But it would be some great irony if the school’s most passionate fans drive away a very capable coach because we were asking for too much too soon.

Final thought
For full disclosure, I’m a big believer that you win games when you out prepare the other team. For ND, this means recruiting, strength training, practice, player development etc. I think Kelly is building the program by making these work.

As fans, we don’t see this part of the program. What we do see are the W’s and the L’s, the playcalls, the helmets. But the data I’ve shown you here, which measures a lot this, shows that progress is happening. Now, is it happening due to luck? No (three goal line fumbles for TDs, no in our favor? When has that every happened?). It’s not happening because we’re playing easier teams. It’s not happening because we have a stud QB winning extra games for us.

Maybe, just maybe, it’s happening because we have a coach that identified and recruited talent (thank you Weis and Kelly) and have a coach that is developing talent and putting a program together (than you Kelly).

In any case, rather than calling our coach shanty or coach 30 or something else, I think a little more optimism (perhaps optimism is the wrong word – it’s just seeing it is as it is and showing some patience) may go a lot further than any of us can appreciate.

Sunday, October 30, 2011

A look at ND performance 2008-2010

For those that have been following, I’ve been playing with yards per play differential (yppd) – or if ND gets 6.0 yds/play and our opponents get 5.0 yds/play, we have a +1.0 yppd. I’ve been using it as a way to measure a team’s strength as well as a way of picking winners.

I’ve tried a new cut of the data – looking at ND opponents based on their yppd “score” (I call it a score b/c it’s not only their yppd, but I adjust for strength of schedule). The results look like this:




Basically, above the x-axis is a win, below the x-axis is a loss. I’ve colored Weis and Kelly game differently.

What is interesting to me – looking at this, we look a lot better under Kelly than we did under Weis. Kelly has played a tougher schedule and, almost across the board, has done better across every quality of opponent. Consider:
- 3 of Weis’s 13 wins came across teams worse than any we’ve played under Kelly
- Kelly’s won ~80% of games against average teams (in the 0 yppd bucket), and is 4-4 against good teams (teams in the 1 or 1.5 yppd bracket… these teams typically win 70% of their games)
- Kelly has played 90% of his games vs average or better teams, Weis only 70%

Looking at these numbers, it’s hard not to see progress from Weis, at least for me. We are winning more games against better opponents.

Peeling back the numbers a bit more Kelly’s two losses against average teams (0 yppd) were Navy & Tulsa last year. The two losses against 0.5 teams were UM last year and USF this year. USC is in the 1 yppd bucket this year. UM and MSU are both in the 1.5 yppd bucket this year. Our remaining opponents are from average (Wake, MD are both in 0) to bad (BC is in -0.5) to very good (Standard is in 2).

If we beat Wake, MD, and BC, ND will have won all games against 0 yppd or worst opponents for the first time in 4 years. Which will bring us to Stanford, which will be our 2nd toughest game in this four year span.

One last interesting stat. Against 0 yppd teams, Weis teams scored on average 23-23. Kelly's teams have scored 32-19. Against teams in the 0.5-1.5 categories, Weis's average score was 31-29, Kelly's has been 25-24.

Saturday, October 22, 2011

Picking today's winners - yppd fun

I've had a request to put together my "yards per play differential" (yppd) analysis for this week - so I just pulled together some numbers. I continue to evolve how I do these calculations, so it changes from week to week - but once I get to a place where i think it all comes together, i'll stop tweaking the approach.

With the model as it currently is, I develop % chance of winning based on historical performance against a set of statistics. Right now, the statistics I’m using are all yppd based and I’m using three different measures.

Across these three measures, ND is at +1.1 yppd, +2.4 yppd, and +1.4 yppd. These correspond to a 76%, 87%, and 83% chance of winning (the way each of these are calculated, I can apply 2008-2010 data to find historical win %). One estimate I’ve done has ND at 7.2 ypp and USC at 5.7 ypp – suggesting that we essentially have our way on offense and USC can move the ball, but not as well as we can. Some more details are below, but this looks like it translates into a ~35-17 win. Other ways I've crunched these numbers in the past suggest that the game may be closer than that - but in this iteration of how I'm crunching the numbers, it looks like ND should do quite well.

For some context, here are how other teams with a +1.1 yppd have fared (the first team is the one with +1.1).
In 2011, wins:
Marshall 24 Rice 20
Utah St. 63 Wyoming 19
South Carolina 14 Mississippi St. 12
Boise St. 41 Tulsa 21
Texas 17 BYU 16
Troy 24 UAB 23
Oklahoma St. 38 Texas 26
Tulane 49 UAB 10
Hawaii 44 Louisiana Tech 26
La.-Lafayette 20 Kent St. 12
New Mexico St. 28 Minnesota 21

In 2011, losses:
Nevada 34 Texas Tech 35
Middle Tenn. 33 Western Ky. 36
Iowa 41 Iowa St. 44
Central Mich. 13 Kentucky 27
Florida 6 Auburn 17

Ones that were at +2.4 yppd by the 2nd measure in 2011 (only wins listed, no losses occured):
Stanford 44 Duke 14
Oklahoma 47 Kansas 17
Southern Miss. 48 Rice 24
Georgia 27 Ole Miss 13
Clemson 43 Troy 19

Ones at +1.4 yppd by the fourth measure (again, no losses):
LSU 19 Mississippi St. 6
Boise St. 40 Toledo 15
Utah St. 63 Wyoming 19
Utah 54 BYU 10
Rice 28 Memphis 6
Oklahoma 38 Missouri 28
Nevada 17 San Jose St. 14
Northern Ill. 51 Western Mich. 22
Michigan 28 San Diego St. 7
Cincinnati 27 Miami (OH) 0

In short, for the first measure (+1.1):
11-5 this year,
Avg win: 33-19
Avg loss: 25-32
For the second measure, all wins (5-0), 42-17 score.
For the third measure, all wins (10-0), 37-13 score.

I see the % as more accurate than the specific game outcomes because they pull from a larger data set. But right now, it looks like a 35-17 win should not be out of the question. And by most measures, we look to be the better team. Before we get too excited, other measures I’ve done shows ND at more of a 60% chance. But we should be feeling good overall.

Now – for the rest of the league. Last week, this methodology was 40-12 (I may have double counted a game.. I’m too lazy to check). When the chance of winning was 60% or higher, the model was 35-7.

This week, here are the predictions, listed from blowouts to close games (the first team is the one the % is relevant to). Some interesting games in the list, such as Clemson /UNC being close, MSU/UW being neck and neck. Washington giving Stanford a run for its month. Or Alabama blowing out TN.

Institution Opponent Name % chance of winning
TCU New Mexico 100%
Oregon Colorado 99%
Oklahoma Texas Tech 99%
Tulsa Rice 99%
Nebraska Minnesota 97%
Temple Bowling Green 96%
Penn St. Northwestern 94%
Virginia Tech Boston College 95%
Alabama Tennessee 97%
Tulane Memphis 95%
Texas A&M Iowa St. 93%
Arkansas Ole Miss 94%
Vanderbilt Army 89%
Toledo Miami (OH) 89%
Notre Dame Southern California 82%
La.-Monroe North Texas 88%
Northern Ill. Buffalo 88%
Houston Marshall 88%
LSU Auburn 84%
Iowa Indiana 86%
Virginia North Carolina St. 84%
Boise St. Air Force 86%
UTEP Colorado St. 66%
Utah St. Louisiana Tech 82%
Wake Forest Duke 82%
La.-Lafayette Western Ky. 75%
Nevada Fresno St. 78%
Central Mich. Ball St. 74%
Illinois Purdue 74%
SMU Southern Miss. 69%
Ohio Akron 80%
Florida St. Maryland 72%
Kansas St. Kansas 71%
Oklahoma St. Missouri 66%
Utah California 66%
Georgia Tech Miami (FL) 64%
Middle Tenn. Fla. Atlantic 64%
Stanford Washington 64%
Washington St. Oregon St. 63%
South Fla. Cincinnati 54%
Michigan St. Wisconsin 52%
Clemson North Carolina 50%
Western Mich. Eastern Mich. 54%
East Carolina Navy 49%
New Mexico St. Hawaii 50%

...
...

PS Not that I have to say it - but by no means is this meant for betting - I'm just having fun with numbers and there's a good chance my excel model is buggy and this is all wrong.

Tuesday, October 11, 2011

Predicting remaining W's on ND's schedule

For those that have noticed, I've been playing with a variety of stats this year doing some analysis for fun. I've been a bit facinated with yppd - yards per play differential (e.g. if we average 6.0 yds/play and our opponents average 5.0 yds/play, we are at +1 yppd).

Well, I've decided to expand my model and I've included a number of things, including total yards/game, home field, and winning % - to come up with a predictor model. Essentially, by combining these variables, I've create a formula that correctly 'predicts' the winner to a game approximately 80% of the time (based on data from 2008-2010). For 2011 so far, this formula would predict the winner 85% of the time (likely due to the tougher part of the schedule not yet being played).

The predictor works by having a % chance of winning - 0 to 100%. If it's above 50%, I say it predicts a win, below 50%, a loss.

Some more detail on its accuracy -



[% chance of winning] [% of times right] [# of Games in sample]
0-19 94% 950
20-39 77% 709
40-49 54% 538
50-59 56% 405
60-79 76% 756
80-100 94% 940



So, in the middle of its predictions, it's right about 55% of the time. But is right 3/4 to more than 9/10 once it gets outside the 60% predictions.

Anyway - the point of everything I wrote before this is... the numbers in this model are complex, but are a decent predictor of who wild win.

So - let's apply this to ND's schedule this year. We get:


[Opponent] [Prediction] [% chance]
South Fla. win 64%
Michigan loss 19%
Michigan St. loss 40%
Pittsburgh win 78%
Purdue win 82%
Air Force win 81%
Southern California win 55%
Navy win 88%
Wake Forest win 51%
Maryland win 81%
Boston College win 103%
Stanford loss 19%



BC is above 100% because there's a home field advantage factor that sometimes pushes % above 100% and below 0%.

Now, the model gets better as the year goes on, so the numbers will change as we and everyone else plays more games. But, based on what's been played so far, Michigan and Stanford will be our toughest games. USC and (gasp) Wake are our hardest games left outside of Stanford.

For fun, here is what the model would predict for past years:

2008 (predicted 7-6)

San Diego St. win 94%
Michigan win 92%
Michigan St. loss 32%
Purdue win 77%
Stanford win 75%
North Carolina loss 35%
Washington win 94%
Pittsburgh loss 46%
Boston College loss 33%
Navy loss 42%
Syracuse win 96%
Southern California loss 3%
Hawaii win 65%


2009 (predicted 3-9)

Nevada loss 46%
Michigan win 56%
Michigan St. loss 49%
Purdue loss 42%
Washington win 70%
Southern California loss 33%
Boston College loss 50%
Washington St. win 98%
Navy loss 33%
Pittsburgh loss 11%
Connecticut loss 50%
Stanford loss 15%


2010 (predicted 8-5)

Purdue win 87%
Michigan win 66%
Michigan St. loss 28%
Stanford loss 21%
Boston College win 52%
Pittsburgh win 56%
Western Mich. win 83%
Navy loss 27%
Tulsa win 50%
Utah loss 46%
Army win 64%
Southern California loss 44%
Miami (FL) win 58%

Sunday, October 9, 2011

Defense vs AFA - drive compression

A term I've made up for when playing an option offense is "drive compression" - or that what you look for your team to do is to get better over the course of the game... and drives become "compressed" throughout the game, or become worse and worse.

If I take a 2-drive, rolling average of AFA drives against us in terms of total yards and yards per play, we get:
[total yards] [yards per play]
92 10.2
144 6.5
106 4.8
106 5.3
106 6.6
29 4.1
21 2.1
72 2.9
60 2.9
84 12.0
145 13.2

You can see the first two drives, they really ripped into us - over 10 yards/play. Then, they continued to gain yards (over 100 yds per two drives), but the yds per play dropped significantly.

Then, our defense really shut their offense down, that is - until our subs came in and they started gaining a lot of yards again.

This could be due to them being worn down by our size or us getting used to their execution speed. But it is good to see the team get better as the game went on. Hopefully, the team is able to carry over this performance into Navy so they don't get off to as strong of a start.

Sunday, October 2, 2011

Are we getting better? A progress report

Some fun with statistics -

This year, 4 of our 5 games have been against familiar opponents (UM, MSU, Pitt, PU). Looking at stats from 2008-2011, this has been our best year by a significant margin. Looking at 4 measures:
1) Record
2) Points differential
3) Yards differential
4) Yards per play differential

1) Record:
This is our best year by one game. In the last three years, we’ve split games with these four teams, going 2-2 each year. Always beating PU, going 1-2 with the other three. This year, we are 3-1.

2) Point differential
This is our best year by a significant amount – we are +11/game. 2008 was +4, 2009 was -1, 2010 was +2.5.

3) Yards differential
This is our best year by a significant amount. +96 yards differential game. About 100 yds/game better than last year – with 10-15 yds due to offense, and 90 due to defense. 2010 and 2009 were pretty much the same (-5/6 yds/game). 2008 was the worst, -40/game. An interesting note - last year, only 15 teams averaged +96 yards/game differential or better.

4) Yards per play differential
Also our best year (and are running the ball more!). We are at +0.6. Up from +0.3 last year and -0.8 in ’09 and -0.3 in ’08.

So, all in all, by the numbers, we are looking much improved.

Wednesday, September 28, 2011

ND is better than every team on our schedule

Well, at least that's what some "fun with numbers" says.

I've been enamored with looking at yards per play differential as an indication of a team's strength. Basically, if two teams play and they both average abour 5 yds/play, they are equal. If one averages 6 yds/play and the other averages 5, the one with 6 is better, and the yds per play differential is 1 yppd.

So, I decided to look at our schedule's yppd. And not only the teams we play, but they teams they play. So, for example, ND's average yppd is currently 1.0. Michigan's is 2.4. If I average our opponents (only the one's we've played), our opponents are at 1.4 yppd. Or, on average, they are 1.4 yppd better than the teams they've played.

Then, I thought looking just at our opponents wasn't enough - what if someone's played a very easy schedule. So, I looked at our opponent's opponents yppd (confused yet). Not too surprisingly, our opponents have played relatively easy schedules and have a -0.5 yppd. In another words, the average of USF, UM, MSU, and Pitt's opponents typically gain 0.5 yds less per play than their opponents.

Taking these three numbers - our yppd + our opponent's yppd + our opponent's opponent's yppd... and added them to create a number - of how good ND (and other teams) are. You could think of this as - how good doe ND win the line of scrimage relative to our opponent's - and then adjusting for strength of schedule.

Based on this, here's what the numbers say - ND is better than any team on our schedule:

Notre Dame 1.9
Stanford 1.7
Michigan 1.5
Michigan St. 1.2
Southern California 1.1
Navy 0.9
South Fla. 0.6
Pittsburgh 0.5
Air Force 0.2
Maryland (0.0)
Wake Forest (0.1)
Boston College (0.4)
Purdue (1.3)

And Purdue is the worst team we'll play. Granted, this is on a limited set of data... and should become more relevant as the season goes on. But if this is any indication, we have played as well as any team we're going to play (save the turnovers and boneheaded mistakes).

For fun, here are the top teams according to this logic for this year:
Alabama 2.8
Georgia Tech 2.7
Nebraska 2.1
Virginia Tech 2.0
Florida 1.9
Wisconsin 1.9
Notre Dame 1.9
Texas A&M 1.8
Stanford 1.7
South Carolina 1.7
UCF 1.7
Baylor 1.5
LSU 1.5
Michigan 1.5
Penn St. 1.4
North Carolina 1.4
Illinois 1.2
Michigan St. 1.2
Tennessee 1.2
San Diego St. 1.2
Washington 1.1
Texas 1.1
Southern California 1.1
Georgia 0.9
Navy 0.9

Looking at the list and then some of the underlying data, I think this analysis would be better served with more game data (for example - Air Force only has one data point, the game against TCU; Tenn has only 2 games of data, Baylor only 2, etc... b/c I excluded all non- DIA games from the analysis).

But, this would suggest we should feel very good about Purdue (although Purdue only has two data points...). And all be said about a night game, them having a week off, etc - we should have the ability to beat them pretty badly. It will be interesting to see what happens.