8377087607 â, Cash On Delivery Call Girls Service In Hauz Khas Delhi Enjoy 24/7
Â
Bracketology talk at the Crossroads of ideas
1. The Math Behind the
March Madness Tournament and
College Football Playoff
Laura Albert McLay
Associate Professor, ISYE
laura@engr.wisc.edu
@lauramclay
@badgerbrackets
http://bracketology.engr.wisc.edu/
2. Letâs start with the 2 minute
version of my talk
https://www.facebook.com/UWMadison/videos/10154004638653114/
3. First, of allâŚ
Iâm a industrial and systems
engineering professor by day And a bracketologist by night!
4. I study systems
A system is a set of thingsâpeople, cells, vehicles,
basketball teams, or whateverâinterconnected in
such a way that they produce their own pattern of
behavior over time.
My discipline is operations research: the science of
making decisions using advanced analytical methods
5. Our world is becoming increasingly
complex and increasingly connected
Systems matter!
Mathematical models
and systems thinking
help us study systems
and navigate the
complex, interconnected
world we live in.
6. What do we hope to learn from
probability models like Markov
chains?
⢠How do we draw conclusions from limited data?
⢠How can we make data-driven decisions in the
presence of uncertainty?
7. How I got started in bracketology
In 2014 someone suggested I examine
bracketology in the context of the first
College Football PlayoffâŚ
âŚand so began Badger Bracketology
My objective: forecast which teams
would make the first college football
playoff before the season was over.
8. Markov chains:
The Little Engine that Could
Markov chains:
A type of math model for understanding how a
system can evolve over time.
Uses: finance, epidemiology, queues, zombies
9. Markov chains for ranking teams in a nutshell
Each team is a state. A team âvotesâ for teams that that it loses to
http://sumnous.github.io/blog/2014/07/24/gephi-on-mac/
Graph of 2014
college football season
10. Simple yet powerful idea
Automatically rate and ranks teams by
taking advantage of the network structure
of the match ups
⢠Use Markov chains to account for strength of schedule
⢠Do not need a human in the loop
Simple data requirements:
1. Game outcomes (score differentials),
2. Home/away status
Takes difficulty of future games into account in football playoff
forecasts
⢠Polls give the ranking right now, only gives insight a playoff held
today
14. Transitions
Rutgers 52 @ Wisconsin 72
Wisconsin Rutgers 1 â đ
đ
đ
1 â đ
How much credit should Wisconsin get for beating Rutgers by
20 at home?
đ = effective wins (fraction of a vote), which help us compute
our Markov chain transition probabilities
15. Letâs find a data-driven answer!
Given that team đ beat team đ by đĽ points at home, what is the
probability that đ is a better team than đ on a neutral court?
Data: Some teams play twice per season (home ď away)
Given that team đ beat team đ by đĽ points at home, what is the
probability that đ is a better team than đ on đâ˛
đ home court?
đđĽ
đť đđĽ
đ´ = probability that a team outscores its opponent by đĽ
points at home đť (away đ´) is better than its opponent on a
neutral đ site
Developed by Sokol, Kvam, Nemhauser, and Brown at Georgia Tech to rank NCAA menâs basketball teams
https://www2.isye.gatech.edu/~jsokol/lrmc/
16. What is the probability you win your next
game (on the road) given that you win by 20 at
home?
17. Logistic regression to the rescue!
Problem 1: must win by 50+ points to get a lot of credit for a win!
Winning/losing close games gives you the same amount of âcreditâ
Margin of victory đĽ
Probabilityofwinningontheroadnexttime
Problem 2: We need to get neutral site win probabilities
18. Logistic regression for
NCAA menâs basketball
⢠Use log (Point differentials) instead!
⢠Do not truncate point differentials
-30 -20 -10 0 10 20 30
0
0.2
0.4
0.6
0.8
1
Point differential
Effectivewins
19. Winning matters
⢠Average in a pure win/loss model to give more credit for winning the
game
-30 -20 -10 0 10 20 30
0
0.2
0.4
0.6
0.8
1
Point differential
Effectivewins
-30 -20 -10 0 10 20 30
0
0.2
0.4
0.6
0.8
1
Point differential
Effectivewins
20. Putting it all together
⢠End up with the red line!
-30 -20 -10 0 10 20 30
0
0.2
0.4
0.6
0.8
1
Point differential
Effectivewins
-30 -20 -10 0 10 20 30
0
0.2
0.4
0.6
0.8
1
Point differential
Effectivewins
21. Markov chain transition probabilities
Rutgers 52 @ Wisconsin 72 *
Wisconsin Rutgers 1 â đ
đ
đ
1 â đ
How much credit should Wisconsin get for beating Rutgers by 20 at home?
P(UW beats Rutgers on a neutral court) = 0.6255
đ = 0.6817 effective wins (fraction of a vote)
* Wisconsin 61 @ Rutgers 54 later on 1/28/2017
22. Transitions
Same idea for the rest of the gamesâŚ
Wisconsin
Minnesota
Northwestern
Rutgers
Illinois
23. Current rankings
3/12/2017 Selection Sunday
1 Gonzaga
2 Villanova
3 Kentucky
4 SMU
5 Wichita St
6 Arizona
7 UCLA
8 Duke
9 Cincinnati
10 Oregon
11 MTSU
12 North Carolina
13 St Marys CA
14 West Virginia
15 Kansas
16 Nevada
17 Purdue
18 Vermont
19 UNC Wilmington
20 Michigan
21 Florida St
22 VA Commonwealth
23 Notre Dame
24 Bucknell
25 Wisconsin
24. The B1G, ranked.
3/12/2017
17 Purdue
20 Michigan
25 Wisconsin
41 Northwestern
43 Minnesota
54 Maryland
78 Indiana
87 Michigan St
121 Iowa
130 Illinois
141 Ohio St
176 Penn St
187 Rutgers
242 Nebraska
25. How did we do last year?
3/13/2016 Selection Sunday
1. North Carolina
2. Kansas
3. Villanova
4. Michigan St
5. Virginia
6. West Virginia
7. Oklahoma
8. Kentucky
9. Oregon
10. Purdue
11. Xavier
12. Miami FL
13. Duke
14. Utah
15. Texas A&M
16. Louisville
17. Maryland
18. Arizona
19. Seton Hall
20. Iowa St
21. Indiana
22. California
23. Baylor
24. St Josephs PA
25. Iowa
27. College Football Playoff
Objective: determine which teams would make the first
college football playoff.
Goal: to forecast the top 4 teams weeks before the season
ends.
Solution method: a ranking method.
Challenge: need to simulate the remainder of the season and
rank the teams at the end of the (simulated) season.
28. Giant assumption
⢠We assume the selection committee will pick the four
ranked teams in the playoff.
⢠History suggests that humans prefer the most deserving
teams rather than the best teams in the national
championship game.
⢠E.g., 2013 Alabama lost on
a fluke play.
⢠âŚbut the College Football
Selection Committee might
have changed this!
2013 BCS Rankings just before bowl bids
30. How we did last year
2016 Playoff Rankings Badger Bracketology rankings
1 Alabama
2 Ohio State
3 Clemson
3 Washington
5 Michigan
6 Penn State
7 Western Michigan
8 Louisville
9 Oklahoma
10 Wisconsin :(
31. Model: two parts
0. Observe a few (7-8) weeks of game outcomes
1. Ranking.
⢠Assign a rating to each team to rank the teams.
⢠Similar to what we had before but with college football data
2. Game simulation.
⢠Determine who wins a game based on the team ratings.
Simulate the next weekâs game outcomes.
⢠Combine these:
⢠Re-rate and re-rank after each week of games.
⢠Simulate the remainder of the season.
⢠Report teams most likely to be in the top 4
32. Score differentials
Yes, running up the score matters, mathematically.
Histogram of score differentials, 2012-2014
Home score - away score
Frequency
-60 -40 -20 0 20 40 60 80
050100150200
33. Capped score differentials
38% of conference games fall beyond the cap
Histogram of score differentials capped at +/-21, 2012-2014
Home score - away score
Frequency
-60 -40 -20 0 20 40 60 80
050100150200250
Note: Rating systems used by College Football Playoff committee must use wins/losses
only (not score differentials). Running up the score makes a difference!
34. -20 -10 0 10 20
0
0.2
0.4
0.6
0.8
1
Point differential
Effectivewins
Sx
H
rx
H
rx
N
Build the Markov chain for football
⢠Used 3 seasons of data (truncate scores by +/-21)
⢠Use games played in consecutive years to identify win
probabilities to feed into the Markov chain
-20 -15 -10 -5 0 5 10 15 20
0
0.2
0.4
0.6
0.8
1
logistic regression
logistic regression averaged with win (weight = 2/3)
logistic regression averaged with win (weight = 1/3)
35. Modified Log Logistic Regression
Markov Chain (ln(mLRMC))
⢠Same as mLRMC except that we consider log point differentials to
dampen big score differentials
⢠Do not truncate point differentials
-20 -10 0 10 20
0
0.2
0.4
0.6
0.8
1
Point differential
Effectivewins
logistic regression (home team)
logistic regression averaged with win
37. Simulate the rest of the season!
0. Observe a few (7-8) weeks of game outcomes
1. Ranking.
⢠Assign a rating to each team to rank the teams.
2. Game simulation.
⢠Determine who wins a game based on the team ratings.
Simulate the next weekâs game outcomes.
⢠Combine these:
⢠Re-rate and re-rank after each week of games.
⢠Simulate the remainder of the season.
⢠Report teams most likely to be in the top 4
38. Win probability parameters
The win probability between teams đ and đ, where đ is the home
team is captured by the best-fit logistic regression model using
two years of game data:
đđđ =
đ đ+đ(đ đâđ đ)
1 + đ đ+đ(đ đâđ đ)
where
đđ â đđ = the difference in ratings between the two teams.
and assign a point differential to the winner.
Game prediction accuracy (averaged per game)
Statistic Model Training set Test set
Mean Absolute Error mLRMC 0.2043 0.3152
ln(mLRMC) 0.2026 0.3162
Mean Squared Error mLRMC 0.1006 0.1885
ln(mLRMC) 0.0999 0.1897
43. Team Method Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13
Clemson mLRMC 667 897 931 905 915 949 956
ln(mLRMC) 749 840 897 893 955 923 976
Alabama mLRMC 361 209 166 837 913 943 995
ln(mLRMC) 427 240 197 858 847 931 996
MSU mLRMC 179 213 261 54 24 569 675
ln(mLRMC) 226 349 354 115 162 573 706
Oklahoma mLRMC 20 46 71 119 393 758 1000
ln(mLRMC) 12 73 16 63 142 247 1000
2015 Results:
Forecasted number of times to make playoff
(out of 1000)
Nebraska
beats
MSU
MSU
beats
The OSU
No Big12
championship
Slight difference in
rankings:
3rd /4th vs. 5th /6th
44. 2015 Results:
Forecasted ranking of likelihood to make
playoff (any seed, out of 1000)
Team Method Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14
Clemson mLRMC 2 1 1 1 1 1 3 2
ln(mLRMC) 2 1 1 1 2 2 3 2
Alabama mLRMC 5 7 6 2 2 2 2 1
ln(mLRMC) 4 6 8 2 1 1 2 1
MSU mLRMC 7 6 7 11 10 4 4 3
ln(mLRMC) 6 5 5 9 6 3 4 3
Oklahoma mLRMC 18 13 13 9 4 3 1 4
ln(mLRMC) 21 15 14 12 8 6 1 4
No Big12
championship
No simulation:
the season is
over. We think
the committee
got it right!
Ranked 2nd & 7th
after week 7
Ranked 5th & 4th
after week 9
45. 2015 Results:
What happened to The Ohio State University?
Rankings after week 12
Forecasted rankings after
week 12
1. Clemson 1. Clemson
2. Alabama 2. Alabama
3. Oklahoma 3. Oklahoma
4. Notre Dame 4. Michigan State
5. Michigan State 5. Iowa
6. Ohio State 6. Notre Dame
7. Iowa 7. Stanford
8. Florida 8. Florida
9. Michigan 9. Ohio State
10. Stanford
(no other teams have >1%
chance of making the playoff)
47. Picking the perfect bracket
There are about 9.2 quintillion ways to fill out a bracketâŚ
And 1 way to fill out a perfect bracket
The odds of filling out a perfect bracket are not 9-
quintillion-to-1 because:
(a) the tournament isnât like the lottery where every
outcome is equally likely, and
(b) monkeys are not randomly selecting game outcomes.
Instead, people are purposefully selecting outcomes.
48. Can math help our odds?
FiveThirtyEight notes that the typical bracket has a
2.5 trillion-to-1 odds of being perfect:
⢠https://fivethirtyeight.com/features/march-madness-
perfect-bracket-odds/
BracketOdds at Illinois estimates that a historical
average winning bracket performs at 4.4 billion-to-1
⢠Warren Buffet may have to pay out!
49. The thing with perfect brackets
They depend on the year.
Letâs only look at how many people correctly select all Final Four teams:
â 1140 of 13 million brackets correctly picked all Final Four teams in 2016
â 182,709 of 11.57 million brackets correctly picked all Final Four teams
in 2015 *
â 612 of 11 million brackets correctly picked all Final Four teams in 2014
â 47 of 8.15 million brackets correctly picked all Final Four teams in 2013
â 23,304 of 6.45 million brackets correctly picked all Final Four teams
in 2012
â 2 of 5.9 million brackets correctly picked all Final Four teams in 2011
* Only 1 bracket emerged from the round of 64 with all 32 correct picks
51. 1. Donât use RPI
⢠Badger Bracketology (my favorite tool!)
⢠Logistic Regression Markov Chain (LRMC)
⢠FiveThirtyEight rankings of tournament teams
⢠Ken Pomoroyâs rankings
⢠Sagarin rankings
⢠Massey Ratings
⢠ESPNâs BPI rankings
Rankings clearinghouse: http://www.masseyratings.com/cb/compare.htm
52. 2. Pay attention to the seeds
Some seeds generate more upsets than others
⢠7-10 seeds and 5/12 seeds
Historically, 6/11 seeds go the longest before facing a
1 or 2 seed.
53. 3. Donât pick Kansas
⢠Be strategic. The point is NOT to maximize your
points, itâs to get more points than your opponents
⢠Differentiate your Final Four
⢠Check ESPN for the top picked teams. Some top teams
are overvalued and others are undervalued
⢠Last year:
⢠Kansas was selected as the overall winner in 27% of brackets
(and in 62% of Final Fours) with a 19% chance of winning (538)
⢠UNC selected as overall winner in 8% of brackets (with a 15%
win probability) and Villanova in 5.5%
http://games.espn.com/tournament-challenge-bracket/2016/en/whopickedwhom
https://projects.fivethirtyeight.com/2016-march-madness-predictions/
54. 4. Itâs totally random
A good process yields good outcomes on average
⢠It does not guarantee the best outcome in any given
tournament
Small pools are better if you have a good process
⢠Scoring can be random
⢠The more brackets, the higher chance that a ârandomâ
bracket will be the best
55. Topics in Sports Analytics
ISYE 601 in Spring 2017!
⢠Goal: teach students data-driven methods for making
better decisions using sports as a vehicle
⢠Course topics:
⢠Linear regression
⢠Logistic regression
⢠Empirical Bayes
⢠Ranking methods
⢠Probability models and Markov chains
⢠Forecasting
⢠Game theory
⢠Tournament scheduling
⢠Networks (is my team mathematically eliminated from the
playoffs?)
âŚand more!