POLS 3220: How to Predict the Future
What is the best way to evaluate probabilistic predictions?
Introduce Brier Scores, the method we’ll use to assess accuracy in our forecasting challenge this semester.
Was this a bad prediction?
“There’s no chance that the iPhone is going to get any significant market share. No chance.”
—Steve Ballmer, Microsoft CEO, 2007
What about this prediction?
What about this one?
Evaluating individual probability statements is quite tricky.
If something happens that the forecaster said was unlikely, was the forecaster wrong or unlucky?
Much better to evaluate a forecaster’s track record over many predictions.
Across thousands of predictions about politics and sports, Nate Silver has a pretty impressive track record…
Calibration isn’t the only thing we care about, though.
A doctor who tells his pregnant patient there is a 50% chance the baby will be a boy is perfectly-calibrated…but the forecast isn’t terribly useful.
The ultrasound tech who can predict with100% confidence is much more useful. Same level of calibration though!
We want forecasts that are both well-calibrated and as confident as possible.
Prediction | Outcome | Error |
---|---|---|
70% | 1 | 0.3 |
20% | 0 | 0.2 |
70% | 0 | 0.7 |
40% | 1 | 0.6 |
Suppose you are a TV meteorologist.
Your weather model says it’s going to be a rainy week. 80% chance of rain each day.
What do you report to your viewers if you want the best average error for the week?
We would like a scoring rule that encourages honesty (a strictly proper scoring rule).
Penalizing extremely wrong predictions helps.
The Brier Score does this by taking the average squared error (Brier, 1950).
Notice that the penalty is particularly steep when predictions is wrong and overconfident.
Brier Scores are a sort of mathematical truth serum.
Your optimal strategy in the forecasting challenge is to report your honest beliefs.
A forecaster’s expected Brier Score cannot be improved by exaggerating or hedging probabilities.