Cambridge Prediction Game
post by NoSignalNoNoise (AspiringRationalist) · 2020-01-25T03:57:59.721Z · LW · GW · 3 commentsContents
General Creating Markets Making Predictions Scoring Example Markets None 3 comments
In order to improve our prediction and calibration skills, the Cambridge, MA rationalist community has been community has been running a prediction game (and keeping score) at a succession of rationalist group houses since 2013.
Below are the rules:
General
The game consists of a series of prediction markets. Each market consists of a question that will (within a reasonable timeframe) have a well-defined binary or multiple-choice answer, an initial probability estimate (called the "house bet"), and a sequence of players' probability estimates. Each prediction is scored on how much more or less accurate it is than the preceding prediction (we do it this way because the previous player's prediction is evidence, and one of the skills this game is meant to develop is updating properly based on what other people think).
Creating Markets
Any player can create a market. To create a market, a player writes the question to be predicted on a whiteboard or on a giant sticky note on the wall with a house bet. The house bet should be something generally reasonable, but does not need to be super well-informed (this is abusable in theory but has not been abused in practice).
Making Predictions
To make a prediction, a player writes their name and their probability estimate under the most recent prediction (or the house bet if there are no predictions so far). The restrictions on predictions are:
- The player who set the house bet cannot make the first prediction (otherwise they could essentially award themself points by setting a bad house bet).
- No predicted probability can be < 0.01.
- A player without a positive score cannot lower any predicted probability by more than a factor of 2 (in order to avoid creating too many easy points from going immediately after an inexperienced player).
Scoring
When a market is settled (i.e. the correct answer becomes known), each prediction is given points equal to:
100 * log2(
probability given to the correct answer /
previous probability given to the correct answer)
In a binary market where the correct answer is no, each prediction's implied probability of "no" is used (e.g. if a player predicted 0.25, that is treated as p(no)=0.75
).
This is a strictly proper scoring rule, meaning that the optimal strategy (strategy with the highest expected points) is to bet one's true beliefs about the question.
The points from each market are tracked in a spreadsheet, along with the date each market settled. The points from each market decay by a factor of e
every 180 days.
The score of each player with a positive score is written on one of our whiteboards and is updated semi-regularly.
Example Markets
Example binary outcome market:
Does the nearest 7-11 sell coconut water? | Points | |
---|---|---|
House | 0.5 | |
Alice | 0.4 | -32 |
Bob | 0.2 | -100 |
Alice | 0.3 | +58 |
Carol | 0.6 | +100 |
Outcome | Yes |
Example multiple-choice market:
Faithless electors in 2016 | 0 | 1-5 | 6-36 | 37+ | Points |
---|---|---|---|---|---|
House | 0.4 | 0.4 | 0.1 | 0.1 | |
Alice | 0.2 | 0.4 | 0.1 | 0.3 | 0 |
Bob | 0.2 | 0.5 | 0.2 | 0.1 | +100 |
Carol | 0.25 | 0.55 | 0.15 | 0.05 | -42 |
Bob | 0.1 | 0.3 | 0.58 | 0.02 | +195 |
Outcome | Yes |
3 comments
Comments sorted by top scores.
comment by Trevor Hill-Hand (Jadael) · 2020-01-25T19:55:42.927Z · LW(p) · GW(p)
Can you elaborate more on whether there have been noticeable results in either A) taking successful actions based on the most recent predictions or B) improving the forecasting skills of the players? And if so- how were these things measured? How would you prefer to measure them?
Replies from: AspiringRationalist, AspiringRationalist↑ comment by NoSignalNoNoise (AspiringRationalist) · 2020-01-26T16:18:49.962Z · LW(p) · GW(p)
We don't have well-defined stats on how well people's prediction skills have improved over time. From my anecdotal observations, pretty much everyone (myself included) starts out vastly overconfident, and then after losing a lot of points in their first few predictions, reaches an appropriate level of confidence. I'm not sure if anyone goes from ok to great though.
↑ comment by NoSignalNoNoise (AspiringRationalist) · 2020-01-26T16:13:40.726Z · LW(p) · GW(p)
There have been few times that we've been able to take actions based on the predictions, because that requires the following combination of factors that tends not to occur together:
- The answers are sufficiently clear-cut to make the prediction scorable
- There are specific actions that depend on those well-defined answers
- Enough people in our local community have enough insight to get some sort of wisdom of crowds.
The examples where the predictions led to decisions are:
- Scott Alexander visited Boston and we hosted a meetup that he came to, and we wanted to run a survey at the meetup. We took predictions on how many people would attend, along with a conditional prediction market on the survey response rate for paper vs for tablet. Based on this, we went with paper. Attendance was much higher than anticipated, and we ended up running out of forms (but were able to make copies, so it worked out ok).
- When our apartment had some maintenance issues and the landlord was giving us mixed signals about whether we'd be able to renew the lease, we took predictions on when/whether the issues would be resolved and whether the landlord would offer a renewal. Based on these, we decided not to look for a new place. The issue was in fact resolved and we were able to renew the lease.
- One of our housemates has moved away, with some ambiguity around whether it's temporary permanent, and we have predictions on whether they will return. TBD how this one will go.