2020 Election: Prediction Markets versus Polling/Modeling Assessment and Postmortem

zvi

2020 Election: Prediction Markets versus Polling/Modeling Assessment and Postmortem

post by Zvi · 2020-11-18T23:00:01.166Z · LW · GW · 17 comments

    Predictions Over Time versus Predictions Once
  Easy Mode Evaluations
    Method One: Money Talks, Bull**** Walks
    Methods Two and Three: Trading Simulation, Where Virtual Money Talks, and The Green Knight Test
    Method Four: Log Likelihood (Effectively Similar to Brier Score)
    Method Five: Calibration Testing
    Method Six: The One Mistake Rule
  Hard Mode
    Aren’t Elections Inherently Unpredictable And Thus Always Close to 50/50?
    Is There Still a Chance Trump Stays in Office? Would Wagers Win If He Did?
    Was the market predicting the election would often be stolen? If so, was that prediction reasonable?
    Which was the better prediction for the mean result, that the popular vote margin would be about 4%, or that the popular vote margin would be 7.8%?
    What dynamics slash thinking were causing the market to make its prediction? Should we expect these dynamics to give good results in the future?
  Conclusion
None
17 comments

Moderation/Commenting: I tried to minimize it, but this post necessarily involves some politics. See note at end of post for comment norms. I hope for this to be the last post I make that has to say anything about the election.

Previously: Evaluating Predictions in Hindsight, PredictIt: Presidential Market is Increasingly Wrong

There have been several posts evaluating FiveThirtyEight’s model and comparing its predictions to those of prediction markets [LW · GW]. I believe that those evaluations so far have left out a lot of key information and considerations, and thus been highly overly generous to prediction markets and overly critical of the polls and models. General Twitter consensus and other reactions seem to be doing a similar thing. This post gives my perspective.

The summary is:

The market’s movements over time still look crazy before the election, they don’t look great during the election, and they look really, really terrible after the election.
The market’s predictions for safe states were very underconfident, and it looks quite stupid there, to the extent that one cares about that.
According to the Easy Mode evaluation methods, Nate’s model defeats the market.
The market’s predictions on states relative to each other were mostly copied from the polls slash Nate Silver’s model, which was very good at predicting the order, and the odds of an electoral college versus popular vote split at various margins of the popular vote.
The market’s claim to being better is that they gave Trump a higher chance of winning, implying similar to what Nate predicted if the final popular vote was Biden +4, which is about where it will land.
My model of the market says that it was not doing the thing we want to give it credit for, but rather representing the clash between two different reality tunnels, and the prices over time only make sense in that context.
The market priced in a small but sane chance of a stolen election before the election, but seems to have overestimated that chance after the election and not adjusted it over time.

We will go over the methods given in Evaluating Predictions in Hindsight for using Easy Mode, then ask remaining questions in Hard Mode.

First, we need to mention the big thing that most analysis is ignoring. Estimates were not made once at midnight before election day. Estimates were continuous by all parties until then, and then continued to be continuous for prediction markets until now, and informally and incompletely continued to be given by modelers and other traditional sources.

Predictions Over Time versus Predictions Once

Even when we look at Easy Mode, we have to decide which method we are using to judge predictions. Are we evaluating only on the morning of the election? Or are we evaluating versus the continuous path of predictions over time? Are we taking the sum of the accuracy of predictions at each point, or are we evaluating whether the changes in probability made sense?

One method asks, does Nate Silver’s 90% chance for Biden to win, and his probabilities of Biden to win various states, look better than the prediction market’s roughly 60%, together with its chances for Trump to win various states?

The other method takes that as one moment in time, and one data point in a bigger picture. That second method seems better to me.

It seems especially better to look at the whole timeline when trying to evaluate what is effectively a single data point. We need to think about the logic behind the predictions. Otherwise, a stopped clock will frequently look smart.

And even more so than we saw last time, the market is looking an awful lot like a largely stopped clock.

FiveThirtyEight’s slash Nate Silver’s predictions are consistent and make sense of the data over time. Over the course of the year, Biden increases his polling lead, and time passes, while events seem to go relatively poorly for Trump versus the range of possible outcomes.

You can certainly argue that 70% for Biden on June 1 was too high, and perhaps put it as low as 50%. But given we started at 70%, the movements from there seem highly sensible. If there was a bias in the polls, it seems logical to presume that bias was there in June. Movement after that point clearly favored Biden, and given Biden was already the favorite, that makes Biden a very large favorite.

As I laid out previously, when we compare this to prediction markets, we find a market that treated day after day of good things for Biden and bad things for Trump, in a world in which Trump was already the underdog, as not relevant to the probability that Trump would win the election. Day after day, the price barely moved, and Trump was consistently a small underdog. If you start Trump at 38% and more or less keep him there from July 20 all the way to November 3, that’s not a prediction that makes any sense if you interpret it as an analysis of the available data. That’s something else.

Now we get to extend that into election night and beyond.

During election night, we saw wild swings to Trump. There should definitely have been movement towards Trump as it became clear things would be closer than the polls predicted, but the markets clearly overreacted to early returns. Part of that was that they focused on Florida, and didn’t properly appreciate how much of the swing was in Miami-Dade and therefore was unlikely to fully carry over elsewhere. But a lot of it was basic forgetting about how blue versus red shifts worked in various places, and typical market night-of overreaction. You could get 3:1 on Biden at one point, and could get 2:1 for hours.

When things were swinging back during the early morning hours, and Biden was an underdog despite clearly taking the lead in Wisconsin, it was clear that the market was acting bonkers. Any reasonable understanding of how the ballots were coming in would make Biden a large favorite to win Michigan, Wisconsin, Pennsylvania and Nevada, and the favorite in Arizona, with Georgia still up for grabs. Basic electoral math says that Biden was a large favorite by that point.

One could defend the market by saying that Trump would find a way to stop the count or use the courts, or something, in many worlds. That didn’t happen, and in hindsight seems like it was basically never going to happen, but it’s hard to make a good evaluation of how likely we should have considered that possible future given what was known.

What you can’t defend is that the market was trading at non-zero prices continuously after the election was already over. Even after the networks called the election a day or two later than necessary, the markets didn’t collapse, including at BetFair where the market will be evaluated on projected winners rather than who gets electoral college votes. Other places graded the bets, and Trump supporters went into an uproar, and new markets on who would be president in 2021 were created in several places. None of this was a good look.

As I write this, Trump is still being given over a 10% chance of winning by PredictIt, and a 5.5% chance by BetFair. This is obvious nonsense. The majority of the money I’ve bet on the election, I wagered after the election was over.

Before the election, markets gave an answer that didn’t change when new information was available. After the election, they went nuts. You can’t evaluate markets properly if you neglect these facts.

Easy Mode Evaluations

Method One: Money Talks, Bull**** Walks

I bet against the prediction markets. I made money. Others in the rationalist sphere also made money. Some made money big, including multiple six figure wins. Money talks, bull**** walks.

Yes, I am assuming the markets all get graded for Biden, because the election is over. Free money is still available if you want it.

You would have lost money betting on Florida or North Carolina, or on Texas which I heard discussed, but according to the model’s own odds, those were clearly worse bets than betting on the election itself or on relatively safe states. There were better opportunities, and mostly people bet on the better opportunities.

I also made money betting on New York, California, Maryland, Wisconsin and Michigan.

Methods Two and Three: Trading Simulation, Where Virtual Money Talks, and The Green Knight Test

Normally we let models bet into markets and not vice versa. Seth Burn observed correctly that Nate Silver was talking some major trash and evaluating markets as having no information, so he advanced them straight to The Green Knight Test. That's where you bet into the market, but the market also gets to bet into your fair values. That makes it a fair fight.

On the top line, Nate obviously won, and given liquidity, that’s where most of the money was. We can then see what happens if we treat all states equal to the presidency and enforce Kelly betting. Using Seth’s methodology we get this as his Green Knight test, where the market bets into Silver’s odds and Silver bets into the market’s odds:

STATE	Nate Risks	Nate Wins	Outcome
AK	2.56%	16.96%	-2.56%
AZ	29.49%	16.96%	16.96%
FL	49.21%	42.10%	-49.21%
GA	29.71%	29.50%	29.50%
IA	9.94%	14.03%	-9.94%
ME	3.17%	0.39%	-3.17%
MI	205.84%	23.45%	23.45%
MN	200.16%	20.03%	20.03%
MT	7.66%	65.51%	-7.66%
NC	31.58%	20.90%	-31.58%
NH	68.99%	12.32%	12.32%
NM	128.52%	10.28%	10.28%
NV	83.15%	15.63%	15.63%
OH	20.71%	34.64%	-20.71%
PA	107.33%	31.84%	31.84%
TX	15.14%	33.84%	-15.14%
WI	168.20%	22.70%	22.70%

Which adds up to a total of +42.7% for Nate Silver, soundly defeating the market. If Nate had been allowed to use market odds only, he would have won a resounding victory.

Also note that this excludes the non-close states. Those states are a massacre. PredictIt let me buy New York for Biden at 93% odds, which Nate Silver evaluated as over 99%, and so on. Nate Silver wins all of the wagers not listed, no exceptions.

On the presidential level, Nate Silver passes the Green Knight Test with flying colors.

How about for the Senate, which Seth also provides?

Race	Nate Risks	Nate Wins	Result
AK-Dem Senate	5.34%	20.59%	-5.34%
AL-Dem Senate	6.03%	36.75%	-6.03%
AZ-Dem Senate	4.84%	1.21%	1.21%
CO-GOP Senate	0.93%	5.79%	-0.93%
IA-Dem Senate	3.86%	3.58%	-3.86%
KS-Dem Senate	2.60%	9.54%	-2.60%
ME-GOP Senate	8.76%	15.45%	15.45%
MI-Dem Senate	42.78%	13.43%	13.43%
MN-Dem Senate	82.02%	11.01%	11.01%
MS-Dem Senate	2.49%	17.67%	-2.49%
MT-Dem Senate	1.73%	3.44%	-1.73%
NC-Dem Senate	9.30%	5.50%	-9.30%
TX-GOP Senate	0.44%	0.05%	0.05%
Result So Far			8.87%
GA-Dem Special	40.85%	35.97%

Georgia’s special election is a special case and a very strange election, where I think Nate made a mistake. Other than that, these disagreements were relatively small, and Nate once again comes out ahead even in a Green Knight test.

Passing a Green Knight test is not easy. You know what’s even harder? Passing a Green Knight test when the opponent took your predictions as a huge consideration when setting their odds.

And again, were Nate to go around picking off free money in the not-close races, he looks even better. Nate wins again.

Method Four: Log Likelihood (Effectively Similar to Brier Score)

Predicting the headline result with higher confidence scores better than lower confidence, and the model’s higher confidence in a lot of small states helps a ton if they count. So evaluating either the main prediction, or all predictions together, Nate wins again, despite the election being closer than expected.

If you evaluate only on the ‘swing’ states, for the right definition of swing, it can be argued that the market finally wins one. That’s kind of a metric chosen to be maximally generous to the market if you’re looking at binary outcomes only.

It’s also worth noting that the market mostly knew which states were the swing states because of the polling and people like Nate Silver. Then the market adjusted the odds in those states towards Trump. Effectively this is a proxy for ‘the election went off Biden +4.5 rather than Biden +7.8’ rather than anything else.

Method Five: Calibration Testing

Nate claims that historically his models are well-calibrated or even a bit underconfident. The market’s predictions come out looking very under confident, especially if you look at the 90%+ part of the distribution. Nate wins big.

Method Six: The One Mistake Rule

We talked about this above. The market has made several absurd predictions. It gives Trump chances in deep blue states, and Biden chances in deep red states. It gives Trump major chances after the election is over. The market fails to adjust over time prior to the election. The market gave a 20%+ chance of a popular vote versus electoral college split, which is a proxy prediction of absurdly low variance in vote shares. The market made a lot of other small absurd predictions.

Nate Silver got some predictions wrong, but can you point to anything and say it was clearly absurd or unreasonable? The Georgia special senate election is a weird special case, but that would be an isolated error and time will tell. Other than that, I can’t find a clear mistake. Maybe he was overconfident in Florida, but the difference there was Miami-Dade and that was largely a surprise, plus Nate is running a hands-off model, and has to be evaluated in light of that.

Once again, seems to me like Nate wins big.

Hard Mode

We’ve already covered many of the issues we need to consider in pure hard mode.

The remaining five questions are:

What about the Talib position that things like elections are basically unpredictable so almost always be very close to 50/50?
Is there still a chance Trump stays in office? If he did, would bets on Trump win?
Was the market predicting the election would often be stolen? If so, was that prediction reasonable?
Which was the better prediction for the mean result, that the popular vote margin would be about 4%, or that the popular vote margin would be 7.8%?
What dynamics slash thinking were causing the market to make its prediction? Should we expect these dynamics to give good results in the future?

One way to see that the market and Nate strongly agreed on the ordering of the states was to look at the market on which state would be the tipping point. Throughout the process, that market not only broadly agreed with Nate, it agreed with changes in Nate’s model, and roughly on the distribution of lower probability tipping point states.

In light of that fact, we can broadly say that markets mostly gave credit to the polls for giving vote shares in various states relative to other states. All they were asserting was a systematic bias, or the potential thereof, or the existence of factors not covered by the polls (e.g. cheating, stealing, voter suppression, additional turnout considerations, etc, all by either or both sides).

Thus, I think these five questions sum up our remaining disagreements.

Aren’t Elections Inherently Unpredictable And Thus Always Close to 50/50?

Nope. Nonsense. Gibberish. Long tails are important and skin in the game is a vital concept, and Talib has taught me more than all but a handful of people, but this position is some combination of nonsense and recency bias, and a pure failure to do Bayesian reasoning.

If you have evidence that suggests one side is more likely to win, that side is more likely to win. The position is silly, and models have done much, much better than you would do predicting 50/50 every time. Yes, elections have recently been in some sense reliably close, but if you had the 2008 or 1996 elections remotely close, you made a terrible prediction. If you had the 2012 election as close in probabilistic terms going into election day, you made another terrible prediction. The prediction markets were being really dumb in 2008 and 2012 (and as far as I know mostly didn’t exist in 1996).

Even when something is close that doesn’t make it 50/50. An election that is 70/30 to go either way despite all the polling data, such as the 2016 election, is pretty darn close! So is a football game that’s 70/30 at the start, or a chess game, or a court case, or a battle in a war, or a fistfight at a bar. Most fights, even fights that happen, aren’t close. That’s a lot of why most potential fights are avoided.

That doesn’t mean that Talib hasn’t made valid critiques of Silver’s model in the past. His criticisms of the 2016 model were in large part correct and important. In particular, he pointed to the absurdly high probabilities given by the model at various points during the summer. With so much time left, I think the model was clearly overconfident when it gave Clinton chances in the mid-high 80s at various points during the summer, and panicked too much when it made Trump a favorite on July 30. Nine days later Clinton was at 87% to win. Really?

I’m not sure if that has been fixed, or if there was much less volatility in this year’s polls, or both. This year, the movements seemed logical.

Is There Still a Chance Trump Stays in Office? Would Wagers Win If He Did?

There’s always some chance. Until the day a person gives up power, there is some probability they will find a way to retain that power. Never underestimate the man with the nuclear codes who continues to claim he should keep them, and who is also commander in chief of the armed forces, and who a large minority of the population stands behind.

But every day, what slim chances remain get slimmer. Lawsuits get thrown out and dropped, and evidence of fraud fails to materialize. Pushback gets stronger. More and more results are certified and more and more people acknowledge what happened, shutting down potential avenues. State legislatures are making it clear they are not going to throw out the votes and appoint different electors, but even if they did, those electors would lack the safe harbor provision, so they’d need approval from the House that they wouldn’t get, and that results in Acting President Pelosi (or Biden, if they make Biden the new Speaker of the House in anticipation).

Everything that has happened has reinforced my model of Trump as someone who is likely to never admit defeat, and who claims that everything is rigged against him (just like the Iowa Caucuses in 2016, and the general election in 2016 which he won), but who never had a viable plan to do anything about it. Sure, lawsuits will be filed, but they’re about creating a narrative, nothing more.

It wasn’t a 0% plan, for two reasons. One is that thinking of Trump as having a plan is a category error. There are no goals, only systems that suggest actions. The other reason is that it had a non-zero chance of success. Convince enough Republican voters, put enough pressure on lawmakers at various levels, and hope they one by one fall in line and ignore the election results. To some extent it did happen, it could have happened in full, and if it had, who knows what happens next. Or alternatively, someone else could have come up with an actual tangible plan. Or maybe even someone could have found real fraud.

But at this point, all of that is vanishingly unlikely and getting less likely every day, and it seems even Trump has de facto acknowledged that Biden has won de facto.

I’ve actually looked into this stuff in some detail, largely so I could sleep better at night, but also to better inform my real actions. It’s over. I will reiterate, as I did on my previous post, that I am open to discussing in more detail privately, if there are things worth discussing, as I have done with several people to help me understand and game out the situation.

What would happen if Trump pulled off this shot, however unlikely it may seem now? That depends on how he does it, and what rules operate on a given website.

BetFair’s rules seem clear to me, in that they are based on projected winners. Biden has won, no matter what happens. Bets should pay out. That’s why many markets have already paid out, and presumably BetFair isn’t doing so to avoid trouble rather than for any real reason. But if Trump did somehow stay in office, then they are still going to have a big problem, because both sides will say they won.

PredictIt’s rules are effectively ‘it’s our website and we’ll decide it however we want to.’ So it would presumably depend on the exact method. State elections refer to the ‘popular vote’ in those states, so if those votes got ignored by the legislature, it’s hard to say that Trump won those markets. Plenty of ambiguity there all around.

I feel very comfortable grading these wagers now.

Was the market predicting the election would often be stolen? If so, was that prediction reasonable?

Certainly that is what the market is predicting now and what it has been predicting since at least November 5. Otherwise the prices don’t make sense. So the question is, was that something it recently discovered after the election, when things were unexpectedly on edge? Or was it priced in to a large extent all along?

It was priced in to some extent because some participants adjusted their probabilities accordingly. Certainly I priced it in when deciding what my fair values were, and I chose states to bet on with that in mind (and did that somewhat poorly, in hindsight the right places were more like Minnesota). But it couldn’t have been priced in much, because the relative prices of the different states didn’t reflect it. In particular, Pennsylvania was always the obvious place to run a steal, since it was the likely tipping point, had a Republican legislature, existing legal disputes and a lot of rhetoric about Philadelphia already in place. But Pennsylvania was exactly where it should have been relative to other states.

The counterargument is that the market assigned a 20% chance of a split between electoral college and popular vote, and presumably this would be how you get that answer?

My guess is that worries about theft moved the market on the order of 2%, which then expanded to 5%-10% as things proved close, which is roughly consistent, but also proved to be higher than it needed to be. But did we have the information to know that? Were such efforts essentially doomed to fail?

It’s hard to say, even in hindsight. I wagered as if there was a substantial chance of this, and now consider my estimate to have been too high, but also perhaps emotionally necessary. An obviously stolen election would have been much worse for my life than an apparent clean victory, no matter the outcome on who won.

Which was the better prediction for the mean result, that the popular vote margin would be about 4%, or that the popular vote margin would be 7.8%?

This is the real crux of it all to many people.

We now know that the margin will be something around 4%-5%, and the margin in the electoral college around 0.6%. FiveThirtyEight’s model thought the break-even point for Biden being a favorite in the electoral college was a popular vote win of around 3.5%. That seems like a great prediction now, and we have no reason to think the market substantially disagreed.

Thus, it mostly comes down to this question. Assume the market was not pricing in much theft or fraud, and thus expected Trump to only lose the popular vote by an average of 4%. Was that a good prediction?

A lot of people are saying it was a good prediction because it happened. That’s suggestive, but no. That’s not how this works.

There were stories about why we should expect Trump to beat his polls, sure. But there were also stories of why Biden should expect to beat his polls instead.

One theory is ‘they messed up in 2016 so they’ll mess up again the same way.’ This is especially popular in hindsight. I do not think that is fair at all. This result is well within the margin of error. Once all the votes get counted, we are talking about a 3-4% polling error, which is well within historical norms. And when we look at details, the stories about why Trump obviously beat his polls don’t seem right to me at all.

Trump’s turnout operation did a great job relative to expectations, and Biden’s day-of operation did not do as well. Good, but that’s relative to expectations. Polls have to account for Democrats doing lots of mail ballots and Republicans going for election day, and it’s hard to get right. When you get it wrong, you can be wrong in either direction, and also it’s possible that the error was that not all mail ballots were counted. The Covid-19 situation was pretty unique, and trying to correct for that is hard. Nate has speculated that perhaps different types of people were at home more or less often this time, and that wasn’t corrected for sufficiently, and that had interesting effects on error. That’s not something people talked about pre-election anywhere I noticed, even in hindsight.

Shy Trump voters! Every time a populist right-wing candidate with views the media finds unacceptable runs, we have to debunk this, and note that the historical pattern is for such candidates not to beat their polls. In addition, if this was a shy voter effect, then you would expect blue state Trump voters to be shy (social disapproval) but red state Trump voters to not be as shy (social approval). Thus, you’d expect that the bluer the state, the more Trump would beat his polls, if what was happening was that Biden supporters were scaring Trump supporters into keeping things secret. Instead we had the opposite. Trump outperformed his polls more in red areas.

Historically, polls were accurate in 2018, and Obama beat his polls in 2012, and so on. It goes both ways.

You could also view this as Republicans and undecideds inevitably coming home to Trump late, and the model underestimating the amount of tightening that would happen. That seems like a reasonable argument, but not big enough to account for predicting outright a 4% margin.

Nate’s explanation is that 90% chance of victory is exactly because a ‘normal’ polling error, like the one we saw, would still likely mean Biden wins, and that’s exactly what happened.

Does it feel right in hindsight that Trump only lost by 4%? Sure. But suppose he had lost by 12%. Would that have felt right too? Would we have a different story to tell? I think strongly yes.

So why does this feel so different than that?

I think it’s the sum of several different things.

One is that the people set expectations for polls too high. A 4% miss, even with the correct winner, is no longer acceptable with stakes and tensions so high, even though it is roughly expected.

Two is that the blue shift meant that the error looked much bigger than it was. The narratives get set long before we get accurate counts. There’s plenty of blame to go around on this, as California and New York take forever to count ballots.

Three is that we now have distinct reality tunnels, so a lot of people thought Trump would obviously win regardless of the evidence, either because they supported Trump (MAGA!) or because they were still traumatized by 2016 and assumed he’d somehow win again. Whereas others had never met a Trump supporter and couldn’t imagine how anyone could be one, and assumed Biden victory, so everyone was of course standing by to say how terrible the polls were.

Fourth is because polls involve numbers and are run by nerds who are low status so we are always looking to tear such things down when they are daring to make bold claims.

Fifth is that polls are being evaluated, as I’ve emphasized throughout, against a polls plus humans hybrid. They are not being evaluated against people who don’t look at polls. That’s not a fair comparison.

All of this leads into the final question, which to me is the key thing to observe.

What dynamics slash thinking were causing the market to make its prediction? Should we expect these dynamics to give good results in the future?

I think the best model is to consider several distinct groups. Here are some of them.

We have the gamblers who are just having fun and doing random stuff.

We have the partisans. There are Democrats and Republicans who will back their candidate no matter what because they are convinced they will win. They don’t do math or anything like that. Yes, the market overall was freaking huge, but with only one market every four years, partisans can wager big in the dark and not go bankrupt ever.

We have the hedgers who are looking for a form of insurance against a win by the wrong party.

We have the modelers who are betting largely on the polls and models.

We have the sharps who are trying to figure things out based on all available information, potentially including institutional traders like hedge funds.

We have the campaigners who bet money to move the market to give the impression their candidate is winning.

One big actor is enough to move the market a lot, and campaigning in this way is an efficient thing to do even with this year’s deeper market.

My experience suggests that most sharps have much bigger and better and more liquid markets to worry about, especially given the annoyance of moving money to gambling sites. They mostly prefer to let the market be a source of information, and sometimes trade privately among themselves, often at prices not that close to the market.

If the sharps and modelers had been in charge, prices would move more over time. They exist, but there are not that many of them and they are mostly size limited.

We see this a lot on major events, as I’ve noted before, like the Super Bowl or the World Cup. If you are on the ball, you’ll bet somewhat more on a big event, but you won’t bet that much more on it than on other things that have similarly crazy prices. So the amount of smart money does not scale up that much. Whereas the dumb money, especially the partisans and gamblers, come out of the woodwork and massively scale up.

What I think essentially was happening was that we had partisans making dumb bets on Trump, other partisans making dumb bets on Biden (which were profitable, but for dumb reasons), and then modelers having enough money to move things somewhat but not all that much until election night.

Then after election night, the people who are buying the whole ‘Trump won!’ line see a chance to buy Trump cheap and do so, combined with most people who would bet on Biden already having bet what they’re willing to wager, and there needing to be a 10:1 or higher ratio of dollars wagered to balance out. So the price stays crazy.

Similar dynamics create the longshot bias problem, and the ‘adds to far more than 100%’ problem that allows pure free money, which are very real issues with these markets.

Conclusion

Who comes out ahead? If you want to know what will happen in 2024, or in any other election, where should you look? To the market, or to the models? Can you make money betting in prediction markets using models?

My conclusion is that the FiveThirtyEight model comes out better than the markets on most metrics. The market implicitly predicted an accurate final popular vote count, on election morning, but that’s largely a coincidence in multiple ways. Markets still have value and should be part of how you anticipate the future, but until their legal structures improve, they should not be your primary source, and you shouldn’t hesitate to make money off them where it is legal.

If you literally can only look at one number, then any model will lag when events happen, which is a problem. There will also be events that don’t fit into neat patterns, where the market is the tool you have, and it will be much better than nothing. Those are the biggest advantages of markets.

If I had to choose one source for 2024, and I knew there had been no major developments in the past week, I would take FiveThirtyEight’s model over the prediction markets.

If I was allowed to look at both the models and markets, and also at the news for context, I would combine all sources. If the election is still far away, I would give more weight to the market, while correcting for its biases. The closer to election day, the more I would trust the model over the polls. In the final week, I expect the market mainly to indicate who the favorite is, but not to tell me much about the degree to which they are favored.

If FiveThirtyEight gives either side 90%, and the market gives that side 62%, in 2024, my fair will again be on the order of 80%, depending on how seriously we need to worry about the integrity of the process at that point.

The more interesting scenario for 2024 is FiveThirtyEight is 62% on one side and the market is 62% on the other, and there is no obvious information outside of FiveThirtyEight’s model that caused that divergence. My gut tells me that I have that one about 50% each way.

A scenario I very much do not expect is the opposite of the first one. What if the market says 90% for one candidate, and FiveThirtyEight says 62%, with no obvious reason for the disagreement? My brain says ‘that won’t happen’ but what if it does? Prediction markets tend to be underconfident and now they’re making a super strong statement despite structural reasons it’s very hard to do that. But the polls somehow are pretty close. My actual answer is “I would assume the election is likely not going to be free and fair.” Another possibility is someone is pumping a ton of money in to manipulate the market or one fanatic has gone nuts, but I think that getting things much above 70% when the situation doesn’t call for it would get stupidly expensive.

In all these cases I’d also be looking at the path both sources took to get to that point.

I think that you can definitely make money betting with models into prediction markets. You can also make money betting without models into prediction markets! Sell things that are trading at 5%-15% and don’t seem likely to increase further, and sell things that have gotten less likely without the price moving. Sell things that seem silly when the odds add up to over 110%. Rinse and repeat. But you can make even more if you let models inform you.

I’d love to see prediction markets improve. If the market had been fully legal and trading on the NYSE, I’d expect vastly different behavior the whole way through. Until then, we need to accept these markets for what they are. When 2024 (or perhaps even 2022) rolls around, you may wish to be ready.

Moderation Note: This topic needs to be discussed by those interested in markets, prediction markets, predictions, calibration and probability, but inevitably involves politics. In order to evaluate prediction markets this election, we need to talk about areas where reality tunnels radically differ, because both things those tunnels disagree about and also the tunnels themselves are at the core of what happened. If you are discussing this on LessWrong, your comments will show up in recent discussion. I’d like to keep such comments out of recent discussion, so to avoid that, if you feel the need for your comment to be political, please make the comment on the Don’t Worry About the Vase version of the post. For this post and this post only, I’m going to allow election-relevant political statements and claims on the original version, to the extent that they seem necessary to explore the issues in question. For the LessWrong comment section, the norms against politics will be strictly enforced, so stick to probability and modeling.

17 comments

Comments sorted by top scores.

comment by knite · 2020-11-21T00:31:37.224Z · LW(p) · GW(p)

Would love to hear more about your specific bets, as well as your "multiple six figure wins" friends! Personally, after maxing several PredictIt markets for a total of low thousands bet, I didn't feel like there was a good market place to bet larger size (tens of thousands USD or higher in crypto).

comment by Anon User (anon-user) · 2020-11-19T00:25:37.295Z · LW(p) · GW(p)

Random data point - https://ftx.com/trade/TRUMPFEB ("Trump is the President on Feb 1st, 2021") is currently at 0.142 (14.2% probability it will happen)...

Replies from: Zvi

↑ comment by Zvi · 2020-11-19T13:07:22.366Z · LW(p) · GW(p)

Looking at that page, I wonder whether the price would be different if it was in BTC rather than USD. Right now, if I'm confident enough in blockchain to put a large amount on this exchange, holding that amount in USD rather than BTC for several months seems like a substantial cost to such a person, perhaps on the order of what you can win in this market.

Replies from: Pablo_Stafforini, Isma

↑ comment by Pablo (Pablo_Stafforini) · 2020-11-19T16:33:02.962Z · LW(p) · GW(p)

I'm not sure I understand your argument, given that FTX allows traders to keep balances in both USD and BTC, but in any case historically FTX prices have been in line with Betfair/PredictIt prices, so I doubt this consideration is relevant.

Replies from: Zvi

↑ comment by Zvi · 2020-11-19T17:07:53.574Z · LW(p) · GW(p)

I haven't used FTX because illegal in USA slash usual worries. So you could deposit BTC, bet on the election, and if you win always get back your BTC+15%? Or not?

Replies from: Pablo_Stafforini, Isma, Isma

↑ comment by Pablo (Pablo_Stafforini) · 2020-11-19T18:06:05.258Z · LW(p) · GW(p)

The contracts are denominated in USD, and they pay in that currency. But you trade on margin, and the collateral can be in any currency (crypto or fiat). In your example, you get back the BTC plus 15% of what that BTC was worth in USD when you made the trade.

Incidentally, TRUMPFEB is now trading at 0.16 (i.e. implied 16% chance that Trump is president next February). This looks insane to me (and I have bet accordingly). I'd be curious if you or others have further thoughts on what might be going on.

Replies from: knite

↑ comment by knite · 2020-11-21T00:33:29.626Z · LW(p) · GW(p)

What were (or are now) the best places for US persons interested in betting significant crypto sums?

↑ comment by Isma · 2021-03-23T19:24:31.258Z · LW(p) · GW(p)

Side note: the counterparty risk (either from hacks or scams) of FTX seems extremely low right now. The team behind FTX/Alameda has built flawless reputation in the crypto industry over the past few years, and is considered among the strongest technically. They basically went from non-existent to top ~3 of crypto exchanges worldwide in volumes in a matter of ~1 year. I think FTX will continue to grow, as (1) crypto grows as a whole; and (2) FTX innovates and grows more than the competition.

For US residents, www.ftx.us is available (but more restricted).

↑ comment by Isma · 2021-03-23T19:24:05.308Z · LW(p) · GW(p)

The short answer is yes: you could bet in BTC because FTX pools collateral across all positions: https://help.ftx.com/hc/en-us/articles/360027946371-Margin-Collateral

↑ comment by Isma · 2021-03-23T19:25:06.271Z · LW(p) · GW(p)

Side note: the counterparty risk (either from hacks or scams) of the FTX exchange seems extremely low right now. The team behind FTX/Alameda has built flawless reputation in the crypto industry over the past few years, and is considered among the strongest technically. They basically went from non-existent to top ~3 of crypto exchanges worldwide in volumes in a matter of ~1 year. I think FTX will continue to grow, as (1) crypto grows as a whole; and (2) FTX innovates and grows more than the competition.

For US residents, www.ftx.us is available (but more restricted).

comment by Rafael Harth (sil-ver) · 2020-11-19T13:56:24.298Z · LW(p) · GW(p)

Given my previous noises on the topic, it won't be surprising that I agree with this post. However,

The more interesting scenario for 2024 is FiveThirtyEight is 62% on one side and the market is 62% on the other, and there is no obvious information outside of FiveThirtyEight’s model that caused that divergence. My gut tells me that I have that one about 50% each way.

it's possible that I didn't read the post carefully enough, but I don't quite get this. My instinct would be to bet against the market if this happens. Why is 37/63 different from 63/90?

Replies from: SimonM

↑ comment by SimonM · 2020-11-19T14:41:40.939Z · LW(p) · GW(p)

I think Zvi would also bet against the market if that happened. If he thinks the probability is 50% and the market is offering 38%, that's a great bet.

He's completely consistent in that he puts the probabilities of these events between the markets and Nate (which inevitably means betting against the market in the direction of the models)

Replies from: sil-ver

↑ comment by Rafael Harth (sil-ver) · 2020-11-19T15:19:30.284Z · LW(p) · GW(p)

Oh, duh. I got confused there, but you're right that there's no inconsistency to explain.

comment by habryka (habryka4) · 2020-11-19T00:44:11.058Z · LW(p) · GW(p)

Method Two: Trading Simulation, Where Virtual Money Talks

I am not yet done with the post, but this heading is completely empty. Is that intentional?

Replies from: Zvi

↑ comment by Zvi · 2020-11-19T01:19:56.974Z · LW(p) · GW(p)

Was meant to fix that, will shortly. Should be fixed now - it's a joint title that I fixed in the draft but I forgot to copy over the edit.

comment by redlizard · 2020-11-19T15:56:52.677Z · LW(p) · GW(p)

We see this a lot on major events, as I’ve noted before, like the Super Bowl or the World Cup. If you are on the ball, you’ll bet somewhat more on a big event, but you won’t bet that much more on it than on other things that have similarly crazy prices. So the amount of smart money does not scale up that much. Whereas the dumb money, especially the partisans and gamblers, come out of the woodwork and massively scale up.

This sounds like it should generalize to "in big events, especially those on which there are vast numbers of partisans on both sides, the prediction markets will reliably be insane and therefore close to useless for prediction purposes". Is that something one sees in practice for things like the World Cup?

One way the World Cup and Super Bowl differ from the election is that in a championship match, only a modest fraction of the people interested in the tournament as a whole will be partisans for the two contestants participating in the championship match; whereas in the election, I would expect the partisan coefficient to be much higher than that. Would that affect the degree to which the dumb money will overwhelm the smart money? I would expect so.

Replies from: Zvi

↑ comment by Zvi · 2020-11-19T17:06:48.210Z · LW(p) · GW(p)

That goes too far. What happens is that the line is less accurate than normal, and there will almost always be good value if you look around the various offerings. But there are a lot of forces that will come in hard if the number gets super wrong, and a lot of ways for even relatively dumb money to know more or less what the odds should be. So if a WC match is actually 65-35, it might be 60-40 or 70-30 instead, which is a great opportunity, but it's not going to be useless. It depends what you already know - you should have a fair value in mind, expect a second value that's different, then look at what you find. And if it's not what you expected, maybe you modeled the public wrong, but also maybe you're missing something.

Basically, if you want to know who the favorite is, the line is trustworthy unless it's super close (52-48 or something). Exactly how big a favorite, or especially various secondary lines, and you get less trustworthy.

In an election, you don't have those kind of anchors, so it's easier to get far out of whack.

2020 Election: Prediction Markets versus Polling/Modeling Assessment and Postmortem

Contents

Predictions Over Time versus Predictions Once

Easy Mode Evaluations

Method One: Money Talks, Bull**** Walks

Methods Two and Three: Trading Simulation, Where Virtual Money Talks, and The Green Knight Test

Method Four: Log Likelihood (Effectively Similar to Brier Score)

Method Five: Calibration Testing

Method Six: The One Mistake Rule

Hard Mode

Aren’t Elections Inherently Unpredictable And Thus Always Close to 50/50?

Is There Still a Chance Trump Stays in Office? Would Wagers Win If He Did?

Was the market predicting the election would often be stolen? If so, was that prediction reasonable?

Which was the better prediction for the mean result, that the popular vote margin would be about 4%, or that the popular vote margin would be 7.8%?

What dynamics slash thinking were causing the market to make its prediction? Should we expect these dynamics to give good results in the future?

Conclusion

17 comments