Prediction Contest 2018

post by jbeshir · 2018-04-30T18:26:32.104Z · LW · GW · 4 comments

Contents

  About
  How to Enter
  When It Ends
  The Predictions
    World Politics
    Technology
    Effective Altruism
    Rationalist Community
  Wrapping Up
None
4 comments

Summary: Make predictions on all the listed predictions before the end of June 2018, and if your predictions are the best you'll win $200 after they all settle in January.

About

This is happening! Motivations discussed in the earlier post here [LW · GW], I'm running a prediction contest for the rationalist and EA communities. There are 20 predictions- seven about world politics over the next year, five about technology, six about the upcoming year for EA organisations, and two about rationalist websites, for a mix of topics with a lean towards things it would be helpful to know about. They were inspired by a variety of suggested sources- particularly Socratic Form Microblogging's world politics predictions, Slate Star Codex's predictions for the year and the SSC community's reddit thread for predictions, and a handful of EA organisation blog posts about the past and their future plans.

How to Enter

Each entry in the list below goes to a prediction on PredictionBook. Using a new or existing PredictionBook account, assign probability to each of them sometime before the end of June (12:00 midday 1st July UTC being the enforced deadline, to accommodate variation in timezone). Then submit a contact email and your PredictionBook account name through this form, and you're done!

You can assign probability more than once, and I'll take whichever probability was made most recently before the deadline, so you can feel safe to make predictions early and just update them later if new information arrives (both will impact on your PredictionBook calibration calculation, but that doesn't feed into the contest). In particular, you can do this even after submitting to the form- I won't go collect assigned probabilities until after the deadline.

Tip: If predictions are daunting, you don't need to do it all in one go- you can make a handful of predictions now and then throughout the two months and they'll all be done by the end. Maybe just do one and see after if you feel like doing more.

When It Ends

All the predictions settle by the 1st of January 2019. At some point prior to the end of January 2019, I'll ensure all the predictions are settled accurately, calculate the log score for each entrant, make a new post announcing the results and listing the scores and ranks of the entrants (by their PredictionBook account- contact email remains private), and send an email to the contact email address of the winner.

I'll provide them with a random string and ask them to post it as a comment on a particular one of the predictions using their PredictionBook account, to verify account ownership, and then ask for them to provide me with a PayPal account to send the prize to (I might be able to do other payment methods, can discuss at the time).

The Predictions

Without further ado: the predictions:

World Politics

Africa - Libya still has two rival governments on January 1, 2019

Middle East - Fatah and Hamas do not meaningfully reconcile in 2018 (e.g. Fatah still doesn't control Gaza by January 1, 2019)

Middle East - Iran withdraws from the deal limiting its nuclear program before the end of 2018

Middle East – Saudi Arabia does not conduct any airstrikes in Yemen between the start and end of December 2018

South America - FARC peace deal remains in place on January 1, 2019

US - A department of the Federal Government is eliminated before the end of 2018

The investigation run by Special Prosecutor Robert Mueller is still ongoing by the end of 2018 (and is still run by Mueller, e.g., Mueller has not died, or been removed, or formally concluded his investigation).

Technology

Tesla will deliver at least 180,000 Model 3's to customers in 2018

New iPhone model released with lowest option priced over $800 in the US before the end of 2018

A fake picture, video, or audio sample of a famous person doing/saying “something awful” causes a scandal reported by the BBC in 2018

The price of a bitcoin is over $10,000 at the end of 2018

Ethereum market cap is below Bitcoin market cap at end of 2018

Effective Altruism

GiveWell publishes at least two reports on interventions to influence policy by end of 2018

GiveWell's end of year donation recommendations for 2018 once again recommend a majority of direct donations go to Against Malaria Foundation

Schistosomiasis Control Initiative remains a GiveWell top charity at end of 2018

Evidence Action's No Lean Season remains a GiveWell top charity at end of 2018

Machine Intelligence Research Institute raises more in a 2018 fundraiser than in its 2017 fundraiser

Over 4500 people have signed up to the Giving What We Can Pledge by the end of 2018

Rationalist Community

Slate Star Codex gets mentioned in the New York Times (by someone other than Ross Douthat) between 1st of July and end of 2018

LessWrong.com has at least 20 front page posts approved during December 2018

Wrapping Up

The expected value of entering obviously depends on how many people do enter, which is something I'm curious to see- it could be only a couple of people or it could be tens of people (larger numbers seem unlikely, but would be great... from my perspective as a non-entrant, anyway). I'm considering this first run an experiment, burning $200 to try the idea out. If it works well, I'd hope to run a similar one next year, maybe incorporating any feedback I get on the concept here. If not, we try things.

Questions, feedback, ideas, etc, are very welcome.

4 comments

Comments sorted by top scores.

comment by zulupineapple · 2018-04-30T20:11:47.245Z · LW(p) · GW(p)

Measuring accuracy is a good way to assess the quality of our models. But not all accurate models are inherently good to have.

I'm wondering about the questions you picked. Do you feel that there is some utility for you in being able to predict, e.g. the future situation in Libya? I don't really think there is, but then I struggle to come up with more useful alternatives, at least ones that aren't personal.

I appreciate what you're doing here, I'd like to be doing it myself, on some level (though I don't currently intend to participate). But I'm concerned to what extent this sort of contest is useful to have, and to what extent it is a game.

Replies from: jbeshir
comment by jbeshir · 2018-05-01T08:59:15.154Z · LW(p) · GW(p)

The usefulness of a model of the particular area was something I considered in choosing between questions, but I had a hard time finding a set of good non-personal questions which had very high value to model. I tried to pick questions which in some way depended on interesting underlying questions-for example, the Tesla one hinges on your ability to predict the performance of a known-to-overpromise entrepreneur in a manner that's more precise than either maximum cynicism or full trust, and the ability to predict ongoing ramp-up of manufacturing of tech facing manufacturing difficulties, both of which I think have value.

World politics are I think the weakest section in that regard, and this is a big part of why rather than just taking twenty questions from the various sources of world politics predictions I had available, I looked for other questions, and made a bunch of my own EA-related ones by going through EA org posts looking for uncertain pieces of the future, reducing the world politics questions down to only a little over a third of the set.

That said, I think the world politics do have transferability in calibration if not precision (you can learn to be accurate on topics you don't have a precise model for by having a good grasp of how confident you should be), and the general skill of skimming a topic, arriving at impressions about it, and knowing how much to trust those impressions. I think there are general skills of rationality being practiced here, beyond gaining specific models.

And I think while it is the weakest section it does have some value- there's utility in having a reasonable grasp of the behaviour and in particular the speed of change under various circumstances in governments- the way governments behave and react in the future will set the regulatory environment for future technological development, and the way they behave in geopolitics affects risk from political instability, both as a civilisation risk in itself and as something that could require mitigation in other work. There was an ongoing line of questioning about how good it is, exactly, to have a massive chunk of AGI safety orgs in one coastal American city (in particular during the worst of the North Korea stuff), and a good model for that is useful for deciding whether it's worth trying to fund the creation and expansion and focusing of orgs elsewhere as a "backup", for example, which is a decision that can be taken individually on the basis of a good grasp of how concerned you should be, exactly, about particular geopolitical issues.

These world politics questions are probably not perfectly optimised for that (I had to avoid anything on NK in particular due to the current rate of change), and it'd be nice to find better ones, and maybe more other useful questions and shrink the section further next year. I think they probably have some value to practice predicting on, though.

Replies from: zulupineapple
comment by zulupineapple · 2018-05-01T10:24:07.891Z · LW(p) · GW(p)

I like the EA section. I think grouping people by specific goals/interests and preparing questions for those goals is the right way. If I cared about EA, then being able to predict which charities will start/stop being effective, before they actually implement whatever changes they're considering, would allow me to spend money more efficiently. It would be good not only to have an accurate personal model, but also to see other people with better models make those predictions, and know how reliable they really are.

Likewise, we could have something about AGI, e.g. "which AGI safety organization will produce the most important work next year", so that we can fund them more effectively. Of course, "most important" is a bit subjective, and, also, there is a self-fulfilling component in this (if you don't fund an organization, then it won't do anything useful). But in theory being able to predict this would be a good skill, for someone who cares about AGI safety.

Problem is, I don't really know what else we commonly care about (to be honest, I don't care about either of those much).

I think the world politics do have transferability in calibration

I would also like this to be true, but I wonder if it really is. There is a very big difference between political questions and personal questions. I'd ask if someone has measured whether they experience any transfer between the two, but then I'm not even sure how to measure it.

Replies from: jbeshir
comment by jbeshir · 2018-05-01T11:54:10.923Z · LW(p) · GW(p)

It might be nice to have a set of twenty EA questions, a set of twenty ongoing-academic-research questions, a set of twenty general tech industry questions, a set of twenty world politics questions for the people who like them maybe, and run multiple contests at some point which refine predictive ability within a particular domain, yeah.

It'd be a tough time to source that many, and I feel that twenty is already about the minimum sample size I'd want to use, and for research questions it'd probably require some crowdsourcing of interesting upcoming experiments to predict on, but particularly if help turns out to be available it'd be worth considering if the smaller thing works.