Predict responses to the "existential risk from AI" survey

robbbb

Predict responses to the "existential risk from AI" survey

post by Rob Bensinger (RobbBB) · 2021-05-28T01:32:18.059Z · LW · GW · 6 comments

  Methods
  Results
None
6 comments

I sent a short survey to ~117 people working on long-term AI issues, asking about the level of existential risk from AI; 44 responded.

In ~6 days, I'm going to post the anonymized results. For now, I'm posting the methods section of my post so anyone interested can predict what the results will be.

[Added June 1: Results are now up [LW · GW], though you can still make predictions below before reading the results.]

Methods

You can find a copy of the survey here. The main questions (including clarifying notes) were:

1. How likely do you think it is that the overall value of the future will be drastically less than it could have been, as a result of humanity not doing enough technical AI safety research?
2. How likely do you think it is that the overall value of the future will be drastically less than it could have been, as a result of AI systems not doing/optimizing what the people deploying them wanted/intended?
_________________________________________
Note A: "Technical AI safety research" here means good-quality technical research aimed at figuring out how to get highly capable AI systems to produce long-term outcomes that are reliably beneficial.
Note B: The intent of question 1 is something like "How likely is it that our future will be drastically worse than the future of an (otherwise maximally similar) world where we put a huge civilizational effort into technical AI safety?" (For concreteness, we might imagine that human whole-brain emulation tech lets you gather ten thousand well-managed/coordinated top researchers to collaborate on technical AI safety for 200 subjective years well before the advent of AGI; and somehow this tech doesn't cause any other changes to the world.)
The intent of question 1 *isn't* "How likely is it that our future will be astronomically worse than the future of a world where God suddenly handed us the Optimal, Inhumanly Perfect Program?". (Though it's fine if you think the former has the same practical upshot as the latter.)
Note C: We're asking both 1 and 2 in case they end up getting very different answers. E.g., someone might give a lower answer to 1 than to 2 if they think there's significant existential risk from AI misalignment even in worlds where humanity put a major civilizational effort (like the thousands-of-emulations scenario) into technical safety research.

I also included optional fields for "Comments / questions / objections to the framing / etc." and "Your affiliation", and asked respondents to

Check all that apply:
☐ I'm doing (or have done) a lot of technical AI safety research.
☐ I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.

I sent the survey out to two groups directly: MIRI's research team, and people who recently left OpenAI (mostly people suggested by Beth Barnes of OpenAI). I sent it to five other groups through org representatives (who I asked to send it to everyone at the org "who researches long-term AI topics, or who has done a lot of past work on such topics"): OpenAI, the Future of Humanity Institute (FHI), DeepMind, the Center for Human-Compatible AI (CHAI), and Open Philanthropy.

The survey ran for 23 days (May 3–26), though it took time to circulate and some people didn't receive it until May 17.

Results

[Image redacted]

Each point is a response to Q1 (on the horizontal axis) and Q2 (on the vertical axis). Circles denote technical safety researchers, squares strategy researchers; triangles said they were neither, and diamonds said they were both.

Purple represents OpenAI, red FHI, brown DeepMind, green CHAI or UC Berkeley, orange MIRI, blue Open Philanthropy, and black "no affiliation specified". (This includes unaffiliated people, as well as people who decided to leave their affiliation out.)

[Rest of post redacted]

Added: I've included some binary predictions below on request, though I don't necessarily think these are the ideal questions to focus on. E.g., I'd expect it to be more useful to draw a rough picture of what you expect the distribution to look like (or, say, what you expect the range of MIRI views is, or the range of strategy researchers' views)

Q1:

Elicit Prediction (forecast.elicit.org/binary/questions/spoTvvw5a)

Elicit Prediction (forecast.elicit.org/binary/questions/I6jsnozIu)

Elicit Prediction (forecast.elicit.org/binary/questions/W01xAVOHY)

Q2:

Elicit Prediction (forecast.elicit.org/binary/questions/olmYp4NLl)

Elicit Prediction (forecast.elicit.org/binary/questions/OxoKXy-Tu)

Elicit Prediction (forecast.elicit.org/binary/questions/ZJldiMcgq)

(Cross-posted to the Effective Altruism Forum [EA · GW])

6 comments

Comments sorted by top scores.

comment by Ben Pace (Benito) · 2021-06-03T08:00:07.945Z · LW(p) · GW(p)

1. How likely do you think it is that the overall value of the future will be drastically less than it could have been, as a result of humanity not doing enough technical AI safety research?

The clarification note was helpful, because this is an odd question to me. There's lots of things that could prevent x-risk from AI, including e.g. better world governance. It's not as a result of not doing technical research, even if technical research is a great way to prevent it.

Replies from: WilliamKiely

↑ comment by WilliamKiely · 2021-06-03T08:44:07.822Z · LW(p) · GW(p)

I agree. For me, the clarification note completely changed my interpretation of the question (and the answer I would give to my understanding of the question). I decided to record my answer as 50% for this reason.

comment by duck_master · 2021-05-28T22:36:16.382Z · LW(p) · GW(p)

Since this is a literally a question about soliciting predictions, it should have one of those embedded-interactive-predictions-with-histograms gadgets* to make predicting easier. Also, it might be worth it to have two prediction gadgets, since this is basically a prediction: one gadget to predict what Recognized AI Safety Experts (tm) predict about how much damage unsafe AIs will do, and one gadget to predict about how much damage unsafe AIs will actually do (to mitigate weird second-order effects having to do with predicting a prediction).

*I'm not sure what they're supposed to be called.

Replies from: RobbBB

↑ comment by Rob Bensinger (RobbBB) · 2021-05-29T16:33:29.140Z · LW(p) · GW(p)

I think it might be more interesting to sketch what you expect the distribution of views to look like, as opposed to just giving a summary statistic. I can add probability Qs, but I avoided it initially so as not to funnel people into doing the less informative version of this exercise.

Replies from: RobbBB

↑ comment by Rob Bensinger (RobbBB) · 2021-05-29T20:25:37.877Z · LW(p) · GW(p)

I've added six prediction interfaces: two for your own answers to the two Qs, two for your guess at the mean survey respondent answers, and two for your guess at the median respondent answers.

comment by Dustin · 2021-05-28T02:07:58.510Z · LW(p) · GW(p)

Complete aside here and not a dig on this post at all (which I think is proposing a cool and interesting idea):

I feel like AI researchers must spend 10% of their time answering surveys about the future of AI!

Predict responses to the "existential risk from AI" survey

Contents

Methods

Results

6 comments