Are extreme probabilities for P(doom) epistemically justifed?

post by NathanBarnard, Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2024-03-19T20:32:04.622Z · LW · GW · 11 comments
Alexander Gietelink Oldenziel

Can you post the superforecaster report that has the 0.12% P(Doom) number. I have not actually read anything of course and might be talking out of my behind.

In any case, there have been several cases where OpenPhil or somebody or other has brought in 'experts' of various ilk to debate the P(Doom), probability of existential risk. [usually in the context of AI]

Many of these experts give very low percentages. One percentage I remember was 0.12 %

In the latest case these were Superforecasters, Tetlock's anointed. Having 'skin in the game' they outperformed the fakexperts in various prediction markets. 

So we should defer to them (partially) on the big questions of x-risk also. Since they give very low percentages that is good. So the argument goes.

Alex thinks these percentages are ridicilously, unseriously low. I would even go as far as saying that the superforecasters aren't really answering the actual question when they give these percentages for x-risk (both AI and general x-risk) 

NathanBarnard

Yeah - there are two relevant reports.  This is the first one  - which has estimates for a range of x-risks including from AI where superforcasters reports a mean estimate of AI causing extinction of 0.4%, and this one - which has the 0.12% figure - which specifically selected for forcasters who were sceptical of AI extinction and experts who were concerned, with the purpose of finding cruxes between the two groups. 

NathanBarnard

I think that we should defer quite a lot to superforcasters, and I don't take these percentages as being unseriously low. 

NathanBarnard

I also think it's important to clarify that the 0.4% number is human extinction in particular, rather than something broader like loss of control of the future, over 1bn people dying, outcomes as bad as extinction etc. 

Alexander Gietelink Oldenziel

One issue with deferring to superforecasters on x-risk percentages is that a true x-risk is a prediction question that never gets resolved. 

The optimal strategy when playing a prediction market on a question that doesn't get resolved is giving 0% probability 

NathanBarnard

I think I'm basically not worried about that - I predict that the superforcaters in the report took the task seriously and forcasted in the same way they would other questions. It's also not been made public who the forcasters were, there's no money on the line or even internet points. 

Alexander Gietelink Oldenziel

Those are some good points. 

We're getting to another issue. Why is it appropriate to defer to a forecaster that does well on short-horizon prediction, in fairly-well understood domains on a question that has literally never happened (this includes both x-risks and catastrophic risks).

I would even say these are anti-correlated. By ignoring black swans you will do a little better on the time horizons in which black swans haven't happened yet ('picking up pennies in front of a steamroller')

NathanBarnard

Yep that's reasonable and I wouldn't advocate for deferring entirely to superforcasters - I think the appropriate strategy is using a number of different sources to predict both the probability of human extinction from AI and AI timelines. 

Superforcasters though have amongst the most clearly transferable skills to predicting rare events because they're good at predicting near term events. It's plausible that ignoring black swans makes you do a bit better on short time horizons and superforcasters are using this as a herustic. 

If this effect was sufficiently strong that we shouldn't defer to superforcasters on questions over long time horizons, that would imply that people with lots of practice forcasting would do worse on forcasting long timelines questions than people who don't and that seems basically implausible to me. 

So I think the realively weak performance of forcasters on long term predictions means that we should defer less to them on these of questions, while still giving their views some weight. I think the possibility of the black-swan heuristics mean that that defference should go down a bit further, but by a small degree. 

Alexander Gietelink Oldenziel

I find it highly suspicious that superforecasters are giving low percentages for all types of x-risks. 

Obviously as an AI x-risk person I am quite biased to thinking AI x-risk is a big deal and P(doom|agi) is substantial. Now perhaps I am the victim of peer pressure, crowd-thinking, a dangerous cult etc etc. Perhaps I am missing something essential. 

But for other risks superforecasters are giving extreemely low percentages as well.  

To be sure, one sometimes feels that in certain circles ('doomer') there is a race to be as doomy as possible. There is some social desirability bias here, maybe some other complicated Hansonian signalling reasons. We have also seen this in various groups concerned about nuclear risks, environmental risks etc etc. 

Alexander Gietelink Oldenziel

But it's so low (0.4 %, 0.12 %, whatever) I am wondering how the obtain so much confidence about these kinds of never-before-seen events. 
Are they starting with a very low prior on extinction? Or are they updating on something?

These kinds of all-things-considered percentages are really low. 0.1-1 % is getting at the epsilon threshold of being able to trust social reasoning. 

It is about my credence in really faroff stuff like aliens, conspiracy theories, I am secretly mentally ill, whacky stuff for which I would normally say 'I dont believe this'. 

Alexander Gietelink Oldenziel

just to clarify: I obviously think in many cases very low (or high) percentages are real and valid epistemic state. 

But I see these as conditional on 'my reasoning process is good, I haven't massively missed some factor, I am not crazy'. 

Feels quite hard to have all-things considered percentages on 1-off events that maybe be decades in the future. 

NathanBarnard

I don't find it suspicious that the other probabilities for extinction from other causes is also really low - we do have some empirical evidence from natural pandemics and rates of death from terrorists and wars for biowepon-mediated extinction events. We have the lack of any nuclear weapons being launched for nuclear risks, and really quite a lot of empircal evidence for climate stuff. 

We also have the implied rate of time-discounting from financial instruments that expirance high rates of trading and pose very few other risks like US and Swiss bonds, which also imply low risks of everyone dying - for instance there have been recent periods where bonds have been trading as negative real rates. 

I also think that supercastors are probably much much better than other people at taking qualitative considerations and intuitions and translating those into well-calibrated probabilities. 

Alexander Gietelink Oldenziel

I really like Rafael Harth comment [LW(p) · GW(p)] on Yudkowsky's recent empiricism as anti-epistomology. 

"It is not the case that an observation of things happening in the past automatically translates into a high probability of them continuing to happen. Solomonoff Induction actually operates over possible programs that generate our observation set (and in extension, the observable universe), and it may or not may not be the case that the simplest universe is such that any given trend persists into the future. There are no also easy rules that tell you when this happens; you just have to do the hard work of comparing world models."

And perhaps this is a deeper crux between you and me that underlies here. 
I am quite suspicious of linearly extrapolating various trendlines. 
To me the pieces of evidence you name - while very interesting! - are fundamentally limited in what they can say about the future. 

Stated in Bayesian terms - if we consider the history of the earth as a stochastic process then it is highly non-IID, so correlations in one time-period are of limited informativeness about the future. 

I also feel this kinds of linear extrapolation would have done really bad in many historical cases. 

NathanBarnard

I agree that the time series of observations of history of the earth are deeply non IID - but shouldn't this make us more willing to extrapolate trends because we aren't facing time series composed of noise but instead time series where we can condition on the previous substantiation of that time series. E.g we could imagine the time series as some process with an autoregressive component meaning that there should see persistence from past events. 

(this comment isn't very precise, but probably could be made more precise with some work)

NathanBarnard

Why would these linear (in the generalised linear model sense) have done badly in the past? 

Alexander Gietelink Oldenziel

The kind of superficial linear extrapolation of trendlines can be powerful, perhaps more powerful than usually accepted in many political/social/futurist discussions. In many cases, succesful forecasters by betting on some high level trend lines often outpredict 'experts'.

But it's a very non-gears level model. I think one should be very careful about using this kind of reasoning when for tail-events. 
e.g. this kind of reasoning could lead one to reject development of nuclear weapons. 

I think mechanistic stories that are gears-level about the future can give lower bounds on tail events that are more reliable than linear trend extrapolation.
e.g. I see a clear 'mechanistic' path to catastrophic (or even extinction) risk from human-engineered plagues in the next 100 years. The technical details to human-engineerd plagues are being suppressed but afaic it's either possible to make engineered plagues that are many many times more invectious, deadly, and kill by delay, difficult to track etc or it will be possible soon.  
scenario: Some terrorist group, weird dictator, great power conflict makes a monstrous weapon, an engineered virus that is spreads like the measles or covid, but kills >10% after a long incubation period. We've seen how powerless the world governments were in containing covid. It doesn't seem enough lessons were learned or have been learned since then. 

 I can't imagine any realistic evidence based on market interest rates or past records of terrorist dea​​ths or anything that economists would like would ever convince me that this is not a realistic (>1-5%) event. 

Alexander Gietelink Oldenziel

Linear extrapolation of chemical explosives yield would have predicted nuclear weapons are totally out of distribution.
But in fact, just looking at past data just simply isn't a substitute for knowing the secrets of the universe. 

NathanBarnard

I think the crux here might be how we should convert these qualitative considerations into numerical probabilities, and basically, my take is that superforcasters have a really good track record of doing this well, and the average person is really bad at doing this (e.g the average American thinks like 30% of the population is Jewish, these sorts of things.) 

NathanBarnard

On the chemical explosives one, AI impacts have maybe 35 of these case studies on weather are breakpoints in technological development and I think explosive power is the only one where they found a break that trend extrapolition wouldn't have predicted

Alexander Gietelink Oldenziel

I am aware of AI impacts research and I like it. 

I think what it suggest is that trend-breaks are rare. 

1/35 if you will. 

(of course, one can get in some reference class tennis here. Homo sapiens are also a trend break compared to primates, animals. Is that a technology? I don't know. It's very vulnerable to refernces class and low N examples)

NathanBarnard

Fwiw the average probability given that AI kills 10%+ of the population was 2.13% in the general x-risk forcasting report, which isn't very different from 1/35

NathanBarnard

I'm not sure where it's useful to go from here. I think maybe the takeaway is that our crux is how to convert qualitative considerations combined with base rates stuff into final probabilities, and I'm much more willing to defer to superforcasters on this than you are?

Alexander Gietelink Oldenziel

I feel this is a good place to end. Thank you for your time and effort ! 

I would summarize my position as:
- I am less impressed by superforecaster track record than you are. [ we didn't get into this]
- I feel linear trend extrapolation is limited in saying much about tail risk. 
- I think good short-horizon predictors will predictabily underestimate black swans
- I think there is a large irreducible uncertainty about the future (and the world in general) that makes very low or very high percentages not epistemically justified. 

If I were epistemically empathetic I would be able to summarize your position. I am not. 
But if I would try I would say you are generally optimistic about forecasting, past data and empirics. 

11 comments

Comments sorted by top scores.

comment by Jacob Pfau (jacob-pfau) · 2024-03-21T00:33:06.579Z · LW(p) · GW(p)

The Metaculus community strikes me as a better starting point for evaluating how different the safety inside view is from a forecasting/outside view. The case for deferring to superforecasters is the same the case for deferring to the Metaculus community--their track record. What's more, the most relevant comparison I know of scores Metaculus higher on AI predictions. Metaculus as a whole is not self-consistent on AI and extinction forecasting across individual questions (links below). However, I think it is fair to say that Metaculus as a whole has significantly faster timelines and P(doom) compared to superforecasters.

If we compare the distribution of safety researchers' forecasts to Metaculus (maybe we have to set aside MIRI...), I don't think there will be that much disagreement. I think remaining disagreement will often be that safety researchers aren't being careful about how the letter and the spirit of the question can come apart and result in false negatives. In the one section of the FRI studies linked above I took a careful look at, the ARA section, I found that there was still huge ambiguity in how the question is operationalized--this could explain up to an OOM of disagreement in probabilities.

Some Metaculus links: https://www.metaculus.com/questions/578/human-extinction-by-2100/ Admittedly in this question the number is 1%, but compare to the below. Also note that the forecasts date back to as old as 2018. https://www.metaculus.com/questions/17735/conditional-human-extinction-by-2100/ https://www.metaculus.com/questions/9062/time-from-weak-agi-to-superintelligence/ (compare this to the weak AGI timeline and other questions)

comment by Daniel Murfet (dmurfet) · 2024-03-20T23:44:10.780Z · LW(p) · GW(p)

The kind of superficial linear extrapolation of trendlines can be powerful, perhaps more powerful than usually accepted in many political/social/futurist discussions. In many cases, succesful forecasters by betting on some high level trend lines often outpredict 'experts'.

But it's a very non-gears level model. I think one should be very careful about using this kind of reasoning when for tail-events. 
e.g. this kind of reasoning could lead one to reject development of nuclear weapons. 

 

Agree. In some sense you have to invent all the technology before the stochastic process of technological development looks predictable to you, almost by definition. I'm not sure it is reasonable to ask general "forecasters" about questions that hinge on specific technological change. They're not oracles.

comment by Garrett Baker (D0TheMath) · 2024-03-20T07:30:15.704Z · LW(p) · GW(p)

I am far less impressed by superforecaster track record than you are. [ we didn't get into this]

I'm interested in hearing your thoughts on this.

comment by Mikhail Samin (mikhail-samin) · 2024-03-22T09:28:24.442Z · LW(p) · GW(p)

My expectation is that superforcasters weren’t able to look into detailed arguments that represent the x-risk well and they would update after learning more.

Replies from: NathanBarnard
comment by NathanBarnard · 2024-03-22T11:33:37.416Z · LW(p) · GW(p)

I think this proves too much - this would predict that superforecasters would be consistently outperformed by domain experts when typically the reverse it true. 

Replies from: alexander-gietelink-oldenziel
comment by Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2024-03-22T12:11:25.641Z · LW(p) · GW(p)

I think I agree. 

For my information, what's your favorite reference for superforecasters outperforming domain experts?

Replies from: technicalities
comment by technicalities · 2024-03-29T10:34:49.557Z · LW(p) · GW(p)

As of two years ago, the evidence for this was sparse. Looked like parity overall, though the pool of "supers" has improved over the last decade as more people got sampled.

There are other reasons [LW · GW] to be down on XPT in particular.

comment by Mikhail Samin (mikhail-samin) · 2024-03-22T09:27:48.507Z · LW(p) · GW(p)

My expectation is that superforcasters weren’t able to look into detailed arguments that represent the x-risk well and they would update after learning more.

comment by Michael Tontchev (michael-tontchev-1) · 2024-03-21T17:57:13.873Z · LW(p) · GW(p)

I quite enjoyed this conversation, but imo the x-risk side needs to sit down to make a more convincing, forecasting-style prediction to meet forecasters where they are.  A large part of it is sorting through the possible base rates and making an argument for which ones are most relevant. Once the whole process is documented, then the two sides can argue on the line items.

Replies from: alexander-gietelink-oldenziel
comment by Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2024-03-22T11:03:18.548Z · LW(p) · GW(p)

Thank you ! glad you liked it. ☺️

LessWrong & EA is inundated with repeating the same.old arguments for ai x-risk in a hundred different formats. Could this really be the difference ?

Besides, arent superforecasters supposed to be the Kung Fu masters of doing their own research ;-)

I agree with you that a crux is base rate relevancy. Since there is no base rate for x-risk I'm unsure how to translate this to superforecaster language tho

Replies from: michael-tontchev-1
comment by Michael Tontchev (michael-tontchev-1) · 2024-03-23T21:40:43.397Z · LW(p) · GW(p)

Well, what base rates can inform the trajectory of AGI?

  • dominance of h sapiens over other hominids
  • historical errors in forecasting AI capabilities/timelines
  • impacts of new technologies on animals they have replaced
  • an analysis of what base rates AI has already violated
  • rate of bad individuals shaping world history
  • analysis of similarity of AI to the typical new technology that doesn't cause extinction
  • success of terrorist attacks
  • impacts of covid
  • success of smallpox eradication

Would be an interesting exercise to do to flesh this out.