What are our outs to play to?

post by Hastings (hastings-greer) · 2022-06-18T19:32:10.822Z · LW · GW · 0 comments

Contents

  Nuclear war: 
  Spontaneous morality:
  AGI Impossibility
  Human intelligence takeoff
None
No comments

This is an unfiltered, unsorted collection of plausible (not likely) scenarios where there are humans alive in 200 years. I have left out "We solve alignment" since it is adequately discussed elsewhere on the website. These non-alignment scenarios form a larger and larger part of the human utility pie as MIRI adds nines to the odds of AGI takeoff being unaligned. 

I've collected the conditions for these outcomes into two categories:

We can influence future events. Notably, we can learn about epistemic conditions, but cannot influence them: they're either true or not.  However, in a scenario where we 'play to our outs,' we need to estimate the odds of everything turning out OK in a way we had no influence over, in order to judge how much harm an organization 'playing to our outs' should be willing to do.


Nuclear war: 

Epistemic conditions: AGI research requires a significant number of GPU Years- it cannot be achieved by an extremely clever and lucky programmer with a generator and a 3090.

Future events: A significant fraction of all nuclear warheads are launched, and all chip fabs and supercomputers are hit. Humanity loses the ability to build computers at a 2022 level of performance, and cannot regain it because fossil fuels are depleted.

This scenario is not palatable. After reading Eliezer Yudkowsky's "Death With Dignity", intellectually this seems like by far the most likely scenario where humanity recognizably survives. Emotionally I disagree with that conclusion, and I suspect that emotion is encoding some important heuristics.


Spontaneous morality:

Epistemic conditions: inner goals are sampled randomly when an AGI takes off, and there's a nonzero chance that "intrinsically wants to keep some number of humans around and happy" makes the cut.

Future events: Conceivably, creating a large number of AGIs at once increases the odds that one of them wants pets. In particular, we suspect that we got morality from living in tribes.

This scenario isn't necessarily likely. However, it's worth noting that in our N=1 example of intelligence takeoff, humans spontaneously developed the inner goal of keeping cats, even when this is not needed for our (now discarded) outer goal of familial reproductive fitness.


AGI Impossibility

Epistemic conditions: Creating AGI is impossible for humans

Future events: None (other than the implicit 'We don't go extinct through some other means' condition of all these scenarios)

This scenario is unlikely, and we can't influence its likelyhood.  


Human intelligence takeoff

Epistemic conditions: Human intelligence can be meaninfully augmented; modified neurons or neurons + silicon are a fundamentally more efficient substrate for GI computation than silicon.

Future events: Research is done to augment human intelligence, or multiple AIs take off at once.

I'd put this as one of the more plausible set of scenarios. In particular, if it turns out that neurons are a better substrate than silicon, then:

However, if neurons are not better than silicon, then this scenario is implausible, unless smarter humans gain the capacity to solve alignment, or coordinate to not produce AGI faster than they gain the capacity to produce AGI. In my opinion, the likelyhood of this scenario depends much more on epistemic conditions than future events: the nash equilibrium will be the most powerful substrate, and knowledge of how to make neurons is unlikely to be lost.


Did I miss any?

0 comments

Comments sorted by top scores.