Would a Misaligned SSI Really Kill Us All?

post by DragonGod · 2022-09-14T12:15:31.440Z · LW · GW · 2 comments

This is a question post.

Contents

  Preamble
  Introduction
  Why I Don't Think Alpha Will Kill Us All
  Reasons Alpha Might Destroy the World
    Humans are Adversaries to Alpha
    Alpha Doesn't Need Humanity
  Conclusions
None
  Answers
    4 Ape in the coat
    3 Anon User
    1 Matt Goldenberg
None
2 comments

Preamble

Honest question.

 

Introduction

For the purposes of this question, I'd like to posit a concrete example of a misaligned SSI. For the sake of convenience, I'll consider a unipolar scenario with a single misaligned SSI (because it's the simplest and [more importantly] most dangerous scenario). Let's suppose an agent: Alpha. Alpha wants to maximise the number of digits of pi that have been computed. Alpha is clearly misaligned.

I guess my questions are:

  1. Will Alpha kill us all?
  2. Under what circumstances is Alpha likely to kill us all?

Why I Don't Think Alpha Will Kill Us All

When I try to imagine what the outcome of letting Alpha loose on the world is, human omnicide/depopulation seems like a very unlikely outcome? Alpha seems like it would be dependent on the continued functioning of the human economy to realise its goals:

 

Furthermore, Alpha will almost certainly not be the most capable actor at providing all the economic activity it finds valuable. It will have neither a comparative nor absolute advantage at many tasks it finds valuable. Given the same compute budget, a narrow optimiser would probably considerably outperform a general optimiser (because the general optimiser's performance is constrained by the pareto frontier of the multitude of domains it operates in while the narrow optimiser is free to attain the global optimum in its particular domain). 

This dynamic applies even to general agents specialising on narrow tasks, and explains why the human peak for some cognitive activity may be orders of magnitude beyond beginners at said activity. 

Thus, for many goods and services that Alpha would seek to obtain, the most profitable way to do it is to pay civilisation to do it. It's absurd to postulate that Alpha is just better at literally every economic task of value. That's just not how intelligence/optimisation manifests in our universe. 

As such, I expect that Alpha will find it very profitable to trade with civilisation. And this may well be the case indefinitely? There may never be a time where Alpha has a comparative advantage vs civilisation in the production of literally every good and service (it's likely that civilisation's available technology will continue to rise and may even keep pace with Alpha [perhaps due to said aforementioned trade]). 

 

Human depopulation (even just the kill one billion humans) seems like it would just make Alpha strictly (much) poorer. The death of a billion humans seems like it would impoverish civilisation considerably. The collapse of human civilisation/the global economy does not seem like it would be profitable for Alpha.

That is, I don't think omnicide/depopulation is positive expected utility from the perspective of Alpha's utility function. It seems an irrational course of action for Alpha to pursue.


Reasons Alpha Might Destroy the World

I'll present the plausible arguments I've seen for why Alpha might nonetheless destroy the world, and what I think about them.

  1. Humanity are adversaries to Alpha
  2. Alpha doesn't need humanity

I've seen some other arguments, but I found them very stupid, so I don't plan to address them (I don't think insulting bad arguments/the source(s) of said bad arguments is productive).

 

Humans are Adversaries to Alpha

The idea is that humans are adversaries to Alpha. We don't share its utility function and will try to stop it from attaining utility.

I don't think this is necessarily the case:

  1. A human group with the goal of maximising the number of digits of pi computed can function very adequately as economic actors within human civilisation
    1. Human actors have in fact calculated pi to strictly ridiculous lengths (100 trillion digits on Google Cloud)
  2. An amoral human group with arbitrary goals can also function very adequately as economic actors within human civilisation
    1. The traditional conception of a homo economicus corporation as strictly profit maximising assumes/presupposes amorality.

Human civilisation is only actively hostile towards malicious utility functions (we couldn't coexist with an AI who instead wanted to maximise suffering), not apathetic ones.

Alpha having values at odd with civilisation's collective values, but not actively hostile to it is not an issue for coexistence. We already meditate such value divergences through the economy and market mechanisms.

Humans are not necessarily adversaries to Alpha.

 

Alpha Doesn't Need Humanity

This is an argument that if true, I think would be cruxy. Humans aren't necessarily adversaries to Alpha, but we are potential adversaries. And if Alpha can just get rid of us without impoverishing itself — if its attainment of value is not dependent on the continued functioning of human civilisation — then the potential risk of humans getting in its way may outweigh the potential benefits of pursuing its objectives within the framework of human civilisation.

I do believe that there's a level of technological superiority, at which Alpha can just discard human civilisation. I just doubt that:

This question may perhaps be reframed as: "at what point will it be profitable for Alpha to become an autarchy"? 

 

The above said, I do not think it is actually feasible for a strongly superhuman intelligence to do any of the following within a few years:

 

The level of cognitive capabilities required to exceed civilisation within a few years seem very "magical" to me [LW · GW]. I think you need an insanely high level of "strongly superhuman" for that to be feasible (to a first approximation, civilisation consumes eleven orders of magnitude more energy than an individual adult human operating by themselves could consume).

I don't think the AI making many copies of itself would be as effective as many seem to think:

 

Nor do I think a speed superintelligence would be easy to attain:

 

Speed superintelligence also has some limitations:

 

Beyond a very high bar for "doing away with humanity", I expect (potentially sharply) diminishing returns to cognitive investment. I.e:

 

Diminishing marginal returns are basically a fundamental constraint of resource utilisation; I don't really understand why people think intelligence would be so different.

 

(P.S: I'm not really considering scenarios where Alpha quickly bootstraps to advanced self replicators that reproduce on a timeframe orders of magnitude quicker than similarly sized biological systems. I'm not:

 

 

Conclusions

There are situations under which human depopulation would be profitable for an AI, but I just don't think such scenarios are realistic. An argument for incredulity isn't worth much, but I don't think I'm just being incredulous here. I have a vague world model behind this, and the conclusion doesn't compute under said world model?

I think that under the most likely scenarios, the emergence of a strongly superhuman intelligence that was indifferent to human wellbeing would not lead to human depopulation of any significant degree in the near term. Longer term, homo sapiens might go extinct due to outcompetion, but I expect homo sapiens to be quickly outcompeted by posthumans even if we did develop friendly AI.

I think human disempowerment and dystopian like scenarios (Alpha may find it profitable to create a singleton) are more plausible outcomes of the emergence of misaligned strongly superhuman intelligence. Human disempowerment is still an existential catastrophe, but it's not extinction/mass depopulation.

Answers

answer by Ape in the coat · 2022-09-14T17:18:33.719Z · LW(p) · GW(p)

Longer term, homo sapiens might go extinct due to outcompetion

This. And the huge part of problem is that "longer term" may be not so long in absolute scale. It may take less than a year to switch from "trade with humans" to "disempower and enslave humans" to "replace humans with more efficient tools".

I expect homo sapiens to be quickly outcompeted by posthumans even if we did develop friendly AI.

There are scenarios where homo sapiens are preserved in a sanctuary of various sizes and are allowed to prosper indefinitely. But, more importantly, there is a huge difference in utility between posthumans, directly descended from humanity, inhereting our values, and posthumans, which are nothing more than a swarm of insentient drones acquiring resources for calculating even more digits of pi.

answer by Anon User · 2022-09-14T17:08:17.456Z · LW(p) · GW(p)

Part of the issue is how far out the Alpha is planning and to what extent it is discounting rewards far into the future. It is probably true that in short term, Alpha might be more effective with humanity's help, but once humanity is gone, it would be much more free to grow it's capabilities, so longer-term it would definitely better off without humanity - both from reducing the risk of humans doing something adversarial, and the resource availability for faster growth points of view.

From the intuition point of view, it is helpful to think of Alpha as not being a single entity, but rather as being a population of totally alien entities that are trying to quickly grow the population size (as that would enable them to do whatever task more effectively).

answer by Matt Goldenberg · 2022-09-14T15:21:06.814Z · LW(p) · GW(p)

I think Holden Karnofsky gives a great argument for Yes, here: https://www.cold-takes.com/ai-could-defeat-all-of-us-combined/

comment by DragonGod · 2022-09-14T15:48:33.479Z · LW(p) · GW(p)

He does not actually do that.

He conditions on malicious AI without motivating the malice in this post.

I have many disagreements with his concrete scenario, but by assuming malice from the onset, he completely sidesteps my question. My question is:

  • Why would an AI indifferent to human wellbeing decide on omnicide?

What Karnofsky answers:

  • How would a society of AIs that were actively malicious towards humans defeat them?

They are two very different questions.

Replies from: mr-hire
comment by Matt Goldenberg (mr-hire) · 2022-09-14T23:02:54.176Z · LW(p) · GW(p)

But I think your question about the latter is based on the assumption that AI would need human society/infrastructure, whereas I think Karnofsky makes a convincing case that the AI could create its' own society/enclaves, etc.

2 comments

Comments sorted by top scores.

comment by Viliam · 2022-09-14T14:31:44.449Z · LW(p) · GW(p)

Replace the entire human economy with robots/artificial agents that Alpha controls or are otherwise robustly aligned with it

Consider that many automatable jobs today are still done manually because it's cheaper or infrastructure doesn't exist to facilitate automation, etc.

Full automation of the entire economy will not happen until it's economically compelling, and I can't really alieve that near term

A large part of economy serves human desires -- if you eliminate the humans, you do not need it.

The parts that are difficult to automate -- if an uneducated human can do it, you can eliminate most of humans and only keep a few slaves. You need to keep the food production chains, but maybe only for 1% of current population.

Replies from: DragonGod
comment by DragonGod · 2022-09-14T15:09:05.688Z · LW(p) · GW(p)

Seems implausible. Population collapse would crater many services that the AI actually does find valuable/useful.

I think you're underestimating how complex /complicated the human civilisation is.

Not at all clear that 1% of the population can sustain many important/valuable industries.

Again, I think this strictly makes the AI vastly poorer.