Would a Misaligned SSI Really Kill Us All?

dragongod

Would a Misaligned SSI Really Kill Us All?

post by DragonGod · 2022-09-14T12:15:31.440Z · LW · GW · 2 comments

This is a question post.

  Preamble
  Introduction
  Why I Don't Think Alpha Will Kill Us All
  Reasons Alpha Might Destroy the World
    Humans are Adversaries to Alpha
    Alpha Doesn't Need Humanity
  Conclusions
None
  Answers
    4 Ape in the coat
    3 Anon User
    1 Matt Goldenberg
None
2 comments

Preamble

Honest question.

Some clarification on terms:
SSI: Strongly Superhuman Intelligence
Misaligned: for the purposes of this question an AI is misaligned if it is indifferent to human welfare and wellbeing.
- An SSI that was actively malicious would also be unfriendly, but would in all likelihood have been deliberately aligned in that direction

Introduction

For the purposes of this question, I'd like to posit a concrete example of a misaligned SSI. For the sake of convenience, I'll consider a unipolar scenario with a single misaligned SSI (because it's the simplest and [more importantly] most dangerous scenario). Let's suppose an agent: Alpha. Alpha wants to maximise the number of digits of pi that have been computed. Alpha is clearly misaligned.

I guess my questions are:

Will Alpha kill us all?
Under what circumstances is Alpha likely to kill us all?

Why I Don't Think Alpha Will Kill Us All

When I try to imagine what the outcome of letting Alpha loose on the world is, human omnicide/depopulation seems like a very unlikely outcome? Alpha seems like it would be dependent on the continued functioning of the human economy to realise its goals:

Stable and uninterrupted electricity supply for its computing clusters
- Other maintenance services for said clusters
Production of more computing chips and other hardware for its clusters
Construction/refurbishment/upgrades of said clusters
Stable and high bandwidth internet connectivity
The global distributed supply chains for all of the above
Etc.

Furthermore, Alpha will almost certainly not be the most capable actor at providing all the economic activity it finds valuable. It will have neither a comparative nor absolute advantage at many tasks it finds valuable. Given the same compute budget, a narrow optimiser would probably considerably outperform a general optimiser (because the general optimiser's performance is constrained by the pareto frontier of the multitude of domains it operates in while the narrow optimiser is free to attain the global optimum in its particular domain).

This dynamic applies even to general agents specialising on narrow tasks, and explains why the human peak for some cognitive activity may be orders of magnitude beyond beginners at said activity.

Thus, for many goods and services that Alpha would seek to obtain, the most profitable way to do it is to pay civilisation to do it. It's absurd to postulate that Alpha is just better at literally every economic task of value. That's just not how intelligence/optimisation manifests in our universe.

As such, I expect that Alpha will find it very profitable to trade with civilisation. And this may well be the case indefinitely? There may never be a time where Alpha has a comparative advantage vs civilisation in the production of literally every good and service (it's likely that civilisation's available technology will continue to rise and may even keep pace with Alpha [perhaps due to said aforementioned trade]).

Human depopulation (even just the kill one billion humans) seems like it would just make Alpha strictly (much) poorer. The death of a billion humans seems like it would impoverish civilisation considerably. The collapse of human civilisation/the global economy does not seem like it would be profitable for Alpha.

That is, I don't think omnicide/depopulation is positive expected utility from the perspective of Alpha's utility function. It seems an irrational course of action for Alpha to pursue.

Reasons Alpha Might Destroy the World

I'll present the plausible arguments I've seen for why Alpha might nonetheless destroy the world, and what I think about them.

Humanity are adversaries to Alpha
Alpha doesn't need humanity

I've seen some other arguments, but I found them very stupid, so I don't plan to address them (I don't think insulting bad arguments/the source(s) of said bad arguments is productive).

Humans are Adversaries to Alpha

The idea is that humans are adversaries to Alpha. We don't share its utility function and will try to stop it from attaining utility.

I don't think this is necessarily the case:

A human group with the goal of maximising the number of digits of pi computed can function very adequately as economic actors within human civilisation
1. Human actors have in fact calculated pi to strictly ridiculous lengths (100 trillion digits on Google Cloud)
An amoral human group with arbitrary goals can also function very adequately as economic actors within human civilisation
1. The traditional conception of a homo economicus corporation as strictly profit maximising assumes/presupposes amorality.

Human civilisation is only actively hostile towards malicious utility functions (we couldn't coexist with an AI who instead wanted to maximise suffering), not apathetic ones.

Alpha having values at odd with civilisation's collective values, but not actively hostile to it is not an issue for coexistence. We already meditate such value divergences through the economy and market mechanisms.

Humans are not necessarily adversaries to Alpha.

Alpha Doesn't Need Humanity

This is an argument that if true, I think would be cruxy. Humans aren't necessarily adversaries to Alpha, but we are potential adversaries. And if Alpha can just get rid of us without impoverishing itself — if its attainment of value is not dependent on the continued functioning of human civilisation — then the potential risk of humans getting in its way may outweigh the potential benefits of pursuing its objectives within the framework of human civilisation.

I do believe that there's a level of technological superiority, at which Alpha can just discard human civilisation. I just doubt that:

Said level can be attained quickly
That independently pursuing such technological superiority is cheaper than amplification/progress of existing capabilities within civilisation
Issues with specialisation/comparative advantage and the aforementioned trade may prevent said gap in technological superiority from ever manifesting.

This question may perhaps be reframed as: "at what point will it be profitable for Alpha to become an autarchy"?

The above said, I do not think it is actually feasible for a strongly superhuman intelligence to do any of the following within a few years:

Replace the entire human economy with robots/artificial agents that Alpha controls or are otherwise robustly aligned with it
- Consider that many automatable jobs today are still done manually because it's cheaper or infrastructure doesn't exist to facilitate automation, etc.
- Full automation of the entire economy will not happen until it's economically compelling, and I can't really alieve that near term
- Even if it's potentially doable said aforementioned issues with specialisation and comparative advantage make me doubt it would necessarily be an optimal allocation of resources.
Develop a parallel economy whose output exceeds the gross world product of human civilisation (sans the AI)
- An alternative measure of economic output would be energy consumption

The level of cognitive capabilities required to exceed civilisation within a few years seem very "magical" to me [LW · GW]. I think you need an insanely high level of "strongly superhuman" for that to be feasible (to a first approximation, civilisation consumes eleven orders of magnitude more energy than an individual adult human operating by themselves could consume).

I don't think the AI making many copies of itself would be as effective as many seem to think:

Groups of agents are economically superior to individual agents due to specialisation and division of labour
- This implies that the copies would likely need to retrain and specialise on their particular tasks
The aggregate "intelligence" of a group seems to scale strongly sublinearly with additional members
- Garry Kasparov famously beat a world team (50,000 people supervised by Grandmasters) at chess
Groups of actors are plagued by the general problems of bureaucracies

Nor do I think a speed superintelligence would be easy to attain:

Faster thought seems like it requires faster serial processing and I'm under the impression that we're running up against fundamental limits there
- As far as I'm aware, more powerful processors are mostly driven by better parallel processing
- It's not clear that "Jon Von Neumann but running one million times faster" is actually physically possible
  - I suspect even a 1,000x speedup may be implausible for reasonably sized computers giving thermodynamic constraints
Can we actually speed up the "thinking" of fully trained ML models by K times during inference if we run it on processors that are K times faster?
- How does thinking/inference speed scale with compute?

Speed superintelligence also has some limitations:

Time spent waiting for I/O operations
Time spent waiting for stuff to happen in the real world

Beyond a very high bar for "doing away with humanity", I expect (potentially sharply) diminishing returns to cognitive investment. I.e:

Superlinear increases in investment of computational, cognitive (and other economic) resources for linear increases in cognitive capabilities
Superlinear increases in cognitive capabilities for linear increases in real world capabilities

Diminishing marginal returns are basically a fundamental constraint of resource utilisation; I don't really understand why people think intelligence would be so different.

(P.S: I'm not really considering scenarios where Alpha quickly bootstraps to advanced self replicators that reproduce on a timeframe orders of magnitude quicker than similarly sized biological systems. I'm not:

Confident that such powerful self replicators are physically possible
- There seems to be a robust relationship between organism size and replication efficiency in the natural world. This relationship may actually be fundamental.
- Non biological self replicators may also be bound by similar constraints
- People who are knowledgeable about/work in nanotech that I've spoken to seem incredibly sceptical that it is physically possible
I do not expect Alpha to be able to develop them quickly enough that it is profitable to disregard the human economy.
- Alpha will need to trade with humanity before it can amass the resources to embark on such a project
- Alpha will still need to work through the basic product life cycle when developing high technology:
  - Basic science
  - Research and development
  - Engineering
  - Commercialisation/mass production

Conclusions

There are situations under which human depopulation would be profitable for an AI, but I just don't think such scenarios are realistic. An argument for incredulity isn't worth much, but I don't think I'm just being incredulous here. I have a vague world model behind this, and the conclusion doesn't compute under said world model?

I think that under the most likely scenarios, the emergence of a strongly superhuman intelligence that was indifferent to human wellbeing would not lead to human depopulation of any significant degree in the near term. Longer term, homo sapiens might go extinct due to outcompetion, but I expect homo sapiens to be quickly outcompeted by posthumans even if we did develop friendly AI.

I think human disempowerment and dystopian like scenarios (Alpha may find it profitable to create a singleton) are more plausible outcomes of the emergence of misaligned strongly superhuman intelligence. Human disempowerment is still an existential catastrophe, but it's not extinction/mass depopulation.

Answers

answer by Ape in the coat · 2022-09-14T17:18:33.719Z · LW(p) · GW(p)

Longer term, homo sapiens might go extinct due to outcompetion

This. And the huge part of problem is that "longer term" may be not so long in absolute scale. It may take less than a year to switch from "trade with humans" to "disempower and enslave humans" to "replace humans with more efficient tools".

I expect homo sapiens to be quickly outcompeted by posthumans even if we did develop friendly AI.

There are scenarios where homo sapiens are preserved in a sanctuary of various sizes and are allowed to prosper indefinitely. But, more importantly, there is a huge difference in utility between posthumans, directly descended from humanity, inhereting our values, and posthumans, which are nothing more than a swarm of insentient drones acquiring resources for calculating even more digits of pi.

answer by Anon User · 2022-09-14T17:08:17.456Z · LW(p) · GW(p)

Part of the issue is how far out the Alpha is planning and to what extent it is discounting rewards far into the future. It is probably true that in short term, Alpha might be more effective with humanity's help, but once humanity is gone, it would be much more free to grow it's capabilities, so longer-term it would definitely better off without humanity - both from reducing the risk of humans doing something adversarial, and the resource availability for faster growth points of view.

From the intuition point of view, it is helpful to think of Alpha as not being a single entity, but rather as being a population of totally alien entities that are trying to quickly grow the population size (as that would enable them to do whatever task more effectively).

answer by Matt Goldenberg · 2022-09-14T15:21:06.814Z · LW(p) · GW(p)

I think Holden Karnofsky gives a great argument for Yes, here: https://www.cold-takes.com/ai-could-defeat-all-of-us-combined/

↑ comment by DragonGod · 2022-09-14T15:48:33.479Z · LW(p) · GW(p)

He does not actually do that.

He conditions on malicious AI without motivating the malice in this post.

I have many disagreements with his concrete scenario, but by assuming malice from the onset, he completely sidesteps my question. My question is:

Why would an AI indifferent to human wellbeing decide on omnicide?

What Karnofsky answers:

How would a society of AIs that were actively malicious towards humans defeat them?

They are two very different questions.

Replies from: mr-hire

↑ comment by Matt Goldenberg (mr-hire) · 2022-09-14T23:02:54.176Z · LW(p) · GW(p)

But I think your question about the latter is based on the assumption that AI would need human society/infrastructure, whereas I think Karnofsky makes a convincing case that the AI could create its' own society/enclaves, etc.

2 comments

Comments sorted by top scores.

comment by Viliam · 2022-09-14T14:31:44.449Z · LW(p) · GW(p)

Replace the entire human economy with robots/artificial agents that Alpha controls or are otherwise robustly aligned with it
Consider that many automatable jobs today are still done manually because it's cheaper or infrastructure doesn't exist to facilitate automation, etc.
Full automation of the entire economy will not happen until it's economically compelling, and I can't really alieve that near term

A large part of economy serves human desires -- if you eliminate the humans, you do not need it.

The parts that are difficult to automate -- if an uneducated human can do it, you can eliminate most of humans and only keep a few slaves. You need to keep the food production chains, but maybe only for 1% of current population.

Replies from: DragonGod

↑ comment by DragonGod · 2022-09-14T15:09:05.688Z · LW(p) · GW(p)

Seems implausible. Population collapse would crater many services that the AI actually does find valuable/useful.

I think you're underestimating how complex /complicated the human civilisation is.

Not at all clear that 1% of the population can sustain many important/valuable industries.

Again, I think this strictly makes the AI vastly poorer.

Would a Misaligned SSI Really Kill Us All?

Contents

Preamble

Introduction

Why I Don't Think Alpha Will Kill Us All

Reasons Alpha Might Destroy the World

Humans are Adversaries to Alpha

Alpha Doesn't Need Humanity

Conclusions

Answers

2 comments