AI takeoff and nuclear war

post by owencb · 2024-06-11T19:36:24.710Z · LW · GW · 6 comments

This is a link post for https://strangecities.substack.com/p/ai-takeoff-and-nuclear-war

Contents

  Summary
  Why do(n’t) people go to war?
    Rational reasons to go to war
    Irrational reasons to go to war
  Impacts of AI takeoff on reasons to go to war
    Impacts on rational reasons for war
      Commitment issues
      Private information
      Issue indivisibility
    Impacts on irrational reasons for war
      Irrational decision-making
      Misaligned decision-makers
      National pride
  Strategies for reducing risk of war
    Strategies for averting failure to navigate takeoff
      Research & dissemination
      Spreading “we’re all in this together” frames
      Agreements/treaties about sharing power of AI
    Differential technological development
    What about an AI pause?
  Closing thoughts
    What about non-nuclear warfare?
    How big a deal is this?
None
6 comments

Summary

As we approach and pass through an AI takeoff period, the risk of nuclear war (or other all-out global conflict) will increase.

An AI takeoff would involve the automation of scientific and technological research. This would lead to much faster technological progress, including military technologies. In such a rapidly changing world, some of the circumstances which underpin the current peaceful equilibrium will dissolve or change. There are then two risks[1]:

  1. Fundamental instability. New circumstances could give a situation where there is no peaceful equilibrium it is in everyone’s interests to maintain.
    • e.g. —
      • If nuclear calculus changes to make second strike capabilities infeasible
      • If one party is racing ahead with technological progress and will soon trivially outmatch the rest of the world, without any way to credibly commit not to completely disempower them after it has done so
  2. Failure to navigate. Despite the existence of new peaceful equilibria, decision-makers might fail to reach one.
    • e.g. —
      • If decision-makers misunderstand the strategic position, they may hold out for a more favourable outcome they (incorrectly) believe is fair
      • If the only peaceful equilibria are convoluted and unprecedented, leaders may not be able to identify or build trust in them in a timely fashion
      • Individual leaders might choose a path of war that would be good for them personally as they solidify power with AI; or nations might hold strongly to values like sovereignty that could make cooperation much harder

Of these two risks, it is likely simpler to work to reduce the risk of failure to navigate. The three straightforward strategies here are research & dissemination, to ensure that the basic strategic situation is common knowledge among decision-makers, spreading positive-sum frames, and crafting and getting buy-in to meaningful commitments about sharing the power from AI, to reduce incentives for anyone to initiate war.

Additionally, powerful AI tools could change the landscape in ways that reduce either or both of these risks. A fourth strategy, therefore, is to differentially accelerate risk-reducing applications of AI. These could include:

Why do(n’t) people go to war?

To date, the world has been pretty good at avoiding thermonuclear war. The doctrine of mutually assured destruction means that it’s in nobody’s interest to start a war (although the short timescales involved mean that accidentally starting one is a concern).

The rapid development of powerful AI could disrupt the current equilibrium. From a very outside-view perspective, we might think that this is equally likely to result in, say, a 10x decrease in risk as a 10x increase. Even this would be alarming, since the annual probability seems fairly low right now, so a big decrease in risk is merely nice-to-have, but a big increase could be catastrophic.

To get more clarity than that, we’ll look at the theoretical reasons people might go to war, and then look at how an AI takeoff period might impact each of these.

Rational reasons to go to war

War is inefficient; for any war, there should be some possible world which doesn’t have that war in which everyone is better off. So why do we have war? Fearon’s classic paper on Rationalist Explanations for War explains that there are essentially three mechanisms that can lead to war between states that are all acting rationally:

  1. Commitment problems
    • If you’re about to build a superweapon, I might want to attack now. We might both be better off if I didn’t attack, and I paid you to promise not to use the superweapon. But absent some strong commitment mechanism, why should I trust that you won’t break your promise and use the superweapon to take all my stuff?
    • This is the main mechanism behind expecting war in the case of the Thucydides Trap
  2. Private information, plus incentives to misrepresent that information
    • If each side believes themselves to have a military advantage, plus cannot trust the other side’s self-reports of their strength, they may go to war to resolve the issue
  3. Issue indivisibility
    • If there is a single central issue at stake, and we can’t make side-payments or agree to abide by a throw of the dice, we may have no other choice than to determine it via war
    • I side with Fearon in having the view that this is a less important mechanism, although for completeness I will discuss it briefly below

Irrational reasons to go to war

Alternatively, as Fearon briefly explains, there are reasons states may go to war even though it is not in their rational interest to do so:

Finally, I want to note that an important contributory factor may be:

(My understanding is that Fearon takes a neo-realist stance which wouldn’t classify this as irrational, but from my perspective it’s an important source of misalignment between states as decision-makers and what would be good for the people who live in them, and so worth mentioning. It won’t by itself suffice to explain a war, but it could be a contributory factor.)

Impacts of AI takeoff on reasons to go to war

We’ll consider in turn the effects on each of the possible reasons for war. The rapidity of change during an AI takeoff looks to increase the risk both of people starting a nuclear war for rational reasons (i.e. fundamental instability), as well as people starting a nuclear war for irrational reasons (i.e. failure to navigate to a peaceful equilibrium).

(Note that this section is just an overview of the effects of things speeding up a lot; I’ll get to the effects of particular new AI applications later.)

Impacts on rational reasons for war

Commitment issues

At present a lot of our commitment mechanisms come down to being in a repeated game. If a party violates things that are expected of them, they can receive some appropriate sanctions. If a state began nuclear war, its rivals could retaliate.

A fast-changing technological landscape threatens to upend this. An actor who got far ahead, especially if they developed technologies their rivals were unaware of, could potentially take effective control of the whole world, preventing those affected from retaliating. But a lack of possible retaliation means that they might face no disincentive to do so. And so the actor who was behind, reasoning through possible outcomes, might think they had no better option than starting a war before things reached that stage.

Other concerns include the possibility that the military landscape might move to one which was offence-dominant. Then even an actor who was clearly in the lead might attack a rival to stop them developing any potentially-destructive technologies. Or if new technology threatened to permit a nuclear first-strike to eliminate adversaries’ second-strike capabilities, the clear incentive to initiate war after that technology was possible could translate into some actors having an incentive to initiate war even before the technology came online.

Private information

States may have private information about their own technological base, and about future technological pathways (which inform both their strategic picture, but also their research strategy). If underlying technologies are changing faster, the amount and value of private information will probably increase.

While not conclusive, an increase in private information seems concerning. It could precipitate war, e.g. from someone who believes they have a technological advantage, but cannot deploy this in small-scale ways without giving their adversaries an opportunity to learn and respond; or from a party worried that another state is on course to develop an insurmountable lead in research into military technologies (even if this worry is misplaced).

Issue indivisibility

Mostly I agree with Fearon that this is likely rarely a major driver of war. Most likely that remains true during an AI takeoff. However, novel issues might arise, on which (at least in principle) there might be issue indivisibility. e.g.

Impacts on irrational reasons for war

Irrational decision-making

During an AI takeoff, the world may feel highly energetic and unstable, as technological capabilities are developed at a rapid pace. People may not grasp the strategic implications of the latest technologies, and are even less likely to fully understand the implications of expected future developments — even if those will come online within the next year.

If the situation becomes much harder to understand, and without a track record of similar situations to have learned from, it will become much easier to act less-than-fully-rationally. People might make big errors, even while acting in ways that we might think looked reasonable.

Of course, less-than-fully-rational doesn't imply that there will be war, but it weakens the arguments against. People might initiate war if they mistakenly believe themselves to be in one of the situations where there is rational justification for war. Or they might initiate war if they believe the other parties to be acting sufficiently irrationally in damaging ways that it becomes the best option to contain that.

Many people would have a moral aversion to the idea of starting a nuclear war. It is a hopeful thought that this would bias even irrational action against initiating war. However, this consideration feels a bit thin to count on.

(Also, all of these situations could be very stressful, and stress can inhibit good decision-making in a normal sense.)

Misaligned decision-makers

I'm not sure takeoff will have a big effect on the extent to which decision-makers are misaligned. But there are a couple of related considerations that give some cause for alarm:

National pride

It is quite plausible that the unsettling nature of a takeoff period will make things feel unsafe to people in ways that push their mindset towards something like national pride — binding up their notion of what acting well and selflessly is with protecting the dignity and honour of their civilization or nation. This could occur at the level of the leadership, or the citizenry, or both.

Generally high levels of national pride seem to make the situation more fraught, because they narrow the space of globally-acceptable outcomes — it becomes necessary not only to find outcomes that are good for all of the people, but also for the identity of the nations (as projected by the people running them). This could, for example, be a blocker on reaching agreements which avert war by giving up certain sorts of sovereignty to an international body.

Strategies for reducing risk of war

Strategies for averting failure to navigate takeoff

Nuclear war seems pretty bad[2]. It may therefore be high leverage to pursue strategies to reduce the risk of war. The straightforward strategies are education, and getting buy-in to meaningful commitments.

Research & dissemination

A major driver of risk is the possibility that the rate of change will mean that decision-makers are out of their depth, and acting on partially-wrong models about the strategic situation.

An obvious response is to produce, and disseminate, high-quality analysis which will help people to better understand the strategic picture.

This seems likely a good idea. While there are some possible worlds where things reach better outcomes because some people don't understand the situation and are blindsided, a strategy of deliberately occluding information feels very non-robust.

Spreading “we’re all in this together” frames

The more people naturally think of this challenge as a contest between nations, the more likely they are to make decisions on the basis of national pride, and the harder it may be to get people to come together to face what may be the grandest challenge for humanity — preserving our dignity as we move into a world where human intelligence is not supreme.

On the other hand, I think that getting people unified around frames which naturally put us all in the same boat is likely to have some effect reducing the impact of national pride on decision-making, and hence reduce the risk of war. Of course this is dependent on how far a reach these frames could have — but I think that as the world becomes stranger people will naturally be reaching for new frames, so there may be some opportunity for good frames to have a very wide reach.

Agreements/treaties about sharing power of AI

The risks are driven by the possibility that some nuclear actor, at some point, may not perceive better options than initiating nuclear war. An obvious mitigating strategy is to work to ensure that there always are such options, and they are clear and salient.

Since the potential benefits from AI are large, it seems likely that there should be possible distributions of benefits and of power which look robustly better to all parties than war. The worry is that things may move too fast to allow people to identify these (or if there are differing views about what is fair, that this difference of views will lead to obstinacy from people each trying to hold out for what they think is fair and thereby walking into war). Working early on possible approaches for such distributions, and how best to reach robust agreement on that, could thereby help to reduce risk.

I say “sharing power” rather than just “sharing benefits” because it seems like a good fraction of people and institutions ~terminally value having power over things. They might not be satisfied with options which just give them a share in the material benefits of AI, without any meaningful power.

Differential technological development

Strong and trusted AI tools targeted at the right problems could help to change the basic situation in ways that reduce risks of (rational or irrational) initiation of nuclear war. This could include both development of the underlying technologies, and building them out so that they are actually adopted and have time to come to be trusted.

To survey how AI applications could help with the various possible reasons for war:

By default, I expect the increases in risk to occur before we have strong (& sufficiently trusted) effective tools for these things. But accelerating progress for these use-cases might meaningfully shrink the period of risk.

I am uncertain which of these are the most promising to pursue, but my guesses would be:

What about an AI pause?

If AI takeoff is a driver of risk here, would slowing down or pausing AI progress help?

My take is that:

Closing thoughts

What about non-nuclear warfare?

This analysis is about all-out war. Right now this probably means nuclear, although that could change with time. (Bioweapons could potentially be even more concerning than nuclear.)

How big a deal is this?

On my current impressions, destabilizing effects from AI takeoff leading to all-out global war are very concerning. I’m not very confident in any particular estimates of absolute risk, but I think it's fair to say that, having thought about all of them for some time, it's not clear to me which are the biggest risks associated with AI, between risk from misaligned systems, risk of totalitarian lock-in, and risk of nuclear war.

Given this, it does seem clear that each of these areas deserves significant attention. I think the world should still pay more attention to misaligned AI, but I think it should pay much more attention than at present to risks of things ending in catastrophe for other reasons as people navigate AI takeoffs. I'm less confident that any of my specific ideas of things to do are quite right.

Acknowledgements: Thanks to Eric Drexler, who made points in conversation which made me explicitly notice a bunch of this stuff. And thanks to Raymond Douglas, Fynn Heide, Max Dalton, and Toby Ord for helpful comments and discussion.

  1. ^

     There is also a risk of nuclear war initiated deliberately by misaligned AI agents. But as the risks of misaligned AI agents receive significant attention elsewhere, and as the mechanisms driving the risk of nuclear war are quite different in that case, I do not address it in my analysis here.

  2. ^

     Obviously nuclear war is a terrible outcome on all normal metrics. But is there a galaxy-brained take where it’s actually good, for stopping humanity before it goes over the precipice?

    This is definitely a theoretical possibility. But it doesn’t get much of my probability mass. It seems more likely that:

    1) Nuclear war would not wipe out even close-to-everyone.

    2) While it would set the world economy back quite a way, it wouldn’t cause the loss of most technological progress.

    3) In the aftermath of a nuclear war, surviving powers would be more fearful and hostile.

    4) There would be greater incentives to rush for powerful AI, and less effort expended on going carefully or considering pausing.

6 comments

Comments sorted by top scores.

comment by Akash (akash-wasil) · 2024-06-21T20:24:51.018Z · LW(p) · GW(p)

Interesting analysis! I think it'll be useful for more folks to think about the nuclear/geopolitical implications of AGI development, especially in worlds where governments are paying more attention & one or more nuclear powers experience a "wakeup" or "sudden increase in situational awareness."

Some specific thoughts:

Of these two risks, it is likely simpler to work to reduce the risk of failure to navigate. 

Can you say more about why you believe this? At first glance, it seems to be like "fundamental instability" is much more tied to how AI development goes, so I would've expected it to be more tractable [among LW users]. Whereas "failure to navigate" seems further outside our spheres of influence– it seems to me like there would be a lot of intelligence agency analysts, defense people, and national security advisors who are contributing to discussions about whether or not to go to war. Seems plausible that maybe a well-written analysis from folks in the AI safety community could be useful, but my impression is that it would be pretty hard to make a splash here since (a) things would be so fast-moving, (b) a lot of the valuable information about the geopolitical scene will be held by people working in government and people with security clearances, making it harder for outside people to reason about things, and (c) even conditional on valuable analysis, the stakeholders who will be deferred to are (mostly) going to be natsec/defense stakeholders.

3) In the aftermath of a nuclear war, surviving powers would be more fearful and hostile.

4) There would be greater incentives to rush for powerful AI, and less effort expended on going carefully or considering pausing.

There are lots of common-sense reasons why nuclear war is bad. That said, I'd be curious to learn more about how confident you are in these statements. In a post-catastrophe world, it seems quite plausible to me that the rebounding civilizations would fear existential catastrophes and dangerous technologies and try hard to avoid technology-induced catastrophes. I also just think such scenarios are very hard to reason about, such that there's a lot of uncertainty around whether AI progress would be faster (bc civs are fearful of each other and hostile) or slower (because civs are are fearful of technology-induced catastrophes and generally have more of a safety/security mindset.)

Replies from: owencb
comment by owencb · 2024-06-21T21:47:02.886Z · LW(p) · GW(p)

Can you say more about why you believe this? At first glance, it seems to be like "fundamental instability" is much more tied to how AI development goes, so I would've expected it to be more tractable [among LW users].

Maybe "simpler" was the wrong choice of word. I didn't really mean "more tractable". I just meant "it's kind of obvious what needs to happen (even if it's very hard to get it to happen)". Whereas with fundamental instability it's more like it's unclear if it's actually a very overdetermined fundamental instability, or what exactly could nudge it to a part of scenario space with stable possibilities.

In a post-catastrophe world, it seems quite plausible to me that the rebounding civilizations would fear existential catastrophes and dangerous technologies and try hard to avoid technology-induced catastrophes.

I agree that it's hard to reason about this stuff so I'm not super confident in anything. However, my inside view is that this story seems plausible if the catastrophe seems like it was basically an accident, but less plausible for nuclear war. Somewhat more plausible is that rebounding civilizations would create a meaningful world government to avoid repeating history.

comment by VojtaKovarik · 2024-06-12T14:01:58.449Z · LW(p) · GW(p)

Nitpick on the framing: I feel that thinking about "misaligned decision-makers" as an "irrational" reason for war could contribute to (mildly) misunderstanding or underestimating the issue.

To elaborate: The "rational vs irrational reasons" distinction talks about the reasons using the framing where states are viewed as monolithic agents who act in "rational" or "irrational" ways. I agree that for the purpose of classifying the risks, this is an ok way to go about things.

I wanted to offer an alternative framing of this, though: For any state, we can consider the abstraction where all people in that state act in harmony to pursue the interests of the state. And then there is the more accurate abstraction where the state is made of individual people with imperfectly aligned interests, who each act optimally to pursue those interests, given their situation. And then there is the model where the individual humans are misaligned and make mistakes. And then you can classify the reasons based on which abstraction you need to explain them.

comment by Inosen Infinity (Inosen_Infinity) · 2024-06-12T07:51:41.103Z · LW(p) · GW(p)
  • Good AI tools could help people to make better sense of the world, and make more rational decisions.

I have a feeling this may go both ways. If AI development is nation-led (which may become true at some point somewhere), the nation's leaders would perhaps want the AI to be aligned with their own values. There is some risk that in such way biases could be solidified instead of overcome, and the AI would recommend even more irrational (in terms of the common good) decisions -- or rather, rational decisions based on irrational premises. Which could lead to increased risk of conflicts. It might be especially true for authoritarian countries.

  • AI could potentially give new powerful tools for democratic accountability, holding individual decisions to higher standards of scrutiny (without creating undue overhead or privacy issues)

The way I understand it could work is that democratic leaders with "democracy-aligned AI" would get more effective influence on nondemocratic figures (by fine-tuned persuasion or some kind of AI-designed political zugzwang or etc), thus reducing totalitarian risks. Is my understanding correct? 

(I also had a thought that maybe you meant a yet-misaligned leader would agree to cooperate with aligned-AI, but it sounds unlikely -- such leader would probably refuse because their values would differ from the AI's)

Replies from: owencb
comment by owencb · 2024-06-12T10:09:42.049Z · LW(p) · GW(p)

The way I understand it could work is that democratic leaders with "democracy-aligned AI" would get more effective influence on nondemocratic figures (by fine-tuned persuasion or some kind of AI-designed political zugzwang or etc), thus reducing totalitarian risks. Is my understanding correct? 

Not what I'd meant -- rather, that democracies could demand better oversight of their leaders, and so reduce the risk of democracies slipping into various traps (corruption, authoritarianism).

Replies from: Inosen_Infinity
comment by Inosen Infinity (Inosen_Infinity) · 2024-06-12T10:43:51.828Z · LW(p) · GW(p)

Thanks!

The idea sounds nice, but practically it may also occur to be a double edged sword. If there is an AI that could significantly help in oversight of decision-makers, then there is almost surely an AI that could help the decision-makers drive public opinion in their desired direction. And since leaders usually have more resources (network, money) than the public, I'd assume that this scenario has larger probability than the successful oversight scenario. Intuitively, way larger.

I wonder how we could achieve oversight without getting controlled back in the process. Seems like a tough problem.