An illustrative model of backfire risks from pausing AI research

post by Maxime Riché (maxime-riche) · 2023-11-06T14:30:58.615Z · LW · GW · 3 comments

Contents

  TLDR
  Summary
  Introduction
      Acknowledgments
      Definition: "Takeoff speed" in this post equals "Maximum explosive takeoff speed"
      Definition: Training/inference asymmetry = Number of AI researchers you can run just after training a model
      Motivation: Why look at the takeoff speed? Answer: Because most of the X-risks concentrate around the time of the explosive takeoff.
  Model
    Main dynamics
    Building the model step-by-step
    Guessing the parameters
    Illustrations
      Baseline (without 3. The training/inference asymmetry effect)
      Increasing investment (without 3. The training/inference asymmetry effect)
      Increasing investment (with 3. The training/inference asymmetry effect)
  Scenarios
      Scenario - Long timelines
      Scenario - Short timelines
  Takeaways
    General
    Surprising updates
    How do these arguments and/or this model fail?
  Comparison with the takeoff speed report's model
        Results expected if the model in this post is right:
        Results observed:
        Conclusion:
      Best guess scenario
      Best guess scenario - low training requirement only
        Computing Overhangs LW's wiki page
        Guessing the parameters - Details
None
3 comments

TLDR

The post contains a simple modelization of the maximum explosive takeoff speed (simplified as "takeoff speed"), using as the main parameters: 

The post concludes that:

Summary

This post is a contribution to the AI pause debate. It will illustrate a few effects for why pausing progress on software efficiency or increasing investment can be negative. 

The model in this post predicts similar effects on the takeoff speed from increasing investment or from pausing software efficiency improvement. I will mainly illustrate the effects by looking at an increase in early investment. Still, the effects will be similar to a pause in AI development if it mostly slows down progress on software efficiency.

This post also tries to address the claim that shorting timelines can lead to a softer takeoff. The model concludes that while short timelines can have softer takeoffs, shortening timelines by investing more will likely increase the takeoff speed.

I describe a naive model of AI capability acceleration, whose main features are to look at different types of overhangs[1], their sizes, consumption rates[2], and how these rates are influenced by the training/inference asymmetry[3]

This model only looks at a few effects, and their impacts on the maximum speed of capability increase during the explosive takeoff period.[4]

The model includes the following effects:

The model concludes the following:

The model also leads to a few indirect conclusions:

Introduction

Acknowledgments

Thanks to @Jaime Sevilla [EA · GW] and @Tom_Davidson [EA · GW] for their generous feedback on this post. All errors are mine.

Thanks to @Tristan Cook [EA · GW] for prompting my work on this, by asking me why I thought increasing investment early may soften the takeoff. In the end, this model shows the opposite.

Definition: "Takeoff speed" in this post equals "Maximum explosive takeoff speed"

By takeoff, I mean the "explosive takeoff", when an explosive capability increase is triggered by creating capable enough AI (e.g., automatizing >90% of cognitive work). I don't mean the "pre-takeoff" period during which the speed-up in capability increase is still strongly limited.

The post focuses on the maximum speed of capability increase during the explosive takeoff. I often call this the “takeoff speed” while meaning the “maximum speed of capability increase during the explosive takeoff”.

Illustration of the explosive takeoff period (in red only). Taken and modified from: Robin Hanson: I Still Don’t Get Foom.

Definition: Training/inference asymmetry = Number of AI researchers you can run just after training a model

The training/inference asymmetry is the fact that after finishing training one model, you can start running N inferences by using the same supercomputer used for training. 

I.e., as soon as you finish training, you can use the trained model many times using the same computing infrastructure.

E.g., if you always train your model in the same amount of clock time, and you increase the number of data points you train on by x10, then you will need a supercomputer x10 as large[6], and after training, you will be able to run x10 more inferences. 

Motivation: Why look at the takeoff speed? Answer: Because most of the X-risks concentrate around the time of the explosive takeoff.

Given recent developments in AI safety, it is now more likely that we will manage to align below or around human-level capabilities. You can find a list of advances in AI safety that push in that direction in Don't Dismiss Simple Alignment Approaches [LW · GW] or Thoughts on sharing information about language model capabilities [LW · GW]. Furthermore, over the last year, there has been surprising progress in AI governance (e.g., the UK AI safety summit [LW · GW] or [7] and the US executive order on AI safety [EA · GW]). All of these are updates towards less X-risks probability before takeoff. 

The default plan looks more and more like OpenAI's Superalignment plan: Leveraging aligned AIs to align/oversee higher capability AIs. A critical parameter influencing the probability of success of this plan is the speed at which AI capabilities will increase.

Thus, the model in this post will look at the maximum speed of capability increase during the explosive takeoff ("takeoff speed") and how a few parameters influence it.

Model

Main dynamics

There are three main dynamics in this model, and the later sections will sometimes only look at the first two without looking at the third.

  1. The software efficiency overhang drives takeoff speed.
    • The software efficiency overhang is the overhang influencing the most the takeoff speed. Because of its large size and fast consumption rate.
  2. No free lunch on decreasing overhangs.
    • Decreasing one overhang leads to increasing some of the other overhangs (at the start of the takeoff). Because of having a fixed threshold of capability for the start of the takeoff.
  3. The training/inference asymmetry effect.
    • The consumption rate of the software efficiency increases fast with the investment in training costs, because of the increase in the training/inference asymmetry.

Building the model step-by-step

The following model is NOT a predictive model, some parameters are set arbitrary. The model is here to illustrate the main dynamics presented above and to give a rough sense of the directions and intensity of the different effects.

(I omit some of the "Overhang" terms below, but most of the terms are about overhangs. And I set t = start of takeoff.)

TakeoffSpeed(t=start of takeoff)

= SumOverOverhangs(InitialSize * %Remaining * ConsumptionRate)

= (InitialSize(invest.) * %Remaining(invest.) * C1 * 1/Delay(invest.)) + 

(InitialSize(hard.eff.) * %Remaining(hard.eff.) * C1 * 1/Delay(hard.eff.)) + 

(InitialSize(soft.eff.) * %Remaining(soft.eff.) * C1 * 1/Delay(soft.eff.) * C2 * SQRT(SQRT(investment)))

TakeoffThreshold 

= 1e36 

= InitialSize(invest.) * (1-%Remaining(invest., t=start of takeoff)) +

InitialSize(hard.eff.) * (1-%Remaining(hard.eff., t=start of takeoff)) +

InitialSize(soft.eff.) * (1-%Remaining(soft.eff., t=start of takeoff))

Guessing the parameters

For the Delay(overhang) parameters, I will only look at their relative values.

A few explanations are available in [15].

Illustrations

Baseline (without 3. The training/inference asymmetry effect)

This is a baseline, with semi-arbitrary numbers for the overhang consumption rates, and the initial values. You can see the formulas in [16]. The capability is the sum of the consumption of the 3 overhangs. The takeoff starts when capability reaches 36 (1e36  compute equivalent).

Increasing investment (without 3. The training/inference asymmetry effect)

I don't add yet here the effect of the training/inference asymmetry on the consumption rate of the software efficiency overhang.

I compare the baseline with a scenario in which investment overhang is consumed x3 faster during pre-takeoff. The maximum takeoff speed increases from 25.5 to 28 (OOM of compute equivalent/year).

Note: these values 25.5 and 28 OOM/y are much more illustrative than predictive. The relative increase is more important than the absolute numbers.

Comparison of takeoff speed, when consuming the investment overhang x3 faster during pre-takeoff. Not including the effect of the training/inference asymmetry.

Increasing investment (with 3. The training/inference asymmetry effect)

I finally add the effect of the training/inference asymmetry on the consumption rate of the software efficiency overhang.

I compare a new baseline (with the dynamic 3) with a scenario in which the consumption of the investment overhang is x3 faster during pre-takeoff. The maximum takeoff speed increases from 24 to 52 (OOM of compute equivalent/year).

Note: these values 24 and 52 OOM/y are much more illustrative than predictive. The relative increase is more important than the absolute numbers.

Comparison of takeoff speed, when increasing investment x3 faster during pre-takeoff. Including the effect of the training/inference asymmetry.

Scenarios

Scenario - Long timelines

Description: 

AI labs scale investments spent on training costs early, but because timelines are long and there are large economic incentives to invest in AI, this boost is balanced by lower investment growth later. I.e., this early increase in investment does not change the level of investment at the start of takeoff because training costs were already going to be max-out at the start of takeoff.

Assumption (1): I assume no increase in investments in hardware and software efficiencies. 

Or assumption (2): Assumption (1) is unrealistic. Let's assume instead that if there is an increase, but it applies as strongly to both hardware and software efficiencies.

Consequences: 

Scenario - Short timelines

Description: 

AI labs invest more heavily in training costs early. Because timelines are short, we see a counterfactual change in the distribution of investment, hardware efficiency, and software efficiency overhangs when the takeoff starts. The early increase in investment is not balanced out later.

Consequences:

Takeaways

General

Given the very limited scope of this model:

Surprising updates

After working on this question, I changed my mind a bit towards the following:

How do these arguments and/or this model fail?

The updates just above should not be made if you think:

Comparison with the takeoff speed report's model

You can find the takeoff speed report in the following post: What a compute-centric framework says about takeoff speeds [EA · GW]. You can play with their model at https://takeoffspeeds.com/.

I played with increasing the investment rates in the model. I changed both the "Growth rate fraction GWP compute" and "Wake-up growth rate fraction of GWP buying compute" at the same time and kept their values equal. You can find the extracted data here.

Results expected if the model in this post is right:

Results observed:

Conclusion:

Best guess scenario

Best guess scenario - low training requirement only

  1. ^
  2. ^

    I define the consumption rate of some capability overhang by the percentage of the remaining overhang that is consumed in a year. This leads to the remaining overhangs at time t  following an exponential decay function.

  3. ^

    The training/inference asymmetry is the fact that after finishing training one model, you can start running N inferences by using the same supercomputer used for training. I.e., as soon as you finish training, you can use the trained model many times using the same computing infrastructure.

  4. ^

    Scope of the post: This post is limited in scope and is about the question: “Does accelerating AI capabilities reduce the maximum explosive takeoff speed?”. It is NOT about the questions: “Does accelerating AI capabilities have a positive or negative impact?”, etc.

  5. ^

    Obviously, the reduced takeoff timeline also needs to be taken into account, and this is not modeled here.

  6. ^

    With a number of weights constant.

  7. ^
  8. ^

    There is a threshold of capability after which the explosive takeoff starts (e.g. 90% automatization of cognitive tasks).

  9. ^

    The takeoff ends when there are no more overhangs to easily consume for fast capability improvement (or if self-improvement is stopped, e.g. by governance or by the loss of self-improvement capabilities, but this is neglected).

  10. ^

    The size of the pool is given as an initial parameter. Excluding the investment overhang, whose pool size increases with the GWP (World GDP). But I ignore this effect.

  11. ^

    This is not the case for the hardware efficiency and investment overhangs, which have bottlenecks preventing most of this effect.

  12. ^

    But this function would largely be wrong for the pre-takeoff period, which is should be much more governed by returns on investments.

  13. ^

    Because of scaling laws, and assuming that the training time for SOTA models stays the same, we have something like this:

     

    E.g., each time investment is scaled by x100, the training run uses x10 more data points to train a model x10 larger, and the supercomputer used for training the model will be able to run x10 more inferences after training (assuming training clock time constant).
     

    How is this linked to the consumption rate of the software efficiency overhang? I will assume that:
    - when you have x100 more clones of the same researcher
    - and when you have a lot of potential research directions (e.g. at the start of takeoff when a lot of the overhang is remaining)
    - then your research outputs are simply multiplied by x10. 
    So, I am assuming some small diminishing returns (SQRT) because researchers are clones of each other and because I look at a time (start of takeoff) when a lot of capability overhangs remain.

  14. ^
  15. ^
  16. ^
  17. ^

    Excluding the effects of GWP growth.

  18. ^

    Using huge scaffoldings OOM larger to increase SOTA could slow takeoff (during its most dangerous time) by reducing the number of AI researchers that we will be able to run when we unlock this level of capability. For example, instead of running for the first time, 10000 AI equivalent researchers in parallel in 2030, it could be safer to be able to run 10 expensive AI equivalent researchers in 2029 instead by paying a huge 3 OOM premium using massive scaffolding techniques.

  19. ^
  20. ^

    This is NOT an argument for reducing timelines. It is an argument for being more optimistic about the outcomes in a world with short timelines. 


    The model presented in this post says that actively reducing timelines only reduces the takeoff speed if you consume relatively more of the software efficiency overhang while also not increasing the training/inference asymmetry.

3 comments

Comments sorted by top scores.

comment by Tom Davidson · 2023-12-02T19:19:30.630Z · LW(p) · GW(p)

I think your model will underestimate the benefits of ramping up spending quickly today. 

You model the size of the $ overhang as constant. But in fact it's doubling every couple of years as global spending on producing on AI chips grows. (The overhang relates to the fraction of chips used in the largest training run, not the fraction of GWP spent on the largest training run.) That means that ramping up spending quickly (on training runs or software or hardware research) gives that $ overhang less time to grow

Replies from: maxime-riche
comment by Maxime Riché (maxime-riche) · 2023-12-04T03:28:19.766Z · LW(p) · GW(p)

Interesting! I will see if I can correct that easily.

comment by Vladimir_Nesov · 2023-11-06T14:52:34.808Z · LW(p) · GW(p)

Technical constraints on speed of takeoff only determine actual speed of takeoff when there is no control over what's happening, instead of going orders of magnitude slower than feasible. Pause and strong coordination on development might be useful for securing such control.