Musings on Scenario Forecasting and AI

post by Alvin Ånestrand (alvin-anestrand) · 2025-03-06T12:28:19.662Z · LW · GW · 0 comments

This is a link post for https://forecastingaifutures.substack.com/p/musings-on-ai-scenario-forecasting

Contents

    Other work
  Using and Misusing Scenario Forecasting
  Some Basic World Modeling
      Research and Development (R&D) Variables
      Circumstantial Variables
  Crossroads and Cruxes
  Dealing with Uncertainty
None
No comments

I have yet to write detailed scenarios for AI futures, which definitely seems like something I should do considering the title of my blog (Forecasting AI Futures). I have speculated, pondered and wondered much in the recent weeks—I feel it is time. But first, I have some thoughts about scenario forecasting.

The plan:

  1. Write down general thoughts about scenario forecasting with special focus on AI (this post).
  2. Write one or two specific scenarios for events over the coming months and years.
  3. Wait a few months, see what comes true, and update my scenario forecasting methods.

Other work

In 2021, Daniel Kokotaljo published a scenario titled What 2026 looks like [LW · GW]. He managed to predict many important aspects of the progression of AI between 2021 and now, such as chatbots, chain-of-thought, and inference compute scaling.

Now he is collaborating with other forecasters—including Eli Lifland, a superforecaster from the Samotsvety forecasting group—to develop a highly detailed and well-researched scenario forecast under the AI Futures Project. Designed to be as predictively accurate as possible, it illustrates how the world might change as AI capabilities evolve. The scenario is scheduled to be published in Q1 of this year.

I also recommend reading Scale Was All We Needed, At First [LW · GW] and How AI Takeover Might Happen in 2 Years [LW · GW], two brilliant stories exploring scenarios with very short timelines to superintelligence.

Using and Misusing Scenario Forecasting

People may hear about a specific superintelligence disaster scenario, and then confidently say something like “That seems entirely implausible!” or “AIs will just be tools, not agents!” or “If it tried that, humans could just [insert solution].”

There is some fundamental issue here. Those who see significant risks struggle to convey their concerns to those who think everything is fine.

And I think this problem at least partly has to do with scenario forecasting. One side is envisioning specific bad scenarios, which can easily be refuted. The other side is envisioning specific good scenarios, which can also be easily refuted.

The question we should consider is something more like “Out of all possible scenarios, how many lead to positive outcomes versus negative ones?”. But this question is harder to reason about, and any reasoning about it takes longer to convey.

We can start by considering what avoiding the major risks would mean. The world needs to reach a stable state with minimal risks. For example:

  1. Major powers agree to never develop dangerously sophisticated AI. All significant datacenters are monitored for compliance, and any breach results in severe punishment.
  2. Superintelligent tool AI is developed—a system without ability of agentic behavior that lacks goals. Like the above scenario, there are extremely robust control mechanisms and oversight; no one can ask the AI to design WMDs or develop other potentially dangerous AI systems.
  3. There is a single aligned superintelligence that follows human instructions—whether through a government, a shadow government, the population of a nation, or even a global democratic system. There are advanced superintelligence-powered security measures ensuring that no human makes dangerous alterations to the AI. There are reliable measures for avoiding authoritarian scenarios where some human(s) take control over the future and directs it in ways the rest of humanity would not agree to.
  4. There are several superintelligent AIs, perhaps acting in an economy similar to the current one. More superintelligences may occasionally be developed. Humans are still alive and well, and in control of the AIs. There are mechanisms that ensure that all superintelligences are properly aligned, or can’t take any action that would harm humans, e.g. through highly advanced formal verification of the AIs and their actions.

There are certainly other relatively stable states. Imagine, for instance, a scenario where AIs are granted rights—such as property ownership and voting. Strict regulation and monitoring ensure that no superintelligence can succeed in killing most or all humans with e.g. an advanced bioweapon. This scenario could, however, lead to AIs outcompeting humans. Unless human minds are simulated in large quantities, AIs would far outnumber humans and have basically all voting power in a democratic system.

For those arguing that there are no significant risks, I ask: What specific positive scenario(s) do you anticipate? Will one of them simply happen by default?

A single misaligned superintelligence might be all it takes to end humanity. Some think the first AI to reach superintelligence will have a sharp left turn [? · GW]; capabilities generalize across domains while alignment properties fail to generalize. My impression is that those that think that AI-caused extinction is highly probable considers this the major threat [LW · GW], or at least one of the major threats. By default, alignment methods break apart when capabilities generalize, rendering them basically useless, and we lose control over an intelligence much smarter than us.

But what if alignment turns out to be really easy? Carelessness, misuse, conflicts and authoritarian control risks remain. How do you ensure everyone aligns their superintelligences, and use them responsibly? Some have suggested pivotal acts [? · GW], such as using the first (hopefully aligned) superintelligence to ensure that no other potentially unsafe superintelligences are ever developed. Others are arguing that the most advanced AIs should be developed in an international collaborative effort and controlled according to international consensus, hopefully leading to a stable scenario like scenario 2 and 3 above. See What success looks like [EA · GW] for further discussion on how these scenarios may be reached.

When considering questions like “Will AI kill us all?” or “Will there be a positive transition to a world with radically smarter-than-human artificial artificial intelligence?”, I try to imagine stable scenarios like the ones above and estimate the probability that such a state is achieved before some catastrophic disaster occurs.

Please comment the type of stable scenario you find most likely! Which one should we aim for?

Some Basic World Modeling

Predictively accurate forecasting scenarios should not be written as you write fiction—they should follow rules of probability, as well as cause and effect. They should tell you about things you might actually see, which requires that all details of the scenario are consistent with each other and with the current state of the world.

This requires some world modeling.

I will provide an example. While it might not be the best model, or entirely comprehensive, it should serve to illustrate my way of thinking about forecasting. For a more thorough world modeling attempt, see Modeling Transformative AI Risk (MTAIR) [? · GW].

When forecasting, I categorize facts and events. For instance, benchmark results fall under AI Capabilities, while AI misuse cases fall under AI Incidents. Let’s call these categories variables—things that feel especially important when thinking about AI futures. These variables affect each other—often in complex ways, as detailed below. The variables can in turn be categorized into Research & Development (R&D) Variables and Circumstantial Variables. Under each variable, I have included other variables that it affects and describe their relationship.

Research and Development (R&D) Variables

Circumstantial Variables

We can analyze more complex interactions between the variables. For instance, a misaligned AI has sufficiently advanced capabilities that it can circumvent its control mechanisms, which increases incident risks. An AI lab that are confident in the alignment of their AI, will be more confident in their control, motivating the lab to use the AI for further automation of their research and development.

With these variables and their interactions in place, we can craft plausible scenarios:

The future will surely be a confusing combination of an unnumerable number of scenarios such as these.

Crossroads and Cruxes

Scenarios may involve small details with far-reaching implications—things that could be called ‘crossroads’ or ‘cruxes’.

Consider these example scenarios:

If you consider it highly likely that a certain crossroads / crux will shape the future, you can target it to have greater impact. You could aim to present valuable information or advice to the committee in the first example, work on security at the leading AI labs to avoid exfiltration, or work on interpretability to discover scheming and misalignment hidden in the AI weights or internal processes.

I’m not saying these scenarios in particular are necessarily very likely, they are just for illustration.

Say you want to contribute towards solving a large, complicated problem. You could tackle the central issues or contribute in some way that is helpful regardless of how the central problems are solved [LW · GW]. Find and work on a subproblem that occurs in most scenarios. Or alternatively, instead of solving the central parts or sub-problems, consider actions that improve the overall circumstances in most scenarios—e.g. providing valuable resources and information—such as forecasting!

Dealing with Uncertainty

I think it is quite likely that there will be autonomous self-replicating AI proliferating over the cloud at some point before 2030 (70% probability). But what would be the consequences? I could imagine that it barely affects the world; the AIs fail to gain significant amounts of resources due to fierce competition and generally avoid attracting attention, since they don’t want to be tracked and shut down. I could also imagine that there will be thousands or millions of digital entities, circulating freely and causing all types of problems—causing billions or trillions of USD in damage—and it’s basically impossible to shut them all down.

I know too little about the details. How easy is it for the AIs to gain resources? How hard is it to shut them down? How sophisticated could they be? I’ll have to investigate these things further. The 70% probability estimate largely reflects randomness of future events. This would not be true of any probability estimates I might make about the potential effects. They would not reflect randomness in the world—they would mostly reflect my own uncertainty due to ignorance.

Or consider the consequences of AI-powered mass manipulation campaigns—I have no idea how easy it is to manipulate people! People are used to being bombarded with things that are competing for their attention and trying to influence their beliefs and behavior, but AIs open up new spaces of persuasion opportunities. Could humans avoid manipulation by AI friends that are really nice and share all your interests? Again, my uncertainty doesn’t reflect randomness in the world, but lack of understanding on how effective manipulation attempts may be.

Inevitably, when creating scenarios, there will be many things like this—things that you don’t know enough about yet. Perhaps no one does.

So, let’s separate these different forms of uncertainty (basically aleatoric and epistemic uncertainty). Ignorance about certain details should not deter us from constructing insightful scenarios. I may include proliferation in many scenarios but imagine vastly different consequences—and be clear about my uncertainty regarding such effects.

There are a few trends; better benchmark performance, steadily improving chips, and increasingly large training runs are a few examples. Unless there are major interruptions, you can expect the trends to continue, forming a backbone for possible scenarios. But even those trends will break at some point—maybe in a few years, maybe in a few weeks.

In some scenarios I've encountered, it's unclear which parts are well-founded, and which are wild guesses—and I want to know! At times, I have had seemingly large disagreements with people that, upon closer inspection, were just slightly different intuitions about uncertain details that neither party fully understood. We focused our attention on unimportant points to disentangle non-existing disagreements.

I hope that by clearly formulating the reasoning behind these scenarios and identifying which parts are mostly guesses, we can avoid this pitfall and use scenario forecasting as a powerful tool for constructive debate.

Thank you for reading!

 

P.S. For updates on future posts, consider subscribing to Forecasting AI Futures!

0 comments

Comments sorted by top scores.