The Game Board has been Flipped: Now is a good time to rethink what you’re doing

post by Alex Lintz (alex-lintz) · 2025-01-28T23:36:18.106Z · LW · GW · 18 comments

Contents

  Introduction
  Implications of recent developments
    Updates toward short timelines
      Tentative implications:
    The Trump Presidency
      Tentative implications:
    The o1 paradigm
      Tentative implications:
    Deepseek
      Tentative implications:
    Stargate/AI data center spending
      Tentative implications:
    Increased internal deployment
      Tentative implications:
    Absence of AI x-risk/safety considerations in mainstream AI discourse
      Tentative implications:
  Implications for strategic priorities
    Broader implications for US-China competition
    What seems less likely to work?
    What should people concerned about AI safety do now?
    Acknowledgements
None
18 comments

Cross-posted on the EA Forum here [EA · GW]

Introduction

Several developments over the past few months should cause you to re-evaluate what you are doing. These include:

  1. Updates toward short timelines
  2. The Trump presidency
  3. The o1 (inference-time compute scaling) paradigm
  4. Deepseek
  5. Stargate/AI datacenter spending
  6. Increased internal deployment
  7. Absence of AI x-risk/safety considerations in mainstream AI discourse

Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on.

In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments! 

Implications of recent developments

Updates toward short timelines

There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate [LW(p) · GW(p)] is 99% automation of fully-remote jobs in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus [1]& Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months).

We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achieve 25% on its Frontier Math dataset (though there’s some controversy [LW · GW]). o3 also had surprisingly good performance in coding. In many real-world domains we’re already seeing AI match top experts, they seem poised to exceed them soon. 

With AGI looking so close, it's worth remembering that capabilities are unlikely to stall around human level. We may see far more capable systems potentially very soon (perhaps months, perhaps years) after achieving systems capable of matching or exceeding humans in most important domains. 

While nothing is certain, and there’s certainly potential for groupthink, I believe these bits of evidence should update us toward timelines being shorter.

Tentative implications:

The Trump Presidency

My sense is that many in the AI governance community were preparing for a business-as-usual case and either implicitly expected another Democratic administration or else built plans around it because it seemed more likely to deliver regulations around AI. It’s likely not enough to just tweak these strategies for the new administration - building policy for the Trump administration is a different ball game.

We still don't know whether the Trump administration will take AI risk seriously. During the first days of the administration, we've seen signs on both sides with Trump pushing Stargate but also announcing we may levy up to 100% tariffs on Taiwanese semiconductors. So far Elon Musk has apparently done little to push for action to mitigate AI x-risk (though it’s still possible and could be worth pursuing) and we have few, if any, allies close to the administration. That said, it’s still early and there's nothing  partisan about preventing existential risk from AI (as opposed to, e.g., AI ethics) so I think there’s a reasonable chance we could convince Trump or other influential figures that these risks are worth taking seriously (e.g. Trump made promising comments about ASI recently and seemed concerned in his Logan Paul interview last year).

Tentative implications:

Important caveat: Democrats could still matter a lot if timelines aren’t extremely short or if we have years between AGI & ASI. [4]Dems are reasonably likely to take back control of the House in 2026 (70% odds), somewhat likely to win the presidency in 2028 (50% odds), and there's a possibility of a Democratic Senate (20% odds). That means the AI risk movement should still be careful about increasing polarization or alienating the Left. This is a tricky balance to strike and I’m not sure how to do it. Luckily, the community is not a monolith and, to some extent, some can pursue the long-game while others pursue near-term change.

The o1 paradigm

Alongside scaling up training runs, it appears that inference compute will be key to attaining human-level AI and beyond. Compared to the previous paradigm, compute can be turned directly into capabilities much faster by simply running the models for longer.

Tentative implications:

Deepseek

Deepseek is highly compute efficient and they’ve managed to replicate the o1 paradigm at far lower cost (though not as low as it initially seemed). It seems possible that merely scaling up what they have could yield enormous returns beyond what they already have (though this is unclear).

Deepseek’s methods are, for the most part, open source. That means anyone with a solid base model can now build an impressive reasoner on top of it with barely any additional cost

Tentative implications:

Stargate/AI data center spending

OpenAI and partners intend to invest $100 billion in 2025 and $500 billion over the coming 4 years.[6] Microsoft intends to spend $80 billion on building data centers this year, other companies seem similarly keen to dump money into compute.

The US government has gotten increasingly involved in AI and Sam Altman had a prominent place at Trump’s inauguration. So far, actual government involvement has mostly been in the form of helping companies get through the permitting process quickly. (more detail here)

Tentative implications:

Increased internal deployment

This is more speculative, but I expect we’ll see less and less of what labs are producing and may have less access to the best models. I expect this due to a number of factors including:

Tentative implications:

Absence of AI x-risk/safety considerations in mainstream AI discourse

For a while, after ChatGPT, it looked like AI risk would be a permanent part of the discourse going forward, largely thanks to efforts like the CAIS AI Extinction Letter getting high profile signatories and news coverage. For the past year though, AI x-risk concerns have have not had much airtime in the major media cycles around AI. There haven't been big safety-oriented stories in mainstream outlets in regards to recent AI events with strong implications for AGI timelines and existential risk (e.g. Deepseek, Stargate). A notable example of the AI safety community's lack of ability to affect the media was the decisive loss of the media game during the OpenAI board drama.

That said, we do have more people writing directly about AI safety and governance issues across a variety of Substacks and on Twitter/X now. We’ve also got plenty of prominent people capable of getting into the news if we made a concerted effort to do so (e.g. Yoshua Bengio, Geoff Hinton).

Tentative implications:

Implications for strategic priorities

Broader implications for US-China competition

Recent developments call into question any strategy built on the idea that the US will have a significant lead over China which it could use to e.g. gain a decisive advantage or to slow down and figure out safety. This is because:

Overall, the idea that the US could unilaterally win an AI race and impose constraints on other actors appears less likely now. I suspect this means an international agreement is far more important than we’d thought, though I'm not sure whether I think recent developments make that more or less likely. 

Note: The below takes are far more speculative and I have yet to dive into them in any depth. It still seems useful to give some rough thoughts on what I think looks better and worse given recent developments, but in the interest of getting this out quickly I’ll defer going into more detail until a later post.

What seems less likely to work?

What should people concerned about AI safety do now?

Acknowledgements

Many people commented on an earlier version of this post and were incredibly helpful for refining my views! Thanks especially to Trevor Levin, John Croxton, as well as several others who would rather not be named. Thanks also to everyone who came to a workshop I hosted on this topic!

  1. ^

     That market predicted roughly 2040 timelines until early 2023, then dropped down significantly to around 2033 average and is now down to 2030.

  2. ^

     I have an old write-up on this reasoning here which also talks about how to think about tradeoffs between short and long timelines.

  3. ^

     That said, given things could shift dramatically in 2028 (and 2026 to some extent) it could be worth having part of the community focus on the left still.

  4. ^

     E.g. Perhaps we get human-level research AIs in 2027 but don’t see anything truly transformative until 2029.

  5. ^

     See OpenAI’s Pro pricing plan of $200 per month. To the extent frontier models like o3 can be leveraged for alignment or governance work, it’s possible funders should subsidize their use. Another interesting implication is that, to the extent companies and individuals can pay more money to get smarter models/better answers, we could see increased stratification of capabilities which could increase rich-get-richer dynamics.

  6. ^

     Note that ‘intend’ is important here! They do not have the money lined up yet.

18 comments

Comments sorted by top scores.

comment by Vladimir_Nesov · 2025-01-29T02:30:47.081Z · LW(p) · GW(p)

Taken in isolation, DeepSeek-V3 looks like a 15x compute multiplier. But if a lot of it is data, the multiplier won't scale (when you need much more data, it necessarily becomes worse, or alternatively you need a teacher model that's already better). In any case, this raises the ceiling for what 5 GW training systems can do (at which point there's either almost-AGI or scaling slows down a lot). And there the 15x multiplier of DeepSeek-V3 (or what remains of it after scaling) needs to be compared with the algorithmic advancements of 2025-2028, which would've included most of the things in DeepSeek-V3 anyway, so the counterfactual impact is small.

Replies from: ryan_greenblatt
comment by ryan_greenblatt · 2025-01-29T17:15:19.012Z · LW(p) · GW(p)

15x compute multiplier relative to what? See also here.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2025-01-29T17:46:12.632Z · LW(p) · GW(p)

Relative to GPT-4o, which was trained at a time when 30K H100s clusters were around, and so in BF16 could be expected to be around 8e25 FLOPs, possibly overtrained to a degree that's not too different from DeepSeek-V3 itself.

Amodei's post you linked says a few of tens of millions of dollars for Claude 3.5 Sonnet, which is maybe 4e25 FLOPs in BF16, but I think Claude 3.5 Sonnet is better than DeepSeek-V3, which is not as clearly the case for GPT-4o and DeepSeek-V3, making them easier to compare. Being better than GPT-4o at 2x fewer FLOPs, Claude 3.5 Sonnet has at least a 4x compute multiplier over it (under all the assumptions), but not necessarily more, while with DeepSeek-V3 there's evidence for more. As DeepSeek-V3 was trained more than half a year later than Claude 3.5 Sonnet, it's somewhat "on trend" in getting a compute multiplier of 16x instead of 4x, if we anchor to Amodei's claim of 4x per year.

comment by yo-cuddles · 2025-01-30T00:49:33.017Z · LW(p) · GW(p)

I'm kinda opposite on the timelines thing? This is probably a timeline delayer even if I thought LLM's scaled to AGI, which I don't but let's play along.

If a Pharma company could look at another company's product and copy it and release it for free with no consequences, but the product they release itself could only be marginally improved without massive investment, what does that do to the landscape?

It kills the entire industry. This HURTS anyone trying to fundraise, reckless billions will have a harder time finding their way into the hands of developers because many investors will not be happy with the possibility (already demonstrated at least twice) that someone could just read the outputs of your API available model and eat your lunch, and releases so that people have a less restricted, more customizable and cheaper alternative they can run on their own hardware. Expanding that view, how many services will hose this model for cheaper than Openai will host gpt?

Want proof? Openai has problems running a profit on its services yet has effectively cut prices (or otherwise given away more for less money) twice since deepseek came out. Is Openai so grateful for the free research that deepseek produced that they would rather have that than (probably) billions of dollars in lost revenue, added cost and thinner investment?

Being more speculative, the way in which models have converged on being basically interchangeable should be a red flag that the real growth is plateaued. Goods competing mostly via price is a sign that there's uniformity in quality, that they're mostly interchangeable.

Real model growth seems to have been discarded in favor of finding ways to make a model stare at a problem for hours at a time and come up with answers that are... Maybe human passable if the problem is easy and it's accustomed to it? All it'll cost is a million dollars per run. If that sounds like it's just brute forcing the problem, it's because it is.

Where is the real advancement? The only real advancement is inference time scaling and it doesn't look like this last reach has gotten us close to AGI. The reasoning models are less flexible, not more, the opposite of what you would think if they were actually reasoning, best case is that the reasoning is an excuse to summon a magic token or remove a toxic token.

Am I crazy? Why would this accelerate your timeline?

comment by TsviBT · 2025-01-29T08:37:05.643Z · LW(p) · GW(p)

nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years).


You're in an echo chamber. They don't have very good reasons for thinking this. https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce [LW · GW

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2025-01-29T11:54:57.824Z · LW(p) · GW(p)

Your definition of AGI is "that which completely ends the game", source in your link. By that definition I agree with you. By others' definition (which is similar but doesn't rely on the game over clause) I do not.

My timelines have gotten slightly longer since 2020, I was expecting TAI when we got GPT4, and I have recently gone back and discovered I have chatlogs showing I'd been expecting that for years and had specific reasons. I would propose Daniel K. is particularly a good reference.

Replies from: TsviBT, lahwran
comment by TsviBT · 2025-01-29T12:09:50.049Z · LW(p) · GW(p)

I also dispute that genuine HLMI refers to something meaningfully different from my definition. I think people are replacing HLMI with "thing that can do all stereotyped, clear-feedback, short-feedback tasks", and then also claiming that this thing can replace many human workers (probably true of 5 or 10 million, false of 500 million) or cause a bunch of unemployment by making many people 5x effective (maybe, IDK), and at that point IDK why we're talking about this, when X-risk is the important thing.

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2025-01-29T12:21:00.697Z · LW(p) · GW(p)

When predicting timelines, it matters which benchmark in the compounding returns curve you pick. Your definition minus doom happens earlier, even if the minus doom version is too late to avert in literally all worlds (I doubt that, it's likely more that the most powerful humans' ELO against AIs falls and falls but takes a while to be indistinguishable from zero).

Replies from: TsviBT
comment by TsviBT · 2025-01-29T22:40:59.631Z · LW(p) · GW(p)

You refered to " others' definition (which is similar but doesn't rely on the game over clause) ", and I'm saying no, it's not relevantly similar, and it's not just my definition minus doom.

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2025-01-30T01:31:38.122Z · LW(p) · GW(p)

By "AGI" I mean the thing that has very large effects on the world (e.g., it kills everyone) via the same sort of route that humanity has large effects on the world. The route is where you figure out how to figure stuff out, and you figure a lot of stuff out using your figure-outers, and then the stuff you figured out says how to make powerful artifacts that move many atoms into very specific arrangements.

delete "it kills everyone", that's a reasonable definition. "it kills everyone" is indeed a likely consequence a ways downstream, but I don't think it's a likely major action of an early AGI, with the current trajectory of levels of alignment (ie, very weak alignment, very not robust, not goal aligned, certainly not likely to be recursively aligned such that it keeps pointing qualitatively towards good things for humans for more than a few minutes after AIs in charge, but not inclined to accumulate power hard like an instant wipeout. but hey, also, maybe an AI will see this, and go, like, hey actually we really value humans being around, so let's plan trajectories that let them keep up with AIs rather than disempowering them. then it'd depend on how our word meanings are structured relative to each other).

we already have AI that does every qualitative kind of thing you say AIs qualitatively can't do, you're just somehow immune to realizing that for each thing, yes, that'll scale too, modulo some tweaks to get the things to not break when you scale them. requiring the benchmarks to be when the hardest things are solved indicates that you're not generalizing from small to large in a way that allows forecasting from research progress. I don't understand why you don't find this obvious by, eg, simply reading the paper lists of major labs, and skimming a few papers to see what their details are - I tried to explain it in DM and you dismissed the evidence, yet again, same as MIRI folks always have. This was all obvious literally 10 years ago, nothing significant has changed, everything is on the obvious trajectory you get if intelligence is simple, easy, and compute bound. https://www.lesswrong.com/posts/9Yc7Pp7szcjPgPsjf/the-brain-as-a-universal-learning-machine [LW · GW]

Replies from: TsviBT, TsviBT, TsviBT
comment by TsviBT · 2025-01-30T02:22:14.153Z · LW(p) · GW(p)

we already have AI that does every qualitative kind of thing you say AIs qualitatively can't do

As I mentioned, my response is here https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#_We_just_need_X__intuitions: [LW · GW]

just because an idea is, at a high level, some kind of X, doesn't mean the idea is anything like the fully-fledged, generally applicable version of X that one imagines when describing X

I haven't heard a response / counterargument to this yet, and many people keep making this logic mistake, including AFAICT you.

comment by TsviBT · 2025-01-30T02:19:52.585Z · LW(p) · GW(p)

requiring the benchmarks to be when the hardest things are solved

My definition is better than yours, and you're too triggered or something to think about it for 2 minutes and understand what I'm saying. I'm not saying "it's not AGI until it kills us", I'm saying "the simplest way to tell that something is an AGI is that it kills us; now, AGI is whatever that thing is, and could exist some time before it kills us".

comment by TsviBT · 2025-01-30T02:16:20.099Z · LW(p) · GW(p)

I tried to explain it in DM and you dismissed the evidence,

What do you mean? According to me we barely started the conversation, you didn't present evidence, I tried to explain that to you, we made a bit of progress on that, and then you ended the conversation.

comment by the gears to ascension (lahwran) · 2025-01-30T01:27:35.443Z · LW(p) · GW(p)

@daniel k I just can never remember your last name's spelling, sorry, heh. My point in saying this is that my prediction approach up to 2020 was similar to, though not as refined as, yours, and that instead of trying to argue my views (which differ from yours in a few trivial ways that are mostly not relevant) I'd rather just point people to your arguments of yours.

comment by Lukas Finnveden (Lanrian) · 2025-01-30T00:17:42.586Z · LW(p) · GW(p)

if the trend toward long periods of internal-only deployment continues

Have we seen such a trend so far? I would have thought the trend to date was neutral or towards shorter period of internal-only deployment.

Tbc, not really objecting to your list of reasons [LW · GW] why this might change in the future. One thing I'd add to it is that even if calendar-time deployment delays don't change, the gap in capabilities inside vs. outside AI companies could increase a lot if AI speeds up the pace of AI progress.

ETA: Dario Amodei says "Sonnet's training was conducted 9-12 months ago". He doesn't really clarify whether he's talking the "old" or "new" 3.5. Old and new sonnet were released in mid-June and mid-October, so 7 and 3 months ago respectively. Combining the 3 vs. 7 months options with the 9-12 months range imply 2, 5, 6, or 9 months of keeping it internal. I think for GPT-4, pretraining ended in August and it was released in March, so that's 7 months from pre-training to release. So that's probably on the slower side of Claude possibilities if Dario was talking about pre-training ending 9-12 months ago. But probably faster than Claude if Dario was talking about post-training finishing that early.

comment by RussellThor · 2025-01-29T02:16:48.877Z · LW(p) · GW(p)

In terms of specific actions that don't require government, I would be positive about an agreement between all the leading labs that when one of them made an AI (AGI+) capable of automatic self improvement they would all commit to share it between them and allow 1 year where they did not hit the self improve button, but instead put that towards alignment. 12 months may not sound like a lot, but if the research is 2-10* because of such AI then it would matter. In terms of single potentially achievable actions that will help that seems to be the best to me.

Replies from: rhollerith_dot_com
comment by RHollerith (rhollerith_dot_com) · 2025-01-30T00:39:31.059Z · LW(p) · GW(p)

Your specific action places most of its hope for human survival on the entities that have done the most to increase extinction risk.

Replies from: RussellThor
comment by RussellThor · 2025-01-30T01:58:06.422Z · LW(p) · GW(p)

Thats not a valid criticism if we are simply about choosing one action to reduce X-risk. Consider for example the cold war - the guys with nukes did the most to endanger humanity however it was most important that they cooperated to reduce it.