Reflections on the state of the race to superintelligence, February 2025
post by Mitchell_Porter · 2025-02-23T13:58:07.663Z · LW · GW · 6 commentsContents
6 comments
My model of the situation is that some time last year, the frontier paradigm moved from "scaling up large language models" to "scaling up chain-of-thought models". People are still inventing new architectures, e.g. Google's Titans, or Lecun's energy-based models. But it's conceivable that inference scaling really is the final paradigm prior to superintelligence. If it can produce von Neumann-level intelligence, then that should be the end, right? - the end, in the sense that control will pass out of human hands, unless there are a few humans embedded in the self-transformations of these genius-level AIs, and those embedded humans will become something more than human or other than human quickly enough.
This leads to the concrete question, where are the new frontier models, the powerful chain-of-thought AIs, being produced? Because these organizations are also the contenders to produce the first superintelligence. My list of known or suspected organizations consists of 5 in the USA, 1 in China, and 1 in Israel. Of course there may be more.
At the head of my list is Elon Musk's xAI. It's at the head of the list, not because I believe that Grok 3 is "the world's smartest AI", but because of its political advantages in Trump 2.0 America. xAI is part of the Musk family of companies, and these are now more deeply integrated into US government activities than ever before. There is a serious possibility that superintelligence will be "born" commanding all the resources of Musk's unprecedented empire, including social media, rocketry, robotics, brain-computer interfaces, and US government information systems.
Intellectually, Trump 2.0 means the ascendancy to power of a large number of ideas from outside the institutional consensus of the liberal establishment (which includes at a minimum, the universities, the mainstream media, and all parts of the government): peace with Putin's Russia, RFK Jr's idea of health, downsizing the government, reversal of DEI policies, and probably more to come. In the relatively esoteric area of frontier AI policy, it is as if e/acc replaced effective altruism as the implicit zeitgeist. Concretely that seems to mean switching from regulation (of AI) and international treaties, to deregulation and a competitive race with China.
Again xAI has a bit of an advantage here, because Musk made X-Twitter friendly to these outsider ideas, in advance of Trump 2.0's right-wing populist revolution seizing institutional power. On the other hand, the other American AI companies were comfortable with Biden-era liberal progressivism, and have had to reorient themselves to the new order. I'll get to that in a moment, but first I'll address the situation of the second company in my list, OpenAI, which has an extra problem in addition to the change in political paradigm.
Sam Altman is doing what he can to keep up - he wasn't at the inauguration, but literally the next day he was co-hosting the launch of the Stargate project with Trump. Altman's real problem is his beef with Elon Musk, who evidently wants to stop or suborn OpenAI, and has a lot of resources with which to do that. A year ago I would have said that OpenAI can draw on the resources of Microsoft to defend itself (since the alliance with Microsoft is what allowed Altman to hang on as CEO in November 2023), but I'm not up-to-date on OpenAI's partnerships, e.g. I think I heard of some partnership with Amazon too.
Politics aside, I think OpenAI may be the technical leader, with GPT-5 perhaps coming later this year and incorporating the best of o1 and o3.
Third and fourth on my list are Anthropic (with Claude) and Google (with Gemini). Politically I see them as pragmatically going with the flow - I think both Dario Amodei and Demis Hassabis have given lip service to the idea that democracy must get to superintelligence before authoritarianism does, while also reminding us that superintelligence in itself is dangerous if unaligned.
Technically, Anthropic might be the leader in AI safety, since they took on Jan Leike and the rest of OpenAI's superalignment team, while Google is a ubiquitous behemoth with vast resources, the IBM of the Internet era, and has both the advantages and disadvantages that come with this.
Fifth on my list is a hypothetical organization: The Project, Leopold Aschenbrenner's name for an American Manhattan Project aiming to create superintelligence. We don't know that it exists; we only have speculation about AI researchers who quit in order to pursue unknown exciting opportunities. If it does exist, it is also possible that it exists under the umbrella of one or more of the companies already in the list.
That's my list for America. Now, onto my final two contenders. These are DeepSeek in China, and Ilya Sutskever's opaque Safe Superintelligence Inc., which is divided between Palo Alto and Tel Aviv. I won't speculate about their political context except to note that they exist outside or half-outside the American scene.
Given this situation, in which there are at least six separate centers of research that are in with a chance of creating superintelligence, and dozens, possibly hundreds more worldwide, that are either doing it or want to do it; and given the fact that I am not part of any of those research centers; my strategy for trying to increase the chance of a human-friendly outcome, is to contribute to the public discussion of how to make autonomous superintelligence "ethical" or "human-friendly" or "superaligned", since the public discussion can in principle be noticed by any of the participants in the AI race, and if there are good valid ideas, they just might take note and implement them. (For example, at the moment I'm interested in what happens if you combine Joshua Clymer's new thoughts on safely outsourcing AI alignment tasks to AI [LW · GW], with June Ku's old CEV-like proposal at MetaEthical.AI - does it take us far beyond Eliezer's own thoughts on "Interim Friendliness" from the early 2000s?)
It's true that there's an unknown number of unsolved questions remaining to be answered, in the theory and practice of safe superintelligence. The situation we've arrived at, in which the risks inherent to the creation of superintelligence are barely publicly acknowledged by the protagonists of the race, is far from desirable. It would be best to cross that threshold only once you really know what you're doing; but that is not the attitude that has prevailed.
However, I don't consider it impossible that the theory will actually be figured out. Knowledge is distributed highly unequally in the world. Individuals and groups with extreme expertise do exist; and humanity plausibly has the foundations needed to figure out the theory of safe superintelligence. Our best experts do have an advanced understanding of quite a lot of physics, mathematics, and computation; and the people making the frontier models do at least know what recipes they are using to create and manage their AIs. Topics like consciousness and (meta)philosophy are a bit more problematic, but we do have a lot of ideas and data to work with.
And finally, the source of our peril - the rapid climb in AI capabilities towards superintelligence - also means that all kinds of hard problems may be solved at unprecedented speed, even before we cross the threshold. So I choose to stay engaged, add my thoughts to the greater flow, and hope that somewhere, the problems that need to be solved will actually be solved.
6 comments
Comments sorted by top scores.
comment by Vladimir_Nesov · 2025-02-23T17:27:33.838Z · LW(p) · GW(p)
AGI is a technical milestone, so I don't see what you are gesturing at with vibes as arguments about AI company advantages. I think 100x advantage in compute remains crucial, 10x advantage matters as much as technical brilliance, and 3x advantage doesn't matter. Better chips are also not strictly needed for larger scale training, because critical batch size scales fast enough for LLMs, they are merely cheaper (though by multiple times, and in both cost and power). Good chips do help enormously with inference.
So the issue for SSI/DeepSeek/Mistral is that they plausibly remain 10x behind in compute by 2026, while Google retains compute advantage even without managing to make a most capable model so far (rather than cheapest-for-its-capabilities).
Replies from: anaguma↑ comment by anaguma · 2025-02-23T17:38:24.802Z · LW(p) · GW(p)
How large of an advantage do you think OA gets relative to its competitors from Stargate?
Replies from: Vladimir_Nesov↑ comment by Vladimir_Nesov · 2025-02-23T18:01:22.396Z · LW(p) · GW(p)
With Stargate, there is only Abilene site and relatively concrete prospect for maybe $40bn so far, enough to build a 1 GW Blackwell training system (4e27 FLOPs models) in 2025-2026, the same scale as was announced by Musk [LW(p) · GW(p)] this week. Anthropic compute for 2026 remains opaque ("a million of some kind of chip" [LW(p) · GW(p)]), Google probably has the most in principle, but with unclear willingness to spend. Meta didn't say anything to indicate that its Richland Parish site will see 1 GW of Blackwells in 2025-2026, it remains a vague 2 GW by 2030 thing.
comment by teradimich · 2025-02-23T14:37:07.453Z · LW(p) · GW(p)
There doesn't seem to be a consensus that ASI will be created in the next 5-10 years. This means that current technology leaders and their promises may be forgotten.
Does anyone else remember Ben Goertzel and Novamente? Or Hugo de Garis?
↑ comment by AnthonyC · 2025-02-23T16:09:57.237Z · LW(p) · GW(p)
True, that can definitely happen, but consider
1) the median and average timeline estimates have been getting shorter, not longer, by most measures,
and
2) no previous iteration of such claims was credible enough to attract hundreds of billions of dollars in funding, or meaningfully impact politics and geopolitics, or shift the global near-consensus that has held back nuclear power for generations. This suggests a difference in the strength of evidence for the claims in question.
Also 3) When adopted as a general principle of thought, this approach to reasoning about highly impactful emerging technologies is right in almost every case, except the ones that matter. There were many light bulbs before Edison, and many steels before Bessemer, but those things happened anyway, and each previous failure made the next attempt more likely to succeed, not less.
↑ comment by Gordon Seidoh Worley (gworley) · 2025-02-23T16:00:50.349Z · LW(p) · GW(p)
While history suggests we should be skeptical, current AI models produce real results of economic value, not just interesting demos. This suggests that we should be willing to take more seriously the possibility that they will be produce TAI since they are more clearly on that path and already having significant transformative effects on the world.