AI Strategy Updates that You Should Make
post by Alice Blair (Diatom) · 2025-01-27T21:10:41.838Z · LW · GW · 2 commentsContents
Models o3 Deepseek-v3 Deepseek-R1 Distillations Updates from Models Geopolitics The EO Stargate Chinese Stargate? Updates from News Macro-Scale Strategy Updates China US Conclusion None 2 comments
or: "Things That I Keep Telling AI People Around Me, So I'm Just Writing A Big Post About Them." I think it's really valuable for people to think about AI strategy, even if that's not their main job, since it is useful for informing what their main job should be.
There will be a later post titled something like "Emotional Updates on AI that You Should Make". This is not that. Things are changing fast enough that I want to get one or the other out quickly, and besides, emotion should follow belief [LW · GW], so this one is coming first.
[Epistemic status: I expect most of the things I predict here to turn out correctly since most of the reasoning feels pretty straightforward, and I mark especially speculative things as such.]
A good thing to keep in mind: if you can predict that future-you will make a certain update, just make it now instead [LW · GW].
Models
o3
If you've made it to this post, you've probably heard about o3. Nonetheless, here's the summary. Besides just being particularly good at math and [whatever-it-is that ARC-AGI measures], the coding benchmarks, including the fact that o3 gets higher elo on codeforces than all but one OpenAI employee, indicate that we're solidly approaching the beginning of Recursive Self-Improvement. It's very expensive to run at this scale, but it is clearly going to get cheaper (see Updates from Models).
Deepseek-v3
Reportedly, it took $5.5 million to train a LLM that performs on par with the ~May 2024 frontier. Anecdotally, it seems like it's worse than Claude 3.6 Sonnet, but not by huge amount. It feels like it's in a similar tier, but not quite reaching the same level. The benchmarks turn out pretty similar to 3.6. It's also very cheap to run, compared to other models at this level, in large part due to its incredibly small size for a frontier model. The rest is due to hardware optimizations that Deepseek performed. Notably, it's open source.
(EDIT: a reader pointed out that the total costs were likely much higher than this, accounting for wages, R&D, etc. This $5.5 million number is somewhat misleading in some respects. Also, a previous version of the above paragraph evaluated Deepseek-v3 and Claude 3.6 Sonnet as being at similar levels of power, which I realized was an oversimplification.)
Deepseek-R1
Just as GPT-4o became o1 with RL on CoT, Deepseek-v3 became Deepseek-R1, a reasoning model which is, by Zvi's estimate, 10x cheaper for similar quality outputs to o1. Further, you can see the chain of thought, which is both very weird and often useful. Being able to see the reasoning that goes into a claim makes it much more digestible, and it makes it much easier to detect and salvage when the model makes a mistake.
Distillations
Further, Deepseek was able to perform knowledge distillation from R1 to various smaller models, getting very impressive results (page 14 of the paper).
Updates from Models
As per usual, go out and get your utility from these models. They're powerful and only ever getting moreso. People often under-update on this fact and don't think of creative ways to offload more and more cognitive tasks to LLMs and they get less utility because of it.
These results show us a few different things about strategy, as well:
First, however fast you think OpenAI is scaling based off of the o3 news, that's an underestimate. Just make the update now so you don't have to do it later. Why? Most of the US AGI companies have a massive compute overhang now, because of Deepseek. Even though OpenAI has o3 and that's more performant than anything Deepseek is doing, 4o (the likely base model under o3) is nowhere nowhere near as efficient as Deepseek-v3 or the R1 distillations, and so there's a lot of room to take advantage of the new efficiency and get higher performance without needing much more compute or money. If you put OpenAI's budget into Deepseek's methods (do their particular style of MoE, do RL on CoT, then knowledge distillation), you're going to get improvement out the other end, and probably a pretty sizable improvement without much additional effort beyond what Deepseek did.
Second is that open-weight models are getting a lot more serious. Anything that's out in the open is out forever, and so is the jailbroken version of it, since our current "alignment techniques" are so easy to undo. The open weight frontier will never be any worse than R1, and it can only build on that. If Chinese AI companies are trying to catch up to the US frontier, then a plausibly really good way to do it is to get people really excited about some open frontier models and then take advantage of the open source community to accelerate them further. Epoch shows that closed models have been losing their lead over open models for some time now, and Deepseek-v3 and R1 only further this trend.
Third is that, if you are still working under a strategy that assumes we don't enter a recursive self-improvement+AI R&D automation loop, it's a bit too late. Either have a good reason why that loop will stop despite all the massive monetary incentives for it to keep going, or ditch that plan.
Fourth, if your strategy relies on not giving China or any US adversaries enough compute to make frontier models, ditch that too.
Geopolitics
The EO
Trump repealed the Biden AI EO. As I understand it, this action doesn't dismantle the US AISI, but it does remove the mandate for the AISI to do much useful stuff. I don't find it maximally clear where the repeal is coming from, but most of my probability mass is on it being an anti-safety move for the purpose of further accelerating US AI progress.
Stargate
People keep calling this the AGI Manhattan Project. It is not the AGI Manhattan Project. It is an announcement of tech people investing a lot in AI, and Donald Trump happened to be there. The current plan is to invest $100 billion into AI infrastructure immediately, and $500 billion over the next four years. At present, there is no (announced) USG funding going into it. Manifold thinks that the likely monetary output is very bimodal, either doing much less than promised or doing much more than promised.
While this market is currently pretty low-volume, I generally agree with the distribution at the time of writing.
Chinese Stargate?
Two days after the Stargate announcement, the Bank of China released this. Translation[1] (by Claude 3.6 Sonnet). The gist is that Bank of China is saying they'll invest $140B in AI infrastructure over the next five years.
Updates from News
Although Stargate is not a Manhattan project, it very well might become a Manhattan Project; the US has a lot of reasons to not want random companies making superintelligence. Leopold put it well: "Imagine if we had developed atomic bombs by letting Uber just improvise." If ever there were a nice convenient organization that the US government would like to absorb or partially absorb [LW · GW], this will probably be one of them, given that they are poised to Do The Thing and Make The AGI Happen. This seems especially likely because the fact that this announcement was with Trump means that there's some level of talk and coordination between tech and governance. But also, Trump's words and outward signals are not historically very correlated with his actions, so it's very hard to use them as more than weak evidence for his strategy.
Still, it seems pretty clear that the government is treating the world like we're in for an arms race that we must win in order to bring about the so-called [LW · GW] "golden age".
In both the Stargate and Bank of China announcements, it's not exactly clear what concretely, definitely happens next, other than "lots of money goes into [AI Infrastructure]". I predict that this is only the beginning of things ramping up between the US and China. I also predict that we might see other frontier AI labs popping up in China, since right now it's basically just Deepseek at the frontier, and there's about to be a lot more funding to go around (if the Bank of China memo is to be believed).
If you're working under a strategy that relies on the US and China not entering a full-on AI arms race, you'd better have a reason why this currently-building arms race is just going to stop, otherwise ditch the strategy. If you're working under a strategy that relies on the default position of the US government being cautious about ASI by default, ditch that strategy.
Macro-Scale Strategy Updates
China
If China wakes up to the possibility of an AGI race (which it seems like it is, from the Bank of China source), then at some point, their frontier models must be closed, especially if they gain a capabilities lead. It's just not strategically sound to send over AGI weights to the person you're in an AGI arms race with (and to everyone in the world). Therefore, conditional on us being in an arms race scenario between China and the US, it makes sense to predict that Chinese models will stop being open source at some point around when the Chinese frontier meets the US frontier, breaking the strong pattern we're seeing now of open sourcing models. If you're working under a strategy that assumes that the US maintains a lead over China in the AGI race and never loses it, seriously question that assumption, given recent Chinese acceleration.
I'm really interested in writing a post sometime soon about what can be gained from predicting this pattern violation in advance.
US
Before the Trump presidency began, I was hearing a lot of ambiguity about what it meant for AI strategy. People don't quite know to predict his actions given his words, nor do they even know how to predict his words. Now, we know that we're probably in the arms race world (and have been able to see it coming for a little while now). We know that we're probably in a world where some sort of AGI company nationalization happens, for both the US and China.
Conclusion
Do not become attached to strategies you may not want [? · GW]. Don't let your plans be stuck in the past.
- ^
1 Trillion Yuan! Providing Special Comprehensive Financial Support to Aid AI Industry Chain Development
According to the "Bank of China's Action Plan to Support AI Industry Chain Development," Bank of China plans to provide specialized comprehensive financial support totaling no less than 1 trillion yuan to various entities across the AI industry chain over the next five years. This includes no less than 300 billion yuan in combined equity and debt financing, along with establishing specialized institutional safeguards aligned with AI technological innovation to serve the financial needs across all links in the industry chain.
Empowering National Scientific and Technological Self-Reliance As the first bank to establish a support mechanism for major scientific and technical projects, Bank of China has launched "1+1+N" full-cycle services. With support from relevant ministries, Bank of China has established direct cooperation with major AI technology projects, providing "one-stop" customized financial services covering "basic research-achievement transformation-industrial application" for "N" companies involved in innovative technology, continuously improving the scientific and technological financial system mechanism that matches technological innovation.
Serving AI Factor Supply Bank of China fully leverages its comprehensive features to strengthen the computing power and data supply foundation for the AI industry. Through diversified financial tools including equity, loans, bonds, insurance, and leasing, it empowers the development of intelligent computing infrastructure. Focusing on national computing hub node planning, it supports the construction of intelligent computing centers and supporting facilities and park infrastructure. It provides financial guarantees such as property insurance and comprehensive insurance for first (set) major technical equipment to enhance enterprise risk control capabilities.
Boosting AI Technical Innovation Bank of China provides differentiated financial services throughout the lifecycle for technical innovation enterprises in models and algorithms. It has created the "BOC M&A+" one-stop service system to promote the integration and upgrade of AI technology with industrial resources. By integrating BOC Group's AIC equity investment fund and the domestic and international investment banking advantages of BOC International and BOC Securities, it establishes an integrated "equity+commercial banking+investment banking" service system to help key core technology enterprises access capital market financing channels and cultivate industry chain "unicorns" and listed companies.
Promoting AI Scenario Applications Bank of China increases support for AI technology demonstration applications. It opens up and creates intelligent marketing, intelligent operations, and intelligent risk control application scenarios. It builds upstream and downstream industry chain connection platforms, creating differentiated supply chain financial service solutions for different scenarios. It supports the growth of new tracks such as "AI+robotics," "AI+low-altitude economy," "AI+biomanufacturing," and "AI+new materials" to cultivate new development momentum. Leveraging Bank of China's global operating advantages, it provides professional cross-border financial support for AI enterprises' "going out" and "bringing in" through the "single point access, global response" platform.
Bank of China will use the AI industry chain service as a pilot to construct a comprehensive, multi-level financial service system, continuously creating new paradigms in technology finance, fully supporting key core technologies, serving the development of the entire AI industry chain, and promoting high-level circulation of "technology-industry-finance" to contribute sustained financial momentum for building a modern industrial system and promoting high-quality development.
2 comments
Comments sorted by top scores.
comment by momom2 (amaury-lorin) · 2025-01-27T22:11:35.996Z · LW(p) · GW(p)
Thanks for writing this! I was unaware of the Chinese investment, which explains another recent information which you did not include but I think is significant: Nvidia's stock plummeted 18% today.
Replies from: Diatom↑ comment by Alice Blair (Diatom) · 2025-01-27T23:01:27.480Z · LW(p) · GW(p)
I saw that news as I was polishing up a final draft of this post. I don't think it's terribly relevant to AI safety strategy, I think it's just an instance of the market making a series of mistakes in understanding how AI capabilities work. I won't get into why I think this is such a layered mistake here, but it's another reminder that the world generally has no idea what's coming in AI. If you think that there's something interesting to be gleaned from this mistake, write a post about it! Very plausibly, nobody else will.