Dario Amodei: On DeepSeek and Export Controls
post by Zach Stein-Perlman · 2025-01-29T17:15:18.986Z · LW · GW · 2 commentsThis is a link post for https://darioamodei.com/on-deepseek-and-export-controls
Contents
2 comments
Dario corrects misconceptions and endorses export controls.
A few weeks ago I made the case for stronger US export controls on chips to China. Since then DeepSeek, a Chinese AI company, has managed to — at least in some respects — come close to the performance of US frontier AI models at lower cost.
Here, I won't focus on whether DeepSeek is or isn't a threat to US AI companies like Anthropic (although I do believe many of the claims about their threat to US AI leadership are greatly overstated)1. Instead, I'll focus on whether DeepSeek's releases undermine the case for those export control policies on chips. I don't think they do. In fact, I think they make export control policies even more existentially important than they were a week ago2.
Export controls serve a vital purpose: keeping democratic nations at the forefront of AI development. To be clear, they’re not a way to duck the competition between the US and China. In the end, AI companies in the US and other democracies must have better models than those in China if we want to prevail. But we shouldn't hand the Chinese Communist Party technological advantages when we don't have to.
Also:
DeepSeek does not "do for $6M what cost US AI companies billions". I can only speak for Anthropic, but Claude 3.5 Sonnet is a mid-sized model that cost a few $10M's to train (I won't give an exact number). Also, 3.5 Sonnet was not trained in any way that involved a larger or more expensive model (contrary to some rumors).
Also:
Making AI that is smarter than almost all humans at almost all things will require millions of chips, tens of billions of dollars (at least), and is most likely to happen in 2026-2027.
One thing seems wrong:
If China can't get millions of chips, we'll (at least temporarily) live in a unipolar world, where only the US and its allies have these models.
If "smarter than almost all humans at almost all things" models appear in 2026-2027, China and several others will be able to ~immediately steal the first such models, by default.
2 comments
Comments sorted by top scores.
comment by Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2025-01-29T23:48:09.103Z · LW(p) · GW(p)
If "smarter than almost all humans at almost all things" models appear in 2026-2027, China and several others will be able to ~immediately steal the first such models, by default.
Interpreted very charitably: but even in that case, they probably wouldn't have enough inference compute to compete.
comment by Julian Bradshaw · 2025-01-30T03:17:30.969Z · LW(p) · GW(p)
It's strange that he doesn't mention DeepSeek-R1-Zero anywhere in that blogpost, which is arguably the most important development DeepSeek announced (self-play RL on reasoning models). R1-Zero is what stuck out to me in DeepSeek's papers, and ex. the Arc Prize team behind the Arc-Agi benchmark says:
R1-Zero is significantly more important than R1.
Was R1-Zero already obvious to the big labs, or is Amodei deliberately underemphasizing that part?