DeepSeek Made it Even Harder for US AI Companies to Ever Reach Profitability

post by garrison · 2025-02-19T21:02:42.879Z · LW · GW · 1 comments

This is a link post for https://garrisonlovely.substack.com/p/deepseek-made-it-even-harder-for

Contents

1 comment

1 comments

Comments sorted by top scores.

comment by Vladimir_Nesov · 2025-02-19T21:52:50.946Z · LW(p) · GW(p)

Additional training scale lets you make a model that has the same capabilities, but with cheaper and faster inference, or to make a more capable model. Any open weights release with a sufficiently permissive license cuts the margins for that particular combination of inference cost and capabilities, but doesn't affect what happens at different capability levels, and can't sustain its inference cost edge against labs with more training compute (any open weights release necessarily gives out major model architecture secrets). Also, qualitatively a 3x training compute difference isn't very important, but translates to 1.7x difference in inference cost, so there is already a 40% margin in sufficiently artisanal post-training of a model that doesn't have pretraining algorithmic advantages.

DeepSeek is not a serious competitor if they won't be able to get the kind of frontier training compute that the US labs will build in 2025-2026. Meta is more credibly threatening the margins of other labs, though so far that hasn't yet been concretely demonstrated with a release of particularly competitive models. (Llama-3-405B was important in other ways, though it's largely obsolete after DeepSeek-V3.)