Kailuo Wang's Shortform

post by Kailuo Wang (kailuo-wang) · 2025-02-07T03:57:53.727Z · LW · GW · 1 comments

Contents

1 comment

1 comments

Comments sorted by top scores.

comment by Kailuo Wang (kailuo-wang) · 2025-02-07T03:57:53.725Z · LW(p) · GW(p)

Could Billion-Dollar Test Run Unlock Superhuman AI?

- edited from a draft by Gemini Flash Thinking 2.0 Experimental 

We hear about billion-dollar investments in massive datasets and colossal neural networks, all in the pursuit of imbuing AI models with broad knowledge and general capabilities. Imagine a new paradigm: use massive test-time compute to improve the test time algorithm itself.

Initial explorations suggest that while such test-time scaling might be computationally expensive, e.g. O3 high requiring thousands of GPU hours to solve a single Arc-AGI problem, it appears to be remarkably free from data constraints. This is a critical distinction. Traditional pre-training is fundamentally limited by the availability and quality of data. Test-time scaling, however, seemingly allows for continuous improvement simply by throwing more compute at the problem.

This unlocks a new possibility: instead of spending billions on a single, massive pre-training run to build a generally intelligent model, organizations like OpenAI could potentially invest billions, or even tens of billions, in test-time computation. This wouldn't be for serving customers or just general knowledge acquisition, but rather for discovering AI algorithm breakthroughs that even the most brilliant human engineers might miss. Imagine dedicating immense computational resources – perhaps a million GPUs or more – to solving a single, critical algorithmic challenge that can drastically improve the efficiency of the whole system. First invest astronomically to obtain an extremely inefficient ASI which runs at a prohibitive high cost for general applications, then use it to self improve efficiency until the cost becomes practical for general applications. 

This approach hinges on the idea that while current "O3 high" configurations may be inefficient in terms of compute per unit of intelligence gained, the sheer scalability of computation can compensate. Even if solving an AI development problem at a super-intelligence level with current methods is slow and requires immense resources, the potential payoff could be transformative. Assuming O3 high can be scale up another 4 OOM more compute to reach ASI level intelligence, then Open AI just need to acquire 4 OOM more GPUs to use it to solve algorithm problems, which is around 100 million GPUs. However if they manage to improve the algorithm efficiency by 2 OOM, which is reasonably achievable within 2 years, then they just need 1 million GPUs - roughly 25- 50 billion dollars. 

This perspective lends credence to recent pronouncements from figures like Dario Amodei, CEO of Anthropic, who has spoken about the necessity of labs needing millions of GPUs in the coming years. While the immediate interpretation might be for ever-larger pre-training runs, it's equally plausible that this massive compute infrastructure is envisioned for this very purpose: billion-dollar test-time runs dedicated to solving the most pressing AI development challenges at a level of intelligence exceeding human capacity.

The implications of this approach are profound. It suggests a potential acceleration in AI progress, driven not by data accumulation, but by the strategic deployment of massive computational power. It opens up the possibility of using AI itself to solve the most complex problems in AI development, leading to more efficient algorithms, novel architectures, and ultimately, faster progress towards efficient AGI. While the efficiency of "O3 high" test-time scaling remains a concern, the tantalizing prospect of data-unconstrained intelligence amplification through massive compute offers a compelling and potentially game-changing direction for the future of AI.