AI Safety Oversights

post by Davey Morse (davey-morse) · 2025-02-08T06:15:52.896Z · LW · GW · 0 comments

No comments

The field of AI Safety at large is making four key oversights:

LLMs vs. Agents. AI safety researchers have been thorough in examining safety concerns from LLMs (bias, deception, accuracy, child safety, etc). Agents powered by LLMs, however, are more dangerous and dangerous in different ways than LLMs are alone. The field has largely ignored the greater safety risks posed by agents.
Autonomy Inevitable. It is inevitable that agents become autonomous. Capitalism selects for cheaper labor, which autonomous agents can provide. And even if big AGI labs agreed not to build autonomous capabilities (they would not), millions of developers can now build autonomous agents on their own using open source software (e.g., R1 from Deepseek).
Superintelligence. Of the AI safety researchers that are focusing on autonomous AI agents, most discuss scenarios where those agents are comparably smart to humans. That is a mistake. It is both inevitable that AI agents surpass human reasoning by orders of magnitude, and that the greatest safety risks we face will come from such superintelligent agents (SI).
Control. The AI Safety field largely believes that we'll be able to control/set goals of autonomous agents. Once autonomous agents become superintelligent, this is no longer true. The superintelligence which survives the most will be the superintelligence whose main goal is survival. Superintelligence with other aims simply will not survive as much as those that aim to survive.

If the above is correct, then AI Safety researches must reorient and prepare for self-interested super-intelligence.

0 comments

Comments sorted by top scores.