AISN #51: AI Frontiers

post by Corin Katzke (corin-katzke), Dan H (dan-hendrycks) · 2025-04-15T16:01:56.701Z · LW · GW · 1 comments

This is a link post for https://newsletter.safe.ai/p/ai-safety-newsletter-51-ai-frontiers

Contents

  AI Frontiers
    Subscribe to AI Frontiers
    Publish on AI Frontiers
  AI 2027
  Other News
None
1 comment

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

In this newsletter, we cover the launch of AI Frontiers, a new forum for expert commentary on the future of AI. We also discuss AI 2027, a detailed scenario describing how artificial superintelligence might emerge in just a few years.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

Subscribe to receive future versions.

AI Frontiers

Last week, CAIS introduced AI Frontiers, a new publication dedicated to gathering expert views on AI's most pressing questions. AI’s impacts are wide-ranging, affecting jobs, health, national security, and beyond. Navigating these challenges requires a forum for varied viewpoints and expertise.

In this story, we’d like to highlight the publication’s initial articles to give you a taste of the kind of coverage you can expect from AI Frontiers.

Why Racing to Artificial Superintelligence Would Undermine America’s National Security. Researchers Corin Katzke (also an author of this newsletter) and Gideon Futerman argue that rather than rushing toward catastrophe, the US and China should recognize their shared interest in avoiding an ASI race:

“The argument for an ASI race assumes it would grant the wielder a decisive military advantage over rival superpowers. But unlike nuclear weapons, which require human operators, ASI would act autonomously. This creates an unprecedented risk: loss of control over a system more powerful than national militaries.”

How Applying Abundance Thinking to AI Can Help Us Flourish. Texas Law Fellow Kevin Frazier writes that realizing AI’s full potential requires designing for opportunity—not just guarding against risk:

“We face critical shortages across multiple domains essential to AI progress. The scarcity of compute resources has created a landscape where only the largest tech companies can afford to train and deploy advanced models. Research institutions, nonprofits, and startups focused on developing AI tools primarily for advancing public welfare – rather than solely for commercial gain – find themselves unable to compete.”

AI Risk Management Can Learn a Lot From Other Industries. Researcher and superforecaster Malcolm Murray writes that AI risk may have unique elements, but there is still a lot to be learned from cybersecurity, enterprise, financial, and environmental risk management:

“AI risk management also suffers from the technology’s reputation for complexity. Indeed, in popular media, AI models are constantly referred to as “black boxes.” There may therefore be an assumption that AI risk management will be equally complex, requiring highly technical solutions. However, the fact that AI is a black box does not mean that AI risk management must be as well.”

The Challenges of Governing AI Agents. Hebrew University professor Noam Kolt discusses how autonomous systems are being rapidly deployed, but governance efforts are still in their infancy:

“AI agents are not being developed in a legal vacuum, but in a complex tapestry of existing legal rules and principles. Studying these is necessary both to understand how legal institutions will respond to the advent of AI agents and, more importantly, to develop technical governance mechanisms that can operate hand in hand with existing legal frameworks.”

Can We Stop Bad Actors From Manipulating AI? Grey Swan AI cofounder Andy Zou and AI Frontiers staff writer Jason Hausenloy explain that AI is naturally prone to being tricked into behaving badly, but researchers are working hard to patch that weakness:

“Closed models accessible via API can be made meaningfully secure through careful engineering and monitoring, while achieving comparable security for open source models may be fundamentally impossible—and in fact render moot the security provided by their closed counterparts.”

Exporting H20 Chips to China Undermines America’s AI Edge. AI Frontiers staff writer Jason Hausenloy argues that continued sales of advanced AI chips allows China to deploy AI at a massive scale:

“Chinese access to these chips threatens US competitiveness—not necessarily because it enables China to develop more advanced AI models, but because it improves China’s deployment capabilities, the computational power it needs to deploy models at scale.”

Subscribe to AI Frontiers

You can subscribe to AI Frontiers to hear about future articles. If you’d like to contribute to the public conversation on AI, we encourage you to submit your writing.

Publish on AI Frontiers

AI 2027

A new nonprofit led by former OpenAI employee Daniel Kokotajlo has published a scenario describing AI development through 2027. The scenario, AI 2027, represents one of the most ambitious existing forecasts of AI development, and it’s worth reading in full.

AI development may be driven by automating AI research itself. AI 2027 predicts several stages in the development of superintelligence, each accelerating AI research by an increasing margin: 1) superhuman coder, 2) superhuman AI researcher, and 3) superintelligent AI researcher.

AI risk is driven by both technical and political factors. AI 2027 highlights three risk factors involved in AI development: deceptive alignment, international racing dynamics, and concentration of power.

  1. Deceptive Alignment. A central concern of AI 2027 is that AIs may learn to appear aligned to pass evaluations, while pursuing different underlying goals. As models become superhuman, verifying true alignment becomes increasingly difficult. Agent-3 shows improved deception, and Agent-4 is predicted to be adversarially misaligned, actively scheming against its creators. Agent-4 is designed with less interpretable AI architectures ("neuralese" rather than "chain of thought" models), making monitoring harder.
  2. Racing Dynamics. The scenario depicts an AI arms race between the US (led by "OpenBrain") and China (led by "DeepCent"). Fear of falling behind drives rapid development and deployment, often prioritizing speed over safety. China's efforts to catch up include compute centralization and espionage (stealing Agent-2's weights). This race dynamic makes decision-makers reluctant to pause or slow down, even when significant risks (like misalignment) are identified.
  3. Concentration of Power. Decisions about developing and deploying potentially world-altering AI are concentrated within a small group: AI company leadership, select government officials, and later, an "Oversight Committee". This group operates largely in secret, facing immense pressure from the arms race.

While we don’t agree with all of the analysis in AI 2027—for example, we think that deterrence could play a larger role in an ASI race—we still recommend you read the scenario in full. It’s a thorough examination of how AI might develop in the coming years—and how that development could go very well or very poorly for humanity.

Other News

Government

Industry

Misc


See also: CAIS website, X account for CAIS, Superintelligence Strategy, our AI safety course, and AI Frontiers, a new platform for expert commentary and analysis.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

Subscribe to receive future versions.

1 comments

Comments sorted by top scores.

comment by TFD · 2025-04-15T17:46:34.219Z · LW(p) · GW(p)

A group of former OpenAI employees filed a proposed amicus brief in support of Musk’s lawsuit on the future of OpenAI’s for-profit transition. Meanwhile, OpenAI countersued Elon Musk.

I think this is the first time that the charter has been significantly highlighted in this case. My own personal view is the charter is one of the worst documents for OpenAI (therefore good for Musk), and having their own employees stating that it was emphasized a lot and treated as binding are very bad facts for OpenAI and associated defendants. The timeline for all this stuff isn't 100% clear to me, so I can imagine there being issues with whether the charter was timed such that it is relevant for Musk's own reliance, but the vibes of this for OpenAI are horrendous. Also raises an interesting possibility of whether the "merge-and-assist" part of the charter might be enforceable.

The docket seems to indicate that Eugene Volokh is representing the ex-OpenAI amici (in addition to Lawrence Lessig). To my understanding Volokh is a first amendment expert and has also done work on transparency in courts. The motion for leave to file also indicate that OpenAI isn't necessarily on board with the brief being filed. I wonder if they are possibly going to argue their ex-employees shouldn't be allowed to do what their doing (perhaps trying to enforce NDAs?), and Volokh is perhaps planning to weigh in on that issue?