AISN #46: The Transition

post by Corin Katzke (corin-katzke), Corin Katzke (corin-katzke), Dan H (dan-hendrycks) · 2025-01-23T18:09:36.858Z · LW · GW · 0 comments

This is a link post for https://newsletter.safe.ai/p/ai-safety-newsletter-46-the-transition

Contents

  The Transition
  CAIS and Scale AI Introduce Humanity's Last Exam
  AI Safety, Ethics, and Society Course
  Links
    Industry
    Government
    Research and Opinion
None
No comments

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.


The Transition

The transition from the Biden to Trump administrations saw a flurry of executive activity on AI policy, with Biden signing several last-minute executive orders and Trump revoking Biden’s 2023 executive order on AI risk. In this story, we review the state of play.

Trump signing first-day executive orders. Source.

The AI Diffusion Framework. The final weeks of the Biden Administration saw three major actions related to AI policy. First, the Bureau of Industry and Security released its Framework for Artificial Intelligence Diffusion, which updates the US’ AI-related export controls. The rule establishes three tiers of countries 1) US allies, 2) most other countries, and 3) arms-embargoed countries.

The three tiers described by the framework. Source.

The US itself is not subject to export controls, meaning that companies can import AI chips and develop and deploy controlled models without restriction within the US. (For more discussion of the framework, see this report from RAND.)

An AI Infrastructure EO. Second, Biden signed the executive order Advancing United States Leadership in Artificial Intelligence Infrastructure.

The executive order follows an aggressive timeline, with a goal of having new data centers operational by the end of 2027.

A Cybersecurity EO. Finally, Biden signed the executive order Strengthening and Promoting Innovation in the Nation's Cybersecurity. Among other provisions, the executive order:

Trump’s First Days in Office. The Trump Administration’s most significant official action on AI policy so far has been to revoke Biden’s 2023 executive order, Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.

However, Trump also announced Stargate, a joint venture by OpenAI, SoftBank, and Oracle, which would invest $500 billion in AI infrastructure over the next few years. According to an announcement by OpenAI, the project will deploy $100 billion immediately. However, Elon Musk undercut the project on X by claiming the project doesn’t “actually have the money.”

CAIS and Scale AI Introduce Humanity's Last Exam

The Center for AI Safety (CAIS) and Scale AI have introduced Humanity's Last Exam (HLE), which is designed to be the final comprehensive benchmark for testing AI capabilities on closed-ended academic questions. HLE is intended to inform research and policymaking with a better understanding of frontier model capabilities, as discussed in this New York Time article.

HLE features unprecedented scope and difficulty. As state-of-the-art language models begin to achieve high accuracy on existing benchmarks like MMLU, those benchmarks fail to provide an informative measure of model capabilities. The public HLE dataset introduces over 3,000 extremely challenging questions to provide a better measure of AI capabilities at the frontier of human knowledge.

Drawing on expertise from nearly 1,000 subject matter experts across 500 institutions in 50 countries, the dataset spans dozens of academic fields including mathematics, physics, computer science, and the humanities. Questions require expert-level skills or highly specific knowledge and are designed to be impossible to answer through simple internet search. The benchmark includes both multiple-choice and exact-match questions, with about 10% featuring multimodal elements requiring image comprehension. Mathematics problems make up the largest portion of the dataset at 1,102 questions.

A few representative questions from the benchmark.

Current AI models perform poorly on HLE. State-of-the-art language models achieve low accuracy on HLE despite their strong performance on other benchmarks. DeepSeek-R1 leads at 9.4%. Models are also systemically overconfident, with calibration errors ranging from 80% to over 90%—indicating they fail to recognize when questions exceed their capabilities.

Accuracy on HLE across frontier models.

HLE questions are rigorously validated. The benchmark was developed with a multi-stage validation process to ensure question quality. First, question submissions must prove too difficult for current AI models to solve. Then, questions undergo two rounds of expert peer review, and are finally divided into public and private datasets.

HLE doesn’t represent the end of AI development. While current models perform poorly on HLE, the authors say it is plausible that, given the rate of AI development, models could exceed 50% accuracy by the end of 2025. However, they also emphasize that high performance on HLE would demonstrate expert-level capabilities and knowledge at the frontier of human knowledge, but not agential skills.

AI Safety, Ethics, and Society Course

The Center for AI Safety is excited to announce the spring session of our AI Safety, Ethics, and Society course, running from February 19th to May 9th, 2025. It follows our fall fall session last year, which included 240 participants.

This free, online course brings together exceptional participants from diverse disciplines and countries, equipping them with the knowledge and practical tools necessary to address challenges arising from AI, such as the malicious use of AI by non-state actors and the erosion of safety standards driven by international competition. The course is designed to accommodate full-time work or study, lasting 12 weeks with an expected time commitment of 5 hours per week.

The course is based on the recently published textbook, Introduction to AI Safety, Ethics, and Society, authored by CAIS Director Dan Hendrycks. It is freely available in text and audio formats.

Applications for the Spring 2025 course are now open. The final application deadline is February 5, 2025, with a priority deadline of January 31. Visit the course website to learn more and apply.

Links

Industry

Government

Research and Opinion

See also: CAIS website, X account for CAIS, our $250K Safety benchmark competition, our new AI safety course, and our feedback form. The Center for AI Safety is also hiring for several positions, including Chief Operating Officer.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

Subscribe here to receive future versions.

0 comments

Comments sorted by top scores.