Posts

Shivam's Shortform 2025-03-09T15:29:10.826Z
The Road to Evil Is Paved with Good Objectives: Framework to Classify and Fix Misalignments. 2025-01-30T02:44:47.907Z
Limits of safe and aligned AI 2024-10-08T21:30:49.661Z

Comments

Comment by Shivam on Shivam's Shortform · 2025-03-09T15:29:10.826Z · LW · GW

An important work in AI safety should be to prove equivalency of various Capability benchmarks to Risk benchmarks. So that, when AI labs show their model is crossing a capability benchmark, they are automatically crossing a AI safety level. 
"So we don't have two separate reports from them; one saying that the model is a PhD level Scientist, and the other saying that studies shows that the CBRN risk with model is not more than internet search."