Posts
Shivam's Shortform
2025-03-09T15:29:10.826Z
The Road to Evil Is Paved with Good Objectives: Framework to Classify and Fix Misalignments.
2025-01-30T02:44:47.907Z
Limits of safe and aligned AI
2024-10-08T21:30:49.661Z
Comments
Comment by
Shivam on
Shivam's Shortform ·
2025-03-09T15:29:10.826Z ·
LW ·
GW
An important work in AI safety should be to prove equivalency of various Capability benchmarks to Risk benchmarks. So that, when AI labs show their model is crossing a capability benchmark, they are automatically crossing a AI safety level.
"So we don't have two separate reports from them; one saying that the model is a PhD level Scientist, and the other saying that studies shows that the CBRN risk with model is not more than internet search."