Posts
Training AI agents to solve hard problems could lead to Scheming
2024-11-19T00:10:55.522Z
Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs
2024-07-08T22:24:38.441Z
Apollo Research 1-year update
2024-05-29T17:44:32.484Z
A starter guide for evals
2024-01-08T18:24:23.913Z
Paper: Tell, Don't Show- Declarative facts influence how LLMs generalize
2023-12-19T19:14:26.423Z