Posts

Towards shutdownable agents via stochastic choice 2024-07-08T10:14:24.452Z
Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models 2023-11-08T11:37:43.997Z

Comments