Posts

Targeted Manipulation and Deception Emerge when Optimizing LLMs for User Feedback 2024-11-07T15:39:06.854Z

Comments