govind-pimpale

Posts
Comments

Posts

Forecasting Frontier Language Model Agent Capabilities 2025-02-24T16:51:32.022Z

Do models know when they are being evaluated? 2025-02-17T23:13:22.017Z

Current safety training techniques do not fully transfer to the agent setting 2024-11-03T19:24:51.537Z

~80 Interesting Questions about Foundation Model Agent Safety 2024-10-28T16:37:04.713Z

Analyzing DeepMind's Probabilistic Methods for Evaluating Agent Capabilities 2024-07-22T16:17:07.665Z

Comments