Posts

Comments

Comment by dshah3 on The Evals Gap · 2024-11-11T20:27:09.873Z · LW · GW

I'm curious if there is a need for developing safety evals with tasks that we want an AI to be good at vs. ones that we don't want an AI to be good at. For example, a subset of evals that, if any AI or combination of AIs can solve it, it becomes a matter of national security. My thinking is that this would be capital-intensive and would involve making external tools LLM-interfaceable, especially gov't ones. I'm not even sure what these would look like, but in theory, there are tasks that we don't want a widely-available LLM to ever solve (e.g. bio-agent design eval, persuasion, privacy, etc.)