Responsible scaling policy TLDR

post by lukehmiles (lcmgcd) · 2023-09-28T18:51:20.330Z · LW · GW · 0 comments

I do not speak for ARC Evals!

TLDR²: If you notice risk and pause, then you have a better chance of doing sufficient mitigations.

Assume you are able to

Then, if you actually define your risks & tasks and actually try to use your AI to complete the tasks, you get plenty of warning of danger.

You pause training & development until the risks are "properly mitigated".

It is too early to say what mitigations work well when, or how to really be sure whether the risk is gone. All else being equal, it is better to make a tentative mitigation plan anyways.

0 comments

Comments sorted by top scores.