Posts

Comments

Comment by tardygrade on Prizes for ELK proposals · 2022-01-29T20:01:26.111Z · LW · GW

Early in the ELK report, it mentions that ARC doesn't believe that strategies like debate solves ELK in the worst case. Can I get some clarifications on why? Specifically, a debate inspired set-up for SafeVault could be something like:

We train the reporter to take a human belief as input (i.e. "The diamond is in the vault.") and returns a "truthful" argument that is most likely to change the human's belief. 

We can guarantee "truthfulness" by for example restricting the output to be a video rendering of what happens in the vault from some camera angle.