Posts
Comments
Comment by
tardygrade on
Prizes for ELK proposals ·
2022-01-29T20:01:26.111Z ·
LW ·
GW
Early in the ELK report, it mentions that ARC doesn't believe that strategies like debate solves ELK in the worst case. Can I get some clarifications on why? Specifically, a debate inspired set-up for SafeVault could be something like:
We train the reporter to take a human belief as input (i.e. "The diamond is in the vault.") and returns a "truthful" argument that is most likely to change the human's belief.
We can guarantee "truthfulness" by for example restricting the output to be a video rendering of what happens in the vault from some camera angle.