Paul Christiano on Dwarkesh Podcast
post by ESRogs · 2023-11-03T22:13:21.464Z · LW · GW · 0 commentsThis is a link post for https://www.dwarkeshpatel.com/p/paul-christiano
Contents
No comments
Dwarkesh's summary:
Paul Christiano is the world’s leading AI safety researcher. My full episode with him is out!
We discuss:
- Does he regret inventing RLHF, and is alignment necessarily dual-use?
- Why he has relatively modest timelines (40% by 2040, 15% by 2030),
- What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?
- Why he’s leading the push to get to labs develop responsible scaling policies, and what it would take to prevent an AI coup or bioweapon,
- His current research into a new proof system, and how this could solve alignment by explaining model's behavior
- and much more.
0 comments
Comments sorted by top scores.