[Link] Why I’m excited about AI-assisted human feedback

janleike

[Link] Why I’m excited about AI-assisted human feedback

post by janleike · 2022-04-06T15:37:58.322Z · LW · GW · 0 comments

No comments

This is a link post for https://aligned.substack.com/p/ai-assisted-human-feedback

I'm writing a sequence of posts on the approach to alignment I'm currently most excited about. This first post argues for recursive reward modeling and the problem it's meant to address (scaling RLHF to tasks that are hard to evaluate).

0 comments

Comments sorted by top scores.

[Link] Why I’m excited about AI-assisted human feedback

Contents

0 comments