Posts

Self-Other Overlap: A Neglected Approach to AI Alignment 2024-07-30T16:22:29.561Z
Video Intro to Guaranteed Safe AI 2024-07-11T17:53:47.630Z
DIY RLHF: A simple implementation for hands on experience 2024-07-10T12:07:03.047Z

Comments

Comment by Mike Vaiana (mike-vaiana) on jacquesthibs's Shortform · 2024-07-09T20:41:36.865Z · LW · GW

Hey, we've been brainstorm ideas about better training strategies for base models and what types of experiments we can run at a small scale (e.g. training gpt-2 ) to get initial information.  I think this idea is really promising and would love to chat about it.