[Linkpost] Play with SAEs on Llama 3

post by Tom McGrath, Eric Ho (eh42), Dan Balsam (dan-balsam) · 2024-09-25T22:35:44.824Z · LW · GW · 1 comments


1 comment

We (Goodfire) just put our research preview live - you can play with Llama 3 and use sparse autoencoders to read & write from its internal activations. This is a linkpost for:

Taking research and turning it into something you can actually use and play with has been great. It's surprising how much of a difference iterating on something when you expect it to actually be used feels; I think it's definitely pushed the quality of what you can do with SAEs up a notch.


Comments sorted by top scores.

comment by Lao Mein (derpherpize) · 2024-09-26T08:01:17.488Z · LW(p) · GW(p)

Extremely impressive! I've been wanting something like this for a while.