Posts

We Inspected Every Head In GPT-2 Small using SAEs So You Don’t Have To 2024-03-06T05:03:09.639Z
Attention SAEs Scale to GPT-2 Small 2024-02-03T06:50:22.583Z
Sparse Autoencoders Work on Attention Layer Outputs 2024-01-16T00:26:14.767Z

Comments

Comment by Connor Kissane (ckkissane) on SAE-VIS: Announcement Post · 2024-03-31T16:35:06.004Z · LW · GW

Amazing! We found your original library super useful for our Attention SAEs research, so thanks for making this!

Comment by Connor Kissane (ckkissane) on Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo · 2023-08-14T14:20:07.795Z · LW · GW

These puzzles are great, thanks for making them!

Comment by Connor Kissane (ckkissane) on Causal scrubbing: results on induction heads · 2023-07-19T19:57:54.463Z · LW · GW

Code for this token filtering can be found in the appendix and the exact token list is linked.

Maybe I just missed it, but I'm not seeing this. Is the code still available?