Adam Karvonen's Shortform
post by Adam Karvonen (karvonenadam) · 2025-01-18T17:11:46.227Z · LW · GW · 1 commentsContents
1 comment
1 comments
Comments sorted by top scores.
comment by Adam Karvonen (karvonenadam) · 2025-01-18T17:11:46.329Z · LW(p) · GW(p)
If you're looking for a hackable SAE training repo for experiments, I'd recommend our dictionary_learning repo. It's been around for a few months, but we've recently spent some time cleaning it up and adding additional trainer types.
It's designed to be simple and hackable - you can add a new SAE type in a single file (~350 lines). We have 8 tested implementations, including JumpReLU, TopK, BatchTopK, Matryoshka, Gated, and others, with BatchTopK recommended as a good default. Training is quick and cheap - training 6 16K width SAEs on Gemma-2-2B for 200M tokens takes ~6 3090 hours, or ~$1.20.
The repo integrates with SAE Bench and includes reproducible baselines trained on Pythia-160M and Gemma-2-2B. While it's not optimized for large models like Eleuther's (no Cuda kernels/multi-GPU support) and has fewer features than SAE Lens, it's great for experiments and trying new architectures.
Here is a link to the repo: https://github.com/saprmarks/dictionary_learning