Posts
AISC project: TinyEvals
2023-11-22T20:47:32.376Z
Polysemantic Attention Head in a 4-Layer Transformer
2023-11-09T16:16:35.132Z
An adversarial example for Direct Logit Attribution: memory management in gelu-4l
2023-08-30T17:36:59.034Z
A circuit for Python docstrings in a 4-layer attention-only transformer
2023-02-20T19:35:14.027Z
Comments
Comment by
Jett (jett) on
A Comprehensive Mechanistic Interpretability Explainer & Glossary ·
2023-10-09T08:41:10.458Z ·
LW ·
GW
The activation patching, causal tracing and resample ablation terms seem to be out of date, compared to how you define them in your post on attribution patching.