Posts
Using the probabilistic method to bound the performance of toy transformers
2025-01-21T23:01:38.067Z
Contextual attention heads in the first layer of GPT-2
2025-01-20T13:24:31.803Z
Duplicate token neurons in the first layer of GPT-2
2024-12-27T04:21:55.896Z
Alex Gibson's Shortform
2024-12-27T04:21:55.840Z