Posts

Using the probabilistic method to bound the performance of toy transformers 2025-01-21T23:01:38.067Z
Contextual attention heads in the first layer of GPT-2 2025-01-20T13:24:31.803Z
Duplicate token neurons in the first layer of GPT-2 2024-12-27T04:21:55.896Z
Alex Gibson's Shortform 2024-12-27T04:21:55.840Z

Comments