Posts
SAEs Discover Meaningful Features in the IOI Task
2024-06-05T23:48:04.808Z
An Interpretability Illusion for Activation Patching of Arbitrary Subspaces
2023-08-29T01:04:18.688Z
Comments
Comment by
Georg Lange (GeorgLange) on
Some costs of superposition ·
2024-03-19T17:53:43.201Z ·
LW ·
GW
Calculating l, the maximal number of simultaneously active features, yields strange results. For example, if we have 100 features and 100 neurons, l has to be < 100/(8 * ln(100)) = 2.7. But I would expect that 100 features can be simultaneously active because we have 100 dimensions, so the features can be orthogonal and independent. Am I understanding something wrong?