Posts

The Buckling World Hypothesis - Visualising Vulnerable Worlds 2024-04-04T15:51:59.151Z
Can AI Transform the Electorate into a Citizen’s Assembly? 2024-04-04T15:45:31.075Z

Comments

Comment by Rosco-Hunter on Towards Monosemanticity: Decomposing Language Models With Dictionary Learning · 2024-04-25T12:28:06.871Z · LW · GW

This was a really interesting paper; however, I was left with one question. Can anyone argue why exactly the model is motivated to learn a much more complex function than the identity map? An auto-encoder whose latent space is much smaller than the input is forced to learn an interesting map; however, I can't see why a highly over-parameterised auto-encoder wouldn't simply learn something close to an identity map. Is it somehow the regularisation or the bias terms? I'd love to hear an argument for why the auto-encoder is likely to learn these mono-semantic features as opposed to an identity map.

Comment by Rosco-Hunter on The Buckling World Hypothesis - Visualising Vulnerable Worlds · 2024-04-04T23:09:25.791Z · LW · GW

Thank you for the reply. I am using the ruler as an informal way of introducing a Pitchfork Bifurcation - see [3]. Although the specific analogy to a ruler may appear tenuous, my article merely attempts to draw links between the underlying dynamics, in which a stable point (with symmetries) splits into two stable branches protruding from a critical point. This setup is used to study a wide variety of physical and biological phenomena - see [Catastrophe Theory and its Applications, Poston and Stewart, 1978].