Posts

Comments

Comment by mgfcatherall on Effects of Non-Uniform Sparsity on Superposition in Toy Models · 2024-11-15T23:04:36.881Z · LW · GW

Great stuff, and perfect timing as I just read the Toy Models paper yesterday! My immediate intuition for why the least sparse features are dropped in the equal importance case is that they are dropped because they are the most expensive to represent. Being the least sparse, they require more of a dimension than any others, so the tradeoff of number of features versus quality of representation stacks up against them. In other words, the choice for what to do with your last few slots might be to represent say 2 relatively dense features, and squeeze all the others up quite a bit, or to represent say 5 sparser features, requiring less squishing of the others because of the sparsity. So the loss is minimized by preferentially representing the sparser ones. I guess the fact that the denser ones occur more often must be offset by the number of extra features that will fit in their place (in order for the overall loss to be lower). Does that make any sense? I don’t feel like I’ve done a great job of explaining what I mean.