thomas-dullien

Posts
Comments

Posts

Comments

Comment by Thomas Dullien (thomas-dullien) on A Mechanistic Interpretability Analysis of Grokking · 2023-06-06T20:36:58.906Z · LW · GW

Good stuff. A few thoughts:

1. Assuming a model has memorized the training data, and still have enough "spare capacity" to play lottery ticket hypothesis to find generalizing solutions to a subset of the memorized data, you'll eventually end up with a number of partial solutions that generalize to a subset of the memorized data (obviously assuming some form of regularization towards simplicity). So this may be where the "underparametrized" regime of ML of the past went wrong: That approach tried to force the model into generalization without memorization, but by being stingy with parameters, forced the model to first and foremost memorize -- there was no spare capacity to "play / experiment with possibly generalizing solutions" left. This then led to memorization-only models, to which researchers reacted by restricting parameters more ...

2. Occam's razor favors simpler models (for some definition of simplicity) over more complex models, given equal predictive power. The best definition of "model simplicity" that we have may in fact be Kolmogorov complexity of the weight matrices. This would mean that if we want a model to apply Occam's razor, we should see if we can use a measure of Kolmogorov complexity of the weights as regularization. The "best" approximation we currently have for Kolmogorov complexity is ... compression, which in itself is a prediction problem. So perhaps the way to encourage good generalization in models is to measure how good the weights can be predicted by another model (?). Apologies if this may sound like a crackpot idea.

3. It might be worth experimenting with shifting the regularization term during training, initially encouraging wide connectivity, and then shifting to either sparsity or low Kolmogorov complexity. There's an intriguing parallel to synaptic pruning in childhood.

User info

Posts

Comments