
Research Report: Alternative sparsity methods for sparse autoencoders with OthelloGPT. 2024-06-14T00:57:29.407Z


Comment by Andrew Quaisley on Research Report: Alternative sparsity methods for sparse autoencoders with OthelloGPT. · 2024-06-15T18:38:43.223Z · LW · GW

Oh, yeah, looks like with  this is equivalent to Hoyer-Square.  Thanks for pointing that out; I didn't know this had been studied previously.

And you're right, that was a typo, and I've fixed it now.  Thank you for mentioning that!

Comment by Andrew Quaisley on Research Report: Alternative sparsity methods for sparse autoencoders with OthelloGPT. · 2024-06-15T18:27:32.406Z · LW · GW

I did not use your initialization scheme, since I was unaware of your paper at the time I was running those experiments.  I will definitely try that soon!

Yeah, I can see how leaky topk and multi-topk are doing similar things.  I wonder if leaky topk also gives a progressive code past the value of k used in training.  That definitely seems worth looking into.  Thanks for the suggestions!