Posts

Comments

Comment by J L (jefferson-lee) on Understanding “Deep Double Descent” · 2022-03-16T07:28:02.243Z · LW · GW

Apologies if it's obvious, but why the focus on SGD? I'm assuming it's not meant as shorthand for other types of optimization algorithms given the emphasis on SGD's specific inductive bias, and the Deep Double Descent paper mentions that the phenomena hold across most natural choices in optimizers.