nonveumann's Shortform

post by nonveumann · 2023-08-16T03:04:24.167Z · LW · GW · 1 comments

1 comments

Comments sorted by top scores.

comment by nonveumann · 2023-08-16T03:04:24.269Z · LW(p) · GW(p)

Kernel of something that might inspire someone else who knows more than I.

Assuming weights that have “grokked” a task are more interpretable, is there use in modifying loss functions to increase grokking likelihood? Perhaps by making it path dependent on the updates of the weights themselves?