Posts
Incidental polysemanticity
2023-11-15T04:00:00.000Z
Comments
Comment by
Kushal Thaman (Kushal_Thaman) on
Speculative inferences about path dependence in LLM supervised fine-tuning from results on linear mode connectivity and model souping ·
2024-01-03T06:28:25.554Z ·
LW ·
GW
Thanks for the post! Do you think there is an amount of pretraining you can do such that no fine-tuning (on a completely non-complementary task, away from pre-trained distribution, say) will let you push out of that loss basin? A 'point of no return' s.t. even for very large values of LR and amount of fine-tuning you will get a network that is still LMC?