Posts

Comments

Comment by zookini on Redundant Attention Heads in Large Language Models For In Context Learning · 2024-11-18T01:33:47.382Z · LW · GW

each previous example confirming the pattern attends to the th example, and adds a constant update term (equivilant to ) to the th example.

Nit: should be \log(c) - norm

Comment by zookini on The Best Tacit Knowledge Videos on Every Subject · 2024-04-14T09:58:22.209Z · LW · GW

Tacit knowledge videos for CAD modelling:
https://www.youtube.com/playlist?list=PLzMIhOgu1Y5fwotlIEKNnuIXcEbVIZ7Qm