Posts
Comments
Comment by
shen yue (shen-yue) on
Induction heads - illustrated ·
2023-08-26T08:11:21.175Z ·
LW ·
GW
Thanks for your hard work. I wonder why in the layer 0 attention head, the positions of the query and value are 1?