Posts

Comments

Comment by shen yue (shen-yue) on Induction heads - illustrated · 2023-08-26T08:11:21.175Z · LW · GW

Thanks for your hard work. I wonder why in the layer 0 attention head, the positions of the query and value are 1?