Posts
A Sober Look at Steering Vectors for LLMs
2024-11-23T17:30:00.745Z
Dima's Shortform
2024-08-22T14:49:00.960Z
Comments
Comment by
Dmitrii Krasheninnikov (dmitrii-krasheninnikov) on
Meta learning to gradient hack ·
2022-07-06T16:31:10.770Z ·
LW ·
GW
Could you please share the results in case you ended up finishing those experiments?