Posts

Auto-matching hidden layers in Pytorch LLMs 2024-02-19T12:40:20.442Z

Comments

Comment by chanind on Implementing activation steering · 2024-02-11T15:49:46.267Z · LW · GW

I'd also like to humbly submit the Steering Vectors Python library to the list as well. We built this library on Pytorch hooks, similar to Baukit, but with the goal that it should work automatically out-of-the-box on any LLM on huggingface. It's different from some of the other libraries in that regard, since it doesn't need a special wrapper class, but works directly with a Huggingface model/tokenizer. It's also more narrowly focused on steering vectors than some of the other libraries.