Posts

Intricacies of Feature Geometry in Large Language Models 2024-12-07T18:10:51.375Z
The Geometry of Feelings and Nonsense in Large Language Models 2024-09-27T17:49:27.420Z

Comments

Comment by 7vik (satvik-golechha) on The Geometry of Feelings and Nonsense in Large Language Models · 2024-10-13T10:05:37.653Z · LW · GW

Thanks a lot! We had an email exchange with the authors and they shared some updated results with much better random shuffling controls on the WordNet hierarchy.

They also argue that some contexts should promote the likelihood of both "sad" and "joy" since they are causally separable, so they should not be expected to be anti-correlated under their causal inner product per se. We’re still concerned about what this means for semantic steering.

Comment by 7vik (satvik-golechha) on The Geometry of Feelings and Nonsense in Large Language Models · 2024-09-29T14:08:15.088Z · LW · GW

I agree. Yes - would be happy to chat and discuss more. Sending you a DM.

Comment by 7vik (satvik-golechha) on The Geometry of Feelings and Nonsense in Large Language Models · 2024-09-29T13:24:29.678Z · LW · GW

They use a WordNet hierarchy to verify their orthogonality results at scale, but doesn't look like they do any other shuffle controls.

Comment by 7vik (satvik-golechha) on The Geometry of Feelings and Nonsense in Large Language Models · 2024-09-29T13:18:25.945Z · LW · GW

Thanks @TomasD, that's interesting! I agree - most words in my random list seem like random "objects/things/organisms" so there might be some conditioning going on there. Going over your code to see if there's something else that's different.