Posts

Toward Safety Case Inspired Basic Research 2024-10-31T23:06:32.854Z
A Universal Emergent Decomposition of Retrieval Tasks in Language Models 2023-12-19T11:52:27.354Z
Basic Facts about Language Model Internals 2023-01-04T13:01:35.223Z
Re-Examining LayerNorm 2022-12-01T22:20:23.542Z
Interpreting Neural Networks through the Polytope Lens 2022-09-23T17:58:30.639Z

Comments

Comment by Eric Winsor (EricWinsor) on SAE feature geometry is outside the superposition hypothesis · 2024-06-24T18:57:15.170Z · LW · GW

This reminded me of how GPT-2-small uses a cosine/sine spiral for its learned positional embeddings embeddings, and I don't think I've seen a mechanistic/dynamical explanation for this (just the post-hoc explanation that attention can use cosine similarity to encode distance in R^n, not that it should happen this way).

Comment by Eric Winsor (EricWinsor) on Formalization as suspension of intuition · 2022-12-12T16:00:05.356Z · LW · GW

I like this perspective! The idea of formalization as suspension of intuition reminds me of the story of the "Gruppenpest" in the development of quantum mechanics. The abstraction of groups (as well as representations and matrices) was seen by many as non-physical and unintuitive. But it turned out the resulting abstractions of gauge theories and symmetries were more fundamental objects than their predecessors.[1][2][3][4]

It also reminds me of a view I've been told many times that mathematical formalization/modeling is the process of forgetting details about a problem until its essential character is laid bare. I think it's important to emphasize that formalization is only a partial suspension or redirection of intuition (which seems to be what Bachelard is actually implying), since the goal of formalization typically isn't to turn the process into something doable by a mechanical proof checker. Formalization removes all the "distractions" that block you from seeing underlying regularities, but you still need to leverage your intuition to get a grasp on those regularities. As you say in the post:

What this suspension gives us is a place to explore the underlying relationships and properties without the tyranny of immediate experience. Thus delivered from the “obvious”, we can unearth new patterns and structures that in turn alter our intuitions themselves!

 

  1. ^

    https://www.researchgate.net/publication/234207946_From_the_Rise_of_the_Group_Concept_to_the_Stormy_Onset_of_Group_Theory_in_the_New_Quantum_Mechanics_A_saga_of_the_invariant_characterization_of_physical_objects_events_and_theories

  2. ^

    https://ncatlab.org/nlab/show/Gruppenpest

  3. ^

    https://hsm.stackexchange.com/questions/170/how-did-group-theory-enter-quantum-mechanics

  4. ^

    https://www.math.columbia.edu/~woit/wordpress/?p=191

Comment by Eric Winsor (EricWinsor) on Re-Examining LayerNorm · 2022-12-02T00:47:27.591Z · LW · GW

Thanks for the catch!