A day in the life of a mechanistic interpretability researcher

post by Bill Benzon (bill-benzon) · 2023-11-28T14:45:17.967Z · LW · GW · 3 comments

3 comments

Comments sorted by top scores.

comment by Ben Pace (Benito) · 2023-11-28T18:40:33.915Z · LW(p) · GW(p)

That was fun to watch. But I would appreciate someone spelling out the implied connection to mechanistic interpretability.

Replies from: joyee-chen
comment by Joyee Chen (joyee-chen) · 2023-11-28T18:53:10.007Z · LW(p) · GW(p)

Hint: does Charlie Chaplin have a gears-level understanding of the system?

Replies from: bill-benzon
comment by Bill Benzon (bill-benzon) · 2023-11-28T19:20:16.805Z · LW(p) · GW(p)

LOL! Plus he's clearly lost in a vast system he can't comprehend. How do you comprehend a complex network of billions upon billions of weights? Is there any way you can get on top of the system to observe its operations, to map them out?