Posts

Comments

Comment by mbissell (mbiss) on The Most Forbidden Technique · 2025-03-12T21:58:32.439Z · LW · GW

Training on CoT traces seems like a particular instance of a general class of "self-defeating strategies." Other examples include antibiotics/bacterial resistance (treating bacterial infections creates selective pressure that promotes resistant bacterial populations, gradually rendering the antibiotics ineffective for future use) and the dilemma in The Imitation Game after Turing and his team have cracked Enigma (acting upon the deciphered messages would tip off the Nazis and remove the Allies' informational advantage).

Comment by mbissell (mbiss) on Attribution-based parameter decomposition · 2025-03-09T17:06:26.451Z · LW · GW

Really cool work!

Would it be accurate to say that MoE models are an extremely coarse form of parameter decomposition? They check the box for faithfulness, and they're an extreme example of optimizing minimality (each input x only uses one component of the model if you define each expert as a component) while completely disregarding simplicity.

Comment by mbissell (mbiss) on What Goes Without Saying · 2024-12-31T18:02:22.987Z · LW · GW
Comment by mbissell (mbiss) on Understanding Shapley Values with Venn Diagrams · 2024-12-30T18:53:38.617Z · LW · GW
Comment by mbissell (mbiss) on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-12-08T17:53:48.559Z · LW · GW

“Living willpower” is willpower that is consciously understood as a bet on unknowns: “I don’t know whether this project will pay off, but I am betting my finite credibility on it anyhow.”

This feels related to the idea of Slack that Scott Alexander writes about here in SSC.

He gives this example towards the end:

7. Ideas. These are in constant evolutionary competition – this is the insight behind memetics. The memetic equivalent of slack is inferential range, aka “willingness to entertain and explore ideas before deciding that they are wrong”.

Willpower and money are both ways to create slack for yourself and others so that you can explore ideas/projects with an uncertain payoff.