Posts
How Logic "Really" Works: An Engineering Perspective
2025-04-16T05:34:09.443Z
FlexChunk: Enabling 100M×100M Out-of-Core SpMV (~1.8 min, ~1.7 GB RAM) with Near-Linear Scaling
2025-04-06T05:27:06.271Z
Formal Proof: O(n) Is a Cognitive Illusion
2025-03-28T18:26:05.111Z
Comments
Comment by
Daniil Strizhov (mila-dolontaeva) on
Tracing the Thoughts of a Large Language Model ·
2025-03-28T05:17:48.565Z ·
LW ·
GW
The poetry case really stuck with me. Claude’s clearly planning rhymes ahead, which already cracks the “just next-token” intuition about autoregressive models. But maybe it’s more than a neat trick. What if this spatial planning is a core capability—like the model’s not just unrolling a string, but navigating a conceptual space toward a target? One could test this by checking how often similar planning circuits pop up in multi-step reasoning tasks. If it’s building a rough "mental map" of where it wants to land, that might explain why bigger context windows boost reasoning so much. Not just more data—more room to plan. Has anyone tried prompting or tracing for this directly?