Speculations on Sleep in an Age of Machines
post by cleanwhiteroom · 2024-08-29T13:42:54.548Z · LW · GW · 0 commentsContents
Entropy in Language and Thought The Entropy of Thinking Sleep and Memory as Dimensionality Reduction Performance Degradation in LLMs over Long Context Windows The Hypothesis Things This Hypothesis Predicts Entropic Attacks on Intelligent Systems Final Thoughts None No comments
One-liner: Must androids dream of electric sheep?
Three-liner: In a thinking network, a vast web of microstates underpin system macrostates. These microstates are expensive to maintain. Processes that locally reduce or collapse informational entropy may be important for thinking things to learn and function.
Entropy in Language and Thought
Entropy has a foot in the physical and informational worlds. Because the human brain is a physical information processor, multiple subtypes of entropy apply to it.
1) Physical entropy is a measure of the disorder or randomness in a physical system. It famously increases over time. It quantifies the number of possible microscopic configurations of a system.
2) Informational entropy is the average amount of information in a dataset and measures the unpredictability or randomness of information content.
3) Neural entropy measures the informational processing capacity of neural systems and is used to quantify the repertoire of neural states a given brain can access.
4) Semantic entropy refers to the ambiguities in meaning (microstates) that a given word (macrostate) conveys.
Within their respective domains, all forms of entropy deal with uncertainty, randomness, disorder and the number of microstates compatible with a given macrostate.
For a more rigorous treatment, see Introduction to abstract entropy [LW · GW]. For a more informal metaphor, consider all the ways you’ve ever answered (or been tempted to answer) the high ambiguity query: “How are you?” You assess recent events, internal self states, and analyze relational data about the relative social position between you and your questioner before replying, “I’m fine.” This scenario contains all four types of entropy discussed above. Physical entropy is present in the heat generated by neuronal firing. Informational entropy will vary based on the depth of the thought content dataset that sums to your reply. Similarly, neural entropy increases proportionally to the literal firing configurations of the circuits holding the events of your day. Semantic entropy is a property of the language you use. (And what does “fine” really mean, anyway?)
The Entropy of Thinking
Biological systems locally reverse entropy to maintain their functional patterns against the metaphorical (and literal) friction of living. They're open systems that take in matter and energy, metabolize it into usable biologic currency, perform work, then return that matter and energy (in an altered state) to the environment. This is how biological organisms maintain their order.
This is how we live. It's also how we think.
Neuroscience deals with physical and informational entropy. Informational entropy has been applied to neuroscience in multiple contexts, including as a measure of information processing and the repertoire of brain states. This paper reviews the topic in detail.
Qualitatively, we can grasp the concept of macrostates versus microstates within our own mind. Consider a simple morning greeting:
Me: Good morning, how are you?
You (state 1): I slept terribly.
You (state 2): I’m drinking my favorite coffee.
You (state 3): My commute was terrible.
You (state 4): I got a raise yesterday!
You (state 5): I’m worried about my mom’s upcoming health test.
You (state 6): I have a lot on my schedule today.
You (state 7): Rings of Power comes out tonight; I love that show!
You (summing all above states simultaneously): I’m fine, thanks.
This also works when we get more granular. Let’s look at state 1 from the list above (you, sleeping terribly). That state itself generalizes a host of data inputs. Maybe you use a sleep tracker, which gives you some quantitative measures, but even if you don’t, the amount of deep sleep you get, the ratio of NREM to REM, the total sleep quantity, the qualitative feel of your dreams, the alacrity of your mind upon waking…all of these are variables that will sum into the idea that your sleep was “terrible.”
Your brain processes and summarizes vast quantities of information, and it does so with neurons that fire (all or none) in response to differently weighted thresholds. As you move through your day, circuit networks and synaptic architectures fire actively to model what's happening to you and to all the things you’re attending to. At the end of the day, you'll finally watch the series premiere of Rings of Power. All your knowledge of JRR Tolkien's extensive writings will open up and fire to the best of its ability as you model how Sauron the Deceiver is likely to execute his plans.
As you prepare for bed, your neural network contains the running vastness of your day. All that's happened within its span is accessible. Recalling what you had for breakfast may be difficult, but it’s possible. High salience circuits fire as you contemplate the fate of Middle Earth in glorious detail. Or, perhaps, over and over, your brain presents you with negative information you’d rather not think about.
Your brain has a huge amount of entropy to offload.
And so, you turn your consciousness off and sleep.
Sleep and Memory as Dimensionality Reduction
Sleep scientists love to talk about how mysterious sleep is. But we do know a few things about it: 1) without sleep, animals die, 2) body temperature drops during sleep, especially the deep kind, 3) it’s important for memory consolidation, and 4) it’s critical for the maintenance and execution of homeostatic mechanisms that restore the body. I went looking for an explicit proposal that the purpose of sleep is entropy reduction and I found one. It's from 2024, published in Medical Hypotheses, and it's called: The fundamental role of sleep is the reduction of thermodynamic entropy of the central nervous system.
The four known points about sleep I list above are entangled with the claim that sleep reduces entropy, but they’re not direct evidence. Animals can die from many things and body temperature regulation is far from straightforward (a preliminary dive on this revealed a chasm).
Memory, though. That one’s interesting.
Memory is a dimensionality reducer. Memory is firing up images of Sauron’s campaign (dastardly, conniving, involving jewelry). Experience is watching it play out onscreen, your synaptic patterns blazing, your whole visual field riveted, your auditory cortex parsing music and speech and sound, your brain wondering when exactly the treacherous turn will come.
Memory is less expensive for the brain. The number of synapses firing, the neuronal microstates—they’ve been curated. The phase-space dimensionality of Sauron is still accessible, but taking up less volume. The entropy of the Dark Lord has been reduced.
And for this, sleep is critical.
Performance Degradation in LLMs over Long Context Windows
LLM performance degrades over long context windows. I’ve noticed this myself, but it’s also been described formally. Companies that provide access to LLMs limit context windows. This, I understand, is formally because transformer-based models experience a quadratic increase in computational cost relative to sequence length. It costs more to run a long context window. And when you really push those limits, performance degrades.
I speculate this degradation is fundamentally related to informational entropy. To the extent that neural networks and deep learning are based on what we know of our own neuronal organization, sleep (a fundamental requirement of our nervous system) may have a computational equivalent, especially if sleep is a mechanism for entropy reduction.
In terms of actual experimental data, I’m aware of one thing that directly informs this argument: a recent Nature paper finds high semantic entropy associated with confabulatory behavior in LLMs. LLM confabulations are a subset of hallucination in which claims are fluent, wrong, and arbitrary. We've all heard the stories of highly plausible ChatGPT outputs that have no basis in reality submitted as part of high profile legal cases, for example. Interestingly, this phenomenon has a human correlate, and a rather famous one: the medical condition of Korsakoff's syndrome. Korsakoff's syndrome features fluent confabulatory explanations for contexts the patient can't recall entering. In humans, this quality emerges from a memory error. And memory, as I’ve argued above, is a dimensionality reducer.
Wading further into the speculative pool, there's also the phenomenon of catastrophic forgetting to consider. Catastrophic forgetting (also known as catastrophic interference) is the tendency of an artificial network to forget prior information upon learning new information. In a computational sense, this is thought to occur when weights within the hidden layers of a network change in the presence of new information. Put another way, catastrophic forgetting occurs when we re-use the network without a way to consolidate knowledge gains. Put a third way, the network subject to catastrophic forgetting had no way to consolidate the most important aspects of what it had learned and pattern them down for permanent storage. Put a fourth way, it couldn't sleep and its "thoughts" dissipated like the heat they became.
The Hypothesis
A method for reducing local informational entropy is important for any thinking system. In humans, this method is sleeping. LLMs (or other thinking systems) require a computational equivalent in order to operate over long context windows.
Things This Hypothesis Predicts
-Hallucination problems in AI become worse as models become larger without explicit mechanisms for entropic reduction
-AGI/superintelligence will need mechanisms to reduce entropy within its cognitive architecture or suffer collapse and decoherence as it continues to run
-Vulnerabilities of intelligent systems to increasing entropy may have applications to superalignment
Entropic Attacks on Intelligent Systems
Humans, of course, already deliver entropic attacks, without explicitly naming them. The clearest example is using sleep deprivation during interrogations as a form of torture. We do it to ourselves when we get too little sleep. Medicine, as a profession, habitually attacks its own practitioners in this way. And across the board, from academia to the military to corporate culture, there's real intellectual currency to be gained from having the neuronal bandwidth to able to stand up well to sleep deprivation.
Similar attacks aimed at compromising processes that reduce entropy should be possible. Science fiction already has a metaphor for this: it’s Captain Kirk’s go-to strategy when he runs into a machine intelligence. Kirk spikes the gears of rigidly logical systems by forcing contemplation of a paradox. Such a tactic wouldn’t work against an intelligence with a sub-symbolic connectionist architecture, but, as a strategy, it’s a stone’s throw from deliberately targeting entropic reduction mechanisms.
Final Thoughts
Biology, despite every incentive, hasn't escaped the need for sleep. I've tried to lay out why I think sleep (or a functional equivalent) will be important for any system that attends to self, to others, to the world. The brain is a connectionist network, and sleep is critical to learning, to memory, to offloading the entropy that would degrade a running network over the short term.
I find it likely that not only can androids dream of electric sheep, they'll very much need to.
0 comments
Comments sorted by top scores.