Eli's shortform feed
post by elityre
score: 31 (6 votes) ·
I'm mostly going to use this to crosspost links to my blog for less polished thoughts, Musings and Rough Drafts.
Comments sorted by top scores.
comment by elityre
· score: 31 (14 votes) · LW
New post: What is mental energy?
[Note: I’ve started a research side project on this question, and it is already obvious to me that this ontology importantly wrong.]
There’s a common phenomenology of “mental energy”. For instance, if I spend a couple of hours thinking hard (maybe doing math), I find it harder to do more mental work afterwards. My thinking may be slower and less productive. And I feel tired, or drained, (mentally, instead of physically).
Mental energy is one of the primary resources that one has to allocate, in doing productive work. In almost all cases, humans have less mental energy than they have time, and therefore effective productivity is a matter of energy management, more than time management. If we want to maximize personal effectiveness, mental energy seems like an extremely important domain to understand. So what is it?
The naive story is that mental energy is an actual energy resource that one expends and then needs to recoup. That is, when one is doing cognitive work, they are burning calories, depleting their bodies energy stores. As they use energy, they have less fuel to burn.
My current understanding is that this story is not physiologically realistic. Thinking hard does consume more of the body’s energy than baseline, but not that much more. And we experience mental fatigue long before we even get close to depleting our calorie stores. It isn’t literal energy that is being consumed. [The Psychology of Fatigue pg.27]
So if not that, what is going on here?
A few hypotheses:
(The first few, are all of a cluster, so I labeled them 1a, 1b, 1c, etc.)
Hypothesis 1a: Mental fatigue is a natural control system that redirects our attention to our other goals.
The explanation that I’ve heard most frequently in recent years (since it became obvious that much of the literature on ego-depletion was off the mark), is the following:
A human mind is composed of a bunch of subsystems that are all pushing for different goals. For a period of time, one of these goal threads might be dominant. For instance, if I spend a few hours doing math, this means that my other goals are temporarily suppressed or on hold: I’m not spending that time seeking a mate, or practicing the piano, or hanging out with friends.
In order to prevent those goals from being neglected entirely, your mind has a natural control system that prevents you from focusing your attention on any one thing at a time: the longer you put your attention on something, the greater the build up of mental fatigue, causing you to do anything else.
Comments and model-predictions: This hypothesis, as stated, seems implausible to me. For one thing, it seems to suggest that that all actives would be equally mentally taxing, which is empirically false: spending several hours doing math is mentally fatiguing, but spending the same amount of time watching TV is not.
This might still be salvaged if we offer some currency other than energy that is being preserved: something like “forceful computations”. But again, it doesn’t seem obvious why the computations of doing math would be more costly than those for watching TV.
Similarly, this model suggests that “a change is as good as a break”: if you switch to a new task, you should be back to full mental energy, until you become fatigued for that task as well.
Hypothesis 1b: Mental fatigue is the phenomenological representation of the loss of support for the winning coalition.
A variation on this hypothesis would be to model the mind as a collection of subsystems. At any given time, there is only one action sequence active, but that action sequence is determined by continuous “voting” by various subsystems.
Overtime, these subsystems get fed up with their goals not being met, and “withdraw support” for the current activity. This manifests as increasing mental fatigue. (Perhaps your thoughts get progressively less effective, because they are interrupted, on the scale of micro-seconds, by bids to think something else).
Comments and model-predictions: This seems like it might suggest that if all of the subsystems have high trust that their goals will be met, that math (or any other cognitively demanding task) would cease to be mentally taxing. Is that the case? (Does doing math mentally exhaust Critch?)
This does have the nice virtue of explaining burnout: when some subset of needs are not satisfied for a long period, the relevant subsystems pull their support for all actions, until those needs are met.
[Is burnout a good paradigm case for studying mental energy in general?]
Hypothesis 1c: The same as 1a or 1b, but some mental operations are painful for some reason.
To answer my question above, one reason why math might be more mentally taxing than watching TV, is that doing math is painful.
If the process of doing math is painful on the micro-level, then even if all of the other needs are met, there is still a fundamental conflict between the subsystem that is aiming to acquire math knowledge, and the subsystem that is trying to avoid micro-pain on the micro-level.
As you keep doing math, the micro pain part votes more and more strongly against doing math, or the overall system biases away from the current activity, and you run out of mental energy.
Comments and model-predictions: This seems plausible for the activity of doing math, which involves many moments of frustration, which might be meaningfully micro-painful. But it seems less consistent with activities like writing, which phenomenologically feel non-painful. This leads to hypothesis 1d…
Hypothesis 1d: The same as 1c, but the key micro-pain is that of processing ambiguity second to second
Maybe the pain comes from many moments of processing ambiguity, which is definitely a thing that is happening in the context of writing. (I’ll sometimes notice myself try to flinch to something easier when I’m not sure which sentence to write.) It seems plausible that mentally taxing activities are taxing to the extent that they involve processing ambiguity, and doing a search for the best template to apply.
Hypothesis 1e: Mental fatigue is the penalty incurred for top down direction of attention.
Maybe consciously deciding to do things is importantly different from the “natural” allocation of cognitive resources. That is, your mind is set up such that the conscious, System 2, long term planning, metacognitive system, doesn’t have free rein. It has a limited budget of “mental energy”, which measures how long it is allowed to call the shots before the visceral, system 1, immediate gratification systems take over again.
Maybe this is an evolutionary adaption? For the monkeys that had “really good” plans for how to achieve their goals, never panned out for them. The monkeys that were impulsive some of the time, actually did better at the reproduction game?
(If this is the case, can the rest of the mind learn to trust S2 more, and thereby offer it a bigger mental energy budget?)
This hypothesis does seem consistent with my observation that rest days are rejuvenating, even when I spend my rest day working on cognitively demanding side projects.
Hypothesis 2: Mental fatigue is the result of the brain temporarily reaching knowledge saturation.
When learning a motor task, there are several phases in which skill improvement occurs. The first, unsurprisingly, is durring practice sessions. However, one also sees automatic improvements in skill in the hours after practice [actually this part is disputed] and following a sleep period (academic link1, 2, 3). That is, there is a period of consolidation following a practice session. This period of consolidation probably involves the literal strengthening of neural connections, and encoding other brain patterns that take more than a few seconds to set.
I speculate, that your brain may reach a saturation point: more practice, more information input, becomes increasingly less effective, because you need to dedicate cognitive resources to consolidation. [Note that this is supposing that there is some tradeoff between consolidation activity and input activity, as opposed to a setup where both can occur simultaneously (does anyone have evidence for such a tradeoff?)].
If so, maybe cognitive fatigue is the phenomenology of needing to extract one’s self from a practice / execution regime, so that your brain can do post-processing and consolidation on what you’ve already done and learned.
Comments and model-predictions: This seems to suggest that all cognitively taxing tasks are learning tasks, or at least tasks in which one is encoding new neural patterns. This seems plausible, at least.
It also seems to naively imply that an activity will become less mentally taxing as you gain expertise with it, and progress along the learning curve. There is (presumably) much more information to process and consolidate in your first hour of doing math than in your 500th.
Hypothesis 3: Mental fatigue is a control system that prevents some kind of damage to the mind or body.
One reason why physical fatigue is useful is that it prevents damage to your body. Getting tired after running for a bit, stops you for running all out for 30 hours at a time, and eroding your fascia.
By simple analogy to physical fatigue, we might guess that mental fatigue is a response to vigorous mental activity that is adaptive in that it prevents us from hurting ourselves.
I have no idea what kind of damage might be caused by thinking too hard.
I note that mania and hypomania involve apparently limitless mental energy reserves, and I think that theses states are bad for your brain.
Hypothesis 4: Mental fatigue is a buffer overflow of peripheral awareness.
Another speculative hypothesis: Human minds have a working memory: a limit of ~4 concepts, or chunks, that can be “activated”, or operated upon in focal attention, at one time. But meditators, at least, also talk a peripheral awareness: a sort of halo of concepts and sense impressions that are “loaded up”, or “near by”, or cognitively available, or “on the fringes of awareness”. These are all the ideas that are “at hand” to your thinking. [Note: is peripheral awareness, as the meditators talk about, the same thing as “short term memory”?]
Perhaps if there is a functional limit to the amount of content that can be held in working memory, there is a similar, if larger, limit to how much content can be held in peripheral awareness. As you engage with a task, more and more mental content is loaded up, or added to peripheral awareness, where it both influences your focal thought process, and/or is available to be operated on directly in working memory. As you continue the task, and more and more content gets added to peripheral awareness, you begin to overflow its capacity. It gets harder and harder to think, because peripheral awareness is overflowing. Your mind needs space to re-ontologize: to chunk pieces together, so that it can all fit in the same mental space. Perhaps this is what mental fatigue is.
Comments and model-predictions: This does give a nice clear account of why sleep replenishes mental energy (it both causes re-ontologizing, and clears the cache), though perhaps this does not provide evidence over most of the other hypotheses listed here.
Other notes about mental energy:
- In this post, I’m mostly talking about mental energy on the scale of hours. But there is also a similar phenomenon on the scale of days (the rejuvenation one feels after rest days) and on the scale of months (burnout and such). Are these the same basic phenomenon on different timescales?
- On the scale of days, I find that my subjective rest-o-meter is charged up if I take a rest day, even if I spend that rest day working on fairly cognitively intensive side projects.
- This might be because there’s a kind of new project energy, or new project optimism?
- Mania and hypomania entail limitless mental energy.
- People seem to be able to play video games for hours and hours without depleting mental energy. Does this include problem solving games, or puzzle games?
- Also, just because they can play indefinitely does not mean that their performance doesn’t drop. Does performance drop, across hours of playing, say, snakebird?
- For that matter, does performance decline on a task correlate with the phenomenological “running out of energy”? Maybe those are separate systems.
comment by G Gordon Worley III (gworley)
· score: 5 (2 votes) · LW
I also think it's reasonable to think that multiple things may be doing on that result in a theory of mental energy. For example, hypotheses 1 and 2 could both be true and result in different causes of similar behavior. I bring this up because I think of those as two different things in my experience: being "full up" and needing to allow time for memory consolidation where I can still force my attention it just doesn't take in new information vs. being unable to force the direction of attention generally.
comment by Viliam
· score: 2 (1 votes) · LW
Seems to me that mental energy is lost by frustration. If what you are doing is fun, you can do it for a log time; if it frustrates you at every moment, you will get "tired" soon.
The exact mechanism... I guess is that some part of the brain takes frustration as an evidence that this is not the right thing to do, and suggests doing something else. (Would correspond to "1b" in your model?)
comment by AprilSR
· score: 1 (1 votes) · LW
I’ve definitely experienced mental exhaustion from video games before - particularly when trying to do an especially difficult task.
comment by elityre
· score: 30 (8 votes) · LW
Old post: RAND needed the "say oops" skill
[Epistemic status: a middling argument]
A few months ago, I wrote about how RAND, and the “Defense Intellectuals” of the cold war represent another precious datapoint of “very smart people, trying to prevent the destruction of the world, in a civilization that they acknowledge to be inadequate to dealing sanely with x-risk.”
Since then I spent some time doing additional research into what cognitive errors and mistakes those consultants, military officials, and politicians made that endangered the world. The idea being that if we could diagnose which specific irrationalities they were subject to, that this would suggest errors that might also be relevant to contemporary x-risk mitigators, and might point out some specific areas where development of rationality training is needed.
However, this proved somewhat less fruitful than I was hoping, and I’ve put it aside for the time being. I might come back to it in the coming months.
It does seem worth sharing at least one relevant anecdote, from Daniel Ellsberg’s excellent book, the Doomsday Machine, and analysis, given that I’ve already written it up.
The missile gap
In the late nineteen-fifties it was widely understood that there was a “missile gap”: that the soviets had many more ICBM (“intercontinental ballistic missiles” armed with nuclear warheads) than the US.
Estimates varied widely on how many missiles the soviets had. The Army and the Navy gave estimates of about 40 missiles, which was about at parity with the the US’s strategic nuclear force. The Air Force and the Strategic Air Command, in contrast, gave estimates of as many as 1000 soviet missiles, 20 times more than the US’s count.
(The Air Force and SAC were incentivized to inflate their estimates of the Russian nuclear arsenal, because a large missile gap strongly necessitated the creation of more nuclear weapons, which would be under SAC control and entail increases in the Air Force budget. Similarly, the Army and Navy were incentivized to lowball their estimates, because a comparatively weaker soviet nuclear force made conventional military forces more relevant and implied allocating budget-resources to the Army and Navy.)
So there was some dispute about the size of the missile gap, including an unlikely possibility of nuclear parity with the Soviet Union. Nevertheless, the Soviet’s nuclear superiority was the basis for all planning and diplomacy at the time.
Kennedy campaigned on the basis of correcting the missile gap. Perhaps more critically, all of RAND’s planning and analysis was concerned with the possibility of the Russians launching a nearly-or-actually debilitating first or second strike.
In 1961 it came to light, on the basis of new satellite photos, that all of these estimates were dead wrong. It turned out the the Soviets had only 4 nuclear ICBMs, one tenth as many as the US controlled.
The importance of this development should be emphasized. It meant that several of the fundamental assumptions of US nuclear planners were in error.
First of all, it meant that the Soviets were not bent on world domination (as had been assumed). Ellsberg says…
Since it seemed clear that the Soviets could have produced and deployed many, many more missiles in the three years since their first ICBM test, it put in question—it virtually demolished—the fundamental premise that the Soviets were pursuing a program of world conquest like Hitler’s.
That pursuit of world domination would have given them an enormous incentive to acquire at the earliest possible moment the capability to disarm their chief obstacle to this aim, the United States and its SAC. [That] assumption of Soviet aims was shared, as far as I knew, by all my RAND colleagues and with everyone I’d encountered in the Pentagon:
The Assistant Chief of Staff, Intelligence, USAF, believes that Soviet determination to achieve world domination has fostered recognition of the fact that the ultimate elimination of the US, as the chief obstacle to the achievement of their objective, cannot be accomplished without a clear preponderance of military capability.
If that was their intention, they really would have had to seek this capability before 1963. The 1959–62 period was their only opportunity to have such a disarming capability with missiles, either for blackmail purposes or an actual attack. After that, we were programmed to have increasing numbers of Atlas and Minuteman missiles in hard silos and Polaris sub-launched missiles. Even moderate confidence of disarming us so thoroughly as to escape catastrophic damage from our response would elude them indefinitely.
Four missiles in 1960–61 was strategically equivalent to zero, in terms of such an aim.
This revelation about soviet goals was not only of obvious strategic importance, it also took the wind out of the ideological motivation for this sort of nuclear planning. As Ellsberg relays early in his book, many, if not most, RAND employees were explicitly attempting to defend US and the world from what was presumed to be an aggressive communist state, bent on conquest. This just wasn’t true.
But it had even more practical consequences: this revelation meant that the Russians had no first strike (or for that matter, second strike) capability. They could launch their ICBMs at American cities or military bases, but such an attack had no chance of debilitating US second strike capacity. It would unquestionably trigger a nuclear counterattack from the US who, with their 40 missiles, would be able to utterly annihilate the Soviet Union. The only effect of a Russian nuclear attack would be to doom their own country.
[Eli’s research note: What about all the Russian planes and bombs? ICBMs aren’t the the only way of attacking the US, right?]
This means that the primary consideration in US nuclear war planning at RAND and elsewhere, was fallacious. The Soviet’s could not meaningfully destroy the US.
…the estimate contradicted and essentially invalidated the key RAND studies on SAC vulnerability since 1956. Those studies had explicitly assumed a range of uncertainty about the size of the Soviet ICBM force that might play a crucial role in combination with bomber attacks. Ever since the term “missile gap” had come into widespread use after 1957, Albert Wohlstetter had deprecated that description of his key findings. He emphasized that those were premised on the possibility of clever Soviet bomber and sub-launched attacks in combination with missiles or, earlier, even without them. He preferred the term “deterrent gap.” But there was no deterrent gap either. Never had been, never would be.
To recognize that was to face the conclusion that RAND had, in all good faith, been working obsessively and with a sense of frantic urgency on a wrong set of problems, an irrelevant pursuit in respect to national security.
This realization invalidated virtually all of RAND’s work to date. Virtually every, analysis, study, and strategy, had been useless, at best.
The reaction to the revelation
How did RAND employees respond to this reveal, that their work had been completely off base?
That is not a recognition that most humans in an institution are quick to accept. It was to take months, if not years, for RAND to accept it, if it ever did in those terms. To some degree, it’s my impression that it never recovered its former prestige or sense of mission, though both its building and its budget eventually became much larger. For some time most of my former colleagues continued their focus on the vulnerability of SAC, much the same as before, while questioning the reliability of the new estimate and its relevance to the years ahead. [Emphasis mine]
For years the specter of a “missile gap” had been haunting my colleagues at RAND and in the Defense Department. The revelation that this had been illusory cast a new perspective on everything. It might have occasioned a complete reassessment of our own plans for a massive buildup of strategic weapons, thus averting an otherwise inevitable and disastrous arms race. It did not; no one known to me considered that for a moment. [Emphasis mine]
According to Ellsberg, many at RAND were unable to adapt to the new reality and continued (fruitlessly) to continue with what they were doing, as if by inertia, when the thing that they needed to do (to use Eliezer’s turn of phrase) is “halt, melt, and catch fire.”
This suggests that one failure of this ecosystem, that was working in the domain of existential risk, was a failure to “say oops [LW · GW]“: to notice a mistaken belief, concretely acknowledge that is was mistaken, and to reconstruct one’s plans and world views.
Relevance to people working on AI safety
This seems to be at least some evidence (though, only weak evidence, I think), that we should be cautious of this particular cognitive failure ourselves.
It may be worth rehearsing the motion in advance: how will you respond, when you discover that a foundational crux of your planning is actually mirage, and the world is actually different than it seems?
What if you discovered that your overall approach to making the world better was badly mistaken?
What if you received a strong argument against the orthogonality thesis?
What about a strong argument for negative utilitarianism?
I think that many of the people around me have effectively absorbed the impact of a major update at least once in their life, on a variety of issues (religion, x-risk, average vs. total utilitarianism, etc), so I’m not that worried about us. But it seems worth pointing out the importance of this error mode.
A note: Ellsberg relays later in the book that, durring the Cuban missile crisis, he perceived Kennedy as offering baffling terms to the soviets: terms that didn’t make sense in light of the actual strategic situation, but might have been sensible under the premiss of a soviet missile gap. Ellsberg wondered, at the time, if Kennedy had also failed to propagate the update regarding the actual strategic situation.
I believed it very unlikely that the Soviets would risk hitting our missiles in Turkey even if we attacked theirs in Cuba. We couldn’t understand why Kennedy thought otherwise. Why did he seem sure that the Soviets would respond to an attack on their missiles in Cuba by armed moves against Turkey or Berlin? We wondered if—after his campaigning in 1960 against a supposed “missile gap”—Kennedy had never really absorbed what the strategic balance actually was, or its implications.
I mention this because additional research suggests that this is implausible: that Kennedy and his staff were aware of the true strategic situation, and that their planning was based on that premise.
comment by elityre
· score: 16 (6 votes) · LW
Old post: A mechanistic description of status
[This is an essay that I’ve had bopping around in my head for a long time. I’m not sure if this says anything usefully new-but it might click with some folks. If you haven’t read Social Status: Down the Rabbit Hole on Kevin Simler’s excellent blog, Melting Asphalt read that first. I think this is pretty bad and needs to be rewritten and maybe expanded substantially, but this blog is called “musings and rough drafts.”]
In this post, I’m going to outline how I think about status. In particular, I want to give a mechanistic account of how status necessarily arises, given some set of axioms, in much the same way one can show that evolution by natural selection must necessarily occur given the axioms of 1) inheritance of traits 2) variance in reproductive success based on variance in traits and 3) mutation.
(I am not claiming any particular skill at navigating status relationships, any more than a student of sports-biology is necessarily a skilled basketball player.)
By “status” I mean prestige-status.
Axiom 1: People have goals.
That is, for any given human, there are some things that they want. This can include just about anything. You might want more money, more sex, a ninja-turtles lunchbox, a new car, to have interesting conversations, to become an expert tennis player, to move to New York etc.
Axiom 2: There are people who control resources relevant to other people achieving their goals.
The kinds of resources are as varied as the goals one can have.
Thinking about status dynamics and the like, people often focus on the particularly convergent resources, like money. But resources that are onlyrelevant to a specific goal are just as much a part of the dynamics I’m about to describe.
Knowing a bunch about late 16th century Swedish architecture is controlling a goal relevant-resource, if someone has the goal of learning more about 16th century Swedish architecture.
Just being a fun person to spend time with (due to being particularly attractive, or funny, or interesting to talk to, or whatever) is a resource relevant to other people’s goals.
Axiom 3: People are more willing to help (offer favors to) a person who can help them achieve their goals.
Simply stated, you’re apt to offer to help a person with their goals if it seems like they can help you with yours, because you hope they’ll reciprocate. You’re willing to make a trade with, or ally with such people, because it seems likely to be beneficial to you. At minimum, you don’t want to get on their bad side.
(Notably, there are two factors that go into one’s assessment of another person’s usefulness: if they control a resource relevant to one of your goals, and if you expect them to reciprocate.
This produces a dynamic where by A’s willingness to ally with B is determined by something like the product of
- A’s assessment of B’s power (as relevant to A’s goals), and
- A’s assessment of B’s probability of helping (which might translate into integrity, niceness, etc.)
If a person is a jerk, they need to be very powerful-relative-to-your-goals to make allying with them worthwhile.)
All of this seems good so far, but notice that we have up to this point only described individual pair-wise transactions and pair-wise relationships. People speak about “status” as a attribute that someone can possess or lack. How does the dynamic of a person being “high status” arise from the flux of individual transactions?
Lemma 1: One of the resources that a person can control is other people’s willingness to offer them favors
With this lemma, the system folds in on itself, and the individual transactions cohere into a mostly-stable status hierarchy.
Given lemma 1, a person doesn’t need to personally control resources relevant to your goals, they just need to be in a position such that someone who is relevant to your goals will privilege them.
As an example, suppose that you’re introduced to someone who is very well respected in your local social group: person-W. Your assessment might be that W, directly, doesn’t have anything that you need. But because person-W is well-respected by others in your social group are likely to offer favors to him/her. Therefore, it’s useful for person-W to like you, because then they are more apt to call on other people’s favors on your behalf.
(All the usual caveats about has this is subconscious, and humans are adaption-executors and don’t do explicit, verbal assessments of how useful a person will be to them, but rely on emotional heuristics that approximate explicit assessment.)
This causes the mess of status transactions to reinforce and stabilize into a mostly-static hierarchy. The mass of individual A-privileges-B-on-the-basis-of-A’s-goals flattens out, into each person having a single “score” which determines to what degree each other person privileges them.
(It’s a little more complicated than that because people who have access to their own resources have less need of help from other. So a person’s effective status (the status-level at which you treat them is closer to their status minus your status. But this is complicated again because people are motivated not to be dicks (that’s bad for business), and respecting other people’s status is important to not being a dick.)
[more stuff here.]
comment by Kaj_Sotala
· score: 5 (2 votes) · LW
Related: The red paperclip theory of status [LW · GW] describes status as a form of optimization power, specifically one that can be used to influence a group.
The name of the game is to convert the temporary power gained from (say) a dominance behaviour into something further, bringing you closer to something you desire: reproduction, money, a particular social position...
comment by Raemon
· score: 5 (2 votes) · LW
(it says "more stuff here" but links to your overall blog, not sure if that meant to be a link to a specific post)