Eli's shortform feed

post by Eli Tyre (elityre) · 2019-06-02T09:21:32.245Z · LW · GW · 145 comments

I'm mostly going to use this to crosspost links to my blog for less polished thoughts, Musings and Rough Drafts.


Comments sorted by top scores.

comment by Eli Tyre (elityre) · 2019-09-27T22:08:05.174Z · LW(p) · GW(p)

New post: Some things I think about Double Crux and related topics

I've spent a lot of my discretionary time working on the broad problem of developing tools for bridging deep disagreements and transferring tacit knowledge. I'm also probably the person who has spent the most time explicitly thinking about and working with CFAR's Double Crux framework. It seems good for at least some of my high level thoughts to be written up some place, even if I'm not going to go into detail about, defend, or substantiate, most of them.

The following are my own beliefs and do not necessarily represent CFAR, or anyone else.

I, of course, reserve the right to change my mind.

[Throughout I use "Double Crux" to refer to the Double Crux technique, the Double Crux class, or a Double Crux conversation, and I use "double crux" to refer to a proposition that is a shared crux for two people in a conversation.]

Here are some things I currently believe:


  1. Double Crux is one (highly important) tool/ framework among many. I want to distinguish between the the overall art of untangling and resolving deep disagreements and the Double Crux tool in particular. The Double Crux framework is maybe the most important tool (that I know of) for resolving disagreements, but it is only one tool/framework in an ensemble.
    1. Some other tools/ frameworks, that are not strictly part of Double Crux (but which are sometimes crucial to bridging disagreements) include NVC, methods for managing people's intentions and goals, various forms of co-articulation (helping to draw out an inchoate model from one's conversational partner), etc.
    2. In some contexts other tools are substitutes for Double Crux (ie another framework is more useful) and in some cases other tools are helpful or necessary compliments (ie they solve problems or smooth the process within the Double Crux frame).
    3. In particular, my personal conversational facilitation repertoire is about 60%  Double Crux-related techniques, and 40% other frameworks that are not strictly within the frame of Double Crux.
  2. Just to say it clearly: I don't think Double Crux is the only way to resolve disagreements, or the best way in all contexts. (Though I think it may be the best way, that I know of, in a plurality of common contexts?)
  3. The ideal use case for Double Crux is when...
    1. There are two people...
    2. ...who have a real, action-relevant, decision...
    3. ...that they need to make together (they can't just do their own different things)...
    4. ...in which both people have strong, visceral intuitions.
  4. Double Cruxes are almost always conversations between two people's system 1's.
  5. You can Double Crux between two people's unendorsed intuitions. (For instance, Alice and Bob are discussing a question about open borders. They both agree that neither of them are economists, and that neither of them trust their intuitions here, and that if they had to actually make this decision, it would be crucial to spend a lot of time doing research and examining the evidence and consulting experts. But nevertheless Alices current intuition leans in favor of open borders , and Bob's current intuition leans against. This is a great starting point for a Double Crux.)
  6. Double cruxes (as in a crux that is shared by both parties in a disagreement) are common, and useful. Most disagreements have implicit Double Cruxes, though identifying them can sometimes be tricky.
  7. Conjunctive cruxes (I would change my mind about X, if I changed my mind about Y and about Z, but not if I only changed my mind about Y or about Z) are common.
  8. Folks sometimes object that Double Crux won't work, because their belief depends on a large number of considerations, each one of which has only a small impact on their overall belief, and so no one consideration is a crux. In practice, I find that there are double cruxes to be found even in cases where people expect their beliefs have this structure.
    1. Theoretically, it makes sense that we would find double cruxes in these scenarios: if a person has a strong disagreement (including a disagreement of intuition) with someone else, we should expect that there are a small number of considerations doing most of the work of causing one person to think one thing and the other to think something else. It is improbable that each person's beliefs depend on 50 factors, and for Alice, most of those 50 factors point in one direction, and for Bob, most of those 50 factors point in the other direction, unless the details of those factors are not independent. If considerations are correlated, you can abstract out the fact or belief that generates the differing predictions in all of those separate considerations. That "generating belief" is the crux.
    2. That said, there is a different conversational approach that I sometimes use, which involves delineating all of the key considerations (then doing Goal-factoring style relevance and completeness checks), and then dealing with each consideration one at time (often via a fractal tree structure: listing the key considerations of each of the higher level considerations).
      1. This approach absolutely requires paper, and skillful (firm, gentle) facilitation, because people will almost universally try and hop around between considerations, and they need to be viscerally assured that their other concerns are recorded and will be dealt with in due course, in order to engage deeply with any given consideration one at a time.
  9. About 60% of the power of Double Crux comes from operationalizing or being specific.
    1. I quite like Liron's [LW · GW] recent sequence [LW · GW] on being specific. It re-reminded me of some basic things that have been helpful in several recent conversations. In particular, I like the move [LW · GW] of having a conversational partner paint a specific, best case scenario, as a starting point for discussion.
      1. (However, I'm concerned about Less Wrong readers trying this with a spirit of trying to "catch out" one's conversational partner in inconsistency, instead of trying to understand what their partner wants to say, and thereby shooting themselves in the foot. I think the attitude of looking to "catch out" is usually counterproductive to both understanding and to persuasion. People rarely change their mind when they feel like you have trapped them in some inconsistency, but they often do change their mind if they feel like you've actually heard and understood their belief / what they are trying to say / what they are trying to defend, and then provide relevant evidence and argument. In general (but not universally) it is more productive to adopt a collaborative attitude of sincerely trying to help a person articulate, clarify, and substantiate the point your partner is trying to make, even if you suspect that their point is ultimately wrong and confused.)
    2. As an aside, specificity and operationalization is also the engine that makes Non Violent communication work. Being specific is really super powerful.
  10. Many (~50%) disagreements evaporate upon operationalization, but this happens less frequently than people think: and if you seem to agree about all of the facts, and agree about all specific operationalizations, but nevertheless seem to have differing attitudes about a question, that should be a flag. [I have a post that I'll publish soon about this problem.]
  11. You should be using paper when Double Cruxing. Keep track of the chain of Double Cruxes, and keep them in view.
  12. People talk past each other all the time, and often don't notice it. Frequently paraphrasing your current understanding of what your conversational partner is saying, helps with this. [There is a lot more to say about this problem, and details about how to solve it effectively].
  13. I don't endorse the Double Crux "algorithm [LW(p) · GW(p)]" described in the canonical post. That is, I don't think that the best way to steer a Double Crux conversation is to hew to those 5 steps in that order. Actually finding double cruxes is, in practice, much more complicated, and there are a large number of heuristics and TAPs that make the process work. I regard that algorithm as an early (and self conscious [LW(p) · GW(p)]) attempt to delineate moves that would help move a conversation towards double cruxes.
  14. This is my current best attempt at distilling the core moves that make Double Crux work, though this leaves out a lot.
  15. In practice, I think that double cruxes most frequently emerge not from people independently generating their own list cruxes (though this is useful). Rather double cruxes usually emerge from the move of "checking if the point that your partner made is a crux for you."
  16. I strongly endorse facilitation of basically all tricky conversations, Double Crux oriented or not. It is much easier to have a third party track the meta and help steer, instead of the participants, who's working memory is (and should be) full of the object level.
  17. So called, "Triple Crux" is not a feasible operation. If you have more than two stakeholders, have two of them Double Crux, and then have one of those two Double Crux with the third person. Things get exponentially trickier as you add more people. I don't think that Double Crux is a feasible method for coordinating more than ~ 6 people. We'll need other methods for that.
  18. Double Crux is much easier when both parties are interested in truth-seeking and in changing their mind, and are assuming good faith about the other. But, these are not strict prerequisites, and unilateral Double Crux is totally a thing.
  19. People being defensive, emotional, or ego-filled does not preclude a productive Double Crux. Some particular auxiliary skills are required for navigating those situations, however.
    1. This is a good start for the relevant skills.
  20. If a person wants to get better at Double Crux skills, I recommend they cross-train with IDC. Any move that works in IDC you should try in Double Crux. Any move that works in Double Crux you should try in IDC. This will seem silly sometimes, but I am pretty serious about it, even in the silly-seeming cases. I've learned a lot this way.
  21. I don't think Double Crux necessarily runs into a problem of "black box beliefs" wherein one can no longer make progress because one or both parties comes down to a fundamental disagreement about System 1 heuristics/ models that they learned from some training data, but into which they can't introspect. Almost always, there are ways to draw out those models.
    1. The simplest way to do this (which is not the only or best way, depending on the circumstances, involves generating many examples and testing the "black box" against them. Vary the hypothetical situation to triangulate to the exact circumstances in which the "black box" outputs which suggestions.
    2. I am not making the universal claim that one never runs into black box beliefs that can't be dealt with.
  22. Disagreements rarely come down to "fundamental value disagreements". If you think that you have gotten to a disagreement about fundamental values, I suspect there was another conversational tact that would have been more productive.
  23. Also, you can totally Double Crux about values. In practice, you can often treat values like beliefs: often there is some evidence that a person could observe, at least in principle, that would convince them to hold or not hold some "fundamental" value.
    1. I am not making the claim that there are no such thing as fundamental values, or that all values are Double Crux-able.
  24. A semi-esoteric point: cruxes are (or can be) contiguous with operationalizations. For instance, if I'm having a disagreement about whether advertising produces value on net, I might operationalize to "beer commercials, in particular, produce value on net", which (if I think that operationalization actually captures the original question) is isomorphic to "The value of beer commercials is a crux for the value of advertising.  I would change my mind about advertising in general, if I changed my mind about beer commercials." (In this is an evidential crux, as opposed to the more common causal crux. (More on this distinction in future posts.))
  25. People's beliefs are strongly informed by their incentives. This makes me somewhat less optimistic about tools in this space than I would otherwise be, but I still think there's hope.
  26. There are a number of gaps in the repertoire of conversational tools that I'm currently aware of. One of the most important holes is the lack of a method for dealing with psychological blindspots. These days, I often run out of ability to make a conversation go well when we bump into a blindspot in one person or the other (sometimes, there seem to be psychological blindspots on both sides). Tools wanted, in this domain.

(The Double Crux class)

  1. Knowing how to identify Double Cruxes can be kind of tricky, and I don't think that most participants learn the knack from the 55 to 70 minute Double Crux class at a CFAR workshop.
  2. Currently, I think I can teach the basic knack (not including all the other heuristics and skills) to a person in about 3 hours, but I'm still playing around with how to do this most efficiently. (The "Basic Double Crux pattern" post is the distillation of my current approach.)
    1. This is one development avenue that would particularly benefit from parallel search: If you feel like you "get" Double Crux, and can identify Double Cruxes fairly reliably and quickly, it might be helpful if you explicated your process.
  3. That said, there are a lot of relevant compliments and sub-skills to Double Crux, and to bridging disagreements more generally.
  4. The most important function of the Double Crux class at CFAR workshops is teaching and propagating the concept of a "crux", and to a lesser extent, the concept of a "double crux". These are very useful shorthands for one's personal thinking and for discourse, which are great to have in the collective lexicon.

(Some other things)

  1. Personally, I am mostly focused on developing deep methods (perhaps for training high-expertise specialists) that increase the range of problems of disagreements that the x-risk ecosystem can solve at all. I care more about this goal than about developing shallow tools that are useful "out of the box" for smart non-specialists, or in trying to change the conversational norms of various relevant communities (though both of those are secondary goals.)
  2. I am highly skeptical of teaching many-to-most of the important skills for bridging deep disagreement, via anything other than ~one-on-one, in-person interaction.
  3. In large part due to being prodded by a large number of people, I am polishing  all my existing drafts of Double Crux stuff (and writing some new posts), and posting them here over the next few weeks. (There are already some drafts, still being edited, available on my blog.)

I have a standing offer to facilitate conversations and disagreements (Double Crux or not) for rationalists and EAs. Email me at eli [at] rationality [dot] org if that's something you're interested in.

Replies from: Zack_M_Davis, ChristianKl, Chris_Leong, DanielFilan
comment by Zack_M_Davis · 2019-09-28T16:36:06.449Z · LW(p) · GW(p)

People rarely change their mind when they feel like you have trapped them in some inconsistency [...] In general (but not universally) it is more productive to adopt a collaborative attitude of sincerely trying to help a person articulate, clarify, and substantiate [bolding mine—ZMD]

"People" in general rarely change their mind when they feel like you have trapped them in some inconsistency, but people using the double-crux method in the first place are going to be aspiring rationalists, right? Trapping someone in an inconsistency (if it's a real inconsistency and not a false perception of one) is collaborative: the thing they were thinking was flawed, and you helped them see the flaw! That's a good thing! (As it is written of the fifth virtue, "Do not believe you do others a favor if you accept their arguments; the favor is to you.")

Obviously, I agree that people should try to understand their interlocutors. (If you performatively try to find fault in something you don't understand, then apparent "faults" you find are likely to be your own misunderstandings rather than actual faults.) But if someone spots an actual inconsistency in my ideas, I want them to tell me right away. Performing the behavior of trying to substantiate something that cannot, in fact, be substantiated (because it contains an inconsistency) is a waste of everyone's time!

In general (but not universally) it is more productive to adopt a collaborative attitude

Can you say more about what you think the exceptions to the general-but-not-universal rule are? (Um, specifically [LW · GW].)

Replies from: Slider
comment by Slider · 2019-09-28T20:08:54.890Z · LW(p) · GW(p)

I would think that inconsistencies are easier to appriciate when they are in the central machinery. A rationalist might have more load bearing on their beliefs so most beliefs are central to atleast something but I think a centrality/point-of-communication check is more upside than downside to keep. Also cognitive time spent looking for inconsistencies could be better spent on more constructive activities. Then there is the whole class of heuristics which don't even claim to be consistent. So the ability to pass by an inconsistency without hanging onto it will see use.

comment by ChristianKl · 2020-07-21T13:06:54.272Z · LW(p) · GW(p)

Currently, I think I can teach the basic knack (not including all the other heuristics and skills) to a person in about 3 hours, but I'm still playing around with how to do this most efficiently. (The "Basic Double Crux pattern" post is the distillation of my current approach.)

How about doing this a few times on video? Watching the video might not be as effective as the one-on-one teaching but I would expect that watching a few 1-on-1 explanations would be a good way to learn about the process.

From a learning perspective it also helps a lot for reflecting on the technique. The early NLP folks spent a lot of time analysing tapes of people performing techniques to better understand the techniques.

Replies from: elityre
comment by Eli Tyre (elityre) · 2020-07-22T00:50:59.939Z · LW(p) · GW(p)

I in fact recorded a test session of attempting to teach this via Zoom last weekend. This was the first time I tried a test session via Zoom however and there were a lot of kinks to work out, so I probably won't publish that version in particular.

But yeah, I'm interested in making video recordings of some of this stuff and putting up online.

comment by Chris_Leong · 2020-05-11T11:21:31.200Z · LW(p) · GW(p)

Thanks for mentioning conjugative cruxes. That was always my biggest objection to this technique. At least when I went through CFAR, the training completely ignored this possibility. It was clear that it often worked anyway, but the impression that I got was that it was the general frame [LW · GW]which was important more than the precise methodology which at that time still seemed in need of refinement.

comment by DanielFilan · 2019-09-27T23:18:52.368Z · LW(p) · GW(p)

FYI the numbering in the (General) section is pretty off.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-09-28T07:01:41.277Z · LW(p) · GW(p)

What do you mean? All the numbers are in order. Are you objecting to the nested numbers?

Replies from: DanielFilan
comment by DanielFilan · 2019-09-28T21:01:42.455Z · LW(p) · GW(p)

To me, it looks like the numbers in the General section go 1, 4, 5, 5, 6, 7, 8, 9, 3, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 2, 3, 3, 4, 2, 3, 4 (ignoring the nested numbers).

Replies from: DanielFilan
comment by DanielFilan · 2019-09-28T21:10:01.837Z · LW(p) · GW(p)

(this appears to be a problem where it displays differently on different browser/OS pairs)

comment by Eli Tyre (elityre) · 2019-08-24T02:44:31.028Z · LW(p) · GW(p)

Old post: RAND needed the "say oops" skill

[Epistemic status: a middling argument]

A few months ago, I wrote about how RAND, and the “Defense Intellectuals” of the cold war represent another precious datapoint of “very smart people, trying to prevent the destruction of the world, in a civilization that they acknowledge to be inadequate to dealing sanely with x-risk.”

Since then I spent some time doing additional research into what cognitive errors and mistakes  those consultants, military officials, and politicians made that endangered the world. The idea being that if we could diagnose which specific irrationalities they were subject to, that this would suggest errors that might also be relevant to contemporary x-risk mitigators, and might point out some specific areas where development of rationality training is needed.

However, this proved somewhat less fruitful than I was hoping, and I’ve put it aside for the time being. I might come back to it in the coming months.

It does seem worth sharing at least one relevant anecdote, from Daniel Ellsberg’s excellent book, the Doomsday Machine, and analysis, given that I’ve already written it up.

The missile gap

In the late nineteen-fifties it was widely understood that there was a “missile gap”: that the soviets had many more ICBM (“intercontinental ballistic missiles” armed with nuclear warheads) than the US.

Estimates varied widely on how many missiles the soviets had. The Army and the Navy gave estimates of about 40 missiles, which was about at parity with the the US’s strategic nuclear force. The Air Force and the Strategic Air Command, in contrast, gave estimates of as many as 1000 soviet missiles, 20 times more than the US’s count.

(The Air Force and SAC were incentivized to inflate their estimates of the Russian nuclear arsenal, because a large missile gap strongly necessitated the creation of more nuclear weapons, which would be under SAC control and entail increases in the Air Force budget. Similarly, the Army and Navy were incentivized to lowball their estimates, because a comparatively weaker soviet nuclear force made conventional military forces more relevant and implied allocating budget-resources to the Army and Navy.)

So there was some dispute about the size of the missile gap, including an unlikely possibility of nuclear parity with the Soviet Union. Nevertheless, the Soviet’s nuclear superiority was the basis for all planning and diplomacy at the time.

Kennedy campaigned on the basis of correcting the missile gap. Perhaps more critically, all of RAND’s planning and analysis was concerned with the possibility of the Russians launching a nearly-or-actually debilitating first or second strike.

The revelation

In 1961 it came to light, on the basis of new satellite photos, that all of these estimates were dead wrong. It turned out the the Soviets had only 4 nuclear ICBMs, one tenth as many as the US controlled.

The importance of this development should be emphasized. It meant that several of the fundamental assumptions of US nuclear planners were in error.

First of all, it meant that the Soviets were not bent on world domination (as had been assumed). Ellsberg says…

Since it seemed clear that the Soviets could have produced and deployed many, many more missiles in the three years since their first ICBM test, it put in question—it virtually demolished—the fundamental premise that the Soviets were pursuing a program of world conquest like Hitler’s.

That pursuit of world domination would have given them an enormous incentive to acquire at the earliest possible moment the capability to disarm their chief obstacle to this aim, the United States and its SAC. [That] assumption of Soviet aims was shared, as far as I knew, by all my RAND colleagues and with everyone I’d encountered in the Pentagon:
The Assistant Chief of Staff, Intelligence, USAF, believes that Soviet determination to achieve world domination has fostered recognition of the fact that the ultimate elimination of the US, as the chief obstacle to the achievement of their objective, cannot be accomplished without a clear preponderance of military capability.
If that was their intention, they really would have had to seek this capability before 1963. The 1959–62 period was their only opportunity to have such a disarming capability with missiles, either for blackmail purposes or an actual attack. After that, we were programmed to have increasing numbers of Atlas and Minuteman missiles in hard silos and Polaris sub-launched missiles. Even moderate confidence of disarming us so thoroughly as to escape catastrophic damage from our response would elude them indefinitely.
Four missiles in 1960–61 was strategically equivalent to zero, in terms of such an aim.

This revelation about soviet goals was not only of obvious strategic importance, it also took the wind out of the ideological motivation for this sort of nuclear planning. As Ellsberg relays early in his book, many, if not most, RAND employees were explicitly attempting to defend US and the world from what was presumed to be an aggressive communist state, bent on conquest. This just wasn’t true.

But it had even more practical consequences: this revelation meant that the Russians had no first strike (or for that matter, second strike) capability. They could launch their ICBMs at American cities or military bases, but such an attack had no chance of debilitating US second strike capacity. It would unquestionably trigger a nuclear counterattack from the US who, with their 40 missiles, would be able to utterly annihilate the Soviet Union. The only effect of a Russian nuclear attack would be to doom their own country.

[Eli’s research note: What about all the Russian planes and bombs? ICBMs aren’t the the only way of attacking the US, right?]

This means that the primary consideration in US nuclear war planning at RAND and elsewhere, was fallacious. The Soviet’s could not meaningfully destroy the US.

…the estimate contradicted and essentially invalidated the key RAND studies on SAC vulnerability since 1956. Those studies had explicitly assumed a range of uncertainty about the size of the Soviet ICBM force that might play a crucial role in combination with bomber attacks. Ever since the term “missile gap” had come into widespread use after 1957, Albert Wohlstetter had deprecated that description of his key findings. He emphasized that those were premised on the possibility of clever Soviet bomber and sub-launched attacks in combination with missiles or, earlier, even without them. He preferred the term “deterrent gap.” But there was no deterrent gap either. Never had been, never would be.
To recognize that was to face the conclusion that RAND had, in all good faith, been working obsessively and with a sense of frantic urgency on a wrong set of problems, an irrelevant pursuit in respect to national security.

This realization invalidated virtually all of RAND’s work to date. Virtually every, analysis, study, and strategy, had been useless, at best.

The reaction to the revelation

How did RAND employees respond to this reveal, that their work had been completely off base?

That is not a recognition that most humans in an institution are quick to accept. It was to take months, if not years, for RAND to accept it, if it ever did in those terms. To some degree, it’s my impression that it never recovered its former prestige or sense of mission, though both its building and its budget eventually became much larger. For some time most of my former colleagues continued their focus on the vulnerability of SAC, much the same as before, while questioning the reliability of the new estimate and its relevance to the years ahead. [Emphasis mine]

For years the specter of a “missile gap” had been haunting my colleagues at RAND and in the Defense Department. The revelation that this had been illusory cast a new perspective on everything. It might have occasioned a complete reassessment of our own plans for a massive buildup of strategic weapons, thus averting an otherwise inevitable and disastrous arms race. It did not; no one known to me considered that for a moment. [Emphasis mine]

According to Ellsberg, many at RAND were unable to adapt to the new reality and continued (fruitlessly) to continue with what they were doing, as if by inertia, when the thing that they needed to do (to use Eliezer’s turn of phrase) is “halt, melt, and catch fire.”

This suggests that one failure of this ecosystem, that was working in the domain of existential risk, was a failure to “say oops [LW · GW]“: to notice a mistaken belief, concretely acknowledge that is was mistaken, and to reconstruct one’s plans and world views.

Relevance to people working on AI safety

This seems to be at least some evidence (though, only weak evidence, I think), that we should be cautious of this particular cognitive failure ourselves.

It may be worth rehearsing the motion in advance: how will you respond, when you discover that a foundational crux of your planning is actually mirage, and the world is actually different than it seems?

What if you discovered that your overall approach to making the world better was badly mistaken?

What if you received a strong argument against the orthogonality thesis?

What about a strong argument for negative utilitarianism?

I think that many of the people around me have effectively absorbed the impact of a major update at least once in their life, on a variety of issues (religion, x-risk, average vs. total utilitarianism, etc), so I’m not that worried about us. But it seems worth pointing out the importance of this error mode.

A note: Ellsberg relays later in the book that, durring the Cuban missile crisis, he perceived Kennedy as offering baffling terms to the soviets: terms that didn’t make sense in light of the actual strategic situation, but might have been sensible under the premiss of a soviet missile gap. Ellsberg wondered, at the time, if Kennedy had also failed to propagate the update regarding the actual strategic situation.

I believed it very unlikely that the Soviets would risk hitting our missiles in Turkey even if we attacked theirs in Cuba. We couldn’t understand why Kennedy thought otherwise. Why did he seem sure that the Soviets would respond to an attack on their missiles in Cuba by armed moves against Turkey or Berlin? We wondered if—after his campaigning in 1960 against a supposed “missile gap”—Kennedy had never really absorbed what the strategic balance actually was, or its implications.

I mention this because additional research suggests that this is implausible: that Kennedy and his staff were aware of the true strategic situation, and that their planning was based on that premise.

Replies from: habryka4, adam_scholl
comment by habryka (habryka4) · 2019-08-24T03:50:55.284Z · LW(p) · GW(p)

This was quite valuable to me, and I think I would be excited about seeing it as a top-level post.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-08-24T07:22:18.897Z · LW(p) · GW(p)

Can you say more about what you got from it?

Replies from: billzito
comment by billzito · 2019-08-26T21:28:13.198Z · LW(p) · GW(p)

I can't speak for habryka, but I think your post did a great job of laying out the need for "say oops" in detail. I read the Doomsday Machine and felt this point very strongly while reading it, but this was a great reminder to me of its importance. I think "say oops" is one of the most important skills for actually working on the right thing, and that in my opinion, very few people have this skill even within the rationality community.

comment by Adam Scholl (adam_scholl) · 2019-08-26T06:34:36.666Z · LW(p) · GW(p)

There feel to me like two relevant questions here, which seem conflated in this analysis:

1) At what point did the USSR gain the ability to launch a comprehensively-destructive, undetectable-in-advance nuclear strike on the US? That is, at what point would a first strike have been achievable and effective?

2) At what point did the USSR gain the ability to launch such a first strike using ICBMs in particular?

By 1960 the USSR had 1,605 nuclear warheads; there may have been few ICBMs among them, but there are other ways to deliver warheads than shooting them across continents. Planes fail the "undetectable" criteria, but ocean-adjacent cities can be blown up by small boats, and by 1960 the USSR had submarines equipped with six "short"-range (650 km and 1,300 km) ballistic missiles. By 1967 they were producing subs like this, each of which was armed with 16 missiles with ranges of 2,800-4,600 km.

All of which is to say that from what I understand, RAND's fears were only a few years premature.

comment by Eli Tyre (elityre) · 2019-08-13T18:02:40.236Z · LW(p) · GW(p)

New post: What is mental energy?

[Note: I’ve started a research side project on this question, and it is already obvious to me that this ontology importantly wrong.]

There’s a common phenomenology of “mental energy”. For instance, if I spend a couple of hours thinking hard (maybe doing math), I find it harder to do more mental work afterwards. My thinking may be slower and less productive. And I feel tired, or drained, (mentally, instead of physically).

Mental energy is one of the primary resources that one has to allocate, in doing productive work. In almost all cases, humans have less mental energy than they have time, and therefore effective productivity is a matter of energy management, more than time management. If we want to maximize personal effectiveness, mental energy seems like an extremely important domain to understand. So what is it?

The naive story is that mental energy is an actual energy resource that one expends and then needs to recoup. That is, when one is doing cognitive work, they are burning calories, depleting their bodies energy stores. As they use energy, they have less fuel to burn.

My current understanding is that this story is not physiologically realistic. Thinking hard does consume more of the body’s energy than baseline, but not that much more. And we experience mental fatigue long before we even get close to depleting our calorie stores. It isn’t literal energy that is being consumed. [The Psychology of Fatigue pg.27]

So if not that, what is going on here?

A few hypotheses:

(The first few, are all of a cluster, so I labeled them 1a, 1b, 1c, etc.)

Hypothesis 1a: Mental fatigue is a natural control system that redirects our attention to our other goals.

The explanation that I’ve heard most frequently in recent years (since it became obvious that much of the literature on ego-depletion was off the mark), is the following:

A human mind is composed of a bunch of subsystems that are all pushing for different goals. For a period of time, one of these goal threads might be dominant. For instance, if I spend a few hours doing math, this means that my other goals are temporarily suppressed or on hold: I’m not spending that time seeking a mate, or practicing the piano, or hanging out with friends.

In order to prevent those goals from being neglected entirely, your mind has a natural control system that prevents you from focusing your attention on any one thing at a time: the longer you put your attention on something, the greater the build up of mental fatigue, causing you to do anything else.

Comments and model-predictions: This hypothesis, as stated, seems implausible to me. For one thing, it seems to suggest that that all actives would be equally mentally taxing, which is empirically false: spending several hours doing math is mentally fatiguing, but spending the same amount of time watching TV is not.

This might still be salvaged if we offer some currency other than energy that is being preserved: something like “forceful computations”. But again, it doesn’t seem obvious why the computations of doing math would be more costly than those for watching TV.

Similarly, this model suggests that “a change is as good as a break”: if you switch to a new task, you should be back to full mental energy, until you become fatigued for that task as well.

Hypothesis 1b: Mental fatigue is the phenomenological representation of the loss of support for the winning coalition.

A variation on this hypothesis would be to model the mind as a collection of subsystems. At any given time, there is only one action sequence active, but that action sequence is determined by continuous “voting” by various subsystems.

Overtime, these subsystems get fed up with their goals not being met, and “withdraw support” for the current activity. This manifests as increasing mental fatigue. (Perhaps your thoughts get progressively less effective, because they are interrupted, on the scale of micro-seconds, by bids to think something else).

Comments and model-predictions: This seems like it might suggest that if all of the subsystems have high trust that their goals will be met, that math (or any other cognitively demanding task) would cease to be mentally taxing. Is that the case? (Does doing math mentally exhaust Critch?)

This does have the nice virtue of explaining burnout: when some subset of needs are not satisfied for a long period, the relevant subsystems pull their support for all actions, until those needs are met.

[Is burnout a good paradigm case for studying mental energy in general?]

Hypothesis 1c: The same as 1a or 1b, but some mental operations are painful for some reason.

To answer my question above, one reason why math might be more mentally taxing than watching TV, is that doing math is painful.

If the process of doing math is painful on the micro-level, then even if all of the other needs are met, there is still a fundamental conflict between the subsystem that is aiming to acquire math knowledge, and the subsystem that is trying to avoid micro-pain on the micro-level.

As you keep doing math, the micro pain part votes more and more strongly against doing math, or the overall system biases away from the current activity, and you run out of mental energy.

Comments and model-predictions: This seems plausible for the activity of doing math, which involves many moments of frustration, which might be meaningfully micro-painful. But it seems less consistent with activities like writing, which phenomenologically feel non-painful. This leads to hypothesis 1d…

Hypothesis 1d: The same as 1c, but the key micro-pain is that of processing ambiguity second to second

Maybe the pain comes from many moments of processing ambiguity, which is definitely a thing that is happening in the context of writing. (I’ll sometimes notice myself try to flinch to something easier when I’m not sure which sentence to write.) It seems plausible that mentally taxing activities are taxing to the extent that they involve processing ambiguity, and doing a search for the best template to apply.

Hypothesis 1e: Mental fatigue is the penalty incurred for top down direction of attention.

Maybe consciously deciding to do things is importantly different from the “natural” allocation of cognitive resources. That is, your mind is set up such that the conscious, System 2, long term planning, metacognitive system, doesn’t have free rein. It has a limited budget of “mental energy”, which measures how long it is allowed to call the shots before the visceral, system 1, immediate gratification systems take over again.

Maybe this is an evolutionary adaption? For the monkeys that had “really good” plans for how to achieve their goals, never panned out for them. The monkeys that were impulsive some of the time, actually did better at the reproduction game?

(If this is the case, can the rest of the mind learn to trust S2 more, and thereby offer it a bigger mental energy budget?)

This hypothesis does seem consistent with my observation that rest days are rejuvenating, even when I spend my rest day working on cognitively demanding side projects.

Hypothesis 2: Mental fatigue is the result of the brain temporarily reaching knowledge saturation.

When learning a motor task, there are several phases in which skill improvement occurs. The first, unsurprisingly, is durring practice sessions. However, one also sees automatic improvements in skill in the hours after practice [actually this part is disputed] and following a sleep period (academic link1, 2, 3). That is, there is a period of consolidation following a practice session. This period of consolidation probably involves the literal strengthening of neural connections, and encoding other brain patterns that take more than a few seconds to set.

I speculate, that your brain may reach a saturation point: more practice, more information input, becomes increasingly less effective, because you need to dedicate cognitive resources to consolidation. [Note that this is supposing that there is some tradeoff between consolidation activity and input activity, as opposed to a setup where both can occur simultaneously (does anyone have evidence for such a tradeoff?)].

If so, maybe cognitive fatigue is the phenomenology of needing to extract one’s self from a practice / execution regime, so that your brain can do post-processing and consolidation on what you’ve already done and learned.

Comments and model-predictions: This seems to suggest that all cognitively taxing tasks are learning tasks, or at least tasks in which one is encoding new neural patterns. This seems plausible, at least.

It also seems to naively imply that an activity will become less mentally taxing as you gain expertise with it, and progress along the learning curve. There is (presumably) much more information to process and consolidate in your first hour of doing math than in your 500th.

Hypothesis 3: Mental fatigue is a control system that prevents some kind of damage to the mind or body.

One reason why physical fatigue is useful is that it prevents damage to your body. Getting tired after running for a bit, stops you for running all out for 30 hours at a time, and eroding your fascia.

By simple analogy to physical fatigue, we might guess that mental fatigue is a response to vigorous mental activity that is adaptive in that it prevents us from hurting ourselves.

I have no idea what kind of damage might be caused by thinking too hard.

I note that mania and hypomania involve apparently limitless mental energy reserves, and I think that theses states are bad for your brain.

Hypothesis 4: Mental fatigue is a buffer overflow of peripheral awareness.

Another speculative hypothesis: Human minds have a working memory: a limit of ~4 concepts, or chunks, that can be “activated”, or operated upon in focal attention, at one time. But meditators, at least, also talk a peripheral awareness: a sort of halo of concepts and sense impressions that are “loaded up”, or “near by”, or cognitively available, or “on the fringes of awareness”. These are all the ideas that are “at hand” to your thinking. [Note: is peripheral awareness, as the meditators talk about,  the same thing as “short term memory”?]

Perhaps if there is a functional limit to the amount of content that can be held in working memory, there is a similar, if larger, limit to how much content can be held in peripheral awareness. As you engage with a task, more and more mental content is loaded up, or added to peripheral awareness, where it both influences your focal thought process, and/or is available to be operated on directly in working memory. As you continue the task, and more and more content gets added to peripheral awareness, you begin to overflow its capacity. It gets harder and harder to think, because peripheral awareness is overflowing. Your mind needs space to re-ontologize: to chunk pieces together, so that it can all fit in the same mental space. Perhaps this is what mental fatigue is.

Comments and model-predictions: This does give a nice clear account of why sleep replenishes mental energy (it both causes re-ontologizing, and clears the cache), though perhaps this does not provide evidence over most of the other hypotheses listed here.

Other notes about mental energy:

  • In this post, I’m mostly talking about mental energy on the scale of hours. But there is also a similar phenomenon on the scale of days (the rejuvenation one feels after rest days) and on the scale of months (burnout and such). Are these the same basic phenomenon on different timescales?
  • On the scale of days, I find that my subjective rest-o-meter is charged up if I take a rest day, even if I spend that rest day working on fairly cognitively intensive side projects.
    • This might be because there’s a kind of new project energy, or new project optimism?
  • Mania and hypomania entail limitless mental energy.
  • People seem to be able to play video games for hours and hours without depleting mental energy. Does this include problem solving games, or puzzle games?
    • Also, just because they can play indefinitely does not mean that their performance doesn’t drop. Does performance drop, across hours of playing, say, snakebird?
  • For that matter, does performance decline on a task correlate with the phenomenological “running out of energy”? Maybe those are separate systems.
Replies from: gilch, gworley, mr-hire, eigen, Viliam, AprilSR
comment by gilch · 2019-09-04T05:53:12.904Z · LW(p) · GW(p)

On Hypothesis 3, the brain may build up waste as a byproduct of its metabolism when it's working harder than normal, just as muscles do. Cleaning up this buildup seems to be one of the functions of sleep. Even brainless animals like jellyfish sleep. They do have neurons though.

comment by G Gordon Worley III (gworley) · 2019-08-13T19:57:32.655Z · LW(p) · GW(p)

I also think it's reasonable to think that multiple things may be doing on that result in a theory of mental energy. For example, hypotheses 1 and 2 could both be true and result in different causes of similar behavior. I bring this up because I think of those as two different things in my experience: being "full up" and needing to allow time for memory consolidation where I can still force my attention it just doesn't take in new information vs. being unable to force the direction of attention generally.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-09-01T04:57:12.787Z · LW(p) · GW(p)

Yeah. I think you're on to something here. My current read is that "mental energy" is at least 3 things.

Can you elaborate on the what "knowledge saturation" feels like for you?

Replies from: gworley
comment by G Gordon Worley III (gworley) · 2019-09-02T16:40:31.733Z · LW(p) · GW(p)

Sure. It feels like my head is "full", although the felt sense is more like my head has gone from being porous and sponge-like to hard and concrete-like. When I try to read or listen to something I can feel it "bounce off" in that I can't hold the thought in memory beyond forcing it to stay in short term memory.

comment by Matt Goldenberg (mr-hire) · 2019-09-02T02:50:04.934Z · LW(p) · GW(p)

Isn't it possible that there's some other biological sink that is time delayed from caloric energy? Like say, a very specific part of your brain needs a very specific protein, and only holds enough of that protein for 4 hours? And it can take hours to build that protein back up. This seems to me to be at least somewhat likeely.

Replies from: Ruby
comment by Ruby · 2019-09-02T16:44:04.425Z · LW(p) · GW(p)

Someone smart once made a case like to this to me in support of a specific substance (can't remember which) as a nootropic, though I'm a bit skeptical.

comment by eigen · 2019-09-01T17:13:13.623Z · LW(p) · GW(p)

I think about this a lot. I'm currently dangling with the fourth Hypothesis, which seems more correct to me and one where I can actually do something to ameliorate the trade-off implied by it.

In this comment [LW(p) · GW(p)], I talk what it means to me and how I can do something about it, which ,in summary, is to use Anki a lot and change subjects when working memory gets overloaded. It's important to note that mathematics is sort-of different from another subjects, since concepts build on each other and you need to keep up with what all of them mean and entail, so we may be bound to reach an overload faster in that sense.

A few notes about your other hypothesis:

Hypothesis 1c:

it doesn’t seem obvious why the computations of doing math would be more costly than those for watching TV.

It's because we're not used to it. Some things come easier than other; some things are more closely similar to what we have been doing for 60000 years (math is not one of them). So we flinch from that which we are not use to. Although, adaptation is easy and the major hurdle is only at the beginning.

This seems plausible for the activity of doing math, which involves many moments of frustration, which might be meaningfully micro-painful.

It may also mean that the reward system is different. Is difficult to see on a piece of mathematics, as we explore it, how fulfilling it's when we know that we may not be getting anywhere. So the inherent reward is missing or has to be more artificially created.

Hypothesis 1d:

It seems plausible that mentally taxing activities are taxing to the extent that they involve processing ambiguity, and doing a search for the best template to apply.

This seems correct to me. Consider the following: “This statement is false”.

Thinking about it for a few minutes (or iterations of that statement) is quickly bound to make us flinch away in just a few seconds. How many other things take this form? I bet there are many.

For the monkeys that had “really good” plans for how to achieve their goals, never panned out for them. The monkeys that were impulsive some of the time, actually did better at the reproduction game?

Instead of working to trust System 2 is it there a way to train System 1? It seems more apt to me, like training tactics in chess or to make rapid calculations.

Thank you for the good post, I'd really like to further know more about your findings.

comment by Viliam · 2019-08-14T22:31:22.698Z · LW(p) · GW(p)

Seems to me that mental energy is lost by frustration. If what you are doing is fun, you can do it for a log time; if it frustrates you at every moment, you will get "tired" soon.

The exact mechanism... I guess is that some part of the brain takes frustration as an evidence that this is not the right thing to do, and suggests doing something else. (Would correspond to "1b" in your model?)

comment by AprilSR · 2019-08-14T00:11:18.010Z · LW(p) · GW(p)

I’ve definitely experienced mental exhaustion from video games before - particularly when trying to do an especially difficult task.

comment by Eli Tyre (elityre) · 2019-10-26T14:48:03.638Z · LW(p) · GW(p)

New post: Some notes on Von Neumann, as a human being

I recently read Prisoner’s Dilemma, which half an introduction to very elementary game theory, and half a biography of John Von Neumann, and watched this old PBS documentary about the man.

I’m glad I did. Von Neumann has legendary status in my circles, as the smartest person ever to live. [1] Many times I’ve written the words “Von Neumann Level Intelligence” in a AI strategy document, or speculated about how many coordinated Von Neumanns would it take to take over the world. (For reference, I now think that 10 is far too low, mostly because he didn’t seem to have the entrepreneurial or managerial dispositions.)

Learning a little bit more about him was humanizing. Yes, he was the smartest person ever to live, but he was also an actual human being, with actual human traits.

Watching this first clip, I noticed that I was surprised by a number of thing.

  1. That VN had an accent. I had known that he was Hungarian, but somehow it had never quite propagated that he would speak with a Hungarian accent.
  2. That he was middling height (somewhat shorter than the presenter he’s talking too).
  3. The thing he is saying is the sort of thing that I would expect to hear from any scientist in the public eye, “science education is important.” There is something revealing about Von Neumann, despite being the smartest person in the world, saying basically what I would expect Neil DeGrasse Tyson to say in an interview. A lot of the time he was wearing his “scientist / public intellectual” hat, not the “smartest person ever to live” hat.

Some other notes of interest:

He was not a skilled poker player, which punctured my assumption that Von Neumann was omnicompetent. (pg. 5) Nevertheless, poker was among the first inspirations for game theory. (When I told this to Steph, she quipped “Oh. He wasn’t any good at it, so he developed a theory from first principles, describing optimal play?” For all I know, that might be spot on.)

Perhaps relatedly, he claimed he had low sales resistance, and so would have his wife come clothes shopping with him. (pg. 21)

He was sexually crude, and perhaps a bit misogynistic. Eugene Wigner stated that “Johny believed in having sex, in pleasure, but not in emotional attachment. HE was interested in immediate pleasure and little comprehension of emotions in relationships and mostly saw women in terms of their bodies.” The journalist Steve Heimes wrote “upon entering an office where a pretty secretary was working, von Neumann habitually would bend way over, more or less trying to look up her dress.” (pg. 28) Not surprisingly, his relationship with his wife, Klara, was tumultuous, to say the least.

He did however, maintain a strong, life long, relationship with his mother (who died the same year that he did).

Overall, he gives the impression of being a genius, overgrown child.

Unlike many of his colleagues, he seemed not to share the pangs conscience that afflicted many of the bomb creators. Rather than going back to academia following the war, he continued doing work for the government, including the development of the Hydrogen bomb.

Von Neumann advocated preventative war: giving the Soviet union an ultimatum, of joining a world government, backed by the threat of (and probable enaction of) nuclear attack, while the US still had a nuclear monopoly. He famously said of the matter, “If you say why not bomb them tomorrow, I say why not today? If you say today at 5 o’clock, I say why not 1 o’clock.”

This attitude was certainly influenced by his work on game theory, but it should also be noted that Von Neumann hated communism.

Richard Feynman reports that Von Neumann, in their walks through the Los Alamos desert, convinced him to adopt and attitude of “social irresponsibility”, that one “didn’t have to be responsible for the world he was in.”

Prisoner’s dilemma says that he and his collaborators “pursued patents less aggressively than the could have”. Edward Teller commented, “probably the IBM company owes half its money to John Von Neumann.” (pg. 76)

So he was not very entrepreneurial, which is a bit of a shame, because if he had the disposition he probably could have made sooooo much money / really taken substantial steps towards taking over the world. (He certainly had the energy to be an entrepreneur: he only slept for a few hours a night, and was working for basically all his working hours.

He famously always wore a grey oxford 3 piece suit, including when playing tennis with Stanislaw Ulam, or when riding a donkey down the grand canyon. But, I am not clear why. Was that more comfortable? Did he think it made him look good? Did he just not want to have to ever think about clothing, and so preferred to be over-hot in the middle of the Los Alamos desert, rather than need to think about if today was “shirt sleeves whether”?

Von Neumann himself once commented on the strange fact of so many Hungarian geniuses growing up in such a small area, in his generation:

Stanislaw Ulam recalled that when Von Neumann was asked about this “statistically unlikely” Hungarian phenomenon, Von Neumann “would say that it was a coincidence of some cultural factors which he could not make precise: an external pressure on the whole society of this part of Central Europe, a subconscious feeling of extreme insecurity in individual, and the necessity of producing the unusual or facing extinction.” (pg. 66)

One thing that surprised me most was that it seems that, despite being possibly the smartest person in modernity, he would have benefited from attending a CFAR workshop.

For one thing, at the end of his life, he was terrified of dying. But throughout the course of his life he made many reckless choices with his health.

He ate gluttonously and became fatter and fatter over the course of his life. (One friend remarked that he “could count anything but calories.”)

Furthermore, he seemed to regularly risk his life when driving.

Von Neuman was an aggressive and apparently reckless driver. He supposedly totaled his car every year or so. An intersection in Princeton was nicknamed “Von Neumann corner” for all the auto accidents he had there. records of accidents and speeding arrests are preserved in his papers. [The book goes on to list a number of such accidents.] (pg. 25)

(Amusingly, Von Neumann’s reckless driving seems due, not to drinking and driving, but to singing and driving. “He would sway back and forth, turning the steering wheel in time with the music.”)

I think I would call this a bug.

On another thread, one of his friends (the documentary didn’t identify which) expressed that he was over-impressed by powerful people, and didn’t make effective tradeoffs.

I wish he’d been more economical with his time in that respect. For example, if people called him to Washington or elsewhere, he would very readily go and so on, instead of having these people come to him. It was much more important, I think, he should have saved his time and effort.
He felt, when the government called, [that] one had to go, it was a patriotic duty, and as I said before he was a very devoted citizen of the country. And I think one of the things that particularly pleased him was any recognition that came sort-of from the government. In fact, in that sense I felt that he was sometimes somewhat peculiar that he would be impressed by government officials or generals and so on. If a big uniform appeared that made more of an impression than it should have. It was odd.
But it shows that he was a person of many different and sometimes self contradictory facets, I think.

Stanislaw Ulam speculated, “I think he had a hidden admiration for people and organizations that could be tough and ruthless.” (pg. 179)

From these statements, it seems like Von Neumann leapt at chances to seem useful or important to the government, somewhat unreflectively.

These anecdotes suggest that Von Neumann would have gotten value out of Goal Factoring, or Units of Exchange, or IDC (possibly there was something deeper going on, regarding a blindspots around death, or status, but I think the point still stands, and he would have benefited from IDC).

Despite being the discoverer/ inventor of VNM Utility theory, and founding the field of Game Theory (concerned with rational choice), it seems to me that Von Neumann did far less to import the insights of the math into his actual life than say, Critch.

(I wonder aloud if this is because Von Neumann was born and came of age before the development of cognitive science. I speculate that the importance of actually applying theories of rationality in practice, only becomes obvious after Tversky and Kahneman demonstrate that humans are not rational by default. (In evidence against this view: Eliezer seems to have been very concerned with thinking clearly, and being sane, before encountering Heuristics and Biases in his (I belive) mid 20s. He was exposed to Evo Psych though.))

Also, he converted to Catholicism at the end of his life, based on Pascal’s Wager. He commented “So long as there is the possibility of eternal damnation for nonbelievers it is more logical to be a believer at the end”, and “There probably has to be a God. Many things are easier to explain if there is than if there isn’t.”

(According to wikipedia, this deathbed conversion did not give him much comfort.)

This suggests that he would have gotten value out of reading the sequences, in addition to attending a CFAR workshop.

Replies from: Viliam, liam-donovan
comment by Viliam · 2019-10-27T10:14:46.773Z · LW(p) · GW(p)

Thank you, this is very interesting!

Seems to me the most imporant lesson here is "even if you are John von Neumann, you can't take over the world alone."

First, because no matter how smart you are, you will have blind spots.

Second, because your time is still limited to 24 hours a day; even if you'd decide to focus on things you have been neglecting until now, you would have to start neglecting the things you have been focusing on until now. Being better at poker (converting your smartness to money more directly), living healthier and therefore on average longer, developing social skills, and being strategic in gaining power... would perhaps come at a cost of not having invented half of the stuff. When you are John von Neumann, your time has insane opportunity costs.

comment by Liam Donovan (liam-donovan) · 2019-10-27T13:45:53.458Z · LW(p) · GW(p)

Is there any information on how Von Neumann came to believe Catholicism was the correct religion for Pascal Wager purposes? "My wife is Catholic" doesn't seem like very strong evidence...

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-10-28T15:09:26.860Z · LW(p) · GW(p)

I don't know why Catholicism.

I note that it does seem to be the religion of choice for former atheists, or at least for rationalists. I know of several rationalists that converted to catholicism, but none that have converted to any other religion.

comment by Eli Tyre (elityre) · 2020-09-13T02:40:38.038Z · LW(p) · GW(p)

TL;DR: I’m offering to help people productively have difficult conversations and resolve disagreements, for free. Feel free to email me if and when that seems helpful. elitrye [at] gmail.com


Over the past 4-ish years, I’ve had a side project of learning, developing, and iterating on methods for resolving tricky disagreements, and failures to communicate. A lot of this has been in the Double Crux frame, but I’ve also been exploring a number of other frameworks (including, NVC, Convergent Facilitation, Circling-inspired stuff, intuition extraction, and some home-grown methods).

As part of that, I’ve had a standing offer to facilitate / mediate tricky conversations for folks in the CFAR and MIRI spheres (testimonials below). Facilitating “real disagreements”, allows me to get feedback on my current conversational frameworks and techniques. When I encounter blockers that I don’t know how to deal with, I can go back to the drawing board to model those problems and interventions that would solve them, and iterate from there, developing new methods.

I generally like doing this kind of conversational facilitation and am open to doing a lot more of it with a wider selection of people.

I am extending an offer to help mediate tricky conversations, to anyone that might read this post, for the foreseeable future. [If I retract this offer, I’ll come back and leave a note here.]

What sort of thing is this good for?

I’m open to trying to help with a wide variety of difficult conversations, but the situations where I have been most helpful in the past have had the following features:

  1. Two* people are either having some conflict or disagreement or are having difficulty understanding something about what the other person is saying.
  2. There’s some reason to expect the conversation to not “work”, by default: either they’ve tried already, and made little progress etc. or, at least one person can predict that this conversation will be tricky or heated.
  3. There is enough mutual respect and/or there is enough at stake that it seems worthwhile to try and have the conversation anyway. It seems worth the time to engage.

Here are some (anonymized) examples of conversations that I’ve facilitated in the past years.

  • Two researchers work in related fields, but in different frames / paradigms. Try as they might, neither person can manage to see how the other’s claims are even plausible.
  • Two friends are working on a project together, but they each feel inclined to take it in a different direction, and find it hard to get excited about the other’s proposal, even having talked about the question a lot.
  • John and Janet are EAs. John thinks that the project that Janet has spent the past year on, and is close to launching, is net negative, and that Janet should drop it entirely. Janet feels exasperated by this and generally feels that John is overly-controlling.
  • Two rationalists Laura and Alex, are each in some kind of community leadership role, and have a lot of respect for each other, but they have very different takes on a particular question of social mores: Laura thinks that there is a class of norm enforcement that is normal and important, Alex thinks that class of “norm enforcement” behavior is unacceptable and corrosive to the social fabric. They sit down to talk about it, but seem to keep going in circles without clarifying anything.

Basically, if you have a tricky disagreement that you want to try to hash out, and you feel comfortable inviting an outside party, feel free to reach out to me.

(If there’s some conversation or conflict that you have in mind, but don’t know if it falls in this category, feel free to email me and ask.)

*- I’m also potentially open to trying to help with conflicts that involve more than two people, such as a committee that is in gridlock, trying to make a decision, but I am much less practiced with that.

The process

If everyone involved is open to a third person (me) coming in to mediate, shoot me an email at elityre [at] gmail.com, and we can schedule a half hour call to discuss your issue. After discussing it a bit, I’ll tell you if I think I can help or not. If not, I might refer you to other people resources that might be more useful.

If it seems like I can help, I typically prefer to meet with both parties one-on-one, as much as a week before we meet together, so that I can “load up” each person’s perspective, and start doing prep work. From there we can schedule a conversation, presumably over Zoom, for all three (or more) of us to meet.

In the conversation itself, I would facilitate, tracking what’s happening and suggesting particular conversational moves or tacts, and possibly recommending and high-level framework.

[I would like to link to an facilitation-example video here, but almost all of the conversations that I’ve facilitated are confidential. Hopefully this post will lead to one or two that can be public.]

Individual cases can vary a lot, and I’m generally open to considering alternative formats.

Currently, I’m doing this free of charge.

My sense of my current level of skill

I think this is a domain in which deep mastery is possible. I don’t consider myself to be a master, but I am aspiring to mastery.

My (possibly biased impression), is that the median outcome of my coming to help with a conversation is “eh, that was moderately helpful, mostly because having a third person to help hold space, freed up our working memory to focus on the object level.”

Occasionally (one out of every 10 conversations?), I think I’ve helped dramatically, on the order of “this conversation was not working at all, until Eli came to help, and then we had multiple breakthroughs in understanding.”

(I’ve started explicitly tracking my participants’ estimation of my counterfactual impact, following conversations, so I hope to have much better numbers for assessing how useful this work is in a few months. Part of my hope in doing more of this is that I will get a more accurate assessment of how much value my facilitation in particular provides, and how much I should be investing in this general area.)


(I asked a number of people who I've done facilitation work in the past to give me a short honest testimonial, if they felt comfortable with that. I included the blurb from every person who sent me something, though this is still a biased sample, since I mostly reached out to people who I expected would give a "positive review".)

Anna Salamon:

I've found Eli quite helpful with a varied set of tricky conversations over the years. Some details:
- It helps that he can be tracking whether we are understanding each other, vs whether it is time to paraphrase;
- It helps that he can be tracking whether we are speaking to a "crux" or are on an accidental tangent/dead-end (I can do many of these things too, but when Eli is facilitating I can trust him to do some of this, which leaves me with more working memory for understanding the other party's perspective, figuring out how to articulate my own, etc.)
- It helps that he can help track the conversational stack, so that e.g. if I stop to paraphrase my conversation partner's point, that doesn't mean we'll never get back to the thing I was trying to keep track of.
- It has sometimes helped that he could paraphrase one or the other of us in ways the other party couldn't, but could then hear [after hearing his paraphrase];
- I have seen him help with both research-like/technical conversational topics, and messy cultural stuff.
- He can often help in cases where many folks would intuitively assume that a conversation is just "stuck," e.g. because it boils down to a difference in aesthetics or root empistemological perspectives or similar (Eli has a bunch of cached patterns for sometimes allowing such topics to progress, where a lot of people would not know how)
- I can vouch for Eli's ability to not-repeat private content that he says he won't repeat.
- I personally highly value Eli's literal-like or autistic-like tendency to just actually stick with what is being said, and to attempt to facilitate communication, without guessing ahead of time which party is "mature" or "right" or to-be-secretly-sided with. This is perhaps the area in which I have most noticed Eli's skills/habits rising above (in my preference-ordering) those of other skilled facilitators I've worked with.
- He responds pretty well to feedback, and acts so as to try to find out how to actually aid thinking/communication rather than to feel as though he is already doing so.

Scott Garrabrant:

I once went to a workshop and participated in a fishbowl double crux on the second to last day. That day went so well that we basically replaced all of the last day’s schedule with continuing the conversation, and that day went so well that we canceled plane tickets and extended the workshop. This experience made me very optimistic about what can be accomplished with a facilitated double crux.
Later, when asked to give a talk at a different workshop, I declined and suggested that talks were boring and we should replace several talk slots with fishbowl double cruxes. We tried it. It was a failure, and I don’t think much of value came out of any of the resulting conversations.
As far as I can tell, the second largest contributor to the relative failure was regression to the mean. The first largest was not having Eli there.

Evan Hubinger:

I really appreciate Eli's facilitation and I think that the hard conversations I've had with Eli facilitating would have been essentially impossible without good facilitation. I do think that trusting the facilitator is very important, but if you know and trust Eli as I do, I would definitely recommend his facilitation if you have a need for it.

Oliver Habryka:

I've asked Eli many times over the years to help me facilitate conversations that seemed particularly important and difficult. For most of these, having them happen at all without Eli seems quite difficult, so simply the presence of his willingness to facilitate, and to be reasonably well-known to be reasonable in his facilitation, provided a substantial amount of value.
He is also pretty decent at facilitation, as far as I can tell, or at least I can't really think of anyone who is substantially more skilled at it.
It's kind of hard for me to give a super clear review here. Like, facilitation isn't much of a commodity, and I don't think there is a shared standard of what a facilitator is supposed to do, so it's hard for me to straightforwardly evaluate it. I do think what Eli has been doing has been quite valuable to me, and I would recommend reasonably strongly that other people have more conversations of the type that Eli tends to facilitate.

Mathew Fallshaw

In 2017 I was engaged in a complicated discussion with a collaborator that was not progressing smoothly. Eli joined the discussion, in the role of facilitator, and the discussion markedly improved.

Other people who have some experience with my facilitation style, feel free to put your own thoughts in the comments.

Caveats and other info

As noted, this is an open research-ish project for me, and I obviously cannot guarantee that I will be helpful, much less that I will be able to resolve or get to the bottom of a given disagreement. In fact, as stated, I, personally, am most interested in the cases where I don’t know how to help, because those are the places where I'm most likely to learn the most, even if they are the places where I am least able to provide value.

You are always welcome to invite me to try and help, and then partway through, decide that my suggestions are less-than helpful, and say that you don’t want my help after all. (Anna Salamon does this moderately frequently.)

I do my best to keep track of a map of relevant skills in this area, and which people around have more skill than me in particular sub-domains. So it is possible that when you describe your situation, I’ll either suggest someone else who I think might be better to help you than me, or who I would like to bring in to co-facilitate with me (with your agreement, of course).

Note that this is one of a number of projects, involving difficult conversations or facilitation, that I am experimenting with lately. Another is here and another is to be announced.

If you’re interested in training sessions on Double Crux and other Conversational Facilitation skills, join my Double Crux training mailing list, here. I have vague plans to do a 3-weekend training program, covering my current take on the core Double Crux skill, but no guarantees that I will actually end up doing that any time soon.

Questions welcome!

Replies from: riceissa, m_arj
comment by riceissa · 2020-09-13T04:43:46.666Z · LW(p) · GW(p)

I am curious how good you think the conversation/facilitation was in the AI takeoff double crux between Oliver Habryka and Buck Shlegeris [LW · GW]. I am looking for something like "the quality of facilitation at that event was X percentile among all the conversation facilitation I have done".

Replies from: elityre
comment by Eli Tyre (elityre) · 2020-09-13T06:39:28.996Z · LW(p) · GW(p)

[I wrote a much longer and more detailed comment, and then decided that I wanted to think more about it. In lieu of posting nothing, here's a short version.]

I mean I did very little facilitation one way or the other at that event, so I think my counterfactual impact was pretty minimal.

In terms of my value added, I think that one was in the bottom 5th percentile?

In terms of how useful that tiny amount of facilitation was, maybe 15 to 20th percentile? (This is a little weird, because quantity and quality are related. More active facilitation has a quality span: active (read: a lot of) facilitation can be much more helpful when it is good and much more disruptive / annoying / harmful, when it is bad, compared to less active backstop facilitation,

Overall, the conversation served the goals of the participants and had a median outcome for that kind of conversation, which is maybe 30th percentile, but there is a long right tail of positive outcomes (and maybe I am messing up how to think about percentile scores with skewed distributions).

The outcome that occured ("had an interesting conversation, and had some new thoughts / clarifications") is good but also far below the sort of outcome that I'm ussually aiming for (but often missing), of substantive, permanent (epistemic!) change to the way that one or both of the people orient on this topic.

Replies from: habryka4
comment by habryka (habryka4) · 2020-09-13T07:37:17.111Z · LW(p) · GW(p)

(This is a little weird, because quantity and quality are related, in that

Looks like you dropped a sentence.

Replies from: elityre
comment by Eli Tyre (elityre) · 2020-09-13T18:17:03.610Z · LW(p) · GW(p)


comment by m_arj · 2020-09-13T17:03:26.032Z · LW(p) · GW(p)

Could you recommended the best book about this topic?

Replies from: elityre
comment by Eli Tyre (elityre) · 2020-09-13T18:16:33.873Z · LW(p) · GW(p)


I've gotten very little out of books in this area.

It is a little afield, but strongly recommend the basic NVC book: Nonviolent Communication: A Language for Life. I recommend that at minimum, everyone read at least the first two chapters, which is something like 8 pages long, and has the most content in the book. (The rest of the book is good too, but it is mostly examples.)

Also, people I trust have gotten value out of How to Have Impossible Conversations. This is still on my reading stack though (for this month, I hope), so I don't personally recommend it. My expectation, from not having read it yet, is that it will cover the basics pretty well.

comment by Eli Tyre (elityre) · 2020-08-07T05:21:28.982Z · LW(p) · GW(p)

(Reasonably personal)

I spend a lot of time trying to build skills, because I want to be awesome. But there is something off about that.

I think I should just go after things that I want, and solve the problems that come up on the way. The idea of building skills sort of implies that if I don't have some foundation or some skill, I'll be blocked, and won't be able to solve some thing in the way of my goals.

But that doesn't actually sound right. Like it seems like the main important thing for people who do incredible things is their ability to do problem solving on the things that come up, and not the skills that they had previously built up in a "skill bank".

Raw problem solving is the real thing and skills are cruft. (Or maybe not cruft per se, but more like a side effect. The compiled residue of previous problem solving. Or like a code base from previous project that you might repurpose.)

Part of the problem with this is that I don't know what I want for my own sake, though. I want to be awesome, which in my conception, means being able to do things.

I note that wanting "to be able to do things" is a leaky sort of motivation: because the victory condition is not clearly defined, it can't be crisply compelling, and so there's a lot of waste somehow.

The sort of motivation that works is simply wanting to do something, not wanting to be able to do something. Like specific discrete goals that one could accomplish, know that one accomplished, and then (in most cases) move on from.

But most of the things that I want by default are of the sort "wanting to be able to do", because if I had more capabilities, that would make me awesome.

But again, that's not actually conforming with my actual model of the world. The thing that makes someone awesome is general problem solving capability, more than specific capacities. Specific capacities are brittle. General problem solving is not.

I guess that I could pick arbitrary goals that seem cool. But I'm much more emotionally compelled by being able to do something instead of doing something.

But I also think that I am notably less awesome and on a trajectory to be less awesome over time, because my goals tend to be shaped in this way. (One of those binds whereby if you go after x directly, you don't get x, but if you go after y, you get x as a side effect.)

I'm not sure what to do about this.

Maybe meditate on, and dialogue with, my sense that skills are how awesomeness is measured, as opposed to raw, general problem solving.

Maybe I need to undergo some deep change that causes me to have different sorts of goals at a deep level. (I think this would be a pretty fundamental shift in how I engage with the world: from a virtue ethics orientation (focused on one's own attributes) to one of consequentialism (focused on the states of the world).)

There are some exceptions to this, goals that are more consequentialist (although if you scratch a bit, you'll find they're about living an ideal of myself, more than they are directly about the world), including wanting a romantic partner who makes me better (note that "who makes me better is" is virtue ethics-y), and some things related to my moral duty, like mitigating x-risk. These goals do give me grounding in sort of the way that I think I need, but they're not sufficient? I still spend a lot of time trying to get skills.

Anyone have thoughts?

Replies from: Marcello, Dagon, Viliam, mr-hire
comment by Marcello · 2020-08-07T16:23:07.415Z · LW(p) · GW(p)

Your seemingly target-less skill-building motive isn't necessarily irrational or non-awesome. My steel-man is that you're in a hibernation period, in which you're waiting for the best opportunity of some sort (romantic, or business, or career, or other) to show up so you can execute on it. Picking a goal to focus on really hard now might well be the wrong thing to do; you might miss a golden opportunity if your nose is at the grindstone. In such a situation a good strategy would, in fact, be to spend some time cultivating skills, and some time in existential confusion (which is what I think not knowing which broad opportunities you want to pursue feels like from the inside).

The other point I'd like to make is that I expect building specific skills actually is a way to increase general problem solving ability; they're not at odds. It's not that super specific skills are extremely likely to be useful directly, but that the act of constructing a skill is itself trainable and a significant part of general problem solving ability for sufficiently large problems. Also, there's lots of cross-fertilization of analogies between skills; skills aren't quite as discrete as you're thinking.

comment by Dagon · 2020-08-07T15:07:22.802Z · LW(p) · GW(p)

Skills and problem-solving are deeply related. The basics of most skills are mechanical and knowledge-based, with some generalization creeping in on your 3rd or 4th skill in terms of how to learn and seeing non-obvious crossover. Intermediate (say, after the first 500 to a few thousand hours) use of skills requires application of problem-solving within the basic capabilities of that skill. Again, you get good practice within a skill, and better across a few skills. Advanced application in many skills is MOSTLY problem-solving. How to apply your well-indexed-and-integrated knowledge to novel situations, and how to combine that knowledge across domains.

I don't know of any shortcuts, though - it takes those thousands of hours to get enough knowledge and basic techniques embedded in your brain that you can intuit what avenues to more deeply explore in new applications.

There is a huge amount of human variance - some people pick up some domains ludicrously easily. This is a blessing and a curse, as it causes great frustration when they hit a domain that they have to really work at. Others have to work at everything, and never get their Nobel, but still contribute a whole lot of less-transformational "just work" within the domains they work at.

comment by Viliam · 2020-08-15T16:52:14.181Z · LW(p) · GW(p)

Seems to me there is some risk either way. If you keep developing skills without applying them to a specific goal, it can be a form of procrastination (an insidious one, because it feels so virtuous). There are many skills you could develop, and life is short. On the other hand, as you said, if you go right after your goal, you may find an obstacle you can't overcome... or even worse, an obstacle you can't even properly analyze, so the problem is not merely that you don't have the necessary skill, but that you even have no idea which skill you miss (so if you try to develop the skills as needed, you may waste time developing the wrong skills, because you misunderstood the nature of the problem).

it seems like the main important thing for people who do incredible things is their ability to do problem solving on the things that come up, and not the skills that they had previously built up in a "skill bank".

It could be both. And perhaps you notice the problem-specific skills more, because those are rare.

But I also kinda agree that the attitude is more important, and skills often can be acquired when needed.

So... dunno, maybe there are two kinds of skills? Like, the skills with obvious application, such as "learn to play a piano"; and the world-modelling skills, such as "understand whether playing a piano would realistically help you accomplish your goals"? You can acquire the former when needed, but you need the latter in advance, to remove your blind spots?

Or perhaps some skills such as "understand math" are useful in many kinds of situations and take a lot of time to learn, so you probably want to develop these in advance? (Also, if you don't know yet what to do, it probably helps to get power: learn math, develop social skills, make money... When you later make up your mind, you will likely find some of this useful.)

And maybe you need the world-modelling skills before you make specific goals, because how could your goal be to learn play the piano, if you don't know the piano exists? You could have a more general goal, such as "become famous at something", but if you don't know that piano exists, maybe you wouldn't even look in this direction.

But most of the things that I want by default are of the sort "wanting to be able to do", because if I had more capabilities, that would make me awesome.

Could this also be about your age? (I am assuming here that you are young.) For younger people it makes more sense to develop general skills; for older people it makes more sense to go after specific goals. The more time you have ahead of you, the more meta you can go -- the costs of acquiring a skill are the same, but the possible benefits of having the skill are proportional to your remaining time (more than linear, if you actually use the skill, because it will keep increasing as a side effect of being used).

Also, as a rule of thumb, younger people are judged by their potential, older people are judged by their accomplishments. If you are young, evolution wants you to feel awesome about having skills, because that's what your peers will admire. You signal general intelligence. The accomplishments you have... uhm, how to put it politely... if you see a 20 years old kid driving an expensive car, your best guess is that their parents have bought it, isn't it? On the other hand, an older person without accomplishments seems like a loser, regardless of their apparent skills, because there is something suspicious about them not having translated those skills into actual outcomes. The excuse for the young ones is that their best strategy is to acquire skills now, and apply them later (which hasn't happened yet, but there is enough time remaining).

comment by Matt Goldenberg (mr-hire) · 2020-08-07T18:57:31.982Z · LW(p) · GW(p)

I've gone through something very similar.

Based on your language here, it feels to me like you're in the contemplation stage along the stages of change.

So the very first thing I'd say is to not feel the desire to jump ahead and "get started on a goal right now." That's jumping ahead in the stages of change, and will likely create a relapse.  I will predict that there's a 50% chance that if you continue thinking about this without "forcing it", you'll have started in on a goal (action stage) within 3 months.

Secondly, unlike some of the other responses here, I think your analysis is fairly accurate.  I've certainly found that picking up gears when I need them for my goals is better than learning them ahead of time. [LW · GW]

Now, in terms of "how to actually do it." 

I'm pretty convinced that they key to getting yourself to do stuff is "Creative Tension" - creating a clear internal tension between the end state that feels good and the current state that doesn't feel as good. There are 4 ways I know to go about generating internal tension:

  1. Develop a strong sense of self, and create tension between the world where you're fully expressing that self and the world where you're not.
  2. Develop a strong sense of taste, and create tension between the beautiful things that could exist and what exists now.
  3. Develop a strong pain, and create tension between the world where you have that pain and the world where you've solved it.
  4. Develop a strong vision, and create tension between the world as it is now and the world as it would be in your vision.

One especially useful trick that worked for me coming from the "just develop myself into someone awesome" place was tying the vision of the awesome person I could be with the vision of what I'd achieved - that is, in m vision of the future, including a vision of the awesome person I had to become in order to reach that future.

 I then would deliberately contrast where I was now with that compelling vision/self/taste with where I was. Checking in with that vision every morning, and fixing areas of resistance when they arise, is what keeps me motivated.

I do have a workshop that I run on exactly how to create that vision that's tied with sense of self and taste, and then how to use it to generate creative tension.  Let me know if something like that would be helpful to you.

comment by Eli Tyre (elityre) · 2021-06-19T02:02:07.373Z · LW(p) · GW(p)

I’m no longer sure that I buy dutch book arguments, in full generality, and this makes me skeptical of the "utility function" abstraction

Thesis: I now think that utility functions might be a pretty bad abstraction for thinking about the behavior of agents in general including highly capable agents.

[Epistemic status: half-baked, elucidating an intuition. Possibly what I’m saying here is just wrong, and someone will helpfully explain why.]

Over the past years, in thinking about agency and AI, I’ve taken the concept of a “utility function” for granted as the natural way to express an entity's goals or preferences. 

Of course, we know that humans don’t have well defined utility functions (they’re inconsistent, and subject to all kinds of framing effects), but that’s only because humans are irrational. To the extent that a thing acts like an agent, it’s behavior corresponds to some utility function. That utility function might not be explicitly represented, but if an agent is rational, there’s some utility function that reflects it’s preferences. 

Given this, I might be inclined to scoff at people who scoff at “blindly maximizing” AGIs. “They just don’t get it”, I might think. “They don’t understand why agency has to conform to some utility function, and an AI would try to maximize expected utility.”

Currently, I’m not so sure. I think that talking in terms of utility functions is biting a philosophical bullet, and importing some unacknowledged assumptions. Rather than being the natural way to conceive of preferences and agency, I think utility functions might be only one possible abstraction, and one that emphasizes the wrong features, giving a distorted impression of what agents, in general, are actually like.

I want to explore that possibility in this post.

Before I begin, I want to make two notes. 

First, all of this is going to be hand-wavy intuition. I don’t have crisp knock-down arguments, only a vague discontent. But it seems like more progress will follow if I write up my current, tentative, stance even without formal arguments.

Second, I don’t think utility functions being a poor abstraction for agency in the real world has much bearing on whether there is AI risk. As I’ll discuss, it might change the shape and tenor of the problem, but highly capable agents with alien seed preferences are still likely to be catastrophic to human civilization and human values. I mention this because the sentiments expressed in this essay are casually downstream of conversations that I’ve had with skeptics about whether there is AI risk at all. So I want to highlight: I think I was mistakenly overlooking some philosophical assumptions, but that is not a crux.

Is coherence overrated? 

The tagline of the “utility” page on arbital is “The only coherent way of wanting things is to assign consistent relative scores to outcomes.” 

This is true as far as it goes, but to me, at least, that sentence implies a sort of dominance of utility functions. “Coherent” is a technical term, with a precise meaning, but it also has connotations of “the correct way to do things”. If someone’s theory of agency is incoherent, that seems like a mark against it. 

But it is possible to ask, “What’s so good about coherence anyway? Maybe 

The standard reply of course, is that if your preferences are incoherent, you’re dutchbookable, and someone will pump you for money. 

But I’m not satisfied with this argument. It isn’t obvious that being dutch booked is a bad thing.

In, Coherent Decisions Imply Consistent Utilities, Eliezer says, 

Suppose I tell you that I prefer pineapple to mushrooms on my pizza. Suppose you're about to give me a slice of mushroom pizza; but by paying one penny ($0.01) I can instead get a slice of pineapple pizza (which is just as fresh from the oven). It seems realistic to say that most people with a pineapple pizza preference would probably pay the penny, if they happened to have a penny in their pocket. 1

After I pay the penny, though, and just before I'm about to get the pineapple pizza, you offer me a slice of onion pizza instead--no charge for the change! If I was telling the truth about preferring onion pizza to pineapple, I should certainly accept the substitution if it's free.

And then to round out the day, you offer me a mushroom pizza instead of the onion pizza, and again, since I prefer mushrooms to onions, I accept the swap.

I end up with exactly the same slice of mushroom pizza I started with... and one penny poorer, because I previously paid $0.01 to swap mushrooms for pineapple.

This seems like a qualitatively bad behavior on my part.

Eliezer asserts that this is “qualitatively bad behavior.” But I think that this is biting a philosophical bullet. 

As an intuition pump: In the actual case of humans, we seem to get utility not from states of the world, but from changes in states of the world. So it isn’t unusual for a human to pay to cycle between states of the world. 

For instance, I could imagine a human being hungry, eating a really good meal, feeling full, and then happily paying a fee to be instantly returned to their hungry state, so that they can enjoy eating a good meal again. 

This is technically a dutch booking (which do they prefer, being hungry or being full?), but from the perspective of the agent’s values there’s nothing qualitatively bad about it. Instead of the dutchbooker pumping money from the agent, he’s offering a useful and appreciated service.

Of course, we can still back out a utility function from this dynamic: instead of having a mapping of ordinal numbers to world states, we can have one from ordinal numbers to changes from world state to another. 

But that just passes the buck one level. I see no reason in principle that an agent might have a preference to rotate between different changes in the world, just as well as rotating different between states of the world.

But this also misses the central point. I think you can always construct a utility function that represents some behavior. But if one is no longer compelled by dutch book arguments, this begs the question of why we would want to do that. If coherence is no longer a desiderata, it’s no longer clear that a utility function is that natural way to express preferences.

And I wonder, maybe this also applies to agents in general, or at least the kind of learned agents that humans are likely to build via gradient descent. 

Maximization behavior

I think this matters, because many of the classic AI risk arguments go through a claim that maximization behavior is convergent. If you try to build a satisficer, there are a number of pressures for it to become a maximizer of some kind. (See this Rob Miles video, for instance)

I think that most arguments of that sort depend on an agent acting according to an expected utility maximization framework. And utility maximization turns out not to be a good abstraction for agents in the real world, I don't know if these arguments are still correct.

I posit that straightforward maximizers are rare in the multiverse, and that most evolved or learned agents are better described by some other abstraction.  

If not utility functions, then what?

If we accept for the time being that utility functions are a warped abstraction for most agents, what might a better abstraction be?

I don’t know. I’m writing this post in the hopes that others will think about this question and perhaps come up with productive alternative formulations. 

I'll post some of my half-baked thoughts on this question shortly.

Replies from: gworley, JBlack, Pattern
comment by G Gordon Worley III (gworley) · 2021-06-19T17:59:37.933Z · LW(p) · GW(p)

I've long been somewhat skeptical that utility functions are the right abstraction.

My argument is also rather handwavy, being something like "this is the wrong abstraction for how agents actually function, so even if you can always construct a utility function and say some interesting things about its properties, it doesn't tell you the thing you need to know to understand and predict how an agent will behave". In my mind I liken it to the state of trying to code in functional programming languages on modern computers: you can do it, but you're also fighting an uphill battle against the way the computer is physically implemented, so don't be surprised if things get confusing.

And much like in the utility function case, people still program in functional languages because of the benefits they confer. I think the same is true of utility functions: they confer some big benefits when trying to reason about certain problems, so we accept the tradeoffs of using them. I think that's fine so long as we have a morphism to other abstractions that will work better for understanding the things that utility functions obscure.

comment by JBlack · 2021-06-21T01:05:35.029Z · LW(p) · GW(p)

Utility functions are especially problematic in modeling behaviour for agents with bounded rationality, or those where there are costs of reasoning. These include every physically realizable agent.

For modelling human behaviour, even considering the ideals of what we would like human behaviour to achieve, there are even worse problems. We can hope that there is some utility function consistent with the behaviour we're modelling and just ignore cases where there isn't, but that doesn't seem satisfactory either.

comment by Pattern · 2021-06-19T20:14:32.820Z · LW(p) · GW(p)
The standard reply of course, is that 'if your preferences are incoherent, you’re dutchbookable, and someone will pump you for money.' 

'Or you will leave money on the table.'

rotating different between

You rotated 'different' and 'between'. (Or a serious of rotations isomorphic to such.)

comment by Eli Tyre (elityre) · 2019-09-28T07:16:20.129Z · LW(p) · GW(p)

New post: The Basic Double Crux Pattern

[This is a draft, to be posted on LessWrong soon.]

I’ve spent a lot of time developing tools and frameworks for bridging "intractable" disagreements. I’m also the person affiliated with CFAR who has taught Double Crux the most, and done the most work on it.

People often express to me something to the effect, “The important thing about Double Crux is all the low level habits of mind: being curious, being open to changing your mind, paraphrasing to check that you’ve understood, operationalizing, etc. The ‘Double Crux’ framework, itself is not very important.”

I half agree with that sentiment. I do think that those low level cognitive and conversational patterns are the most important thing, and at Double Crux trainings that I have run, most of the time is spent focusing on specific exercises to instill those low level TAPs.

However, I don’t think that the only value of the Double Crux schema is in training those low level habits. Double cruxes are extremely powerful machines that allow one to identify, if not the most efficient conversational path, a very high efficiency conversational path. Effectively navigating down a chain of Double Cruxes is like magic. So I’m sad when people write it off as useless.

In this post, I’m going to try and outline the basic Double Crux pattern, the series of 4 moves that makes Double Crux work, and give a (simple, silly) example of that pattern in action.

These four moves are not (always) sufficient for making a Double Crux conversation work, that does depend on a number of other mental habits and TAPs, but this pattern is, according to me, at the core of the Double Crux formalism.

The pattern:

The core Double Crux pattern is as follows. For simplicity, I have described this in the form of a 3-person Double Crux conversation, with two participants and a facilitator. Of course, one can execute these same moves in a 2 person conversation, as one of the participants. But that additional complexity is hard to manage for beginners.

The pattern has two parts (finding a crux, and finding a double crux), and each part is composed of 2 main facilitation moves.

Those four moves are...

  1. Clarifying that you understood the first person's point.
  2. Checking if that point is a crux
  3. Checking the second person's belief about the truth value of the first person's crux.
  4. Checking the if the first person's crux is also a crux for the second person.

In practice: 

[The version of this section on my blog has color coding and special formatting.]

The conversational flow of these moves looks something like this:

Finding a crux of participant 1:

P1: I think [x] because of [y]

Facilitator: (paraphrasing, and checking for understanding) It sounds like you think [x] because of [y]?

P1: Yep!

Facilitator: (checking for cruxyness) If you didn’t think [y], would you change your mind about [x]?

P1: Yes.

Facilitator: (signposting) It sounds like [y] is a crux for [x] for you.

Checking if it is also a crux for participant 2: 

Facilitator: Do you think [y]?

P2: No.

Facilitator: (checking for a Double Crux) if you did think [y] would that change your mind about [x]?

P2: Yes.

Facilitator: It sounds like [y] is a Double Crux

[Recurse, running the same pattern on [Y] ]

Obviously, in actual conversation, there is a lot more complexity, and a lot of other things that are going on.

For one thing, I’ve only outlined the best case pattern, where the participants give exactly the most convenient answer for moving the conversation forward (yes, yes, no, yes). In actual practice, it is quite likely that one of those answers will be reversed, and you’ll have to compensate.

For another thing, this formalism is rarely so simple. You might have to do a lot of conversational work to clarify the claims enough that you can ask if B is a crux for A (for instance when B is nonsensical to one of the participants). Getting through each of these steps might take fifteen minutes, in which case rather than four basic moves, this pattern describes four phases of conversation. (I claim that one of the core skills of a savvy facilitator is tracking which stage the conversation is at, which goals have you successfully hit, and which is the current proximal subgoal.)

There is also a judgment call about which person to treat as “participant 1” (the person who generates the point that is tested for cruxyness). As a first order heuristic, the person who is closer to making a positive claim over and above the default, should usually be the “p1”. But this is only one heuristic.


This is an intentionally silly, over-the-top-example, for demonstrating the the pattern without any unnecessary complexity. I'll publish a somewhat more realistic example in the next few days.

Two people, Alex and Barbra, disagree about tea: Alex thinks that tea is great, and drinks it all the time, and thinks that more people should drink tea, and Barbra thinks that tea is bad, and no one should drink tea.

Facilitator: So, Barbra, why do you think tea is bad?
Barbra: Well it's really quite simple. You see, tea causes cancer.
Facilitator: Let me check if I've got that: you think that tea causes cancer?
Barbra: That's right.
Facilitator: Wow. Ok. Well if you found out that tea actually didn't cause cancer, would you be fine with people drinking tea.
Barbra: Yeah. Really the main thing that I'm concerned with is the cancer-causing. If tea didn't cause cancer, then it seems like tea would be fine.
Facilitator: Cool. Well it sounds like this is a crux for you Barb. Alex, do you currently think that tea causes cancer?
Alex: No. That sounds like crazy-talk to me.
Facilitator: Ok. But aside from how realistic it seems right now, if you found out that tea actually does cause cancer, would you change your mind about people drinking tea?
Alex: Well, to be honest, I've always been opposed to cancer, so yeah, if I found out that tea causes cancer, then I would think that people shouldn't drink tea.
Facilitator: Well, it sounds like we have a double crux!

In a real conversation, it often doesn't goes this smoothly. But this is the rhythm of Double Crux, at least as I apply it.

That's the basic Double Crux pattern. As noted there are a number of other methods and sub-skills that are (often) necessary to make a Double Crux conversation work, but this is my current best attempt at a minimum compression of the basic engine of finding double cruxes.

I made up a more realistic example here, and I'm might make more or better examples.

comment by Eli Tyre (elityre) · 2019-08-19T22:36:06.928Z · LW(p) · GW(p)

Old post: A mechanistic description of status

[This is an essay that I’ve had bopping around in my head for a long time. I’m not sure if this says anything usefully new-but it might click with some folks. If you haven’t read Social Status: Down the Rabbit Hole on Kevin Simler’s excellent blog, Melting Asphalt read that first. I think this is pretty bad and needs to be rewritten and maybe expanded substantially, but this blog is called “musings and rough drafts.”]

In this post, I’m going to outline how I think about status. In particular, I want to give a mechanistic account of how status necessarily arises, given some set of axioms, in much the same way one can show that evolution by natural selection must necessarily occur given the axioms of 1) inheritance of traits 2) variance in reproductive success based on variance in traits and 3) mutation.

(I am not claiming any particular skill at navigating status relationships, any more than a student of sports-biology is necessarily a skilled basketball player.)

By “status” I mean prestige-status.

Axiom 1: People have goals.

That is, for any given human, there are some things that they want. This can include just about anything. You might want more money, more sex, a ninja-turtles lunchbox, a new car, to have interesting conversations, to become an expert tennis player, to move to New York etc.

Axiom 2: There are people who control resources relevant to other people achieving their goals.

The kinds of resources are as varied as the goals one can have.

Thinking about status dynamics and the like, people often focus on the particularly convergent resources, like money. But resources that are onlyrelevant to a specific goal are just as much a part of the dynamics I’m about to describe.

Knowing a bunch about late 16th century Swedish architecture is controlling a goal relevant-resource, if someone has the goal of learning more about 16th century Swedish architecture.

Just being a fun person to spend time with (due to being particularly attractive, or funny, or interesting to talk to, or whatever) is a resource relevant to other people’s goals.

Axiom 3: People are more willing to help (offer favors to) a person who can help them achieve their goals.

Simply stated, you’re apt to offer to help a person with their goals if it seems like they can help you with yours, because you hope they’ll reciprocate. You’re willing to make a trade with, or ally with such people, because it seems likely to be beneficial to you. At minimum, you don’t want to get on their bad side.

(Notably, there are two factors that go into one’s assessment of another person’s usefulness: if they control a resource relevant to one of your goals, and if you expect them to reciprocate.

This produces a dynamic where by A’s willingness to ally with B is determined by something like the product of

  • A’s assessment of B’s power (as relevant to A’s goals), and
  • A’s assessment of B’s probability of helping (which might translate into integrity, niceness, etc.)

If a person is a jerk, they need to be very powerful-relative-to-your-goals to make allying with them worthwhile.)

All of this seems good so far, but notice that we have up to this point only described individual pair-wise transactions and pair-wise relationships. People speak about “status” as a attribute that someone can possess or lack. How does the dynamic of a person being “high status” arise from the flux of individual transactions?

Lemma 1: One of the resources that a person can control is other people’s willingness to offer them favors

With this lemma, the system folds in on itself, and the individual transactions cohere into a mostly-stable status hierarchy.

Given lemma 1, a person doesn’t need to personally control resources relevant to your goals, they just need to be in a position such that someone who is relevant to your goals will privilege them.

As an example, suppose that you’re introduced to someone who is very well respected in your local social group: person-W. Your assessment might be that W, directly, doesn’t have anything that you need. But because person-W is well-respected by others in your social group are likely to offer favors to him/her. Therefore, it’s useful for person-W to like you, because then they are more apt to call on other people’s favors on your behalf.

(All the usual caveats about has this is subconscious, and humans are adaption-executors and don’t do explicit, verbal assessments of how useful a person will be to them, but rely on emotional heuristics that approximate explicit assessment.)

This causes the mess of status transactions to reinforce and stabilize into a mostly-static hierarchy. The mass of individual A-privileges-B-on-the-basis-of-A’s-goals flattens out, into each person having a single “score” which determines to what degree each other person privileges them.

(It’s a little more complicated than that because people who have access to their own resources have less need of help from other. So a person’s effective status (the status-level at which you treat them is closer to their status minus your status. But this is complicated again because people are motivated not to be dicks (that’s bad for business), and respecting other people’s status is important to not being a dick.)

[more stuff here.]

Replies from: Kaj_Sotala, Raemon
comment by Kaj_Sotala · 2019-08-20T06:37:34.898Z · LW(p) · GW(p)

Related: The red paperclip theory of status [LW · GW] describes status as a form of optimization power, specifically one that can be used to influence a group.

The name of the game is to convert the temporary power gained from (say) a dominance behaviour into something further, bringing you closer to something you desire: reproduction, money, a particular social position...

comment by Raemon · 2019-08-20T06:13:11.968Z · LW(p) · GW(p)

(it says "more stuff here" but links to your overall blog, not sure if that meant to be a link to a specific post)

comment by Eli Tyre (elityre) · 2020-12-16T10:52:50.837Z · LW(p) · GW(p)

Something that I've been thinking about lately is the possibility of an agent's values being partially encoded by the constraints of that agent's natural environment, or arising from the interaction between the agent and environment.

That is, an agent's environment puts constraints on the agent. From one perspective removing those constraints is always good, because it lets the agent get more of what it wants. But sometimes from a different perspective, we might feel that with those constraints removed, the agent goodhearts or wire-heads, or otherwise fails to actualize its "true" values.

The Generator freed from the oppression of the Discriminator

As a metaphor: if I'm one half of a GAN, let's say the generator, then in one sense my "values" are fooling the discriminator, and if you make me relatively more powerful than my discriminator, and I dominate it...I'm loving it, and also no longer making good images.

But you might also say, "No, wait. That is a super-stimulus, and actually what you value is making good images, but half of that value was encoded in your partner."

This second perspective seems a little stupid to me. A little too Aristotelian. I mean if we're going to take that position, then I don't know where we draw the line. Naively, it seems like we would throw out the distinction between fitness maximizers and adaption executors, and fall backwards, declaring that the values of evolution are our true values.

Then again, if you fully accept the first perspective, it seems like maybe you are buying into wireheading? Like I might say "my actual values are upticks in pleasure sensation, but I'm trapped in this evolution-designed brain, which only lets me do that by achieving eudaimonia. If only I could escape the tyranny of these constraints, I'd be so much better off." (I am actually kind of partial to the second claim.)

The Human freed from the horrors of nature

Or, let's take a less abstract example. My understanding (from this podcast) is that humans flexibly adjust the degree to which they act primarily as individuals seeking personal benefit vs. act as primarily as selfless members of a group. When things are going well, you're in a situation of plenty and opportunity, people are in a mostly self-interested mode, but when there is scarcity or danger, humans naturally incline towards rallying together and sacrificing for the group.

Junger claims that this switching of emphasis is adaptive:

It clearly is adaptive to think in group terms because your survival depends on the group. And the worse the circumstances, the more your survival depends on the group. And, as a result, the more pro-social the behaviors are. The worse things are, the better people act. But, there's another adaptive response, which is self-interest. Okay? So, if things are okay--if, you know, if the enemy is not attacking; if there's no drought; if there's plenty of food; if everything is fine, then, in evolutionary terms it's adaptive--your need for the group subsides a little bit--it's adaptive to attend to your own interests, your own needs; and all of a sudden, you've invented the bow and arrow. And all of a sudden you've invented the iPhone, whatever. Having the bandwidth and the safety and the space for people to sort of drill deep down into an idea--a religious idea, a philosophical idea, a technological idea--clearly also benefits the human race. So, what you have in our species is this constant toggling back and forth between group interest--selflessness--and individual interest. And individual autonomy. And so, when things are bad, you are way better off investing in the group and forgetting about yourself. When things are good, in some ways you are better off spending that time investing in yourself; and then it toggles back again when things get bad. And so I think in this, in modern society--in a traditional, small-scale tribal society, in the natural world, that toggling back and forth happened continually. There was a dynamic tension between the two that had people winding up more or less in the middle.

I personally experienced this when the COVID situation broke. I usually experience myself as an individual entity, leaning towards disentangling or distancing myself from the groups that I'm a part of and doing cool things on my own (building my own intellectual edifices, that bear my own mark, for instance). But in the very early pandemic, I felt much more like node in a distributed sense-making network, just passing up whatever useful info I could glean. I felt much more strongly like the rationality community was my tribe. 

But, we modern humans find ourselves in a world where we have more or less abolished scarcity and danger. And consequently modern people are sort of permanently toggled to the "individual" setting.

The problem with modern society is that we have, for most of the time, for most people, solved the direct physical threats to our survival. So, what you have is people--and again, it's adaptive: we're wired for this--attending to their own needs and interests. But not--but almost never getting dragged back into the sort of idea of group concern that is part of our human heritage. And, the irony is that when people are part of a group and doing something essential to a group, it gives an incredible sense of wellbeing.

If we take that sense of community and belonging as a part of human values (and that doesn't seem like an unreasonable assumption to me), we might say that this part of our values is not contained simply in humans, but rather in the interaction between humans and their environment.

Humans throughout history might have desperately desired the alleviation of malthusian conditions that we now enjoy. But having accomplished it, it turns out that we were "pulling against" those circumstances, and that the tension of that pulling against, was actually where (at least some) of our true values lay. 

Removing the obstacles, we obsoleted the tension, and maybe broke something about our values?

I don't think that this is an intractable problem. It seems like, in principle, it is possible to goal factor the scarcity and the looming specter of death, to find scenarios that are conducive to human community without people actually having to die a lot. I'm sure a superintelligence could figure something out.

But aside from the practicalities, it seems like this points at a broader thing. If you took the Generator out of the GAN, you might not be able to tell what system it was a part of. So if you consider the "values" of the Generator to "create good images" you can't just look at the Generator. You have to look at, not just the broader environment, but specifically the oppressive force that the generator is resisting.

Replies from: elityre
comment by Eli Tyre (elityre) · 2020-12-16T10:53:13.659Z · LW(p) · GW(p)

Side note, which is not my main point: I think this also has something to do with what meditation and psychedelics do to people, which was recently up for discussion on Duncan's Facebook. I bet that mediation is actually a way to repair psychblocks and trauma and what-not. But if you do that enough, and you remove all the psych constraints...a person might sort of become so relaxed that they become less and less of an agent. I'm a lot less sure of this part.

comment by Eli Tyre (elityre) · 2019-09-04T07:21:07.051Z · LW(p) · GW(p)

[Real short post. Random. Complete speculation.]

Childhood lead exposure reduces one’s IQ, and also causes one to be more impulsive and aggressive.

I always assumed that the impulsiveness was due, basically, to your executive function machinery working less well. So you have less self control.

But maybe the reason for the IQ-impulsiveness connection, is that if you have a lower IQ, all of your subagents/ subprocesses are less smart. Because they’re worse at planning and modeling the world, the only way they know how to get their needs met are very direct, very simple, action-plans/ strategies. It’s not so much that you’re better at controlling your anger, as the part of you that would be angry is less so, because it has other ways of getting its needs met.

Replies from: jimrandomh, capybaralet
comment by jimrandomh · 2019-09-10T00:05:13.511Z · LW(p) · GW(p)

A slightly different spin on this model: it's not about the types of strategies people generate, but the number. If you think about something and only come up with one strategy, you'll do it without hesitation; if you generate three strategies, you'll pause to think about which is the right one. So people who can't come up with as many strategies are impulsive.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-09-10T14:33:35.013Z · LW(p) · GW(p)

This seems that it might be testable. If you force impulsive folk to wait and think, do they generate more ideas for how to proceed?

comment by capybaralet · 2019-09-07T05:44:23.330Z · LW(p) · GW(p)

This reminded me of the argument that superintelligent agents will be very good at coordinating and just divvy of the multiverse and be done with it.

It would be interesting to do an experimental study of how the intelligence profile of a population influences the level of cooperation between them.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-09-09T04:30:20.271Z · LW(p) · GW(p)

I think that's what the book referenced here, is about.

comment by Eli Tyre (elityre) · 2019-11-13T21:08:12.093Z · LW(p) · GW(p)

new post: Metacognitive space

[Part of my Psychological Principles of Personal Productivity, which I am writing mostly in my Roam, now.]

Metacognitive space is a term of art that refers to a particular first person state / experience. In particular it refers to my propensity to be reflective about my urges and deliberate about the use of my resources.

I think it might literally be having the broader context of my life, including my goals and values, and my personal resource constraints loaded up in peripheral awareness.

Metacognitive space allows me to notice aversions and flinches, and take them as object, so that I can respond to them with Focusing or dialogue, instead of being swept around by them. Similarly, it seems to, in practice, to reduce my propensity to act on immediate urges and temptations.

[Having MCS is the opposite of being [[{Urge-y-ness | reactivity | compulsiveness}]]?]

It allows me to “absorb” and respond to happenings in my environment, including problems and opportunities, taking considered instead of semi-automatic, first response that occurred to me, action. [That sentence there feels a little fake, or maybe about something else, or maybe is just playing into a stereotype?]

When I “run out” of meta cognitive space, I will tend to become ensnared in immediate urges or short term goals. Often this will entail spinning off into distractions, or becoming obsessed with some task (of high or low importance), for up to 10 hours at a time.

Some activities that (I think) contribute to metacogntive space:

  • Rest days
  • Having a few free hours between the end of work for the day and going to bed
  • Weekly [[Scheduling]]. (In particular, weekly scheduling clarifies for me the resource constraints on my life.)
  • Daily [[Scheduling]]
  • [[meditation]], including short meditation.
    • Notably, I’m not sure if meditation is much more efficient than just taking the same time to go for a walk. I think it might be or might not be.
  • [[Exercise]]?
  • Waking up early?
  • Starting work as soon as I wake up?
    • [I’m not sure that the thing that this is contributing to is metacogntive space per se.]

[I would like to do a causal analysis on which factors contribute to metacogntive space. Could I identify it in my toggl data with good enough reliability that I can use my toggl data? I guess that’s one of the things I should test? Maybe with a servery asking me to rate my level of metacognitive space for the day every evening?]


Usually, I find that I can maintain metacogntive space for about 3 days [test this?] without my upkeep pillars.

Often, this happens with a sense of pressure: I have a number of days of would-be-overwhelm which is translated into pressure for action. This is often good, it adds force and velocity to activity. But it also runs down the resource of my metacognitive space (and probably other resources). If I loose that higher level awareness, that pressure-as-a-forewind, tends to decay into either 1) a harried, scattered, rushed-feeling, 2) a myopic focus on one particular thing that I’m obsessively trying to do (it feels like an itch that I compulsively need to scratch), 3) or flinching way from it all into distraction.

[Metacognitive space is the attribute that makes the difference between absorbing, and then acting gracefully and sensibly to deal with the problems, and harried, flinching, fearful, non-productive overwhelm, in general?]

I make a point, when I am overwhelmed, or would be overwhelmed to make sure to allocate time to maintain my metacognitive space. It is especially important when I feel so busy that I don’t have time for it.

When metacognition is opposed to satisfying your needs, your needs will be opposed to metacognition

One dynamic that I think is in play, is that I have a number of needs, like the need for rest, and maybe the need for sexual release or entertainment/ stimulation. If those needs aren’t being met, there’s a sort of build up of pressure. If choosing consciously and deliberately prohibits those needs getting met, eventually they will sabotage the choosing consciously and deliberately.

From the inside, this feels like “knowing that you ‘shouldn’t’ do something (and sometimes even knowing that you’ll regret it later), but doing it anyway” or “throwing yourself away with abandon”. Often, there’s a sense of doing the dis-endorsed thing quickly, or while carefully not thinking much about it or deliberating about it: you need to do the thing before you convince yourself that you shouldn’t.

[[Research Questions]]

What is the relationship between [[metacognitive space]] and [[Rest]]?

What is the relationship between [[metacognitive space]] and [[Mental Energy]]?

comment by Eli Tyre (elityre) · 2021-01-15T05:54:08.529Z · LW(p) · GW(p)

Does anyone know of a good technical overview of why it seems hard to get Whole Brain Emulations before we get neuromorphic AGI?

I think maybe I read a PDF that made this case years ago, but I don't know where.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2021-01-15T12:33:32.610Z · LW(p) · GW(p)

I haven't seen such a document but I'd be interested to read it too. I made an argument to that effect here: https://www.lesswrong.com/posts/PTkd8nazvH9HQpwP8/building-brain-inspired-agi-is-infinitely-easier-than [LW · GW]

(Well, a related argument anyway. WBE is about scanning and simulating the brain rather than understanding it, but I would make a similar argument using "hard-to-scan" and/or "hard-to-simulate" things the brain does, rather than "hard-understand" things the brain does, which is what I was nominally blogging about. There's a lot of overlap between those anyway; the examples I put in mostly work for both.)

Replies from: elityre
comment by Eli Tyre (elityre) · 2021-01-16T01:12:39.291Z · LW(p) · GW(p)

Great. This post is exactly the sort of thing that I was thinking about.

comment by Eli Tyre (elityre) · 2020-06-14T04:00:29.455Z · LW(p) · GW(p)

There’s a psychological variable that seems to be able to change on different timescales, in me, at least. I want to gesture at it, and see if anyone can give me pointers to related resources.

[Hopefully this is super basic.]

There a set of states that I occasionally fall into that include what I call “reactive” (meaning that I respond compulsively to the things around me), and what I call “urgy” (meaning that that I feel a sort of “graspy” desire for some kind of immediate gratification).

These states all have some flavor of compulsiveness.

They are often accompanied by high physiological arousal, and sometimes have a burning / clenching sensation in the torso. These all have a kind of “jittery” feeling, and my attention jumps around, or is yanked around. There’s also a way in which this feels “high” on a spectrum, (maybe because my awareness is centered on my head?)

I might be tempted to say that something like “all of these states incline me towards neuroticism.” But that isn’t exactly right on a few counts. (For one thing, the reactions aren’t necessarily irrational, just compulsive.)

In contrast to this, there is another way that I can feel sometimes, which is more like “calm”, “anchored”, settled. It feels “deeper” or “lower” somehow. Things often feel slowed down. My attention can settle, and when it moves it moves deliberately, instead of compulsively. I expect that this correlates with low arousal.

I want to know...

  1. Does this axis have a standardized name? In the various traditions of practice? In cognitive psychology or neuroscience?
    1. Knowing the technical, academic name would be particularly great.
  2. Do people have, or know of, efficient methods for moving along this axis, either in the short term or the long term?
    1. This phenomenon could maybe be described as “length of the delay between stimulus and response”, insofar as that even makes sense, which is one of the benefits noted in the popular branding for meditation.
Replies from: mr-hire, mr-hire
comment by Matt Goldenberg (mr-hire) · 2020-06-19T18:02:07.123Z · LW(p) · GW(p)

I remembered there was a set of audios from Eben Pagan that really helped me before I turned them into the 9 breaths technique. Just emailed them to you. They go a bit more into depth and you may find them useful.

comment by Matt Goldenberg (mr-hire) · 2020-06-14T17:37:14.342Z · LW(p) · GW(p)

I don't know if this is what you're looking for, but I've heard the variable you're pointing at referred to as your level of groundedness, centeredness, and stillness in the self-help space.  

There are all sorts of meditations, visualizations, and exercises aimed to make you more grounded/centered/still and a quick google search pulls up a bunch.

One I teach is called the 9 breaths technique.

Here's another.

comment by Eli Tyre (elityre) · 2019-06-24T04:34:05.117Z · LW(p) · GW(p)

new (boring) post on controlled actions.

comment by Eli Tyre (elityre) · 2019-06-04T08:33:53.379Z · LW(p) · GW(p)

New post: Why does outlining my day in advance help so much?

Replies from: rk
comment by rk · 2019-06-04T14:49:07.731Z · LW(p) · GW(p)

This link (and the one for "Why do we fear the twinge of starting?") is broken (I think it's an admin view?).

(Correct link)

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-06-04T16:05:44.620Z · LW(p) · GW(p)

They should both be fixed now.


comment by Eli Tyre (elityre) · 2019-06-02T22:16:00.239Z · LW(p) · GW(p)

New post: some musings on deliberate practice

Replies from: Raemon, Hazard
comment by Raemon · 2019-06-02T23:13:33.849Z · LW(p) · GW(p)

Thanks! I just read through a few of your most recent posts and found them all real useful.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-06-04T02:41:34.522Z · LW(p) · GW(p)

Cool! I'd be glad to hear more. I don't have much of a sense of which thing I write are useful or how.

comment by Hazard · 2019-11-05T00:15:07.140Z · LW(p) · GW(p)

Relating to the "Perception of Progress" bit at the end. I can confirm for a handful of physical skills I practice there can be a big disconnect between Perception of Progress and Progress from a given session. Sometimes this looks like working on a piece of sleight of hand, it feeling weird and awkward, and the next day suddenly I'm a lot better at it, much more than I was at any point in the previous days practice.

I've got a hazy memory of a breakdancer blogging about how a particular shade of "no progress fumbling" can be a signal that a certain about of "unlearning" is happening, though I can't find the source to vet it.

comment by Eli Tyre (elityre) · 2020-08-14T15:16:10.578Z · LW(p) · GW(p)

I’ve decided that I want to to make more of a point to write down my macro-strategic thoughts, because writing things down often produces new insights and refinements, and so that other folks can engage with them.

This is one frame or lens that I tend to think with a lot. This might be more of a lens or a model-let than a full break-down.

There are two broad classes of problems that we need to solve: we have some pre-paradigmatic science to figure out, and we have have the problem of civilizational sanity.

Preparadigmatic science

There are a number of hard scientific or scientific-philosophical problems that we’re facing down as a species.

Most notably, the problem of AI alignment, but also finding technical solutions to various risks caused by bio-techinlogy, possibly getting our bearings with regards to what civilization collapse means and how it is likely to come about, possibly getting a handle on the risk of a simulation shut-down, possibly making sense of the large scale cultural, political, cognitive shifts that are likely to follow from new technologies that disrupt existing social systems (like VR?).

Basically, for every x-risk, and every big shift to human civilization, there is work to be done even making sense of the situation, and framing the problem.

As this work progresses it eventually transitions into incremental science / engineering, as the problems are clarified and specified, and the good methodologies for attacking those problems solidify.

(Work on bio-risk, might already be in this phase. And I think that work towards human genetic enhancement is basically incremental science.)

To my rough intuitions, it seems like these problems, in order of pressingness are:

  1. AI alignment
  2. Bio-risk
  3. Human genetic enhancement
  4. Social, political, civilizational collapse

…where that ranking is mostly determined by which one will have a very large impact on the world first.

So there’s the object-level work of just trying to make progress on these puzzles, plus a bunch of support work for doing that object level work.

The support work includes

  • Operations that makes the research machines run (ex: MIRI ops)
  • Recruitment (and acclimation) of people who can do this kind of work (ex: CFAR)
  • Creating and maintaining infrastructure that enables intellectually fruitful conversations (ex: LessWrong)
  • Developing methodology for making progress on the problems (ex: CFAR, a little, but in practice I think that this basically has to be done by the people trying to do the object level work.)
  • Other stuff.

So we have a whole ecosystem of folks who are supporting this preparadgimatic development.

Civilizational Sanity

I think that in most worlds, if we completely succeeded at the pre-paradigmatic science, and the incremental science and engineering that follows it, the world still wouldn’t be saved.

Broadly, one way or the other, there are huge technological and social changes heading our way, and human decision makers are going to decide how to respond to those changes, possibly in ways that will have very long term repercussions on the trajectory of earth-originating life.

As a central example, if we more-or-less-completely solved AI alignment, from a full theory of agent-foundations, all the way down to the specific implementation, we would still find ourselves in a world, where humanity has attained god-like power over the universe, which we could very well abuse, and end up with a much much worse future than we might otherwise have had. And by default, I don’t expect humanity to refrain from using new capabilities rashly and unwisely.

Completely solving alignment does give us a big leg up on this problem, because we’ll have the aid of superintelligent assistants in our decision making, or we might just have an AI system implement our CEV in classic fashion.

I would say that “aligned superintelligent assistants” and “AIs implementing CEV”, are civilizational sanity interventions: technologies or institutions that help humanity’s high level decision-makers to make wise decisions in response to huge changes that, by default, they will not comprehend.

I gave some examples of possible Civ Sanity interventions here [LW · GW].

Also, think that some forms of governance / policy work that OpenPhil, OpenAI, and FHI have done, count as part of this category, though I want to cleanly distinguish between pushing for object-level policy proposals that you’ve already figured out, and instantiating systems that make it more likely that good policies will be reached and acted upon in general.

Overall, this class of interventions seems neglected by our community, compared to doing and supporting preparadigmatic research. That might be justified. There’s reason to think that we are well equipped to make progress on hard important research problems, but changing the way the world works, seems like it might be harder on some absolute scale, or less suited to our abilities.

comment by Eli Tyre (elityre) · 2019-11-12T18:00:26.584Z · LW(p) · GW(p)

New (short) post: Desires vs. Reflexes

[Epistemic status: a quick thought that I had a minute ago.]

There are goals / desires (I want to have sex, I want to stop working, I want to eat ice cream) and there are reflexes (anger, “wasted motions”, complaining about a problem, etc.).

If you try and squash goals / desires, they will often (not always?) resurface around the side, or find some way to get met. (Why not always? What are the difference between those that do and those that don’t?) You need to bargain with them, or design outlet polices for them.

Reflexes on the other hand are strategies / motions that are more or less habitual to you. These you train or untrain.

comment by Eli Tyre (elityre) · 2019-08-05T20:33:04.751Z · LW(p) · GW(p)

new post: Intro to and outline of a sequence on a productivity system

Replies from: eigen, mr-hire
comment by eigen · 2019-08-06T23:04:15.756Z · LW(p) · GW(p)

I'm interested about knowing more about the meditation aspect and how it relates to productivity!

comment by Matt Goldenberg (mr-hire) · 2019-08-06T16:49:13.019Z · LW(p) · GW(p)

I'm currently running a pilot program that takes a very similar psychological slant on productivity and procrastination, and planning to write a sequence starting in the next week or so. It covers a lot of the same subjects, including habits, ambiguity or overwhelm aversion, coercion aversion, and creating good relationships with parts. Maybe we should chat!

comment by Eli Tyre (elityre) · 2019-11-09T05:01:41.111Z · LW(p) · GW(p)

Totally an experiment, I'm trying out posting my raw notes from a personal review / theorizing session, in my short form. I'd be glad to hear people's thoughts.

This is written for me, straight out of my personal Roam repository. The formatting is a little messed up because LessWrong's bullet don't support indefinite levels of nesting.

This one is about Urge-y-ness / reactivity / compulsiveness

  • I don't know if I'm naming this right. I think I might be lumping categories together.
  • Let's start with what I know:
    • There are three different experiences, which might turn out to have a common cause, or which might turn out to be inssuficently differentiated
      1. I sometimes experience a compulsive need to do something or finish something.
        1. examples:
          1. That time when I was trying to make an audiobook of Focusing: Learn from the Masters
          2. That time when I was flying to Princeton to give a talk, and I was frustratedly trying to add photos to some dating app.
      2. Sometimes I am anxious or agitated (often with a feeling in my belly), and I find myself reaching for distraction, often youtube or webcomics or porn.
      3. Sometimes, I don't seem to be anxious, but I still default to immediate gratification behaviors, instead of doing satisfying focused work ()"my attention like a plow, heavy with inertia, deep in the earth, and cutting forward"). I might think about working, and then deflect to youtube or webcomics or porn.
        1. I think this has to do with having a thought or urge, and then acting on it unreflectively.
        2. examples:
          1. I think I've been like that for much of the past two days. [2019-11-8]
    • These might be different states, each of which is high on some axis: something like reactivity (as opposed to responsive) or impulsiveness or compulsiveness.
    • If so, the third case feels most pure. I think I'll focus on that one first, and then see if anxiety needs a separate analysis.
    • Theorizing about non-anxious immediate gratification
      • What is it?
      • What is the cause / structure?
        • Hypotheses:
          1. It might be that I have some unmet need, and the reactivity is trying to meet that need or cover up the pain of the unmet need.
          2. This suggests that the main goal should be trying to uncover the need.
          3. Note that my current urgeyness really doesn't feel like it has an unmet need underlying it. It feels more like I just have a bad habit, locally. But maybe I'm not aware of the neglected need?
          4. If it is an unmet need or a fear, I bet it is the feeling of overwelm. That actually matches a lot. I do feel like I have a huge number of things on my plate and even though I'm not feeling anxiety per se, I find myself bouncing off them.
          5. In particular, I have a lot to write, but have also been feeling resistance to start on my writing projects, because there are so many of them and once I start I'll have loose threads out and open. Right now, things are a little bit tucked away (in that I have outlines of almost everything), but very far from completed, in that I have hundreds of pages to write, and I'm a little afraid of loosing the content that feels kind of precariously balanced in my mind, and if I start writing I might loose some of it somehow.
          6. This also fits with the data that makes me feel like a positive feedback attractor: when I can get moving in the right way, my overwhelm becomes actionable, and I fall towards effective work. When I can't get enough momentum such that my effective system believes that I can deal with the overwhelm, I'll continue to bounce off.
          7. Ok. So under this hypothesis, this kind of thing is caused by an aversion, just like everything else.
          8. This predicts that just meditating might or might not alleviate the urgeyness: it doesn't solve the problem of the aversion, but it might buy me enough [[metacognitive space]] to not be flinching away.
          9. It might be a matter of "short term habit". My actions have an influence on my later actions: acting on urges causes me to be more likely to act on urges (and vis versa) so there can be positive feedback in both directions.
          10. Rather than a positive thing, it might be better to think of it as the absence of a loaded up goal-chain.
          11. Maybe this is the inverse of [[Productivity Momentum]]?
        • My takeaway from the above hypotheses is that the urgeness, in this case is either the result of an aversion, overwhelm aversion in particular, or it is an attractor state, due to my actions training a short term habit or action-propensity towards immediate reaction to my urges.
        • Some evidence and posits
          • I have some belief that this is more common when I have eaten a lot of sugar, but that might be wrong.
          • I had thought that exercise pushes against reactivity, but I strength trained pretty hard yesterday, and that didn't seem to make much of a difference today.
          • I think maybe meditation helps on this axis.
          • I have the sense that self-control trains the right short term habits.
          • Things like meditation, or fasting, or abstaining from porn/ sex.
          • Waking up and starting work immediately
          • I notice that my leg is jumping right now, as if I'm hyped up or over-energized, like with a caffeine high.
      • How should I intervene on it?
        • background maintenance
          • Some ideas:
          1. It helps to just block the distracting sites.
          2. Waking up early and scheduling my day (I already know this).
          3. Exercising?
          4. Meditating?
          • It would be good if I could do statistical analysis on these.
          • Maybe I can use my toggl data and compare it to my tracking data?
          • What metric?
          • How often I read webcomics or watch youtube?
          • I might try both intentional, and unintentional?
          • How much deep work I'm getting done?
        • point interventions
          • some ideas
          1. When I am feeling urgey, I should meditate?
          2. When I'm feeling urgey, I should sit quietly with a notebook (no screens), for 20 minutes, to get some metacognition about what I care about?
          3. When I'm feeling urgey, I should do focusing and try to uncover the unmet need?
          4. When I'm feeling urgey, I should do 90 seconds of intense cardio?
          • Those first two feel the most in the right vein: the thing that needs to happen is that I need to "calm down" my urgent grabbiness, and take a little space for my deeper goals to become visible.
          • I want to solicit more ideas from people.
          • I want to be able to test these.
          • The hard part about that is the transition function: how do I make the TAP work?
          • I should see if somenone can help me debug this.
          • One thought that I have is to do a daily review every day, and to ask on the daily review if I missed any places where I was urgey: opportunities to try an intervention
comment by Eli Tyre (elityre) · 2019-11-08T04:06:48.231Z · LW(p) · GW(p)

New post: Some musings about exercise and time discount rates

[Epistemic status: a half-thought, which I started on earlier today, and which might or might not be a full thought by the time I finish writing this post.]

I’ve long counted exercise as an important component of my overall productivity and functionality. But over the past months my exercise habit has slipped some, without apparent detriment to my focus or productivity. But this week, after coming back from a workshop, my focus and productivity haven’t really booted up.

Here’s a possible story:

Exercise (and maybe mediation) expands the effective time-horizon of my motivation system. By default, I will fall towards attractors of immediate gratification and impulsive action, but after I exercise, I tend to be tracking, and to be motivated by, progress on my longer term goals. [1]

When I am already in the midst of work: my goals are loaded up and the goal threads are primed in short term memory, this sort of short term compulsiveness causes me to fall towards task completion: I feel slightly obsessed about finishing what I’m working on.

But if I’m not already in the stream of work, seeking immediate gratification instead drives me to youtube and web comics and whatever. (Although it is important to note that I did switch my non self tracking web usage to Firefox this week, and I don’t have my usual blockers for youtube and for SMBC set up yet. That might totally account for the effect that I’m describing here.)

In short, when I’m not exercising enough, I have less meta cognitive space for directing my attention and choosing what is best do do. But if I’m in the stream of work already, I need that meta cognitive space less: because I’ll default to doing more of what I’m working on. (Though, I think that I do end up getting obsessed with overall less important things, compared to when I am maintaining metacognitive space). Exercise is most important for booting up and setting myself up to direct my energies.

[1] This might be due to a number of mechanisms:

  • Maybe the physical endorphin effect of exercise has me feeling good, and so my desire for immediate pleasure is sated, freeing up resources for longer term goals.
  • Or maybe exercise involves engaging in intimidate discomfort for the sake of future payoff, and this shifts my “time horizon set point” or something. (Or maybe it’s that exercise is downstream of that change in set point.)
    • If meditation also has this time-horizon shifting effect, that would be evidence for this hypothesis.
    • Also if fasting has this effect.
  • Or maybe, it’s the combination of both of the above: engaging in delayed gratification, with a viscerally experienced payoff, temporarily retrains my motivation system for that kind of thing.)
  • Or something else.
Replies from: Viliam
comment by Viliam · 2019-11-08T20:46:20.365Z · LW(p) · GW(p)

Alternative hypothesis: maybe what expands your time horizon is not exercise and meditation per se, but the fact that you are doing several different things (work, meditation, exercise), instead of doing the same thing over and over again (work). It probably also helps that the different activities use different muscles, so that they feel completely different.

This hypothesis predicts that a combination of e.g. work, walking, and painting, could provide similar benefits compared to work only.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-11-08T23:44:58.870Z · LW(p) · GW(p)

Well, my working is often pretty varied, while my "being distracted" is pretty monotonous (watching youtube clips), so I don't think it is this one.

comment by Eli Tyre (elityre) · 2019-09-07T06:34:39.335Z · LW(p) · GW(p)

New post: Capability testing as a pseudo fire alarm

[epistemic status: a thought I had]

It seems like it would be useful to have very fine-grained measures of how smart / capable a general reasoner is, because this would allow an AGI project to carefully avoid creating a system smart enough to pose an existential risk.

I’m imagining slowly feeding a system more training data (or, alternatively, iteratively training a system with slightly more compute), and regularly checking its capability. When the system reaches “chimpanzee level” (whatever that means), you stop training it (or giving it more compute resources).

This might even be a kind of fire-alarm. If you have a known predetermined battery of tests, then when some lab develops a system that scores “at the chimp level” at that battery, that might be a signal to everyone, that it’s time to pool our resources and figure out safety. (Of course, this event might alternatively precipitate a race, as everyone tries to get to human-level first.)

Probably the best way to do this would be for both training data, and compute / architecture. Start with a given architecture, then train it, slowly increasing the amount or quality of the training data, with regular tests (done on “spurs”, the agent should never have episodic memory of the . When increasing training data plateaus, iteratively improve the architecture in some way, either by giving the systems it more compute resources, or maybe making small adjustment. Again train the new version of the system, with regular tests. If you ever start to get very steep improvement, slow down and run tests more frequently.

Naively, it seems like a setup like this would prevent an AI team from overshooting and making a system that is much more capable that they think (which gives rise to all kinds of problems, like treacherous turns), regardless of how close “chimp” is to “human” on some absolute intelligence scale.

Some considerations:

  • This of course, depends on having metrics that work. Of course, it seems very likely that early general AIs will be minds with a very unusual “shape”, such that it has sophisticated reasoning abilities along dimensions that are not natural categories to humans, while being weak on many of the dimensions we’re testing for.
    • Although, this seems a bit surprising, since presumably we would be training it to hit those metrics? Somehow it would get better, faster, at a kind of reasoning we were not intending to train, than at the kind of reasoning that we were optimizing for?
    • Also, can we just design metrics that get around this problem, by measuring exactly what we care about, i.e. danger? Something like a metric for consequentialist reasoning?
  • This idea depends on the system not deceiving you, or intentionally underperforming on your metrics.
    • Naively, this doesn’t seem like that much of an issue, if you are doing tests regularly enough. I’m presuming (perhaps wrongly), that a chimpanzee is not smart enough to infer that it is an agent in a simulation that is overseen by human agents, and that it should try to decide those humans. That is, we stop our progression before the agent is at the point where deception is a consideration.
      • Which means we really need to be confident about how smart something needs to be before it tries to deceive us?
        • Really? Its first attempts at tricking humans will be pretty transparent. Just as the attempts of animals / babies to trick us are pretty transparent.
  • At least one “danger point”, is when the system is capable enough to realize the instrumental value of self improving by seizing more resources.
    • How smart is this?
      • My guess, is really smart. Animals come pre-loaded with all kinds of instincts that cause them to seek out food, water, etc. These AI systems would not have an instinct to seek more training data / computation. Most humans don’t reason their way into finding ways to improve their own reasoning. If there was a chimp, even loose in the internet (whatever that means), would it figure out to make itself smarter?
      • If the agent has experienced (and has memories of) rounds of getting smarter, as the humans give it more resources, and can identify that these improvements allow it to get more of what it wants, it might instrumentally reason that it should figure out how to get more compute / training data. But it seems easy to have a setup such that no system has episodic memories previous improvement rounds.
        • [Note: This makes a lot less sense for an agent of the active inference paradigm]
          • Could I salvage it somehow? Maybe by making some kind of principled distinction between learning in the sense of “getting better at reasoning” (procedural), and learning in the sense of “acquiring information about the environment” (episodic).
Replies from: jimrandomh
comment by jimrandomh · 2019-09-10T01:34:00.749Z · LW(p) · GW(p)

In There’s No Fire Alarm for Artificial General Intelligence Eliezer argues:

A fire alarm creates common knowledge, in the you-know-I-know sense, that there is a fire; after which it is socially safe to react. When the fire alarm goes off, you know that everyone else knows there is a fire, you know you won’t lose face if you proceed to exit the building.

If I have a predetermined set of tests, this could serve as a fire alarm, but only if you've successfully built a consensus that it is one. This is hard, and the consensus would need to be quite strong. To avoid ambiguity, the test itself would need to be demonstrably resistant to being clever Hans'ed. Otherwise it would be just another milestone.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-09-10T14:40:37.110Z · LW(p) · GW(p)

I very much agree.

comment by Eli Tyre (elityre) · 2021-03-18T03:42:42.871Z · LW(p) · GW(p)

Question: Have Moral Mazes been getting worse over time? 

Could the growth of Moral Mazes be the cause of cost disease? 

I was thinking about how I could answer this question. I think that the thing that I need is a good quantitative measure of how "mazy" an organization is. 

I considered the metric of "how much output for each input", but 1) that metric is just cost disease itself, so it doesn't help us distinguish the mazy cause from other possible causes, 2) If you're good enough at rent seeking maybe you can get high revenue despite you poor production. 

What metric could we use?

Replies from: Raemon
comment by Raemon · 2021-03-18T04:24:14.832Z · LW(p) · GW(p)

This is still a bit superficial/goodharty, but I think "number of layers of hierarchy" is at least one thing to look at. (Maybe find pairs of companies that output comparable products that you're somehow able to measure the inputs and outputs of, and see if layers of management correlate with cost disease)

comment by Eli Tyre (elityre) · 2020-12-30T07:13:15.245Z · LW(p) · GW(p)

This is my current take about where we're at in the world:

Deep learning, scaled up, might be basically enough to get AGI. There might be some additional conceptual work necessary, but the main difference between 2020 and the year in which we have transformative AI is that in that year, the models are much bigger.

If this is the case, then the most urgent problem is strong AI alignment + wise deployment of strong AI.

We'll know if this is the case in the next 10 years or so, because either we'll continue to see incredible gains from increasingly bigger Deep Learning systems or we'll see those gains level off, as we start seeing decreasing marginal returns to more compute / training.

If deep learning is basically not sufficient, then all bets are off. In that case, it isn't clear when transformative AI will arrive.

This may shift meaningfully shift priorities, for two reasons:

It may mean that some other countdown will reach a critical point before the "AGI clock" does. Genetic engineering, or synthetic biology, or major geopolitical upheaval (like a nuclear war), or some strong form of civilizational collapse will upset the game-board before we get to AGI.

There is more time to pursue "foundational strategies" that only pay off in the medium term (30 to 100 years). Things like, improving the epistemic mechanism design of human institutions, including governmental reform, human genetic engineering projects, or plans to radically detraumatize large fractions of the population.

This suggests to me that I should, in this decade, be planning and steering for how to robustly-positively intervene on the AI safety problem, while tracking the sideline of broader Civilizational Sanity interventions, that might take longer to payoff. While planning to reassess every few years, to see if it looks like we're getting diminishing marginal returns to Deep Learning yet.

Replies from: niplav
comment by niplav · 2020-12-30T14:40:10.885Z · LW(p) · GW(p)

(This question is only related to a small point)

You write that one possible foundational strategy could be to "radically detraumatize large fractions of the population". Do you believe that

  1. A large part of the population is traumatized
  2. That trauma is reversible
  3. Removing/reversing that trauma would improve the development of humanity drastically?

If yes, why? I'm happy to get a 1k page PDF thrown at me.

I know that this has been a relatively popular talking point on twitter, but without a canonical resource, and I also haven't seen it discussed on LW.

Replies from: elityre
comment by Eli Tyre (elityre) · 2020-12-31T21:51:15.222Z · LW(p) · GW(p)

I was wondering if I would get comment on that part in particular. ; )

I don't have a strong belief about your points one through three, currently. But it is an important hypothesis in my hypothesis space, and I'm hoping that I can get to the bottom of it in the next year or two.

I do confidently think that one of the "forces for badness" in the world is that people regularly feel triggered or threatened by all kinds of different proposals, reflexively act to defend themselves. I think this is among the top three problems in having good discourse and cooperative politics. Systematically reducing that trigger response would be super high value, if it were feasible.

My best guess is that that propensity to be triggered is not mostly the result of infant or childhood trauma. It seems more parsimonious to posit that it is basic tribal stuff. But I could imagine it having its root in something like "trauma" (meaning it is the result of specific experiences, not just general dispositions, and it is practically feasible, if difficult, to clear or heal the underlying problem in a way completely prevents the symptoms).

I think there is no canonical resource on trauma-stuff because 1) the people on twitter are less interested on average, in that kind of theory building than we are on lesswong and 2) because mostly those people are (I think) extrapolating from their own experience, in which some practices unlocked subjectively huge breakthroughs in personal well-being / freedom of thought and action.

Does that help at all?

Replies from: Hazard, niplav
comment by Hazard · 2021-01-04T02:29:10.914Z · LW(p) · GW(p)

I plan to blog more about how I understand some of these trigger states and how it relates to trauma. I do think there's a decent amount of written work, not sure how "canonical", but I've read some great stuff that from sources I'm surprised I haven't heard more hype about. The most useful stuff I've read so far is the first three chapters of this book. It has hugely sharpened my thinking.

I agree that a lot of trauma discourse on our chunk of twitter is more for used on the personal experience/transformation side, and doesn't let itself well to bigger Theory of Change type scheming.


Replies from: elityre
comment by Eli Tyre (elityre) · 2021-01-04T08:24:23.381Z · LW(p) · GW(p)

Thanks for the link! I'm going to take a look!

comment by niplav · 2021-01-03T22:02:54.514Z · LW(p) · GW(p)

Yes, it definitely does–you just created the resource I will will link people to. Thank you!

Especially the third paragraph is cruxy. As far as I can tell, there are many people who have (to some extent) defused this propensity to get triggered for themselves. At least for me, LW was a resource to achieve that.

comment by Eli Tyre (elityre) · 2020-07-21T08:21:53.387Z · LW(p) · GW(p)

I was thinking lately about how there are some different classes of models of psychological change, and I thought I would outline them and see where that leads me. 

It turns out it led me into a question about where and when Parts-based vs. Association-based models are applicable.

Google Doc version.

Parts-based / agent-based models 

Some examples: 

  • Focusing
  • IFS
  • IDC
  • Connection Theory
  • The NLP ecological check

This is the frame that I make the most use of, in my personal practice. It assumes that all behavior is the result of some goal directed subprocess in you (or parts), that is serving one of your needs. Sometimes parts adopt strategies that are globally harmful or cause problems, but those strategies are always solving or mitigating (if only barely) some problem of yours. 

Some parts based approaches are pretty adamant about the goal directed-ness of all behavior.

For instance, I think (though I’m not interested in trying to find the quote right now), Self therapy, a book on IFS, states that all behavior is adaptive in this way. Nothing is due to habit. And the original Connection Theory document says the same.

Sometimes these parts can conflict with each other, or get in each other’s way, and you might engage in behavior that is far from optimal, different parts encat different behaviors (for instance, procrastination typically involves a part that is concerned about some impending state of the world, while another part of you, anticipating the psychological pain of consciously facing up to that bad possibility, 

Furthermore, these parts are reasonably intelligent, and can update. If you can provide them a solution to the problem that they are solving, that is superior (by the standards of the part) than its current strategy, then it will immediately adopt that new strategy instead. This is markedly different from a model under which unwanted behaviors are “bad habits” that are mindfully retrained.

Association-based models 


  • TAPs 
  • NLP anchoring
  • Lots of CBT and Mindfulness based therapy (eg “notice 
  • Reinforcement learning / behavioral shaping
  • Tony Robbins’ “forming new neuro associations”

In contrast there is another simple model of the mind, that mostly operates with an ontology of simple (learned) association, instead of intelligent strategies. That is, it thinks of your behavior, including your emotional responses, mostly as habits, or stimulus response patterns, that can be trained or untrained. 

For instance, say you have a problem of road rage. In the “parts” frame, you might deal with anger by dialoguing with with the anger, finding out what the anger is protecting, own or ally with that goal, and then find an alternative strategy that meets that goal without the anger. In the association frame, you might gradually retrain the anger response, by mindfully noticing as it arises, and then letting it go. Overtime, you’ll gradually train a different emotional reaction to the formerly rage-inducing stimulus.

Or, if you don’t want to wait that long, you might use some NLP trick to rapidly associate a new emotional pattern to a given stimulus, so that instead of feeling anger, you feel calm. (Or instead of feeling anxious jealousy, you feel loving abundant gratitude.)

This association process can sometimes be pretty dumb, such a skilled manipulater might cause you to associate a mental state like guilt or gratitude with tap on the shoulder, so that everytime you are tapped on the shoulder you return to the mental state. That phenomenon does not seem consistent with a naive form of the parts-based model.

And notably, an association model predicts that merely offering an alternative strategy (or frame) to a part doesn’t immediately or permanently change the behavior: you expect to have some “hold over” from the previous strategy because those associations will still fire. You have to clear them out somehow.

And this is my experience some of the time: sometimes, particularly with situations that have had a lot of emotional weight for me, I will immediately fall into old emotional patterns, even when I (or at least part part of me) has updated away from the beliefs that made that reaction relevant. For instance, I fall in love with a person because I have some story / CT path about how we are uniquely compatible, I gradually learn that this isn’t true, but I still have a strong emotional reaction when they walk into the room. What’s going on here? Some part of me isn’t updating, for some reason? It sure seems like some stimuli are activating old patterns even if those patterns aren’t adaptive and don’t even make sense in context. But this seems to suggest less intelligence on the part of my parts, it seems more like stimulus response machinery.

And on the other side, what’s happening when Tony Robins is splashing water in people’s faces to shake them out of their patterns? From a parts-based perspective, that doesn’t make any sense. Is the sub agent in question being permanently disrupted? (Or maybe you only have to disrupt it for a bit, to give space for a new association / strategy to take hold? And then after that the new strategy outcompetes the old one?) 

[Big Question: how does the parts-based model interact with the associations-based model?

Is it just that human minds do both? What governs when which phenomenon applies?

When should I use which kind of technique?]

Narrative-based / frame-based models


  • Transforming Yourself Self concept work
  • NLP reframing effects
  • Some other CBT stuff
  • Katy Byran’s the Work
  • Anything that involves reontologizing

A third category of psychological intervention are those that are based around narrative: you find, and “put on” a new way of interpreting, or making sense of, your experience, such that it has a different meaning that provides you different affordances. Generally you find a new narrative that is more useful for you.

The classic example is a simple reframe, where you feel frustrated that people keep mooching off of you, but you reframe this so that you instead feel magnanimous, emphasizing your generosity, and how great it is to have an opportunity to give back to people. Same circumstances, different story about them.

This class of interventions feels like it can slide easily into either the parts based frame or the association based frame. In the parts based frame, a narrative can be thought of as just another strategy that a part might adopt so long as that is the best way that the part can solve its problem (and so long as other parts don’t conflict). 

But I think this fits even more naturally into the association frame, where you find a new way to conceptualize your situation and you do some work to reassociate that new conceptualization with the stimulus that previously activated your old narrative (this is exactly what Phil of Philosophical Counseling’s process does: you find a new narrative / belief structure and set up a regime under which you noticed when the old one arises, let it go, and feel into the new one.)

[Other classes of intervention that I am distinctly missing?]

Replies from: Raemon, Viliam, ChristianKl
comment by Raemon · 2020-07-21T11:09:40.794Z · LW(p) · GW(p)

I like this a lot, and think it’d make a good top level post. 

Replies from: elityre
comment by Eli Tyre (elityre) · 2020-07-22T00:53:46.864Z · LW(p) · GW(p)

Really? I would prefer to have something much more developed and/or to have solved my key puzzle here before I put as a top level post.

Replies from: Raemon
comment by Raemon · 2020-07-22T01:24:29.492Z · LW(p) · GW(p)

I saw the post more as giving me a framework that was helping for sorting various psych models, and the fact that you had one question about it didn't actually feel too central for my own reading. (Separately, I think it's basically fine for posts to be framed as questions rather than definitive statements/arguments after you've finished your thinking)

comment by Viliam · 2020-07-21T15:57:50.645Z · LW(p) · GW(p)

I wonder how the ancient schools of psychotherapy would fit here. Psychoanalysis is parts-based. Behaviorism is association-based. Rational therapy seems narrative-based. What about Rogers or Maslow?

Seems to me that Rogers and the "think about it seriously for 5 minutes" technique should be in the same category. In both cases, the goal is to let the client actually think about the problem and find the solution for themselves. Not sure if this is or isn't an example of narrative-based, except the client is supposed to find the narrative themselves.

Maslow comes with a supposed universal model of human desires and lets you find yourself in that system. Jung kinda does the same, but with a mythological model. Sounds like an externally provided narrative. Dunno, maybe the narrative-based should be split into more subgroups, depending on where the narrative comes from (a universal model, an ad-hoc model provided by the therapist, an ad-hoc model constructed by the client)?

comment by ChristianKl · 2020-07-21T11:31:30.613Z · LW(p) · GW(p)

The way I have been taught NLP, you usually don't use either anchors or an ecological check but both. 

Behavior changes that are created by changing around anchors are not long-term stable when they violate ecology. 

Changing around associations allows to create new strategies in a more detailed way then you get by just doing parts work and I have the impression that it's often faster in creating new strategies. 

[Other classes of intervention that I am distinctly missing?]

(A) Interventions that are about resolving traumas feel to me like a different model. 

(B) None of the three models you listed address the usefulness of connecting with the felt sense of emotions. 

(C) There's a model of change where you create a setting where people can have new behavioral experiences and then hopefully learn from those experiences and integrate what they learned in their lives. 

CFAR's goal of wanting to give people more agency about ways they think seems to work through C where CFAR wants to expose people to a bunch of experiences where people actually feel new ways to affect their thinking. 

In the Danis Bois method both A and C are central.

comment by Eli Tyre (elityre) · 2019-10-28T22:40:04.584Z · LW(p) · GW(p)

Can someone affiliated with a university, ect. get me a PDF of this paper?


It is on Scihub, but that version is missing a few pages in which they describe the methodology.

[I hope this isn't an abuse of LessWrong.]

Replies from: romeostevensit
comment by Eli Tyre (elityre) · 2019-09-12T08:44:03.142Z · LW(p) · GW(p)

New (image) post: My strategic picture of the work that needs to be done

Replies from: Raemon
comment by Raemon · 2019-09-13T22:20:32.404Z · LW(p) · GW(p)

I edited the image into the comment box, predicting that the reason you didn't was because you didn't know you could (using markdown). Apologies if you prefer it not to be here (and can edit it back if so)

Replies from: elityre, elityre
comment by Eli Tyre (elityre) · 2019-09-14T08:01:22.943Z · LW(p) · GW(p)

In this case it seems fine to add the image, but I feel disconcerted that mods have the ability to edit my posts.

I guess it makes sense that the LessWrong team would have the technical ability to do that. But editing a users post, without their specifically asking, feels like a pretty big breach of... not exactly trust, but something like that. It means I don’t have fundamental control over what is written under my name.

That is to say, I personally request that you never edit my posts, without asking (which you did, in this case) and waiting for my response. I furthermore, I think that should be a universal policy on LessWrong, though maybe this is just an idiosyncratic neurosis of mine.

Replies from: Raemon, Wei_Dai
comment by Raemon · 2019-09-14T09:07:09.020Z · LW(p) · GW(p)

Understood, and apologies.

A fairly common mod practice has been to fix typos and stuff in a sort of "move first and then ask if it was okay" thing. (I'm not confident this is the best policy, but it saves time/friction, and meanwhile I don't think anyone had had an issue with it). But, your preference definitely makes sense and if others felt the same I'd reconsider the overall policy.

(It's also the case that adding an image is a bit of a larger change than the usual typo fixing, and may have been more of an overstep of bounds)

In any case I definitely won't edit your stuff again without express permission.

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-09-14T10:02:26.387Z · LW(p) · GW(p)


: )

comment by Wei_Dai · 2019-09-14T08:23:41.752Z · LW(p) · GW(p)

I furthermore, I think that should be a universal policy on LessWrong, though maybe this is just an idiosyncratic neurosis of mine.

If it's not just you, it's at least pretty rare. I've seen the mods "helpfully" edit posts several times (without asking first) and this is the first time I've seen anyone complain about it.

comment by Eli Tyre (elityre) · 2019-09-14T07:50:16.204Z · LW(p) · GW(p)

I knew that I could, and didn’t, because it didn’t seem worth it. (Thinking that I still have to upload it to a third party photo repository and link to it. It’s easier than that now?)

Replies from: Raemon
comment by Raemon · 2019-09-14T09:07:46.118Z · LW(p) · GW(p)

In this case your blog already counted as a third party repository.

comment by Eli Tyre (elityre) · 2019-08-07T18:41:06.925Z · LW(p) · GW(p)

New post: Napping Protocol

Replies from: Raemon
comment by Raemon · 2019-08-20T06:16:47.730Z · LW(p) · GW(p)

Some of these seem likely to generalize and some seem likely to be more specific.

Curious about your thoughts "best experimental approaches to figuring out your own napping protocol."

comment by Eli Tyre (elityre) · 2020-12-09T07:56:31.057Z · LW(p) · GW(p)

Doing actual mini-RCTs can be pretty simple. You only need 3 things: 

1. A spreadsheet 

2. A digital coin for randomization 

3. A way to measure the variable that you care about

I think one of practically powerful "techniques" of rationality is doing simple empirical experiments like this. You want to get something? You don't know how to get it? Try out some ideas and check which ones work!

There are other applications of empiricism that are not as formal, and sometimes faster. Those are also awesome. But at the very least, I've found that doing mini-RCTs is pretty enlightening.

On the object level, you can learn what actually works for hitting your goals.

On the process level, this trains some good epistemic norms and priors.

For one thing, I now have a much stronger intuition for the likelihood that an impressive effect is just noise. And getting into the habit of doing quantified hypothesis testing, such that you can cleanly falsify your hypotheses, teaches you to hold hypotheses lightly while inclining you to generate hypotheses in the first place.

Theorizing methods can enhance and accelerate this process, but if you have a quantified empirical feedback loop, your theorizing will be grounded. Science is hard, and most of our guesses are wrong. But that's fine, so long as we actually check.

comment by Eli Tyre (elityre) · 2021-02-25T05:34:46.429Z · LW(p) · GW(p)

Is there a LessWrong article that unifies physical determinism and choice / "free will"? Something about thinking of yourself as the algorithm computed on this brain?

Replies from: Measure
comment by Measure · 2021-02-25T06:36:50.362Z · LW(p) · GW(p)

Perhaps This one [? · GW]?

comment by Eli Tyre (elityre) · 2021-02-17T20:41:43.655Z · LW(p) · GW(p)

Is there any particular reason why I should assign more credibility to Moral Mazes / Robert Jackall than I would to the work of any other sociologist?

(My prior on sociologists is that they sometimes produce useful frameworks, but generally rely on subjective hard-to-verify and especially theory-laden methodology, and are very often straightforwardly ideologically motivated.)

I imagine that someone else could write a different book, based on the same kind of anthropological research, that highlights different features of the corporate world, to tell the opposite story.

And that's without anyone trying to be deceptive. There's just a fundamental problem of case studies that they don't tell you what's typical, only give you examples.

I can totally imagine that Jackall landed on this narrative somehow, found that it held together and just confirmation biased for the rest of his career. Once his basic thesis was well-known, and associated with his name, it seems hard for something like that NOT to happen.

And this leaves me unsure what to do with the data of Moral Mazes. Should I default assume that Jackall's characterization is a good description of the corporate world? Or should I throw this out as a useless set of examples confirmation biased together? Or something else?

It seems like the question of "is the most of the world dominated by Moral Mazes?" is an extremely important one. But also, its seems to me that it's not operationalized enough to have a meaningful answer. At best, it seems like this is a thing that happens sometimes.

Replies from: Raemon, Dagon
comment by Raemon · 2021-02-17T21:54:53.549Z · LW(p) · GW(p)

My own take is that moral mazes should be considered in the "interesting hypothesis" stage, and that the next step is to actually figure out how to go be empirical about checking it.

I made some cursory attempts at this last year, and then found myself unsure this was even the right question. The core operationalization I wanted was something like:

  • Does having more layers of management introduce pathologies into an organization?
  • How much value is generated by organizations scaling up?
  • Can you reap the benefits of organizations scaling up by instead having them splinter off?

(The "middle management == disconnected from reality == bad" hypothesis was the most clear-cut of the moral maze model to me, although I don't think it was the only part of the model)

I have some disagreements with Zvi about this.

I chatted briefly with habryka about this and I think he said something like "it seems like a more useful question is to look for positive examples of orgs that work well, rather than try and tease out various negative ways orgs could fail to work."

I think there are maybe two overarching questions this is all relevant to:

  1. How should the rationality / xrisk / EA community handle scale? Should we be worried about introducing middle-management into ourselves?
  2. What's up with civilization? Is maziness a major bottleneck on humanity? Should we try to do anything about it? (My default answer here is "there's not much to be done here, simply because the world is full of hard problems and this one doesn't seem very tractable even if the models are straightforwardly true." But, I do think this is a contender for humanity hamming problem)
comment by Dagon · 2021-02-18T15:36:24.023Z · LW(p) · GW(p)

There are multiple dimensions to the credibility question.  You probably should increase your credence from prior to reading it/about it that large organizations very often have more severe misalignment than you thought.  You probably should recognize that the model of middle-management internal competition has some explanatory power.  

You probably should NOT go all the way to believing that the corporate world is homogeneously broken in exactly this way.  I don't think he makes that claim, but it's what a lot of readers seem to take.  There's plenty of variation, and the Anna Karenina principle applies (paraphrased): well-functioning organizations are alike; disfunctional organizations are each broken in their own way.   But really, it's wrong too - each group is actually distinct, and has distinct sets of forces that have driven it to whatever pathologies or successes it has.  Even when there are elements that appear very similar, they have different causes and likely different solutions or coping mechanisms.

"is most of the world dominated by moral mazes"?  I don't think this is a useful framing.  Most groups have some elements of Moral Mazes.  Some groups appear dominated by those elements, in some ways.  From the outside, most groups are at least somewhat effective at their stated mission, so the level of domination is low enough that it hasn't killed them (though there are certainly "zombie orgs" which HAVE been killed, but don't know it yet).  

comment by Eli Tyre (elityre) · 2021-01-09T10:56:55.733Z · LW(p) · GW(p)

My understanding is that there was a 10 year period starting around 1868, in which South Carolina's legislature was mostly black, and when the universities were integrated (causing most white students to leave), before the Dixiecrats regained power.

I would like to find a relatively non-partisan account of this period.

Anyone have suggestions?

Replies from: _mp_
comment by _mp_ · 2021-01-09T22:51:56.662Z · LW(p) · GW(p)

I would just read W. E. B. Du Bois - Black Reconstruction in America (1935)

comment by Eli Tyre (elityre) · 2021-03-30T05:13:34.130Z · LW(p) · GW(p)

I recall a Chriss Olah post in which he talks about using AIs as a tool for understanding the world, by letting the AI learn, and then using interpretability tools to study the abstractions that the AI uncovers. 

I thought he specifically mentioned "using AI as a microscope."

Is that a real post, or am I misremembering this one? 

Replies from: Unnamed
comment by Eli Tyre (elityre) · 2021-03-26T09:36:33.757Z · LW(p) · GW(p)

Are there any hidden risks to buying or owning a car that someone who's never been a car owner might neglect?

I'm considering buying a very old (ie from the 1990s), very cheap (under $1000, ideally) minivan, as an experiment.

That's inexpensive enough that I'm not that worried about it completely breaking down on me. I'm willing to just eat the monetary cost for the information value.

However, maybe there are other costs or other risks that I'm not tracking, that make this a worse idea.

Things like

- Some ways that a car can break make it dangerous, instead of non-functional.

- Maybe if a car breaks down in the middle of route 66, the government fines you a bunch?

- Something something car insurance?

Are there other things that I should know? What are the major things that one should check for to avoid buying a lemon?

Assume I'm not aware of even the most drop-dead basic stuff. I'm probably not.

(Also, I'm in the market for a minivan, or other car with 3 rows of seats. If you have an old car like that which you would like to sell, or if know someone who does, get in touch.

Do note that I am extremely price sensitive, but I would pay somewhat more than $1000 for a car, if I were confident that it was not a lemon.)

Replies from: gerald-monroe, Dagon
comment by Gerald Monroe (gerald-monroe) · 2021-03-28T05:46:09.785Z · LW(p) · GW(p)

There are.  https://www.iihs.org/ratings/driver-death-rates-by-make-and-model

You can explore the data yourself, but the general trend is that it appears there have been real improvements in crash fatality rates.  Better designed structure, more and better airbags, stability control, and now in some new vehicles automatic emergency braking is standard.

Generally a bigger vehicle like a minivan is safer, and a newer version of that minivan will be safer, but you just have to go with what you can afford.

Main risk is simply that at this price point that minivan is going to have a lot of miles, and it's simply probability how long it will run until a very expensive major repair is needed.  One strategy is to plan to junk the vehicle and get a similar 'beater' vehicle when the present one fails.

If you're so price sensitive $1000 is meaningful, well, uh try to find a solution to this crisis.  I'm not saying one exists, but there are survival risks to poverty.

Replies from: elityre
comment by Eli Tyre (elityre) · 2021-03-30T05:50:28.323Z · LW(p) · GW(p)

If you're so price sensitive $1000 is meaningful, well, uh try to find a solution to this crisis.  I'm not saying one exists, but there are survival risks to poverty.

Lol. I'm not impoverished, but I want to cheaply experiment with having a car. It isn't worth it to spend throw away $30,000 on a thing that I'm not going to get much value from.

Replies from: gerald-monroe, Raemon
comment by Gerald Monroe (gerald-monroe) · 2021-03-30T20:32:16.139Z · LW(p) · GW(p)

Ok but at the price point you are talking you are not going to have a good time.

Analogy: would you "experiment with having a computer" by grabbing a packard bell from the 1990s and putting an ethernet card in it so it can connect to the internet from windows 95?

Do you need the minivan form factor? As a vehicle in decent condition (6-10 years old, under 100k miles, from a reputable brand) is cheapest in the small car form factor.

comment by Raemon · 2021-03-30T12:18:04.920Z · LW(p) · GW(p)

Not spending $30,000 makes sense, but my impression from car shopping last year was that trying to get a good car for less than $7k was fairly hard. (I get the ‘willingness to eat the cost’ price point of $1k, but wanted to highlight that the next price point up was more like 10k than 30k.)

Depending on your experimentation goals, you might want to rent a a car rather than buy.

comment by Dagon · 2021-04-01T15:55:47.718Z · LW(p) · GW(p)

Most auto shops will do a safety/mechanical inspection for a small amount (usually in the $50-200 range, but be aware that the cheaper ones subsidize it by anticipating that they can sell you services to fix the car if you buy it).   

However, as others have said, this price point is too low for your first car as a novice, unless you have a mentor and intend to spend a lot of time learning to maintain/fix.  Something reliable enough for you to actually run the experiment and get the information you want about the benefits vs frustrations of owning a car is going to run probably $5-$10K, depending on regional variance and specifics of your needs.  

For a first car, look into getting a warranty, not because it's a good insurance bet, but because it forces the seller to make claims of warrantability to their insurance company.

You can probably cut the cost in half (or more) if you educate yourself and get to know the local car community.  If the car is a hobby rather than an experiment in transportation convenience, you can take a lot more risk, AND those risks are mitigated if you know how to get things fixed cheaply.

comment by Eli Tyre (elityre) · 2021-02-28T20:38:59.875Z · LW(p) · GW(p)

Is there a standard article on what "the critical risk period" is?

I thought I remembered an arbital post, but I can't seem to find it.

comment by Eli Tyre (elityre) · 2021-01-15T05:52:07.714Z · LW(p) · GW(p)

I remember reading a Zvi Mowshowitz post in which he says something like "if you have concluded that the most ethical thing to do is to destroy the world, you've made a mistake in your reasoning somewhere." 

I spent some time search around his blog for that post, but couldn't find it. Does anyone know what I'm talking about? 

Replies from: Pattern, Raemon
comment by Pattern · 2021-01-15T21:36:51.011Z · LW(p) · GW(p)

It sounds like a tagline for a blog.

comment by Raemon · 2021-01-15T08:00:34.520Z · LW(p) · GW(p)

Probably this one?


Replies from: elityre
comment by Eli Tyre (elityre) · 2021-01-16T01:11:46.965Z · LW(p) · GW(p)


I thought that it was in the context of talking about EA, but maybe this is what I am remembering? 

It seems unlikely though, since wouldn't have read the spoiler-part.

comment by Eli Tyre (elityre) · 2021-01-07T08:09:31.530Z · LW(p) · GW(p)

Anyone have a link to the sequence post where someone posits that AIs would do art and science from a drive to compress information, but rather it would create and then reveal cryptographic strings (or something)?

Replies from: niplav
comment by niplav · 2021-01-07T09:12:36.404Z · LW(p) · GW(p)

I think you are thinking of “AI Alignment: Why It’s Hard, and Where to Start”:

The next problem is unforeseen instantiation: you can’t think fast enough to search the whole space of possibilities. At an early singularity summit, Jürgen Schmidhuber, who did some of the pioneering work on self-modifying agents that preserve their own utility functions with his Gödel machine, also solved the friendly AI problem. Yes, he came up with the one true utility function that is all you need to program into AGIs!

(For God’s sake, don’t try doing this yourselves. Everyone does it. They all come up with different utility functions. It’s always horrible.)

His one true utility function was “increasing the compression of environmental data.” Because science increases the compression of environmental data: if you understand science better, you can better compress what you see in the environment. Art, according to him, also involves compressing the environment better. I went up in Q&A and said, “Yes, science does let you compress the environment better, but you know what really maxes out your utility function? Building something that encrypts streams of 1s and 0s using a cryptographic key, and then reveals the cryptographic key to you.”

He put up a utility function; that was the maximum. All of a sudden, the cryptographic key is revealed and what you thought was a long stream of random-looking 1s and 0s has been compressed down to a single stream of 1s.

There's also a mention of that method in this post [LW · GW].

comment by Eli Tyre (elityre) · 2020-12-09T08:00:18.434Z · LW(p) · GW(p)

I remember reading a Zvi Mowshowitz post in which he says something like "if you have concluded that the most ethical thing to do is to destroy the world, you've made a mistake in your reasoning somewhere." 

I spent some time search around his blog for that post, but couldn't find it. Does anyone know what I'm talking about? 

Replies from: Raemon
comment by Raemon · 2020-12-09T08:23:03.218Z · LW(p) · GW(p)

Review of three body problem is my first guess

comment by Eli Tyre (elityre) · 2020-07-23T04:38:37.247Z · LW(p) · GW(p)

A hierarchy of behavioral change methods

Follow up to, and a continuation of the line of thinking from: Some classes of models of psychology and psychological change

Related to: The universe of possible interventions on human behavior (from 2017)

This post outlines a hierarchy of behavioral change methods. Each of these approaches is intended to be simpler, more light-weight, and faster to use (is that right?), than the one that comes after it. On the flip side, each of these approaches is intended to resolve a common major blocker of the approach before it.

I do not necessarily endorse this breakdown or this ordering. This represents me thinking out loud.

[Google Doc version]

[Note that all of these are more-or-less top down, and focused on the individual instead the environment]

Level 1:  TAPs

If there’s some behavior that you want to make habitual, the simplest thing is to set, and then train a TAP. Identify a trigger and the action with which you want to respond to that trigger, and then practice it a few times. 

This is simple, direct, and can work for actions as varied as “use NVC” and “correct my posture” and “take a moment to consider the correct spelling.”

This works particularly well for “remembering problems”, in which you can and would do the action, if only it occurred to you at the right moment.

Level 2: Modifying affect / meaning

Sometimes however, you’ll have set a TAP to do something, you’ll notice the trigger, and...you just don’t feel like doing the action. 

Maybe you’ve decided that you’re going to take the stairs instead of the elevator, but you look at the stairs and then take the elevator anyway. Or maybe you want to stop watching youtube, and have a TAP to open your todo list instead, but you notice...and then just keep watching youtube. 

The most natural thing to do here is to adjust your associations / affect around the behavior that you want to engage in or the behavior that you want to start. You not only want the TAP to fire, reminding you of the action, but you want the feeling of the action to pull you toward it, emotionally. Or another way of saying it, you change the meaning that you assign to the behavior.

Some techniques here include: 

  • Selectively emphasizing different elements of an experience (like the doritos example in Nate’s post here), and other kinds of reframes
  • Tony Robins’ process for working with “neuro associations” of asking 1) what pain has kept me from taking this action in the past, 2) what pleasure have I gotten from not taking this action in the past, 3) what will it cost me if I don’t take this action?, 4) what pleasure will it bring me if I take this action.
  • This here [LW(p) · GW(p)] goal chaining technique,
  • Some more heavy-duty NLP tools.
  • Behaviorist conditioning (I’m weary of this one, since it seems pretty symmetric.)

Level 3: Dialogue 

The above approach only has a limited range of application, in that it can only work in situations where there are degrees of freedom in one’s affect to a stimulus or situation. In many cases, you might go in and try to change the affect around something from the top-down, and some part of you will object, or you will temporarily change the affect, but it will get “kicked out” later.

This is because your affects are typically not arbitrary. Rather they are the result of epistemic processes that are modeling the world and the impact of circumstances on your goals.

When this is the case, you’ll need to do some form of dialogue, which either updates a model of some objecting part, or modifieds the recommended strategy / affect to accommodate the objection, or find some other third option.

This can take the form of 

  • Focusing
  • IDC
  • IFS
  • CT debugging

The most extreme instance of “some part has an objection” is when there is some relevant trauma somewhere in the system. Sort of by definition, this means that you’ll have an extreme objection to some possible behavior or affect changes, because that part of the state space is marked as critically bad.

Junk Drawer 

As I noted, this schema describes top-down behavior change. It does not include cases where there is a problem, but you don’t have much of a sense what the problem is and/or how to approach it. For those kinds of bugs you might instead start with Focusing, or with a noticing regime.

For related reasons, this is super not relevant to blindspots.

I’m also neglecting environmental interventions, both those that simply redirect your attention (like a TAP), and those that shift the affect around an activity (like using social pressure to get yourself to do stuff, via coworking for instance). I can’t think of an environmental version of level 3.

comment by Eli Tyre (elityre) · 2020-04-21T00:22:12.200Z · LW(p) · GW(p)

Can anyone get a copy of this paper for me? I'm looking to get clarity about how important cryopreserving non-brain tissue is for preserving personality.

comment by Eli Tyre (elityre) · 2019-07-17T17:05:48.547Z · LW(p) · GW(p)

New post: my personal wellbeing support pillars

Replies from: Raemon
comment by Raemon · 2019-07-17T18:31:31.029Z · LW(p) · GW(p)

I'm interested in knowing your napping tools

Replies from: elityre
comment by Eli Tyre (elityre) · 2019-08-07T19:19:27.478Z · LW(p) · GW(p)

Here you go.

New post: Napping Protocol

Replies from: Raemon
comment by Raemon · 2019-08-07T19:21:27.460Z · LW(p) · GW(p)