Contra double crux

post by Thrasymachus · 2017-10-08T16:29:35.193Z · LW · GW · 67 comments

Contents

  What is a crux?
  How common are cruxes (and double cruxes)?
  Auxiliary challenges to double crux
    Crux-asymmetry
    'Changing one's mind' around p=0.5 isn't (that) important
  Intermezzo
  Good philosophers already disagree better than double cruxing
  Coda: Wherefore double crux?
None
67 comments

Summary: CFAR proposes double crux as a method to resolve disagreement: instead of arguing over some belief B, one should look for a crux (C) which underlies it, such that if either party changed their mind over C, they would change their mind about B.

I don't think double crux is that helpful, principally because 'double cruxes' are rare in topics where reasonable people differ (and they can be asymmetric, be about a considerations strength rather than direction, and so on). I suggest this may diagnose the difficulty others have noted in getting double crux to 'work'. Good philosophers seem to do much better than double cruxing using different approaches.

I aver the strengths of double crux are primarily other epistemic virtues, pre-requisite for double crux, which are conflated with double cruxing itself (e.g. it is good to have a collaborative rather than combative mindset when disagreeing). Conditional on having this pre-requisite set of epistemic virtues, double cruxing does not add further benefit, and is probably inferior to other means of discussion exemplified by good philosophers. I recommend we look elsewhere.

What is a crux?

From Sabien's exposition, a crux for some belief B is another belief C which if one changed one's mind about C, one would change one's mind about B. The original example was the impact of school uniforms concealing unhelpful class distinctions being a crux for whether one supports or opposes school uniforms.

A double crux is a particular case where two people disagree over B and have the same crux, albeit going in opposite directions. Say if Xenia believes B (because she believes C) and Yevgeny disbelieves B (because he does not believe C), then if Xenia stopped believing C, she would stop believing B (and thus agree with Yevgeny) and vice-versa.

How common are cruxes (and double cruxes)?

I suggest the main problem facing the 'double crux technique' is that disagreements like Xenia's and Yevgeny's, which can be eventually traced to a single underlying consideration, are the exception rather than the rule. Across most reasonable people on most recondite topics, 'cruxes' are rare, and 'double cruxes' (roughly) exponentially rarer.

For many recondite topics I think about, my credence it in arises from the balance of a variety of considerations pointing in either direction. Thus whether or not I believe 'MIRI is doing good work', 'God exists', or 'The top marginal tax rate in the UK should be higher than its current value' does not rely on a single consideration or argument, but rather its support is distributed over a plethora of issues. Although in some cases undercutting what I take as the most important consideration would push my degree of belief over or under 0.5, in other cases it would not.

Thus if I meet someone else who disagrees with me on (say) whether God exists, it would be remarkable if our disagreement hinges on (for example) the evidential argument of evil, such that if I could persuade them of its soundness they would renounce their faith, and vice versa. Were I persuaded the evidential argument from evil 'didn't work', I expect I would remain fairly sceptical of god's existence; were I to persuade them it 'does work', I would not be surprised if they maintained other evidence nonetheless makes god's existence likely on the total balance of evidence. And so on and so forth for other issues where reasonable people disagree. I suspect a common example would be reasonably close agreement on common information, yet beliefs diverging based on 'priors', comprised of a melange of experiences, gestalts, intuitions, and other pieces of more 'private' evidence.

Auxiliary challenges to double crux

I believe there are other difficulties with double crux, somewhat related to the above:

Crux-asymmetry

As implied above, even in cases where there is a crux C for person X believing B, C may not be a crux for B for person Y, but it might be something else (A?). (Or, more generally, X and Y's set of cruxes are disjoint). A worked example:

Carl Shulman and I disagree about whether MIRI is doing good research (we have money riding on it). I expect if I lose the bet, I'd change my mind substantially about the quality of MIRI's work (i.e. my view would be favourable rather than unfavourable). I don't see this should be symmetrical between Shulman and I. If he lost the bet, he may still have a generally favourable view of MIRI, and a 'crux' for him maybe some other evidence or collection of evidence.

X or Y may simply differ in the resilience of their credence in B, such that one or the other's belief shifts more on being persuaded on a particular consideration. One commoner scenario (intra-EA) would be if one is trying to chase 'hits', one is probably more resilient to subsequent adverse information than the initial steers that suggested a given thing could be hit.

A related issue is when one person believes they are in receipt of a decisive consideration for or against B their interlocutor is unaware of.

'Changing one's mind' around p=0.5 isn't (that) important

In most practical cases, a difference between 1 and 0.51 or 0 and 0.49 is much more important than between 0.49 and 0.51. Thus disagreements over confidence of dis/belief can be more important, even if they may not count as 'changing one's mind': I probably differ more with a 'convinced Atheist' than an a 'doubter who leans slightly towards Theism'.

Many arguments and considerations are abductive, and so lend strength to a particular belief. Thus a similar challenge applies to proposed cruxes - they may regard the strength, rather than direction, of a given consideration. One could imagine the 'crux' between me and the hypothetical convinced Atheist is they think that the evidential problem of evil provides overwhelming disconfirmation for Theism, whilst I think its persuasive, perhaps decisive, but not so it drives reasonable credence in Theism down to near-zero.

Sabien's exposition recognises this, and so suggests one can 'double crux' over varying credences. So in this sample disagreement, the belief is 'Atheism is almost certain', and the crux is 'the evidential argument from evil is overwhelming'. Yet our language for credences is fuzzy, and so what would be a crux for the difference between (say) 'somewhat confident' versus 'almost certain' hard to nail down in a satisfactory inter-subjective way. An alternative where a change of raw credence is 'changing ones mind' entails all considerations we take to support our credence in a given a belief are cruxes.

Intermezzo

I suggest these difficulties may make a good diagnosis for why double cruxing has not always worked well. Anecdata seems to vary from those who have found it helpful to those who haven't seen any benefit (but perhaps leaning towards the latter), and remarks along the lines of wanting to see a public example.

Raemon's subsequent exegesis helpfully distinguishes between the actual double crux technique. and "the overall pattern of behaviour surrounding this Official Double Crux technique". They also offer a long list of considerations around the latter which may be pre-requisite for double cruxing working well (e.g. Social Skills, Actually Changing your Mind, and so on).

I wonder what value double crux really adds, if Raemon's argument is on the right track. If double cruxing requires many (or most) of the pre-requisites suggested, all disagreements conditioned on meeting these pre-requisites will go about as well whether one uses double crux or some other intuitive means of subsequent discussion.

A related concern of mine is a 'castle-and-keep' esque defence of double crux which arises from equivocating between double crux per se and a host of admirable epistemic norms it may rely upon. Thus when defended double crux may transmogrify from "look for some C which if you changed your mind about you'd change your mind about B too" to a large set of incontrovertibly good epistemic practices "It is better to be collaborative rather than combative in discussion, and be willing to change ones mind, (etc.)" Yet even if double cruxing is associated with (or requires) these good practices, it is not a necessary condition for them.

Good philosophers already disagree better than double cruxing

To find fault, easy; to do better, difficult

- Plutarch (paraphrased)

Per Plutarch's remark, any shortcomings in double crux may count for little if it is the 'best we've got'. However, I believe I can not only offer a better approach, but this approach already exists 'in the wild'. I have the fortune of knowing many extraordinary able philosophers, and not only observe their discussions but (as they also have extraordinary reserves of generosity and forbearance) participate in them as well. Their approach seems to do much better than reports of what double cruxing accomplishes.

What roughly happens is something like this:

  1. X and Y realize their credences on some belief B vary considerably.

  2. X and Y both offer what appear (to their lights) the strongest considerations that push them to a higher/lower credence on B.

  3. X and Y attempt to prioritize these considerations by the sensitivity of credence in B is to each of these, via some mix of resilience, degree of disagreement over these considerations, and so forth.

  4. They then discuss these in order of priority, moving topics when the likely yield drops below the next candidate with some underlying constraint on time.

This approach seems to avoid the 'in theory' objections I raise against double crux above. It seems to avoid some of the 'in practice' problems people observe:

  • These discussions often occur (in fencing terms) at double time, and thus one tends not to flounder trying to find 'double-cruxy' issues. Atheist may engage Theist on attempting to undermine the free will defence to the argument from evil, whilst Theist may engage Atheist on the deficiencies of moral-antirealism to prepare ground for a moral argument for the existence of god. These may be crux-y but they may be highly assymetrical. Atheist may be a compatibilist but grant libertarian free will for the sake of argument, for example: thus Atheist's credence in God will change little even if persuaded the free will defence broadly 'checks out' if one grants libertarian free will, and vice versa.

  • These discussions seldom get bogged down in fundamental disagreements. Although Deontologist and Utilitarian recognise their view on normative ethics is often a 'double crux' for many applied ethics questions (e.g. Euthanasia), they mutually recognise their overall view on normative ethics will likely be sufficiently resilient such that either of them 'changing their mind' based on a conversation is low. Instead they turn their focus to other matters which are less resilient, and thus they anticipate a greater likelihood of someone or other changing their mind.

  • There appears to be more realistic expectations about the result. If Utilitarian and Deontologist do discuss the merits of utilitarianism versus (say) kantianism, there's little expectation of 'resolving their disagreement' or that they will find or mutually crucial considerations. Rather they pick at a particular leading consideration on either side and see whether it may change their confidence (this is broadly reflected in the philosophical literature: papers tend to concern particular arguments or considerations, rather then offering all things considered determinations of broad recondite philosophical positions).

  • There appear to be better stopping rules. On the numerous occasions where I'm not aware of a particularly important consideration, it often seems a better use of everyone's time for me to read about this in the relevant literature rather than continuing to discuss (I'd guess 'reading a book' beats 'discussing with someone you disagree with' on getting more accurate beliefs about a topic per unit time surprisingly often).

Coda: Wherefore double crux?

It is perhaps possible for double crux to be expanded or altered to capture the form of discussion I point to above, and perhaps one can recast all the beneficial characteristics I suggest in double crux verbiage. Yet such a program appears a fool's errand: the core of the idea of double crux introduced at the top of the post is distinct from generally laudable epistemic norms (c.f. Intermezzo, supra), but also the practices of the elite cognisers I point towards in the section above. A concept of double crux so altered to incorporate these things is epiphenomenal - the engine which is driving the better disagreement is simply those other principles and practices double crux has now appropriated, and its chief result is to add terminological overhead, and, perhaps, inapt approximation.

I generally think the rationalist community already labours under too much bloated jargon: words and phrases which are hard for outsiders to understand, and yet do not encode particularly hard or deep concepts. I'd advise against further additions to the lexicon. 'Look for key considerations' captures the key motivation for double crux better than 'double crux' itself, and its meaning is clear.

The practices of exceptional philosophers set a high bar: these are people selected for, and who practice heavily, argument and disagreement. It is almost conceivable that they are better than this than even the rationalist community, notwithstanding the vast and irrefragable evidence of this group's excellence across so many domains. Double crux could still have pedagogical value: it might be a technique which cultivates better epistemic practices, even if those who enjoy excellent epistemic practices have a better alternative. Yet this does not seem the original intent, nor does there appear much evidence of this benefit.

In the introduction to double crux, Sabien wrote that the core concept was 'fairly settled'. In conclusion he writes:

We think double crux is super sweet. To the extent that you see flaws in it, we want to find them and repair them, and we're currently betting that repairing and refining double crux is going to pay off better than try something totally different. [emphasis in original]

I respectfully disagree. I see considerable flaws in double crux, which I don't think have much prospect of adequate repair. Would that time and effort be spent better looking elsewhere.

67 comments

Comments sorted by top scores.

comment by Eli Tyre (elityre) · 2017-10-10T23:33:37.709Z · LW(p) · GW(p)

I’m the person affiliated with CFAR who has done the most work on Double Crux in the past year. I both teach the unit (and it’s new accompaniment-class “Finding Cruxes") at workshops, and semi-frequently run full or half day Double Crux development-and-test sessions on weekends. (However, I am technically a contractor, not an employee of CFAR.)

In the process of running test sessions, I’ve developed several CFAR units worth of support material or prerequisite for doing Double Crux well. We haven’t yet solved all of the blockers, but attendees of those full-day workshops are much more skilled at applying the technique successfully (according to my subjective impression, and by the count of “successfully resolved" conversations.)

This new content is currently unpublished, but I expect that I’ll put it up on LessWrong in some form (see the last few bullet point below), sometime in the next year.

I broadly agree with this post. Some of my current thoughts:

  • I’m fully aware that Double Crux is hard to use successfully, which is what motivated me to work on improving the usability of the technique in the first place.

  • Despite those usability issues, I have seen it work effectively to the point of completely resolving a disagreement. (Notably, most of the instances I can recall were Double Cruxes between CFAR staff, who have a very high level of familiarity with Double Crux as a concept.)

  • The specific algorithm that we teach at workshops has undergone iteration. The steps we teach now are quite different than those of a year ago.

  • Most of the value of Double Crux, it seems to me, comes not from formal application of the framework, but rather from using conversational moves from Double Crux in “regular” conversations. TAPs to “operationalize”, or “ask what would change your own mind” are very useful. (Indeed, about half of the Double Crux support content is explicitly for training those TAPs, individually.) This is, I think, what you're pointing to with the difference between "the actual double crux technique. and 'the overall pattern of behaviour surrounding this Official Double Crux technique'".

  • In particular, I think that the greatest value of having the Double Crux class at workshops is the propagation of the jargon "crux". It is useful for the CFAR alumni community to have a distinct concept for "a thing that would cause you to change your mind", because that concept can then be invoked in conversation.

  • I think the full stack of habits, TAPs, concepts, and mindsets that lead to resolution of apparently intractable disagreement, is the interesting thing, and what we should be pursuing, regardless of if that stack "is Double Crux." (This is in fact what I'm working on.)

  • Currently, I am unconvinced that Double Crux is the best or “correct” framework for resolving disagreements. Personally, I am more interested in other (nearby) conversational frameworks,

  • In particular, I expect that non-symmetrical methods for grocking another person’s intuitions, as Thrasymachus suggests, to be fruitful. I, personally, currently use an asymmetrical framework much more frequently than I use a symmetric Double Crux framework. (In part because this doesn't require my interlocutor to do anything in particular or have knowledge of any particular conversational frame.)

  • I broadly agree with the section on asymmetry of cruxes (and it is an open curriculum development consideration). One frequently does not find a Double Crux, and furthermore doesn't need to find a Double Crux to make progress: single cruxs are very useful. (The current CFAR unit currently says as much.)

  • There are some non-obvious advantages to finding a Double Crux though, namely that (if successful), you don't just agree about the top-level proposition, but also share the same underlying model. (Double Crux is not, however, the only framework that produces this result.)

I have a few points of disagreement, however. Most notably, how common cruxes are.

I suggest the main problem facing the 'double crux technique' is that disagreements like Xenia's and Yevgeny's, which can be eventually traced to a single underlying consideration, are the exception rather than the rule.

My empirical experience is that disputes can be traced down to a single underlying consideration more frequently than one might naively think, particularly in on-the-fly disagreements about "what we should do" between two people with similar goals (which, I believe, is Double Crux's ideal use case.)

For many recondite topics I think about, my credence it in arises from the balance of a variety of considerations pointing in either direction.

While this is usually true (at least for sophisticated reasoners), it sometimes doesn't bear on the possibility of finding a (single) crux.

For instance, as a very toy example, I have lots of reasons to believe that acceleration due to gravity is about 9.806 m/s^2: the textbooks I've read, the experiments I did in highschool, my credence in the edifice of science, ect.

But, if I were to find out that I were currently on the moon, this would render all of those factors irrelevant. It isn't that some huge event changed my credence about all of the above factors. It's that all of those factors flow into a single higher-level node and if you break the connection between that node and the top level proposition, your view can change drastically, because those factors are no longer important. In one sense it's a massive update, but in another sense, it's only a single bit flipped.

I think that many real to life seemingly intractable disagreements, particularly when each party has a strong and contrary-to-the-other's intuition, have this characteristic. It's not that you disagree about the evidence in question, you disagree about which evidence matters.

Because I think we're on the moon, and you think we're on earth.

But this is often hard to notice, because that sort of background content is something we both take for granted.

Next I'll try to give some realer-to-life examples. (Full real life examples will be hard to convey because they will be more subtle or require more context. Very simplified anecdotes will have to do for now.)

You can notice something like this happening when...

1) You are surprised or taken aback at some piece of information that the other person thinks is true:

"Wait. You think if we had open borders almost the same number of people would immigrate as under the current US immigration policy?!" [This was a full Double Crux from a real conversation, resolved with recourse to available stats and a fermi estimate.]

Whether or not more people will imigrate could very well change your mind about open borders.

2) You have (according to you) knock-down arguments against their case that they seem to concede quickly, that they don't seem very interested in, or that don't change their overall view much.

You're talking with a person who doesn't think that decreasing carbon emissions is important. You give them a bunch of evidence about the havoc that global warming will wreak, and they agree with it all. It turns out (though they didn't quite realize it themselves) that they're expecting that things are so bad that geo-engineering will be necessary, and it's not worth doing anything short of geoengineering. [Fictionalized real example.]

The possibility and/or necessity of geoengineering could easily be a crux for someone in favor of carbon-minimizing interventions.

3) They keep talking about considerations that, to you, seem minor (this also happens between people who agree):

A colleague tells you that something you did was "rude" and seems very upset. It seems to you that it was a little abrasive, but that it is important that actions of that sort be allowed in the social space. Your colleague declares that it is unacceptable to be "rude." It becomes clear that she is operating from a model whereby being "rude" is so chilling to the discourse that it effectively makes discussion impossible. [Real, heavily simplified, example.]

If this were true it might very well cause you to reassess your sense of what is socially acceptable.

Additionally, here are some more examples of Cruxes, that on the face of it, seem too shallow to be useful, but can actually move the conversation forward:

If there were complete nuclear disarmament, more people would die violently. [Real example from a CFAR workshop, though clouded memory.]

If everyone were bisexual, people would have more sex. [I'm not sure if I've actually seen this one, but it seems like a plausible disagreement from watching people Double Crux on a nearby topic.]

CFAR wants to reach as many people as possible. [Real example]

For each of these, we might tend to take the proposition (or its opposite!) as given, but rather frequently, two people disagree about the truth value.

I claim that there is crux-structure hiding in each of these instances, and that instances like these are surprisingly common (acknowledging that they could seem frequent only because I'm looking for them, and the key feature of some other conversational paradigm is at least as common.)

More specifically, I claim that on hard questions and in situations that call for intuitive judgement, it is frequently the case that the two parties are paying attention to different considerations, and some of the time, the consideration that the other person if tracking, if born out, is sufficient to change your view substantially.

. . .

I was hoping to respond to more points here, but this is already long, and, I fear, a bit rambly. As I said, I'll write up my full thoughts at some point.

I'm curious if I could operationalize a bet with Thrasymachus about how similar the next (or final, or 5 years out, or whatever) iteration of disagreement resolution social-technology will be to Double Crux. I don't think I would take 1-to-1 odds, but I might take something like 3-to-1, depending on the operationalization.

Completely aside from the content, I'm glad to have posts like this one, critiquing CFAR's content.

Replies from: Kenny, elityre
comment by Kenny · 2018-03-06T14:59:57.182Z · LW(p) · GW(p)

I didn't think your comment was too long, nor would it even if it was twice as long. Nor did I find it rambly. Please consider writing up portions of your thoughts whenever you can if doing so is much easier than writing up your full thoughts.

Replies from: elityre
comment by Eli Tyre (elityre) · 2018-05-09T19:18:36.223Z · LW(p) · GW(p)

Thanks. : ) I'll take this into consideration.

comment by Eli Tyre (elityre) · 2019-12-13T18:22:23.851Z · LW(p) · GW(p)

Some updates on what I think about Double Crux these days are here [LW(p) · GW(p)].

comment by Chris_Leong · 2017-10-08T23:15:32.718Z · LW(p) · GW(p)

I've attended a CFAR workshop. I agree with you that Double Crux has all of these theoretical flaws, but it actually seems to work reasonably well in practise, even if these flaws make it kind of confusing. In practise you just kind of stumble through. I strongly agree that if the technique was rewritten so that it didn't have these flaws, it would be much easier to learn as the stumbling bit isn't the most confidence inspiring (this is when the in person assistance becomes important).

One of the key elements that haven't seen mentioned here is this separation between trying to persuade the other person and trying to find out where your point of view differs. When you are trying to convince the other person it is much easier to miss, for example, when there's a difference in a core assumption. Double Crux lets you understand the broad structure of their beliefs so that you can at least figure out the right kinds of things to say later to persuade them that won't be immediately dismissed.

Replies from: None
comment by [deleted] · 2017-10-09T00:57:24.360Z · LW(p) · GW(p)

I think the second paragraph is good point, and it is a big part of what I think makes Double Crux better, perhaps, from the standard disagreement resolution model.

comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-08T20:16:01.573Z · LW(p) · GW(p)

A specific sub-point that I don't want to be lost in the sea of my previous comment:

A related concern of mine is a 'castle-and-keep' esque defence of double crux which arises from equivocating between double crux per se and a host of admirable epistemic norms it may rely upon. Thus when defended double crux may transmogrify from "look for some C which if you changed your mind about you'd change your mind about B too" to a large set of incontrovertibly good epistemic practices "It is better to be collaborative rather than combative in discussion, and be willing to change ones mind, (etc.)" Yet even if double cruxing is associated with (or requires) these good practices, it is not a necessary condition for them.

I think there's a third path here, which is something like "double crux may be an instrumentally useful tool in causing these admirable epistemic norms to take root, or to move from nominally-good to actually-practiced."

I attempted in the original LW post, and attempt each time I teach double crux, to underscore that double crux has as its casus belli specific failure modes in normal discourse, and that the point is not, actually, to adhere rigidly to the specific algorithm, but rather that the algorithm highlights a certain productive way of thinking and being, and that while often my conversations don't resemble pure double crux, I've always found that a given marginal step toward pure double crux produces value for me.

Which seems to fit with your understanding of the situation, except that you object to a claim that I and CFAR didn't intend to make. You interpreted us (probably reasonably and fairly) as doing a sort of motte-and-bailey bait-and-switch. But what I, at least, meant to convey was something like "so, there are all these really good epistemic norms that are hard to lodge in your S1, and hard to operationalize in the moment. If you do this other thing, where you talk about cruxes and search for overlap, somehow magically that causes you to cleave closer to those epistemic norms, in practice."

It's like the sort of thing where, if I tell you that it's an experiment about breathing, your breathing starts doing weird and unhelpful things. But if I tell you that it's an experiment about calculation, I can get good data on your breathing while your attention is otherwise occupied.

Hopefully, we're not being that deceptive. But I claim that we're basically saying "Do X" because of a borne-out-in-practice prediction that it will result in people doing Y, where Y are the good norms you've identified as seemingly unrelated to the double crux framework. I've found that directly saying "Do Y" doesn't produce the desired results, and so I say "Do X" and then feel victorious when Y results, but at the cost of being vulnerable to criticism along the lines of "Well, yeah, sure, but your intervention was pointed in the wrong direction."

comment by Thrasymachus · 2017-10-14T21:45:08.879Z · LW(p) · GW(p)

I hope readers will forgive a 'top level' reply from me, it's length, and that I plan to 'tap out' after making it (save for betting). As pleasant as this discussion is, other demands pull me elsewhere. I offer a summary of my thoughts below - a mix of dredging up points I made better 3-4 replies deep than I managed in the OP, and to reply to various folks at CFAR. I'd also like to bet (I regret to decline Eli's offer for reasons that will become apparent, but I hope to make some agreeable counter-offers).

I persist in three main worries: 1) That double crux (or 'cruxes' simpliciter) are confused concepts; 2) It doesn't offer anything above 'strong consideration', and insofar as it is not redundant, framing in 'cruxes' harms epistemic practice; 3) The evidence CFAR tends to fall back upon to nonetheless justify the practice of double crux is so undermined that it is not only inadequate public evidence, but it is inadequate private evidence for CFAR itself.

The colloid, not crystal, of double crux

A common theme in replies (and subsequent discussions) between folks at CFAR and I is one of a gap in understanding. I suspect 'from their end' (with perhaps the exception of Eli) the impression is I don't quite 'get it' (or, as Duncan graciously offers, maybe it's just the sort of thing that's hard to 'get' from the written up forms): I produce sort-of-but-not-quite-there simulacra of double crux, object to them, but fail to appreciate the real core of double crux to which these objections don't apply. From mine, I keep trying to uncover what double crux is, yet can't find any 'hard edges': it seems amorphous, retreating back into other concepts when I push on what I think it is distinct, yet flopping out again when I turn to something else. So I wonder if there's anything there at all.

Of course this seeming 'from my end' doesn't distinguish between the two cases. Perhaps I am right double crux is no more than some colloid of conflated and confused concepts; but perhaps instead there is a a crystallized sense of what double crux is 'out there' that I haven't grasped. Yet what does distinguish these cases in my favour is that CFAR personnel disagree with one another about double crux.

For a typical belief which one might use double crux (or just 'single cruxing') should one expect to find one crux, or find multiple cruxes?

Duncan writes (among other things on this point):

The claim that I derive from "there's surprisingly often one crux" is something like the following: that, for most people, most of the time, there is not in fact a careful, conscious, reasoned weighing and synthesis of a variety of pieces of evidence. [My emphasis]

By contrast, Dan asserts in his explanation:

A typical belief has many cruxes. For example, if Ron is in favor of a proposal to increase the top marginal tax rate in the UK by 5 percentage points, his cruxes might include "There is too much inequality in the UK", "Increasing the top marginal rate by a few percentage points would not have much negative effect on the economy", and "Spending by the UK government, at the margin, produces value". [my emphasis]

This doesn't seem like a minor disagreement, as it flows through to important practical considerations. If there's often one crux (but seldom more), once I find it I should likely stop looking; if there's often many cruxes, I should keep looking after I find the first.

What would this matter, beyond some 'gotcha' or cheap point-scoring? This: I used to work in public health, and one key area is evaluation of complex interventions. Key to this in turn is to try and understand both that the intervention works but also how it works. The former without the latter raises introduces a troublesome black box: maybe elaborate high-overhead model for your intervention works through some much simpler causal path (c.f. that many schools of therapy with mutually incompatible models are in clinical equipoise, but appear also in equipoise with 'someone sympathetic listening to you'); maybe you mistake the key ingredient as intrinsic to the intervention where it is instead contingent on the setting so it doesn't work when this is changed (c.f. the external validity concerns that plague global health interventions).

In CFAR's case there doesn't seem a shared understanding of the epistemic landscape (or, at least, where cruxes lie within it) between 'practicioners'. It also looks to me there's not a shared understanding on the 'how it works' question - different accounts point in different directions: Vanvier seems to talk more about 'getting out of trying to win the argument mode to getting to the truth mode', Duncan emphasizes more potential rationalisations one may have for a belief, Eli suggests it may help locate cases where we differ in framing/fundamental reasons that are in common with more proximal reasons (i.e. the 'earth versus moon' hypothetical). Of course, it could do all of these, but I don't think CFAR has a way to tell. Finding the mediators would also help buttress claims of causal impact.

Cruxes contra considerations

I take it as a 'bad news' for an idea, whatever its role, if one can show it is a) a proposed elaboration of another idea, and b) yet this elaboration makes the idea worse. I offer an in theory reason to think 'cruxes' are inapt elaborations for 'considerations', a couple of considerations as to why 'double crux' might degrade epistemic practice, and a bet that, in fact, people who are 'double cruxing' (or just 'finding cruxes') are often not in fact using cruxes.

Call a 'consideration' something like this:

A consideration for some belief B is another belief X such that believing X leads one to assign a higher credence to B.

This is (unsurprisingly) broad, including stuff like 'reasons', 'data' and the usual fodder for bayesian updating we know and love. Although definitions of a 'crux' slightly vary, it seems to be something like this:

A crux for some belief B is another belief C such that if one did not believe C, one would not believe B.

Or:

A crux for some belief B is another belief C such that if one did not believe C, one would change one's mind about B.

'Changing one's mind' about B is not ultra-exact, but nothing subsequent turns on this point (one could just encode B in the first formulation as 'I do not change my mind about another belief (A)', etc.

The crux rule

I said in a reply to Dan given this idea of a crux, a belief should held no more strongly than its (weakest) crux (call this the 'crux rule'). He expressed uncertainty about whether this was true. I hope this derivation is persuasive:

¬C -> ¬B (i.e. if I don't believe C, I don't believe B - or, if you prefer, if I don't believe the crux, I should not 'not change my mind about' B)

So:

B -> C (i.e. if I believe B, I must therefore believe C).

If B -> C, P(C) >= P(B): there is no possibility C is false yet B is true, yet there is a possibility where C is true and B is false (compare modus tollens to affirming the consequent).

So if C is a crux for B, one has inconsistent credences if one offers a higher credence for B than for C. An example: suppose I take "Increasing tax would cause a recession" as a crux for "Increasing taxes is bad" - if I thought increasing taxes would not cause a recession, I would not think increasing taxes is bad. Suppose my credence for raising taxes being bad is 0.9, and my credence for raising taxes causing a recession is 0.6. I'm inconsistent: if I assign a 40% chance raising taxes would not cause a recession, I should think there's at least a 40% chance raising taxes would not be bad, not 10%.

(In a multi-crux case with C1-n cruxes for B, the above argument applies to C1-n, so B must not be higher than any of them, and thus equal to or lower than the lowest. Although this is a bound, one may anticipate B's credence to be substantially lower, as the probability of a union of (mostly) independent cruxes approximates P(C1)*P(C2) etc.)

Note this does not apply to considerations, as there's no neat conditional parsing of 'consideration' in the same way as 'crux'. This also agrees with common sense: imagine some consideration one is uncertain of which nonetheless favours B over ¬B: one can be less confident of X than B.

Why belabour this logic and probability? Because it offers a test of intervention fidelity: whether people who are 'cruxing' are really using cruxes. Gather a set of people one takes as epistemically virtuous who 'know how to crux' to find cruxes for some of their beliefs. Then ask them to offer their credences for both the belief and the crux(s) for the belief. If they're always finding cruxes, there will be no cases where they offer higher credence for the belief than its associated crux(s).

I aver the actual proportion of violations of this 'crux rule' will be at least 25%. What (epistemically virtuous) people are really doing when finding 'cruxes' are strong considerations which they think gave them large updates toward B over ¬B. However, despite this they will often find their credence in the belief is higher than the supposed crux. I might think the argument from evil is the best consideration for atheism, but I may also hold a large number of considerations point in favour in atheism, such they work together to make me more confident of atheism than the soundness of the argument from evil. Readers (CFAR alums or not) can 'try this at home'. For a few beliefs 'find your cruxes'. Now offer credences for these - how often do you need to adjust these credences to obey the 'crux rule'? Do you feel closer to reflective equilibrium when you do so?

Even if CFAR can't bus in some superforecasters or superstar philosophers to try this on, they can presumably do this with their participants. I offer the following bet (and happy to haggle over the precise numbers):

(5-1 odds [i.e. favouring you].) From any n cases of beliefs and associated cruxes for CFAR alums/participants/any other epistemically virtuous group who you deem 'know cruxing', greater than n/4 cases will violate the crux rule.

But so what? Aren't CFAR folks already willing to accept often 'crux' (in Vanvier's words) 'degrades gracefully' into something like what I call a 'strong consideration'? Rather than castle-and-keep, isn't this more like constructing a shoddier castle somewhat nearby and knocking its walls down? To-may-to/To-mar-to?

Yet we already have words for 'things which push us towards a belief'. I used consideration, but we can also use 'reasons', or 'evidence' or whatever. 'Strong consideration' has 16 more characters than crux, but it has the benefits of its meaning being common knowledge, being naturally consonant with bayesianism, and accurately captures how epistemically virtuous people think and how they should be thinking. To introduce another term which is not common knowledge and forms either a degenerate or redundant version of this common knowledge term looks, respectfully, like bloated jargon by my lights.

If you think there's a crux, don't double crux, think again

Things may be worse than 'we've already got a better concept'. It looks plausible to me that teaching cruxes (or double crux) teaches bad epistemic practice. A contention I made in the OP is that as crux incidence is anti-correlated with epistemic virtue: epistemically virtuous people usually find in topics of controversy that the support for their beliefs is distributed over a number of considerations, without a clear 'crux', rather than they would change their mind in some matter based on a single not-that-resilient consideration. Folks at CFAR seem to (mostly) agree, e.g. Duncan's remarks:

I note that, if correct, this theory would indicate that e.g. your average LessWronger would find less value in double crux than your average CFAR participant (who shares a lot in common with a LessWronger but in expectation is less rigorous and careful about their epistemics). This being because LessWrongers try very deliberately to form belief webs like the first image [many-one, small edge weighs - T], and when they have a belief web like the third image [not-so-many-one, one much bigger edge -T] they try to make that belief feel to themselves as unbalanced and vulnerable as it actually is.

This suggests one's reaction on finding you have a crux should be alarm: "My web of beliefs doesn't look like what I'd expect to see from a person with good epistemics", and one's attitude towards 'this should be the crux for my belief' should be scepticism: "It's not usually the case some controversial matter depends upon a single issue like this". It seems the best next step in such a situation is something like this, "I'm surprised there is a crux here. I should check with experts/the field/peers to whether they agree with me that this is the crux of the matter. If they don't, I should investigate the other considerations suggested to bear upon this matter/reasons they may offer to assign lower weight to what I take to the be crux".

The meta-cognitive point is that it is important to not only get the right credences on the considerations, but also to the weigh these considerations rightly to form a good 'all things considered' credence on the topic. Webs of belief that greatly overweigh a particular consideration track truth poorly even they are accurate on what it (mis)takes as the key issue. In my experience among elite cognisers, there's seldom disagreement that a consideration bears upon a given issue. Disagreement seldom occurs about the direction of that consideration either: parties tend to agree a given consideration favours one view or another. Most of the action occurs at the aggregation: "I agree with you this piece of evidence favours your view, but I weigh it less than this other piece of evidence that favours mine."

Cruxing/double crux seems to give entirely wrong recommendations. It pushes one to try to find single considerations that would change their mind, despite this usually being pathological; it focuses subsequent thinking on those considerations identified as cruxes, instead of the more important issue of whether one is weighing these considerations too heavily; it celebrates when you and your interlocutor agree on the crux of your disagreement, instead of cautioning such cases often indicate you've both gotten things wrong.

The plural of plausibly biased anecdote is effectively no evidence

Ultimately, the crux (forgive me) is whether double crux actually works. Suppose 'meditation' is to 'relaxing' as I allege 'crux/double crux' is to 'consideration'. Pretend all the stuff you hear about 'meditation' is mumbo-jumbo that confuses the issue that the only good 'meditation' does is that it prompts people to relax. This would be regrettable, but meditation would still be a good thing even if its only parasitic on the good of relaxing. One might wonder if you could do something better than 'meditation' by focusing on actually valuable relaxing bit, but maybe this is one of those cases where the stuff around 'meditation' is a better route to get people to relax than targeting 'relaxing' directly. C.f. Duncan:

I think there's a third path here, which is something like "double crux may be an instrumentally useful tool in causing these admirable epistemic norms to take root, or to move from nominally-good to actually-practiced.

The evidence base for double crux (and I guess CFAR generally) seems to be something like this:

  • Lots of intelligent and reasonable people report cruxing/double crux was helpful for them. (I can somewhat allay Duncan's worry that the cases he observes might be explained by social pressure he generates - people have reported the same in conversations in which he is an ocean away).

  • Folks at CFAR observe many cases where double crux works, and although it might work particularly well between folks at CFAR (see Eli's comment) but in any case they still observe it to be handy with non-CFAR staff.

  • Duncan notes in a sham control test (i.e. double crux versus 'discussing the benefits of epistemic virtues').

  • Dan provides some participant data: about half 'find a double crux', and it looks like finding a disagreement, finding a double crux (or both) was associated with a more valuable conversation.

Despite general equanimity, Duncan noted distress at the 'lack of epistemic hygiene' around looking at double crux, principally (as I read him) that of excessive scepticism from some outside CFAR. With apologies to him (and the writer of Matthew 7), I think the concern is more plausible in reverse: whatever motes blemish outsider eyes do not stop them seeing the beams blocking CFAR's insight. It's not only the case that outsiders aren't being overly sceptical in doubting this evidence, CFAR is being overly credulous taking it as seriously as they do. Consider this:

  1. In cases where those who are evaluating the program are those involved in delivering the intervention, and they expectedly benefit the better the results, there's a high risk of bias. (c.f. blinding, conflict of interest)

  2. In cases where individuals enjoy some intervention (and often spent a quite a lot of money to participate) there's a high risk of bias for their self-report. (c.f. choice-supportive bias, halo effect, among others).

  3. Neither good faith nor knowledge of a potential bias risk do not, by themselves, help one much to avoid this bias.

  4. Prefer hard metrics with tight feedback loops when trying to perform well at something.

  5. Try and perform some reference class forecasting to avoid getting tricked by erroneous insider views (but I repeat myself).

What measure of credulity should a rationalist mete out to an outsider group with a CFAR-like corpus of evidence? I suggest it would be meagre indeed. One can recite almost without end interventions with promising evidence fatally undercut by minor oversights or bias (e.g. inadequate allocation concealment in an RCT). In the class of interventions where the available evidence has multiple, large, obvious bias risks, the central and modal member of this class is an intervention with no impact.

We should mete out this meagre measure of credulity to ourselves: we should not on the one hand remain unmoved by the asseveration of a chiropractor that they 'really see it works', yet take evidence of similar quality and quantity to vindicate rationality training. CFAR's case is unpersuasive public evidence. I go further: it's unpersuasive private evidence too. In the same way we take the chiropractor be irrational if they don't almost-entirely discount their first-person experience of chiropractic successes when we inform them of the various cognitive biases that undercut the evidentiary value of this experience, we should expect a CFAR instructor or alum, given what they already know about rationality, to almost entirely discount these sources of testimonial evidence to judge whether double crux works.

Yet this doesn't happen. Folks at CFAR tend to lead with this anecdata when arguing that double crux works. This also mirrors 'in person' conversations I have, where otherwise epistemically laudable people cite their personal experience as what convinces them of the veracity of a particular CFAR technique. What has a better chance of putting one in touch with reality about whether double crux (or CFAR generally) works is the usual scientific suspects: focusing on 'hard outcomes', attempting formal trials, randomisation, making results public, and so forth. That this generally hasn't happened across the time of CFARs operation I take to be a red flag.

For this reason I respectfully decline Eli's suggestion to make bets on whether CFAR will 'stick with' double crux (or something close to it) in the future. I don't believe CFAR's perception of what is working will track the truth, and so whether or not it remains 'behind double crux' is uninformative for the question of whether double crux works. I'm willing to offer bets against whether CFAR will gain 'objective' evidence of efficacy, and bet in favour of the null hypothesis for these:

(More an error bounty than a bet - first person to claim gets £100) CFAR's upcoming "EA impact metrics report" will contain no 'objective measures' (defined somewhat loosely - objective measure is something like "My income went up/BMI went down/independent third party assessor rated the conversation as better", not things along the lines of, "Participants rate the workshop as highly valuable/instructor rates conversations as more rational/ etc."

(3-to-1): CFAR will not generate in the next 24 months any peer reviewed literature in psychology or related fields (stipulated along the lines of, "either published in the 'original reports' section of a journal with impact factor >1 or presenting at an academic conference."

(4 to 1): Conditional on a CFAR study getting past peer review, it will not show significantly positive effects on any objective, pre-specified outcome measure.

I'm also happy to offer bets on objective measures of any internal evaluations re. double crux or CFAR activity more broadly.

Replies from: habryka4, Conor Moreton
comment by habryka (habryka4) · 2017-10-14T22:29:42.842Z · LW(p) · GW(p)

I agree with the gist of the critique of double crux as presented here, and have had similar worries. I don't endorse everything in this comment, but think taking it seriously will positively contribute to developing an art of productive disagreement.

I think the bet at the end feels a bit fake to me, since I think it is currently reasonable to assume that publishing a study in a prestigious psychology journal is associated with something around 300 person hours of completely useless bureaucratic labor, and I don't think it is currently worth it for CFAR to go through that effort (and neither I think is it for almost anyone else). However, if we relax the constraints to only reaching the data quality necessary to publish in a journal (verified by Carl Shulman or Paul Christiano or Holden Karnofsky, or whoever we can find who we would both trust to assess this), I am happy to take you up on your 4-to-1 bet (as long as we are measuring the effect of the current set of CFAR instructors teaching, not some external party trying to teach the same techniques, which I expect to fail).

I sadly currently don't have the time to write a larger response about the parts of your comment I disagree with, but think this is an important enough topic that I might end up writing a synthesis on things in this general vicinity, drawing from both yours and other people's writing. For now, I will leave a quick bullet list of things I think this response/argument is getting wrong:

  • While your critique is pointing out true faults in the protocol of double crux, I think it has not yet really engaged with some of the core benefits I think it brings. You somewhat responded to this by saying that you think other people don't agree what double crux is about, which is indeed evidence of the lack of a coherent benefit, however I claim that if you would dig deeper into those people's opinion, you will find that the core benefits they claim might sound superficially very different, but are actually at the core quite similar and highly related. I personally expect that we two would have a more productive disagreement than you and Duncan, and so I am happy to chat in person, here on LW or via any chat service of your preference if you want to dig deeper into this. Though obviously feel completely free to decline this.

  • I particularly think that the alternative communication protocols you proposed are significantly worse and, in as much as they are codified, do not actually result in more productive disagreement.

  • I have a sense that part of your argument still boils down to "CFAR's arguments are not affiliated enough with institutions that are allowed to make a claim about something like this (whereas academic philosophers and psychology journals are).". This is a very tentative impression, and I do not want to give you the sense that you have to defend yourself for this. I have high priors on people's epistemics being tightly entangled with their sense of status, and usually require fairly extraordinary evidence until I am convinced that this is not the case for any specific individual. However, since this kind of psychologizing almost never results in a productive conversation, this is not a valid argument in a public debate. And other people should be very hesitant to see my position as additional evidence of anything. But I want to be transparent in my epistemic state, and state the true reasons for my assessment as much as possible.

  • While I agree that the flaws you point out are indeed holding back the effectiveness of double crux, I disagree that they have any significant negative effects on your long-term epistemics. I don't think CFAR is training people to adopt worse belief structures after double cruxing, partially because I think the default incentives on people's belief structures as a result of normal social conversation are already very bad and doing worse by accident is unlikely, and because the times I've seen double crux in practice, I did not see mental motions that would correspond to the loss of the web-like structure of their beliefs, and more noticed a small effect in the opposite direction (i.e. people's social stance towards something was very monolithic and non-web-like, but as soon as they started double cruxing their stated beliefs were much more web-like).

Overall, I am very happy about this comment, and would give it a karma reward had I not spent time writing this comment, and implemented the moderator-karma-reward functionality instead. While I would have not phrased the issues I have with double crux in the same language, the issues it points out overlap to a reasonable degree with the ones that I have, and so I think it also represents a good chunk of my worries. Thank you for writing it.

Replies from: Thrasymachus
comment by Thrasymachus · 2017-10-14T23:20:10.371Z · LW(p) · GW(p)

Thanks for your reply. Given my own time constraints I'll decline your kind offer to discuss this further (I would be interested in reading some future synthesis). As consolation, I'd happily take you up on the modified bet. Something like:

Within the next 24 months CFAR will not produce results of sufficient quality for academic publication (as judged by someone like Christiano or Karnofsky) that demonstrate benefit on a pre-specified objective outcome measure

I guess 'demonstrate benefit' could be stipulated as 'p<0.05 on some appropriate statistical test' (the pre-specification should get rid of the p-hacking worries). 'Objective' may remain a bit fuzzy: the rider is meant to rule out self-report stuff like "Participants really enjoyed the session/thought it helped them". I'd be happy to take things like "Participants got richer than controls", "CFAR alums did better on these previously used metrics of decision making", or whatever else.

Happy to discuss further to arrive at agreeable stipulations - or, if you prefer, we can just leave them to the judges discretion.

Replies from: habryka4
comment by habryka (habryka4) · 2017-10-14T23:47:30.830Z · LW(p) · GW(p)

Ah, the 4-1 to one bet was a conditional one:

(4 to 1): Conditional on a CFAR study getting past peer review, it will not show significantly positive effects on any objective, pre-specified outcome measure.

I don't know CFAR's current plans well enough to judge whether they will synthesize the relevant evidence. I am only betting that if they do, the result will be positive. I am still on the fence of taking a 4-1 bet on this, but the vast majority of my uncertainty here comes from what CFAR is planning to do, not what the result would be. I would probably take a 5-1 bet on the statement as you proposed it.

Replies from: Thrasymachus
comment by Thrasymachus · 2017-10-15T15:07:31.744Z · LW(p) · GW(p)

Sorry for misreading your original remark. Happy to offer the bet in conditional, i.e.:

Conditional on CFAR producing results of sufficient quality for academic publication (as judged by someone like Christiano or Karnofsky) these will fail to demonstrate benefit on a pre-specified objective outcome measure

comment by Conor Moreton · 2017-10-15T01:04:00.729Z · LW(p) · GW(p)

This comment combines into one bucket several different major threads that probably each deserve their own bucket (e.g. the last part seems like strong bets about CFAR's competence that are unrelated to "is double crux good"). Personally I don't like that, though it doesn't seem objectively objectionable.

Replies from: habryka4
comment by habryka (habryka4) · 2017-10-15T01:11:30.728Z · LW(p) · GW(p)

I agree with this, and also prefer to keep discussion about the competence of specific institutions or individuals to a minimum on the frontpage (this is what I want to have a community tag for).

comment by ahartell · 2017-10-09T11:32:47.582Z · LW(p) · GW(p)

[These don't seem like cruxes to me, but are places where our models differ.]

[...]

a crux for some belief B is another belief C which if one changed one's mind about C, one would change one's mind about B.

[...]

A double crux is a particular case where two people disagree over B and have the same crux, albeit going in opposite directions. Say if Xenia believes B (because she believes C) and Yevgeny disbelieves B (because he does not believe C), then if Xenia stopped believing C, she would stop believing B (and thus agree with Yevgeny) and vice-versa.

[...]

Across most reasonable people on most recondite topics, 'cruxes' are rare, and 'double cruxes' (roughly) exponentially rarer.

It seems like your model might be missing a class of double cruxes:

It doesn't have to be the case that, if my interlocutor and I drew up belief maps, we would both find a load-bearing belief C about which we disagree. Rather, it's often the case that my interlocutor has some 'crucial' argument or belief which isn't on my radar at all, but would indeed change my mind about B if I were convinced it were true. In another framing, I have an implicit crux for most beliefs that there is no extremely strong argument/evidence to the contrary, which can match up against any load-bearing belief the other person has. In this light, it seems to me that one should not be very surprised to find double cruxes pretty regularly.

Further, even when you have a belief map where the main belief rests on many small pieces of evidence, it is usually possible to move up a level of abstraction and summarize all of that evidence in a higher-level claim, which can serve as a crux. This does not address your point about relatively unimportant shifts around 49%/51%, but in practice it seems like a meaningful point.

Replies from: Thrasymachus
comment by Thrasymachus · 2017-10-09T18:16:08.720Z · LW(p) · GW(p)

I guess my overall impression is that including the cases you specify in a double cruxy style look more like epicycles by my lights rather than helpful augmentations to the concept of double crux.

Non-common knowledge cruxes

I had a sentence in the OP on crux asymmetry along the lines of 'another case may be is X believes they have a crux for B which Y is unaware of'. One may frame this along the lines of an implicit crux of 'there's no decisive consideration that changes my mind about B' for which a proposed 'silver bullet' argument would constitute disagreement.

One of the past-times of my mis-spent youth was arguing about god on the internet. A common occurrence was Theist and Atheist would meet, and both would offer their 'pet argument' which they took to be decisive for A/Theism. I'm not sure they were high-quality discussions, so I'd not count it as a huge merit if they satisfy double crux.

I guess this ties back to my claim that on topics on which reasonable people differ, decisive considerations of this type should be very rare. One motivation for this would veer along social epistemiological lines: a claim about a decisive consideration seems to require some explanation as others nonetheless hold the belief the decisive consideration speaks against. Explaining why your interlocutor is not persuaded is easy - they may simply have not come across it. Yet on many of the sort of recondite topics of disagreement one finds that experts are similarly divided to the laity, and usually the experts who disagree with you are aware of the proposed decisive consideration you have in mind (e.g. the non-trivial proportion of economists who are aware of the Laffer curve yet nonetheless support higher marginal tax rates, etc.). Although it is possible there's some systemic cause/bias/whatever which could account for why this section of experts are getting this wrong, it seems the more common explanation (also favoured by an outside view) is that you overrate the importance of the consideration due to inadequate knowledge of rebutting/undercutting defeaters.

I probably have a non-central sample of discussions I observe, so it may be the case there are a large family of cases of 'putative decisive considerations unknown by one party'. Yet in those cases I don't think double cruxing is the right 'next step'. In cases where the consideration has in mind is widely deemed to settle the matter by the relevant body of experts, it seems to approximate a 'lets agree to check it on Wikipedia case' (from a semi recent conversation, "I'm generally quite sympathetic to this particular formulation of the equal weight view on disagreement" "Oh the field has generally turned away from that due to work by Bloggs showing this view can be dutch-booked?" "If that is so that's a pretty decisive show-stopper, can you point me to it?" /end conversation). In cases where the decisive consideration X has in mind is not held as decisive by the expert body, that should be a red flag to X, and X and Y's time, instead of being spent inexpertly hashing out this consideration, is better spent looking the wider field of knowledge to see likely more sophisticated treatment of the same.

Conjunctive cruxes

I agree one could summarise the case in where many small pieces of evidence provide support for the belief could be summarized into some wider conjunction (e.g. "I'd believe god exists if my credence in the argument from evil goes down by this much, and credence in the argument from design goes up by this much") which could be a crux for discussion.

Yet in such cases there's a large disjunction of conjunctions that would lead to a similar shift in credence, as each consideration likely weights at least someone independently on the scales of reason (e.g. I'd believe god exists if my credence in AfE goes down by X, and credence in the argument from design goes up by Y, or credence in AfE goes down by X-e, and credence in AfD goes up by Y+f, or credence in AfE goes down by X-e, credence in AfD goes up by Y, and credence in argument from religious disagreement goes down by Z, etc. etc.) Although there are not continua in practice due to granularity in how we store credences (I don't back myself to be more precise than the first significant digit in most cases), the size of this disjunctive set grows very rapidly with number of relevant considerations. In consequence, there isn't a single neat crux to focus subsequent discussion upon.

I don't think discussion is hopeless in these cases. If X and Y find (as I think they should in most relevant cases) their disagreement arises from varying weights they place on a number of considerations that bear upon B and ¬B, they can prioritize considerations to discuss which they differ the most on, and for which it appears their credences are the least resilient (I guess in essence to optimise expected d(credence)/dt). This is what I observe elite cognisers doing, but this doesn't seem to be double crux to me.

comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-08T18:29:21.465Z · LW(p) · GW(p)

I like most of this; it seems like the sort of post that's going to lead to significant improvements in people's overall ability to do collaborative truth-seeking, because it makes concrete and specific recommendations that overall seem useful and sane.

However,

principally because 'double cruxes' are rare in topics where reasonable people differ

disagreements like Xenia's and Yevgeny's, which can be eventually traced to a single underlying consideration, are the exception rather than the rule

and similar make me wish that posts like this would start by getting curious about CFAR's "n" of several hundred, rather than implicitly treating it as irrelevant. We've been teaching double crux at workshops for a couple of years now, and haven't stopped the way we've stopped with other classes and concepts that weren't pulling their weight.

My sense is that the combined number of all of the double-crux-doubters and double-crux-strugglers still does not approach, in magnitude, the number of people who have found double crux moderately-to-very useful and workable and helpful (and, in specific, the number of people who have been surprised to discover that double cruxes do in fact exist and are findable an order of magnitude more often than one would have naively guessed).

It does not distress me to consider that double crux might be imperfect, or even sufficiently broken that it should be thrown out.

It does distress me when claims about its imperfectness or brokenness fail to address the existing large corpus of data about its usefulness. Posts like these seem to me to be biased toward the perspective of people who have tried to pick it up online and piece it together, and to discount the several hundred people who've been taught it at workshops (well over a thousand if you count crash courses at e.g. EA conferences). At the very least, claim that CFAR is lying or that its participants are socially pressured into reporting a falsehood—don't just ignore the data outright.

(Edit: I consider these to be actually reasonable things to posit; I would want them to be posited politely and falsifiably if possible, but I think it's perfectly okay for people to have, as hypotheses, that CFAR's deceiving itself or others, or that people's self-report data is unduly swayed by conformity pressures. The hypotheses themselves seem inoffensive to me, as long as they're investigated soberly instead of tribally.)

=( Sorry for the strength of language, here (I acknowledge and own that I am a bit triggered), but it feels really important to me that we have better epistemic hygiene surrounding this question. The contra double crux argument almost always comes most strongly from people who haven't actually talked to anyone skilled in double crux or haven't actually practiced it in an environment where facilitators and instructors could help them iterate in the moment toward an actually useful set of mental motions.

(And no, those aren't the only people arguing contra double crux. Plenty of actual CFAR grads are represented by posts like the above; plenty of actual CFAR grads struggled and asked for help and still walked away confused, without a useful tool. But as far as I can tell, they are in a plurality, not a majority.)

It seems critical to me that we be the type of community that can distinguish between "none of CFAR's published material is sufficient to teach this technique on its own" and "this technique isn't sufficiently good to keep iterating on." The former is almost trivially true at this point, but that's weak evidence at best for the latter, especially if you take seriously (repeated for emphasis) the fact that CFAR has a higher n on this question than literally anybody else, as far as I know.

Scattered other thoughts:

  • As far as I can tell, none of the most serious doubters in the previous thread actually stepped up to try out double crux with any of the people who claimed to be skilled and capable in it. I know there was an additional "post disagreements and we'll practice" thread, but that seemed to me to contain low density of doubters giving it a serious try with proficient partners.

  • Also, the numbered description in the "what mostly happens" is, according to me, something like an 85% overlap for double crux. It is a set of motions that the double crux class and tips point people toward and endorse. So that leaves me in confuséd disagreement with the claim that it is "perhaps possible" for double crux to be twisted around to include them—from my point of view, it already does. Which again points to a problem with the explanation and dissemination of the technique, but not with the core technique itself, as transmitted to real humans over the course of a couple of hours of instruction with facilitators on hand*. This causes me to wonder whether the real crux is the jargon problem that you alluded to near the end, with people being upset because they keep crystallizing specific and inaccurate impressions of what double crux is, doing things that seem better, and getting angry at double crux for not being those better things (when in at least many of the cases, it is, and the problem is one of semantics).

* Speaking of "a couple of hours of instruction with facilitators on hand," I'm also a little sad about what I read as an implicit claim that "well, double crux doesn't work as well as being the sort of philosopher or debater who's spent thousands upon thousands of hours practicing and looking at the nature of truth and argument and has been educated within a solid culture with a long tradition." It seems to me that double crux can take a 1-4 hour investment of time and attention and jumpstart people to being something like a quarter as good as those lifelong philosophers.

And I think that's amazing, and super promising, and that it's disingenuous to point at literal world-class professionals and say "what they do is better." It strikes me as true, but irrelevant? It's like if I offered an afternoon martial arts seminar that had 50% of its participants walking away with blue-belt level skill (and the other 50% admittedly lost or confused or unimpressed) and someone criticized it for not being as useful or correct as what black belts with ten years of experience do. It's still an afternoon seminar to get excited about, and for outsiders looking in to wonder "what's the secret sauce?"

(I realize now that I'm making a claim that may not have been published before, about double crux being intended to be a scrappy small-scale bootstrapping technique, and not necessarily the final step in one's productive-disagreement evolution. That's a novel claim, and one that other CFAR staff might disagree with me on, so I retract the part of my grrrr that was based on you not-taking-into-account-this-thing-you-couldn't-possibly-have-known-about).

Here's my sense of the bar that putative double crux replacement needs to meet, because I claim double crux is already reliably meeting it:

  • Be explainable within 30-60 minutes in a deliberately scaffolded educational context

  • Be meaningfully practicable within another 30-60 minutes

  • Be sticky enough that people in fact desire to reference it and spread it and use it in the future

  • Provide a significant boost in productivity-in-debate and/or a corresponding significant reduction in wasted time/antagonism à la giving people 5-25% of the skill that you see in those philosophers

  • Do all of the above for greater than 33% of people who try it

Here's my sense of what is unfairly and inaccurately being expected of double crux:

  • Be grokkable by people based entirely on text and hearsay/be immune to problems that arise from [games of telephone] or [imperfect capture and reconstruction through writing].

  • Be "attractive" in the sense that people fumbling around on their own will necessarily find themselves moving closer toward the right motions rather than sideways or away, without help or guidance.

  • Be "complete" in the sense that it contains all of of what we know about how to improve disagreement and/or debate.

  • Provide an efficacy boost of greater than 50% for greater than 75% of the people who try it.

Here are my "cruxes" on this question:

  • I would drastically reduce my support for double crux as a technique if it turned out that what was needed was something that could be asynchronously transmitted (e.g. because in-person instruction insufficiently scales).

  • I would drastically reduce my support for double crux if it turned out that 50+% of the people who reported valuable experiences in my presence later on discovered that the knowledge didn't stick or that the feeling had been ephemeral (e.g. possibly social-conformity based rather than real).

  • I would drastically reduce my support for double crux if I attempted to convey my True Scotsman version to four high-quality skeptics/doubters (such as Thrasymachus) and all four afterward told me that all of the nuance I thought I was adding was non-useful, or had already been a part of their model and was non-novel.

  • I would drastically reduce my support for double crux if a concrete alternative that was chunkable, grokkable, teachable, and had thing-nature (as opposed to being a set of principles that are hard to operationalize) was proposed and was promising after its first iterations with 30+ people.

tl;dr I claim that most of the good things you point at are, in fact, reasonably captured by double crux and have been all along; I claim that most of the bad things you point at are either due to insufficient pedagogy (our fault) or are likely to be universal and bad for any attempt to create a technique in this space. I recognize that I'm in danger of sliding into No True Scotsman territory and I nevertheless stick to my claim based on having done this with hundreds of people, which is something most others have not had the opportunity to do.

Replies from: Kaj_Sotala, Thrasymachus, habryka4, lahwran
comment by Kaj_Sotala · 2017-10-08T19:15:41.643Z · LW(p) · GW(p)

It does distress me when people who argue that it's imperfect or broken do not even bother to address a very large corpus of data about its usefulness, including our claim that, in our experiences, the overlapping common crux is actually there.

As a datapoint, this is the first time that I remember hearing that there would exist "a very large corpus of data about its usefulness". The impression I got from the original DC post was that this was popular among CFAR's instructors, but that you'd been having difficulties effectively teaching this to others.

I think that if such a corpus of evidence exists, then the main reason why people are ignoring it is because the existence of this corpus hasn't been adequately communicated, making the implicit accusation of "your argument isn't taking into account all the data that it should" unfair.

Replies from: Duncan_Sabien
comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-08T19:17:05.094Z · LW(p) · GW(p)

That's sensible. I would have thought it was implied by "CFAR's taught this at every workshop and event that it's run for the past two years," but I now realize that's very typical-mind-fallacy of me.

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2017-10-08T19:24:30.710Z · LW(p) · GW(p)

Where's that fact from? It wasn't in the original DC post, which only said that the technique "is one of CFAR's newer concepts".

Replies from: Duncan_Sabien
comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-08T19:29:53.170Z · LW(p) · GW(p)

That's what I mean by typical mind fallacy. I live in the universe where it's obvious that double crux is being taught and tinkered with constantly, because I work at CFAR, and so I just stupidly forgot that others don't have access to that same knowledge. i.e. I was elaborating on my agreement with you, above.

Also, by "newer concept" we just mean relative to CFAR's existence since 2012. It's younger than e.g. inner sim, or TAPs, but it's been a part of the curriculum since before my hiring in October 2015.

Also also in consolidating comments I have discovered that I lack the ability to delete empty comments.

Replies from: Kaj_Sotala, habryka4
comment by Kaj_Sotala · 2017-10-08T19:33:51.911Z · LW(p) · GW(p)

Ah, gotcha.

Replies from: dxu
comment by dxu · 2017-10-08T19:42:13.855Z · LW(p) · GW(p)

I upvoted this entire chain of comments for the clear and prosocial communication displayed throughout.

comment by habryka (habryka4) · 2017-10-08T19:48:00.861Z · LW(p) · GW(p)

I've just been going around and deleting the empty comments. Right now we don't allow users to delete comments, since that would also delete all children comments by other authors (or at least make them inaccessible). Probably makes sense for people to be able to delete their own comments if they don't have any children.

Replies from: Lucretius
comment by Lucretius · 2017-10-09T16:27:44.253Z · LW(p) · GW(p)

comment by Thrasymachus · 2017-10-08T22:36:27.371Z · LW(p) · GW(p)

Hello Duncan,

My thanks for your reply. I apologise if my wording in the OP was inflammatory or unnecessarily 'triggering' (another commenter noted an 'undertone of aggression', which I am sorry for, although I promise it wasn't intended - you are quoted repeatedly as you wrote the canonical exposition for what I target in the OP, rather than some misguided desire to pick a fight with you on the internet). I hope I capture the relevant issues below, but apologies in advance if I neglect or mistake any along the way.

CFAR's several hundred and the challenge of insider evidence

I was not aware of the of the several hundred successes CFAR reports of double crux being used 'in the wild'. I'm not entirely sure whether the successes are a) those who find double crux helpful or b) particular instances of double crux resolving disagreement, but I think you would endorse plenty of examples of both. My pretty sceptical take on double crux had 'priced in' the expectation of CFAR instructors at least some/many alums thought it was pretty nifty.

You correctly anticipate the sort of worries I would have about this sort of evidence. Self-reported approbation from self-selected participants is far from robust. Given branches of complementary medicine can probably tout thousands to millions of 'positive results' and happy customers, yet we know it is in principle intellectually bankrupt, and in practice performs no better than placebo in properly conducted trials. (I regret to add replies along the lines of, "if you had received the proper education in the technique you'd - probably - see it works well", or "I'm a practicioner with much more experience than any doubter in terms of using this, and it works in my experience" also have analogies here).

I don't think one need presume mendacity on the part of CFAR, nor gullibility on the part of workshop attendees, to nonetheless believe this testimonial evidence isn't strongly truth-tracking: one may anticipate similarly positive reports in worlds where (perhaps) double crux doesn't really work, but other stuff CFAR practices does work, and participants enjoy mingling with similarly rationally minded participants, may have had to invest 4 figure sums to get on the workshop, and so on and so forth. (I recall CFAR's previous evaluation had stupendous scores on self-reported measures, but more modest performance on objective metrics).

Of course, unlike complementary medicine, double crux does not have such powerful disconfirmation as 'violates known physics' or 'always fails RCTs'. Around the time double crux was proposed I challenged double crux on theoretical grounds (i.e. double cruxes should be very rare), this post was prompted by some of the dissonance on previous threads, but also the lack of public examples of double crux working. Your experience of the success of double crux in workshops is essentially private evidence (at least for now): in the same way it is hard to persuade me of its validity, it is next to impossible for me to rebut it. I nonetheless hope other lines of inquiry are fruitful.

How sparse is a typical web of belief?

One is the theoretical point. I read in your reply disagreement on the 'double cruxes' should be rare point ("... the number of people who have been surprised to discover that double cruxes do in fact exist and are findable an order of magnitude more often than one would have naively guessed"). Although you don't include it as a crux in your reply, it looks pretty crucial to me. If cruxes are as rare as I claim, double cruxing shouldn't work, and so the participant reports are more likely to have been innocently mistaken.

In essence, I take the issue to be around what web of belief surrounding a typical subject of disagreement looks like. It seems double crux is predicated on this being pretty sparse (at least in terms of important considerations): although lots of beliefs might have some trivial impact on your credence B, B is mainly set by small-n cruxes (C), which are generally sufficient to change ones mind if ones attitude towards them changes.

By contrast, I propose the relevant web tends to be much denser (or, alternatively, the 'power' of the population of reasons that may alter ones credence in B is fairly evenly distributed). Credence in B arises from a large number of considerations that weigh upon it, each of middling magnitude. Thus even if I am persuaded one is mistaken, my credence in B does not change dramatically. It follows that 'cruxes' are rare, and so two people happening to discover their belief on some recondite topic B is principally determined by some other issue (C), and it is the same for both of them is rare.

This is hard to make very crisp, as (among others) the 'space of all topics reasonable people disagree' are hard to pin down. Beyond appeals to my own experience and introspection ("do you really find your belief in (let's say, some political view like gay marriage or abortion) depends on a single consideration to such a degree that, if it was refuted, you would change your view?) I'd want to marshal a couple of other considerations.

  1. When one looks at a topic in philosophy, or science, or many other fields of enquiry, one usually sees a very one-to-many relationship of the topic to germane considerations. A large number of independent lines of evidence support the theory of evolution; a large number of arguments regarding god's existence in philosophy receive scrutiny (and in return they spawn a one-to-many relationship of argument to objections, objection to counter-objections). I suggest this offers analogical evidence in support of my thesis.

  2. Constantin's report of double cruxing (which has been used a couple of times as an exemplar in other threads) seems to follow the pattern I expect. I struggle to identify a double-crux in the discussion Constatin summarizes: most of the discussion seems to involve whether Salvatier's intellectual project is making much progress, with then a host of subsidiary considerations (e.g. how much to weigh 'formal accomplishments', the relative value of more speculative efforts on far future considerations, etc.), but it is unclear to me if Constantin was persuaded Salvatier's project was making good progress this would change her mind about the value of the rationalist intellectual community (after all, one good project may not be adequate 'output') or vice versa (even if Salvatier recognises his own project was not making good progress, the rationality community might still be a fertile ground to cultivate his next attempt, etc.)

What comprises double-crux?

I took the numbered list of my counter-proposal to have 25% overlap with double crux (i.e. realising your credences vary considerably), not 85%. Allow me to be explicit on how I see 2-4 in my list are in contradistinction to the 'double crux algorithm':

  • There's no assumption of an underlying single 'crux of the matter' between participants, or for either individually.

  • There's no necessity for a given consideration (even the strongest identified) to be individually sufficient to change ones mind about B

  • There's also no necessity for the strongest considerations proposed by X and Y to have common elements.

  • There's explicit consideration of credence resilience. Foundational issues may by 'double cruxes' in that (e.g.) my views on most applied ethics questions would change dramatically if I were persuaded of the virtue ethics my interlocutor holds, but one often makes more progress discussing a less resilient non-foundational claim even if 'payoff' in terms of the subsequent credence change in the belief of interest is lower.

This may partly be explained by a broader versus narrower conception of double crux. I take the core idea of double crux to be the 'find some C for which your disagreement over B relies upon, then discuss C' (this did, in my defense, comprise the whole of the 'how to play' section in the initial write-up). I take you to holding a broader view, where double crux incorporates other related epistemic practices, and it has value in toto.

My objection is expressly this. Double crux is not essential for these incorporated practices. So one can compare discussion with the set of these other practices to this set with the addition of double crux. I aver the set sans double crux will lead to better discussions.

Pedagogy versus performance

I took double crux was mainly being proposed as a leading strategy to resolve disagreement. Hence the comparison to elite philosophers was to suggest it wasn't a leading strategy by pointing to something better. I see from this comment (and the one you split off into its own thread) you see it more as a more a pedagogical role - even if elite performers do something different, it does valuable work in improving skills. Although I included a paragraph about its possible pedagogical value (admittedly one you may have missed as I started it with a self-indulgent swipe at the rationalist community), I would have focused more on this area had I realised it was CFAR's main contention.

I regret not to surprise you with doubts about the pedagogical value as well. This mostly arises from the above concerns: if double cruxes are as rare as I propose, it is unclear how searching for them is that helpful an exercise. A related worry (related to the top of the program) is this seems to entail increasing reliance on private evidence regarding whether the technique works: in principle objections to the 'face value' of the technique apply less (as it is there to improve skills rather than a proposal for what the 'finished article' should look like); adverse reports from non-CFAR alums don't really matter (you didn't teach them, so it is no surprise they don't get it right). What one is left with is the collective impressions of instructors, and the reports of the students.

I guess I have higher hopes for transparency and communicability of 'good techniques'. I understand CFAR is currently working on further efforts to evaluate itself. I hope to be refuted by the forthcoming data.

Replies from: ozymandias, Duncan_Sabien
comment by ozymandias · 2017-10-08T22:50:32.781Z · LW(p) · GW(p)

I want to bring up sequence thinking and cluster thinking, which I think are useful in understanding the disagreement here. As I understand it, Duncan argues that sequence thinking is more common than cluster thinking, and you're arguing the converse.

I think most beliefs can be put in either a cluster-thinking or a sequence-thinking framework. However, I think that (while both are important and useful) cluster thinking is generally more useful for coming up with final conclusions. For that reason, I'm suspicious of double crux, because I'm worried that it will cause people to frame their beliefs in a sequence-thinking way and feel like they should change their beliefs if some important part of their sequence was proven wrong, even though (I think) using cluster thinking will generally get you more accurate answers.

Replies from: dxu
comment by dxu · 2017-10-08T23:51:12.237Z · LW(p) · GW(p)

As I understand it, Duncan argues that sequence thinking is more common than cluster thinking, and you're arguing the converse.

This looks remarkably like an attempt to identify a crux in the discussion. Assuming that you're correct about double-cruxing being problematic due to encouraging sequence-like thinking: isn't the quoted sentence precisely the kind of simplification that propagates such thinking? Conversely, if it's not a simplification, doesn't that provide (weak) evidence in favor of double-cruxing being a useful tool in addressing disagreements?

Replies from: ozymandias
comment by ozymandias · 2017-10-09T00:03:01.517Z · LW(p) · GW(p)

I think that sequence thinking is important and valuable (and probably undersupplied in the world in general, even while cluster thinking is undersupplied in the rationalist community in specific). However, I think both Thrasymachus and Duncan are doing cluster thinking here-- like, if Duncan were convinced that cluster thinking is actually generally a better way of coming to final decisions, I expect he'd go "that's weird, why is CFAR getting such good results from teaching double crux anyway?" not "obviously I was wrong about how good double crux is." Identifying a single important point of disagreement isn't a claim that it's the only important point of disagreement.

Replies from: Duncan_Sabien, dxu
comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-09T02:24:08.172Z · LW(p) · GW(p)

I like this point a lot, and your model of me is accurate, at least insofar as I'm capable of simming this without actually experiencing it. For instance, I have similar thoughts about some of my cutting/oversimplifying black-or-white heuristics, which seem less good than the shades-of-gray epistemics of people around me, and yet often produce more solid results. I don't conclude from this that those heuristics are better, but rather that I should be confused about my model of what's going on.

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2017-10-09T02:30:01.234Z · LW(p) · GW(p)

that makes a ton of sense for theoretically justified reasons I don't know how to explain yet. anyone want to collab with me on a sequence? I'm a bit blocked on 1. exactly what my goal is and 2. what I should be practicing in order to be able to write a sequence (given that I'm averse to writing post-style content right now)

comment by dxu · 2017-10-09T00:18:28.346Z · LW(p) · GW(p)

Naturally, and I wasn't claiming it was. That being said, I think that when you single out a specific point of disagreement (without mentioning any others), there is an implication that the mentioned point is, if not the only point of disagreement, then at the very least the most salient point of disagreement. Moreover, I'd argue that if Duncan's only recourse after being swayed regarding sequence versus cluster thinking is "huh, then I'm not sure why we're getting such good results", then there is a sense in which sequence versus cluster thinking is the only point of disagreement, i.e. once that point is settled, Duncan has no more arguments.

(Of course, I'm speaking purely in the hypothetical here; I'm not trying to make any claims about Duncan's actual epistemic state. This should be fairly obvious given the context of our discussion, but I just thought I'd throw that disclaimer in there.)

Replies from: Duncan_Sabien
comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-09T02:24:47.586Z · LW(p) · GW(p)

Oh, hmm, this is Good Point Also.

comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-08T23:35:45.266Z · LW(p) · GW(p)

First off, a symmetric apology for any inflammatory or triggering nature in my own response, and an unqualified acceptance of your own, and reiterated thanks for writing the post in the first place, and thanks for engaging further. I did not at any point feel personally attacked or slighted; to the degree that I was and am defensive, it was over a fear that real value would be thrown out or socially disfavored for insufficient reason.

(I note the symmetrical concern on your part: that real input value will be thrown out or lost by being poured into a socially-favored-for-insufficient-reason framework, when other frameworks would do better. You are clearly motivated by the Good.)

You're absolutely right that the relative lack of double cruxes ought be on my list of cruxes. It is in fact, and I simply didn't think of it to write it down. I highly value double crux as a technique if double cruxes are actually findable in 40-70% of disagreements; I significantly-but-not-highly value double crux if double cruxes are actually findable in 25-40% of disagreements; I lean toward ceasing to investigate double crux if they're only findable in 10-25%, and I am confused if they're rarer than 10%.

By contrast, I propose the relevant web tends to be much denser (or, alternatively, the 'power' of the population of reasons that may alter ones credence in B is fairly evenly distributed). Credence in B arises from a large number of considerations that weigh upon it, each of middling magnitude. Thus even if I am persuaded one is mistaken, my credence in B does not change dramatically. It follows that 'cruxes' are rare, and so two people happening to discover their belief on some recondite topic B is principally determined by some other issue (C), and it is the same for both of them is rare.

I agree that this is a relevant place to investigate, and at the risk of proving you right at the start, I add it to my list of things which would cause me to shift my belief somewhat.

The claim that I derive from "there's surprisingly often one crux" is something like the following: that, for most people, most of the time, there is not in fact a careful, conscious, reasoned weighing and synthesis of a variety of pieces of evidence. That, fompmott, the switch from "I don't believe this" to "I now believe this" is sudden rather than gradual, and, post-switch, involves a lot of recasting of prior evidence and conclusions, and a lot of further confirmation-biased integration of new evidence. That, fompmott, there are a lot of accumulated post-hoc justifications whose functional irrelevance may not even be consciously acknowledged, or even safe to acknowledge, but whose accumulation is strongly incentivized given a culture wherein a list of twenty reasons is accorded more than 20x the weight of a list of one reason, even if nineteen of those twenty reasons are demonstrated to be fake (e.g. someone accused of sexual assault, acquitted due to their ironclad alibi that they were elsewhere, and yet the accusation still lingers because of all the sticky circumstantial bits that are utterly irrelevant).

In short, the idealized claim of double crux is that people's belief webs look like this:

<insert>

Whereas I read you claiming that people's belief webs look like this:

<insert>

And on reflection and in my experience, the missing case that tilts toward "double crux is surprisingly useful" is that a lot of belief webs look like this:

<insert>

... where they are not, in fact, simplistic and absolutely straightforward, but there often is a crux which far outweighs all of the other accumulated evidence.

I note that, if correct, this theory would indicate that e.g. your average LessWronger would find less value in double crux than your average CFAR participant (who shares a lot in common with a LessWronger but in expectation is less rigorous and careful about their epistemics). This being because LessWrongers try very deliberately to form belief webs like the first image, and when they have a belief web like the third image they try to make that belief feel to themselves as unbalanced and vulnerable as it actually is. Ergo, LessWrongers would find the "Surprise! You had an unjustified belief!" thing happening less often and less unexpectedly.

If I'm reading you right, this takes care of your first bullet point above entirely and brings us closer to a mutual understanding on your second bullet point. Your third bullet point remains entirely unaddressed in double crux except by the fact that we often have common cultural pressures causing us to have aligned-or-opposite opinions on many matters, and thus in practice there's often overlap. Your fourth bullet point seems both true and a meaningful hole or flaw in double crux in its idealized, Platonic form, but also is an objection that in practice is rather gracefully integrated by advice to "keep ideals in mind, but do what seems sane and useful in the moment."

To the extent that those sections of your arguments which miss were based on my bad explanation, that's entirely on me, and I apologize for the confusion and the correspondingly wasted time (on stuff that proved to be non-crucial!). I should further clarify that the double crux writeup was conceived in the first place as "well, we have a thing that works pretty well when transmitted in person, but people keep wanting it not transmitted in person, partly because workshops are hard to get to even though we give the average EA or rationalist who can't afford it pretty significant discounts, so let's publish something even though it's Not Likely To Be Good, and let's do our best to signal within the document that it's incomplete and that they should be counting it as 'better than nothing' rather than judging it as 'this is the technique, and if I'm smart and good and can't do it from reading, then that's strong evidence that the technique doesn't work for me.'" I obviously did not do enough of that signaling, since we're here.

Re: the claim "Double crux is not essential for these incorporated practices." I agree wholeheartedly on the surface—certainly people were doing good debate and collaborative truthseeking for millennia before the double crux technique was dreamed up.

I would be interested in seeing a side-by-side test of double crux versus direct instruction in a set of epistemic debate principles, or double crux versus some other technique that purports to install the same virtues. We've done some informal testing of this within CFAR—in one workshop, Eli Tyre and Lauren Lee taught half the group double crux as it had always previously been taught, while I discussed with the other half all of the ways that truthseeking conversations go awry, and all of the general desiderata for a positive, forward-moving experience. As it turned out, the formal double crux group did noticeably better when later trying to actually resolve intellectual disagreement, but the strongest takeaway we got from it was that the latter group didn't have an imperative to operationalize their disagreement into concrete observations or specific predictions, which seems like a non-central confound to the original question.

As for "I guess I have higher hopes for transparency and communicability of 'good techniques'," all I can do is fall back yet again on the fact that, every time skepticism of double crux has reared its head, multiple CFAR instructors and mentors and comparably skilled alumni have expressed willingness to engage with skeptics, and produce publicly accessible records and so forth. Perhaps, since CFAR's the one claiming it's a solid technique, 100% of the burden of creating such referenceable content falls on us, but one would hope that the relationship between enthusiasts and doubters is not completely antagonistic, and that we could find some Robin Hansons to our Yudkowskys, who are willing to step up and put their skepticism on the line as we are with our confidence.

As of yet, not a single person has sent me a request of the form "Okay, Duncan, I want to double crux with you about X such that we can write it down or video it for others to reference," nor has anyone sent me a request of the form "Okay, Duncan, I suspect I can either prove double crux unworth it or prove [replacement Y] a more promising target. Let's do this in public?"

I really really do want all of us to have the best tool. My enthusiasm for double crux has nothing to do with an implication that it's perfect, and everything to do with a lack of visibly better options. If that's just because I haven't noticed something obvious, I'd genuinely appreciate having the obvious pointed out, in this case.

Thanks again, Thrasymachus.

Replies from: Thrasymachus
comment by Thrasymachus · 2017-10-09T22:21:17.934Z · LW(p) · GW(p)

Thank you for your gracious reply. I interpret a couple of overarching themes in which I would like to frame my own: the first is the 'performance issue' (i.e. 'How good is double crux at resolving disagreement/getting closer to the truth'); the second the 'pedagogical issue' (i.e. 'how good is double crux at the second order task of getting people better at resolving disagreement/getting closer to the truth'). I now better understand you take the main support from double crux to draw upon the latter issue, but I'd also like to press on some topics about the former on which I believe we disagree.

How well does double crux perform?

Your first two diagrams precisely capture the distinction I have in mind (I regret not having thought to draw my own earlier). If I read the surrounding text right (I'm afraid not to know what 'fompmott' means, and google didn't help me), you suggest that even if better cognisers find their considerations form a denser web like the second diagram, double-crux amenable 'sparser' webs are still common in practice, perhaps due to various non-rational considerations. You also add:

I note that, if correct, this theory would indicate that e.g. your average LessWronger would find less value in double crux than your average CFAR participant (who shares a lot in common with a LessWronger but in expectation is less rigorous and careful about their epistemics). This being because LessWrongers try very deliberately to form belief webs like the first [I think second? - T] image, and when they have a belief web like the third image they try to make that belief feel to themselves as unbalanced and vulnerable as it actually is. Ergo, LessWrongers would find the "Surprise! You had an unjustified belief!" thing happening less often and less unexpectedly.

This note mirrors a further thought I had (c.f. Ozymandias's helpful remark in a child about sequence versus cluster thinking). Yet I fear this poses a further worry for the 'performance issue' of double crux, as it implies that the existence of cruxes (or double cruxes) may be indicative of pathological epistemic practices. A crux implies something like the following:

  1. You hold some belief B you find important (at least, important enough you think it is worth your time to discuss).

  2. Your credence in B depends closely on some consideration C.

  3. Your credence in C is non-resilient (at least sufficiently non-resilient you would not be surprised to change your mind on it after some not-unduly-long discussion with a reasonable interlocutor).*

* What about cases where one has a resilient credence in C? Then the subsequent worries do not apply. However, I suspect these cases often correspond to "we tried to double crux and we found we couldn't make progress on resolving our disagreement about theories of truth/normative ethics/some other foundational issue".

It roughly follows from this you should have low resilience in your credence of B. As you note, this is vulnerable, and knowing one had non-resilient credences in important Bs is to be avoided.

As a tool of diagnosis, double crux might be handy (i.e. "This seems to be a crux for me, yet cruxes aren't common among elite cognisers - I should probably go check whether they agree this is the crux of this particular matter, and if not maybe see what else they think bears upon B besides C"). Yet (at least per the original exposition) it seems to be more a tool for subsequent 'treatment'. Doing so could make things worse, not better.

If X and Y find they differ on some crux, but also understand that superior cognisers tend not to have this crux, and distribute support across a variety of considerations, it seems a better idea for them to explore other candidate considerations rather than trying to resolve their disagreement re. C. If they instead do the double-cruxy thing and try and converge on C, they may be led up the epistemic garden path. They may agree with one another on C (thus B), and thus increase their resilience of C (thus B), yet they also confirm a mistaken web of belief around B which wrongly accords too much weight to C. If (as I suggest) at least half the battle on having good 'all things considered' attitudes to recondite matters comprises getting the right weights for relevant considerations on the matter, double crux may celebrate them converging further away from the truth. (I take this idea to be expressed in kernel in Ozymandias's worry of double crux displacing more-expectedly-accurate cluster thinking with less-expectedly-accurate sequence thinking).

How good is double crux at 'levelling people up at rationality'

The substantial independence of the 'performance issue' from the pedagogical issue'

In the same way practising scales may not be the best music, but make one better at playing music, double crux may not be the best discussion technique, but make one better at discussions. This seems fairly independent of its 'object level performance' (although I guess if the worry above is on the right track, we would be very surprised if a technique that on the object level leads beliefs to track truth more poorly nonetheless has a salutatory second-order effect).

Thus comparisons to practices of elite philosophers (even if they differ) are inapposite - especially, as I understand from one of them, the sort of superior pattern I observe occurs only at a far right tail even among philosophers (i.e. 'world-class' as you write, rather then 'good', as I write in the OP). It is obviously a great boon if I could get some fraction more like someone like Askell or Shulman without either their profound ability or the time they have invested in these practices.

On demurring the 'double crux challenge'

I regret I don't think it would be hugely valuable to 'try double crux' with an instructor in terms of resolving this disagreement. One consideration (on which more later) is that conditional on me not being persuaded by a large group of people who self-report double crux is great, I shouldn't change my mind (for symmetry reasons) if this number increases by one other person, or it increases by including me. Another is that the expected yield may not be great, at least in one direction: although I hope I am not 'hostile' to double crux, it seems one wouldn't be surprised if it didn't work with me, even if its generally laudable.

Yet I hope I am not quite as recalcitrant as 'I would not believe until I felt the stigmata with my own hands'. Apart from a more publicly legible case (infra), I'm a bit surprised at the lack of 'public successes' of double cruxing (although this may confuse performance versus pedagogy). In addition to Constantin, Raemon points to their own example with gjm. Maybe I'm only seeing what I want to, but I get a similar impression. They exhibit a variety of laudable epistemic practices, but I don't see a crux or double crux (what they call 'cruxes' seem to be more considerations they take to be important).

The methods of rational self-evaluation

You note a head-to-head comparison between double crux and an approximate sham-control seemed to favour double crux. This looks like interesting data, and it seems a pity it emerges in the depths of a comment thread (ditto the 'large n of successes') rather than being written up and presented - it seems unfortunate that the last 'public evaluation report' is about 2 years old. I would generally urge trying to produce more 'public evidence' rather than the more private "we've generally seen this work great (and a large fraction of our alums agree!)"

I recognise that "Provide more evidence to satisfy outside sceptics" should not be high on CFAR's priority list. Yet I think it is instrumental to other important goals instead. Chiefly: "Does what we are doing actually work?"

You noted in your initial reply undercutting considerations to the 'we have a large n of successes', yet you framed this in way that these would often need to amount to a claim of epistemic malice (i.e. 'either CFAR is lying or participants are being socially pressured into reporting a falsehood'). I don't work at a rationality institute or specialise in rationality, but on reflection I find this somewhat astonishing. My impression of cognitive biases were that they were much more insidious, that falling prey to them was the rule rather than the exception, and that sincere good faith was not adequate protection (is this not, in some sense, what CFAR casus belli is predicated upon?)

Although covered en passant, let me explicitly (although non-exhaustively) list things which might bias more private evidence of the type CFAR often cites:

  1. CFAR staff (collectively) are often responsible for developing the interventions they hope will improve rationality. One may expect them to be invested in them, and more eager to see that they work than see they don't (c.f. why we prefer double-blinding over single-blinding).

  2. Other goods CFAR enjoys (i.e. revenue/funding, social capital) seem to go up the better the results of their training. Thus CFAR staff have a variety of incentives pushing them to over-report how good their 'product' is (c.f. why conflicts of interest are bad, the general worries about pharma-funded drug trials).

  3. Many CFAR participants have to spend quite a lot of money (i.e. fees and travel) to attend a workshop. They may fear looking silly if it turns out after all this it didn't do anything, and so incentivised to assert it was much more helpful than it actually was (c.f. choice supportive bias).

  4. There are other aspects of CFAR workshops that participants may enjoy independent of the hoped-for improvement of their rationality (e.g. hanging around interesting people like them, personable and entertaining instructors, romantic entanglements). This extraneous benefits may nonetheless bias upwards their estimate of how effective CFAR workshops are at improving their rationality (c.f. halo effect).

I am sure there are quite a few more. One need not look that hard to find lots of promising studies supporting a given intervention undermined by any one of these.

The reference class of interventions with "a large corpus of (mainly self-reported) evidence of benefit, but susceptible to these limitations" is dismal. It includes many branches of complementary medicine. It includes social programs (e.g. 'scared straight') that we now know to be extremely harmful. It includes a large number of ineffective global poverty interventions. Beyond cautionary tales, I aver these approximate the modal member of the class: when the data is so subjective, and the limitations this severe, one should expect the thing in question doesn't actually work after all.

I don't think this expectation changes when we condition on the further rider "And the practicioners really only care about the truth re. whether the intervention works or not." What I worry about going on under the hood is a stronger (and by my lights poorly substantiated) claim of rationalist exceptionalism: "Sure, although cognitive biases plague entire fields of science and can upend decades of results, and we're appropriately quick to point out risk of bias of work done by outsiders, we can be confident that as we call ourselves rationalists/we teach rationality/we read the sequences/etc. we are akin to Penelope refusing her army of suitors - essentially incorruptible. So when we do similarly bias-susceptible sorts of things, we should give one another a pass."

I accept 'gold standard RCTs' are infeasible (very pricey, and how well can one really do 'sham CFAR'?) yet I aver there is quite a large gap between this ideal of evidence and the actuality (i.e. evidence kept in house, and which emerges via reference in response to criticism) which could be bridged by doing more write-ups, looking for harder metrics that put one more reliably in touch with reality, and so on. I find it surprisingly incongruent that the sort of common cautions about cognitive biases - indeed, common cautions that seem predicates for CFAR's value proposition (e.g. "Good faith is not enough", "Knowing about the existence of biases does not make one immune to them", Feynmann's dictum about 'you are the easiest person to fool'), are not reflected in its approach to self-evaluation.

If nothing else, opening up more of CFAR's rationale, evidence, etc. to outside review may allow more benefits of outside critique. Insofar as it is the case you found this exchange valuable, one may anticipate greater benefit from further interaction with higher-quality sceptics.

comment by habryka (habryka4) · 2017-10-08T19:13:28.045Z · LW(p) · GW(p)

I think I feel similar to lahwran. You made a lot of good points, but the comment feels aggressive in a way that would make me feel surprised if the discussion following this comment would be good. Not downvoting or upvoting either way because of this.

Replies from: Duncan_Sabien, dxu
comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-08T19:23:19.164Z · LW(p) · GW(p)

Sensible; have been going through and making edits to reduce aggressiveness (e.g. removing italics, correcting typical-mind fallacies, etc.) I like having these comments here as a record of what was there before edits occurred.

Replies from: habryka4
comment by habryka (habryka4) · 2017-10-08T19:33:19.585Z · LW(p) · GW(p)

Upvoted the top level comment after the edit.

comment by dxu · 2017-10-08T19:21:32.981Z · LW(p) · GW(p)

I would argue that Thrasymachus' initial post also carried an undertone of aggression (that Duncan may have picked up on, either consciously or subconsciously), but that this was possibly obscured and/or distracted from by its very formal tone.

(Whether you prefer veiled or explicit aggression is a "pick your poison" kind of choice.)

Replies from: habryka4
comment by habryka (habryka4) · 2017-10-08T19:32:27.561Z · LW(p) · GW(p)

This seems correct to me. And I originally didn't upvote the top-level post either.

comment by the gears to ascension (lahwran) · 2017-10-08T18:32:18.331Z · LW(p) · GW(p)

I upvoted you, then changed my mind about doing so because of intense emotional content. From both sides, this feels like a fight. I have also retracted my vote on the main post.

I agree that you have good points, but I don't feel able to engage with them without it feeling like fighting/like tribal something or other.

Replies from: Duncan_Sabien
comment by Duncan Sabien (Deactivated) (Duncan_Sabien) · 2017-10-08T18:35:17.500Z · LW(p) · GW(p)

Thanks for both your policy and your honesty. The bind I feel like I'm in is that, in this case, the way I'd back away from a fight and move myself toward productive collaboration is to offer to double crux, and it seems like in this case that would be inappropriate/might be received as itself a sort of sneaky status move or an attempt to "win."

If Thrasymachus or anyone else has specific thoughts on how best to engage, I commit to conforming to those thoughts, as a worthwhile experiment. I am interested in the actual truth of the matter, and most of my defensiveness centers around not wanting to throw away the accumulated value we have so far (as opposed to something something status something something ownership).

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2017-10-08T18:40:15.741Z · LW(p) · GW(p)

I think, based on my reading of Thrasymachus's post, that they think there's a reasonable generalization of double crux that has succeeded in the real world; that it's too hard to get to that generalization from double crux; but that there is a reasonable way for disagreeing people to engage.

I am censoring further things I want to say, to avoid pushing on the resonance of tribalism-fighting.

Replies from: dxu
comment by dxu · 2017-10-08T19:00:02.212Z · LW(p) · GW(p)

I am censoring further things I want to say, to avoid pushing on the resonance of tribalism-fighting.

Out of curiosity, do you think that inserting an explicit disclaimer like this helps to reduce feelings of tribal offense? If so, having now written such a disclaimer, do you think it would be worth it to share more of your thoughts on the matter?

(I'll be honest; my main motivator for asking this is because I'm curious and want to read the stuff you didn't say. But even taking that into consideration, it seems to me that the questions I asked have merit.)

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2017-10-08T20:11:08.438Z · LW(p) · GW(p)

no, I think it creates a small fraction of what it would if I'd said the thing.

comment by sarahconstantin · 2017-10-10T03:17:36.598Z · LW(p) · GW(p)

I think the key contribution of Double Crux is "diagram what the argument even is, before making it." If you try to make your argument at the same time as you clarify what the precise topic is, you risk getting confused more easily. "Get on the same page first, then debate" also is a practical way to motivate collaborative rather than competitive discussion, in a way that still retains a driving force towards clarity (whereas many other "be nice to each other" priming techniques point you away from being clear.)

comment by Vaniver · 2017-10-09T19:28:24.704Z · LW(p) · GW(p)

X and Y both offer what appear (to their lights) the strongest considerations that push them to a higher/lower credence on B.

I think this is a good example of something where the text, interpreted literally, leads to bad technique, and doing it right is relying on a skill that's perhaps invisible (but is part of the long experience of the philosophical tradition).

A core distinction between "non-Bayesian reasoning" and "Bayesian reasoning," as I see it, is whether hypotheses are judged against themselves or against each other. The first involves desirable properties like consistency and exceeding probability thresholds; the second involves comparison of likelihoods.

Expressed mathematically, this is the difference between the probability of observations given an explanation, P(o|E), and the probability of an explanation given the observations, P(E|o).

So when a good philosopher considers a belief, like 'the group should order pizza,' they are reflexively considering lethal consequences of that belief and comparing it against other hypotheses, and attempt to establish the case that there are no lethal consequences and the hypothesis isn't just consistent with the available evidence, but is more easily consistent with it than other hypotheses.

But when a good lawyer considers a belief, that isn't what's happening. They're doing something like "how do I make the desired explanation seem very high (or low) credence?". There would be lots of statements like "if my client is innocent, then 1 is equal to 1" without observations that the mirroring statement "if my client is guilty, then 1 is equal to 1" is equally true. (Against savvy opponents, you might not try to slip that past them.)

The point of looking for cruxes is that it gets people out of lawyer-mode, where they're looking for observations that their theory predicts, and into look into the dark mode, where they're looking for observations that their theory anti-predicts. If my belief that the group should order pizza strongly anti-predicts that the pizza place is closed, and you believe that the pizza place is closed, then that's an obviously productive disagreement for us to settle. And trying to find one where the opponent disagrees keeps it from being empty--"well, my belief that we should order pizza anti-predicts than 1=0."

For many recondite topics I think about, my credence it in arises from the balance of a variety of considerations pointing in either direction. Thus whether or not I believe 'MIRI is doing good work', 'God exists', or 'The top marginal tax rate in the UK should be higher than its current value' does not rely on a single consideration or argument, but rather its support is distributed over a plethora of issues. Although in some cases undercutting what I take as the most important consideration would push my degree of belief over or under 0.5, in other cases it would not.

One of the things that I think becomes clear in practice is that the ideal form (where a crux would completely change my mind) degrades gracefully. If I think that the group should get pizza because of twelve different factors, none of which could be individually decisive, leading to overwhelming odds in favor of pizza, then I can likely still identify the factor which would most change the odds if it flipped. (And, since those factors themselves likely have quantitative strengths as opposed to 'true' or 'false' values, this becomes a slope rather than a discrete value.)

This seems to be driving in the right direction--when sorting my beliefs by relevance, the criterion I'm using is 'ability to change my mind,' and then checking to see if you actually believe differently. This is somewhat more cooperative and useful than sorting my beliefs by 'ability to convince my opponent,' since I don't have as good access to the counterfactuals of my opponent's models.

I also notice that I can't predict whether you'll look at the "prioritize discussion based on the slope of your possible update combined with the other party's belief" version that I give here and say "okay, but that's not double crux" or "okay, but the motion of double crux doesn't point there as efficiently as something else" or "that doesn't seem like the right step in the dance, tho."

Replies from: Thrasymachus
comment by Thrasymachus · 2017-10-10T00:04:29.637Z · LW(p) · GW(p)

I also notice that I can't predict whether you'll look at the "prioritize discussion based on the slope of your possible update combined with the other party's belief" version that I give here and say "okay, but that's not double crux" or "okay, but the motion of double crux doesn't point there as efficiently as something else" or "that doesn't seem like the right step in the dance, tho."

I regret it is unclear what I would say given what I have written, but it is the former ("okay, but that's not double crux"). I say this for the following reasons.

  1. The consideration with the greatest slope need not be a crux. (Your colleague Dan seems to agree with my interpretation that a crux should be some C necessary for ones attitude over B, so that if you changed your mind about C you'd change your mind about B).

  2. There doesn't seem to be a 'double' either: identifying the slopiest consideration regarding ones own credence doesn't seem to demand comparing this to the beliefs of any particular interlocutor to look for shared elements.

I guess (forgive me if I'm wrong) what you might say is that although what you describe may not satisfy what was exactly specified in the original introduction to double crux, this was a simplification and these are essentially the same thing. Yet I take what distinguishes double crux over related and anodyne epistemic virtues (e.g. 'focus on important less-resilient considerations', 'don't act like a lawyer') is the 'some C for which if ¬C then ¬B' characteristic. As I fear may be abundantly obvious, I find eliding this distinction confusing rather than enlightening: if (as I suggest) the distinguishing characteristic of double crux neither works as good epistemic tool nor good epistemic training, that there may be some nearby epistemic norm that does one or both of these is little consolation.

comment by Raemon · 2017-10-09T04:16:17.020Z · LW(p) · GW(p)

gjm and I recently completed an Actual Example of Double Crux.

I posted it to Meta instead of the front page because it was more of a "talk about the community itself" post (which we're trying not to have on the front page). But linking it here seemed like a reasonable compromise.

Replies from: Conor Moreton
comment by Conor Moreton · 2017-10-09T09:22:52.955Z · LW(p) · GW(p)

I have looked for this mysterious "post to Meta" option, and I am confused—is it a thing that is available only to mods? I have neither been able to post to Meta, nor peruse things in Meta, and am afraid it is because I am uniquely stupid.

Replies from: Benito
comment by Ben Pace (Benito) · 2017-10-09T09:31:56.556Z · LW(p) · GW(p)

<3 In the menu on the top left, there's a 'meta' button, and the page it leads you to looks a lot like the front page but with all different posts. When you're in meta, there's an extra 'new post' button under 'recent meta posts', and this will post things to meta.

comment by Unnamed · 2017-10-09T20:49:21.797Z · LW(p) · GW(p)

(This is Dan from CFAR.)

It seems like it might be useful to set aside discussion of the practice of "double crux" and just focus on the concept of a "crux" for a while. My impression is that "crux" is a very useful concept to have (even without double crux), that it is a simpler/easier concept to get than double crux, and that lots of LWers who have heard of it still don't have a very clear idea of what a "crux" is. (Relatedly, at CFAR workshops these days, the concept of a crux gets introduced before the class on double crux.)

Something is a crux for a belief for a person if changing their mind about the crux will change their mind about that belief. "Cruxiness" is actually a matter of degree, e.g. going from "90% sure of X" to "60% sure of X" is changing your mind less than a shift from 90% to 10%, but more than a shift from 90% to 85%.

A typical belief has many cruxes. For example, if Ron is in favor of a proposal to increase the top marginal tax rate in the UK by 5 percentage points, his cruxes might include "There is too much inequality in the UK", "Increasing the top marginal rate by a few percentage points would not have much negative effect on the economy", and "Spending by the UK government, at the margin, produces value". If he thought that more inequality would be good for society then he would no longer favor increasing the top marginal rate. If he thought that increasing the top marginal rate would be disastrous for the UK economy then he would no longer favor increasing it (even if he didn't change his mind about there being too much equality). If he thought that marginal government spending was worthless or harmful then he would no longer favor increasing taxes.

There are technically a whole bunch more cruxes, such as most radically skeptical scenarios. If Ron became convinced that he was dreaming and that "the UK" was just a figment of his imagination, then he would no longer favor increasing UK tax rates. So other cruxes of Ron's include things like "I am not dreaming", "I am not a Boltzmann brain," "The UK exists", etc. But cruxes like these are uninteresting and typically don't come to mind. The "interestingness" of a crux is also a matter of degree, and of context, in a way that is more complicated than just probabilities and hard to make precise.

To take another example: suppose that Melissa thinks she has a brilliant startup idea of making gizmos, and believes that it's the best career path for her to take right now. One crux is whether gizmos can be manufactured for less than a certain price. Another is whether many people are willing to buy gizmos for a certain price. If Melissa learned that many entrepreneurs had tried making gizmos and all of them had failed, and some of them were highly capable people who had the skills that she thinks of as her biggest strengths and had the ideas that she thinks of as her biggest insights about gizmos, that would also change her mind about this startup idea. If Melissa explained her startup idea to 5 specific friends of hers (who she sees as having good judgment, and expertise relevant to startups or to gizmos) and all 5 advised her against the startup, then that would also change her mind. So those also point to cruxes, of an "outside view" flavor rather than a "business model" flavor. And there are various other potentially interesting/relevant cruxes (e.g., she wouldn't start the company if she discovered that she had a $10 million per year job offer from Google, or that she had cancer, or that gizmos are terrible for people even if they're willing to pay a bunch of money for them).

Cruxes are really important. Lots of useful thinking involves trying to figure out what your cruxes are, or trying to gain information about a particular crux, or checking if the topic that you've been thinking about is a crux. These aren't the only useful kinds of thinking, of course (it's often also useful to try to get the lay of the land, to follow up on something you feel uneasy about, etc., etc.). But they're useful, and underutilized, and having a crystallized concept with a one-syllable label makes it easier to do more of them.

Replies from: Unnamed, Thrasymachus
comment by Unnamed · 2017-10-09T21:23:42.691Z · LW(p) · GW(p)

How are cruxes relevant in disagreements?

One issue is that people often spend a lot of time arguing about things that aren't cruxes for either of them. Two people who disagree about whether to increase the top marignal tax rate might get into a back-and-forth about the extent to the higher rate will lead to rich people hiding their money in offshore banks, when the answer to that question wouldn't shift either of their views. Maybe they're talking about that topic because it seems like a it should be an important consideration (even though it isn't crucial for either of them). Maybe one of them mentioned the topic briefly as part of a longer argument, and said something about it that the other person disagreed with and therefore responded to. Maybe one of them guessed that it was a crux for the other person and therefore chose to bring it up. Whatever the reason, this subtopic is mostly a waste of time and a distraction from a potentially more interesting conversation that they could be having. Focusing on cruxes helps to avoid these sidetracks because each person is frequently checking "is this a crux for me?" and occasionally asking "is that a crux for you?" (or some reduced-jargon alternative, like "So if you imagine one world where raising the top rate would mostly just lead to rich people hiding their money in offshore banks, and another world where that didn't happen, would your view on raising the rate be the same in both of those worlds?").

Focusing on your cruxes also flips around the typical dynamic of a disagreement. Normally, if Alice and Bob disagree about something, then for the most part Alice is trying to change Bob's mind and Bob is trying to change Alice's mind. If Alice is used to thinking about her cruxes, Alice can instead mostly be trying to change Alice's mind. "Raising the top marginal tax rate seems to me like a good idea, but Bob thinks otherwise - maybe there's something I'm missing?" Alice understands Alice's mind a lot better than Bob does, so she has the advantage in looking for what sorts of information might shift her views. Bob is helping her doing it, collaboratively, noticing the places where her thinking seems funny, or where she seems to be missing information, or where her model of the world is different from his, and so on. But this looks very different from Bob taking the lead on changing Alice's views by taking his best guesses, often going mainly on his priors about what things tax-rate-increasers tend to be wrong about.

Obviously this is not the best way to approach every disagreement. In cases like negotiation you have other goals besides "improve my model of this aspect of the world", if you don't have much respect for Bob's thinking then it may not be worth the trouble, and in online discussions with many people and lots of time lag this approach may be impractical. But in cases where you really care about getting the right answer (e.g., because your career success depends on it), and where the other person's head seems like one of the better sources of information available to you, focusing on your cruxes in a conversation about disagreements can be a valuable approach to take.

(I still haven't gotten into "double crux", and am not planning to.)

comment by Thrasymachus · 2017-10-09T23:37:35.462Z · LW(p) · GW(p)

Hello Dan,

I'm not sure whether these remarks are addressed 'as a reply' to me in particular. That you use the 'marginal tax rate in the UK' example I do suggests this might be meant as a response. On the other hand, I struggle to locate the particular loci of disagreement - or rather, I see in your remarks an explanation of double crux which includes various elements I believe I both understand and object to, but not reasons that argue against this belief (e.g. "you think double crux involves X, but actually it is X*, and thus your objection vanishes when this misunderstanding is resolved", "your objection to X is mistaken as Y", etc.) If this is a reply, I apologise for not getting it; if it is not, I apologise for my mistake.

In any case, I take the opportunity to suggest to concretely identify one aspect of my disagreement:

A typical belief has many cruxes. For example, if Ron is in favor of a proposal to increase the top marginal tax rate in the UK by 5 percentage points, his cruxes might include "There is too much inequality in the UK", "Increasing the top marginal rate by a few percentage points would not have much negative effect on the economy", and "Spending by the UK government, at the margin, produces value". If he thought that more inequality would be good for society then he would no longer favor increasing the top marginal rate. If he thought that increasing the top marginal rate would be disastrous for the UK economy then he would no longer favor increasing it (even if he didn't change his mind about there being too much equality). If he thought that marginal government spending was worthless or harmful then he would no longer favor increasing taxes.

This seems to imply agreement with my take that cruxes (per how CFAR sees them) have the 'if you change your mind about this, you should change your mind about that', and so this example has the sequence think-esque characteristic that these cruxes are jointly necessary for ron's belief (i.e. if Ron thinks ¬A, ¬B, or ¬C, he should change his mind about the marginal tax rate). Yet by my lights it seems more typical considerations like these exert weight upon the balance of reason, but not of such strength that their negation provides a decisive consideration against increasing taxes (e.g. it doesn't seem crazy for Ron to think "Well, I don't think inequality is a big deal, but other reasons nonetheless favour raising taxes", or "Even though I think marginal spending by the UK government is harmful, this negative externality could be outweighed by other considerations").

I think some harder data can provide better information than litigating hypothetical cases. If the claim that a typical belief has many cruxes, one should see that if one asks elite cognisers to state their credence for a belief, and then state their credences for the most crucial few considerations regarding it, the credence for the belief should only be very rarely higher than the lowest credence among the considerations. This is because if most beliefs have many (jointly necessary) cruxes which should usually comprise at least the top few considerations, and thus this conjunction is necessary (but not sufficient) for believing B, and P(one crux) >= P(conjunction of cruxes). In essence ones credence in a belief should be no greater than ones weakest crux (I guess usually the credence in the belief of a sequence-thinking argument should generally approximate a lower credence set by P(crux1)*P(crux2) etc, as these are usually fairly independent.)

In contrast, if I am closer to the mark, one should fairly commonly see the credence for the belief be higher than the lowest credence of the set of important considerations. If each consideration offers a bayesian update favouring B, a set of important considerations that support B may act together (along with other less important considerations) to increase its credence such that one is more confident of B than of some (or all) of the important considerations that support it.

I aver relevant elite cognisers (e.g. superforecasters, the philosophers I point to) will exhibit the property I suggest. I would also venture that when reasonable cognisers attempt to double crux, their credences will also behave in the way I predict.

Replies from: Unnamed
comment by Unnamed · 2017-10-10T08:33:12.384Z · LW(p) · GW(p)

I agree that it would be good to look at some real examples of beliefs rather than continuing with hypothetical examples and abstract arguments.

Your suggestion for what hard data to get isn't something that we can do right now (and I'm also not sure if I disagree with your prediction). We do have some real examples of beliefs and (first attempt at stating) cruxes near at hand, in this comment from Duncan and in this post from gjm (under the heading "So, what would change my mind? ") and Raemon (under the heading "So: My Actual Cruxes"). And I'd recommend that anyone who cares about cruxes or double crux enough to be reading this three-layers-deep comment, and who has never had a double crux conversation, pick a belief of yours, set a 5 minute timer, and spend that time looking for cruxes. (I recommend picking a belief that is near the level of actions, not something on the level of a philosophical doctrine.)

In response to your question about whether my comments were aimed at you:

They were partly aimed at you, partly aimed at other LWers (taking you as one data point of how LWers are thinking about cruxes). My impression is that your model of cruxes and double crux is different from the thing that folks around CFAR actually do, and I was trying to close that gap for you and for other folks who don't have direct experience with double crux at CFAR.

For my first comment: the OP had several phrases like "traced to a single underlying consideration" which I would not use when talking about cruxes. Melissa's current belief that she should start a gizmo company isn't based on a single consideration, it's a result of the fact that several different factors line up in a way that makes that specific plan look like an especially good idea. So of course she has several different cruxes. Similarly with views on marginal tax rates.

For my second comment: 'Primarily look for things that would change your own views, not for things that would change the other person's views' is one of the core advantages of focusing on cruxes, in my opinion, and it didn't seem to be a focus of the OP. It's something that's missing from your suggested substitute ("Look for key considerations") and from your discussion of the example of how experts philosophers handle disagreements. e.g., If Theist is the one pressing the moral argument for the existence of God, because Theist guesses that it might shift Atheist's views, then that is not a conversation based on cruxes. Whereas if Atheist is choosing to focus the discussion on that argument because Atheist thinks it might shift their own views, then it sounds like it is very similar to a conversation based on cruxes.

On the question of whether cruxes are all-or-nothing or a matter of degree: I think of "crux" as a term similar to "belief". It suggests sharp category boundaries when in fact things are a matter of degree, but it's often a good enough approximation and it's easier for a person to think about, learn, and use the rest of the framework if they can fall back on the categorical concept. Replacing "look for cruxes" with "look for considerations to which your beliefs have relatively high credence sensitivity" also seems like a decent approximation. Doing a Value of Information calculation also seems like a decent approximation, at least for the subset of considerations that are within the model. I could say more to try to elaborate on all of this, but it feels like it really needs some concrete examples to point at. If a discussion like this was happening at a workshop, I'd elaborate by looking at the person's attempts to come up with cruxes and giving them feedback.

(I'll repeat here: this comment is about cruxes, not about double crux in particular.)

comment by [deleted] · 2017-10-08T16:50:43.470Z · LW(p) · GW(p)

A thing that I would like us (i.e. humans) to have is a framework for people who believe different things to come to some sort of consensus, such that people are able to change their minds without fear of losing things like status.

One reason I like Double Crux is because the goal of changing your mind is made explicit. There's a neat thing within the process where you're hopefully also shifting your gut feelings and not just trading well-maintained sophisticated arguments. The end result feels from the inside like you "actually" believe the thing you've shifted to.

I don't have much experience in philosophy, but I'm curious if there are any salient examples you could point to where two philosophers were on different ends of a spectrum and, after some time and discussion, one of them shifted in a major direction (EX: swapped from being a heavy proponent of Utilitarianism to Deontology because their concerns were allayed)?

Replies from: Thrasymachus
comment by Thrasymachus · 2017-10-08T22:58:28.495Z · LW(p) · GW(p)

I'm not sure I can provide exactly what you're after: although philosophers do change their minds, they usually explain it in terms of 'I was persuaded by these considerations' rather than 'I talked with X for a while about it' - and often they hold many-many discussions in the literature. I could offer my own anecdata about how I changed my mind on various philosophical topics after discussing the matter with an elite philosopher, but I'm hardly one myself.

Perhaps the closest example that springs to mind is Wittgenstein. His initial masterwork was the Tractatus, one of the touchstones of logical positivism, a leading approach to analytic philosophy in the early twentieth century. His second masterwork, composed in later life (Philosophical investigations) is generally held to repudiate many of the claims (as well as the overall direction) of the Tractatus. He credits Piero Sraffa (admittedly an economist) for 'most of the consequential ideas' in his foreword.

Replies from: None
comment by [deleted] · 2017-10-09T00:55:42.749Z · LW(p) · GW(p)

<Nods.>

Thanks for the additional information. I guess the thing I had in mind in the original comment was something like:

"As a community with shared goals, it seems good to have a way for people to quickly converge when they're on differing sides of a spectrum."

I don't have any actual experience with philosophers, but my mental stereotype is that it might take a lot of back and forth (e.g. months or years) before one of them changes their mind. (Is this even accurate?)

If so, maybe it would be worth still investigating this area of conflict resolution if only to find something that works faster than what people are currently using.

Replies from: Thrasymachus
comment by Thrasymachus · 2017-10-09T17:12:32.863Z · LW(p) · GW(p)

I guess there might be a selection effect: 'mature philosophers' might have spent a lot of time hashing our their views at earlier stages (e.g. undergrad, graduate school). So it may not be that surprising to find in the subject of their expertise their credences on the issues are highly resilient such that they change their mind rarely, and only after considerable amounts of evidence gathered over a long time.

Good data would be whether outside of this whether these people are good at hashing out cases where they have less resilient credences, but these cases will seldom come up publicly (Putnam and Russell are famed for changing their view often, but it is unclear how much 'effort' that took or whether it depended on interlocutors). I can offer my private experience of some of these exceptional philosophers that they are exceptional at this, but I anticipate reasonable hesitance of this type of private evidence.

comment by Ben Pace (Benito) · 2017-10-11T01:35:51.643Z · LW(p) · GW(p)

Promoted to featured, for a thoughtful criticism of a substantial idea in the community, and excellent follow-up discussion.

comment by the gears to ascension (lahwran) · 2017-10-08T16:51:20.527Z · LW(p) · GW(p)

This post changed my opinion about double crux as a whole. However, I'm uncomfortable with the degree of "call to action" - eg, "Would that time and effort be spent better looking elsewhere." I upvoted the post anyway. [edit: have retracted my vote on the post, because this post seems too much like fighting words after seeing how it made duncan feel]

Replies from: scarcegreengrass
comment by scarcegreengrass · 2017-10-08T21:43:05.068Z · LW(p) · GW(p)

This is a good phrasing of my opinion also. I don't think this is an issue of resource scarcity.

comment by Unnamed · 2017-10-09T21:56:40.621Z · LW(p) · GW(p)

(This is Dan from CFAR.)

Most of the "data" that we have on double crux is of the informal type, from lots of experience doing double crux, trying to teach people double crux, facilitating double cruxes, watching people try to double crux, etc. But some of the data consists of numbers in a spreadsheet. Here are those numbers.

At workshops, when we teach double crux we have people pair off, find a topic which they and their partner disagree about, and try to double crux. We typically give them about 20 minutes, and have a few staff members floating around available to help. At the end of the class, participants get a handout with three questions:

  • How easy was it for you and your partner to find an interesting disagreement to apply the technique to? (0 = very hard, 10 = very easy)

  • Was your conversation valuable / productive / something that you learned from? (0 = not at all, 10 = very much)

  • Did you and your partner find a double crux? (No, Sort Of, Almost, Yes)

With a sample of 124 people, the averages on these were:

6.96 Easy to find an interesting disagreement

7.82 Conversation was valuable

49% Yes found a double crux

The value of the conversation rating was 8.08 among those who found a double crux ("Yes", n=61), 8.14 among those who easily found a disagreement (rating of 7 or higher, n=86), and 8.35 among those who both easily found a disagreement and found a double crux (n=43). (In contrast with 7.56 among those who didn't find a double crux (n=63) and 7.08 among those who had difficulty finding a disagreement (n=38).)

Replies from: Thrasymachus
comment by Thrasymachus · 2017-10-09T22:47:32.672Z · LW(p) · GW(p)

Thanks for presenting this helpful data. If you'll forgive the (somewhat off topic) question, I understand both that you are responsible for evaluation of CFAR, and that you are working on a new evaluation. I'd be eager to know what this is likely to comprise, especially (see various comments) what evidence (if any) is expected to be released 'for public consumption'?

comment by bfinn · 2019-12-09T17:04:33.596Z · LW(p) · GW(p)

This is an interesting topic and post. My thoughts following from the God exists / priors bit (and apologies if this is an obvious point, or dealt with elsewhere - e.g. too many long comments below to read more than cursorily!):

Many deeply-held beliefs - particularly broadly ideological ones (e.g. theological, ethical, or political) - are held emotionally rather than rationally, and not really debated in a search for the truth, but to proclaim one's own beliefs, and perhaps in the vain hope of converting others.

So any apparently strong counter-evidence or counter-arguments are met with fall-back arguments, or so-called 'saving hypotheses' (where special reasons are invoked for why God didn't answer your entirely justified prayer). Savvy arguers will have an endless supply of these, including perhaps some so general that they can escape all attack (e.g. that God deliberately evades all attempts at testing). Unsavvy arguers will run out of responses, but still won't be convinced, and will think there is some valid response that they just happen not to know. (I've even heard this used by one church as an official ultimate response to the problem of evil: 'we don't know why God allows evil, but he does (so there must be a good reason we just don't know about)'.)

That is, the double-crux model that evidence (e.g. the universe) comes first and beliefs follow from it is reversed in these cases. The beliefs come first, and any supporting evidence and reasoning are merely used to justify the beliefs to others. (Counter-evidence and counter-arguments are ignored.) Gut feel is all that counts. So there aren't really cruxes to be had.

I don't think these are very special cases; probably quite a wide variety of topics are treated like this by many people. E.g. a lot of 'debates' I see on Facebook are of this kind; they lead nowhere, no-one ever changes their mind, and they usually turn unpleasant quickly. The problem isn't the debating technique, but the nature of the beliefs.

comment by dxu · 2017-10-08T19:02:09.627Z · LW(p) · GW(p)

I was going to suggest that you delete this comment, but then I realized that I have no idea if that's actually possible on this site. Would someone with more familiarity with LW 2.0 than I currently have care to comment?

Replies from: habryka4
comment by habryka (habryka4) · 2017-10-08T19:03:34.728Z · LW(p) · GW(p)

Yep, you can delete, and I just did so.