Posts

In defence of epistemic modesty 2017-10-29T20:00:31.366Z
Contra double crux 2017-10-08T16:29:35.193Z
Beware surprising and suspicious convergence 2016-01-24T19:13:17.813Z
Log-normal Lamentations 2015-05-19T21:12:52.019Z
Against the internal locus of control 2015-04-03T17:48:56.400Z
Funding cannibalism motivates concern for overheads 2014-08-30T00:42:13.492Z
Why the tails come apart 2014-08-01T22:41:00.044Z
UFAI cannot be the Great Filter 2012-12-22T11:26:41.010Z

Comments

Comment by Thrasymachus on Non-Disparagement Canaries for OpenAI · 2024-05-31T08:57:31.075Z · LW · GW

I see the concerns as these:

  1. The four corners of the agreement seem to define 'disparagement' broadly, so one might reasonably fear (e.g.) "First author on an eval especially critical of OpenAI versus its competitors", or "Policy document highly critical of OpenAI leadership decisions" might 'count'.
  2. Given Altman's/OpenAI's vindictiveness and duplicity, and the previous 'safeguards' (from their perspective) which give them all the cards in terms of folks being able to realise the value of their equity, "They will screw me out of a lot of money if I do something they really don't like (regardless of whether it 'counts' per the non-disparagement agreement)" seems a credible fear. 
    1. It appears Altman tried to get Toner kicked off the board for being critical of OpenAI in a policy piece, after all.
  3. This is indeed moot for roles which require equity to be surrendered anyway. I'd guess most roles outside government (and maybe some within it) do not have such requirements. A conflict of interest roughly along the lines of the first two points makes impartial performance difficult, and credible impartial performance impossible (i.e. even if indeed Alice can truthfully swear "My being subject to such an agreement has never influenced my work in AI policy", reasonable third parties would be unwise to believe her).
  4. The 'non-disclosure of non-disparagement' makes this worse, as it interferes with this conflict of interest being fully disclosed. "Alice has a bunch of OpenAI equity" is one thing, "Alice has a bunch of OpenAI equity, and has agreed to be beholden to them in various ways to keep it" is another. We would want to know the latter to critically appraise Alice's work whenever it is relevant to OpenAI's interests (and I would guess a lot of policy/eval/reg/etc. would be sufficiently relevant that we'd like to contemplate whether Alice's commitments colour her position). Yet Alice has also promised to keep these extra relevant details secret.
Comment by Thrasymachus on A Quick Guide to Confronting Doom · 2022-04-14T09:41:52.676Z · LW · GW

I can't help with the object level determination, but I think you may be overrating both the balance and import of the second-order evidence.

As far as I can tell, Yudkowsky is a (?dramatically) pessimistic outlier among the class of "rationalist/rationalist-adjacent" SMEs in AI safety, and probably even more so relative to aggregate opinion without an LW-y filter applied (cf.). My impression of the epistemic track-record is Yudkowsky has a tendency of staking out positions (both within and without AI) with striking levels of confidence but not commensurately-striking levels of accuracy. 

In essence, I doubt there's much epistemic reason to defer to Yudkowsky more (or much more) than folks like Carl Shulman, or Paul Christiano, nor maybe much more than "a random AI alignment researcher" or "a superforecaster making a guess after watching a few Rob Miles videos" (although these have a few implied premises around difficulty curves/ subject matter expertise being relatively uncorrelated to judgemental accuracy). 

I suggest ~all reasonable attempts at idealised aggregate wouldn't take a hand-brake turn to extreme pessimism on finding Yudkowsky is. My impression is the plurality LW view has shifted more from "pretty worried" to "pessimistic" (e.g. p(screwed) > 0.4) rather than agreement with Yudkowsky, but in any case I'd attribute large shifts in this aggregate mostly to Yudkowsky's cultural influence on the LW-community plus some degree of internet cabin fever (and selection) distorting collective judgement.

None of this is cause for complacency: even if p(screwed) isn't ~1, > 0.1 (or 0.001) is ample cause for concern, and resolution on values between (say) [0.1 0.9] is informative for many things (like personal career choice). I'm not sure whether you get more yield for marginal effort on object or second-order uncertainty (e.g. my impression is the 'LW cluster' trends towards pessimism, so adjudicating whether this cluster should be over/under weighted could be more informative than trying to get up to speed on ELK). I would guess, though, that whatever distils out of LW discourse in 1-2 months will be much more useful than what you'd get right now.

Comment by Thrasymachus on My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage) · 2021-10-20T11:27:26.967Z · LW · GW

Looking back, my sense remains that we basically succeeded—i.e., that we described the situation about as accurately and neutrally as we could have. If I'm wrong about this... well, all I can say is that it wasn't for lack of trying.

I think CFAR ultimately succeeded in providing a candid and good faith account of what went wrong, but the time it took to get there (i.e. 6 months between this and the initial update/apology) invites adverse inferences like those in the grandparent. 

A lot of the information ultimately disclosed in March would definitely have been known to CFAR in September, such as Brent's prior involvement as a volunteer/contractor for CFAR, his relationships/friendships with current staff, and the events as ESPR. The initial responses remained coy on these points, and seemed apt to give the misleading impression CFAR's mistakes were (relatively) much milder than they in fact were. I (among many) contacted CFAR leadership to urge them to provide more candid and complete account when I discovered some of this further information independently. 

I also think, similar to how it would be reasonable to doubt 'utmost corporate candour' back then given initial partial disclosure, it's reasonable to doubt CFAR has addressed the shortcomings revealed given the lack of concrete follow-up. I also approached CFAR leadership when CFAR's 2019 Progress Report and Future Plans initially made no mention of what happened with Brent, nor what CFAR intended to improve in response to it. What was added in is not greatly reassuring:

And after spending significant time investigating our mistakes with regard to Brent, we reformed our hiring, admissions and conduct policies, to reduce the likelihood such mistakes reoccur.

A cynic would note this is 'marking your own homework', but cynicism is unnecessary to recommend more self-scepticism. I don't doubt the Brent situation indeed inspired a lot of soul searching and substantial, sincere efforts to improve. What is more doubtful (especially given the rest of the morass of comments) is whether these efforts actually worked. Although there is little prospect of satisfying me, more transparency over what exactly has changed - and perhaps third party oversight and review - may better reassure others.   

Comment by Thrasymachus on REVISED: A drowning child is hard to find · 2020-02-01T08:32:59.868Z · LW · GW

The malaria story has fair face validity if one observes the wider time series (e.g.). Further, the typical EA 'picks' for net distribution are generally seen as filling around the edges of the mega-distributors.

FWIW: I think this discussion would be clearer if framed in last-dollar terms.

If Gates et al. are doing something like last dollar optimisation, trying to save as many lives as they can allocating across opportunities both now and in the future, leaving the right now best marginal interventions on the table would imply they expect to exhaust their last dollar on more cost-effective interventions in the future.

This implies the right now marginal price should be higher than the (expected) last dollar cost effectiveness (if not, it should be reallocating some of the 'last dollars' to interventions right now). Yet this in turn does not imply we should see 50Bn of marginal price lifesaving lying around right now. So it seems we can explain Gates et al. not availing themselves of the (non-existent) opportunity to (say) halve communicable diseases for 2Bn a year worldwide (extrapolating from the right now marginal prices) without the right now marginal price being lied about or manipulated. (Obviously, even if we forecast the Gates et al. last dollar EV to be higher than the current marginal price, we might venture alternative explanations of this discrepancy besides them screwing us.)

Comment by Thrasymachus on Please Critique Things for the Review! · 2020-01-12T18:45:47.738Z · LW · GW

I also buy the econ story here (and, per Ruby, I'm somewhat pleasantly surprised by the amount of reviewing activity given this).

General observation suggests that people won't find writing reviews that intrinsically motivating (compare to just writing posts, which all the authors are doing 'for free' with scant chance of reward, also compare to academia - I don't think many academics find peer review/refereeing one of the highlights of their job). With apologies for the classic classical econ joke, if reviewing was so valuable, how come people weren't doing it already? [It also looks like ~25%? of reviews, especially the most extensive, are done by the author on their own work].

If we assume there's little intrinsic motivation (I'm comfortably in the 'you'd have to pay me' camp), the money doesn't offer that much incentive. Given Rudy's numbers suppose each of the 82 reviews takes an average of 45 minutes or so (factoring in (re)reading time and similar). If the nomination money is ~roughly allocated by person time spent, the marginal expected return of me taking an hour to review is something like $40. Facially, this isn't too bad an hourly rate, but the real value is significantly lower:

  • The 'person-time lottery' model should not be denominated by observed person-time so far, but one's expectation how much will be spent in total once reviewing finishes, which will be higher (especially conditioned on posts like this).
  • It's very unlikely the reward is going to allocated proportionately to time spent (/some crude proxy thereof like word count). Thus the EV would be discounted by whatever degree of risk aversion one has (I expect the modal 'payout' for a review to be $0).
  • Opaque allocation also incurs further EV-reducing uncertainty, but best guesses suggest there will be Pareto-principle/tournament dynamic game dynamics, so those with (e.g.) reasons to believe they're less likely to impress the mod team's evaluation of their 'pruning' have strong reasons to select themselves out.
Comment by Thrasymachus on Polio and the controversy over randomized clinical trials · 2019-12-21T19:56:42.428Z · LW · GW

Sure - there's a fair bit of literature on 'optimal stopping' rules for interim results in clinical trials to try and strike the right balance.

It probably wouldn't have helped much for Salk's dilemma: Polio is seasonal and the outcome of interest is substantially lagged from the intervention - which has to precede the exposure, and so the 'window of opportunity' is quickly lost; I doubt the statistical methods for conducting this were well-developed in the 50s; and the polio studies were already some of the largest trials ever conducted, so even if available these methods may have imposed even more formidable logistical challenges. So there probably wasn't a neat pareto-improvement of "Let's run an RCT with optimal statistical control governing whether we switch to universal administration" Salk and his interlocutors could have agreed to pursue.

Comment by Thrasymachus on Polio and the controversy over randomized clinical trials · 2019-12-20T22:23:04.787Z · LW · GW
Mostly I just find it fascinating that as late as the 1950s, the need for proper randomized blind placebo controls in clinical trials was not universally accepted, even among scientific researchers. Cultural norms matter, especially epistemic norms.

This seems to misunderstand the dispute. Salk may have had an overly optimistic view of the efficacy of his vaccine (among other foibles your source demonstrates), but I don't recall him being a general disbeliever in the value of RCTs.

Rather, his objection is consonant with consensus guidelines for medical research, e.g. the declaration of Helsinki (article 8): [See also the Nuremberg code (art 10), relevant bits of the Hippocratic Oath, etc.]

While the primary purpose of medical research is to generate new knowledge, this goal can never take precedence over the rights and interests of individual research subjects.

This cashes out in a variety of ways. The main one is a principle of clinical equipoise - one should only conduct a trial if there is genuine uncertainty about which option is clinically superior. A consequence of this is that clinical trials conducted are often stopped early if a panel supervising the trial finds clear evidence of (e.g.) the treatment outperforming the control (or vice versa) as continuing the trial continues to place those in the 'wrong' arm in harm's way - even though this comes at an epistemic cost as the resulting data is poorer than that which could have been gathered if the trial continued to completion.

I imagine the typical reader of this page is going to tend unsympathetic to the virtue ethicsy/deontic motivations here, but there is also a straightforward utilitarian trade-off: better information may benefit future patients, at the cost of harming (in expectation) those enrolled in the trial. Although RCTs are the ideal, one can make progress with less (although I agree it is even more treacherous), and the question of the right threshold for these is fraught. (There also also natural 'slippery slope' style worries about taking a robust 'longtermist' position in holding the value of the evidence for all future patients is worth much more than the welfare of the much smaller number of individuals enrolled in a given trial - the genesis of the Nuremberg Code need not be elaborated upon.)

A lot of this ethical infrastructure post-dates Salk, but this suggests his concerns were forward-looking rather than retrograde (even if he was overconfident in the empirical premise that 'the vaccine works' which drove these commitments). I couldn't in good conscience support a placebo-controlled trial for a treatment I knew worked for a paralytic disease either. Similarly, it seems very murky to me what the right call was given knowledge-at-the-time - but if Bell and Francis were right, it likely owed more to them having a more reasonable (if ultimately mistaken) scepticism of the vaccine efficacy than Salk, rather him just 'not getting it' about why RCTs are valuable.


Comment by Thrasymachus on Neural Annealing: Toward a Neural Theory of Everything (crosspost) · 2019-11-30T18:41:21.836Z · LW · GW

I'm afraid I couldn't follow most of this, but do you actually mean 'high energy' brain states in terms of aggregate neural activity (i.e. the parentheticals which equate energy to 'firing rates' or 'neural activity')? If so, this seems relatively easy to assess for proposed 'annealing prompts' - whether psychedelics/meditation/music/etc. tend to provoke greater aggregate activity than not seems open to direct calorimetry, leave alone proxy indicators.

Yet the steers on this tend very equivocal (e.g. the evidence on psychedelics looks facially 'right', things look a lot more uncertain for meditation and music, and identifying sleep as a possible 'natural annealing process' looks discordant with a 'high energy state' account, as brains seem to consume less energy when asleep than awake). Moreover, natural 'positive controls' don't seem supportive: cognitively demanding tasks (e.g. learning an instrument, playing chess) seem to increase brain energy consumption, yet presumably aren't promising candidates for this hypothesised neural annealing.

My guess from the rest of the document is the proviso about semantically-neutral energy would rule out a lot of these supposed positive controls: the elevation needs to be general rather than well-localized. Yet this is a lot harder to use as an instrument with predictive power: meditation/music/etc. have foci too in the neural activity it provokes.

Comment by Thrasymachus on The unexpected difficulty of comparing AlphaStar to humans · 2019-09-19T04:51:32.742Z · LW · GW

Thanks for this excellent write-up!

I'm don't have relevant expertise in either AI or SC2, but I was wondering whether precision might still be a bigger mechanical advantage than the write-up notes. Even if humans can (say) max out at 150 'combat' actions per minute, they might misclick, not be able to pick out the right unit in a busy and fast battle to focus fire/trigger abilities/etc, and so on. The AI presumably won't have this problem. So even with similar EAPM (and subdividing out 'non-combat' EAPM which need not be so accurate), Alphastar may still have a considerable mechanical advantage.

I'd also be interested in how important, beyond some (high) baseline, 'decision making' is at the highest levels of SC2 play. One worry I have is although decision-making is important (build orders, scouting, etc. etc.) what decides many (?most) pro games is who can more effectively micro in the key battles, or who can best juggle all the macro/econ tasks (I'd guess some considerations in favour would be that APM is very important, and that a lot of the units in SC2 are implicitly balanced by 'human' unit control limitations). If so, unlike Chess and Go, there may not be some deep strategic insights Alphastar can uncover to give it the edge, and 'beating humans fairly' is essentially an exercise in getting the AI to fall within the band of 'reasonably human', but can still subtly exploit enough of the 'microable' advantages to prevail.


Comment by Thrasymachus on [deleted post] 2019-08-30T05:53:36.997Z

Combining the two doesn't solve the 'biggest problems of utilitarianism':

1) We know from Arrhenius's impossibility theorems you cannot get an axiology which can avoid the repugnant conclusion without incurring other large costs (e.g. violations of transitivity, dependence of irrelevant alternatives). Although you don't spell out 'balance utilitarianism' enough to tell what it violates, we know it - like any other population axiology - will have very large drawbacks.

2) 'Balance utilitarianism' seems a long way from the frontier of ethical theories in terms of its persuasiveness as a population ethic.

a) The write-up claims that actions that only actions that increase sum and median wellbeing are good, those that increase one or the other are sub-optimal, and those that decrease both are bad. Yet what if we face choices where we don't have an option that increases both sum and median welfare (such as Parfit's 'mere addition'), and we have to choose between them? How do we balance one against the other? The devil is in these details, and a theory being silent on these cases shouldn't be counted in its favour.

b) Yet even as it stands we can construct nasty counter-examples to the rule, based on very benign versions of mere addition. Suppose Alice is in her own universe at 10 welfare (benchmark this as a very happy life). She can press button A or button B. Button A boosts her up to 11 welfare. Button B boosts her to 10^100 welfare, and brings into existence 10^100 people at (10-10^-100) welfare (say a life as happy as Alice but with a pinprick). Balance utilitarianism recommends button A (as it increases total and median) as good, but pressing button B as suboptimal. Yet pressing button B is much better for Alice, and also instantiates vast numbers of happy people.

c) The 'median criterion' is going to be generally costly, as it is insensitive to changing cardinal levels outside the median person/pair so long as ordering is unchanged (and vice-versa).

d) Median views (like average ones) also incur costs due to their violation of separability. It seems intuitive that the choiceworthiness of our actions shouldn't depend on whether there is an alien population on Alpha Centauri who are happier/sadder than we are (e.g. if there's lots of them and they're happier, any act that brings more humans into existence is 'suboptimal' by the lights of balance util).



Comment by Thrasymachus on Asymmetric Weapons Aren't Always on Your Side · 2019-06-09T20:49:16.174Z · LW · GW

(Very minor inexpert points on military history, I agree with the overall point there can be various asymmetries, not all of which are good - although, in fairness, I don't think Scott had intended to make this generalisation.)

1) I think you're right the German army was considered one of the most effective fighting forces on a 'man for man' basis (I recall pretty contemporaneous criticism from allied commanders on facing them in combat, and I think the consensus of military historians is they tended to outfight American, British, and Russian forces until the latest stages of WW2).

2) But it's not clear how much the Germany owed this performance to fascism:

  • Other fascist states (i.e. Italy) had much less effective fighting forces.
  • I understand a lot of the accounts to explain how German army performed so well sound very unstereotypically facist - delegating initiative to junior officers/NCOs rather than unquestioning obedience to authority (IIRC some historical comment was the American army was more stiflingly authoritarian than the German one for most of the war), better 'human resource' management of soldiers, combined arms, etc. etc. This might be owed more to Prussian heritage than Hitler's rise to power.

3) Per others, it is unclear 'punching above one's weight' for saying something is 'better at violence'. Even if the US had worse infantry, they leveraged their industrial base to give their forces massive material advantages. If the metric for being better at violence is winning in violent contests, the fact the German's were better at one aspect of this seems to matter little if they lost overall.

Comment by Thrasymachus on The Schelling Choice is "Rabbit", not "Stag" · 2019-06-08T16:12:44.028Z · LW · GW

It's perhaps worth noting that if you add in some chance of failure (e.g. even if everyone goes stag, there's a 5% chance of ending up -5, so Elliott might be risk-averse enough to decline even if they knew everyone else was going for sure), or some unevenness in allocation (e.g. maybe you can keep rabbits to yourself, or the stag-hunt-proposer gets more of the spoils), this further strengthens the suggested takeaways. People often aren't defecting/being insufficiently public spirited/heroic/cooperative if they aren't 'going to hunt stags with you', but are sceptical of the upside and/or more sensitive to the downsides.

One option (as you say) is to try and persuade them the value prop is better than they think. Another worth highlighting is whether there are mutually beneficial deals one can offer them to join in. If we adapt Duncan's stag hunt to have a 5% chance of failure even if everyone goes, there's some efficient risk-balancing option A-E can take (e.g. A-C pool together to offer some insurance to D-E if they go on a failed hunt with them).

[Minor: one of the downsides of 'choosing rabbit/stag' talk is it implies the people not 'joining in' agree with the proposer that they are turning down a (better-EV) 'stag' option.]


Comment by Thrasymachus on Drowning children are rare · 2019-06-04T17:38:26.419Z · LW · GW
A marginalist analysis that assumes that the person making the decision doesn’t know their own intentions & is just another random draw of a ball from an urn totally misses this factor.

Happily, this factor has not been missed by either my profile or 80k's work here more generally. Among other things, we looked at:

  • Variance in impact between specialties and (intranational) location (1) (as well as variance in earnings for E2G reasons) (2, also, cf.)
  • Areas within medicine which look particularly promising (3)
  • Why 'direct' clinical impact (either between or within clinical specialties) probably has limited variance versus (e.g.) research (4), also

I also cover this in talks I have given on medical careers, as well as when offering advice to people contemplating a medical career or how to have a greater impact staying within medicine.

I still think trying to get a handle on the average case is a useful benchmark.

Comment by Thrasymachus on Drowning children are rare · 2019-06-04T14:58:07.511Z · LW · GW

[I wrote the 80k medical careers page]

I don't see there as being a 'fundamental confusion' here, and not even that much of a fundamental disagreement.

When I crunched the numbers on 'how much good do doctors do' it was meant to provide a rough handle on a plausible upper bound: even if we beg the question against critics of medicine (of which there are many), and even if we presume any observational marginal response is purely causal (and purely mediated by doctors), the numbers aren't (in EA terms) that exciting in terms of direct impact.

In talks, I generally use the upper 95% confidence bound or central estimate of the doctor coefficient as a rough steer (it isn't a significant predictor, and there's reasonable probability mass on the impact being negative): although I suspect there will be generally unaccounted confounders attenuating 'true' effect rather than colliders masking it, these sort of ecological studies are sufficiently insensitive to either to be no more than indications - alongside the qualitative factors - that the 'best (naive) case' for direct impact as a doctor isn't promising.

There's little that turns on which side of zero our best guess falls, so long as we be confident it is a long way down from the best candidates: on the scale of intervention effectiveness, there's not that much absolute distance between estimates (I suspect) Hanson or I would offer. There might not be much disagreement even in coarse qualitative terms: Hanson's work here - I think - focuses on the US, and US health outcomes are a sufficiently pathological outlier in the world I'm also unsure whether marginal US medical effort is beneficial; I'm not sure Hanson has staked out a view on whether he's similarly uncertain about positive marginal impact in non-US countries, so he might agree with my view it is (modestly) net-positive, despite its dysfunction (neither I nor what I wrote assumes the system 'basically knows what it's doing' in the common-sense meaning).

If Hanson has staked out this broader view, then I do disagree with it, but I don't think this disagreement would indicate at least one of us has to be 'deeply confused' (this looks like a pretty crisp disagreement to me) nor 'badly misinformed' (I don't think there are key considerations one-or-other of us is ignorant of which explains why one of us errs to sceptical or cautiously optimistic). My impressions are also less sympathetic to 'signalling accounts' of healthcare than his (cf.) - but again, my view isn't 'This is total garbage', and I doubt he's monomaniacally hedgehog-y about the signalling account. (Both of us have also argued for attenuating our individual impressions in deference to a wider consensus/outside view for all things considered judgements).

Although I think the balance of expertise leans against archly sceptical takes on medicine, I don't foresee convincing adjudication on this point coming any time soon, nor that EA can reasonably expect to be the ones to provide this breakthrough - still less for all the potential sign-inverting crucial considerations out there. Stumbling on as best we can with our best guess seems a better approach than being paralyzed until we're sure we've figured it all out.

Comment by Thrasymachus on What are the advantages and disadvantages of knowing your own IQ? · 2019-04-07T15:19:12.949Z · LW · GW

It looks generally redundant in most cases to me: Given how pervasive IQ-correlations are, I think most people can get a reasonable estimate of their IQ by observing their life history so far. E.g.

  • Educational achievement
  • Performance on other standardised tests
  • Job type and professional success
  • Peer esteem/reputation

Obviously, none of these are perfect signals, but I think taking them together usually gives a reasonable steer to a credible range not dramatically larger than test-restest correlations of an IQ test. An IQ test would still provide additional information, but I'm not sure there are many instances where (say) knowing the answer in a 5 point band versus a 10 point band is that important.

The case where I think it could be worthwhile is for those whose life history hasn't generated the usual signals to review: maybe one was initially homeschooled and became seriously ill before starting employment/university, etc.

Comment by Thrasymachus on How good is a human's gut judgement at guessing someone's IQ? · 2019-02-26T01:55:18.592Z · LW · GW

Googling around phrases like 'perception of intelligence' seems to be a keyword for a relevant literature. On a very cursory skim (i.e. no more than what you see here) it seems to suggest "people can estimate intelligence of strangers better than chance (but with plenty of room for error and bias), even with limited exposure". E.g.:

Perceived Intelligence Is Associated with Measured Intelligence in Men but Not Women (Note in this study the assessment was done purely on looking at a photograph of someone's face)

Accurate Intelligence Assessments in Social Interactions: Mediators and Gender Effects (Abstract starts with: "Research indicates that people can assess a stranger's measured intelligence more accurately than expected by chance, based on minimal information involving appearance and behavior.")

Thin Slices of Behavior as Cues of Personality and Intelligence. (Short 1-2min slices of behaviour in a variety of contexts leads to assessments by strangers that positively correlate with administered test scores for IQ and big 5)

Comment by Thrasymachus on Epistemic Tenure · 2019-02-19T10:39:18.465Z · LW · GW

As you say, Bob's good epistemic reputation should count when he says something that appears wild, especially if he has a track record that endorses him in these cases ("We've thought he was crazy before, but he proved us wrong"). Maybe one should think of Bob as an epistemic 'venture capitalist', making (seemingly) wild epistemic bets which are right more often than chance (and often illuminating even if wrong), even if they aren't right more often than not, and this might be enough to warrant further attention ("well, he's probably wrong about this, but maybe he's onto something").

I'm not sure your suggestion pushes in the right direction in the case where - pricing all of that in - we still think Bob's belief is unreasonable and he is unreasonable for holding it. The right responses in this case by my lights are two-fold.

First, you should dismiss (rather than engage with) Bob's wild belief - as (ex hypothesi) all things considered it should be dismissed.

Second, it should (usually) count against Bob's overall epistemic reputation. After all, whatever it was that meant despite Bob's merits you think he's saying something stupid is likely an indicator of epistemic vice.

This doesn't mean it should be a global black mark to taking Bob seriously ever again. Even the best can err badly, so one should weigh up the whole record. Furthermore, epistemic virtue has a few dimensions, and Bob's weaknesses in something need not mean his strengths in others be sufficient for attention esteem going forward: An archetype I have in mind with 'epistemic venture capitalist' is someone clever, creative, yet cocky and epistemically immodest - has lots of novel ideas, some true, more interesting, but many 'duds' arising from not doing their homework, being hedgehogs with their preferred 'big idea', etc.

I accept, notwithstanding those caveats, this still disincentivizes epistemic venture capitalists like Bob to some degree. Although I only have anecdata, this leans in favour of some sort of trade-off: brilliant thinkers often appear poorly calibrated and indulge in all sorts of foolish beliefs; interviews with superforecasters (e.g.) tend to emphasise things like "don't trust your intuition, be very self sceptical, canvass lots of views, do lots of careful research on a topic before staking out a view". Yet good epistemic progress relies on both - and if they lie on a convex frontier, one wants to have a division of labour.

Although the right balance to strike re. second order norms depends on tricky questions on which sort of work is currently under-supplied, which has higher value on the margin, and the current norms of communal practice (all of which may differ by community), my hunch is 'epistemic tenure' (going beyond what I sketch above) tends disadvantageous.

One is noting the are plausible costs in both directions. 'Tenure'-esque practice could spur on crack pots, have too lax a filter for noise-esque ideas, discourage broadly praiseworthy epistemic norms (cf. virtue of scholarship), and maybe not give Bob-like figures enough guidance so they range too far and unproductively (e.g. I recall one Nobel Laureate mentioning the idea of, "Once you win your Nobel Prize, you should go and try and figure out the hard problem of consciousness" - which seems a terrible idea).

The other is even if there is a trade-off, one still wants to reach the one's frontier on 'calibration/accuracy/whatever'. Scott Sumner seems to be able to combine researching on the inside view alongside judging on the outside view (see). This seems better for Sumner, and the wider intellectual community, than Sumner* who could not do the latter.

Comment by Thrasymachus on What are the open problems in Human Rationality? · 2019-01-14T17:53:39.533Z · LW · GW

FWIW: I'm not sure I've spent >100 hours on a 'serious study of rationality'. Although I have been around a while, I am at best sporadically active. If I understand the karma mechanics, the great majority of my ~1400 karma comes from a single highly upvoted top level post I wrote a few years ago. I have pretty sceptical reflexes re. rationality, the rationality community, etc., and this is reflected in that (I think) the modal post/comment I make is critical.

On the topic 'under the hood' here:

I sympathise with the desire to ask conditional questions which don't inevitably widen into broader foundational issues. "Is moral nihilism true?" doesn't seem the right sort of 'open question' for "What are the open questions in Utilitarianism?". It seems better for these topics to be segregated, no matter the plausibility or not for the foundational 'presumption' ("Is homeopathy/climate change even real?" also seems inapposite for 'open questions in homeopathy/anthropogenic climate change'). (cf. 'This isn't a 101-space').

That being said, I think superforecasting/GJP and RQ/CART etc. are at least highly relevant to the 'Project' (even if this seems to be taken very broadly to normative issues in general - if Wei_Dai's list of topics are considered elements of the wider Project, then I definitely have spent more than 100 hours in the area). For a question cluster around "How can one best make decisions on unknown domains with scant data", the superforecasting literature seems some of the lowest hanging fruit to pluck.

Yet community competence in these areas has apparently declined. If you google 'lesswrong GJP' (or similar terms) you find posts on them but these posts are many years old. There has been interesting work done in the interim: here's something on the whether the skills generalise, and something else of a training technique that not only demonstrably improves forecasting performance, but also has a handy mnemonic one could 'try at home'. (The same applies to RQ: Sotala wrote a cool sequence on Stanovich's 'What intelligence tests miss', but this is 9 years old. Stanovich has written three books since expressly on rationality, none of which have been discussed here as best as I can tell.)

I don't understand, if there are multiple people who have spent >100 hours on the Project (broadly construed), why I don't see there being a 'lessons from the superforecasting literature' write-up here (I am slowly working on one myself).

Maybe I just missed the memo and many people have kept abreast of this work (ditto other 'relevant-looking work in academia'), and it is essentially tacit knowledge for people working on the Project, but they are focusing their efforts to develop other areas. If so, a shame this is not being put into common knowledge, and I remain mystified as to why the apparent neglect of these topics versus others: it is a lot easier to be sceptical of 'is there anything there?' for (say) circling, introspection/meditation/enlightenment, Kegan levels, or Focusing than for the GJP, and doubt in the foundation should substantially discount the value of further elaborations on a potentially unedifying edifice.

[Minor] I think the first para is meant to be block-quoted?

Comment by Thrasymachus on What are the open problems in Human Rationality? · 2019-01-13T15:44:36.207Z · LW · GW

There seem some foundational questions to the 'Rationality project', and (reprising my role as querulous critic) are oddly neglected in the 5-10 year history of the rationalist community: conspicuously, I find the best insight into these questions comes from psychology academia.

Is rationality best thought of as a single construct?

It roughly makes sense to talk of 'intelligence' or 'physical fitness' because performance in sub-components positively correlate: although it is hard to say which of an elite ultramarathoner, Judoka, or shotputter is fittest, I can confidently say all of them are fitter than I, and I am fitter than someone who is bedbound.

Is the same true of rationality? If it were the case that performance on tests of (say) callibration, sunk cost fallacy, and anchoring were all independent, then this would suggest 'rationality' is a circle our natural language draws around a grab-bag of skills or practices. The term could therefore mislead us into thinking it is a unified skill which we can 'generally' improve, and our efforts are better addressed at a finer level of granularity.

I think this is plausibly the case (or at least closer to the truth). The main evidence I have in mind is Stanovich's CART, whereby tests on individual sub-components we'd mark as fairly 'pure rationality' (e.g. base-rate neglect, framing, overconfidence - other parts of the CART look very IQ-testy like syllogistic reasoning, on which more later) have only weak correlations with one another (e.g. 0.2 ish).

Is rationality a skill, or a trait?

Perhaps key is that rationality (general sense) is something you can get stronger at or 'level up' in. Yet there is a facially plausible story that rationality (especially so-called 'epistemic' rationality) is something more like IQ: essentially a trait where training can at best enhance performance on sub-components yet not transfer back to the broader construct. Briefly:

  • Overall measures of rationality (principally Stanovich's CART) correlate about 0.7 with IQ - not much worse than IQ test subtests correlate with one another or g.
  • Infamous challenges in transfer. People whose job relies on a particular 'rationality skill' (e.g. gamblers and calibration) show greater performance in this area but not, as I recall, transfer improvements to others. This improved performance is often not only isolated but also context dependent: people may learn to avoid a particular cognitive bias in their professional lives, but remain generally susceptible to it otherwise.
  • The general dearth of well-evidenced successes from training. (cf. the old TAM panel on this topic, where most were autumnal).
  • For superforecasters, the GJP sees it can get some boost from training, but (as I understand it) the majority of their performance is attributed to selection, grouping, and aggregation.

It wouldn't necessarily be 'game over' for the 'Rationality project' even if this turns out to be the true story. Even if it is the case that 'drilling vocab' doesn't really improve my g, I might value a larger vocabulary for its own sake. In a similar way, even if there's no transfer, some rationality skills might prove generally useful (and 'improvable') such that drilling them to be useful on their own terms.

The superforecasting point can be argued the other way: that training can still get modest increases in performance in a composite test of epistemic rationality from people already exhibiting elite performance. But it does seem crucial to get a general sense of how well (and how broadly) can training be expected to work: else embarking on a program to 'improve rationality' may end up as ill-starred as the 'brain-training' games/apps fad a few years ago.

Comment by Thrasymachus on Open Thread January 2019 · 2019-01-12T19:39:41.171Z · LW · GW

On Functional Decision Theory (Wolfgang Schwarz)

I recently refereed Eliezer Yudkowsky and Nate Soares's "Functional Decision Theory" for a philosophy journal. My recommendation was to accept resubmission with major revisions, but since the article had already undergone a previous round of revisions and still had serious problems, the editors (understandably) decided to reject it. I normally don't publish my referee reports, but this time I'll make an exception because the authors are well-known figures from outside academia, and I want to explain why their account has a hard time gaining traction in academic philosophy. I also want to explain why I think their account is wrong, which is a separate point.
Comment by Thrasymachus on Why is so much discussion happening in private Google Docs? · 2019-01-12T19:26:20.318Z · LW · GW

I'm someone who both prefers and practises the 'status quo'.

My impression is the key feature of this is limited (and author controlled) sharing. (There are other nifty features for things like gdocs - e.g. commenting 'on a line' - but this practice predates gdocs). The key benefits for 'me as author' are these:

1. I can target the best critics: I usually have a good idea of who is likely to help make my work better. If I broadcast, the mean quality of feedback almost certainly goes down.

2. I can leverage existing relationships: The implicit promise if I send out a draft to someone for feedback is I will engage with their criticism seriously (in contrast, there's no obligation that I 'should' respond to every critical comment on a post I write). This both encourages them to do so, and may help further foster a collegial relationship going forward.

3. I can mess up privately: If what I write makes a critical (or embarrassing) mistake, or could be construed to say something objectionable, I'd prefer this be caught in private rather than my failing being on the public record for as long as there's an internet archive (or someone inclined to take screen shots). (This community is no stranger to people - insiders or outsiders - publishing mordant criticisms of remarks made 'off the cuff' to infer serious faults in the speaker).

I also think the current status quo is a pretty good one from an ecosystem wide perspective too: I think there's a useful division of labour between 'early stage' writings to be refined by a smaller network with lower stakes, and 'final publications' which the author implicitly offers an assurance (backed by their reputation) that the work is a valuable contribution to the epistemic commons.

For most work there is a 'refining' stage, which is better done by smaller pre-selected networks rather than of authors and critics mutually 'shouting into the void' (from the author's side, there will likely be a fair amount of annoying/irrelevant/rubbish criticism; from a critic's side, a fair risk your careful remarks will be ignored or brushed off).

Publication seems to be better for polished or refined work, as at this stage a) it hopefully it has fewer mistakes and so generally more valuable to the non-critical reader, b) if there is a key mistake/objection neglected (e.g. because the pre-selected network resulted in an echo chamber) disagreement between ('steel-manned') positions registered publicly and hashed seems a useful exercise. (I'm generally a fan of more 'adversarial' - or at least 'adversarial-tolerant' norms for public discussion for this reason.)

This isn't perfect, although I don't see the 'comments going to waste' issue as the greatest challenge (one can adapt one's private comments to a public one to post, although I appreciate this is a costlier route than initially writing the public comment - ultimately, if one finds ones private feedback is repeatedly neglected, one can decline to provide it in the first place).

The biggest one I see is the risk of people who can benefit from a 'selective high-quality feedback network' (either contributing useful early stage criticism, having good early stage posts, or both) not being able to enter one. Yet so long as members of existing ones still 'keep an eye out' for posts and comments from 'outsiders', this does provide a means for such people to build up a reputation to be included in future (i.e. if Alice sees Bob make good remarks etc., she's more interested in 'running a draft by him' next time, or to respond positively if Bob asks her to look something over).

Comment by Thrasymachus on Genetically Modified Humans Born (Allegedly) · 2018-12-02T05:10:44.786Z · LW · GW

Once again I plead that when you see that an expert community looks like they don't know what their doing, it is usually more accurate to 'reduce confidence' in your understanding rather than their competence. The questions were patently not 'about forms', and covered pretty well the things I would have in mind (I'm a doctor, and I have fairly extensive knowledge of medical ethics).

To explain:

  • Although 'institutional oversight' in medicine is often derided (IRB creep, regulatory burden, and so on and so forth), one of its main purposes is to act as a check on researchers (whatever their intent) causing harm to their patients, and the idea it is good to have other people besides the researcher (who might be biased) and the patient (who might be less well informed) be the only ones making these decisions. That typical oversight was bypassed here is telling, but perhaps unsurprising as no one would green-light violating a moratorium to subject healthy embryos to poorly tested medical procedures for at best marginal clinical benefit.
  • A lot of questions targeted how informed the consent was, because this was often relied upon in the presentation (e.g. "Well, we didn't get the right mutation, but it was pretty close, and the parents were happy for us to go ahead, so we did").
  • The 'read and understand' question (I'm using the transcript, so maybe there were dumber questions which were edited out) wasn't a question about whether the patients were literate, but whether they had adequate understanding of (e.g.) the technical caveats which they were giving consent to proceed with (e.g. one mutation was a 15 del rather than a 32 del, which rather than the natural mutation which induces a frame shift and the non-functional protein gives a novel protein with a five aa removal, which may still generate an HIV susceptible protein and some remote chance of other biological effects).
  • The 'training' question is because establishing whether consent is 'informed', or providing the necessary information to make it so, isn't always straightforward (have you ever had a conversation where you thought someone understood you, but later you found out they didn't?) I did a fair amount of this in medschool, and I don't think many people think this should be an amateur sport.
  • (As hopefully goes without saying, having two rounds of consent where in each the consent taker is a researcher with a vested interest in the work going ahead has obvious problems, and hence why we're so keen on third party oversight).
  • I also see in the transcript fairly extensive discussion about risks (off-target worries would have been tacit knowledge to the audience, so some of this was pre-empted in the presentation then later picked at), and plans for followup etc.
Comment by Thrasymachus on No Really, Why Aren't Rationalists Winning? · 2018-11-05T20:35:32.921Z · LW · GW

I don't see the 'why aren't you winning?' critique as that powerful, and I'm someone who tends critical of rationality writ-large.

High-IQ societies and superforecasters select for demonstrable performance at being smart/epistemically rational. Yet on surveying these groups you see things like, "People generally do better-than-average by commonsense metrics, some are doing great, but it isn't like everyone is a millionaire". Given the barrier to entry to the rationalist community is more, "sincere interest" than "top X-percentile of the population", it would remarkable if they exhibited even better outcomes as a cohort.

There's also going to be messy causal inference worries that cut either way. If there is in some sense 'adverse selection' (perhaps alike IQ societies) for rationalists tending to have less aptitude at social communication, greater prevalence of mental illness (or whatever else), then these people enjoying modest to good success in their lives reflects extremely well on the rationalist community. Contrariwise, there's plausible confounding where smart creative people will naturally gravitate to rationality-esque discussion, even if this discussion doesn't improve their effectiveness (I think a lot of non-rationalists were around OB/LW in the early days): the cohort of people who 'teach themselves general relativity for fun' may also enjoy much better than average success, but it probably wasn't the relativity which did it.

A deeper worry wrt to rationality is there may not be anything to be taught. The elements of (say) RQ don't show much of a common factor (unlike IQ), correlate more strongly with IQ than one another, and improvements in rational thinking have limited domain transfer. So there might not be much of a general sense of (epistemic) rationality, and limited hope for someone to substantially improve themselves in this area.

Comment by Thrasymachus on No standard metric for CFAR workshops? · 2018-09-08T12:51:24.971Z · LW · GW

Another thing I'd be particularly interested in is longer term follow-up. It would be impressive if the changes to conscientiousness etc. observed in the 2015 study persist now.

Comment by Thrasymachus on [deleted post] 2018-09-06T15:08:16.242Z

I'd be hesitant to defend Great Man theory (and so would apply similar caution) but I think it can go some way, especially for defending a fragility of history hypothesis.

In precis (more here):

1. Conception of any given person seems very fragile. If parents decide to conceive an hour earlier or later (or have done different things earlier in the day, etc. etc.), it seems likely another one of the 100 million available sperm fuses than the one which did. The counterpart seems naturally modelled by a sibling, and siblings are considerably different from one another.

2. Although sometimes (/often) supposed Great Men are mere errands of providence, its hard to say this is always the case. It seems the 20th century would have been pretty different if Hitler was not around to rise to power, the character of world religions would be different with siblings of Jesus, Muhammad etc. and Tolstoy's brother probably wouldn't have written War and Peace anyway. (Although maybe in some areas ramifications are less pronounced - Great Scientists may alter the timing of discoveries a bit, but it looks plausible that we'd have Relativity by now even without Einstein).

3. 1 and 2 suggests you could get a lot of scrambling of who is around. Even if it was inevitable there was a Mongol expansion, the precise nature of this seems sensitive to who is in charge, and so whether Ghengis Khan or his sibling was born. The precise details of this expansion (where gets encroached on first, which battles are fought, etc. etc.) does horizontally perturb whether, when (and with who) other people conceive children. These different children go on to alter vertically and horizontally who else is conceived, and so the conceptive chaos propagates. I'd semi-seriously defend the thesis that none of us would be here if Ghengis Khan's parents decided to wait an hour before having sex.

4. This wouldn't mean the world is merely putty to be sculpted by great men. But even if the stage (and dramatis personae) of history is set by broader factors, which actors take on the role might still have considerable effects on the performance.

Comment by Thrasymachus on Historical mathematicians exhibit a birth order effect too · 2018-08-21T11:22:49.112Z · LW · GW

I not sure t-tests are the best approach to take compared to something non-parametric, given smallish sample, considerable skew, etc. (this paper's statistical methods section is pretty handy). Nonetheless I'm confident the considerable effect size (in relative terms, almost a doubling) is not an artefact of statistical technique: when I plugged the numbers into a chi-squared calculator I got P < 0.001, and I'm confident a permutation technique or similar would find much the same.

Comment by Thrasymachus on Informational hazards and the cost-effectiveness of open discussion of catastrophic risks · 2018-07-03T22:49:02.542Z · LW · GW

0: We agree potentially hazardous information should only be disclosed (or potentially discovered) when the benefits of disclosure (or discovery) outweigh the downsides. Heuristics can make principles concrete, and a rule of thumb I try to follow is to have a clear objective in mind for gathering or disclosing such information (and being wary of vague justifications like ‘improving background knowledge’ or ‘better epistemic commons’) and incur the least possible information hazard in achieving this.

A further heuristic which seems right to me is one should disclose information in the way that maximally disadvantages bad actors versus good ones. There are a wide spectrum of approaches that could be taken that lie between ‘try to forget about it’, and ‘broadcast publicly’, and I think one of the intermediate options is often best.

1: I disagree with many of the considerations which push towards more open disclosure and discussion.

1.1: I don’t think we should be confident there is little downside in disclosing dangers a sophisticated bad actor would likely rediscover themselves. Not all plausible bad actors are sophisticated: a typical criminal or terrorist is no mastermind, and so may not make (to us) relatively straightforward insights, but could still ‘pick them up’ from elsewhere.

1.2: Although a big fan of epistemic modesty (and generally a detractor of ‘EA exceptionalism’), EAs do have an impressive track record in coming up with novel and important ideas. So there is some chance of coming up with something novel and dangerous even without exceptional effort.

1.3: I emphatically disagree we are at ‘infohazard saturation’ where the situation re. Infohazards ‘can’t get any worse’. I also find it unfathomable ever being confident enough in this claim to base strategy upon its assumption (cf. eukaryote’s comment).

1.4: There are some benefits to getting out ‘in front’ of more reckless disclosure by someone else. Yet in cases where one wouldn’t want to disclose it oneself, delaying the downsides of wide disclosure as long as possible seems usually more important, and so rules against bringing this to an end by disclosing yourself save in (rare) cases one knows disclosure is imminent rather than merely possible.

2: I don’t think there’s a neat distinction between ‘technical dangerous information’ and ‘broader ideas about possible risks’, with the latter being generally safe to publicise and discuss.

2.1: It seems easy to imagine cases where the general idea comprises most of the danger. The conceptual step to a ‘key insight’ of how something could be dangerously misused ‘in principle’ might be much harder to make than subsequent steps from this insight to realising this danger ‘in practice’. In such cases the insight is the key bottleneck for bad actors traversing the risk pipeline, and so comprises a major information hazard.

2.2: For similar reasons, highlighting a neglected-by-public-discussion part of the risk landscape where one suspects information hazards lie has a considerable downside, as increased attention could prompt investigation which brings these currently dormant hazards to light.

3: Even if I take the downside risks as weightier than you, one still needs to weigh these against the benefits. I take the benefit of ‘general (or public) disclosure’ to have little marginal benefit above more limited disclosure targeted to key stakeholders. As the latter approach greatly reduces the downside risks, this is usually the better strategy by the lights of cost/benefit. At least trying targeted disclosure first seems a robustly better strategy than skipping straight to public discussion (cf.).

3.1: In bio (and I think elsewhere) the set of people who are relevant setting strategy and otherwise contributing to reducing a given risk is usually small and known (e.g. particular academics, parts of the government, civil society, and so on). A particular scientist unwittingly performing research with misuse potential might need to know the risks of their work (likewise some relevant policy and security stakeholders), but the added upside to illustrating these risks in the scientific literature is limited (and the added downsides much greater). The upside of discussing them in the popular/generalist literature (including EA literature not narrowly targeted at those working on biorisk) is limited still further.

3.2: Information also informs decisions around how to weigh causes relative to one another. Yet less-hazardous information (e.g. the basic motivation given here or here, and you could throw in social epistemic steers from the prevailing views of EA ‘cognoscenti’) is sufficient for most decisions and decision-makers. The cases where this nonetheless might be ‘worth it’ (e.g. you are a decision maker allocating a large pool of human or monetary capital between cause areas) are few and so targeted disclosure (similar to 3.1 above) looks better.

3.3: Beyond the direct cost of potentially giving bad actors good ideas, the benefits of more public discussion may not be very high. There are many ways public discussion could be counter-productive (e.g. alarmism, ill-advised remarks poisoning our relationship with scientific groups, etc.). I’d suggest the examples of cryonics, AI safety, GMOs and other lowlights of public communication of policy and science are relevant cautionary examples.

4: I also want to supply other more general considerations which point towards a very high degree caution:

4.1: In addition to the considerations around the unilateralist’s curse offered by Brian Wang (I have written a bit about this in the context of biotechnology here) there is also an asymmetry in the sense that it is much easier to disclose previously-secret information than make previously-disclosed information secret. The irreversibility of disclosure warrants further caution in cases of uncertainty like this.

4.2: I take the examples of analogous fields to also support great caution. As you note, there is a norm in computer security of ‘don’t publicise a vulnerability until there’s a fix in place’, and initially informing a responsible party to give them the opportunity to to do this pre-publication. Applied to bio, this suggests targeted disclosure to those best placed to mitigate the information hazard, rather than public discussion in the hopes of prompting a fix to be produced. (Not to mention a ‘fix’ in this area might prove much more challenging than pushing a software update.)

4.3: More distantly, adversarial work (e.g. red-teaming exercises) is usually done by professionals, with a concrete decision-relevant objective in mind, with exceptional care paid to operational security, and their results are seldom made publicly available. This is for exercises which generate information hazards for a particular group or organisation - similar or greater caution should apply to exercises that one anticipates could generate information hazardous for everyone.

4.4: Even more distantly, norms of intellectual openness are used more in some areas, and much less in others (compare the research performed in academia to security services). In areas like bio, the fact that a significant proportion of the risk arises from deliberate misuse by malicious actors means security services seem to provide the closer analogy, and ‘public/open discussion’ is seldom found desirable in these contexts.

5: In my work, I try to approach potentially hazardous areas as obliquely as possible, more along the lines of general considerations of the risk landscape or from the perspective of safety-enhancing technologies and countermeasures. I do basically no ‘red-teamy’ types of research (e.g. brainstorm the nastiest things I can think of, figure out the ‘best’ ways of defeating existing protections, etc.)

(Concretely, this would comprise asking questions like, “How are disease surveillance systems forecast to improve over the medium term, and are there any robustly beneficial characteristics for preventing high-consequence events that can be pushed for?” or “Are there relevant limits which give insight to whether surveillance will be a key plank of the ‘next-gen biosecurity’ portfolio?”, and not things like, “What are the most effective approaches to make pathogen X maximally damaging yet minimally detectable?”)

I expect a non-professional doing more red-teamy work would generate less upside (e.g. less well networked to people who may be in a position to mitigate vulnerabilities they discover, less likely to unwittingly duplicate work) and more downside (e.g. less experience with trying to manage info-hazards well) than I. Given I think this work is usually a bad idea for me to do, I think it’s definitely a bad idea for non-professionals to try.

I therefore hope people working independently on this topic approach ‘object level’ work here with similar aversion to more ‘red-teamy’ stuff, or instead focus on improving their capital by gaining credentials/experience/etc. (this has other benefits: a lot of the best levers in biorisk are working with/alongside existing stakeholders rather than striking out on one’s own, and it’s hard to get a role without (e.g.) graduate training in a relevant field). I hope to produce a list of self-contained projects to help direct laudable ‘EA energy’ to the best ends.

Comment by Thrasymachus on Informational hazards and the cost-effectiveness of open discussion of catastrophic risks · 2018-07-03T22:48:43.530Z · LW · GW

Thanks for writing this. How best to manage hazardous information is fraught, and although I have some work in draft and under review, much remains unclear - as you say, almost anything could have some some downside risk, and never discussing anything seems a poor approach.

Yet I strongly disagree with the conclusion that the default should be to discuss potentially hazardous (but non-technical) information publicly, and I think your proposals of how to manage these dangers (e.g. talk to one scientist first) generally err too lax. I provide the substance of this disagreement in a child comment.

I’d strongly endorse a heuristic along the lines of, “Try to avoid coming up with (and don’t publish) things which are novel and potentially dangerous”, with the standard of novelty being a relatively uninformed bad actor rather than an expert (e.g. highlighting/elaborating something dangerous which can be found buried in the scientific literature should be avoided).

This expressly includes more general information as well as particular technical points (e.g. “No one seems to be talking about technology X, but here’s why it has really dangerous misuse potential” would ‘count’, even if a particular ‘worked example’ wasn’t included).

I agree it would be good to have direct channels of communication for people considering things like this to get advice on whether projects they have in mind are wise to pursue, and to communicate concerns they have without feeling they need to resort to internet broadcast (cf. Jan Kulveit’s remark).

To these ends, people with concerns/questions of this nature are warmly welcomed and encouraged to contact me to arrange further discussion.

Comment by Thrasymachus on Societal Growth Requires Rehabilitation · 2018-05-26T13:54:57.548Z · LW · GW

This seems right to me, and at least the 'motte' version of growth mindset accepts that innate ability may set pretty hard envelopes on what you can accomplish regardless of how energetic/agently you pursue self improvement (and this can apply across a range of ability - although it seems cruel and ludicrous to suggest someone with severe cognitive impairment can master calculus, it also seems misguided to suggest someone in middle age can become a sports star if they really go for it). As you say, taking growth mindset 'too far' has a dark side in that we might start thinking that people fail or struggle because they aren't trying hard enough (generally a fault which we morally criticise) rather than lacking the ability (generally 'blameless').

But I'd venture a broader criticism about growth mindset which apply both to the 'motte' form sketched above, but also its sincere use in the rationalist community - that we shouldn't only 'not take it too far', but not take it anywhere at all:

1) Growth mindset as expounded by Dweck and colleagues has not weathered replication well. The most recent systematic reviews give extremely minor effects on achievement (r=0.1) and even smaller intervention effects (d=0.08). The authors of the meta-analysis are about as sceptical as I am about whether these residual effects are real, but even if real they are extremely minor compared to more traity things (you can get much better prediction of academic achievement by genotyping than assessing growth mindest).

2) There's a natural story of reverse causation, which also applies to the closely related 'internal versus external locus of control'. If you're smart and living in propitious circumstances, you may be right in thinking "I can get good at this if I really try" for many different things. If you lack this good fortune, your belief "Even if I try really hard at something, for most somethings I probably won't develop mastery (or even competence)" might be a case of accurate and laudable insight.

3a) I can think of more than a few occasions in my life where the latter was better for me. One was when I was contemplating what subjects to keep before I went to university, and I had discussions with various teachers along the lines of, "You're good but not exceptional at this, maybe think about something else?" Or (in medical school) a conversation along the lines of, "You have dysgraphia, which probably makes you somewhat weaker at fine manual dexterity. Opthalmology requires really good fine manual dexterity, so maybe this isn't the specialty for you."

3b) It also seemed to serve me better when I couldn't circumvent my limitations by picking a different line of work. I focused especially hard on training myself to perform practical procedures because I realised I was working from a disadvantage, and so had to try harder to be satisfactory (I maintained no illusions of becoming great at it).

3c) My impression is that conversations (or thinking) like this tend to be more emotionally difficult than more aspirational, "Don't worry, you can do it!" exhortations. So I'd guess they are probably under-supplied from their optimum.

In essence, there's an underlying empirical topic which 'growth mindset' relies upon: that a lot of whether one accomplishes something or not depends on mindset or attitude. The answer to that, as best as I can tell, is this isn't really true: we live in a world which has the uncomfortable features where which tickets one draws from the genetic lottery, birthplace lottery, and early environment lottery (etc) determine the broad strokes of one's life far more than particular efforts of will and mindset (and growth mindset in particular, which seems to have slim-to-no effect). Many things which are possible for someone are not possible for us, no matter what we do, and no matter how hard we try.

Then there's a prudential question of (even if it isn't true) whether it would be better to act and believe as-a growth mindset would suggest. Again, it doesn't seem so: the evidence for mindset interventions working is slim to none, and insofar as one can survey anecdata my impression is 'anti-growth mindset' advice is undersupplied relative to its importance.

It is inarguable one should often persevere in work to improve oneself, to not give up 'too soon', and to encourage others when trying to do the same. Yet there are times when it is better to recognise the limits of one's abilities, that one should cut their losses, and to shoulder the burden of (if one believes it to be the case) telling someone they should quit something because they 'don't have what it takes'. The right judgement in these cases is a matter for practical wisdom. Insofar as growth mindset as it is preached (but also as it is practised) biases us more to the former sort of behaviour, it should be resisted.

Comment by Thrasymachus on The Berkeley Community & The Rest Of Us: A Response to Zvi & Benquo · 2018-05-22T09:23:19.781Z · LW · GW
A healthy topology of the field should have approximately power-law distribution of hub sizes. This should be true also for related research fields we are trying to advance, like AI alignment or x-risk. If the structure is very far from that (e.g. one or two very big hubs, than nothing, than a lot of two orders of magnitude smaller groups fighting for mere existence), the movement should try to re-balance, supporting growth of medium-tier hubs.

Although my understanding of network science is abecedarian, I'm unsure of both whether this feature is diagnostic (i.e. divergence from power-law distributions should be a warning sign) or whether we in fact observe overdispersion even relative to a power law. The latter first.

1) 'One or two big hubs, then lots of very small groups' is close to what a power law distribution should look like. If anything, it's plausible the current topology doesn't look power-lawy enough. The EA community overlaps with the rationalist community, and it has somewhat better data on topology: If anything the hub sizes of the EA community are pretty even. This also agrees with my impression: although the bay area can be identified as the biggest EA hub, there are similar or at least middle sized hubs elsewhere (Oxford, Cambridge (UK), London, Seattle, Berlin, Geneva, etc. etc.) If we really thought a power law topology was desirable, there's a plausible case to push for centralisation.

The closest I could find to a 'rationalist survey' was the SSC survey, which again has a pretty 'full middle', and not one or two groups ascendant. That said, I'd probably defer to others impressions here as I'm not really a rationalist and most of the rationalist online activity I see does originate from the bay. But even if so, this impression wouldn't worry us if we wanted to see a power law here.

2) My understanding is there are a few generators of power law distributions. One is increasing returns to scale (e.g. cities being more attractive to live in the larger they are, ceteris paribus), another is imperfect substitution (why listen to an okay pianist when I can have a recording of the world's best?), a third could be positive feedback loops or Matthew effects (maybe 'getting lucky' with a breakout single increases my chance of getting noticed again, even when controlling for musical ability versus the hitless).

There are others, but many of these generators are neutral, and some should be welcomed. If there's increasing marginal returns to rationalist density, inward migration to central hubs seems desirable. Certain 'jobs' seem to have this property: a technical AI researcher in (say) Japan probably can have greater EV working in an existing group (most of which are in the bay) rather than trying to seed a new AI safety group in Japan. Ditto if the best people in a smaller hub migrate to contribute to a larger one (although emotions run high, I don't think calling this 'raiding' is helpful - the people who migrate have agency).

[3) My hunch is what might be going on is that the 'returns' are sigmoid, and so are diminishing with new entrants to the Bay Area. 'Jobs'-wise, it is not clear the Bay Area is the best place to go if you aren't going to work on AI research, and even if so this is a skill set that is rare in absolute terms amongst rationalists). Social-wise, there's limited interaction bandwidth, especially among higher status folks, and so the typical rationalist who goes to the bay won't get the upside the most desirable bits of bay area social interactions - when weighed across from the transaction costs, staying put and fostering another hub might look better.]

(I echo Chris's exhortation)

Comment by Thrasymachus on [deleted post] 2018-03-21T21:55:31.062Z

+1

It also risks a backfire effect. If one is in essence a troll happy to sneer at what rationalists do regardless of merit (e.g. "LOL, look at those losers trying to LARP enders game!"), seeing things like Duncan's snarky parenthetical remarks would just spur me on, as it implies I'm successfully 'getting a rise' out of the target of my abuse.

It seems responses to criticism that is unpleasant or uncharitable are best addressed specifically to the offending remarks (if they're on LW2, this seems like pointing out the fallacies/downvoting as appropriate), or just ignored. More broadcasted admonishment ("I know this doesn't apply to everyone, but there's this minority who said stupid things about this") seems unlikely to marshall a corps of people who will act together to defend conversational norms, but bickering and uncertainty about whether or not one is included in this 'bad fraction'.

(For similar reasons, I think amplifying rebuttals along the lines of, "You're misinterpreting me, and that people who don't interpret others correctly is one of the key problems with the LW community" seems apt to go poorly - few want to be painted as barbarians at the gates, and prompts those otherwise inclined to admit their mistake to instead double down or argue the case further.)

Comment by Thrasymachus on [deleted post] 2018-03-20T21:34:15.429Z
I also think I got things about right, but I think anyone else taking an outside view would've expected roughly the same thing.

I think you might be doing yourself a disservice. I took the majority of contemporary critcism was more directed towards (in caricature) 'this is going to turn into a nasty cult!' than (what I took your key insight to be) 'it will peter out because the commander won't actually have the required authority'.

So perhaps the typical 'anyone else' would have alighted on the wrong outside view, or at least the wrong question to apply it to ('How likely would a group of rationalists end up sustaining a 'dictator' structure?', rather than 'Conditional on having this structure, what would happen next?')

Comment by Thrasymachus on [deleted post] 2018-03-20T21:16:01.417Z

Bravo - I didn't look at the initial discussion, or I would have linked your pretty accurate looking analysis (on re-skimming, Deluks also had points along similar lines). My ex ante scepticism was more a general sense than a precise pre-mortem I had in mind.

Comment by Thrasymachus on [deleted post] 2018-03-19T14:59:56.430Z

Although I was sufficiently sceptical of this idea to doubt it was 'worth a shot' ex ante,(1) I was looking forward to being pleasantly surprised ex post. I'm sorry to hear it didn't turn out as well as hoped. This careful and candid write-up should definitely be included on the 'plus' side of the ledger for this project.

With the twin benefits of no skin in the game and hindsight. I'd like to float another account which may synthesize a large part of 'why it didn't work'.

Although I understand DAB wasn't meant to simply emulate 'military style' living, it did borrow quite a lot of that framing (hence the 'army' and 'barracks' bit of DAB). Yet these arrangements require a considerable difference in power and authority between the commander and their subordinates. I don't think DAB had this, and I suggest this proved its downfall.

[R]ight, as the world goes, is only in question between equals in power, while the strong do what they can and the weak suffer what they must
Thucydides - The Melian Dialogue

First, power. Having a compelling or else matters for maintaining discipline - even if people come in planning to obey, they may vacillate. In military contexts, accomodation at a barracks (or a boot camp) is a feature that can be unilaterally rescinded - if I argue back against the drill instructor, even if I'm in the right, they can make the credible threat "Either do as I say, or I kick you out". This threat has great 'downside-asymmetry': it's little skin off the instructor's nose if he boots me, but it means the end of my military career if they follow through. In consequence, the instructor has a lot more de facto bargaining power to make me do things I don't want to do. In negotiation theory, the drill sergeant's BATNA is way better than mine, and so good the only 'negotiation' they use are threats rapidly escalating to expulsion for any disobedience to their dictation.

I assume the legal 'fact on the ground' is that the participants of DAB were co-signatories on a lease, making significant financial contributions, with no mechanism for the designated 'commander' to kick people out unilaterally. Given this realpolitic, whatever 'command' the commander has in this situation is essentially roleplay. If a 'Red Knight' character decides not to play ball with the features of DAB that go beyond typical legal arrangements of house-sharing (e.g. "Nah, I'm not going to do this regular communal activity, I'm not going to do press-ups because you're telling me to, and I'm going to do my own thing instead."), the commander can't 'expel' them, but at most ask them to leave.

In fact the commander can only credibly threaten to expel subordinates from the social game of Dragon Army. If I'm a participant this might be a shame, but it might not be a great cost if I'm not feeling it anymore: I can probably (modulo some pretty dodgy social ostracising which other participants may not abide by) still interact with the remaining dragons in a normal housemate-y way.

Such a situation has pretty symmetrical downsides for both parties: if I'm the Red Knight, I probably would end up leaving (it wouldn't be fun being the 'odd man out', although the difficulty is shared by the dragons who have people in the house who aren't 'buying in', and the commander dealing with the ever-present example of the pretty mild consequence of what happens if others likewise disobey), but not until I'd lined up better accomodation, and I certainly wouldn't feel any obligation - given you're kicking me out rather than I'm choosing to leave - to pay rent after I go or help find others to 'fill my spot' to defray the proportionally increased cost of the lease (itself an incentive for other subordinates to pressure the commander to relent).

Instead of the boot camp case, where the much inferior BATNA of recruits gives them no negotiation position with their drill sergeant; in the DA case, the similarly equidistant BATNAs mean the commander and a dragon are at near-party (whatever affectations both maintain to the contrary). The stage is set for negotiation over discipline (even if coded or implicit), rather than dictation from the commander to the subordinate.

Authority (in the informal or moral sense, c.f. Weberian 'charismatic authority') can partly - though imperfectly - substitute for power (e.g. doctors and clergy tend to have more 'influence' than 'ordering people around'). What is notable that authority like this tends to be linked to some particular concrete achievement or track record, and is mainly constrained to a particular domain, with only mild 'bleed through' into general social status.

Given the nature of DAB, there is no convincing concrete thing a putative-commander could point to which legitimizes broad-raging authority over others beyond assent by the putative-subordinates this is an experiment they want to try. I suspect this results in pseudo-authority and obedience highly conditional on results: "I'm going to act as-if I am much lower status than this 'commander', and so kowtow towards them. But in terms of real status they're a peer to me, and I spot them this loan of social game status because I may benefit. If this doesn't happen, I'm less inclined to play along." The limited capital this supplies doesn't give the commander much to spend on pushing things through despite dissent: as the matters are between equals, 'right' becomes the question again.

I suggest this may explain the need for a better off-ramp (I note military or religious organisations which emphasize discipline and obedience generally provide accomodation etc. gratis, perhaps in part to provide the power initiates are there by invitation - I wonder whether reciprocal arrangements with another house substitute adequately); the reluctance to 'give enough orders' (better to maintain a smaller stake of authority than gamble a wider bid that could lose lots of face if flouted); the lack of a 'big enough stick' ("I'm happy to undergo token punishment - to lose pretend status - as a ritual for the sake of the game, but if the price of you 'disciplining' me costs real status, I'm out"); the need to avoid transgressing what less bought-in subordinates wanted (perhaps the result of a coded and implicit negotiation suggested earlier); and the shortfall between the ostensible standards asserted (set by aspiration), and those observed (set by the balance of power). One could imagine recognition of these features may have been key in transforming 'White Knights' into 'Black Knights'.

I'd guess, if this is the (forgive me) crux of the problem, whether it can really be fixed. Perhaps even more assiduous selection could do the trick, yet the base rate in the Bay Area (which isn't renowned for selecting people easygoing with authority) is not propitious. An alternative - but an exorbitantly expensive one - which would supply the commander with unquestioned power/authority would be if rent is free for subordinates: "You're here for free, so I can kick you if you don't play ball". I suspect a more modest 'Phase II' (maybe just social norms to do communal activity, with no pretended 'boss') might be the best realistic iteration.

-

1: It also seemed a bit risky from an 'abuse' (very broadly construed) perspective to me, although I stress this worry arose solely from the set-up rather than any adverse judgement of Duncan personally. Although understandable to be snarky to critics (e.g. one despicable pseudonymous commenter on the original 'pitch' recommended Duncan kill himself) I think the triumphalism of, "Although it didn't work, this 'not working' was not achieving all the good targets, rather than something horrible happening to someone - guess this wasn't a hotbed of abuse after all, and I'm not some abusive monster like you said! Eat humble pie! (but you won't, because you're irrational!)" is a bit misplaced.

The steelman of this is something like, "Although I don't think it blowing up really badly is more likely than not, the disjunction of bad outcomes along these lines is substantially higher than the base rate for a typical group-house, and this isn't outweighed across the scales by the expected benefit". On this view the observation the proposed risk wasn't realised is only mild discomfirmation for the hazard, although it rules strongly against outlandish 'disaster by default' views.

[Edit: Slight rewording, concision, and I remembered the theory I was gesturing at imprecisely]

Comment by Thrasymachus on Comments on Power Law Distribution of Individual Impact · 2018-03-07T15:44:28.499Z · LW · GW

This new paper may be of relevance (H/T Steve Hsu). The abstract:

The largely dominant meritocratic paradigm of highly competitive Western cultures is rooted on the belief that success is due mainly, if not exclusively, to personal qualities such as talent, intelligence, skills, efforts or risk taking. Sometimes, we are willing to admit that a certain degree of luck could also play a role in achieving significant material success. But, as a matter of fact, it is rather common to underestimate the importance of external forces in individual successful stories. It is very well known that intelligence or talent exhibit a Gaussian distribution among the population, whereas the distribution of wealth - considered a proxy of success - follows typically a power law (Pareto law). Such a discrepancy between a Normal distribution of inputs, with a typical scale, and the scale invariant distribution of outputs, suggests that some hidden ingredient is at work behind the scenes. In this paper, with the help of a very simple agent-based model, we suggest that such an ingredient is just randomness. In particular, we show that, if it is true that some degree of talent is necessary to be successful in life, almost never the most talented people reach the highest peaks of success, being overtaken by mediocre but sensibly luckier individuals. As to our knowledge, this counterintuitive result - although implicitly suggested between the lines in a vast literature - is quantified here for the first time. It sheds new light on the effectiveness of assessing merit on the basis of the reached level of success and underlines the risks of distributing excessive honors or resources to people who, at the end of the day, could have been simply luckier than others. With the help of this model, several policy hypotheses are also addressed and compared to show the most efficient strategies for public funding of research in order to improve meritocracy, diversity and innovation.

Comment by Thrasymachus on Meta-tations on Moderation: Towards Public Archipelago · 2018-02-25T19:57:22.909Z · LW · GW

I endorse Said's view, and I've written a couple of frontpage posts.

I also add that I think Said is a particularly able and shrewd critic, and I think LW2 would be much poorer if there was a chilling effect on his contributions.

Comment by Thrasymachus on [Meta] New moderation tools and moderation guidelines · 2018-02-18T23:45:03.991Z · LW · GW

Let's focus on the substance, please.

Comment by Thrasymachus on [Meta] New moderation tools and moderation guidelines · 2018-02-18T16:25:36.952Z · LW · GW

I'm also mystified at why traceless deletition/banning are desirable properties to have on a forum like this. But (with apologies to the moderators) I think consulting the realpolitik will spare us the futile task of litigating these issues on the merits. Consider it instead a fait accompli with the objective to attract a particular writer LW2 wants by catering to his whims.

For whatever reason, Eliezer Yudkowsky wants to have the ability to block commenters and have the ability to do traceless deletion on his own work, and he's been quite clear this is a condition for his participation. Lo and behold precisely these features have been introduced, with suspiciously convenient karma thresholds which allow EY (at his current karma level) to traceless delete/ban on his own promoted posts, yet exclude (as far as I can tell) the great majority of other writers with curated/front page posts from being able to do the same.

Given the popularity of EY's writing (and LW2 wants to include future work of his), the LW2 team are obliged to weigh up the (likely detrimental) addition of these features versus the likely positives of his future posts. Going for the latter is probably the right judgement call to make, but let's not pretend it is a principled one: we are, as the old saw goes, just haggling over the price.

Comment by Thrasymachus on How the LW2.0 front page could be better at incentivizing good content · 2018-01-21T23:26:48.735Z · LW · GW

FWIW, I struggle to navigate the front page to look at good posts (I struggle to explain why - I think I found 'frontpage etc.' easier for earlier versions). What I do instead is look at the comments feed and click through to articles that way, which seems suboptimal, as lots of comments may not be a very precise indicator of quality.

Comment by Thrasymachus on Kenshō · 2018-01-21T02:37:46.828Z · LW · GW

FWIW, this aptly describes my own adverse reaction to the OP. "I have this great insight, but I not only can't explain it to you, but I'm going to spend the balance of my time explaining why you couldn't understand it if I tried to explain it" sounds awfully close to bulveristic stories like, "If only you weren't blinded by sin, you too would see the glory of the coming of the lord".

That the object level benefits offered seem to be idiographic self-exhaltations augur still poorer (i.e. I cut through confusion so much more easily now (no examples provided); I have much greater reserves to do stuff; I can form much deeper pacts with others who, like I, can See the Truth.) I recall the 'case' for Ander's Connection Theory was of a similar type. But at least connection theory at least sketched something like a theory to consider on its merits.

There needs to be either some object-level description (i.e. "This is what Looking is"), or - if that really isn't possible - demonstration of good results (i.e. "Here's a great post on a CFAR-adjecent topic, and this was thanks to Looking.") Otherwise, the recondite and the obscurantist look very much alike.

Comment by Thrasymachus on Comments on Power Law Distribution of Individual Impact · 2017-12-29T08:43:17.238Z · LW · GW

I was unaware of the range restriction, which could well compress SD. That said, if you take the '9' scorers as '9 or more', then you get something like this (using 20-25)

Mean value is around 7 (6.8), 7% get 9 or more, suggesting 9 is at or around +1.5SD assuming normality, so when you get a sample size in the thousands, you should start seeing scores at 11 or so (+3SD) - I wouldn't be startled to find Ben has this level of ability. But scores at (say) 15 or higher (+6SD) should only be seen with extraordinarily rarely.

If you use log-normal assumptions, you should expect something like if +1.5SD is 2, 3SD is around 6 (i.e. ~13), and 4.5SD would give scores at 21 or so.

An unfortunate challenge at picking at the tails here is one can train digit span - memory athletes drill this and I understand the record lies in the three figures.

Perhaps a natural test would be getting very smart but training naive people (IMOers?) to try this. If they're consistently scoring 15+, this is hard to reconcile with normalish assumptions (digit span wouldn't correlate perfectly with mathematical ability, so lots of 6 sigma+ results look weird), and vice versa.

Comment by Thrasymachus on Comments on Power Law Distribution of Individual Impact · 2017-12-29T07:36:25.022Z · LW · GW

I'm aware of normalisation, hence I chose things which have some sort of 'natural cardinal scale' (i.e. 'how many Raven's do you get right' doesn't really work, but 'how many things can you keep in mind at once' is better, albeit imperfect).

Not all skew entails a log-normal (or some similar - assumedly heavy tailed) distribution. This applies to your graph for digit span you cite here. The mean of the data is around 5, and the SD is around 2. Having ~11% at +1SD (7) and about 3% at +2SD (9) is a lot closer to normal distribution land (or, given this is count data, a pretty well-behaved poisson/slightly overdispersed binomial) than a hypothetical log normal. Given log normality, one should expect a dramatically higher maximum score when you increase the sample size from 78 in the cited study to 2400 or so. Yet in the standardization sample of the WAIS III of this size no individual had greater than 9 in forward digit span (and no one higher than 8 in reverse). (This is, I assume, the foundation for the famous '7 plus or minus 2' claim.)

http://www.sciencedirect.com/science/article/pii/S0887617701001767#TBL2

A lot turns on 'vary dramatically', but I think on most commonsense uses of this would not be it. I'd take reaction time data to be similar - although there is a 'long tail', this is a long tail of worse performance - and the tail isn't that long. So I don't buy claims I occasionally see made along the lines of 'Einstein was just miles smarter than a merely average physicist'.

Comment by Thrasymachus on In defence of epistemic modesty · 2017-11-03T00:44:29.582Z · LW · GW

Sorry you disliked the post so much. But you might have liked it more had you looked at the bit titled 'community benefits to immodesty', where I talk about the benefits of people striking out outside expert consensus (but even if they should act 'as if' their contra-expert take was correct, they should nonetheless defer to it for 'all things considered' views).

Comment by Thrasymachus on In defence of epistemic modesty · 2017-10-30T19:17:42.002Z · LW · GW

No. I chose him as a mark of self-effacement. When I was younger I went around discussion forums about philosophy, and people commonly named themselves after ancient greats like Socrates, Hume, etc. Given Thrasymachus's claim to fame is being rude, making some not great arguments, and getting spanked by Socrates before the real discussion started (although I think most experts think Socrates's techne based reply was pretty weak), I thought he would be a more accurate pseudonym,

Comment by Thrasymachus on Contra double crux · 2017-10-15T15:07:31.744Z · LW · GW

Sorry for misreading your original remark. Happy to offer the bet in conditional, i.e.:

Conditional on CFAR producing results of sufficient quality for academic publication (as judged by someone like Christiano or Karnofsky) these will fail to demonstrate benefit on a pre-specified objective outcome measure

Comment by Thrasymachus on Contra double crux · 2017-10-14T23:20:10.371Z · LW · GW

Thanks for your reply. Given my own time constraints I'll decline your kind offer to discuss this further (I would be interested in reading some future synthesis). As consolation, I'd happily take you up on the modified bet. Something like:

Within the next 24 months CFAR will not produce results of sufficient quality for academic publication (as judged by someone like Christiano or Karnofsky) that demonstrate benefit on a pre-specified objective outcome measure

I guess 'demonstrate benefit' could be stipulated as 'p<0.05 on some appropriate statistical test' (the pre-specification should get rid of the p-hacking worries). 'Objective' may remain a bit fuzzy: the rider is meant to rule out self-report stuff like "Participants really enjoyed the session/thought it helped them". I'd be happy to take things like "Participants got richer than controls", "CFAR alums did better on these previously used metrics of decision making", or whatever else.

Happy to discuss further to arrive at agreeable stipulations - or, if you prefer, we can just leave them to the judges discretion.

Comment by Thrasymachus on Contra double crux · 2017-10-14T21:45:08.879Z · LW · GW

I hope readers will forgive a 'top level' reply from me, it's length, and that I plan to 'tap out' after making it (save for betting). As pleasant as this discussion is, other demands pull me elsewhere. I offer a summary of my thoughts below - a mix of dredging up points I made better 3-4 replies deep than I managed in the OP, and to reply to various folks at CFAR. I'd also like to bet (I regret to decline Eli's offer for reasons that will become apparent, but I hope to make some agreeable counter-offers).

I persist in three main worries: 1) That double crux (or 'cruxes' simpliciter) are confused concepts; 2) It doesn't offer anything above 'strong consideration', and insofar as it is not redundant, framing in 'cruxes' harms epistemic practice; 3) The evidence CFAR tends to fall back upon to nonetheless justify the practice of double crux is so undermined that it is not only inadequate public evidence, but it is inadequate private evidence for CFAR itself.

The colloid, not crystal, of double crux

A common theme in replies (and subsequent discussions) between folks at CFAR and I is one of a gap in understanding. I suspect 'from their end' (with perhaps the exception of Eli) the impression is I don't quite 'get it' (or, as Duncan graciously offers, maybe it's just the sort of thing that's hard to 'get' from the written up forms): I produce sort-of-but-not-quite-there simulacra of double crux, object to them, but fail to appreciate the real core of double crux to which these objections don't apply. From mine, I keep trying to uncover what double crux is, yet can't find any 'hard edges': it seems amorphous, retreating back into other concepts when I push on what I think it is distinct, yet flopping out again when I turn to something else. So I wonder if there's anything there at all.

Of course this seeming 'from my end' doesn't distinguish between the two cases. Perhaps I am right double crux is no more than some colloid of conflated and confused concepts; but perhaps instead there is a a crystallized sense of what double crux is 'out there' that I haven't grasped. Yet what does distinguish these cases in my favour is that CFAR personnel disagree with one another about double crux.

For a typical belief which one might use double crux (or just 'single cruxing') should one expect to find one crux, or find multiple cruxes?

Duncan writes (among other things on this point):

The claim that I derive from "there's surprisingly often one crux" is something like the following: that, for most people, most of the time, there is not in fact a careful, conscious, reasoned weighing and synthesis of a variety of pieces of evidence. [My emphasis]

By contrast, Dan asserts in his explanation:

A typical belief has many cruxes. For example, if Ron is in favor of a proposal to increase the top marginal tax rate in the UK by 5 percentage points, his cruxes might include "There is too much inequality in the UK", "Increasing the top marginal rate by a few percentage points would not have much negative effect on the economy", and "Spending by the UK government, at the margin, produces value". [my emphasis]

This doesn't seem like a minor disagreement, as it flows through to important practical considerations. If there's often one crux (but seldom more), once I find it I should likely stop looking; if there's often many cruxes, I should keep looking after I find the first.

What would this matter, beyond some 'gotcha' or cheap point-scoring? This: I used to work in public health, and one key area is evaluation of complex interventions. Key to this in turn is to try and understand both that the intervention works but also how it works. The former without the latter raises introduces a troublesome black box: maybe elaborate high-overhead model for your intervention works through some much simpler causal path (c.f. that many schools of therapy with mutually incompatible models are in clinical equipoise, but appear also in equipoise with 'someone sympathetic listening to you'); maybe you mistake the key ingredient as intrinsic to the intervention where it is instead contingent on the setting so it doesn't work when this is changed (c.f. the external validity concerns that plague global health interventions).

In CFAR's case there doesn't seem a shared understanding of the epistemic landscape (or, at least, where cruxes lie within it) between 'practicioners'. It also looks to me there's not a shared understanding on the 'how it works' question - different accounts point in different directions: Vanvier seems to talk more about 'getting out of trying to win the argument mode to getting to the truth mode', Duncan emphasizes more potential rationalisations one may have for a belief, Eli suggests it may help locate cases where we differ in framing/fundamental reasons that are in common with more proximal reasons (i.e. the 'earth versus moon' hypothetical). Of course, it could do all of these, but I don't think CFAR has a way to tell. Finding the mediators would also help buttress claims of causal impact.

Cruxes contra considerations

I take it as a 'bad news' for an idea, whatever its role, if one can show it is a) a proposed elaboration of another idea, and b) yet this elaboration makes the idea worse. I offer an in theory reason to think 'cruxes' are inapt elaborations for 'considerations', a couple of considerations as to why 'double crux' might degrade epistemic practice, and a bet that, in fact, people who are 'double cruxing' (or just 'finding cruxes') are often not in fact using cruxes.

Call a 'consideration' something like this:

A consideration for some belief B is another belief X such that believing X leads one to assign a higher credence to B.

This is (unsurprisingly) broad, including stuff like 'reasons', 'data' and the usual fodder for bayesian updating we know and love. Although definitions of a 'crux' slightly vary, it seems to be something like this:

A crux for some belief B is another belief C such that if one did not believe C, one would not believe B.

Or:

A crux for some belief B is another belief C such that if one did not believe C, one would change one's mind about B.

'Changing one's mind' about B is not ultra-exact, but nothing subsequent turns on this point (one could just encode B in the first formulation as 'I do not change my mind about another belief (A)', etc.

The crux rule

I said in a reply to Dan given this idea of a crux, a belief should held no more strongly than its (weakest) crux (call this the 'crux rule'). He expressed uncertainty about whether this was true. I hope this derivation is persuasive:

¬C -> ¬B (i.e. if I don't believe C, I don't believe B - or, if you prefer, if I don't believe the crux, I should not 'not change my mind about' B)

So:

B -> C (i.e. if I believe B, I must therefore believe C).

If B -> C, P(C) >= P(B): there is no possibility C is false yet B is true, yet there is a possibility where C is true and B is false (compare modus tollens to affirming the consequent).

So if C is a crux for B, one has inconsistent credences if one offers a higher credence for B than for C. An example: suppose I take "Increasing tax would cause a recession" as a crux for "Increasing taxes is bad" - if I thought increasing taxes would not cause a recession, I would not think increasing taxes is bad. Suppose my credence for raising taxes being bad is 0.9, and my credence for raising taxes causing a recession is 0.6. I'm inconsistent: if I assign a 40% chance raising taxes would not cause a recession, I should think there's at least a 40% chance raising taxes would not be bad, not 10%.

(In a multi-crux case with C1-n cruxes for B, the above argument applies to C1-n, so B must not be higher than any of them, and thus equal to or lower than the lowest. Although this is a bound, one may anticipate B's credence to be substantially lower, as the probability of a union of (mostly) independent cruxes approximates P(C1)*P(C2) etc.)

Note this does not apply to considerations, as there's no neat conditional parsing of 'consideration' in the same way as 'crux'. This also agrees with common sense: imagine some consideration one is uncertain of which nonetheless favours B over ¬B: one can be less confident of X than B.

Why belabour this logic and probability? Because it offers a test of intervention fidelity: whether people who are 'cruxing' are really using cruxes. Gather a set of people one takes as epistemically virtuous who 'know how to crux' to find cruxes for some of their beliefs. Then ask them to offer their credences for both the belief and the crux(s) for the belief. If they're always finding cruxes, there will be no cases where they offer higher credence for the belief than its associated crux(s).

I aver the actual proportion of violations of this 'crux rule' will be at least 25%. What (epistemically virtuous) people are really doing when finding 'cruxes' are strong considerations which they think gave them large updates toward B over ¬B. However, despite this they will often find their credence in the belief is higher than the supposed crux. I might think the argument from evil is the best consideration for atheism, but I may also hold a large number of considerations point in favour in atheism, such they work together to make me more confident of atheism than the soundness of the argument from evil. Readers (CFAR alums or not) can 'try this at home'. For a few beliefs 'find your cruxes'. Now offer credences for these - how often do you need to adjust these credences to obey the 'crux rule'? Do you feel closer to reflective equilibrium when you do so?

Even if CFAR can't bus in some superforecasters or superstar philosophers to try this on, they can presumably do this with their participants. I offer the following bet (and happy to haggle over the precise numbers):

(5-1 odds [i.e. favouring you].) From any n cases of beliefs and associated cruxes for CFAR alums/participants/any other epistemically virtuous group who you deem 'know cruxing', greater than n/4 cases will violate the crux rule.

But so what? Aren't CFAR folks already willing to accept often 'crux' (in Vanvier's words) 'degrades gracefully' into something like what I call a 'strong consideration'? Rather than castle-and-keep, isn't this more like constructing a shoddier castle somewhat nearby and knocking its walls down? To-may-to/To-mar-to?

Yet we already have words for 'things which push us towards a belief'. I used consideration, but we can also use 'reasons', or 'evidence' or whatever. 'Strong consideration' has 16 more characters than crux, but it has the benefits of its meaning being common knowledge, being naturally consonant with bayesianism, and accurately captures how epistemically virtuous people think and how they should be thinking. To introduce another term which is not common knowledge and forms either a degenerate or redundant version of this common knowledge term looks, respectfully, like bloated jargon by my lights.

If you think there's a crux, don't double crux, think again

Things may be worse than 'we've already got a better concept'. It looks plausible to me that teaching cruxes (or double crux) teaches bad epistemic practice. A contention I made in the OP is that as crux incidence is anti-correlated with epistemic virtue: epistemically virtuous people usually find in topics of controversy that the support for their beliefs is distributed over a number of considerations, without a clear 'crux', rather than they would change their mind in some matter based on a single not-that-resilient consideration. Folks at CFAR seem to (mostly) agree, e.g. Duncan's remarks:

I note that, if correct, this theory would indicate that e.g. your average LessWronger would find less value in double crux than your average CFAR participant (who shares a lot in common with a LessWronger but in expectation is less rigorous and careful about their epistemics). This being because LessWrongers try very deliberately to form belief webs like the first image [many-one, small edge weighs - T], and when they have a belief web like the third image [not-so-many-one, one much bigger edge -T] they try to make that belief feel to themselves as unbalanced and vulnerable as it actually is.

This suggests one's reaction on finding you have a crux should be alarm: "My web of beliefs doesn't look like what I'd expect to see from a person with good epistemics", and one's attitude towards 'this should be the crux for my belief' should be scepticism: "It's not usually the case some controversial matter depends upon a single issue like this". It seems the best next step in such a situation is something like this, "I'm surprised there is a crux here. I should check with experts/the field/peers to whether they agree with me that this is the crux of the matter. If they don't, I should investigate the other considerations suggested to bear upon this matter/reasons they may offer to assign lower weight to what I take to the be crux".

The meta-cognitive point is that it is important to not only get the right credences on the considerations, but also to the weigh these considerations rightly to form a good 'all things considered' credence on the topic. Webs of belief that greatly overweigh a particular consideration track truth poorly even they are accurate on what it (mis)takes as the key issue. In my experience among elite cognisers, there's seldom disagreement that a consideration bears upon a given issue. Disagreement seldom occurs about the direction of that consideration either: parties tend to agree a given consideration favours one view or another. Most of the action occurs at the aggregation: "I agree with you this piece of evidence favours your view, but I weigh it less than this other piece of evidence that favours mine."

Cruxing/double crux seems to give entirely wrong recommendations. It pushes one to try to find single considerations that would change their mind, despite this usually being pathological; it focuses subsequent thinking on those considerations identified as cruxes, instead of the more important issue of whether one is weighing these considerations too heavily; it celebrates when you and your interlocutor agree on the crux of your disagreement, instead of cautioning such cases often indicate you've both gotten things wrong.

The plural of plausibly biased anecdote is effectively no evidence

Ultimately, the crux (forgive me) is whether double crux actually works. Suppose 'meditation' is to 'relaxing' as I allege 'crux/double crux' is to 'consideration'. Pretend all the stuff you hear about 'meditation' is mumbo-jumbo that confuses the issue that the only good 'meditation' does is that it prompts people to relax. This would be regrettable, but meditation would still be a good thing even if its only parasitic on the good of relaxing. One might wonder if you could do something better than 'meditation' by focusing on actually valuable relaxing bit, but maybe this is one of those cases where the stuff around 'meditation' is a better route to get people to relax than targeting 'relaxing' directly. C.f. Duncan:

I think there's a third path here, which is something like "double crux may be an instrumentally useful tool in causing these admirable epistemic norms to take root, or to move from nominally-good to actually-practiced.

The evidence base for double crux (and I guess CFAR generally) seems to be something like this:

  • Lots of intelligent and reasonable people report cruxing/double crux was helpful for them. (I can somewhat allay Duncan's worry that the cases he observes might be explained by social pressure he generates - people have reported the same in conversations in which he is an ocean away).

  • Folks at CFAR observe many cases where double crux works, and although it might work particularly well between folks at CFAR (see Eli's comment) but in any case they still observe it to be handy with non-CFAR staff.

  • Duncan notes in a sham control test (i.e. double crux versus 'discussing the benefits of epistemic virtues').

  • Dan provides some participant data: about half 'find a double crux', and it looks like finding a disagreement, finding a double crux (or both) was associated with a more valuable conversation.

Despite general equanimity, Duncan noted distress at the 'lack of epistemic hygiene' around looking at double crux, principally (as I read him) that of excessive scepticism from some outside CFAR. With apologies to him (and the writer of Matthew 7), I think the concern is more plausible in reverse: whatever motes blemish outsider eyes do not stop them seeing the beams blocking CFAR's insight. It's not only the case that outsiders aren't being overly sceptical in doubting this evidence, CFAR is being overly credulous taking it as seriously as they do. Consider this:

  1. In cases where those who are evaluating the program are those involved in delivering the intervention, and they expectedly benefit the better the results, there's a high risk of bias. (c.f. blinding, conflict of interest)

  2. In cases where individuals enjoy some intervention (and often spent a quite a lot of money to participate) there's a high risk of bias for their self-report. (c.f. choice-supportive bias, halo effect, among others).

  3. Neither good faith nor knowledge of a potential bias risk do not, by themselves, help one much to avoid this bias.

  4. Prefer hard metrics with tight feedback loops when trying to perform well at something.

  5. Try and perform some reference class forecasting to avoid getting tricked by erroneous insider views (but I repeat myself).

What measure of credulity should a rationalist mete out to an outsider group with a CFAR-like corpus of evidence? I suggest it would be meagre indeed. One can recite almost without end interventions with promising evidence fatally undercut by minor oversights or bias (e.g. inadequate allocation concealment in an RCT). In the class of interventions where the available evidence has multiple, large, obvious bias risks, the central and modal member of this class is an intervention with no impact.

We should mete out this meagre measure of credulity to ourselves: we should not on the one hand remain unmoved by the asseveration of a chiropractor that they 'really see it works', yet take evidence of similar quality and quantity to vindicate rationality training. CFAR's case is unpersuasive public evidence. I go further: it's unpersuasive private evidence too. In the same way we take the chiropractor be irrational if they don't almost-entirely discount their first-person experience of chiropractic successes when we inform them of the various cognitive biases that undercut the evidentiary value of this experience, we should expect a CFAR instructor or alum, given what they already know about rationality, to almost entirely discount these sources of testimonial evidence to judge whether double crux works.

Yet this doesn't happen. Folks at CFAR tend to lead with this anecdata when arguing that double crux works. This also mirrors 'in person' conversations I have, where otherwise epistemically laudable people cite their personal experience as what convinces them of the veracity of a particular CFAR technique. What has a better chance of putting one in touch with reality about whether double crux (or CFAR generally) works is the usual scientific suspects: focusing on 'hard outcomes', attempting formal trials, randomisation, making results public, and so forth. That this generally hasn't happened across the time of CFARs operation I take to be a red flag.

For this reason I respectfully decline Eli's suggestion to make bets on whether CFAR will 'stick with' double crux (or something close to it) in the future. I don't believe CFAR's perception of what is working will track the truth, and so whether or not it remains 'behind double crux' is uninformative for the question of whether double crux works. I'm willing to offer bets against whether CFAR will gain 'objective' evidence of efficacy, and bet in favour of the null hypothesis for these:

(More an error bounty than a bet - first person to claim gets £100) CFAR's upcoming "EA impact metrics report" will contain no 'objective measures' (defined somewhat loosely - objective measure is something like "My income went up/BMI went down/independent third party assessor rated the conversation as better", not things along the lines of, "Participants rate the workshop as highly valuable/instructor rates conversations as more rational/ etc."

(3-to-1): CFAR will not generate in the next 24 months any peer reviewed literature in psychology or related fields (stipulated along the lines of, "either published in the 'original reports' section of a journal with impact factor >1 or presenting at an academic conference."

(4 to 1): Conditional on a CFAR study getting past peer review, it will not show significantly positive effects on any objective, pre-specified outcome measure.

I'm also happy to offer bets on objective measures of any internal evaluations re. double crux or CFAR activity more broadly.

Comment by Thrasymachus on Contra double crux · 2017-10-10T00:04:29.637Z · LW · GW

I also notice that I can't predict whether you'll look at the "prioritize discussion based on the slope of your possible update combined with the other party's belief" version that I give here and say "okay, but that's not double crux" or "okay, but the motion of double crux doesn't point there as efficiently as something else" or "that doesn't seem like the right step in the dance, tho."

I regret it is unclear what I would say given what I have written, but it is the former ("okay, but that's not double crux"). I say this for the following reasons.

  1. The consideration with the greatest slope need not be a crux. (Your colleague Dan seems to agree with my interpretation that a crux should be some C necessary for ones attitude over B, so that if you changed your mind about C you'd change your mind about B).

  2. There doesn't seem to be a 'double' either: identifying the slopiest consideration regarding ones own credence doesn't seem to demand comparing this to the beliefs of any particular interlocutor to look for shared elements.

I guess (forgive me if I'm wrong) what you might say is that although what you describe may not satisfy what was exactly specified in the original introduction to double crux, this was a simplification and these are essentially the same thing. Yet I take what distinguishes double crux over related and anodyne epistemic virtues (e.g. 'focus on important less-resilient considerations', 'don't act like a lawyer') is the 'some C for which if ¬C then ¬B' characteristic. As I fear may be abundantly obvious, I find eliding this distinction confusing rather than enlightening: if (as I suggest) the distinguishing characteristic of double crux neither works as good epistemic tool nor good epistemic training, that there may be some nearby epistemic norm that does one or both of these is little consolation.

Comment by Thrasymachus on Contra double crux · 2017-10-09T23:37:35.462Z · LW · GW

Hello Dan,

I'm not sure whether these remarks are addressed 'as a reply' to me in particular. That you use the 'marginal tax rate in the UK' example I do suggests this might be meant as a response. On the other hand, I struggle to locate the particular loci of disagreement - or rather, I see in your remarks an explanation of double crux which includes various elements I believe I both understand and object to, but not reasons that argue against this belief (e.g. "you think double crux involves X, but actually it is X*, and thus your objection vanishes when this misunderstanding is resolved", "your objection to X is mistaken as Y", etc.) If this is a reply, I apologise for not getting it; if it is not, I apologise for my mistake.

In any case, I take the opportunity to suggest to concretely identify one aspect of my disagreement:

A typical belief has many cruxes. For example, if Ron is in favor of a proposal to increase the top marginal tax rate in the UK by 5 percentage points, his cruxes might include "There is too much inequality in the UK", "Increasing the top marginal rate by a few percentage points would not have much negative effect on the economy", and "Spending by the UK government, at the margin, produces value". If he thought that more inequality would be good for society then he would no longer favor increasing the top marginal rate. If he thought that increasing the top marginal rate would be disastrous for the UK economy then he would no longer favor increasing it (even if he didn't change his mind about there being too much equality). If he thought that marginal government spending was worthless or harmful then he would no longer favor increasing taxes.

This seems to imply agreement with my take that cruxes (per how CFAR sees them) have the 'if you change your mind about this, you should change your mind about that', and so this example has the sequence think-esque characteristic that these cruxes are jointly necessary for ron's belief (i.e. if Ron thinks ¬A, ¬B, or ¬C, he should change his mind about the marginal tax rate). Yet by my lights it seems more typical considerations like these exert weight upon the balance of reason, but not of such strength that their negation provides a decisive consideration against increasing taxes (e.g. it doesn't seem crazy for Ron to think "Well, I don't think inequality is a big deal, but other reasons nonetheless favour raising taxes", or "Even though I think marginal spending by the UK government is harmful, this negative externality could be outweighed by other considerations").

I think some harder data can provide better information than litigating hypothetical cases. If the claim that a typical belief has many cruxes, one should see that if one asks elite cognisers to state their credence for a belief, and then state their credences for the most crucial few considerations regarding it, the credence for the belief should only be very rarely higher than the lowest credence among the considerations. This is because if most beliefs have many (jointly necessary) cruxes which should usually comprise at least the top few considerations, and thus this conjunction is necessary (but not sufficient) for believing B, and P(one crux) >= P(conjunction of cruxes). In essence ones credence in a belief should be no greater than ones weakest crux (I guess usually the credence in the belief of a sequence-thinking argument should generally approximate a lower credence set by P(crux1)*P(crux2) etc, as these are usually fairly independent.)

In contrast, if I am closer to the mark, one should fairly commonly see the credence for the belief be higher than the lowest credence of the set of important considerations. If each consideration offers a bayesian update favouring B, a set of important considerations that support B may act together (along with other less important considerations) to increase its credence such that one is more confident of B than of some (or all) of the important considerations that support it.

I aver relevant elite cognisers (e.g. superforecasters, the philosophers I point to) will exhibit the property I suggest. I would also venture that when reasonable cognisers attempt to double crux, their credences will also behave in the way I predict.

Comment by Thrasymachus on Contra double crux · 2017-10-09T22:47:32.672Z · LW · GW

Thanks for presenting this helpful data. If you'll forgive the (somewhat off topic) question, I understand both that you are responsible for evaluation of CFAR, and that you are working on a new evaluation. I'd be eager to know what this is likely to comprise, especially (see various comments) what evidence (if any) is expected to be released 'for public consumption'?