A Hill of Validity in Defense of Meaning

zack_m_davis

A Hill of Validity in Defense of Meaning

post by Zack_M_Davis · 2023-07-15T17:57:14.385Z · LW · GW · 119 comments

This is a link post for http://unremediatedgender.space/2023/Jul/a-hill-of-validity-in-defense-of-meaning/

119 comments

If you are silent about your pain, they'll kill you and say you enjoyed it.

—Zora Neale Hurston

Recapping my Whole Dumb Story so far—in a previous post, "Sexual Dimorphism in Yudkowsky's Sequences, in Relation to My Gender Problems", I told the part about how I've "always" (since puberty) had this obsessive sexual fantasy about being magically transformed into a woman and also thought it was immoral to believe in psychological sex differences, until I got set straight by these really great Sequences of blog posts by Eliezer Yudkowsky, which taught me (incidentally, among many other things) how absurdly unrealistic my obsessive sexual fantasy was given merely human-level technology [LW · GW], and that it's actually immoral not to believe in psychological sex differences given that [? · GW] psychological sex differences are actually real. In a subsequent post, "Blanchard's Dangerous Idea and the Plight of the Lucid Crossdreamer", I told the part about how, in 2016, everyone in my systematically-correct-reasoning community up to and including Eliezer Yudkowsky suddenly started claiming that guys like me might actually be women in some unspecified metaphysical sense and insisted on playing dumb when confronted with alternative explanations of the relevant phenomena, until I eventually had a sleep-deprivation- and stress-induced delusional nervous breakdown.

That's not the egregious part of the story. Psychology is a complicated empirical science: no matter how obvious I might think something is, I have to admit that I could be wrong—not just as an obligatory profession of humility, but actually wrong in the real world [LW · GW]. If my fellow rationalists merely weren't sold on the thesis about autogynephilia as a cause of transsexuality, I would be disappointed, but it wouldn't be grounds to denounce the entire community as a failure or a fraud. And indeed, I did end up moderating my views compared to the extent to which my thinking in 2016–7 took the views of Ray Blanchard, J. Michael Bailey, and Anne Lawrence as received truth. (At the same time, I don't particularly regret saying what I said in 2016–7, because Blanchard–Bailey–Lawrence is still obviously directionally correct compared to the nonsense everyone else was telling me.)

But a striking pattern in my attempts to argue with people about the two-type taxonomy in late 2016 and early 2017 was the tendency for the conversation to get derailed on some variation of, "Well, the word woman doesn't necessarily mean that," often with a link to "The Categories Were Made for Man, Not Man for the Categories", a November 2014 post by Scott Alexander arguing that because categories exist in our model of the world rather than the world itself, there's nothing wrong with simply defining trans people as their preferred gender to alleviate their dysphoria.

After Yudkowsky had stepped away from full-time writing, Alexander had emerged as our subculture's preeminent writer. Most people in an intellectual scene "are writers" in some sense, but Alexander was the one "everyone" reads: you could often reference a Slate Star Codex post in conversation and expect people to be familiar with the idea, either from having read it, or by osmosis. The frequency with which "... Not Man for the Categories" was cited at me seemed to suggest it had become our subculture's party line on trans issues.

But the post is wrong in obvious ways. To be clear, it's true that categories exist in our model of the world, rather than the world itself—categories are "map", not "territory"—and it's possible that trans women might be women with respect to some genuinely useful definition of the word "woman." However, Alexander goes much further, claiming that we can redefine gender categories to make trans people feel better:

I ought to accept an unexpected man or two deep inside the conceptual boundaries of what would normally be considered female if it'll save someone's life. There's no rule of rationality saying that I shouldn't, and there are plenty of rules of human decency saying that I should.

This is wrong because categories exist in our model of the world in order to capture empirical regularities in the world itself: the map is supposed to reflect the territory, and there are "rules of rationality" governing what kinds of word and category usages correspond to correct probabilistic inferences. Yudkowsky had written a whole Sequence about this, "A Human's Guide to Words" [? · GW]. Alexander cites a post [LW · GW] from that Sequence in support of the (true) point about how categories are "in the map" ... but if you actually read the Sequence, another point that Yudkowsky pounds home over and over, is that word and category definitions are nevertheless not arbitrary: you can't define a word any way you want, because there are at least 37 ways that words can be wrong [LW · GW]—principles that make some definitions perform better than others as "cognitive technology."

In the case of Alexander's bogus argument about gender categories, the relevant principle (#30 [LW · GW] on the list of 37 [LW · GW]) is that if you group things together in your map that aren't actually similar in the territory, you're going to make bad inferences.

Crucially, this is a general point about how language itself works that has nothing to do with gender. No matter what you believe about controversial empirical questions, intellectually honest people should be able to agree that "I ought to accept an unexpected [X] or two deep inside the conceptual boundaries of what would normally be considered [Y] if [positive consequence]" is not the correct philosophy of language, independently of the particular values of X and Y.

This wasn't even what I was trying to talk to people about. I thought I was trying to talk about autogynephilia as an empirical theory of psychology of late-onset gender dysphoria in males, the truth or falsity of which cannot be altered by changing the meanings of words. But at this point, I still trusted people in my robot cult to be basically intellectually honest, rather than slaves to their political incentives, so I endeavored to respond to the category-boundary argument under the assumption that it was an intellectually serious argument that someone could honestly be confused about.

When I took a year off from dayjobbing from March 2017 to March 2018 to have more time to study and work on this blog, the capstone of my sabbatical was an exhaustive response to Alexander, "The Categories Were Made for Man to Make Predictions" (which Alexander graciously included in his next links post). A few months later, I followed it with "Reply to The Unit of Caring on Adult Human Females", responding to a similar argument from soon-to-be Vox journalist Kelsey Piper, then writing as The Unit of Caring on Tumblr.

I'm proud of those posts. I think Alexander's and Piper's arguments were incredibly dumb, and that with a lot of effort, I did a pretty good job of explaining why to anyone who was interested and didn't, at some level, prefer not to understand.

Of course, a pretty good job of explaining by one niche blogger wasn't going to put much of a dent in the culture, which is the sum of everyone's blogposts; despite the mild boost from the Slate Star Codex links post, my megaphone just wasn't very big. I was disappointed with the limited impact of my work, but not to the point of bearing much hostility to "the community." People had made their arguments, and I had made mine; I didn't think I was entitled to anything more than that.

Really, that should have been the end of the story. Not much of a story at all. If I hadn't been further provoked, I would have still kept up this blog, and I still would have ended up arguing about gender with people sometimes, but this personal obsession wouldn't have been the occasion of a robot-cult religious civil war involving other people whom you'd expect to have much more important things to do with their time.

The casus belli for the religious civil war happened on 28 November 2018. I was at my new dayjob's company offsite event in Austin, Texas. Coincidentally, I had already spent much of the previous two days (since just before the plane to Austin took off) arguing trans issues with other "rationalists" on Discord.

Just that month, I had started a Twitter account using my real name, inspired in an odd way by the suffocating wokeness of the Rust open-source software scene where I occasionally contributed diagnostics patches to the compiler. My secret plan/fantasy was to get more famous and established in the Rust world (one of compiler team membership, or conference talk accepted, preferably both), get some corresponding Twitter followers, and then bust out the @BlanchardPhd retweets and links to this blog. In the median case, absolutely nothing would happen (probably because I failed at being famous), but I saw an interesting tail of scenarios in which I'd get to be a test case in the Code of Conduct wars.

So, now having a Twitter account, I was browsing Twitter in the bedroom at the rental house for the dayjob retreat when I happened to come across this thread by @ESYudkowsky:

Some people I usually respect for their willingness to publicly die on a hill of facts, now seem to be talking as if pronouns are facts, or as if who uses what bathroom is necessarily a factual statement about chromosomes. Come on, you know the distinction better than that!

Even if somebody went around saying, "I demand you call me 'she' and furthermore I claim to have two X chromosomes!", which none of my trans colleagues have ever said to me by the way, it still isn't a question-of-empirical-fact whether she should be called "she". It's an act.

In saying this, I am not taking a stand for or against any Twitter policies. I am making a stand on a hill of meaning in defense of validity, about the distinction between what is and isn't a stand on a hill of facts in defense of truth.

I will never stand against those who stand against lies. But changing your name, asking people to address you by a different pronoun, and getting sex reassignment surgery, Is. Not. Lying. You are ontologically confused if you think those acts are false assertions.

Some of the replies tried to explain the obvious problem—and Yudkowsky kept refusing to understand:

Using language in a way you dislike, openly and explicitly and with public focus on the language and its meaning, is not lying. The proposition you claim false (chromosomes?) is not what the speech is meant to convey—and this is known to everyone involved, it is not a secret.

Now, maybe as a matter of policy, you want to make a case for language being used a certain way. Well, that's a separate debate then. But you're not making a stand for Truth in doing so, and your opponents aren't tricking anyone or trying to.

—repeatedly:

You're mistaken about what the word means to you, I demonstrate thus: https://en.wikipedia.org/wiki/XX_male_syndrome

But even ignoring that, you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning.

Dear reader, this is the moment where I flipped out. Let me explain.

This "hill of meaning in defense of validity" proclamation was such a striking contrast to the Eliezer Yudkowsky I remembered—the Eliezer Yudkowsky I had variously described as having "taught me everything I know" and "rewritten my personality over the internet"—who didn't hesitate to criticize uses of language that he thought were failing to "carve reality at the joints", even going so far as to call them "wrong" [LW · GW]:

[S]aying "There's no way my choice of X can be 'wrong'" is nearly always an error in practice, whatever the theory. You can always be wrong. Even when it's theoretically impossible to be wrong, you can still be wrong. There is never a Get-Out-Of-Jail-Free card for anything you do. That's life.

Similarly [LW · GW]:

Once upon a time it was thought that the word "fish" included dolphins. Now you could play the oh-so-clever arguer, and say, "The list: {Salmon, guppies, sharks, dolphins, trout} is just a list—you can't say that a list is wrong. I can prove in set theory that this list exists. So my definition of fish, which is simply this extensional list, cannot possibly be 'wrong' as you claim."

Or you could stop playing nitwit games and admit that dolphins don't belong on the fish list.

You come up with a list of things that feel similar, and take a guess at why this is so. But when you finally discover what they really have in common, it may turn out that your guess was wrong. It may even turn out that your list was wrong.

You cannot hide behind a comforting shield of correct-by-definition. Both extensional definitions and intensional definitions can be wrong, can fail to carve reality at the joints.

One could argue that this "Words can be wrong when your definition draws a boundary around things that don't really belong together" moral didn't apply to Yudkowsky's new Tweets, which only mentioned pronouns and bathroom policies, not the extensions [LW · GW] of common nouns.

But this seems pretty unsatisfying in the context of Yudkowsky's claim to "not [be] taking a stand for or against any Twitter policies". One of the Tweets that had recently led to radical feminist Meghan Murphy getting kicked off the platform read simply, "Men aren't women tho." This doesn't seem like a policy claim; rather, Murphy was using common language to express the fact-claim that members of the natural category of adult human males, are not, in fact, members of the natural category of adult human females.

Thus, if the extension of common words like "woman" and "man" is an issue of epistemic importance that rationalists should care about, then presumably so was Twitter's anti-misgendering policy—and if it isn't (because you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning) then I wasn't sure what was left of the "Human's Guide to Words" Sequence if the 37-part grand moral [LW · GW] needed to be retracted.

I think I am standing in defense of truth when I have an argument for why my preferred word usage does a better job at carving reality at the joints, and the one bringing my usage explicitly into question does not. As such, I didn't see the practical difference between "you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning," and "I can define a word any way I want." About which, again, an earlier Eliezer Yudkowsky had written:

"It is a common misconception that you can define a word any way you like. [...] If you believe that you can 'define a word any way you like', without realizing that your brain goes on categorizing without your conscious oversight, then you won't take the effort to choose your definitions wisely." [LW · GW]

"So that's another reason you can't 'define a word any way you like': You can't directly program concepts into someone else's brain." [LW · GW]

"When you take into account the way the human mind actually, pragmatically works, the notion 'I can define a word any way I like' soon becomes 'I can believe anything I want about a fixed set of objects' or 'I can move any object I want in or out of a fixed membership test'." [LW · GW]

"There's an idea, which you may have noticed I hate, that 'you can define a word any way you like'." [LW · GW]

"And of course you cannot solve a scientific challenge by appealing to dictionaries, nor master a complex skill of inquiry by saying 'I can define a word any way I like'." [LW · GW]

"Categories are not static things in the context of a human brain; as soon as you actually think of them, they exert force on your mind. One more reason not to believe you can define a word any way you like." [LW · GW]

"And people are lazy. They'd rather argue 'by definition', especially since they think 'you can define a word any way you like'." [LW · GW]

"And this suggests another—yes, yet another—reason to be suspicious of the claim that 'you can define a word any way you like'. When you consider the superexponential size of Conceptspace, it becomes clear that singling out one particular concept for consideration is an act of no small audacity—not just for us, but for any mind of bounded computing power." [LW · GW]

"I say all this, because the idea that 'You can X any way you like' is a huge obstacle to learning how to X wisely. 'It's a free country; I have a right to my own opinion' obstructs the art of finding truth. 'I can define a word any way I like' obstructs the art of carving reality at its joints. And even the sensible-sounding 'The labels we attach to words are arbitrary' obstructs awareness of compactness." [LW · GW]

"One may even consider the act of defining a word as a promise to [the] effect [...] [that the definition] will somehow help you make inferences / shorten your messages." [LW · GW]

One could argue that I was unfairly interpreting Yudkowsky's Tweets as having a broader scope than was intended—that Yudkowsky only meant to slap down the false claim that using he for someone with a Y chromosome is "lying", without intending any broader implications about trans issues or the philosophy of language. It wouldn't be realistic or fair to expect every public figure to host an exhaustive debate on all related issues every time they encounter a fallacy they want to Tweet about.

However, I don't think this "narrow" reading is the most natural one. Yudkowsky had previously written of what he called the fourth virtue of evenness: "If you are selective about which arguments you inspect for flaws, or how hard you inspect for flaws, then every flaw you learn how to detect makes you that much stupider." He had likewise written on reversed stupidity [LW · GW] (bolding mine):

To argue against an idea honestly, you should argue against the best arguments of the strongest advocates. Arguing against weaker advocates proves nothing, because even the strongest idea will attract weak advocates.

Relatedly, Scott Alexander had written about how "weak men are superweapons": speakers often selectively draw attention to the worst arguments in favor of a position in an attempt to socially discredit people who have better arguments (which the speaker ignores). In the same way, by just slapping down a weak man from the "anti-trans" political coalition without saying anything else in a similarly prominent location, Yudkowsky was liable to mislead his faithful students into thinking that there were no better arguments from the "anti-trans" side.

To be sure, it imposes a cost on speakers to not be able to Tweet about one specific annoying fallacy and then move on with their lives without the need for endless disclaimers about related but stronger arguments that they're not addressing. But the fact that Yudkowsky disclaimed that he wasn't taking a stand for or against Twitter's anti-misgendering policy demonstrates that he didn't have an aversion to spending a few extra words to prevent the most common misunderstandings.

Given that, it's hard to read the Tweets Yudkowsky published as anything other than an attempt to intimidate and delegitimize people who want to use language to reason about sex rather than gender identity. It's just not plausible that Yudkowsky was simultaneously savvy enough to choose to make these particular points while also being naïve enough to not understand the political context. Deeper in the thread, he wrote:

The more technology advances, the further we can move people towards where they say they want to be in sexspace. Having said this we've said all the facts. Who competes in sports segregated around an Aristotelian binary is a policy question (that I personally find very humorous).

Sure, in the limit of arbitrarily advanced technology, everyone could be exactly where they wanted to be in sexpsace. Having said this, we have not said all the facts relevant to decisionmaking in our world, where we do not have arbitrarily advanced technology (as Yudkowsky well knew, having written a post about how technically infeasible an actual sex change would be [LW · GW]). As Yudkowsky acknowledged in the previous Tweet, "Hormone therapy changes some things and leaves others constant." The existence of hormone replacement therapy does not itself take us into the glorious transhumanist future where everyone is the sex they say they are.

The reason for sex-segregated sports leagues is that sport-relevant multivariate trait distributions of female bodies and male bodies are different: men are taller, stronger, and faster. If you just had one integrated league, females wouldn't be competitive (in the vast majority of sports, with a few exceptions like ultra-distance swimming that happen to sample an unusually female-favorable corner of sportspace).

Given the empirical reality of the different trait distributions, "Who are the best athletes among females?" is a natural question for people to be interested in and want separate sports leagues to determine. Including male people in female sports leagues undermines the point of having a separate female league, and hormone replacement therapy after puberty doesn't substantially change the picture here.^[1]

Yudkowsky's suggestion that an ignorant commitment to an "Aristotelian binary" is the main reason someone might care about the integrity of women's sports is an absurd strawman. This just isn't something any scientifically literate person would write if they had actually thought about the issue at all, as opposed to having first decided (consciously or not) to bolster their reputation among progressives by dunking on transphobes on Twitter, and then wielding their philosophy knowledge in the service of that political goal. The relevant facts are not subtle, even if most people don't have the fancy vocabulary to talk about them in terms of "multivariate trait distributions."

I'm picking on the "sports segregated around an Aristotelian binary" remark because sports is a case where the relevant effect sizes are so large as to make the point hard for all but the most ardent gender-identity partisans to deny. (For example, what the Cohen's d ≈ 2.6 effect size difference in muscle mass means is that a woman as strong as the average man is at the 99.5th percentile for women.) But the point is general: biological sex exists and is sometimes decision-relevant. People who want to be able to talk about sex and make policy decisions on the basis of sex are not making an ontology error, because the ontology in which sex "actually" "exists" continues to make very good predictions in our current tech regime (if not the glorious transhumanist future). It would be a ridiculous isolated demand for rigor to expect someone to pass a graduate exam about the philosophy and cognitive science of categorization before they can talk about sex.

Thus, Yudkowsky's claim to merely have been standing up for the distinction between facts and policy questions doesn't seem credible. It is, of course, true that pronoun and bathroom conventions are policy decisions rather than matters of fact, but it's bizarre to condescendingly point this out as if it were the crux of contemporary trans-rights debates. Conservatives and gender-critical feminists know that trans-rights advocates aren't falsely claiming that trans women have XX chromosomes! If you just wanted to point out that the rules of sports leagues are a policy question rather than a fact (as if anyone had doubted this), why would you throw in the "Aristotelian binary" weak man and belittle the matter as "humorous"? There are a lot of issues I don't care much about, but I don't see anything funny about the fact that other people do care.^[2]

If any concrete negative consequence of gender self-identity categories is going to be waved away with, "Oh, but that's a mere policy decision that can be dealt with on some basis other than gender, and therefore doesn't count as an objection to the new definition of gender words", then it's not clear what the new definition is for.

Like many gender-dysphoric males, I cosplay female characters at fandom conventions sometimes. And, unfortunately, like many gender-dysphoric males, I'm not very good at it. I think someone looking at some of my cosplay photos and trying to describe their content in clear language—not trying to be nice to anyone or make a point, but just trying to use language as a map that reflects the territory—would say something like, "This is a photo of a man and he's wearing a dress." The word man in that sentence is expressing cognitive work: it's a summary of the lawful cause-and-effect evidential entanglement [LW · GW] whereby the photons reflecting off the photograph are correlated with photons reflecting off my body at the time the photo was taken, which are correlated with my externally observable secondary sex characteristics (facial structure, beard shadow, &c.). From this evidence, an agent using an efficient naïve-Bayes-like model [LW · GW] can assign me to its "man" (adult human male) category and thereby make probabilistic predictions about traits that aren't directly observable from the photo. The agent would achieve a better score on those predictions than if it had assigned me to its "woman" (adult human female) category.

By "traits" I mean not just sex chromosomes (as Yudkowsky suggested on Twitter), but the conjunction of dozens or hundreds of measurements that are causally downstream of sex chromosomes: reproductive organs and muscle mass (again, sex difference effect size of Cohen's d ≈ 2.6) and Big Five Agreeableness (d ≈ 0.5) and Big Five Neuroticism (d ≈ 0.4) and short-term memory (d ≈ 0.2, favoring women) and white-gray-matter ratios in the brain and probable socialization history and any number of other things—including differences we might not know about, but have prior reasons to suspect exist. No one knew about sex chromosomes before 1905, but given the systematic differences between women and men, it would have been reasonable to suspect the existence of some sort of molecular mechanism of sex determination.

Forcing a speaker to say "trans woman" instead of "man" in a sentence about my cosplay photos depending on my verbally self-reported self-identity may not be forcing them to lie, exactly. It's understood, "openly and explicitly and with public focus on the language and its meaning," what trans women are; no one is making a false-to-fact claim about them having ovaries, for example. But it is forcing the speaker to obfuscate the probabilistic inference they were trying to communicate with the original sentence (about modeling the person in the photograph as being sampled from the "man" cluster in configuration space [LW · GW]), and instead use language that suggests a different cluster-structure. ("Trans women", two words, are presumably a subcluster within the "women" cluster.) Crowing in the public square about how people who object to being forced to "lie" must be ontologically confused is ignoring the interesting part of the problem. Gender identity's claim to be non-disprovable [LW · GW] functions as a way to avoid the belief's real weak points [LW · GW].

To this, one might reply that I'm giving too much credit to the "anti-trans" faction for how stupid they're not being: that my careful dissection of the hidden probabilistic inferences implied by words (including pronoun choices) is all well and good, but calling pronouns "lies" is not something you do when you know how to use words.

But I'm not giving them credit for for understanding the lessons of "A Human's Guide to Words"; I just think there's a useful sense of "know how to use words" that embodies a lower standard of philosophical rigor. If a person-in-the-street says of my cosplay photos, "That's a man! I have eyes, and I can see that that's a man! Men aren't women!"—well, I probably wouldn't want to invite them to a Less Wrong meetup. But I do think the person-in-the-street is performing useful cognitive work. Because I have the hidden-Bayesian-structure-of-language-and-cognition-sight (thanks to Yudkowsky's writings back in the 'aughts), I know how to sketch out the reduction of "Men aren't women" to something more like "This cognitive algorithm [LW · GW] detects secondary sex characteristics and uses it as a classifier for a binary female/male 'sex' category, which it uses to make predictions about not-yet-observed features ..."

But having done the reduction-to-cognitive-algorithms, it still looks like the person-in-the-street has a point that I shouldn't be allowed to ignore just because I have 30 more IQ points and better philosophy-of-language skills?

I bring up my bad cosplay photos as an edge case that helps illustrate the problem I'm trying to point out, much like how people love to bring up complete androgen insensitivity syndrome to illustrate why "But chromosomes!" isn't the correct reduction of sex classification. To differentiate what I'm saying from blind transphobia, let me note that I predict that most people-in-the-street would be comfortable using feminine pronouns for someone like Blaire White. That's evidence about the kind of cognitive work people's brains are doing when they use English pronouns! Certainly, English is not the only language, and ours is not the only culture; maybe there is a way to do gender categories that would be more accurate and better for everyone. But to find what that better way is, we need to be able to talk about these kinds of details in public, and the attitude evinced in Yudkowsky's Tweets seemed to function as a semantic stopsign [LW · GW] to get people to stop talking about the details.

If you were interested in having a real discussion (instead of a fake discussion that makes you look good to progressives), why would you slap down the "But, but, chromosomes" fallacy and then not engage with the obvious steelman of "But, but, clusters in high-dimensional [LW · GW] configuration space [LW · GW] that aren't actually changeable with contemporary technology [LW · GW]" steelman which was, in fact, brought up in the replies?

Satire is a weak form of argument: the one who wishes to doubt will always be able to find some aspect in which an obviously absurd satirical situation differs from the real-world situation being satirized and claim that that difference destroys the relevance of the joke. But on the off chance that it might help illustrate the objection, imagine you lived in a so-called "rationalist" subculture where conversations like this happened—

Bob: Look at this adorable cat picture!
Alice: Um, that looks like a dog to me, actually.
Bob: You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning. Now, maybe as a matter of policy, you want to make a case for language being used a certain way. Well, that's a separate debate then.

If you were Alice, and a solid supermajority of your incredibly smart, incredibly philosophically sophisticated friend group including Eliezer Yudkowsky (!!!) seemed to behave like Bob, that would be a worrying sign about your friends' ability to accomplish intellectually hard things like AI alignment, right? Even if there isn't any pressing practical need to discriminate between dogs and cats, the problem is that Bob is selectively using his sophisticated philosophy-of-language knowledge to try to undermine Alice's ability to use language to make sense of the world, even though Bob obviously knows very well what Alice was trying to say. It's incredibly obfuscatory in a way that people—the same people—would not tolerate in almost any other context.

Imagine an Islamic theocracy in which one Megan Murfi (ميغان ميرفي) had recently gotten kicked off the dominant microblogging platform for speaking disrespectfully about the prophet Muhammad. Suppose that Yudkowsky's analogue in that world then posted that those objecting on free inquiry grounds were ontologically confused: saying "peace be upon him" after the name of the prophet Muhammad is a speech act, not a statement of fact. In banning Murfi for repeatedly speaking about the prophet Muhammad (peace be upon him) as if he were just some guy, the platform was merely "enforcing a courtesy standard" (in the words of our world's Yudkowsky). Murfi wasn't being forced to lie.

I think the atheists of our world, including Yudkowsky, would not have trouble seeing the problem with this scenario, nor hesitate to agree that it is a problem for that Society's rationality. Saying "peace be unto him" is indeed a speech act rather than a statement of fact, but it would be bizarre to condescendingly point this out as if it were the crux of debates about religious speech codes. The function of the speech act is to signal the speaker's affirmation of Muhammad's divinity. That's why the Islamic theocrats want to mandate that everyone say it: it's a lot harder for atheism to get any traction if no one is allowed to talk like an atheist.

And that's why trans advocates want to mandate against misgendering people on social media: it's harder for trans-exclusionary ideologies to get any traction if no one is allowed to talk like someone who believes that sex (sometimes) matters and gender identity does not.

Of course, such speech restrictions aren't necessarily "irrational", depending on your goals. If you just don't think "free speech" should go that far—if you want to suppress atheism or gender-critical feminism with an iron fist—speech codes are a perfectly fine way to do it! And to their credit, I think most theocrats and trans advocates are intellectually honest about what they're doing: atheists or transphobes are bad people (the argument goes) and we want to make it harder for them to spread their lies or their hate.

In contrast, by claiming to be "not taking a stand for or against any Twitter policies" while accusing people who opposed the policy of being ontologically confused, Yudkowsky was being less honest than the theocrat or the activist: of course the point of speech codes is to suppress ideas! Given that the distinction between facts and policies is so obviously not anyone's crux—the smarter people in the "anti-trans" faction already know that, and the dumber people in the faction wouldn't change their alignment if they were taught—it's hard to see what the point of harping on the fact/policy distinction would be, except to be seen as implicitly taking a stand for the "pro-trans" faction while putting on a show of being politically "neutral." [LW · GW]

It makes sense that Yudkowsky might perceive political constraints on what he might want to say in public—especially when you look at what happened to the other Harry Potter author.^[3] But if Yudkowsky didn't want to get into a distracting fight about a politically-charged topic, then maybe the responsible thing to do would have been to just not say anything about the topic, rather than engaging with the stupid version of the opposition and stonewalling [LW · GW] with "That's a policy question" when people tried to point out the problem?!

I didn't have all of that criticism collected and carefully written up on 28 November 2018. But that, basically, is why I flipped out when I saw that Twitter thread. If the "rationalists" didn't click [LW · GW] on the autogynephilia thing, that was disappointing, but forgivable. If the "rationalists", on Scott Alexander's authority, were furthermore going to get our own philosophy of language wrong over this, that was—I don't want to say forgivable exactly, but it was tolerable. I had learned from my misadventures the previous year that I had been wrong to trust "the community" as a reified collective. That had never been a reasonable mental stance in the first place.

But trusting Eliezer Yudkowsky—whose writings, more than any other single influence, had made me who I am—did seem reasonable. If I put him on a pedestal, it was because he had earned the pedestal, for supplying me with my criteria for how to think—including, as a trivial special case, how to think about what things to put on pedestals [LW · GW].

So if the rationalists were going to get our own philosophy of language wrong over this and Eliezer Yudkowsky was in on it (!!!), that was intolerable, inexplicable, incomprehensible—like there wasn't a real world anymore.

At the dayjob retreat, I remember going downstairs to impulsively confide in a senior engineer, an older bald guy who exuded masculinity, who you could tell by his entire manner and being was not infected by the Berkeley mind-virus, no matter how loyally he voted Democrat. I briefly explained the situation to him—not just the immediate impetus of this Twitter thread, but this whole thing of the past couple years where my entire social circle just suddenly decided that guys like me could be women by means of saying so. He was noncommittally sympathetic; he told me an anecdote about him accepting a trans person's correction of his pronoun usage, with the thought that different people have their own beliefs, and that's OK.

If Yudkowsky was already stonewalling his Twitter followers, entering the thread myself didn't seem likely to help. (Also, less importantly, I hadn't intended to talk about gender on that account yet.)

It seemed better to try to clear this up in private. I still had Yudkowsky's email address, last used when I had offered to pay to talk about his theory of MtF two years before. I felt bad bidding for his attention over my gender thing again—but I had to do something. Hands trembling, I sent him an email asking him to read my "The Categories Were Made for Man to Make Predictions", suggesting that it might qualify as an answer to his question about "a page [he] could read to find a non-confused exclamation of how there's scientific truth at stake". I said that because I cared very much about correcting confusions in my rationalist subculture, I would be happy to pay up to $1000 for his time—and that, if he liked the post, he might consider Tweeting a link—and that I was cc'ing my friends Anna Salamon and Michael Vassar as character references (Subject: "another offer, $1000 to read a ~6500 word blog post about (was: Re: Happy Price offer for a 2 hour conversation)"). Then I texted Anna and Michael, begging them to vouch for my credibility.

The monetary offer, admittedly, was awkward: I included another paragraph clarifying that any payment was only to get his attention, not quid quo pro advertising, and that if he didn't trust his brain circuitry [LW · GW] not to be corrupted by money, then he might want to reject the offer on those grounds and only read the post if he expected it to be genuinely interesting.

Again, I realize this must seem weird and cultish to any normal people reading this. (Paying some blogger you follow one grand just to read one of your posts? What? Why? Who does that?) To this, I again refer to the reasons justifying my 2016 cheerful price offer—and that, along with tagging in Anna and Michael, whom I thought Yudkowsky respected, it was a way to signal that I really didn't want to be ignored, which I assumed was the default outcome. An ordinary programmer such as me was as a mere worm in the presence of the great Eliezer Yudkowsky. I wouldn't have had the audacity to contact him at all, about anything, if I didn't have Something to Protect [LW · GW].

Anna didn't reply, but I apparently did interest Michael, who chimed in on the email thread to Yudkowsky. We had a long phone conversation the next day lamenting how the "rationalists" were dead as an intellectual community.

As for the attempt to intervene on Yudkowsky—here I need to make a digression about the constraints I'm facing in telling this Whole Dumb Story. I would prefer to just tell this Whole Dumb Story as I would to my long-neglected Diary—trying my best at the difficult task of explaining what actually happened during an important part of my life, without thought of concealing anything.

(If you are silent about your pain, they'll kill you and say you enjoyed it.)

Unfortunately, a lot of other people seem to have strong intuitions about "privacy", which bizarrely impose constraints on what I'm allowed to say about my own life: in particular, it's considered unacceptable to publicly quote or summarize someone's emails from a conversation that they had reason to expect to be private. I feel obligated to comply with these widely-held privacy norms, even if I think they're paranoid and anti-social. (This secrecy-hating trait probably correlates with the autogynephilia blogging; someone otherwise like me who believed in privacy wouldn't be telling you this Whole Dumb Story.)

So I would think that while telling this Whole Dumb Story, I obviously have an inalienable right to blog about my own actions, but I'm not allowed to directly refer to private conversations with named individuals in cases where I don't think I'd be able to get the consent of the other party. (I don't think I'm required to go through the ritual of asking for consent in cases where the revealed information couldn't reasonably be considered "sensitive", or if I know the person doesn't have hangups about this weird "privacy" thing.) In this case, I'm allowed to talk about emailing Yudkowsky (because that was my action), but I'm not allowed to talk about anything he might have said in reply, or whether he did.

Unfortunately, there's a potentially serious loophole in the commonsense rule: what if some of my actions (which I would have hoped to have an inalienable right to blog about) depend on content from private conversations? You can't, in general, only reveal one side of a conversation.

Suppose Carol messages Dave at 5 p.m., "Can you come to the party?", and also, separately, that Carol messages Dave at 6 p.m., "Gout isn't contagious." Should Carol be allowed to blog about the messages she sent at 5 p.m. and 6 p.m., because she's only describing her own messages and not confirming or denying whether Dave replied at all, let alone quoting him?

I think commonsense privacy-norm-adherence intuitions actually say No here: the text of Carol's messages makes it too easy to guess that sometime between 5 and 6, Dave probably said that he couldn't come to the party because he has gout. It would seem that Carol's right to talk about her own actions in her own life does need to take into account some commonsense judgement of whether that leaks "sensitive" information about Dave.

In the substory (of my Whole Dumb Story) that follows, I'm going to describe several times that I and others emailed Yudkowsky to argue with what he said in public, without saying anything about whether Yudkowsky replied or what he might have said if he did reply. I maintain that I'm within my rights here, because I think commonsense judgment will agree that me talking about the arguments I made does not leak any sensitive information about the other side of a conversation that may or may not have happened. I think the story comes off relevantly the same whether Yudkowsky didn't reply at all (e.g., because he was too busy with more existentially important things to check his email), or whether he replied in a way that I found sufficiently unsatisfying as to occasion the further emails with followup arguments that I describe. (Talking about later emails does rule out the possible world where Yudkowsky had said, "Please stop emailing me," because I would have respected that, but the fact that he didn't say that isn't "sensitive".)

It seems particularly important to lay out these judgments about privacy norms in connection to my attempts to contact Yudkowsky, because part of what I'm trying to accomplish in telling this Whole Dumb Story is to deal reputational damage to Yudkowsky, which I claim is deserved. (We want reputations to track reality. If you see Erin exhibiting a pattern of intellectual dishonesty, and she keeps doing it even after you talk to her about it privately, you might want to write a blog post describing the pattern in detail—not to hurt Erin, particularly, but so that everyone else can make higher-quality decisions about whether they should believe the things that Erin says.) Given that motivation of mine, it seems important that I only try to hang Yudkowsky with the rope of what he said in public, where you can click the links and read the context for yourself: I'm attacking him, but not betraying him. In the substory that follows, I also describe correspondence with Scott Alexander, but that doesn't seem sensitive in the same way, because I'm not particularly trying to deal reputational damage to Alexander. (Not because Scott performed well, but because one wouldn't really have expected him to in this situation; Alexander's reputation isn't so direly in need of correction.)

Thus, I don't think I should say whether Yudkowsky replied to Michael's and my emails, nor (again) whether he accepted the cheerful-price money, because any conversation that may or may not have occurred would have been private. But what I can say, because it was public, is that we saw this addition to the Twitter thread:

I was sent this (by a third party) as a possible example of the sort of argument I was looking to read: http://unremediatedgender.space/2018/Feb/the-categories-were-made-for-man-to-make-predictions/. Without yet judging its empirical content, I agree that it is not ontologically confused. It's not going "But this is a MAN so using 'she' is LYING."

Look at that! The great Eliezer Yudkowsky said that my position is "not ontologically confused." That's probably high praise, coming from him!

You might think that that should have been the end of the story. Yudkowsky denounced a particular philosophical confusion, I already had a related objection written up, and he publicly acknowledged my objection as not being the confusion he was trying to police. I should be satisfied, right?

I wasn't, in fact, satisfied. This little "not ontologically confused" clarification buried deep in the replies was much less visible than the bombastic, arrogant top-level pronouncement insinuating that resistance to gender-identity claims was confused. (1 Like on this reply, vs. 140 Likes/18 Retweets on start of thread.) This little follow-up did not seem likely to disabuse the typical reader of the impression that Yudkowsky thought gender-identity skeptics didn't have a leg to stand on. Was it greedy of me to want something louder?

Greedy or not, I wasn't done flipping out. On 1 December 2019, I wrote to Scott Alexander (cc'ing a few other people) to ask if there was any chance of an explicit and loud clarification or partial retraction of "... Not Man for the Categories" (Subject: "super-presumptuous mail about categorization and the influence graph"). Forget my boring whining about the autogynephilia/two-types thing, I said—that's a complicated empirical claim, and not the key issue.

The issue was that category boundaries are not arbitrary (if you care about intelligence being useful). You want to draw your category boundaries such that [LW · GW] things in the same category are similar in the respects that you care about predicting/controlling, and you want to spend your information-theoretically limited budget [LW · GW] of short words on the simplest and most widely useful categories.

It was true that the reason I was continuing to freak out about this to the extent of sending him this obnoxious email telling him what to write (seriously, who does that?!) was because of transgender stuff, but that wasn't why Scott should care.

The other year, Alexander had written a post, "Kolmogorov Complicity and the Parable of Lightning", explaining the consequences of political censorship with an allegory about a Society with the dogma that thunder occurs before lightning.^[4] Alexander had explained that the problem with complying with the dictates of a false orthodoxy wasn't the sacred dogma itself (it's not often that you need to directly make use of the fact that lightning comes first), but that the need to defend the sacred dogma [LW · GW] destroys everyone's ability to think [LW · GW].

It was the same thing here. It wasn't that I had any practical need to misgender anyone in particular. It still wasn't okay that talking about the reality of biological sex to so-called "rationalists" got you an endless deluge of—polite! charitable! non-ostracism-threatening!—bullshit nitpicking. (What about complete androgen insensitivity syndrome? Why doesn't this ludicrous misinterpretation of what you said imply that lesbians aren't women? &c. ad infinitum.) With enough time, I thought the nitpicks could and should be satisfactorily answered; any remaining would presumably be fatal criticisms rather than bullshit nitpicks. But while I was in the process of continuing to write all that up, I hoped Alexander could see why I felt somewhat gaslighted.

(I had been told by others that I wasn't using the word "gaslighting" correctly. No one seemed to think I had the right to define that category boundary for my convenience.)

If our vaunted rationality techniques resulted in me having to spend dozens of hours patiently explaining why I didn't think that I was a woman (where "not a woman" is a convenient rhetorical shorthand for a much longer statement about naïve Bayes models [LW · GW] and high-dimensional configuration spaces [LW · GW] and defensible Schelling points for social norms [LW · GW]), then our techniques were worse than useless.

If Galileo ever muttered "And yet it moves", there's a long and nuanced conversation you could have about the consequences of using the word "moves" in Galileo's preferred sense, as opposed to some other sense that happens to result in the theory needing more epicycles. It may not have been obvious in November 2014 when "... Not Man for the Categories" was published, but in retrospect, maybe it was a bad idea to build a memetic superweapon that says that the number of epicycles doesn't matter.

The reason to write this as a desperate email plea to Scott Alexander instead of working on my own blog was that I was afraid that marketing is a more powerful force than argument. Rather than good arguments propagating through the population of so-called "rationalists" no matter where they arose, what actually happened was that people like Alexander and Yudkowsky rose to power on the strength of good arguments and entertaining writing (but mostly the latter), and then everyone else absorbed some of their worldview (plus noise and conformity with the local environment). So for people who didn't win the talent lottery but thought they saw a flaw in the zeitgeist, the winning move was "persuade Scott Alexander."

Back in 2010, the rationalist community had a shared understanding that the function of language is to describe reality. Now, we didn't. If Scott didn't want to cite my creepy blog about my creepy fetish, that was fine; I liked getting credit, but the important thing was that this "No, the Emperor isn't naked—oh, well, we're not claiming that he's wearing any garments—it would be pretty weird if we were claiming that!—it's just that utilitarianism implies that the social property of clothedness should be defined this way because to do otherwise would be really mean to people who don't have anything to wear" maneuver needed to die, and he alone could kill it.

Scott didn't get it. We agreed that gender categories based on self-identity, natal sex, and passing each had their own pros and cons, and that it's uninteresting to focus on whether something "really" belongs to a category rather than on communicating what you mean. Scott took this to mean that what convention to use is a pragmatic choice we can make on utilitarian grounds, and that being nice to trans people was worth a little bit of clunkiness—that the mental health benefits to trans people were obviously enough to tip the first-order utilitarian calculus.

I didn't think anything about "mental health benefits to trans people" was obvious. More importantly, I considered myself to be prosecuting not the object-level question of which gender categories to use but the meta-level question of what normative principles govern the use of categories. For this, "whatever, it's a pragmatic choice, just be nice" wasn't an answer, because the normative principles exclude "just be nice" from being a relevant consideration.

"... Not Man for the Categories" had concluded with a section on Emperor Norton, a 19th-century San Francisco resident who declared himself Emperor of the United States. Certainly, it's not difficult or costly for the citizens of San Francisco to address Norton as "Your Majesty". But there's more to being Emperor of the United States than what people call you. Unless we abolish Congress and have the military enforce Norton's decrees, he's not actually emperor—at least not according to the currently generally understood meaning of the word.

What are you going to do if Norton takes you literally? Suppose he says, "I ordered the Imperial Army to invade Canada last week; where are the troop reports? And why do the newspapers keep talking about this so-called 'President' Rutherford B. Hayes? Have this pretender Hayes executed at once and bring his head to me!"

You're not really going to bring him Rutherford B. Hayes's head. So what are you going to tell him? "Oh, well, you're not a cis emperor who can command executions. But don't worry! Trans emperors are emperors"?

To be sure, words can be used in many ways depending on context, but insofar as Norton is interpreting "emperor" in the traditional sense, and you keep calling him your emperor without caveats or disclaimers, you are lying to him.

Scott still didn't get it. But I did soon end up in more conversation with Michael Vassar, Ben Hoffman, and Sarah Constantin, who were game to help me reach out to Yudkowsky again to explain the problem in more detail—and to appeal to the conscience of someone who built their career on higher standards [LW · GW].

Yudkowsky probably didn't think much of Atlas Shrugged (judging by an offhand remark by our protagonist in Harry Potter and the Methods), but I kept thinking of the scene^[5] where our heroine, Dagny Taggart, entreats the great Dr. Robert Stadler to denounce an egregiously deceptive but technically-not-lying statement [LW · GW] by the State Science Institute, whose legitimacy derives from its association with his name. Stadler has become cynical in his old age and demurs: "I can't help what people think—if they think at all!" ... "How can one deal in truth when one deals with the public?"

At this point, I still trusted Yudkowsky to do better than an Ayn Rand villain; I had faith that Eliezer Yudkowsky [LW · GW] could deal in truth when he deals with the public.

(I was wrong.)

If we had this entire posse, I felt bad and guilty and ashamed about focusing too much on my special interest except insofar as it was genuinely a proxy for "Has Eliezer and/or everyone else lost the plot, and if so, how do we get it back?" But the group seemed to agree that my philosophy-of-language grievance was a useful test case.

At times, it felt like my mind shut down with only the thought, "What am I doing? This is absurd. Why am I running around picking fights about the philosophy of language—and worse, with me arguing for the Bad Guys' position? Maybe I'm wrong and should stop making a fool of myself. After all, using Aumann-like [? · GW] reasoning, in a dispute of 'me and Michael Vassar vs. everyone else', wouldn't I want to bet on 'everyone else'?"

Except ... I had been raised back in the 'aughts to believe that you're you're supposed to concede arguments on the basis of encountering a superior counterargument, and I couldn't actually point to one. "Maybe I'm making a fool out of myself by picking fights with all these high-status people" is not a counterargument.

Anna continued to be disinclined to take a side in the brewing Category War, and it was beginning to put a strain on our friendship, to the extent that I kept ending up crying during our occasional meetings. She said that my "You have to pass my philosophy-of-language litmus test or I lose all respect for you as a rationalist" attitude was psychologically coercive. I agreed—I was even willing to go up to "violent", in the sense that I'd cop to trying to apply social incentives toward an outcome rather than merely exchanging information. But sometimes you need to use violence in defense of self or property. If we thought of the "rationalist" brand name as intellectual property, maybe it was property worth defending, and if so, then "I can define a word any way I want" wasn't an obviously terrible time to start shooting at the bandits.

My hope was that it was possible to apply just enough "What kind of rationalist are you?!" social pressure to cancel out the "You don't want to be a Bad (Red) person, do you??" social pressure and thereby let people look at the arguments—though I wasn't sure if that even works, and I was growing exhausted from all the social aggression I was doing. (If someone tries to take your property and you shoot at them, you could be said to be the "aggressor" in the sense that you fired the first shot, even if you hope that the courts will uphold your property claim later.)

After some more discussion within the me/Michael/Ben/Sarah posse, on 4 January 2019, I wrote to Yudkowsky again (a second time), to explain the specific problems with his "hill of meaning in defense of validity" Twitter performance, since that apparently hadn't been obvious from the earlier link to "... To Make Predictions". I cc'ed the posse, who chimed in afterwards.

Ben explained what kind of actions we were hoping for from Yudkowsky: that he would (1) notice that he'd accidentally been participating in an epistemic war, (2) generalize the insight (if he hadn't noticed, what were the odds that MIRI had adequate defenses?), and (3) join the conversation about how to actually have a rationality community, while noticing this particular way in which the problem seemed harder than it used to. For my case in particular, something that would help would be either (A) a clear ex cathedra statement that gender categories are not an exception to the general rule that categories are nonarbitrary, or (B) a clear ex cathedra statement that he's been silenced on this matter. If even (B) was too politically expensive, that seemed like important evidence about (1).

Without revealing the other side of any private conversation that may or may not have occurred, I can say that we did not get either of those ex cathedra statements at this time.

It was also around this time that our posse picked up a new member, whom I'll call "Riley".

On 5 January 2019, I met with Michael and one of his associates in San Francisco to attempt mediated discourse with Ziz and Gwen, who were considering suing the Center for Applied Rationality (CfAR)^[6] for discriminating against trans women. Michael hoped to dissuade them from a lawsuit—not because he approved of CfAR's behavior, but because lawyers make everything worse.

Despite our personality and worldview differences, I had had a number of cooperative interactions with Ziz a couple years before. We had argued about the etiology of transsexualism in late 2016. When I sent her some delusional PMs during my February 2017 psychotic break, she came over to my apartment with chocolate ("allegedly good against dementors"), although I wasn't there. I had awarded her $1200 as part of a credit-assignment ritual to compensate the twenty-one people who were most responsible for me successfully navigating my psychological crises of February and April 2017. (The fact that she had been up to argue about trans etiology meant a lot to me.) I had accepted some packages for her at my apartment in mid-2017 when she was preparing to live on a boat and didn't have a mailing address.

At this meeting, Ziz recounted her story of how Anna Salamon (in her capacity as President of CfAR and community leader) allegedly engaged in conceptual warfare to falsely portray Ziz as a predatory male. I was unimpressed: in my worldview, I didn't think Ziz had the right to say "I'm not a man," and expect people to just believe that. (I remember that at one point, Ziz answered a question with, "Because I don't run off masochistic self-doubt like you." I replied, "That's fair.") But I did respect that Ziz actually believed in an intersex brain theory: in Ziz and Gwen's worldview, people's genders were a fact of the matter, not a manipulation of consensus categories to make people happy.

Probably the most ultimately consequential part of this meeting was Michael verbally confirming to Ziz that MIRI had settled with a disgruntled former employee, Louie Helm, who had put up a website slandering them. (I don't know the details of the alleged settlement. I'm working off of Ziz's notes rather than remembering that part of the conversation clearly myself; I don't know what Michael knew.) What was significant was that if MIRI had paid Helm as part of an agreement to get the slanderous website taken down, then (whatever the nonprofit best-practice books might have said about whether this was a wise thing to do when facing a dispute from a former employee) that would decision-theoretically amount to a blackmail payout, which seemed to contradict MIRI's advocacy of timeless decision theories (according to which you shouldn't be the kind of agent that yields to extortion).

Something else Ben had said while chiming in on the second attempt to reach out to Yudkowsky hadn't sat quite right with me.

I am pretty worried that if I actually point out the physical injuries sustained by some of the smartest, clearest-thinking, and kindest people I know in the Rationalist community as a result of this sort of thing, I'll be dismissed as a mean person who wants to make other people feel bad.

I didn't know what he was talking about. My friend "Rebecca"'s 2015 psychiatric imprisonment ("hospitalization") had probably been partially related to her partner's transition and had involved rough handling by the cops. I had been through some Bad Stuff during my psychotic episodes of February and April 2017, but none of it was "physical injuries." What were the other cases, if he could share without telling me Very Secret Secrets With Names?

Ben said that, probabilistically, he expected that some fraction of the trans women he knew who had "voluntarily" had bottom surgery had done so in response to social pressure, even if some of them might well have sought it out in a less weaponized culture.

I said that saying, "I am worried that if I actually point out the physical injuries ..." when the actual example turned out to be sex reassignment surgery seemed dishonest: I had thought he might have more examples of situations like mine or "Rebecca"'s, where gaslighting escalated into more tangible harm in a way that people wouldn't know about by default. In contrast, people already know that bottom surgery is a thing; Ben just had reasons to think it's Actually Bad—reasons that his friends couldn't engage with if we didn't know what he was talking about. It was bad enough that Yudkowsky was being so cagey; if everyone did it, then we were really doomed.

Ben said he was more worried that saying politically loaded things in the wrong order would reduce our chances of getting engagement from Yudkowsky than that someone would share his words out of context in a way that caused him distinct harm. And maybe more than both of those, that saying the wrong keywords would cause his correspondents to talk about him using the wrong keywords, in ways that caused illegible, hard-to-trace damage.

There's a view that assumes that as long as everyone is being cordial, our truthseeking public discussion must be basically on track; the discussion is only being warped by the fear of heresy if someone is overtly calling to burn the heretics.

I do not hold this view. I think there's a subtler failure mode where people know what the politically favored bottom line [LW · GW] is, and collude to ignore, nitpick, or just be uninterested in any fact or line of argument that doesn't fit. I want to distinguish between direct ideological conformity enforcement attempts, and people not living up to their usual epistemic standards in response to ideological conformity enforcement.

Especially compared to normal Berkeley, I had to give the Berkeley "rationalists" credit for being very good at free speech norms. (I'm not sure I would be saying this in the possible world where Scott Alexander didn't have a traumatizing experience with social justice in college, causing him to dump a ton of anti-social-justice, pro-argumentative-charity antibodies into the "rationalist" water supply after he became our subculture's premier writer. But it was true in our world.) I didn't want to fall into the bravery-debate trap of, "Look at me, I'm so heroically persecuted, therefore I'm right (therefore you should have sex with me)". I wasn't angry at the "rationalists" for silencing me (which they didn't); I was angry at them for making bad arguments and systematically refusing to engage with the obvious counterarguments.

As an illustrative example, in an argument on Discord in January 2019, I said, "I need the phrase 'actual women' in my expressive vocabulary to talk about the phenomenon where, if transition technology were to improve, then the people we call 'trans women' would want to make use of that technology; I need language that asymmetrically distinguishes between the original thing that already exists without having to try, and the artificial thing that's trying to imitate it to the limits of available technology".

Kelsey Piper replied, "the people getting surgery to have bodies that do 'women' more the way they want are mostly cis women [...] I don't think 'people who'd get surgery to have the ideal female body' cuts anything at the joints."

Another woman said, "'the original thing that already exists without having to try' sounds fake to me" (to the acclaim of four "+1" emoji reactions).

The problem with this kind of exchange is not that anyone is being shouted down, nor that anyone is lying. The problem is that people are motivatedly, "algorithmically" [LW · GW] "playing dumb." I wish we had more standard terminology for this phenomenon, which is ubiquitous in human life. By "playing dumb", I don't mean that Kelsey was consciously thinking, "I'm playing dumb in order to gain an advantage in this argument." I don't doubt that, subjectively, mentioning that cis women also get cosmetic surgery felt like a relevant reply. It's just that, in context, I was obviously trying to talk about the natural category of "biological sex", and Kelsey could have figured that out if she had wanted to.

It's not that anyone explicitly said, "Biological sex isn't real" in those words. (The elephant in the brain knew it wouldn't be able to get away with that.) But if everyone correlatedly plays dumb whenever someone tries to talk about sex in clear language in a context where that could conceivably hurt some trans person's feelings, I think what you have is a culture of de facto biological sex denialism. ("'The original thing that already exists without having to try' sounds fake to me"!!) It's not that hard to get people to admit that trans women are different from cis women, but somehow they can't (in public, using words) follow the implication that trans women are different from cis women because trans women are male.

Ben thought I was wrong to see this behavior as non-ostracizing. The deluge of motivated nitpicking is an implied marginalization threat, he explained: the game people were playing when they did that was to force me to choose between doing arbitrarily large amounts of interpretive labor or being cast as never having answered these construed-as-reasonable objections, and therefore over time losing standing to make the claim, being thought of as unreasonable, not getting invited to events, &c.

I saw the dynamic he was pointing at, but as a matter of personality, I was more inclined to respond, "Welp, I guess I need to write faster and more clearly", rather than, "You're dishonestly demanding arbitrarily large amounts of interpretive labor from me." I thought Ben was far too quick to give up on people whom he modeled as trying not to understand, whereas I continued to have faith in the possibility of making them understand if I just didn't give up. Not to play chess with a pigeon (which craps on the board and then struts around like it's won), or wrestle with a pig (which gets you both dirty, and the pig likes it), or dispute what the Tortoise said to Achilles—but to hold out hope that people in "the community" could only be boundedly motivatedly dense, and anyway that giving up wouldn't make me a stronger writer.

(Picture me playing Hermione Granger in a post-Singularity holonovel adaptation of Harry Potter and the Methods of Rationality, Emma Watson having charged me the standard licensing fee to use a copy of her body for the occasion: "We can do anything if we exert arbitrarily large amounts of interpretive labor!")

Ben thought that making them understand was hopeless and that becoming a stronger writer was a boring goal; it would be a better use of my talents to jump up a meta level and explain how people were failing to engage. That is, insofar as I expected arguing to work, I had a model of "the rationalists" that kept making bad predictions. What was going on there? Something interesting might happen if I tried to explain that.

(I guess I'm only now, after spending an additional four years exhausting every possible line of argument, taking Ben's advice on this by finishing and publishing this memoir. Sorry, Ben—and thanks.)

One thing I regret about my behavior during this period was the extent to which I was emotionally dependent on my posse, and in some ways particularly Michael, for validation. I remembered Michael as a high-status community elder back in the Overcoming Bias era (to the extent that there was a "community" in those early days).^[7] I had been skeptical of him: the guy makes a lot of stridently "out there" assertions, in a way that makes you assume he must be speaking metaphorically. (He always insists he's being completely literal.) But he had social proof as the President of the Singularity Institute—the "people person" of our world-saving effort, to complement Yudkowsky's antisocial mad scientist personality—which inclined me to take his assertions more charitably than I otherwise would have.

Now, the memory of that social proof was a lifeline. Dear reader, if you've never been in the position of disagreeing with the entire weight of Society's educated opinion, including your idiosyncratic subculture that tells itself a story about being smarter and more open-minded than the surrounding Society—well, it's stressful. There was a comment on the /r/slatestarcodex subreddit around this time that cited Yudkowsky, Alexander, Piper, Ozy Brennan, and Rob Bensinger as leaders of the "rationalist" community. Just an arbitrary Reddit comment of no significance whatsoever—but it was a salient indicator of the zeitgeist to me, because every single one of those people had tried to get away with some variant on the "word usage is subjective, therefore you have no grounds to object to the claim that trans women are women" mind game.

In the face of that juggernaut of received opinion, I was already feeling pretty gaslighted. ("We ... we had a whole Sequence about this. And you were there, and you were there ... It—really happened, right? The hyperlinks [LW · GW] still [LW · GW] work [LW · GW] ...") I don't know how I would have held up intact if I were facing it alone. I definitely wouldn't have had the impudence to pester Alexander and Yudkowsky—especially Yudkowsky—if it was just me against everyone else.

But Michael thought I was in the right—not just intellectually, but morally in the right to be prosecuting the philosophy issue with our leaders. That social proof gave me a lot of bravery that I otherwise wouldn't have been able to muster up—even though it would have been better if I could have internalized that my dependence on him was self-undermining, insofar as Michael himself said that what made me valuable was my ability to think independently.

The social proof was probably more effective in my head than with anyone we were arguing with. I remembered Michael as a high-status community elder back in the Overcoming Bias era, but that had been a long time ago. (Luke Muelhauser had taken over leadership of the Singularity Institute in 2011, and apparently, some sort of rift between Michael and Eliezer had widened in recent years.) Michael's status in "the community" of 2019 was much more mixed. He was intensely critical of the rise of the Effective Altruism movement, which he saw as using bogus claims about how to do the most good to prey on the smartest and most scrupulous people around. (I remember being at a party in 2015 and asking Michael what else I should spend my San Francisco software engineer money on, if not the EA charities I was considering. I was surprised when his answer was, "You.")

Another blow to Michael's reputation was dealt on 27 February 2019, when Anna published a comment badmouthing Michael and suggesting that talking to him was harmful [LW(p) · GW(p)], which I found disappointing—more so as I began to realize the implications.

I agreed with her point about how "ridicule of obviously-fallacious reasoning plays an important role in discerning which thinkers can (or can't) help" fill the role of vetting and common knowledge [LW · GW] creation. That's why I was so heartbroken about the "categories are arbitrary, therefore trans women are women" thing, which deserved to be laughed out of the room. Why was she trying to ostracize the guy who was one of the very few to back me up on this incredibly obvious thing!? The reasons given to discredit Michael seemed weak. (He ... flatters people? He ... didn't tell people to abandon their careers? What?) And the evidence against Michael she offered in private didn't seem much more compelling (e.g., at a CfAR event, he had been insistent on continuing to talk to someone who Anna thought looked near psychosis and needed a break).

It made sense for Anna to not like Michael anymore because of his personal conduct, or because of his opposition to EA. (Expecting all of my friends to be friends with each other would be Geek Social Fallacy #4.) If she didn't want to invite him to CfAR stuff, fine. But what did she gain from publicly denouncing him as someone whose "lies/manipulations can sometimes disrupt [people's] thinking for long and costly periods of time"?! She said she was trying to undo the effects of her previous endorsements of him, and that the comment seemed like it ought to be okay by Michael's standards (which didn't include an expectation that people should collude to protect each other's reputations).

I wasn't the only one whose life was being disrupted by political drama in early 2019. On 22 February, Scott Alexander posted that the /r/slatestarcodex Culture War Thread was being moved to a new non–Slate Star Codex–branded subreddit in the hopes that would curb some of the harassment he had been receiving. Alexander claimed that according to poll data and his own impressions, the Culture War Thread featured a variety of ideologically diverse voices but had nevertheless acquired a reputation as being a hive of right-wing scum and villainy.

Yudkowsky Tweeted:

Your annual reminder that Slate Star Codex is not and never was alt-right, every real stat shows as much, and the primary promoters of this lie are sociopaths who get off on torturing incredibly nice targets like Scott A.

I found Yudkowsky's use of the word "lie" here interesting given his earlier eagerness to police the use of the word "lie" by gender-identity skeptics. With the support of my posse, I wrote to him again, a third time (Subject: "on defending against 'alt-right' categorization").

I said, imagine if one of Alexander's critics were to reply: "Using language in a way you dislike, openly and explicitly and with public focus on the language and its meaning, is not lying. The proposition you claim false (explicit advocacy of a white ethnostate?) is not what the speech is meant to convey—and this is known to everyone involved, it is not a secret. You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning. Now, maybe as a matter of policy, you want to make a case for language like 'alt-right' being used a certain way. Well, that's a separate debate then. But you're not making a stand for Truth in doing so, and your opponents aren't tricking anyone or trying to."

How would Yudkowsky react if someone said that? My model of the Sequences-era Yudkowsky of 2009 would say, "This is an intellectually dishonest attempt to sneak in connotations [LW · GW] by performing a categorization and using an appeal-to-arbitrariness conversation-halter [LW · GW] to avoid having to justify it; go read 'A Human's Guide to Words.' [? · GW]"

But I had no idea what the real Yudkowsky of 2019 would say. If the moral of the "hill of meaning in defense of validity" thread had been that the word "lie" should be reserved for per se direct falsehoods, well, what direct falsehood was being asserted by Scott's detractors? I didn't think anyone was claiming that, say, Scott identified as alt-right, any more than anyone was claiming that trans women have two X chromosomes. Commenters on /r/SneerClub had been pretty explicit in their criticism that the Culture War thread harbored racists (&c.) and possibly that Scott himself was a secret racist, with respect to a definition of racism that included the belief that there exist genetically mediated population differences in the distribution of socially relevant traits and that this probably had decision-relevant consequences that should be discussable somewhere.

And this was correct. For example, Alexander's "The Atomic Bomb Considered As Hungarian High School Science Fair Project" favorably cites Cochran et al.'s genetic theory of Ashkenazi achievement as "really compelling." Scott was almost certainly "guilty" of the category membership that the speech was meant to convey—it's just that Sneer Club got to choose the category. If a machine-learning classifier returns positive on both Scott Alexander and Richard Spencer, the correct response is not that the classifier is "lying" (what would that even mean?) but that the classifier is not very useful for understanding Scott Alexander's effects on the world.

Of course, Scott is great, and it was right that we should defend him from the bastards trying to ruin his reputation, and it was plausible that the most politically convenient way to do that was to pound the table and call them lying sociopaths rather than engaging with the substance of their claims—much as how someone being tried under an unjust law might plead "Not guilty" to save their own skin rather than tell the whole truth and hope for jury nullification.

But, I argued, political convenience came at a dire cost to our common interest [LW · GW]. There was a proverb Yudkowsky had once failed to Google [LW · GW], that ran something like, "Once someone is known to be a liar, you might as well listen to the whistling of the wind."

Similarly, once someone is known to vary the epistemic standards of their public statements for political convenience—if they say categorizations can be lies when that happens to help their friends, but seemingly deny the possibility when that happens to make them look good politically ...

Well, you're still better off listening to them than the whistling of the wind, because the wind in various possible worlds is presumably uncorrelated with most of the things you want to know about, whereas clever arguers [LW · GW] who don't tell explicit lies [LW · GW] are constrained in how much they can mislead you. But it seems plausible that you might as well listen to any other arbitrary smart person with a blue check and 20K Twitter followers. It might be a useful exercise, for Yudkowsky to think of what he would actually say if someone with social power actually did this to him when he was trying to use language to reason about Something he had to Protect?

(Note, my claim here is not that "Pronouns aren't lies" and "Scott Alexander is not a racist" are similarly misinformative. Rather, I'm saying that whether "You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning" makes sense as a response to "X isn't a Y" shouldn't depend on the specific values of X and Y. Yudkowsky's behavior the other month had made it look like he thought that "You're not standing in defense of truth if ..." was a valid response when, say, X = "Caitlyn Jenner" and Y = "woman." I was saying that whether or not it's a valid response, we should, as a matter of local validity [LW · GW], apply the same standard when X = "Scott Alexander" and Y = "racist.")

Without disclosing any specific content from private conversations that may or may not have happened, I can say that our posse did not get the kind of engagement from Yudkowsky that we were hoping for.

Michael said that it seemed important that, if we thought Yudkowsky wasn't interested, we should have common knowledge among ourselves that we considered him to be choosing to be a cult leader.

I settled on Sara Bareilles's "Gonna Get Over You" as my breakup song with Yudkowsky and the rationalists, often listening to a cover of it on loop to numb the pain. I found the lyrics were readily interpretable as being about my problems, even if Sara Bareilles had a different kind of breakup in mind. ("I tell myself to let the story end"—the story of the rationalists as a world-changing intellectual movement. "And my heart will rest in someone else's hand"—Michael Vassar's. "And I'm not the girl that I intend to be"—self-explanatory.)^[8]

Meanwhile, my email thread with Scott started up again. I expressed regret that all the times I had emailed him over the past couple years had been when I was upset about something (like psych hospitals, or—something else) and wanted something from him, treating him as a means rather than an end—and then, despite that regret, I continued prosecuting the argument.

One of Alexander's most popular Less Wrong posts ever had been about the noncentral fallacy, which Alexander called "the worst argument in the world" [LW · GW]: those who (for example) crow that abortion is murder (because murder is the killing of a human being), or that Martin Luther King, Jr. was a criminal (because he defied the segregation laws of the South), are engaging in a dishonest rhetorical maneuver in which they're trying to trick their audience into assigning attributes of the typical "murder" or "criminal" to what are very noncentral members of those categories.

Even if you're opposed to abortion, or have negative views about the historical legacy of Dr. King, this isn't the right way to argue. If you call Fiona a murderer, that causes me to form a whole bunch of implicit probabilistic expectations on the basis of what the typical "murder" is like—expectations about Fiona's moral character, about the suffering of a victim whose hopes and dreams were cut short, about Fiona's relationship with the law, &c.—most of which get violated when you reveal that the murder victim was an embryo.

In the form of a series of short parables, I tried to point out that Alexander's own "The Worst Argument in the World" is complaining about the same category-gerrymandering move that his "... Not Man for the Categories" comes out in favor of. We would not let someone get away with declaring, "I ought to accept an unexpected abortion or two deep inside the conceptual boundaries of what would normally not be considered murder if it'll save someone's life." Maybe abortion is wrong and relevantly similar to the central sense of "murder", but you need to make that case on the empirical merits, not by linguistic fiat (Subject: "twelve short stories about language").

Scott still didn't get it. He didn't see why he shouldn't accept one unit of categorizational awkwardness in exchange for sufficiently large utilitarian benefits. He made an analogy to some lore from the Glowfic collaborative fiction writing community, a story about orcs who had unwisely sworn a oath to serve the evil god Melkor. Though the orcs intend no harm of their own will, they're magically bound to obey Melkor's commands and serve as his terrible army or else suffer unbearable pain. Our heroine comes up with a solution: she founds a new religion featuring a deist God who also happens to be named "Melkor". She convinces the orcs that since the oath didn't specify which Melkor, they're free to follow her new God instead of evil Melkor, and the magic binding the oath apparently accepts this casuistry if the orcs themselves do.

Scott's attitude toward the new interpretation of the oath in the story was analogous to his thinking about transgenderedness: sure, the new definition may be a little awkward and unnatural, but it's not objectively false, and it made life better for so many orcs. If rationalists should win [LW · GW], then the true rationalist in this story was the one who thought up this clever hack to save an entire species.

I started drafting a long reply—but then I remembered that in recent discussion with my posse, the idea had come up that in-person meetings are better for resolving disagreements. Would Scott be up for meeting in person some weekend? Non-urgent. Ben would be willing to moderate, unless Scott wanted to suggest someone else, or no moderator.

Scott didn't want to meet. I considered resorting to the tool of cheerful prices, which I hadn't yet used against Scott—to say, "That's totally understandable! Would a financial incentive change your decision? For a two-hour meeting, I'd be happy to pay up to $4000 to you or your preferred charity. If you don't want the money, then let's table this. I hope you're having a good day." But that seemed sufficiently psychologically coercive and socially weird that I wasn't sure I wanted to go there. On 18 March, I emailed my posse asking what they thought—and then added that maybe they shouldn't reply until Friday, because it was Monday, and I really needed to focus on my dayjob that week.

This is the part where I began to ... overheat. I tried ("tried") to focus on my dayjob, but I was just so angry. Did Scott really not understand the rationality-relevant distinction between "value-dependent categories as a result of caring about predicting different variables" (as explained by the dagim/water-dwellers vs. fish example in "... Not Man for the Categories") and "value-dependent categories in order to not make my friends sad"? Was he that dumb? Or was it that he was only verbal-smart, and this is the sort of thing that only makes sense if you've ever been good at linear algebra? (Such that the language of "only running your clustering algorithm on the subspace of the configuration space spanned by the variables that are relevant to your decisions" would come naturally.) Did I need to write a post explaining just that one point in mathematical detail, with executable code and a worked example with entropy calculations?

My dayjob boss made it clear that he was expecting me to have code for my current Jira tickets by noon the next day, so I deceived myself into thinking I could accomplish that by staying at the office late. Maybe I could have caught up, if it were just a matter of the task being slightly harder than anticipated and I weren't psychologically impaired from being hyper-focused on the religious war. The problem was that focus is worth 30 IQ points, and an IQ 100 person can't do my job.

I was in so much (psychological) pain. Or at least, in one of a series of emails to my posse that night, I felt motivated to type the sentence, "I'm in so much (psychological) pain." I'm never sure how to interpret my own self-reports, because even when I'm really emotionally trashed (crying, shaking, randomly yelling, &c.), I think I'm still noticeably incentivizable: if someone were to present a credible threat (like slapping me and telling me to snap out of it), then I would be able to calm down. There's some sort of game-theory algorithm in the brain that feels subjectively genuine distress (like crying or sending people too many hysterical emails) but only when it can predict that it will be rewarded with sympathy or at least tolerated: tears are a discount on friendship.

I tweeted a Sequences quote (the mention of @ESYudkowsky being to attribute credit, I told myself; I figured Yudkowsky had enough followers that he probably wouldn't see a notification):

"—and if you still have something to protect, so that you MUST keep going, and CANNOT resign and wisely acknowledge the limitations of rationality— [1/3]

"—then you will be ready to start your journey[.] To take sole responsibility, to live without any trustworthy defenses, and to forge a higher Art than the one you were once taught. [2/3]

"No one begins to truly search for the Way until their parents have failed them, their gods are dead, and their tools have shattered in their hand." —@ESYudkowsky (https://www.lesswrong.com/posts/wustx45CPL5rZenuo/no-safe-defense-not-even-science [LW · GW]) [end/3]

Only it wasn't quite appropriate. The quote is about failure resulting in the need to invent new methods of rationality, better than the ones you were taught. But the methods I had been taught were great! I didn't have a pressing need to improve on them! I just couldn't cope with everyone else having forgotten!

I did eventually get some dayjob work done that night, but I didn't finish the whole thing my manager wanted done by the next day, and at 4 a.m., I concluded that I needed sleep, the lack of which had historically been very dangerous for me (being the trigger for my 2013 and 2017 psychotic breaks and subsequent psych imprisonments). We really didn't want another outcome like that. There was a couch in the office, and probably another four hours until my coworkers started to arrive. The thing I needed to do was just lie down on the couch in the dark and have faith that sleep would come. Meeting my manager's deadline wasn't that important. When people came in to the office, I might ask for help getting an Uber home? Or help buying melatonin? The important thing was to be calm.

I sent an email explaining this to Scott and my posse and two other friends (Subject: "predictably bad ideas").

Lying down didn't work. So at 5:26 a.m., I sent an email to Scott cc'ing my posse plus Anna about why I was so mad (both senses). I had a better draft sitting on my desktop at home, but since I was here and couldn't sleep, I might as well type this version (Subject: "five impulsive points, hastily written because I just can't even (was: Re: predictably bad ideas)"). Scott had been continuing to insist it's okay to gerrymander category boundaries for trans people's mental health, but there were a few things I didn't understand. If creatively reinterpreting the meanings of words because the natural interpretation would make people sad is okay, why didn't that generalize to an argument in favor of outright lying when the truth would make people sad? The mind games seemed crueler to me than a simple lie. Also, if "mental health benefits for trans people" matter so much, then why didn't my mental health matter? Wasn't I trans, sort of? Getting shut down by appeal-to-utilitarianism when I was trying to use reason to make sense of the world was observably really bad for my sanity!

Also, Scott had asked me if it wouldn't be embarrassing if the community solved Friendly AI and went down in history as the people who created Utopia forever, and I had rejected it because of gender stuff. But the original reason it had ever seemed remotely plausible that we would create Utopia forever wasn't "because we're us, the world-saving good guys," but because we were going to perfect an art of systematically correct reasoning. If we weren't going to do systematically correct reasoning because that would make people sad, then that undermined the reason that it was plausible that we would create Utopia forever.

Also-also, Scott had proposed a super–Outside View [? · GW] of the culture war as an evolutionary process that produces memes optimized to trigger PTSD syndromes and suggested that I think of that as what was happening to me. But, depending on how much credence Scott put in social proof, mightn't the fact that I managed to round up this whole posse to help me repeatedly argue with (or harass) Yudkowsky shift his estimate over whether my concerns had some objective merit that other people could see, too? It could simultaneously be the case that I had culture-war PTSD and my concerns had merit.

Michael replied at 5:58 a.m., saying that everyone's first priority should be making sure that I could sleep—that given that I was failing to adhere to my commitments to sleep almost immediately after making them, I should be interpreted as urgently needing help, and that Scott had comparative advantage in helping, given that my distress was most centrally over Scott gaslighting me, asking me to consider the possibility that I was wrong while visibly not considering the same possibility regarding himself.

That seemed a little harsh on Scott to me. At 6:14 a.m. and 6:21 a.m., I wrote a couple emails to everyone that my plan was to get a train back to my own apartment to sleep, that I was sorry for making such a fuss despite being incentivizable while emotionally distressed, that I should be punished in accordance with the moral law for sending too many hysterical emails because I thought I could get away with it, that I didn't need Scott's help, and that I thought Michael was being a little aggressive about that, but that I guessed that's also kind of Michael's style.

Michael was furious with me. ("What the FUCK Zack!?! Calling now," he emailed me at 6:18 a.m.) I texted and talked with him on my train ride home. He seemed to have a theory that people who are behaving badly, as Scott was, will only change when they see a victim who is being harmed. Me escalating and then immediately deescalating just after Michael came to help was undermining the attempt to force an honest confrontation, such that we could get to the point of having a Society with morality or punishment.

Anyway, I did get to my apartment and sleep for a few hours. One of the other friends I had cc'd on some of the emails, whom I'll call "Meredith", came to visit me later that morning with her 2½-year-old son—I mean, her son at the time.

(Incidentally, the code that I had written intermittently between 11 p.m. and 4 a.m. was a horrible bug-prone mess, and the company has been paying for it ever since.)

At some level, I wanted Scott to know how frustrated I was about his use of "mental health for trans people" as an Absolute Denial Macro. But when Michael started advocating on my behalf, I started to minimize my claims because I had a generalized attitude of not wanting to sell myself as a victim. Ben pointed out that making oneself mentally ill in order to extract political concessions only works if you have a lot of people doing it in a visibly coordinated way—and even if it did work, getting into a dysphoria contest with trans people didn't seem like it led anywhere good.

I supposed that in Michael's worldview, aggression is more honest than passive-aggression. That seemed true, but I was psychologically limited in how much overt aggression I was willing to deploy against my friends. (And particularly Yudkowsky, whom I still hero-worshiped.) But clearly, the tension between "I don't want to do too much social aggression" and "Losing the Category War within the rationalist community is absolutely unacceptable" was causing me to make wildly inconsistent decisions. (Emailing Scott at 4 a.m. and then calling Michael "aggressive" when he came to defend me was just crazy: either one of those things could make sense, but not both.)

Did I just need to accept that was no such a thing as a "rationalist community"? (Sarah had told me as much two years ago while tripsitting me during my psychosis relapse, but I hadn't made the corresponding mental adjustments.)

On the other hand, a possible reason to be attached to the "rationalist" brand name and social identity that wasn't just me being stupid was that the way I talk had been trained really hard on this subculture for ten years. Most of my emails during this whole campaign had contained multiple Sequences or Slate Star Codex links that I could expect the recipients to have read. I could use the phrase "Absolute Denial Macro" [LW · GW] in conversation and expect to be understood. If I gave up on the "rationalists" being a thing, and went out into the world to make friends with Quillette readers or arbitrary University of Chicago graduates, then I would lose all that accumulated capital. Here, I had a massive home territory advantage because I could appeal to Yudkowsky's writings about the philosophy of language from ten years ago and people couldn't say, "Eliezer who? He's probably a Bad Man."

The language I spoke was mostly educated American English, but I relied on subculture dialect for a lot. My sister has a chemistry doctorate from MIT (and so speaks the language of STEM intellectuals generally), and when I showed her "... To Make Predictions", she reported finding it somewhat hard to read, likely because I casually use phrases like "thus, an excellent motte" and expect to be understood without the reader taking 10 minutes to read the link. That essay, which was me writing from the heart in the words that came most naturally to me, could not be published in Quillette. The links and phraseology were just too context bound.

Maybe that's why I felt like I had to stand my ground and fight for the world I was made in, even though the contradiction between the war effort and my general submissiveness had me making crazy decisions.

Michael said that a reason to make a stand here in "the community" was because if we didn't, the beacon of "rationalism" would continue to lure and mislead others—but that more importantly, we needed to figure out how to win this kind of argument decisively, as a group. We couldn't afford to accept a status quo of accepting defeat when faced with bad faith arguments in general. Ben reported writing to Scott to ask him to alter the beacon so that people like me wouldn't think "the community" was the place to go for the rationality thing anymore.

As it happened, the next day, we saw these Tweets from @ESYudkowsky, linking to a Quillette article interviewing Lisa Littman about her work positing a socially contagious "rapid onset" type of gender dysphoria among young females:

Everything more complicated than protons tends to come in varieties. Hydrogen, for example, has isotopes. Gender dysphoria involves more than one proton and will probably have varieties. https://quillette.com/2019/03/19/an-interview-with-lisa-littman-who-coined-the-term-rapid-onset-gender-dysphoria/

To be clear, I don't know much about gender dysphoria. There's an allegation that people are reluctant to speciate more than one kind of gender dysphoria. To the extent that's not a strawman, I would say only in a generic way that GD seems liable to have more than one species.

(Why now? Maybe he saw the tag in my "tools have shattered" Tweet on Monday, or maybe the Quillette article was just timely?)

The most obvious reading of these Tweets was as a political concession to me. The two-type taxonomy of MtF was the thing I was originally trying to talk about, back in 2016–2017, before getting derailed onto the present philosophy-of-language war, and here Yudkowsky was backing up my side on that.

At this point, some readers might think that this should have been the end of the matter, that I should have been satisfied. I had started the recent drama flare-up because Yudkowsky had Tweeted something unfavorable to my agenda. But now, Yudkowsky was Tweeting something favorable to my agenda! Wouldn't it be greedy and ungrateful for me to keep criticizing him about the pronouns and language thing, given that he'd thrown me a bone here? Shouldn't I call it even?

That's not how it works. The entire concept of "sides" to which one can make "concessions" is an artifact of human coalitional instincts. It's not something that makes sense as a process for constructing a map that reflects the territory. My posse and I were trying to get a clarification about a philosophy-of-language claim Yudkowsky had made a few months prior ("you're not standing in defense of truth if [...]"). Why would we stop prosecuting that because of this unrelated Tweet about the etiology of gender dysphoria? That wasn't the thing we were trying to clarify!

Moreover—and I'm embarrassed that it took me another day to realize this—this new argument from Yudkowsky about the etiology of gender dysphoria was wrong. As I would later get around to explaining in "On the Argumentative Form 'Super-Proton Things Tend to Come in Varieties'", when people claim that some psychological or medical condition "comes in varieties", they're making a substantive empirical claim that the causal or statistical structure of the condition is usefully modeled as distinct clusters, not merely making the trivial observation that instances of the condition are not identical down to the subatomic level.

So we shouldn't think that there are probably multiple kinds of gender dysphoria because things are made of protons. If anything, a priori reasoning about the cognitive function of categorization should actually cut in the other direction, (mildly) against rather than in favor of multi-type theories: you only want to add more categories to your theory if they can pay for their additional complexity with better predictions [LW · GW]. If you believe in Blanchard–Bailey–Lawrence's two-type taxonomy of MtF, or Littman's proposed rapid-onset type, it should be on the empirical merits, not because multi-type theories are a priori more likely to be true (which they aren't).

Had Yudkowsky been thinking that maybe if he Tweeted something favorable to my agenda, then I and the rest of Michael's gang would be satisfied and leave him alone?

But if there's some other reason you suspect there might be multiple species of dysphoria, but you tell people your suspicion is because "everything more complicated than protons tends to come in varieties", you're still misinforming people for political reasons, which was the general problem we were trying to alert Yudkowsky to. Inventing fake rationality lessons in response to political pressure is not okay, and the fact that in this case the political pressure happened to be coming from me didn't make it okay.

I asked the posse if this analysis was worth sending to Yudkowsky. Michael said it wasn't worth the digression. He asked if I was comfortable generalizing from Scott's behavior, and what others had said about fear of speaking openly, to assuming that something similar was going on with Eliezer? If so, then now that we had common knowledge, we needed to confront the actual crisis, "that dread is tearing apart old friendships and causing fanatics to betray everything that they ever stood for while its existence is still being denied."

That week, former MIRI researcher Jessica Taylor joined our posse (being at an in-person meeting with Ben and Sarah and another friend on the seventeenth, and getting tagged in subsequent emails). I had met Jessica for the first time in March 2017, shortly after my psychotic break, and I had been part of the group trying to take care of her when she had her own break in late 2017 [LW · GW], but other than that, we hadn't been particularly close.

Significantly for political purposes, Jessica is trans. We didn't have to agree up front on all gender issues for her to see the epistemology problem with "... Not Man for the Categories", and to say that maintaining a narcissistic fantasy by controlling category boundaries wasn't what she wanted, as a trans person. (On the seventeenth, when I lamented the state of a world that incentivized us to be political enemies, her response was, "Well, we could talk about it first.") Michael said that me and Jessica together had more moral authority than either of us alone.

As it happened, I ran into Scott on the BART train that Friday, the twenty-second. He said he wasn't sure why the oft-repeated moral of "A Human's Guide to Words" had been "You can't define a word any way you want" rather than "You can define a word any way you want, but then you have to deal with the consequences."

Ultimately, I thought this was a pedagogy decision that Yudkowsky had gotten right back in 2008. If you write your summary slogan in relativist language, people predictably take that as license to believe whatever they want without having to defend it. Whereas if you write your summary slogan in objectivist language—so that people know they don't have social permission to say, "It's subjective, so I can't be wrong"—then you have some hope of sparking useful thought about the exact, precise ways that specific, definite things are relative to other specific, definite things.

I told Scott I would send him one more email with a piece of evidence about how other "rationalists" were thinking about the categories issue and give my commentary on the parable about orcs, and then the present thread would probably drop there.

Concerning what others were thinking: on Discord in January, Kelsey Piper had told me that everyone else experienced their disagreement with me as being about where the joints are and which joints are important, where usability for humans was a legitimate criterion of importance, and it was annoying that I thought they didn't believe in carving reality at the joints at all and that categories should be whatever makes people happy.

I didn't want to bring it up at the time because I was so overjoyed that the discussion was actually making progress on the core philosophy-of-language issue, but Scott did seem to be pretty explicit that his position was about happiness rather than usability? If Kelsey thought she agreed with Scott, but actually didn't, that sham consensus was a bad sign for our collective sanity, wasn't it?

As for the parable about orcs, I thought it was significant that Scott chose to tell the story from the standpoint of non-orcs deciding what verbal behaviors [LW · GW] to perform while orcs are around, rather than the standpoint of the orcs themselves. For one thing, how do you know that serving evil-Melkor is a life of constant torture? Is it at all possible that someone has given you misleading information about that?

Moreover, you can't just give an orc a clever misinterpretation of an oath and have them believe it. First you have to cripple their general ability [LW · GW] to correctly interpret oaths, for the same reason that you can't get someone to believe that 2+2=5 without crippling their general ability to do arithmetic. We weren't talking about a little "white lie" that the listener will never get to see falsified (like telling someone their dead dog is in heaven); the orcs already know the text of the oath, and you have to break their ability to understand it. Are you willing to permanently damage an orc's ability to reason in order to save them pain? For some sufficiently large amount of pain, surely. But this isn't a choice to make lightly—and the choices people make to satisfy their own consciences don't always line up with the volition of their alleged beneficiaries. We think we can lie to save others from pain, without wanting to be lied to ourselves. But behind the veil of ignorance, it's the same choice!

I also had more to say about philosophy of categories: I thought I could be more rigorous about the difference between "caring about predicting different variables" and "caring about consequences", in a way that Eliezer would have to understand even if Scott didn't. (Scott had claimed that he could use gerrymandered categories and still be just as good at making predictions—but that's just not true if we're talking about the internal use of categories as a cognitive algorithm [LW · GW], rather than mere verbal behavior. It's easy to say "X is a Y" for arbitrary X and Y if the stakes demand it, but that's not the same thing as using that concept of Y internally as part of your world-model.)

But after consultation with the posse, I concluded that further email prosecution was not useful at this time; the philosophy argument would work better as a public Less Wrong post. So my revised Category War to-do list was:

Send the brief wrapping-up/end-of-conversation email to Scott (with the Discord anecdote about Kelsey and commentary on the orc story).
Mentally write off Scott, Eliezer, and the so-called "rationalist" community as a loss so that I wouldn't be in horrible emotional pain from cognitive dissonance all the time.
Write up the mathy version of the categories argument for Less Wrong (which I thought might take a few months—I had a dayjob, and write slowly, and might need to learn some new math, which I'm also slow at).
Then email the link to Scott and Eliezer asking for a signal boost and/or court ruling.

Ben didn't think the mathematically precise categories argument was the most important thing for Less Wrong readers to know about: a similarly careful explanation of why I'd written off Scott, Eliezer, and the "rationalists" would be way more valuable.

I could see the value he was pointing at, but something in me balked at the idea of attacking my friends in public (Subject: "treachery, faith, and the great river (was: Re: DRAFTS: 'wrapping up; or, Orc-ham's razor' and 'on the power and efficacy of categories')").

Ben had previously written (in the context of the effective altruism movement) about how holding criticism to a higher standard than praise distorts our collective map. He was obviously correct that this was a distortionary force relative to what ideal Bayesian agents would do, but I was worried that when we're talking about criticism of people rather than ideas, the removal of the distortionary force would just result in social conflict (and not more truth). Criticism of institutions and social systems should be filed under "ideas" rather than "people", but the smaller-scale you get, the harder this distinction is to maintain: criticizing, say, "the Center for Effective Altruism", somehow feels more like criticizing Will MacAskill personally than criticizing "the United States" does, even though neither CEA nor the U.S. is a person.

That was why I couldn't give up faith that honest discourse eventually wins. Under my current strategy and consensus social norms, I could criticize Scott or Kelsey or Ozy's ideas without my social life dissolving into a war of all against all, whereas if I were to give in to the temptation to flip a table and say, "Okay, now I know you guys are just messing with me," then I didn't see how that led anywhere good, even if they really were.

Jessica explained what she saw as the problem with this. What Ben was proposing was creating clarity about behavioral patterns. I was saying that I was afraid that creating such clarity is an attack on someone. But if so, then my blog was an attack on trans people. What was going on here?

Socially, creating clarity about behavioral patterns is construed as an attack and can make things worse for someone. For example, if your livelihood is based on telling a story about you and your flunkies being the only sane truthseeking people in the world, then me demonstrating that you don't care about the truth when it's politically inconvenient is a threat to your marketing story and therefore to your livelihood. As a result, it's easier to create clarity down power gradients than up them: it was easy for me to blow the whistle on trans people's narcissistic delusions, but hard to blow the whistle on Yudkowsky's.^[9]

But selectively creating clarity down but not up power gradients just reinforces existing power relations—in the same way that selectively criticizing arguments with politically unfavorable conclusions only reinforces your current political beliefs. I shouldn't be able to get away with claiming that calling non-exclusively-androphilic trans women delusional perverts is okay on the grounds that that which can be destroyed by the truth should be, but that calling out Alexander and Yudkowsky would be unjustified on the grounds of starting a war or whatever. Jessica was on board with a project to tear down narcissistic fantasies in general, but not a project that starts by tearing down trans people's narcissistic fantasies, then emits spurious excuses for not following that effort where it leads.

Somewhat apologetically, I replied that the distinction between truthfully, publicly criticizing group identities and named individuals still seemed important to me?—as did avoiding leaking info from private conversations. I would be more comfortable writing a scathing blog post about the behavior of "rationalists", than about a specific person not adhering to good discourse norms in an email conversation that they had good reason to expect to be private. I thought I was consistent about this; contrast my writing with the way that some anti-trans writers name and shame particular individuals. (The closest I had come was mentioning Danielle Muscato as someone who doesn't pass—and even there, I admitted it was "unclassy" and done out of desperation.) I had to acknowledge that criticism of non-exclusively-androphilic trans women in general implied criticism of Jessica, and criticism of "rationalists" in general implied criticism of Yudkowsky and Alexander and me, but the extra inferential step and "fog of probability" seemed to make the speech act less of an attack. Was I wrong?

Michael said this was importantly backwards: less precise targeting is more violent. If someone said, "Michael Vassar is a terrible person," he would try to be curious, but if they didn't have an argument, he would tend to worry more "for" them and less "about" them, whereas if someone said, "The Jews are terrible people," he saw that as a more serious threat to his safety. (And rationalists and trans women are exactly the sort of people who get targeted by the same people who target Jews.)

Polishing the advanced categories argument from earlier email drafts into a solid Less Wrong post didn't take that long: by 6 April 2019, I had an almost complete draft of the new post, "Where to Draw the Boundaries?" [LW · GW], that I was pretty happy with.

The title (note: "boundaries", plural) was a play off of "Where to Draw the Boundary?" [LW · GW] (note: "boundary", singular), a post from Yudkowsky's original Sequence [? · GW] on the 37 ways in which words can be wrong [LW · GW]. In "... Boundary?", Yudkowsky asserts (without argument, as something that all educated people already know) that dolphins don't form a natural category with fish ("Once upon a time it was thought that the word 'fish' included dolphins [...] Or you could stop playing nitwit games and admit that dolphins don't belong on the fish list"). But Alexander's "... Not Man for the Categories" directly contradicts this, asserting that there's nothing wrong with the biblical Hebrew word dagim encompassing both fish and cetaceans (dolphins and whales). So who's right—Yudkowsky (2008) or Alexander (2014)? Is there a problem with dolphins being "fish", or not?

In "... Boundaries?", I unify the two positions and explain how both Yudkowsky and Alexander have a point: in high-dimensional configuration space, there's a cluster of finned water-dwelling animals in the subspace of the dimensions along which finned water-dwelling animals are similar to each other, and a cluster of mammals in the subspace of the dimensions along which mammals are similar to each other, and dolphins belong to both of them. Which subspace you pay attention to depends on your values: if you don't care about predicting or controlling some particular variable, you have no reason to look for similarity clusters [LW · GW] along that dimension.

But given a subspace of interest, the technical criterion of drawing category boundaries around regions of high density in configuration space [LW · GW] still applies. There is Law governing which uses of communication signals transmit which information, and the Law can't be brushed off with, "whatever, it's a pragmatic choice, just be nice." I demonstrate the Law with a couple of simple mathematical examples: if you redefine a codeword that originally pointed to one cluster in ℝ³, to also include another, that changes the quantitative predictions you make about an unobserved coordinate given the codeword; if an employer starts giving the title "Vice President" to line workers, that decreases the mutual information between the job title and properties of the job.

(Jessica and Ben's discussion of the job title example in relation to the Wikipedia summary of Jean Baudrillard's Simulacra and Simulation got published separately and ended up taking on a life of its own in future posts, including [LW · GW] a [LW · GW] number [LW · GW] of [? · GW] posts by other authors.)

Sarah asked if the math wasn't a bit overkill: were the calculations really necessary to make the basic point that good definitions should be about classifying the world, rather than about what's pleasant or politically expedient to say?

I thought the math was important as an appeal to principle—and as intimidation. (As it was written, the tenth virtue is precision! Even if you cannot do the math, knowing that the math exists tells you that the dance step is precise and has no room in it for your whims.)

"... Boundaries?" explains all this in the form of discourse with a hypothetical interlocutor arguing for the I-can-define-a-word-any-way-I-want position. In the hypothetical interlocutor's parts, I wove in verbatim quotes (without attribution) from Alexander ("an alternative categorization system is not an error, and borders are not objectively true or false") and Yudkowsky ("You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning"; "Using language in a way you dislike is not lying. The propositions you claim false [...] is not what the [...] is meant to convey, and this is known to everyone involved; it is not a secret") and Bensinger ("doesn't unambiguously refer to the thing you're trying to point at").

My thinking here was that the posse's previous email campaigns had been doomed to failure by being too closely linked to the politically contentious object-level topic, which reputable people had strong incentives not to touch with a ten-meter pole. So if I wrote this post just explaining what was wrong with the claims Yudkowsky and Alexander had made about the philosophy of language, with perfectly innocent examples about dolphins and job titles, that would remove the political barrier to Yudkowsky correcting the philosophy of language error. If someone with a threatening social-justicey aura were to say, "Wait, doesn't this contradict what you said about trans people earlier?", the reputable people could stonewall them. (Stonewall them and not me!)

Another reason someone might be reluctant to correct mistakes when pointed out is the fear that such a policy could be abused by motivated nitpickers. It would be pretty annoying to be obligated to churn out an endless stream of trivial corrections by someone motivated to comb through your entire portfolio and point out every little thing you did imperfectly, ever.

I wondered if maybe, in Scott or Eliezer's mental universe, I was a blameworthy (or pitiably mentally ill) nitpicker for flipping out over a blog post from 2014 (!) and some Tweets (!!) from November. I, too, had probably said things that were wrong five years ago.

But I thought I had made a pretty convincing case that a lot of people were making a correctable and important rationality mistake, such that the cost of a correction (about the philosophy of language specifically, not any possible implications for gender politics) would be justified here. As Ben pointed out, if someone had put this much effort into pointing out an error I had made four months or five years ago and making careful arguments for why it was important to get the right answer, I probably would put some serious thought into it.

I could see a case that it was unfair of me to include political subtext and then only expect people to engage with the politically clean text, but if we weren't going to get into full-on gender-politics on Less Wrong (which seemed like a bad idea), but gender politics was motivating an epistemology error, I wasn't sure what else I was supposed to do. I was pretty constrained here!

(I did regret having accidentally poisoned the well the previous month by impulsively sharing "Blegg Mode" as a Less Wrong linkpost [LW · GW]. "Blegg Mode" had originally been drafted as part of "... To Make Predictions" before getting spun off as a separate post. Frustrated in March at our failing email campaign, I thought it was politically "clean" enough to belatedly share, but it proved to be insufficiently deniably allegorical, as evidenced by the 60-plus-entry trainwreck of a comments section. It's plausible that some portion of the Less Wrong audience would have been more receptive to "... Boundaries?" if they hadn't been alerted to the political context by the comments on the "Blegg Mode" linkpost.)

On 13 April 2019, I pulled the trigger on publishing "... Boundaries?", and wrote to Yudkowsky again, a fourth time (!), asking if he could either publicly endorse the post, or publicly comment on what he thought the post got right and what he thought it got wrong—and that if engaging on this level was too expensive for him in terms of spoons, if there was any action I could take to somehow make it less expensive. The reason I thought this was important, I explained, was that if rationalists in good standing find themselves in a persistent disagreement about rationality itself, that seemed like a major concern for our common interest [LW · GW], something we should be eager to definitively settle in public (or at least clarify the current state of the disagreement). In the absence of a rationality court of last resort, I feared the closest thing we had was an appeal to Eliezer Yudkowsky's personal judgment. Despite the context in which the dispute arose, this wasn't a political issue. The post I was asking for his comment on was just about the mathematical laws [LW · GW] governing how to talk about, e.g., dolphins. We had nothing to be afraid of here. (Subject: "movement to clarity; or, rationality court filing").

I got some pushback from Ben and Jessica about claiming that this wasn't "political". What I meant by that was to emphasize (again) that I didn't expect Yudkowsky or "the community" to take a public stance on gender politics. Rather, I was trying to get "us" to take a stance in favor of the kind of epistemology that we were doing in 2008. It turns out that epistemology has implications for gender politics that are unsafe, but that's more inferential steps. And I guess I didn't expect the sort of people who would punish good epistemology to follow the inferential steps?

Anyway, again without revealing any content from the other side of any private conversations that may or may not have occurred, we did not get any public engagement from Yudkowsky.

It seemed that the Category War was over, and we lost.

We lost?! How could we lose?! The philosophy here was clear-cut. This shouldn't be hard or expensive or difficult to clear up. I could believe that Alexander was "honestly" confused, but Yudkowsky?

I could see how, under ordinary circumstances, asking Yudkowsky to weigh in on my post would be inappropriately demanding of a Very Important Person's time, given that an ordinary programmer such as me was surely as a mere worm in the presence of the great Eliezer Yudkowsky. (I would have humbly given up much sooner if I hadn't gotten social proof from Michael and Ben and Sarah and "Riley" and Jessica.)

But the only reason for my post to exist was because it would be even more inappropriately demanding to ask for a clarification in the original gender-political context. The economist Thomas Schelling (of "Schelling point" fame) once wrote about the use of clever excuses to help one's negotiating counterparty release themself from a prior commitment: "One must seek [...] a rationalization by which to deny oneself too great a reward from the opponent's concession, otherwise the concession will not be made."^[10] This is what I was trying to do when soliciting—begging for—engagement or endorsement of "... Boundaries?" By making the post be about dolphins, I was trying to deny myself too great of a reward on the gender-politics front. I don't think it was inappropriately demanding to expect "us" (him) to be correct about the cognitive function of categorization. I was trying to be as accommodating as I could, short of just letting him (us?) be wrong.

I would have expected him to see why we had to make a stand here, where the principles of reasoning that made it possible for words to be assigned interpretations at all were under threat.

A hill of validity in defense of meaning.

Maybe that's not how politics works? Could it be that, somehow, the mob-punishment mechanisms that weren't smart enough to understand the concept of "bad argument (categories are arbitrary) for a true conclusion (trans people are OK)", were smart enough to connect the dots between my broader agenda and my abstract philosophy argument, such that VIPs didn't think they could endorse my philosophy argument, without it being construed as an endorsement of me and my detailed heresies?

Jessica mentioned talking with someone about me writing to Yudkowsky and Alexander about the category boundary issue. This person described having a sense that I should have known it wouldn't work—because of the politics involved, not because I wasn't right. I thought Jessica's takeaway was poignant:

Those who are savvy in high-corruption equilibria maintain the delusion that high corruption is common knowledge, to justify expropriating those who naively don't play along, by narratizing them as already knowing and therefore intentionally attacking people, rather than being lied to and confused.

Should I have known that it wouldn't work? Didn't I "already know", at some level?

I guess in retrospect, the outcome does seem kind of obvious—that it should have been possible to predict in advance, and to make the corresponding update without so much fuss and wasting so many people's time.

But it's only "obvious" if you take as a given that Yudkowsky is playing a savvy Kolmogorov complicity strategy like any other public intellectual in the current year.

Maybe this seems banal if you haven't spent your entire adult life in his robot cult. From anyone else in the world, I wouldn't have had a problem with the "hill of validity in defense of meaning" thread—I would have respected it as a solidly above-average philosophy performance before setting the bozo bit on the author and getting on with my day. But since I did spend my entire adult life in Yudkowsky's robot cult, trusting him the way a Catholic trusts the Pope, I had to assume that it was an "honest mistake" in his rationality lessons, and that honest mistakes could be honestly corrected if someone put in the effort to explain the problem. The idea that Eliezer Yudkowsky was going to behave just as badly as any other public intellectual in the current year was not really in my hypothesis space.

Ben shared the account of our posse's email campaign with someone who commented that I had "sacrificed all hope of success in favor of maintaining his own sanity by CC'ing you guys." That is, if I had been brave enough to confront Yudkowsky by myself, maybe there was some hope of him seeing that the game he was playing was wrong. But because I was so cowardly as to need social proof (because I believed that an ordinary programmer such as me was as a mere worm in the presence of the great Eliezer Yudkowsky), it probably just looked to him like an illegible social plot originating from Michael.

One might wonder why this was such a big deal to us. Okay, so Yudkowsky had prevaricated about his own philosophy of language for political reasons, and he couldn't be moved to clarify even after we spent an enormous amount of effort trying to explain the problem. So what? Aren't people wrong on the internet all the time?

This wasn't just anyone being wrong on the internet. In an essay on the development of cultural traditions, Scott Alexander had written that rationalism is the belief that Eliezer Yudkowsky is the rightful caliph. To no small extent, I and many other people had built our lives around a story that portrayed Yudkowsky as almost uniquely sane—a story that put MIRI, CfAR, and the "rationalist community" at the center of the universe, the ultimate fate of the cosmos resting on our individual and collective mastery of the hidden Bayesian structure of cognition.

But my posse and I had just falsified to our satisfaction the claim that Yudkowsky was currently sane in the relevant way. Maybe he didn't think he had done anything wrong (because he hadn't strictly lied [LW · GW]), and probably a normal person would think we were making a fuss about nothing, but as far as we were concerned, the formerly rightful caliph had relinquished his legitimacy. A so-called "rationalist" community that couldn't clarify this matter of the cognitive function of categories was a sham. Something had to change if we wanted a place in the world for the spirit of "naïve" (rather than politically savvy) inquiry to survive.

(To be continued. Yudkowsky would eventually clarify his position on the philosophy of categorization in September 2020—but the story leading up to that will have to wait for another day.)

Similarly, in automobile races, you want rules to enforce that all competitors have the same type of car, for some commonsense operationalization of "the same type", because a race between a sports car and a moped would be mostly measuring who has the sports car, rather than who's the better racer. ↩︎
And in the case of sports, the facts are so lopsided that if we must find humor in the matter, it really goes the other way. A few years later, Lia Thomas would dominate an NCAA women's swim meet by finishing 4.2 standard deviations (!!) earlier than the median competitor, and Eliezer Yudkowsky feels obligated to pretend not to see the problem? You've got to admit, that's a little bit funny. ↩︎
Despite my misgivings, this blog was still published under a pseudonym at the time; it would have been hypocritical of me to accuse someone of cowardice about what they're willing to attach their real name to. ↩︎
The title was a pun referencing computer scientist Scott Aaronson's post advocating "The Kolmogorov Option", serving the cause of Truth by cultivating a bubble that focuses on specific truths that won't get you in trouble with the local political authorities. Named after the Soviet mathematician Andrey Kolmogorov, who knew better than to pick fights he couldn't win. ↩︎
In Part One, Chapter VII, "The Exploiters and the Exploited". ↩︎
CfAR had been spun off from MIRI in 2012 as a dedicated organization for teaching rationality. ↩︎
Yudkowsky's Sequences (except the last [? · GW]) had originally been published on Overcoming Bias before the creation of Less Wrong in early 2009. ↩︎
In general, I'm proud of my careful choices of breakup songs. For another example, my breakup song with institutionalized schooling was Taylor Swift's "We Are Never Ever Getting Back Together", a bitter renunciation of an on-again-off-again relationship ("I remember when we broke up / The first time") with a ex who was distant and condescending ("And you, would hide away and find your peace of mind / With some indie record that's much cooler than mine"), thematically reminiscent of my ultimately degree-less string of bad relationships with UC Santa Cruz (2006–2007), Heald College (2008), Diablo Valley College (2010–2012), and San Francisco State University (2012–2013).

The fact that I've invested so much symbolic significance in carefully-chosen songs by female vocalists to mourn relationships with abstract perceived institutional authorities, and conspicuously not for any relationships with actual women, maybe tells you something about how my life has gone. ↩︎
Probably a lot of other people who lived in Berkeley would find it harder to criticize trans people than to criticize some privileged white guy named Yudkowski or whatever. But those weren't the relevant power gradients in my world. ↩︎
The Strategy of Conflict, Ch. 2, "An Essay on Bargaining" ↩︎

119 comments

Comments sorted by top scores.

comment by lc · 2023-07-16T07:34:56.799Z · LW(p) · GW(p)

So, Zack, I agree with most of your takes on the object level issues here. At the same time, the amount of motivated reasoning and dishonesty you attribute to this community and Yudkowsky in particular as a result of these passing comments seems comically exaggerated. Personally, I cannot recall any discussion of gender or transgenderism on LessWrong, or from major LessWrong contributors outside of LessWrong, except for yours. A few tweets from Eliezer asking to address trans people as they wish does not substantiate to me this sky-is-falling level of panic about community epistemics you seem to have.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-07-16T20:26:10.839Z · LW(p) · GW(p)

Whether the sky is falling depends on how high the sky was in the past, and whether that's worth panicking over depends on your utility function over sky height? (That's the short version. The long version is another 70,000 words over four posts.)

Replies from: lc

↑ comment by lc · 2023-07-17T13:25:11.984Z · LW(p) · GW(p)

I do not think this is a matter of merely having held Eliezer in less esteem. There is something to be said about how LessWrong developed a cult of personality around Eliezer, but rather than an objection to the cult of personality per se, what your posts are is a criticism of Eliezer for not living up to the standards of his personality cult, with small notes in passing about how unhealthy your reverence to him was.

The long version is another 70,000 words over four posts.

Criticisms of particular people or groups that long tend to be nebulous and pathological rather than based in some reasonable concern. I hope you will understand if I am too skeptical to read the whole thing, if it cannot be summarized into something concrete.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-07-18T02:20:52.769Z · LW(p) · GW(p)

Thanks. It sounds like you should regard my sky-is-falling level of panic as unsubstantiated until I come back to you with a summary at the end, at which time you can reëvaluate the question of whether I was correct to panic.

comment by Raemon · 2023-07-17T18:28:49.619Z · LW(p) · GW(p)

Responding to Zack's comment here [LW(p) · GW(p)]in a new thread since the other thread went in a different direction.

The thing me and my allies were hoping for a "court ruling" on was not about who should or shouldn't be held in high regard, but about the philosophical claim that one "ought to accept an unexpected [X] or two deep inside the conceptual boundaries of what would normally be considered [Y] if [positive consequence]". (I think this is false.) [LW · GW] That's what really matters, not who we should or shouldn't hold in high regard.

I found this a helpful crisp summary of the actual thing you want. (I realize you're not done with this blogseries yet, and probably the blogseries is serving multiple purposes, but insofar as what-you-wanted hasn't happened, I think writing a post at the end that succinctly spells out the things you want that you don't feel like you've gotten yet would probably be worthwhile)

A thing I'm still somewhat fuzzy on is whether you think this court ruling is about "on LessWrong / in truth-focused contexts/communities", "worldwide in all contexts" (or something in-between), and insofar as you think it's "worldwide in all contexts", if you think this has the same degree of crisp "there's a mathematical right answer here" or "this is the correct tradeoff to make given a messy world."

I think one of your central points of contention here is that the bolded sentence is false:

If I’m willing to accept an unexpected chunk of Turkey deep inside Syrian territory to honor some random dead guy – and I better, or else a platoon of Turkish special forces will want to have a word with me – then I ought to accept an unexpected man or two deep inside the conceptual boundaries of what would normally be considered female if it’ll save someone’s life. There’s no rule of rationality saying that I shouldn’t, and there are plenty of rules of human decency saying that I should.

...and Scott should retract it and we should all agree he should retract it (and that refusal to do this makes for a sort of epistemic black hole that ripples outward in a contagious lies [LW · GW] sort of way, which has pretty bad consequences on our collective mapmaking as well as just being factually-wrong in isolation)

(I agree with the above claim. I think both rationalists and broader society should get called out on claims like this)

One more claim I'm pretty sure you're making that I agree with is "it's important to preserve a category for 'actual women' that you can still talk about."

I'm not sure if you also think:

self-described-rationalists shouldn't be willing to accept the social category of women including not-very-successfully-transitioned transwomen because of reasoning like "there are rules of rationality suggesting this has epistemic costs, but the social benefits outweigh those costs."
broader society shouldn't have that social category.

I'm separating out "rationalist society" and "broader society" because they seem different to me. Rationalists have spec'd into prioritizing truth/mapmaking. I'm confidently glad someone is doing that. I'd confidently argue broader society should be more into truth/mapmaking but I'm not sure what steps I'd take in what order to achieve that, and what tradeoffs to make along the way given various messy realities.

(Or: since "rationalist" is a made up label and there's nothing intrinsically wrong with being "more truthseeking-focused without making it your Primary Deal", a better phrasing is "if you are trying to be truthseeking focused and you haven't noticed or cared that you subverted a rationality principle for political expedience, you should be pretty concerned that you are doing this in other domains, or that it has more contagious-lie effects than you're acknowledging. And other people should be suspicious of you and not grant you as much credibility as a "rationalist". You should at the very least notice the edges of your truthseeking competence)

There's some other specific claims I'm not sure if you're making but I'll leave it there for now.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-08-13T23:07:02.823Z · LW(p) · GW(p)

Thanks for your patience. For the most part, I try to be reluctant to issue proclamations about what other people "should" do, as I explain in "I Don't Do Policy". ("Should" claims can be decomposed into conditional predictions about consequences given actions, and preferences over consequences. The conditional predictions can be evaluated on their merits, and I don't control other people's preferences.) In particular, I don't think there's One True social gender convention.

The "court ruling" thing was an unusual case where, you know, I had been under the impression that this subculture with Eliezer Yudkowsky as its acknowledged leader was serious about being "spec'd into prioritizing truth/mapmaking". It really seemed like the kind of thing that shouldn't be hard to clear up—that he would perceive an interest in clearing up, after the problem had been explained.

My position at the time was, "Scott should retract it and we should all agree he should retract it". Since that didn't happen despite absurd efforts, my conclusion is more that there is no "we". It would be nice if there were a subculture that had spec'd into prioritzing truth/mapmaking, but our little cult/patronage-network doesn't deserve credit for that: on the margin, I'm now more interested in efforts to break people of the delusion that the ingroup has a monopoly on Truth and Goodness, than I am in efforts to get the ingroup to live up to its marketing material about being the place in the world for people who are interested in Truth and Goodness.

Obviously, this doesn't entail giving up on Truth and Goodness! ("Thinking" and "writing" and "helping people" were not invented here.) It doesn't even entail cutting ties with "the community." (I am, actually, still using this website, at least for now.) It does involve propagating the implication from "'rationalist' is a made up label" (as you say) to applying the same standards of moral reasoning to the ingroup's cult/patronage-network and broader Society (and not confusing the former thing for "rationalist Society").

Because you know, maybe the "community" model was just never a good idea to begin with? I'm worried that if I were to accept your support in enforcing norms about the category-boundary thing, then the elephant in my brain might think I therefore owed you a favor and should stop giving you such a hard time about the free-speech-for-disagreeable-people thing. As the post notes, that's not how it works. Or maybe it is how human communities work, but it's not how the ideal justice of systematically correct reasoning works.

Replies from: Raemon, Richard_Kennaway

↑ comment by Raemon · 2023-08-15T22:19:09.851Z · LW(p) · GW(p)

Nod, thanks.

Okay, rewriting this to check my understanding, you're saying:

In a rationalist community that was actively pretty successful at being good at mapmaking, more people would have proactively noticed that that particular line of Scott's was false. The fact that this didn't happen is evidence about the state of how much one should trust (and have trusted) the rationality community to live up to its marketing material.
But, rather than being primarily interested in actually prosecuting that case, at this point you think it's more important to drive home the point "we're already living in the world where the community failed to notice that this was a rationality test and they failed", and that this has implications for how people should think of "the rationality community" (or lack thereof)

I didn't quite understand the second clause until you spelled it out just now, thanks.

Overall I am still more focused on "actual live up to the marketing hype" because, well, I actually think we... just need good enough epistemics to handle high stakes decisions with unclear technical underpinnings and political motivations [LW(p) · GW(p)]. I'd want to get the Real Thing whether or not I previously believed that we had it.

(I'm not sure whether this is particularly different from your model here? I guess the major diff in-this-domain is that I'm still more optimistic about solutions mediated through "having a community with shared norms", and you think that's sufficiently likely to be net-negative by default that one should be pretty skeptical about that by default?)

...

Mostly separate point:

I agree somewhat directionally with the second point, but don't think it's as big a deal as you think. I think you're basically right that you were gaslit to some degree by people with politically-motivated-cognition. But, some other considerations at play are:

It's in fact non-obvious what the right answer is (it seems like Eliezer or Scott shouldn't get this wrong or take much convincing, but, see below for additional problems, and I think this was at least a nontrivial part of my own confusion earlier on. Your blog posts that focused on the math helped me)
It sounded like it was in some cases unclear to people what question you were trying to argue (based on your description of some arguments with Scott and Kelsey) .i.e. "self-identity isn't a good way to define male/female" vs "it matters in the first place that words mean things, and that people doing 'shared mapmaking' should at least notice and care about tradeoffs re: mapmakability.
- (this includes multi-level frame-mismatches, i.e. the thing where you "don't do policy" is actually surprising and non-obvious to people)
- I think you in fact have multiple goals or claims, and even if the "Scott you should retract this one sentence" was the most important one originally, I think people were correct to not believe that was your only goal, and to be suspicious of your framing?
People have different experiences of how "thought policed" they perceive themselves as being, which changes their initial guesses of how bad the problem you're trying to point at are. (i.e. I recently had an argument with someone about why they should care about your deal, and was like 'Zack doesn't like people telling him what to think, especially when they are telling him how to think wrong', and Bob responded 'but nobody is telling him how to think!' and I said 'oh holy hell they are absolutely telling him how to think. I can think of at least one concrete FB argument where one rationalist-you-know was specifically upset at another rationalist-you-know for not thinking of them as male, not merely what words they used. Bob was surprised, and updated).
And then a last point (which'd probably sound rude of me to bring up to most people, but seems important here), is, well, at least since 2018, you've had a pretty strong vibe of "seeming to relate to this in a zealous and unhealthy way". It's hard to tell the difference between "people are avoiding the real conversation due to leftist political reasons" and "people are avoiding the conversation because you just seem... a little crazy". It's especially hard to tell the difference when both are true at the same time.

All five points (i.e. those four bullets + the sort of political mindkilledness you seem to be primarily hypothesizing) have different mechanisms. But they overlap and blur together and it's hard to tell which are most significant. I think you'd probably agree all 4 are relevant, but maybe attribute > 60% of the causality to the "political mindkilled and/or political expedience" aspect, where I think that's... maybe like 30-45% of it?". Which is still a lot, but, being less than 50% of the causality changes my relationship with it?)

...

(I didn't quite find a place to link to it inline but this whole thing is one of the reasons I wrote Norm Innovation and Theory of Mind [LW · GW])

↑ comment by Richard_Kennaway · 2023-08-14T08:44:26.550Z · LW(p) · GW(p)

my conclusion is more that there is no "we".

There never was.

comment by tailcalled · 2023-07-15T21:47:31.157Z · LW(p) · GW(p)

and that it's actually immoral not to believe in psychological sex differences given that psychological sex differences are actually real

Perhaps the archetypal psychological sex difference that people have argued about is "women are emotional, men are rational".

After reading your previous post where you quoted Deirdre McCloskey's memoir, I started reading the memoir a bit too, and it actually provides a neat example of this psychological sex difference.

There was a period where McCloskey's wife got all emotional and started complaining about McCloskey spending too much money on the phone bill. McCloskey very rationally pointed out that it was very cheap compared to e.g. therapy or hobbies. Clean example of female irrationality, right?

Of course, if one looks at the extended context, the picture is very different; Deirdre had initially assured the wife that it was just crossdressing and nothing else, and they had agreed with each other to put the crossdressing into the background, but now Deirdre was seriously considering transitioning, yet insisting that things like beard shaving was just crossdressing and nothing more. Essentially, from the beginning McCloskey took rational conversation off the table due to expecting that the wife would leave if they talked with each other about McCloskey's desire to be a woman.

In retrospect, it seems McCloskey's wife was right to worry. But also, it illustrates how psychological sex differences discourse tend to abstract over the environmental constraints people exist within. It seems like in these sorts of cases, "psychological sex differences" can function as a tool of oppression or abuse, rather than a genuine attempt at describing the world.

I think "psychological sex differences" ideology is energetically unable to acknowledge these sorts of things because its main purpose/motivation is to function as a counterstory against feminist ideology, and so the point is not to accurately describe the world but instead to deny cultural factors. (Let's not forget James Damore's memo, who cited research on greater female neuroticism as a justification for ignoring women's issues with their workplace.)

One of the Tweets that had recently led to radical feminist Meghan Murphy getting kicked off the platform read simply, "Men aren't women tho." This doesn't seem like a policy claim; rather, Murphy was using common language to express the fact-claim that members of the natural category of adult human males, are not, in fact, members of the natural category of adult human females.

I think her tweet was intended to make a claim about trans women specifically, and that it was this sub-claim that she was banned for. For example, if she had asked for a list of the richest women, and someone had linked Elon Musk, I don't think she would have gotten banned for responding "Men aren't women tho.".

So I guess one could say Murphy was using common language to express the fact-claim that even if members of the natural category of adult human males transition or become intent on transitioning, they are not, in fact, members of the natural category of adult human females.

By "traits" I mean not just sex chromosomes (as Yudkowsky suggested on Twitter), but the conjunction of dozens or hundreds of measurements that are causally downstream of sex chromosomes: reproductive organs and muscle mass (again, sex difference effect size of Cohen's d ≈ 2.6) and Big Five Agreeableness (d ≈ 0.5) and Big Five Neuroticism (d ≈ 0.4) and short-term memory (d ≈ 0.2, favoring women) and white-gray-matter ratios in the brain and probable socialization history and any number of other things—including differences we might not know about, but have prior reasons to suspect exist.

I think trans women are female-typical with respect to Agreeableness and Neuroticism, though I don't know why. I think HRT changes large-scale brain proportions to be either intermediate between male and female or all the way to the female end, though I don't think this has anything to do with Agreeableness/Neuroticism.

Of course since you didn't transition and went off HRT, this might not apply to you.

But having done the reduction-to-cognitive-algorithms, it still looks like the person-in-the-street has a point that I shouldn't be allowed to ignore just because I have 30 more IQ points and better philosophy-of-language skills?

Yes but the coalition that supports the person-in-the-street opposes me in my dispute with Phil because my dispute with Phil looks superficially similar, so I oppose the coalition that supports the person-in-the-street.

Of course, such speech restrictions aren't necessarily "irrational", depending on your goals. If you just don't think "free speech" should go that far—if you want to suppress atheism or gender-critical feminism with an iron fist—speech codes are a perfectly fine way to do it! And to their credit, I think most theocrats and trans advocates are intellectually honest about what they're doing: atheists or transphobes are bad people (the argument goes) and we want to make it harder for them to spread their lies or their hate.

Yes.

After all, using Aumann-like [? · GW] reasoning, in a dispute of 'me and Michael Vassar vs. everyone else', wouldn't I want to bet on 'everyone else'?"

The intuition here is that Aumann-like reasoning implies something like averaging everyone's opinions, and therefore if there are lots of people on one side, they would dominate in the averaging.

But actually, I think it is better to think of Aumann-like reasoning as adding together everyone's opinions. More formally, if you imagine that everyone has observed different pieces of independent evidence, leading to different people having different updates of their opinions relative to the prior, then to get the Aumannian update you have to add up all the changes in log-odds.

Or alternatively, you are thinking of it as, "what is the probability that all of them have somehow become irrational/dishonest with respect to this subject?". I think... it's rare for irrationality/dishonesty to be one-sided? In the example disputes I can think of off the top of my head, it's usually both or neither.

In theory one might think that there should be a negative correlation between the sides being dishonest due to collider bias; you only need one side to be dishonest in order to get a dispute, so it does seem kind of weird if dishonesty on both sides is correlated. But I think what happens is that disputes lead to conflicts, and conflicts lead to defensiveness, overreach and aggressiveness, and so a single dispute can spin out into an entire system of rationality falling apart. These disputes all arguably started generations ago.

So basically, my Aumann-like reasoning for persistent conflicts would go: The rationalist community is being irrational/dishonest about trans topics. There must be a reason for this; and indeed, it seems like a defense mechanism against the conservative/HBD coalition's conflicts against trans people. But the conservative/HBD coalition is also often bad, irrational and dishonest! And you've been regularly trusting them and appealing to their findings in your posts, even though I think upon closer inspection there are lots of issues that you'd recognize. Ooops.

I said that saying, "I am worried that if I actually point out the physical injuries ..." when the actual example turned out to be sex reassignment surgery seemed dishonest: I had thought he might have more examples of situations like mine or "Rebecca"'s, where gaslighting escalated into more tangible harm in a way that people wouldn't know about by default. In contrast, people already know that bottom surgery is a thing; Ben just had reasons to think it's Actually Bad—reasons that his friends couldn't engage with if we didn't know what he was talking about. It was bad enough that Yudkowsky was being so cagey; if everyone did it, then we were really doomed.

I guess this makes for a neat example of what I just mentioned.

Michael replied at 5:58 a.m., saying that everyone's first priority should be making sure that I could sleep—that given that I was failing to adhere to my commitments to sleep almost immediately after making them, I should be interpreted as urgently needing help, and that Scott had comparative advantage in helping, given that my distress was most centrally over Scott gaslighting me, asking me to consider the possibility that I was wrong while visibly not considering the same possibility regarding himself.
That seemed a little harsh on Scott to me. At 6:14 a.m. and 6:21 a.m., I wrote a couple emails to everyone that my plan was to get a train back to my own apartment to sleep, that I was sorry for making such a fuss despite being incentivizable while emotionally distressed, that I should be punished in accordance with the moral law for sending too many hysterical emails because I thought I could get away with it, that I didn't need Scott's help, and that I thought Michael was being a little aggressive about that, but that I guessed that's also kind of Michael's style.
Michael was furious with me. ("What the FUCK Zack!?! Calling now," he emailed me at 6:18 a.m.) I texted and talked with him on my train ride home. He seemed to have a theory that people who are behaving badly, as Scott was, will only change when they see a victim who is being harmed. Me escalating and then immediately deescalating just after Michael came to help was undermining the attempt to force an honest confrontation, such that we could get to the point of having a Society with morality or punishment.

Another strong example.

I mean, I can feel how Michael and Ben are hurting here. They are in open conflict with the rationalist leaders and are spending a lot of reputation on this.

But they are also opposing locally honest behavior in order to force the rationalist community to improve.

I wondered if maybe, in Scott or Eliezer's mental universe, I was a blameworthy (or pitiably mentally ill) nitpicker for flipping out over a blog post from 2014 (!) and some Tweets (!!) from November. I, too, had probably said things that were wrong five years ago.
But I thought I had made a pretty convincing case that a lot of people were making a correctable and important rationality mistake, such that the cost of a correction (about the philosophy of language specifically, not any possible implications for gender politics) would be justified here. As Ben pointed out, if someone had put this much effort into pointing out an error I had made four months or five years ago and making careful arguments for why it was important to get the right answer, I probably would put some serious thought into it.

Yep. And I mean, once you've written up the explanation of what the error is like, it's pretty cheap to correct.

Scott Alexander could cross out his "Categories" post and put a link to your response at the top, and write a brief public announcement that he had changed his mind in a publicly accessible place, such as as a new top-level post on ACX. Eliezer could quote-tweet his old tweets with a link to your rebuttal, and thank you for pointing out his bias.

Replies from: localdeity, Zack_M_Davis, f____

↑ comment by localdeity · 2023-07-15T22:46:10.985Z · LW(p) · GW(p)

(Let's not forget James Damore's memo, who cited research on greater female neuroticism as a justification for ignoring women's issues with their workplace.)

I don't think that's true, and if anything it looks to be the opposite. Original document; the relevant quotes about neuroticism and what to do about it seem to be:

Personality differences
Neuroticism (higher anxiety, lower stress tolerance).
○ This may contribute to the higher levels of anxiety women report on Googlegeist and to the lower number of women in high stress jobs.
[...]
Non-discriminatory ways to reduce the gender gap
[...]
Women on average are more prone to anxiety
Make tech and leadership less stressful. Google already partly does this with its many stress reduction courses and benefits.

Replies from: tailcalled

↑ comment by tailcalled · 2023-07-15T22:58:40.720Z · LW(p) · GW(p)

Maybe I'm misunderstanding how the Googlegeist works.

At my workplace, we regularly have surveys where we get asked about how we feel about various things. But if we report negative feelings, we get asked for suggestions about what is wrong.

The way I had imagined the situation is, someone working with the Googlegeist had noticed that a lot of women reported anxiety or whatever, and had decided they need to work with women to figure out what's going on here, to solve it. And then James Damore felt that this was one instance of people looking at a disparity and claiming injustice, and that since he finds it biologically inevitable that women would be anxious, this shouldn't be treated as indicative of an external problem, but instead should be medicalized and treated psychologically (or psychiatrically?).

But I admit I haven't looked much into it so maybe the above model is wrong; originally when the Damore memo came out, I supported him, and it's only later I've been thinking that maybe I shouldn't have supported him. But I haven't had much chance to talk with people about it.

Gotta go sleep.

Replies from: localdeity

↑ comment by localdeity · 2023-07-16T03:55:08.952Z · LW(p) · GW(p)

The way I had imagined the situation is, someone working with the Googlegeist had noticed that a lot of women reported anxiety or whatever, and had decided they need to work with women to figure out what's going on here, to solve it. And then James Damore felt that this was one instance of people looking at a disparity and claiming injustice, and that since he finds it biologically inevitable that women would be anxious, this shouldn't be treated as indicative of an external problem, but instead should be medicalized and treated psychologically (or psychiatrically?). [italics added]

As a side note, I consider the italicized part a rather weighty accusation. I think one should therefore be careful about making such an accusation. I guess, in this case, you were just honestly reporting the contents of your brain on the matter, not necessarily making an accusation.

Still, I think this to some extent illustrates an epistemic environment where it's normal to throw around damaging accusations whose truth value is somewhere between "extremely uncharitable interpretation" and "objectively false". Precisely the type that got Damore fired, in other words. Do we have such an environment even among rationalists? That is at the heart of Zack's adventure.

(Incidentally, imagine if Damore had claimed the opposite—"Women are less prone to anxiety and can handle stress more easily." Wouldn't that also lead to accusations that Damore was saying we can ignore women's problems?)

Anyway, on to object level. I think Damore's point, in bringing it up, was that the stress in (some portion of) tech jobs may be a reason there are fewer women than men in tech. Reasons to think this:

The title of the super-section containing the "neuroticism" quote is "Possible non-bias causes of the gender gap in tech".
The super-section is preceded by "For the rest of this document, I’ll concentrate on the extreme stance that all differences in outcome are due to differential treatment [italics added] and the authoritarian element that’s required to actually discriminate to create equal representation."
The last sentence in the section ("Personality differences") is "We need to stop assuming that gender gaps imply sexism."
As already quoted, he says that the anxiety thing implies that "Mak[ing] tech and leadership less stressful" would be a "non-discriminatory way to reduce the gender gap".

If Damore had said "Here are some issues women reported; and we should discount these reports because women are extra-anxious", then your model would be well-founded. I don't see him saying anything like that in the document, though. In the whole document, Damore doesn't mention anything reported by women on Googlegeist, other than the anxiety thing. (I would be surprised if he, being an engineer and not in HR or leadership, had access to the arbitrary text field submissions from the other employees; I would guess he saw aggregated results on numerical questions, plus any items leadership chose to share with everyone.) Googlegeist itself is mentioned only two other times in the document; both times it's him suggesting something be done with future Googlegeist surveys.

He does mention another item as a (primarily) women's issue, although the source is a 2006 paper rather than Googlegeist. Again, he does advocate doing something about it (with caveats):

Non-discriminatory ways to reduce the gender gap
[...]
Women on average look for more work-life balance while men have a higher drive for
status on average
○ Unfortunately, as long as tech and leadership remain high status, lucrative
careers, men may disproportionately want to be in them. Allowing and truly
endorsing (as part of our culture) part time work though can keep more women in
tech.

Now, at the end, he says this:

Philosophically, I don't think we should do arbitrary social engineering of tech just to make it appealing to equal portions of both men and women. For each of these changes, we need principled reasons for why it helps Google; that is, we should be optimizing for Google—with Google's diversity being a component of that. For example, currently those willing to work extra hours or take extra stress will inevitably get ahead and if we try to change that too much, it may have disastrous consequences. Also, when considering the costs and benefits, we should keep in mind that Google's funding is finite so its allocation is more zero-sum than is generally acknowledged.

The most uncharitable reader could say "Aha, so he's laid the groundwork to not follow through with anything that actually helps women, keeping the status quo, and everything he's said before is just a trick." If the reader comes in with that kind of implicit assumption about Damore's character, then they'll probably stick with it; all I can say is, evidence for such a belief does not come from the document. (Incidentally, I've met Damore at a party; I read him as a well-meaning nerd, who thought that if he made a sufficiently comprehensive, careful, well-cited, and constructively oriented writeup, he could cut through the hostility and they'd work out some solutions that would make everyone happier. The result is really tragic in that light.)

I think, to come up with your conclusion, you have to do a lot of reading into the text, and a lot of not reading the actual text. Which, I think, was par for the course for most negative takes on Damore. I am surprised and somewhat perturbed by your report that you originally supported Damore, and wonder what happened since then. Perhaps memory faded and "osmosis" brought in others' takes?

Replies from: tailcalled

↑ comment by tailcalled · 2023-07-16T14:03:06.403Z · LW(p) · GW(p)

In more detail, my background is I used to subscribe to research into psychological differences between the sexes and the races, with a major influence in my views being Scott Alexander (though there's also a whole backstory to how I got into this).

I eventually started doing my own empirical research into transgender topics, and found Blanchardianism/autogynephilia theory to give the strongest effect sizes.

And as I was doing this, I was learning more about how to perform this sort of research; psychometrics, causal inference, psychology, etc.. Over time, I got a feeling for what sorts of research questions are fruitful, what sort of methods and critiques are valid, and what sorts of dynamics and distinctions should be paid attention to.

But I also started getting a feeling for how the researchers into differential psychology operate. Here's a classical example; an IQ researcher who is so focused on providing a counternarrative to motivational theories that he uses methods which are heavily downwards biased to "prove" that IQ test scores don't depend on effort. Or Simon Baron-Cohen playing Motte-Bailey with the "extreme male brain" theory of autism.

More abstractly, what I've generally noticed is:

These sorts of people are not very interested in actually developing substantive theory or testing their claims in strong ways which might disprove them.
Instead they are mainly interested in providing a counternarrative to progressive theories.
They often use superficial or invalid psychometric methods.
They often make insinuations that they have some deep theory or deep studies, but really actually don't.

So yes, I am bringing priors from outside of this. I've been at the heart of the supposed science into these things, and I have become horrified at what I once trusted.

Onto your points:

I think Damore's point, in bringing it up, was that the stress in (some portion of) tech jobs may be a reason there are fewer women than men in tech.

You may or may not be right that this is what he meant.

(I think it's a completely wrong position, because the sex difference in neuroticism is much smaller (by something like 2x) than the sex difference in tech interests and tech abilities, and presumably the selection effect for neuroticism on career field is also much smaller than that of interests. So I'm not sure your reading on it is particularly more charitable, only uncharitable in a different direction; assuming a mistake rather than a conflict.)

... I don't think this changes the point that it assumes the measured sex difference in Neuroticism is a causative agent in promoting sex differences in stress, rather than addressing the possibility that the increased Neuroticism may reflect additional problems women are facing?

(Incidentally, imagine if Damore had claimed the opposite—"Women are less prone to anxiety and can handle stress more easily." Wouldn't that also lead to accusations that Damore was saying we can ignore women's problems?)

The correct thing to claim is "We should investigate what people are anxious/stressed about". Jumping to conclusions that people's states are simply a reflection of their innate traits is the problem.

As a side note, I consider the italicized part a rather weighty accusation. I think one should therefore be careful about making such an accusation. I guess, in this case, you were just honestly reporting the contents of your brain on the matter, not necessarily making an accusation.
Still, I think this to some extent illustrates an epistemic environment where it's normal to throw around damaging accusations whose truth value is somewhere between "extremely uncharitable interpretation" and "objectively false". Precisely the type that got Damore fired, in other words. Do we have such an environment even among rationalists? That is at the heart of Zack's adventure.

I don't think this is at the heart of Zack's adventure? Zack's issues were mainly about leading rationalists jumping in to rationalize things in the name of avoiding conflicts.

Anyway, making weighty claims about people is core to what differential psychology is about. It's possible that some of my claims about Damore are false, in which case we should discuss that and fix the mistakes. However, the position that one should just keep quiet about claims about people simply because they are weighty would also seem to imply that we should keep quiet about claims about trans people and masculinity/femininity, or race and IQ, or, to make the Damore letter more relevant, men/women and various traits related to performance in tech.

Incidentally, I've met Damore at a party; I read him as a well-meaning nerd, who thought that if he made a sufficiently comprehensive, careful, well-cited, and constructively oriented writeup, he could cut through the hostility and they'd work out some solutions that would make everyone happier. The result is really tragic in that light.

Somewhat possible this is true. I think nerdy communities like LessWrong should do a better job at communicating the problems with various differential psychology findings and communicating how they are often made by conservatives to promote an agenda. If they did this, perhaps Damore would not have been in this situation.

But also, if I take my personal story as a template, then it's probably more complicated than that. Yes, I had lots of time where I was a well-meaning nerd and I got tricked by conservative differential psychology. But a big part of the reason I got into this differential psychology in the first place was a distrust of feminism. If I had done my due diligence, or if nerdy groups had been better at communicating the problems with these areas, it might not have become as much of a problem.

Replies from: localdeity

↑ comment by localdeity · 2023-07-17T07:21:37.824Z · LW(p) · GW(p)

I'll address this first:

More abstractly, what I've generally noticed is:
These sorts of people are not very interested in actually developing substantive theory or testing their claims in strong ways which might disprove them.
Instead they are mainly interested in providing a counternarrative to progressive theories.
They often use superficial or invalid psychometric methods.
They often make insinuations that they have some deep theory or deep studies, but really actually don't.

These things are bad, but, apart from point 2, I would ask: how do they compare to the average quality of social science research? Do you have high standards, or do you just have high standards for one group? I think most of us spend at least some time in environments where the incentive gradients point towards the latter. Beware isolated demands for rigor.

Research quality being what it is, I would recommend against giving absolute trust to anyone, even if they appear to have earned it. If there's a result you really care about, it's good to pick at least one study and dig into exactly what they did, and to see if there are other replications; and the prior probability of "fraud" probably shouldn't go below 1%.

As for point 2—if you were a researcher with heretical opinions, determined to publish research on at least some of them, what would you do? It seems like a reasonable strategy is to pick something heretical that you're confident you can defend, and do a rock-solid study on it, and brace for impact. Is it still the case that disproving the blank-slate hypothesis would constitute progress in some academic subfields? If so, then expect people to continue trying it.

Now, digging into the examples:

Here's a classical example; an IQ researcher who is so focused on providing a counternarrative to motivational theories that he uses methods which are heavily downwards biased to "prove" that IQ test scores don't depend on effort.

The study says there was "a meta-analysis concluding that small monetary incentives could improve test scores by 0.64 SDs" (roughly 10 IQ points); looks to be Duckworth et all 2011. The guy says it seemed sketchy—the studies had small N, weird conditions, and/or fraudulent researchers. Looking at table S1 from Duckworth, indeed, N is <100 on most of the studies; "Bruening and Zella (1978)" sticks out as having a large effect size and a large N, and, when I google for more info about that, I find that Bruening was convicted by an NIMH panel of scientific fraud. Checks out so far.

The guy ran a series of studies, the last of which offered incentives of nil, £2, and £5-£10 for test performance, with the smallest subgroup being N=150, taken from the adult population via "prolific academic". He found that £2 and £5-£10 had similar effects, those being apparently 0.2 SD and 0.15 SD respectively, which would be 3 IQ points or a little less. (Were the "small monetary incentives" from Duckworth of that size? The Duckworth table shows most of the studies as being in the $1-$9 or <$1 range; looks like yes.) So, at least as a "We suspected these results were bogus, tried to reproduce them, and got a much smaller effect size", this seems all in order.

Now, you say:

IQ test effort correlates with IQ scores, and they investigate whether it is causal using incentives. However, as far as I can tell, their data analysis is flawed, and when performed correctly the conclusion reverses.
[...] Incentives increase effort, but they only have marginal effects on performance. Does this show that effort doesn't matter? No, because incentives also turn out to only have marginal effects on effort! Surely if you only improve effort a bit, you wouldn't expect to have much influence on scores. We can solve this by a technique called instrumental variables. Basically, we divide the effect of incentives on scores by the effect of incentives on effort.

Your analysis essentially proposes that, if there were some method of increasing effort by 3-4x as much as he managed to increase it, then maybe you could in fact increase IQ scores by 10 points. This assumes that the effort-to-performance causation would stay constant as you step outside the tested range. That's possible, but... I'm quite confident there's a limit to how much "effort" can increase your results on a timed multiple-choice test, that you'll hit diminishing marginal returns at some point (probably even negative marginal returns, if the incentive is strong enough to make many test-takers nervous), and extrapolating 3-4x outside the achieved effect seems dubious. (I also note that the 1x effect here means increasing your self-evaluated effort from 4.13 to 4.28 on a scale that goes up to 5, so a 4x effect would mean going to 4.73, approaching the limits of the scale itself.)

You say, doing your analysis:

For study 2, I get an effect of 0.54. For study 3, I get an effect of 0.37. For study 4, I get an effect of 0.39. The numbers are noisy for various reasons, but this all seems to be of a similar order of magnitude to the correlation in the general population, so this suggests the correlation between IQ and test effort is due to a causal effect of test effort increasing IQ scores.

That is interesting... Though the correlation between test effort and test performance in the studies is given as 0.27 and 0.29 in different samples, so, noise notwithstanding, your effects are consistently larger by a decent margin. That would suggest that there's something else going on than the simple causation.

The authors say:

6.1. Correlation and direction of causality
Across all three samples and cognitive ability tests (sentence verification, vocabulary, visual-spatial reasoning), the magnitude of the association between effort and test performance was approximately 0.30, suggesting that higher levels of motivation are associated better levels of test performance. Our results are in close accord with existing literature [...]
As is well-known, the observation of a correlation is a necessary but not sufficient condition for causality. The failure to observe concomitant increases in test effort and test performance, when test effort is manipulated, suggests the absence of a causal effect between test motivation and test performance.

That last sentence is odd, since there was in fact an increase in both test effort and test performance. Perhaps they're equivocating between "low effect" and "no effect"? (Which is partly defensible in that the effect was not statistically significant in most of the studies they ran. I'd still count it as a mark against them.) The authors continue:

Consequently, the positive linear assocation between effort and performance may be considered either spurious or the direction of causation reversed – flowing from ability to motivation. Several investigations have shown that the correlation between test-taking anxiety and test performance likely flows from ability to test-anxiety, not the other way around (Sommer & Arendasy, 2015; Sommer, Arendasy, Punter, Feldhammer-Kahr, & Rieder, 2019). Thus, if the direction of causation flows from ability to test motivation, it would help explain why effort is so difficult to shift via incentive manipulation.
6.2. Limitations & future research
We acknowledge that the evidence for the causal direction between effort and ability remains equivocal, as our evidence rests upon the absence of evidence (absence of experimental incentive effect). Ideally, positive evidence would be provided. Indirect positive evidence may be obtained by conducting an experiment, whereby half the subjects are given a relatively easy version of the paper folding task (10 easiest items) and the other half are given a relatively more difficult version (10 most difficult items). It is hypothesized that those given the relatively easier version of the paper folding task would then, on average, self-report greater levels of test-taking effort. Partial support for such a hypothesis is apparent in Table 1 of this investigation. Specifically, it can be seen that there is a perfect correspondence between the difficulty of the test (synonyms mean 73.4% correct; sentence verification mean 53.8% correct; paper folding mean 43.3%) and the mean level of reported effort (synonyms mean effort 4.42; sentence verification mean 4.11; paper folding mean 3.83).

That is a pretty interesting piece of evidence for the "ability leads to self-reported effort" theory.

Overall... The study seems to be a good one: doing a large replication study on prior claims. The presentation of it... The author on Twitter said "testing over N= 4,000 people", which is maybe what you get if you add up the N from all the different studies, but each study is considerably smaller; I found that somewhat misleading, but suspect that's a common thing when authors report multiple studies at once. On Twitter he says "We conclude that effort has unequivocally small effects", which omits caveats like "our results are accurate to the degree that alternative incentives do not yield appreciably larger effects" which are in the paper; this also seems like par for the course for science journalism (not to mention Twitter discourse). And they seem to have equivocated in places between "low effect" and "no effect". (Which I suspect is also not rare, unfortunately.)

Now. You presented this as:

Here's a classical example; an IQ researcher who is so focused on providing a counternarrative to motivational theories that he uses methods which are heavily downwards biased to "prove" that IQ test scores don't depend on effort.

The "focused on providing a counternarrative" part is plausibly correct. However, the "uses methods which are heavily downwards biased to "prove" [...]" is not. The "downwards biased methods" are "offering a monetary incentive of £2-£10, which turned out to be insufficient to change effort much". The authors were doing a replication of Duckworth, in which most of the cited studies had a monetary incentive of <$10—so that part is correctly matched—and they used high enough N that Duckworth's claimed effect size should have shown up easily. They also preregistered the first of their incentive-based studies (with the £2 incentive), and the later ones were the same but with increased sample size, then increased incentive. In other words, they did exactly what they should have done in a replication. To claim that they chose downwards-biased methods for the purpose of proving their point seems quite unfair; those methods were chosen by Duckworth.

This seems to be a data point of the form "your priors led you to assume bad faith (without having looked deeply enough to discover this was unjustified), which then led you to take this as a case to justify those priors for future cases". (We will see more of these later.) Clearly this could be a self-reinforcing loop that, over time, could lead one's priors very far astray. I would hope anyone who posts here would recognize the danger of such a trap.

Second example. "Simon Baron-Cohen playing Motte-Bailey with the "extreme male brain" theory of autism." Let's see... It seems uncontroversial (among the participants in this discussion) that there are dimensions on which male and female brains differ (on average), and on which autists are (on average) skewed towards the male side, and that this includes the empathizing and systematizing dimensions.

You quote Baron-Cohen as saying "According to the ‘extreme male brain’ theory of autism, people with autism or AS should always fall in the [extreme systematizing range]", and say that this is obviously false, since there exist autists who are not extreme systematizers—citing a later study coauthored by Baron-Cohen himself, which puts only ~10% of autists into the "Extreme Type S" category. You say he's engaging in a motte-and-bailey.

After some reading, this looks to me like a case of "All models are wrong, but some are useful." The same study says "Finally, we demonstrate that D-scores (difference between EQ and SQ) account for 19 times more of the variance in autistic traits (43%) than do other demographic variables including sex. Our results provide robust evidence in support of both the E-S and EMB theories." So, clearly he's aware that 57% of the variance is not explained by empathizing-systematizing. I think it would be reasonable to cast him as saying "We know this theory is not exactly correct, but it makes some correct predictions." Indeed, he counts the predictions made by these theories:

An extension of the E-S theory is the Extreme Male Brain (EMB) theory (11). This proposes that, with regard to empathy and systemizing, autistic individuals are on average shifted toward a more “masculine” brain type (difficulties in empathy and at least average aptitude in systemizing) (11). This may explain why between two to three times more males than females are diagnosed as autistic (12, 13). The EMB makes four further predictions: (vii) that more autistic than typical people will have an Extreme Type S brain; (viii) that autistic traits are better predicted by D-score than by sex; (ix) that males on average will have a higher number of autistic traits than will females; and (x) that those working in science, technology, engineering, and math (STEM) will have a higher number of autistic traits than those working in non-STEM occupations.

Note also that he states the definition of EMB theory as saying "autistic individuals are on average shifted toward a more “masculine” brain type". You say "Sometimes EMB proponents say that this isn’t really what the EMB theory says. Instead, they make up some weaker predictions, that the theory merely asserts differences “on average”." This is Baron-Cohen himself defining it that way.

Would it be better if he used a word other than "theory"? "Model"? You somewhat facetiously propose "If the EMB theory had instead been named the “sometimes autistic people are kinda nerdy” theory, then it would be a lot more justified by the evidence". How about, say, the theory that "There are processes that masculinize the brain in males; and some of those processes going into overdrive is a thing that causes autism"? (Which was part of the original paper: "What causes this shift remains unclear, but candidate factors include both genetic differences and prenatal testosterone.") That is, in fact, approximately what I found when I googled for people talking about the EMB theory—and note that the article is critical of the theory:

This hypothesis, called the ‘extreme male brain’ theory, postulates that males are at higher risk for autism as a result of in-utero exposure to steroid hormones called androgens. This exposure, the theory goes, accentuates the male-like tendency to recognize patterns in the world (systemizing behavior) and diminishes the female-like capacity to perceive social cues (socializing behavior). Put simply, boys are already part way along the spectrum, and if they are exposed to excessive androgens in the womb, these hormones can push them into the diagnostic range.

That is the sense in which an autistic brain is, hypothetically, an "extreme male brain". I guess "extremely masculinized brain" would be a bit more descriptive to someone who doesn't know the context.

The problem with a motte-and-bailey is that someone gets to go around advancing an extreme position, and then, when challenged by someone who would disprove it, he avoids the consequences by claiming he never said that, he only meant the mundane position. According to you, the bailey is "they want to talk big about how empathizing-systematizing is the explanation for autism". According to the paper, it was 43% of the explanation for autism, and the biggest individual factor? Seems pretty good.

Has Baron-Cohen gone around convincing people that empathizing-systematizing is the only factor involved in autism? I suspect that he doesn't believe it, he didn't mean to claim it, almost no one (except you) understood him as claiming it, and pretty much no one believes it. Maybe he picked a suboptimal name, which lent itself to misinterpretation. Do you have examples of Baron-Cohen making claims of that kind, which aren't explainable as him taking the "This theory is not exactly correct, but it makes useful predictions" approach?

The context here is explaining why you've "become horrified at what [you] once trusted", which you now call "supposed science". I'm... underwhelmed by what I've seen.

Back to Damore...

I think Damore's point, in bringing it up, was that the stress in (some portion of) tech jobs may be a reason there are fewer women than men in tech.
You may or may not be right that this is what he meant.

...I thought it was overkill to cite four quotes on that issue, but apparently not. Such priors!

(I think it's a completely wrong position, because the sex difference in neuroticism is much smaller (by something like 2x) than the sex difference in tech interests and tech abilities, and presumably the selection effect for neuroticism on career field is also much smaller than that of interests. So I'm not sure your reading on it is particularly more charitable, only uncharitable in a different direction; assuming a mistake rather than a conflict.)

It seems you're saying Damore mentions A but not B, and B is bigger, therefore Damore's "comprehensive" writeup is not so, and this omission is possibly ill-motivated. But, erm, Damore does mention B, twice:

[Women, on average have more] Openness directed towards feelings and aesthetics rather than ideas. Women generally also have a stronger interest in people rather than things, relative to men (also interpreted as empathizing vs. systemizing).
○ These two differences in part explain why women relatively prefer jobs in social or artistic areas. More men may like coding because it requires systemizing and even within SWEs, comparatively more women work on front end, which deals with both people and aesthetics.
[...]
Women on average show a higher interest in people and men in things
○ We can make software engineering more people-oriented with pair programming and more collaboration. Unfortunately, there may be limits to how people-oriented certain roles at Google can be and we shouldn't deceive ourselves or students into thinking otherwise (some of our programs to get female students into coding might be doing this).

~~This suggests that casting aspersions on Damore's motives is not gated by "Maybe I should double-check what he said to see if this is unfair".~~

I think the anxiety/stress thing is more relevant for top executive roles than for engineer roles; a population-level difference is more important at the extremes. Damore does talk about leadership specifically:

We always ask why we don't see women in top leadership positions, but we never ask why we
see so many men in these jobs. These positions often require long, stressful hours that may not
be worth it if you want a balanced and fulfilling life.

(Incidentally, imagine if Damore had claimed the opposite—"Women are less prone to anxiety and can handle stress more easily." Wouldn't that also lead to accusations that Damore was saying we can ignore women's problems?)
The correct thing to claim is "We should investigate what people are anxious/stressed about". Jumping to conclusions that people's states are simply a reflection of their innate traits is the problem.

Well, he lists one source of stress above, and he does recommend to "Make tech and leadership less stressful".

I don't think this is at the heart of Zack's adventure? Zack's issues were mainly about leading rationalists jumping in to rationalize things in the name of avoiding conflicts.

And why would these rationalists care so much about avoiding these conflicts, to the point of compromising the intellectual integrity that seems so dear to them? Fear that they'd face the kind of hostility and career-ruining accusations directed at Damore, and things downstream of fears like that, seems like a top candidate explanation.

Anyway, making weighty claims about people is core to what differential psychology is about.

Um. Accusations are things you make about individuals, occasionally organizations. I hope that the majority of differential psychology papers don't consist of "Bob Jones has done XYZ bad thing".

It's possible that some of my claims about Damore are false, in which case we should discuss that and fix the mistakes. However, the position that one should just keep quiet about claims about people simply because they are weighty would also seem to imply that we should keep quiet about claims about trans people and masculinity/femininity, or race and IQ, or, to make the Damore letter more relevant, men/women and various traits related to performance in tech.

You are equivocating between reckless claims of misconduct / malice by an individual, and heavily cited claims about population-level averages that are meant to inform company policy. Are you seriously stating an ethical principle that anyone who makes the latter should expect to face the former and it's justified?

Somewhat possible this is true. I think nerdy communities like LessWrong should do a better job at communicating the problems with various differential psychology findings and communicating how they are often made by conservatives to promote an agenda. If they did this, perhaps Damore would not have been in this situation.

I think Damore was aware that there are people who use population-level differences to justify discriminating against individuals, and that's why he took pains to disavow that. As for "the problems with various differential psychology findings"—do you think that some substantial fraction, say at least 20%, of the findings he cited were false?

Replies from: tailcalled, tailcalled, tailcalled

↑ comment by tailcalled · 2023-07-17T17:44:23.763Z · LW(p) · GW(p)

Second example. "Simon Baron-Cohen playing Motte-Bailey with the "extreme male brain" theory of autism." Let's see... It seems uncontroversial (among the participants in this discussion) that there are dimensions on which male and female brains differ (on average), and on which autists are (on average) skewed towards the male side, and that this includes the empathizing and systematizing dimensions.

Quick update!

I found that OpenPsychometrics has a dataset for the EQ/SQ tests. Unfortunately, there seems to be a problem for the data with the EQ items, but I just ran a factor analysis for the SQ items to take a closer look at your claims here.

There appeared to be 3 or 4 factors underlying the correlations on the SQ test, which I'd roughly call "Technical interests", "Nature interests", "Social difficulties" and "Jockyness". I grabbed the top loading items for each of the factors, and got this correlation matrix:

The correlations between the technical interests and nature interests plausibly reflects the notion that Systematizing is a thing, though I suspect that it could also be found to correlate with all sorts of other things that would not be considered Systematizing? Like non-Systematizing ways of interacting with nature. Idk though.

The sex differences in the items was limited to the technical interests, rather than than also covering the nature interests. This does not fit a simple model of a sex difference in general Systematizing, but it does fit a model where the items are biased towards men but there is not much sex difference in general Systematizing.

I would be inclined to think that the Social difficulties items correlate negatively with Empathizing Quotient or positively with Autism Spectrum Quotient. If we are interested in the correlations between general Systematizing and these other factors, then this could bias the comparisons. On the other hand, the Social difficulties items were not very strongly correlated with the overall SQ score, so maybe not.

I can't immediately think of any comments for the Jockyness items.

Overall, I strongly respect the fact that he made many of the items very concrete, but I now also feel like I have proven that the gender differences on Systematizing to be driven by psychometric shenanigans, and I strongly expect to find that many of the other associations are also driven by psychometric shenanigans.

I've sent an email asking OpenPsychometrics to export the Empathizing Quotient items too. If he does so, I hope to write a top-level post explaining my issues with the psychometrics here.

Replies from: tailcalled, tailcalled, tailcalled

↑ comment by tailcalled · 2023-07-17T19:27:38.545Z · LW(p) · GW(p)

Hm, actually I semi-retract this; the OpenPsychometrics data seems to be based on the original Systematizing Quotient, whereas there seems to be a newer one called Systematizing Quotient-Revised, which is supposedly more gender-neutral. Not sure where I can get data on this, though. Will go looking.

Edit: Like I am still pretty suspicious about the SQ-R. I just don't have explicit proof that it is flawed.

Replies from: tailcalled

↑ comment by tailcalled · 2023-07-17T21:47:30.490Z · LW(p) · GW(p)

Am I gonna have to collect the data myself? I might have to collect the data myself...

↑ comment by tailcalled · 2023-07-17T18:38:11.279Z · LW(p) · GW(p)

Oops, upon reading more about the SQ, I should correct myself:

Some of the items, such as S16, are "filler items" which are not counted as part of the score; these are disproportionately part of the "Social difficulties" and "Jockyness" factors, so that probably reduces the amount of bias that can be introduced by those items, and it also also explains why they don't correlate very much with the overall SQ scores.

But some of the items for these factors, such as S31, are not filler items, and instead get counted for the test, presumably because they have cross-loadings on the Systematizing factor. So the induced bias is probably not zero.

If I get the data from OpenPsychometrics, I will investigate in more detail.

↑ comment by tailcalled · 2023-07-17T19:03:59.470Z · LW(p) · GW(p)

Since I don't have data on the EQ, here's a study where someone else worked with it. They found that the EQ had three factors, which they named "Cognitive Empathy", "Emotional Empathy" and "Social Skills". The male-female difference was driven by "Emotional Empathy" (d=1), whereas the autistic-allistic difference was driven by "Social Skills" (d=1.3). The converse differences were much smaller, 0.24 and 0.66. As such, it seems likely that the EQ lumps together two different kinds of "empathizing", one of which is feminine and one of which is allistic.

↑ comment by tailcalled · 2023-07-17T11:32:21.043Z · LW(p) · GW(p)

As for point 2—if you were a researcher with heretical opinions, determined to publish research on at least some of them, what would you do? It seems like a reasonable strategy is to pick something heretical that you're confident you can defend, and do a rock-solid study on it, and brace for impact. Is it still the case that disproving the blank-slate hypothesis would constitute progress in some academic subfields? If so, then expect people to continue trying it.

I should also say, in the context of IQ and effort, some of the true dispute is about whether effort differences can explain race differences in scores. And for that purpose, what I would do is to go more directly into that.

In fact, I have done so. Quoting some discussion I had on Discord:

Me: Oh look at this thing I just saw

(correlation matrix with 0 correlation between race and test effort highlighted)

Other person: That is a really good find. Where's it from?

Me: from the supplementary info to one of the infamous test motivation studies:

https://www.pnas.org/doi/epdf/10.1073/pnas.1018601108

https://www.pnas.org/action/downloadSupplement?doi=10.1073%2Fpnas.1018601108&file=pnas.201018601SI.pdf

Other person: hohohoho

Me: Despite implying that test motivation explains racial gaps in the study text:

On the other hand, test motivation may be a serious confound instudies including participants who are below-average in IQ and wholack external incentives to perform at their maximal potential.Consider, for example, the National Longitudinal Survey of Youth(NLSY), a nationally representative sample of more than 12,000adolescents who completed an intelligence test called the ArmedForces Qualifying Test (AFQT). As is typical in social science re-search, NLSY participants were not rewarded in any way for higherscores. The NLSY data were analyzed inThe Bell Curve,inwhichHerrnstein and Murray (44) summarily dismissed test motivation asa potential confound in their analysis of black–white IQ disparities.

(This was way after I became critical of differential psychology btw. Around 2 months ago.)

↑ comment by tailcalled · 2023-07-17T09:10:39.380Z · LW(p) · GW(p)

These things are bad, but, apart from point 2, I would ask: how do they compare to the average quality of social science research? Do you have high standards, or do you just have high standards for one group? I think most of us spend at least some time in environments where the incentive gradients point towards the latter. Beware isolated demands for rigor.

I don't know for sure as I am only familiar with certain subsets of social science, but a lot of it is in fact bad. I also often criticize normal social science, but in this context it was this specific area of social science that came up.

As for point 2—if you were a researcher with heretical opinions, determined to publish research on at least some of them, what would you do? It seems like a reasonable strategy is to pick something heretical that you're confident you can defend, and do a rock-solid study on it, and brace for impact. Is it still the case that disproving the blank-slate hypothesis would constitute progress in some academic subfields? If so, then expect people to continue trying it.

I would try to perform studies that yield much more detailed information [LW · GW]. For instance, mixed qualitative and quantitative studies where one qualitatively inspects the data points that are above-average or below-average for the regressions, to see whether there are identifiable missing factors.

So, at least as a "We suspected these results were bogus, tried to reproduce them, and got a much smaller effect size", this seems all in order.

If he had phrased his results purely as disproving the importance of incentives, rather than effort, I think it would have been fine.

Your analysis essentially proposes that, if there were some method of increasing effort by 3-4x as much as he managed to increase it, then maybe you could in fact increase IQ scores by 10 points. This assumes that the effort-to-performance causation would stay constant as you step outside the tested range. That's possible, but... I'm quite confident there's a limit to how much "effort" can increase your results on a timed multiple-choice test, that you'll hit diminishing marginal returns at some point (probably even negative marginal returns, if the incentive is strong enough to make many test-takers nervous), and extrapolating 3-4x outside the achieved effect seems dubious. (I also note that the 1x effect here means increasing your self-evaluated effort from 4.13 to 4.28 on a scale that goes up to 5, so a 4x effect would mean going to 4.73, approaching the limits of the scale itself.)

I prefer to think of it as "if you increase your effort from being one of the lowest-effort people to being one of the highest-effort people, you can increase your IQ score by 17 IQ points". This doesn't seem too implausible to me, though admittedly I'm not 100% sure what the lowest-effort people are doing.

It's valid to say that extrapolating outside of the tested range is dubious, but IMO this means that the study design is bad.

I think it's likely that the limited returns to effort would be reflected in the limited bounds of the scale. So I don't think my position is in tension with the intuition that there's limits on what effort can do for you. Under this model, it is also worth noting that the effort scores were negatively skewed, so this implies that lack of effort is a bigger cause of low scores than extraordinary effort is of high scores.

That is interesting... Though the correlation between test effort and test performance in the studies is given as 0.27 and 0.29 in different samples, so, noise notwithstanding, your effects are consistently larger by a decent margin. That would suggest that there's something else going on than the simple causation.

I don't think my results are statistically significantly different from 0.3ish; in the ensuing discussion, people pointed out that the IV results had huge error bounds (because the original study was only barely significant).

But also if there is measurement error in the instrument (effort), then that would induce an upwards bias in the IV estimated effect. So that might also contribute.

However, the "uses methods which are heavily downwards biased to "prove" [...]" is not. The "downwards biased methods" are "offering a monetary incentive of £2-£10, which turned out to be insufficient to change effort much". The authors were doing a replication of Duckworth, in which most of the cited studies had a monetary incentive of <$10—so that part is correctly matched—and they used high enough N that Duckworth's claimed effect size should have shown up easily. They also preregistered the first of their incentive-based studies (with the £2 incentive), and the later ones were the same but with increased sample size, then increased incentive. In other words, they did exactly what they should have done in a replication. To claim that they chose downwards-biased methods for the purpose of proving their point seems quite unfair; those methods were chosen by Duckworth.

Shitty replications of shitty environmentalist research is still shitty.

Like this sort of thing makes sense to do as a personal dispute between the researchers, but for all of us who'd hope to actually use or build on the research for substantial purposes, it's no good if the researchers use shitty methods because they are trying to build a counternarrative against other researchers using shitty methods.

Let's see... It seems uncontroversial (among the participants in this discussion) that there are dimensions on which male and female brains differ (on average), and on which autists are (on average) skewed towards the male side, and that this includes the empathizing and systematizing dimensions.

I wouldn't confidently disagree with this, but I do have some philosophical nitpicks/uncertainties.

("Brain" connotes neurology to me, yet I am not sure if empathizing and especially systematizing are meaningful variables on a neurological level. I would also need to double-check whether EQ/SQ are MI for sex and autism because I don't remember whether they are. I suspect in particular the EQ is not, and it is the biggest drive of the EQ/SQ-autism connection, so it is pretty important to consider. But for the purposes of the Motte-Bailey situation, we can ignore that. Just tagging it as a potential area of disagreement.)

Would it be better if he used a word other than "theory"? "Model"? You somewhat facetiously propose "If the EMB theory had instead been named the “sometimes autistic people are kinda nerdy” theory, then it would be a lot more justified by the evidence". How about, say, the theory that "There are processes that masculinize the brain in males; and some of those processes going into overdrive is a thing that causes autism"? (Which was part of the original paper: "What causes this shift remains unclear, but candidate factors include both genetic differences and prenatal testosterone.")

I think what would be better would be if he clarified his models and reasoning. (Not positions, as that opens up the whole Motte-Bailey thing and also is kind of hard to engage with.) What is up with the original claim about autists always being extreme type S? Was this just a mistake that he would like to retract? If he only considers it to be a contributor that leads to half the variance, does he have any opinion on the nature of the other contributors to autism? Does he have any position on the relationship between autistic traits as measured by the AQ, and autism diagnosis? What should we make of the genetic contributors to autism being basically unrelated to the EQ/SQ? (And if the EQ/SQ are not MI for sex/autism, what does he make of that?)

Do you have examples of Baron-Cohen making claims of that kind, which aren't explainable as him taking the "This theory is not exactly correct, but it makes useful predictions" approach?

This is part of the trouble, these areas do not have proper discussions.

It seems you're saying Damore mentions A but not B, and B is bigger, therefore Damore's "comprehensive" writeup is not so, and this omission is possibly ill-motivated.

...

This suggests that casting aspersions on Damore's motives is not gated by "Maybe I should double-check what he said to see if this is unfair".

No, I meant that under your interpretation, Damore mentions A when A is of negligible effect, and so that indicates a mistake. I didn't mean to imply that he didn't mention B, and I read this part of his memo multiple times prior to sending my original comment, so I was fully aware that he mentioned B.

Well, he lists one source of stress above, and he does recommend to "Make tech and leadership less stressful".

But again the "Make tech and leadership less stressful" point boiled down to medicalizing it.

And why would these rationalists care so much about avoiding these conflicts, to the point of compromising the intellectual integrity that seems so dear to them? Fear that they'd face the kind of hostility and career-ruining accusations directed at Damore, and things downstream of fears like that, seems like a top candidate explanation.

Valid point.

Um. Accusations are things you make about individuals, occasionally organizations. I hope that the majority of differential psychology papers don't consist of "Bob Jones has done XYZ bad thing".

Differential psychology papers tend to propose ways to measure traits that they consider important, to extend preciously created measures with new claims of importance, and to rank demographics by importance.

You are equivocating between reckless claims of misconduct / malice by an individual, and heavily cited claims about population-level averages that are meant to inform company policy. Are you seriously stating an ethical principle that anyone who makes the latter should expect to face the former and it's justified?

I think in an ideal world, the research and the discourse would be more rational. For people who are willing to discuss and think about these matters rationally, it seems inappropriate to accuse them of misconduct/malice simply for agreeing with them. However if people have spent a long time trying to bring up rational discussion and failed, then it is reasonable for these people to assume misconduct/malice.

I think Damore was aware that there are people who use population-level differences to justify discriminating against individuals, and that's why he took pains to disavow that.

Using population-level differences to justify discriminating against individuals can be fine and is not what I have been objecting to.

As for "the problems with various differential psychology findings"—do you think that some substantial fraction, say at least 20%, of the findings he cited were false?

I don't know. My problem with this sort of research typically isn't that it is wrong (though it sometimes may be) but instead that it is of limited informative value.

I should probably do a top-level review post where I dig through all his cites to look at which parts of his memo are unjustified and which parts are wrong. I'll tag you if I do that.

↑ comment by Zack_M_Davis · 2023-07-16T18:20:05.720Z · LW(p) · GW(p)

I think "psychological sex differences" ideology is energetically unable to acknowledge these sorts of things because its main purpose/motivation is to function as a counterstory against feminist ideology

I mean, I agree that this is obviously a thing, but I continue to maintain hope in the possibility of actually reasoning about sex differences in the physical universe, rather than being resigned to living in the world of warring narratives. I think the named effect sizes help? (I try to be clear about saying "d ≈ 0.6", not "Men are from mars.")

it's rare for irrationality/dishonesty to be one-sided? In the example disputes I can think of off the top of my head, it's usually both or neither.

I absolutely agree that it's critical to recognize that it's often both. (I'm less sure about how often it's "neither". What stops people from converging?)

Replies from: tailcalled

↑ comment by tailcalled · 2023-07-16T18:52:24.293Z · LW(p) · GW(p)

I mean, I agree that this is obviously a thing, but I continue to maintain hope in the possibility of actually reasoning about sex differences in the physical universe, rather than being resigned to living in the world of warring narratives. I think the named effect sizes help? (I try to be clear about saying "d ≈ 0.6", not "Men are from mars.")

I agree.

I think also a lot of it is just down to doing better research, including better psychometrics and more qualitative investigations.

But what I mean is that the research programs such as people-things and similar that I have seen so far are not good enough, and are not attempting to become good enough sufficiently well that you can expect to just fund them and wait for results.

Not because it is fundamentally impossible, but because the challenges are not taken seriously enough.

I absolutely agree that it's critical to recognize that it's often both. (I'm less sure about how often it's "neither". What stops people from converging?)

I think if due to priors, the sides have reasons to distrust each other, but the distrust leads to ignoring each other instead of leading to conflict, both sides can remain honest and rational (rather than degenerating into dishonesty due to conflict) while not realizing that the other sides are honest and rational, and so end up not converging?

↑ comment by orellanin (f____) · 2023-07-16T17:29:09.599Z · LW(p) · GW(p)

Yes.

I don't get what point you're making here.

Replies from: tailcalled

↑ comment by tailcalled · 2023-07-16T18:17:04.152Z · LW(p) · GW(p)

It's an example of a trans activist who, when asked whether people who want to coordinate sex-based descrimination should be purged from the discourse and authoritative sources, was like "yeah they just sound like mean busybodies to me". Admittedly I didn't really go into detail (partly because I don't really have any strong examples that I support and want to argue about), so we don't really know whether Pervocracy supports it in all relevant cases.

comment by Sinclair Chen (sinclair-chen) · 2023-07-17T11:44:17.518Z · LW(p) · GW(p)

Seems like a lot of the asserted "failure to cut reality at its joints" is about trans people before they have started transitioning or before they pass.

I nominate the term "aspiring woman" for people who are biologically male, perceived as men, but who desire to be perceived as women. (That's what I used to call myself, but people were confused why I didn't just call myself a woman, so instead I resorted to various long winded explanations, and then various inaccurate nonbinary labels that required a tumblr account or university education to understand, and people were still confused, and then I gave up and just identified as a woman.)

Much like "aspiring rationalist" the term is technically correct while still vibes-implying that you are sorta the thing you aspire to but sorta not, or not yet at least.

Replies from: BrienneYudkowsky

↑ comment by LoganStrohl (BrienneYudkowsky) · 2023-07-20T23:14:24.262Z · LW(p) · GW(p)

>and people were still confused, and then I gave up and just identified as a woman

D: i know those feels. that's kinda where i am lately. (except man instead of woman)

comment by Raemon · 2023-07-15T18:47:28.529Z · LW(p) · GW(p)

Minor feedback – I got ~2/3 through, took a break, came back to the post and had trouble finding my place. If there were section headers that formed a table of contents it'd be somewhat easier to get back into it. (though I'm not sure how intentional it was to not have headers, or whether you consider that doing some kinda important stylistic work)

comment by Wei Dai (Wei_Dai) · 2023-07-17T11:03:40.071Z · LW(p) · GW(p)

Yudkowsky couldn’t be bothered to either live up to his own stated standards

"his own stated standards" could use a link/citation.

regardless of the initial intent, scrupulous rationalists were paying rent to something claiming moral authority, which had no concrete specific plan to do anything other than run out the clock, maintaining a facsimile of dialogue in ways well-calibrated to continue to generate revenue.

The original Kolmogorov complicity was an instance of lying to protect one's intellectual endeavors. But here you/Ben seem to be accusing Eliezer of doing something much worse, and which seems like a big leap from what came before it in the post. How did you/Ben rule out the Kolmogorov complicity hypothesis (i.e., that Eliezer still had genuine intellectual or altruistic aims that he wanted to protect)?

Of what you wrote specifically, "no concrete specific plan" is in my view actually a point in Eliezer's favor, as it's a natural consequence of high alignment difficulty and intellectual honesty. "Run out the clock" hardly seems fair, and by "maintaining a facsimile of dialogue" what are you referring to? Are you including things like the 2021 MIRI Conversations [? · GW] and if so are you suggesting that all the other (non-MIRI) participants are being fooled or in on the scam?

But since I did spend my entire adult life in Yudkowsky’s robot cult, trusting him the way a Catholic trusts the Pope

I would be interested to read an account of how this happened, and what might have prevented the error.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-07-28T02:45:00.836Z · LW(p) · GW(p)

which seems like a big leap from what came before it in the post

Sorry, the fifth- to second-to-last paragraphs of the originally published version of this post were egregiously terrible writing on my part. (I was summarizing some things Ben said at the time that felt like a relevant part of the story, but what I actually needed to do was explain in my own words the points that I want to endorsedly convey to my readers.)

I've rewritten that passage (now the third- and second-to-last paragraphs). I hope this version is clearer.

I'm not conjecturing anything worse than Kolmogorov complicity. (And the 2021 MIRI conversations were great.) I do think political censorship is significantly more damaging to epistemic conditions than many others seem to. People playing a Kolmogorov complicity strategy typically seem to think that it's cheap to just avoid a few sensitive topics. But the disturbing thing about the events described in this post was that the distortion didn't stay confined [LW · GW] to sensitive topics: the reversal (in emphasis and practice, if not outright logical contradiction) from "words can be wrong" [LW · GW] to "you're not standing in defense of truth [...]" is about the cognitive function of categorization, a "dry" philosophy topic which you wouldn't expect to be politically sensitive!

I would be interested to read an account of how this happened

He encourages it, doesn't he? (Much more to say in a future post, or via PM.)

comment by oregonsun · 2023-07-15T21:02:33.173Z · LW(p) · GW(p)

Scott took this to mean that what convention to use is a pragmatic choice we can make on utilitarian grounds, and that being nice to trans people was worth a little bit of clunkiness—that the mental health benefits to trans people were obviously enough to tip the first-order utilitarian calculus.
I didn't think anything about "mental health benefits to trans people" was obvious.

There's a scottpost that seems relevant called Be Nice, At Least Until You Can Coordinate Meanness. It's not a perfect fit because the thing we are coordinating here isn't meanness. But maybe there are attractor states of mental health, and a mental health local maxima for a trans person is for their pronouns to be respected, but there's also a global maxima where we decide to define "man" and "women" on biological terms like we define "male" and "female" and treat gender dysphoria in some way other than gender affirmation. Perhaps it could be true that if we could coordinate this new paradigm, the mental health of trans people would in the long term be better, because they can live with treated gender dysphoria without transitioning; but in the current paradigm, where [let's say] a woman is someone who identifies as a woman, it's incredibly distressing for someone on the Internet to try--and fail--to unilaterally move you to the new paradigm.

In the current paradigm, perhaps the best local maxima is to find a way to affirm AGP individuals whose gender identity corresponds to their biological sex, while also affirming the gender identity of individuals who have a gender identity not corresponding to their biological sex.

comment by cata · 2023-07-15T21:54:23.550Z · LW(p) · GW(p)

Sorry, but I just wasn't able to read the whole thing carefully, so I might be missing your relevant writing; I apologize if this comment retreads old ground.

It seems to me like the reasonable thing to do in this situation is:

Make whatever categories in your map you would be inclined to make in order to make good predictions. For example, personally I have a sort of "trans women" category based on the handful of trans women I have known reasonably well, which is closer to the "man" category than to the "woman" category, but has some somewhat distinct traits. Obviously you have a much more detailed map than me about this.
Use maximally clear, straightforward, and honest language representing maximally useful maps in situations where you are mostly trying to seek truth in the relevant territory. For example, "trying to figure out good bathroom policy" would be a really bad time to use obfuscatory language in order to spare people's feelings. (Likewise this comment.)
Be amenable to utilitarian arguments for non-straightforward or dishonest language in situations where you are doing something else. For example, if I have a trans coworker who I am trying to cooperate with on some totally unrelated work, and they have strong personal preferences about my use of language that aren't very costly for me, I am basically just happy to go along with those. Or if I have a trans friend who I like and we are just talking about whatever for warm fuzzies reasons, I am happy to go along with their preferences. (If it's a kind of collective political thing, then that brings other considerations to bear; I don't care to play political games.)

Introspectively, I don't think that my use of language in the third point is messing up my ability to think -- it's extremely not confusing to think to myself in my map-language, "OK, this person is a trans woman, which means I should predict they are mostly like a man except with trans-woman-cluster traits X, Y, and Z, and a personal preference to be treated 'like' a woman in many social circumstances, and it's polite to call them 'she' and 'her'." I also don't get confused if other people do the things that I learned are polite; I don't start thinking "oh, everyone is treating this trans woman like a woman, so now I should expect them to have woman-typical traits like P and Q."

The third point is the majority of interactions I have, because I mostly don't care or think very much about gender-related stuff. Is there a reason I should be more of a stickler for maximizing honesty and straightforwardness in these cases?

Replies from: vedrfolnir

↑ comment by vedrfolnir · 2023-07-17T02:52:26.140Z · LW(p) · GW(p)

I don't think "non-straightforward or dishonest language" enters into it very much, but I don't have the clusters you have. I know cis women with "male-pattern" personalities and interests and trans women with "female-pattern" personalities and interests. (Not really any cis men with "female-pattern" personalities and interests, but society does its best to ensure that doesn't happen.) In some online spaces where I don't share demographic information, people sometimes take me for a member of the opposite sex. "Male-pattern" and "female-pattern" are culture- and class-bound anyway - there are many different types of guy. I don't get much use out of categorizing people by biological sex.

In repeated interpersonal interactions, of course, you just construct a model of the person, and then you don't need the categories so much. You still have to figure out who uses which bathroom, but the "you" here unpacks to "the state", which sees in its own way - a low-resolution way that can't be said to track truth.

Unless you're prepared to reject the entire analytic tradition, categories aren't even real - they're abstractions over entities. Maybe some are more useful than others, but if you recognize "trans woman" as a third gender (surely a more useful categorization than "trans women are men"[1]), how many genders are there? Are "nerd" and "jock" genders? "Butch" and "femme"?

[1] If this seems surprising to you, remember that LW and the social strata it recruits from contain highly atypical men! For example: what percentage of the male LW userbase knows the basic rules of a major spectator sport?

comment by Zack_M_Davis · 2025-01-13T07:49:39.726Z · LW(p) · GW(p)

(Self-review.) I think this pt. 2 is the second most interesting entry in my Whole Dumb Story memoir sequence. (Pt. 1 [LW · GW] deals with more niche psychology stuff than the philosophical malpractice covered here; pt. 3 [LW · GW] is a more of a grab-bag of stuff that happened between April 2019 and January 2021; pt. 4 [LW · GW] is the climax. Expect the denouement pt. 5 in mid-2025.)

I feel a lot more at peace having this out there. (If we can't have justice, sanity, or language, at least I got to tell my story about trying to protect them.)

The 8 karma in 97 votes is kind of funny in how nakedly political it is. (I think it was higher before the post got some negative attention on Twitter.)

Given how much prereading and editing effort had already gone into this, it's disappointing that I didn't get the ending right the first time. (I ended up rewriting some of the paragraphs at the end after initial publication [LW(p) · GW(p)] after it didn't land in the comments section the way I wanted it to land.)

Subsection titles would have also been a better choice for such a long piece (which was rectified for the publication of pt.s 3 and 4); I may still yet add them.

comment by Sinclair Chen (sinclair-chen) · 2023-07-17T11:07:10.776Z · LW(p) · GW(p)

I said, "I need the phrase 'actual women' in my expressive vocabulary to talk about the phenomenon where, if transition technology were to improve, then the people we call 'trans women' would want to make use of that technology; I need language that asymmetrically distinguishes between the original thing that already exists without having to try, and the artificial thing that's trying to imitate it to the limits of available technology".
Kelsey Piper replied, "the people getting surgery to have bodies that do 'women' more the way they want are mostly cis women [...] I don't think 'people who'd get surgery to have the ideal female body' cuts anything at the joints."
Another woman said, "'the original thing that already exists without having to try' sounds fake to me" (to the acclaim of four "+1" emoji reactions).

I would also give that second comment a +1 (though not Kelsey's). For cis women to achieve the social target of femininity requires active self modification - shaving body hair periodically, for instance. Okay, to nitpick, one could get it all lasered/electrolyzed (in which case it's just one-time self modification) or be fortunately born with buttery smooth legs. But the modal woman actively performs womanhood. The level of effort and performance success is a matter of degree, and I think this follows a bimodal distribution with a peak of high effort lower results (trans women) and taller peak of lower effort higher results (cis women).

Perhaps we should have a different term for the target, for [that which women on average try to become or fantasize about becoming.] Trans people sorta call it "transition goals" but I think a more typical term would be the "ideal woman"? I think it's hard to pin down what the "ideal woman" is like, because everyone's personal aesthetic vision is different, and people probably bias towards everyone else having the same vision as they do. But if we look at virtual reality as glimpse at what people choose to be given the choice, then the ideal woman is a pretty girl that is overall human with the exception of cat ears and a tail, and sometimes paws but that is a little bit too experimental for me personally.

Replies from: FeepingCreature

↑ comment by FeepingCreature · 2023-07-17T12:40:20.601Z · LW(p) · GW(p)

I mean, men also have to put in effort to perform masculinity, or be seen as being inadequate men; I don't think this is a gendered thing. But even a man that isn't "performing masculinity adequately", an inadequate man, like an inadequate woman, is still a distinct category, and though transwomen, like born women, aim to perform femininity, transwomen have a higher distance to cross and in doing so traverse between clusters along several dimensions. I think we can meaningfully separate "perform effort to transition in adequacy" from "perform effort to transition in cluster", even if the goal is the same.

(From what I gather of VRChat, the ideal man is also a pretty girl that is overall human with the exception of cat ears and a tail...)

comment by Raemon · 2023-07-15T20:30:05.604Z · LW(p) · GW(p)

Sarah asked if the math wasn't a bit overkill: were the calculations really necessary to make the basic point that good definitions should be about classifying the world, rather than about what's pleasant or politically expedient to say?
I thought the math was important as an appeal to principle—and as intimidation. (As it was written, the tenth virtue is precision! Even if you cannot do the math, knowing that the math exists tells you that the dance step is precise and has no room in it for your whims.)

FYI I found the math pretty valuable (even though I didn't sit through and check it). Specifically for the reason you list here that I bolded.

comment by Eli Tyre (elityre) · 2024-02-17T23:33:19.218Z · LW(p) · GW(p)

Some people I usually respect for their willingness to publicly die on a hill of facts, now seem to be talking as if pronouns are facts, or as if who uses what bathroom is necessarily a factual statement about chromosomes. Come on, you know the distinction better than that!
Even if somebody went around saying, "I demand you call me 'she' and furthermore I claim to have two X chromosomes!", which none of my trans colleagues have ever said to me by the way, it still isn't a question-of-empirical-fact whether she should be called "she". It's an act.
In saying this, I am not taking a stand for or against any Twitter policies. I am making a stand on a hill of meaning in defense of validity, about the distinction between what is and isn't a stand on a hill of facts in defense of truth.
I will never stand against those who stand against lies. But changing your name, asking people to address you by a different pronoun, and getting sex reassignment surgery, Is. Not. Lying. You are ontologically confused if you think those acts are false assertions.

My reading of this tweet thread, including some additional extrapolation that isn't in the text:

There are obviously differences between cis men, cis women, trans men, and trans women. Anyone who tries to obliterate the distinction between cis women and trans women, in full generality, is a fool and/or pushing an ideological agenda.

So too are there important similarities between any pair of cis/trans men/women.

However, the social norm of identifying people by pronouns, in particular, is arbitrary. We could just as well have pronouns that categorize people by hair color or by height.

If someone is fighting for the social ritual of referring to someone as "she" meaning "this person is a cis woman", instead of "this person presents as female", that's fine, but it's a policy proposal, not the defense of a fact.

That we perform a social ritual that divides people up according to chromosomes, or according to neurotype, or according to social presentation, or according to preference, or whatever, is NOT implied by the true fact that there is a difference between cis women and trans women.

Similarly if someone wants to advocate for a particular policy about how we allocate bathrooms.

There are many more degrees of freedom in our choice of policies than in our true beliefs, and you're making an error if you're pushing for your preferred policy as if it is necessarily implied by the facts.

[edit 2024-02-19]: I in light of Yudkowsky's clarification, I think my interpretation was not quite right. He's saying instead that it is a mistake to call someone a lier they are explicitly making a bid or an argument to redefine how a word is used in some context,

Replies from: elityre, elityre

↑ comment by Eli Tyre (elityre) · 2024-02-18T03:49:12.135Z · LW(p) · GW(p)

Thus, Yudkowsky's claim to merely have been standing up for the distinction between facts and policy questions doesn't seem credible. It is, of course, true that pronoun and bathroom conventions are policy decisions rather than matters of fact, but it's bizarre to condescendingly point this out as if it were the crux of contemporary trans-rights debates. Conservatives and gender-critical feminists know that trans-rights advocates aren't falsely claiming that trans women have XX chromosomes! If you just wanted to point out that the rules of sports leagues are a policy question rather than a fact (as if anyone had doubted this), why would you throw in the "Aristotelian binary" weak man and belittle the matter as "humorous"? There are a lot of issues I don't care much about, but I don't see anything funny about the fact that other people do care.

But, he's not claiming that this is the crux of contemporary trans-rights debates? He's pointing out the distinction between facts and policy mainly because he has a particular interest in epistemology, not because he has a particular interest in the trans-rights debates.

There's an active debate, which he's mostly not very interested in. But one sub-thread of that debate is some folks making what he considers to be an ontological error, which he points out, because he cares about that class of error, separately from the rest of the context.

↑ comment by Eli Tyre (elityre) · 2024-02-18T02:34:31.257Z · LW(p) · GW(p)

One could argue that this "Words can be wrong when your definition draws a boundary around things that don't really belong together" moral didn't apply to Yudkowsky's new Tweets, which only mentioned pronouns and bathroom policies, not the extensions [LW · GW] of common nouns.
But this seems pretty unsatisfying in the context of Yudkowsky's claim to "not [be] taking a stand for or against any Twitter policies". One of the Tweets that had recently led to radical feminist Meghan Murphy getting kicked off the platform read simply, "Men aren't women tho." This doesn't seem like a policy claim; rather, Murphy was using common language to express the fact-claim that members of the natural category of adult human males, are not, in fact, members of the natural category of adult human females.

I don't get it. He's explicitly disclaiming that he's not commenting on that situation? But that means that we should take his thread here as implicitly commenting on that situation?

I think I must be missing the point, because my summary here seems to uncharitable to be right.

comment by Max H (Maxc) · 2023-07-16T14:35:33.601Z · LW(p) · GW(p)

I did not make any negative updates about Scott, Kelsey, Anna or Eliezer, based on the accusations of gaslighting, motivated reasoning, insanity, corruption, etc. in this post. I don't know any of them personally, but I hold them all in high regard based on their public work, and nothing Zack has written has changed my view.

(This is just my own opinion and reaction after reading / skimming >40k words, stated without argument or explanation or intent to engage further, to get it out there as something people can react to or agree / disagree vote on. Agreement votes on LW are a far cry from a "court of rationality", but they might help others weigh whether responding further is worthwhile.)

Replies from: Zack_M_Davis, martin-randall

↑ comment by Zack_M_Davis · 2023-07-16T23:32:03.535Z · LW(p) · GW(p)

Life is not graded on a curve and you can update yourself incrementally [LW · GW]? I also hold all of those people in high regard—relative to the rest of the world. (And Anna is a personal friend.) I think all of those people hold me in high regard, relative to the rest of the world. Nevertheless, it seems like there ought to be a time and a place to talk about people having been culpably wrong about some things, even while the same people have also done a lot of things right?

(I think I apply this symmetrically; if someone wants to write a 22,000 word blog post about the ways in which my intellectual conduct failed to live up to standards, that's fine with me.)

The thing me and my allies were hoping for a "court ruling" on was not about who should or shouldn't be held in high regard, but about the philosophical claim that one "ought to accept an unexpected [X] or two deep inside the conceptual boundaries of what would normally be considered [Y] if [positive consequence]". (I think this is false.) [LW · GW] That's what really matters, not who we should or shouldn't hold in high regard.

People's reputations only come into it because of considerations like, if you think Scott is right about the philosophical claim, that should be to his credit and to my detriment, and vice versa. The reason I'm telling this Whole Dumb Story about people instead of just making object-level arguments about ideas, is that I tried making object-level arguments about ideas first, for seven years, and it wasn't very effective. At some point, I think it makes sense to jump up a meta level and try to reason about people to figure out why the ideas aren't landing.

Replies from: Maxc

↑ comment by Max H (Maxc) · 2023-07-17T01:49:51.440Z · LW(p) · GW(p)

I understand what you're trying to do. "Hold in high regard" is maybe the wrong choice of phrase since it connotes something more status-y than I intended; what I'm really saying is more general: this post failed to convince me of anything in particular, at any level of meta. I'm not inclined to wade into the actual arguments or explain why I feel that way, but I commented anyway since this post does impugn on various people's internal motivations and reputations pretty directly, and because if my sentiment is widely shared, that might inform others' decision about whether to respond or engage themselves.

I realize this might be frustrating and come across as intellectually rude or lazy to you or anyone who has a different assessment. Blunt / harsh / vague feedback still seemed better than no feedback, and I wanted to gauge sentiment and offer others a low-effort way to give slightly more nuanced feedback than a vote on the top-level post.

(I think I apply this symmetrically; if someone wants to write a 22,000 word blog post about the ways in which my intellectual conduct failed to live up to standards, that's fine with me.)

Sure, I'm generally fine with you or others doing this (about anyone), but I can't imagine many people wanting to, and I'm hypothesizing above that the ROI would be low.

The reason I'm telling this Whole Dumb Story about people instead of just making object-level arguments about ideas, is that I tried making object-level arguments about ideas first, for seven years, and it wasn't very effective. At some point, I think it makes sense to jump up a meta level and try to reason about people to figure out why the ideas aren't landing.

I think that's a very good reason to go meta! And I sympathize deeply with the feeling of having your object-level ideas ignored or misunderstood no matter how many times you explain. But I also think reasoning about the internal motivations / sanity failures / rationality of others is very hard to do responsibly, and even harder to do correctly.

So, more blunt feedback: I think you have mostly succeeded in responsibly presenting your case. This post is a bit rambly, but otherwise clearly written and extremely detailed and... I wouldn't exactly call it "objective", but I mostly trust that is an honest and accurate depiction of the key facts, and that you are not close to impinging on any rights or expectations of privacy. And I have no problem with going meta and reasoning about others' internal motivations and thoughts when done responsibly. It's just that the reasoning also has to be correct, and I claim (again, rudely without explanation) that isn't the case here.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-07-17T06:07:14.918Z · LW(p) · GW(p)

I'm not inclined to wade into the actual arguments or explain why I feel that way

Would you do it for $40? I can do PayPal or mail you a check.

Sorry, I know that's not very much money. I'm low-balling this one partially because I'm trying to be more conservative with money since I'm not dayjobbing right now, and partially because I don't really expect you to have any good arguments (and I feel OK about throwing away $40 to call your bluff, but I would feel bad spending more than that on lousy arguments).

Usually when people have good arguments and are motivated to comment at all, they use the comment to explain the arguments. So far, you've spent 475 words telling me nothing of substance except that you disagree, somehow, about something. This is not a promising sign.

Blunt / harsh / vague feedback

I don't know why you're grouping these together. Blunt is great. Harsh is great. Vague is useless.

The reason that blunt is great and harsh is great is because detailed, intellectually substantive feedback phrased in a blunt or harsh manner can be evaluated on its intellectual merits. The merits can't be faked—or at least, I think I'm pretty good at distinguishing between good and bad criticisms.

The reason vague is useless is because vague feedback is trivially faked. Anyone can just say "this post failed to convince me of anything in particular, at any level of meta" or "the reasoning also has to be correct, and I claim [...] that isn't the case here." Why should I trust you? How do I know you're not bluffing?

If I got something wrong, I want to know about it! (I probably got lots of things wrong; humans aren't good at writing 22,000 word documents without making any mistakes, and the Iceman/Said interaction [LW(p) · GW(p)] already makes me think the last five paragraphs need a rewrite.) But I'm not a mind-reader: in order for me to become less wrong, someone does, actually, have to tell me what I got wrong, specifically. Seems like an easy way to earn forty bucks if you ask me!

Replies from: Maxc

↑ comment by Max H (Maxc) · 2023-07-17T16:02:22.286Z · LW(p) · GW(p)

Accepted, but I want to register that I am responding because you have successfully exerted social pressure, not because of any monetary incentive. I don't mind the offer / ask (or the emotional appeals / explanations), but in the future I would prefer that you (or anyone) make such offers via PM.

Semi-relatedly, on cheerful prices, you wrote:

But that seemed sufficiently psychologically coercive and socially weird that I wasn't sure I wanted to go there.

I don't think there's anything weird or coercive about offering to pay someone to respond or engage. But there are often preconditions that must be met between the buyer and seller for a particular cheerful price transaction to clear. For the seller to feel cheerful about a transaction at any price, the seller might need to be reasonably confident that (a) the buyer understands what they are buying (b) the buyer will not later regret having transacted (c) the seller will not later regret having transacted.

This requires a fair amount of trust between the buyer and seller, the buyer to have enough self-knowledge and stability, and for the seller to know enough about the buyer (or perform some due diligence) to be reasonably confident that all of these conditions obtain.

In the post, you recognize that someone might decline a cheerful price offer for reasons related to status, politics, or time / energy constraints. I think you've failed to consider a bunch of other reasons people might not want to engage, in public or in private.

Anyway, on to some specific objections:

Basically everywhere you speculate about someone's internal motivations or thoughts, or claim that someone is doing something for political reasons or status reasons or energy reasons, I think you're failing to consider alternate hypotheses adequately, starting from false or misleading premises, or drawing conclusions unjustifiably. Also, I disagree with your framing about just about everything, everywhere. I'll go through a few examples to make my point:

But because I was so cowardly as to need social proof (because I believed that an ordinary programmer such as me was as a mere worm in the presence of the great Eliezer Yudkowsky), it must have just looked to him like an illegible social plot originating from Michael.

Or... engaging with you and your allies is likely to be unpleasant. And empirically you don't seem capable of gracefully handling a refusal to engage, without making potentially incorrect and public updates about the refuser's internal state of mind based on that refusal. I guess that's your right, but it increases the transaction costs of engaging with you, and others might choose to glomarize as a blanket policy in response.

Okay, so Yudkowsky had prevaricated about his own philosophy of language

[shaky premise]

for transparently political reasons

[unjustified assertion about someone else's internal mental state that fails to consider alternate hypotheses]

Yudkowsky had set in motion a marketing machine (the "rationalist community")

Loaded framing.

that was continuing to raise funds and demand work from people for below-market rates

"demand"?

based on the claim that while nearly everyone else was criminally insane (causing huge amounts of damage due to disconnect from reality, in a way that would be criminal if done knowingly), he, almost uniquely, was not.

I don't think Eliezer has ever made this claim in the sense that you connote, and is certainly not "making demands" based on it.

"Work for me or the world ends badly," basically.

Again, I'm pretty sure Eliezer has never claimed anything like this, except in the literal sense that he has both claimed that AGI is likely to lead to human extinction, and presumably at some point asked people to work for or with him to prevent that. Eliezer is somewhat more certain of this claim than others, but it's not exactly an abnormal or extreme view within this community on its own, and asking people to work for you for pay is a pretty fundamental and generally prosocial economic activity.

As written, this connotes both "work for me else the world ends badly" and "the world continues to exist iff everyone works for me", both of which are untrue and I don't think Eliezer has ever claimed.

(Also, you offered to pay me to respond, and I didn't call it a "demand" or treat it that way, even though you exerted a bunch of social pressure. OTOH, Eliezer has never asked me or demanded anything of me personally, and I've never really felt pressured to do anything in particular based on his writing. For the record, I am pretty happy with the equilibrium / norm where people are allowed to ask for things, and sometimes even exert social pressure about them when it's important, and I think it's counterproductive and misleading to label such asks as "demands" even when the stakes are high. Perhaps this point is mostly a dispute about language and terminology, heh.)

If, after we had tried to talk to him privately, Yudkowsky couldn't be bothered to either live up to his own stated standards or withdraw his validation from the machine he built, "couldn't be bothered" is again making what look to me like unjustified / unsupported claims about another person's mental state.

"and live up to his own stated standards or withdraw his validation..." is asking Eliezer (and your readers) to share your frame about all your object level claims.

The last several paragraphs of the post are where these mistakes seem most egregious, but I think they're also present in the rest of the piece:

But the post is wrong in obvious ways.

"Obvious", here and elsewhere seems trivially falsified by the amount of controversy and discussion that these posts have generated.

Given that, it's hard to read the Tweets Yudkowsky published as anything other than an attempt to intimidate and delegitimize people who want to use language to reason about sex rather than gender identity.

Really? I had no problem not reading it that way, and still don't.

It's just not plausible that Yudkowsky was simultaneously savvy enough to choose to make these particular points while also being naïve enough to not understand the political context

Again, this seems like ruling out a bunch of other hypotheses. Also, I actually do find the one considered-and-rejected hypothesis at least somewhat plausible!

After Yudkowsky had stepped away from full-time writing, Alexander had emerged as our subculture's preeminent writer. Most people in an intellectual scene "are writers" in some sense, but Alexander was the one "everyone" reads: you could often reference a Slate Star Codex post in conversation and expect people to be familiar with the idea, either from having read it, or by osmosis. The frequency with which "... Not Man for the Categories" was cited at me seemed to suggest it had become our subculture's party line on trans issues.

This entire paragraph describes a model of the rationalist community as having vastly more centralized leadership and more consensus than in my own model. "party line"? I'm not sure there is any issue or claim for which there is widespread consensus, even very basic ones. I'm not disputing your account of your personal experience, but I want to flag that many of the claims and assumptions in this post seem to rest on an assumption that that experience and model is shared, when it is in fact not shared by at least one other person.

Replies from: Zack_M_Davis, martin-randall, SaidAchmiz, Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-07-18T19:50:22.268Z · LW(p) · GW(p)

Accepted

Thanks!! Receipt details (to your selected charity) in PM yesterday.

Or ...

(Another part of the problem here might be that I think the privacy norms let me report on my psychological speculations about named individuals, but not all of the evidence that supports them, which might have come from private conversations.)

Okay, so Yudkowsky had prevaricated about his own philosophy of language

[shaky premise]

But it's not a premise being introduced suddenly out of nowhere; it's a conclusion argued for at length earlier in the piece. Prevaricate, meaning, "To shift or turn from direct speech or behaviour [...] to waffle or be (intentionally) ambiguous." "His own philosophy of language", meaning, that he wrote a 30,000 word Sequence elaborating on 37 ways in which words can be wrong [LW · GW], including #30, "Your definition draws a boundary around things that don't really belong together."

When an author who wrote 30,000 words in 2008 pounding home over and over and over again that "words can be wrong", then turns around in 2018 and says that "maybe as a matter of policy, you want to make a case for language being used a certain way [...] [b]ut you're not making a stand for Truth in doing so" and that "you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning", about an issue where the opposing side's view is precisely that the people bringing the word into question are using a definition that draws a boundary around things that don't really belong together (because men aren't women, and this continues to be the case even if you redefine some of the men as "trans women"), and the author doesn't seem eager to clarify when someone calls him on the apparent reversal, I think it makes sense to describe that as the author being ambiguous, having shifted or turned from direct speech—in a word, prevaricating (although not "lying" [LW · GW]).

I think I've made my case here. If you disagree with some part of the case, I'm eager to hear it! But I don't think it's fair for you to dimiss me for making a "shaky premise" when the premise is a conclusion that I argued for.

[unjustified assertion about someone else's internal mental state that fails to consider alternate hypotheses]

On reflection, the word "transparently" definitely isn't right here (thanks!), but I'm comfortable standing by "political reasons". I think later events in the Whole Dumb Story bear me out.

you offered to pay me to respond, and I didn't call it a "demand" or treat it that way, even though you exerted a bunch of social pressure

I think it would have been fine if you did call it a "demand"! Wiktionary's first definition is "To request forcefully." The grandparent is making a request, and the insults ("I don't really expect you to have any good arguments", &c.) make it "forceful". Seems fine. Why did you expect me to object?

The last several paragraphs of the post are where these mistakes seem most egregious

Yeah, I definitely want to rewrite those to be clearer now (as alluded to in the last paragraph of the grandparent). Sorry.

"Obvious", here and elsewhere seems trivially falsified by the amount of controversy and discussion that these posts have generated.

Sorry, let me clarify. "Obvious" is a 2 place word [LW · GW]; there has to an implicit "to whom" even if not stated. I claim that the error in "... Not Man for the Categories" is obvious to someone who understood the lessons in the "Human's Guide to Words" [? · GW] Sequence, including the math. I agree that it's not obvious to people in general, or self-identified "rationalists" in general.

I'm not disputing your account of your personal experience

Shouldn't you, though? If my perception of "the community" was biased and crazy, it seems like you could totally embarrass me right now by pointing to evidence that my perceptions were biased and crazy.

For example, in part of the post (the paragraph starting with "Now, the memory of that social proof"), I link to comments from Yudkowsky, Alexander, Kelsey Piper, Ozy Brennan, and Rob Bensinger as evidence about the state of the "rationalist" zeitgiest. It seems like you could argue against this by saying something like, "Hey, what about authors X, Y, and Z, who are just as prominent in 'the community' as the people you named and are on the record saying that biological sex is immutable and supporting the integrity of female-only spaces?" Admittedly, this could get a little more complicated insofar as your claim is that I was overestimating the degree of consensus and centralization, because the less consensus and centralization there is, the less "What about this-and-such prominent 'figure'" is a relevant consideration. I still feel like it should be possible to do better than my word against yours.

many of the claims and assumptions in this post seem to rest on an assumption that that experience and model is shared

What gave you that impression?! Would it help if I added the words "to me" or "I think" in key sentences? (An earlier draft said "I think" slightly more often, but my editor cut five instances of it.)

In general, I think a lot of the value proposition of my political writing is that I'm wreckless enough to write clearly about things that most people prefer not to be clear about—models that should be "obvious" but are not shared. I'm definitely not assuming my model is shared. If it were shared, I wouldn't need to write so many words explaining it!

↑ comment by Martin Randall (martin-randall) · 2023-07-19T02:31:12.678Z · LW(p) · GW(p)

based on the claim that while nearly everyone else was criminally insane (causing huge amounts of damage due to disconnect from reality, in a way that would be criminal if done knowingly), he, almost uniquely, was not.

I don't think Eliezer has ever made this claim in the sense that you connote [...]

I read Yudkowsky as asserting:

A very high estimate of his own intelligence, eg comparable to Feynman and Hofstadter.
A very high estimate of the value of intelligence in general, eg sufficient to takeover the world and tile the lightcone using only an internet connection.
A very high estimate of the damage caused on Earth by insufficient intelligence, eg human extinction.

In the fictional world of Dath Ilan where Yudkowsky is the median inhabitant, Yudkowsky says they are on track to solve the alignment (~95% confidence). Whereas in the actual world he says we are on track to go extinct (~100% confidence). Causing human extinction would be criminal if done knowingly, so this satisfies Zach's claim as written.

I'm leaving this light on links, because I'm not sure what of the above you might object to. I realize that you had many other objections to Zach's framing, but I thought this could be something to drill into.

Edit: I'm not offering any money to respond, and independently of that, it's 100% fine if you don't want to respond.

Replies from: Maxc

↑ comment by Max H (Maxc) · 2023-07-19T15:26:33.737Z · LW(p) · GW(p)

I mostly take issue with the phrases "criminally insane", "causing enormous damage", and "he, almost uniquely, was not" connoting a more unusual and more actionable view than Eliezer (or almost anyone else) actually holds or would agree with.

Lots of people in the world are doing things that are straightforwardly non-optimal, often not even in their own narrow self-interest. This is mostly just the mistake side of conflict vs. mistake theory though, which seems relatively uncontroversial, at least on LW.

Eliezer has pointed out some of those mistakes in the context of AGI and other areas, but so have many others (Scott Alexander, Zack himself, etc.), in a variety of areas (housing policy, education policy, economics, etc.). Such explanations often come (implicitly or explicitly) with a call for others to change their behavior if they accept such arguments, but Eliezer doesn't seem particularly strident in making such calls, compared to e.g. ordinary politicians, public policy advocates, or other rationalists.

Note, I'm not claiming that Eliezer does not hold some object-level views considered weird or extreme by most, e.g. that sufficiently intelligent AGI could take over the world, or that establishing multinational agreements for control and use of GPUs would be good policy.

But I agree with most or all those views because they seem correct on the merits, not because Eliezer (or anyone else) happened to say them. Eliezer may have been the one to point them out, for which he deserves credit, but he's always explained his reasoning (sometimes at extreme length) and never to my knowledge asked anyone to just trust him about something like that and start making drastic behavioral changes as a result.

In the fictional world of Dath Ilan where Yudkowsky is the median inhabitant, Yudkowsky says they are on track to solve the alignment (~95% confidence). Whereas in the actual world he says we are on track to go extinct (~100% confidence). Causing human extinction would be criminal if done knowingly, so this satisfies Zach's claim as written.

Again, leaving aside whether Eliezer himself has actually claimed this or would agree with the sentiment, it seems straightforwardly true to me that a world where the median IQ was 140 would look a lot different (and be a lot better off) than the current Earth. Whether or not it would look exactly like Eliezer's speculative fictional world, it seems strange and uncharitable to me to characterize that view (that the world would be much better off with more intelligence) as extreme, or to interpret it as a demand or call to action for anything in particular.

Replies from: martin-randall

↑ comment by Martin Randall (martin-randall) · 2023-07-21T13:34:44.516Z · LW(p) · GW(p)

I would summarize this as saying:

Zach (et al) are exaggerating how unusual/extreme/weird Yudkowsky's positions are.
Zach (et al) are exaggerating how much Yudkowsky's writings are an explicit call to action.
To the extent that Yudkowsky has unusual positions and calls for actions, you think he's mostly correct on the merits.

Of these, I'd like to push on (1) a bit. However, I think this would probably work better as a new top-level post (working title "Yudkowsky on Yudkowsky"). To give a flavor, though, and because I'm quite likely to fail to write the top-level post, here's an example. Shah and Yudkowsky on alignment failures [LW · GW].

This may, perhaps, be confounded by the phenomenon where I am one of the last living descendants of the lineage that ever knew how to say anything concrete at all. Richard Feynman - or so I would now say in retrospect - is noticing concreteness dying out of the world, and being worried about that, at the point where he goes to a college and hears a professor talking about "essential objects" in class, and Feynman asks "Is a brick an essential object?" - meaning to work up to the notion of the inside of a brick, which can't be observed because breaking a brick in half just gives you two new exterior surfaces - and everybody in the classroom has a different notion of what it would mean for a brick to be an essential object.

I encourage you to follow the link to the rest of the conversation, which relates this to alignment work. So we have this phenomenon where one factor in humanity going extinct is that people don't listen enough to Yudkowsky and his almost unique ability to speak concretely. This also supports (2) above - this isn't an explicit call to action, he's just observing a phenomenon.

A (1) take here is that the quote is cherry-picked, a joke, or an outlier, and his overall work implies a more modest self-assessment. A (3) take is that he really is almost uniquely able to speak concretely. My take (4) is that his self-assessment is positively biased. I interpret Zack's "break-up" with Yudkowsky in the opening post as moving from a (3) model to a (4) model, and encouraging others to do the same.

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-17T16:44:01.857Z · LW(p) · GW(p)

Accepted, but I want to register that I am responding because you have successfully exerted social pressure, not because of any monetary incentive. I don’t mind the offer / ask (or the emotional appeals / explanations), but in the future I would prefer that you (or anyone) make such offers via PM.

It would hardly have been effective for Zack to make the offer via PM! In essence, you’re asking for Zack (or anyone) to act ineffectively, in order that you may avoid the inconvenience of having to publicly defend your claims against public disapprobation!

Replies from: Maxc

↑ comment by Max H (Maxc) · 2023-07-17T17:30:26.990Z · LW(p) · GW(p)

Financial incentives are ineffecitve if offered privately? That's perhaps true for me personally at the level Zack is offering, but seems obviously false in general.

Offering money in private is maybe less effective than exerting social pressure in public (via publicly offering financial incentives, or other means). I merely pointed out that the two are entangled here, and that the pressure aspect is the one that actually motivates me in this case. I request that future such incentives be applied in a more disentangled way, but I'm not asking Zack to refrain from applying social pressure OR from offering financial incentives, just asking that those methods be explicitly disentangled. Zack is of course not obliged to comply with this request, but if he does not do so, I will continue flagging my actual motivations explicitly.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-17T17:40:01.522Z · LW(p) · GW(p)

Financial incentives are ineffecitve if offered privately? That’s perhaps true for me personally at the level Zack is offering, but seems obviously false in general.

The financial incentive was clearly ineffective in this case, when offered publicly, so this is a red herring. (Really, who would’ve expected otherwise? $40, for the average Less Wrong reader? That’s a nominal amount, no more.)

No, what was effective was the social pressure—as you say!

I request that future such incentives be applied in a more disentangled way, but I’m not asking Zack to refrain from applying social pressure OR from offering financial incentives, just asking that those methods be explicitly disentangled. Zack is of course not obliged to comply with this request, but if he does not do so, I will continue flagging my actual motivations explicitly.

Disentangling these things as you describe would reduce the force of the social pressure, however.

Replies from: Maxc

↑ comment by Max H (Maxc) · 2023-07-17T19:00:55.463Z · LW(p) · GW(p)

I probably would have also responded if Zack had sent his comment verbatim as a PM. Maybe not as quickly or in exactly the same way, e.g. I wouldn't have included the digression about incentives.

But anyway, I did in fact respond, so I don't think it's valid to conclude much about what would have been "clearly ineffective" in a counterfactual.

One other point that you seem to be missing is that it's possible to exert social pressure via private channels, with or without financial incentives (and I'm also fine with Zack or others trying this, in general). Private might even be more effective at eliciting a response, in some cases.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-07-18T22:20:02.261Z · LW(p) · GW(p)

In retrospect, I feel guilty about impulsively mixing the "cheerful price" mechanism and the "social pressure" mechanism. I suspect Said is right that the gimmick of the former added to the "punch" of the latter, but at the terrible cost of undermining the integrity of the former (it's supposed to be cheerful!). I apologize for that.

↑ comment by Zack_M_Davis · 2023-07-28T02:51:17.618Z · LW(p) · GW(p)

The last several paragraphs of the post are where these mistakes seem most egregious

I have now rewritten that passage (now the third- and second-to-last paragraphs). Better?

↑ comment by Martin Randall (martin-randall) · 2023-07-19T01:52:24.062Z · LW(p) · GW(p)

To highlight something Zack said (three times!):

An ordinary programmer such as me was as a mere worm in the presence of the great Eliezer Yudkowsky.

I didn't make any negative updates, but I don't consider myself a mere worm. If any reader does consider themselves a mere worm, then maybe they should update appropriately. And possibly play "Gonna Get Over You" on loop.

comment by Ben Pace (Benito) · 2023-08-06T02:33:55.940Z · LW(p) · GW(p)

Also, Scott had asked me if it wouldn't be embarrassing if the community solved Friendly AI and went down in history as the people who created Utopia forever, and I had rejected it because of gender stuff.

This is a confusing line, it flags for me. Insofar as there was a local political lie, then it seems quite high-integrity to not lie and say you believe it, even in exchange for political power. This is the precise question that one should not say yes to — why care about the truth if you can instead have power?

This also makes me wonder somewhat if the description of events here is inaccurate and Scott would not characterize what he said that way.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-08-09T04:52:41.485Z · LW(p) · GW(p)

Scott's position was that there wasn't a political lie, because using a different category definition isn't lying. My position is that "using a different category definition isn't lying" is itself a political distortion (setting aside as uninteresting whether it's technically "lying") [LW · GW], a sufficiently egregious one as to invalidate the legitimacy of "the community". Scott was urging me to not be so quick to give up on the community just because I was (in his view) triggered by culture war material.

makes me wonder somewhat if the description of events here is inaccurate and Scott would not characterize what he said that way.

Specifically, in an 17 March 2019 1:37 a.m. email, I had written:

Anyway, as of today I'm thinking that my best option really is to just write off the community qua community as a loss. This probably doesn't mean very much in practice (I still love my friends; I'm still going to read your blog; I still have a lot of writing to do that I'm going to share on /r/TheMotte); it's more of a private mental adjustment I need to make for my own sanity.

In his reply of 17 March 2019 at 3:54 a.m., Scott quoted that passage and said:

I am tempted to make fun of you for this - wouldn't it be embarrassing if the community solved Friendly AI and went down in history as the people who created Utopia forever, and you had rejected it because of gender stuff? - but instead I'll be honest. I get this feeling too, all the time. At some point I want to write a blog post about it, but I don't know exactly how to put it into words that fully explain my model of it or capture what I want to say.

(There's more, but I'm not sure it's appropriate to dump the whole email in this comment.)

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-08-09T05:18:31.804Z · LW(p) · GW(p)

That is an endearing response, and does change my understanding of what he meant — rather, Scott is saying that just because you are losing one political issue you care about doesn't mean you should quit on a community with lots of other great things going for it.

(I am personally confused about the exact lines for sharing 1-1 text exchanges publicly and I wouldn't personally have shared it without permission in your shoes. I don't mean by this to say I think you necessarily shouldn't have.)

comment by AnthonyC · 2023-07-16T22:59:12.882Z · LW(p) · GW(p)

I think one key point that's missing in this otherwise very thorough post is that there is a larger context and historical progression of word definitions and categories about sex, sexuality, and gender. Those categories worked in many ways but were imperfect in others. Creating new words makes sense in principle, like fish vs dag, but it's even harder to get people to adopt new words consistently than it is to change definitions. In any case, this makes any attempt to alter those definitions and categories a matter of public discourse, in which you get about five words https://www.lesswrong.com/posts/4ZvJab25tDebB8FGE/you-get-about-five-words [LW · GW].

If that isn't enough (and it isn’t), then sometimes the best you can do is spend however many weeks to decades it takes for the implications of those five words to diffuse through society and then move on to the next five words. Move too slow, and you lose momentum. Move too fast, and some fraction of society can't keep up. That fraction gets too large, and you get backlash. Lacking perfect agreement and coordination, some forums and subpopulations move faster than others. Repeat forever to asymptotically approach better definitions, we hope.

comment by FireStormOOO · 2023-07-16T03:58:35.303Z · LW(p) · GW(p)

Admittedly I skimmed large portions of that, but I'd like to take a crack at bridging some of that inferential distance with a short description of the model I've been using, whereby I keep all the concerns you brought up straight but also don't have to choke on pronouns.

Categories of Men and Women are useful in a wide variety of areas and point at a real thing. There's a region in the middle these categories overlap and lack clean boundaries - while both genetics and birth sex are undeniable and straightforward fact in almost all cases (~98% IIRC), they don't make the wide ranging good predictions you'd otherwise expect in this region. I've mentally been calling this the "gender/sex/identity is complicated" region. Within this region, carefully consider which category is more relevant and go with that; other times a weighted average may be more appropriate.

By way of example if I want to infer likely skill-sets, hobbies, or interests for someone trans, I'm probably looking at either their pre-transition category, or a weighted average based on years before vs after transition.

On the other hand if I'm considering how a friend or conversation partner might prefer to be treated, I'd almost certainly be correct to infer based on claimed/stated gender until I know more.

On the one hand I can definitely see why those threads got under your skin (and shocked The Thoughts You Cannot Think didn't get a link); not the finest showing in clear thinking. Ultimately though I'm skeptical that we should treat pronouns as making some deep claim about the structure of person-space along the axis of sex. If anything, that there's conflict at all should serve to highlight that there's a large region (as much as 20% of the population maybe???) where this isn't cut and dry and simple rules aren't making good predictions. Looking at that structure there's a decent if not airtight case for treating pronouns as you would any other nicknames or abbreviations - namely acceptable insofar as the referent finds the name acceptable. There are places where a "no pseudonyms allowed, no exceptions" rule should and does trump "preferred moniker"/"no name-calling", but Twitter clearly isn't one.

comment by iceman · 2023-07-15T23:48:56.658Z · LW(p) · GW(p)

It's not exactly the point of your story, but...

Probably the most ultimately consequential part of this meeting was Michael verbally confirming to Ziz that MIRI had settled with a disgruntled former employee, Louie Helm, who had put up a website slandering them.

Wait, that actually happened? Louie Helm really was behind MIRICult? The accusations weren't just...Ziz being Ziz? And presumably Louie got paid out since why would you pay for silence if the accusations weren't at least partially true...or if someone were to go digging, they'd find things even more damning?

Those who are savvy in high-corruption equilibria maintain the delusion that high corruption is common knowledge, to justify expropriating those who naively don't play along, by narratizing them as already knowing and therefore intentionally attacking people, rather than being lied to and confused.

Ouch.

[..]Regardless of the initial intent, scrupulous rationalists were paying rent to something claiming moral authority, which had no concrete specific plan to do anything other than run out the clock, maintaining a facsimile of dialogue in ways well-calibrated to continue to generate revenue.

Really ouch.

So Yudkowsky doesn't have a workable alignment plan, so he decided to just live off our donations, running out the clock. I donated a six figure amount to MIRI over the years, working my ass off to earn to give...and that's it?

Fuck.

I remember being at a party in 2015 and asking Michael what else I should spend my San Francisco software engineer money on, if not the EA charities I was considering. I was surprised when his answer was, "You."

That sounds like wise advice.

Replies from: habryka4, SaidAchmiz, elityre

↑ comment by habryka (habryka4) · 2023-07-16T06:19:04.274Z · LW(p) · GW(p)

Wait, that actually happened? Louie Helm really was behind MIRICult? The accusations weren't just...Ziz being Ziz? And presumably Louie got paid out since why would you pay for silence if the accusations weren't at least partially true...or if someone were to go digging, they'd find things even more damning?

Louie Helm was behind MIRICult (I think as a result of some dispute where he asked for his job back after he had left MIRI and MIRI didn't want to give him his job back). As far as I can piece together from talking to people, he did not get paid out, but there was a threat of a lawsuit which probably cost him a bunch of money in lawyers, and it was settled by both parties signing an NDA (which IMO was a dumb choice on MIRI's part since the NDA has made it much harder to clear things up here).

Overall I am quite confident that he didn't end up with more money than he started with after the whole miricult thing. Also, I don't think the accusations are "at least partially true". Like it's not the case that literally every sentence of the miricult page is false, but basically all the salacious claims are completely made up.

Replies from: iceman, David Hornbein, lc

↑ comment by iceman · 2023-07-16T14:23:19.412Z · LW(p) · GW(p)

So, I started off with the idea that Ziz's claims about MIRI were frankly crazy...because Ziz was pretty clearly crazy (see their entire theory of hemispheres, "collapse the timeline," etc.) so I marked most of their claims as delusions or manipulations and moved on, especially since their recounting of other events on the page where they talked about miricult (which is linked in OP) comes off as completely unhinged.

But Zack confirming this meeting happened and vaguely confirming its contents completely changes all the probabilities. I now need to go back and recalculate a ton of likelihoods here starting from "this node with Vassar saying this event happened."

From Ziz's page:

LessWrong dev Oliver Habryka said it would be inappropriate for me to post about this on LessWrong, the community’s central hub website that mostly made it. Suggested me saying this was defamation.

It's obviously not defamation since Ziz believes its true.

<insert list of rationality community platforms I’ve been banned from for revealing the statutory rape coverup by blackmail payout with misappropriated donor funds and whistleblower silencing, and Gwen as well for protesting that fact.>

Inasmuch as this is true, this is weak Bayesian evidence that Ziz's accusations are more true than false because otherwise you would just post something like your above response to me in response to them. "No, actually official people can't talk about this because there's an NDA, but I've heard second hand there's an NDA" clears a lot up, and would have been advantageous to post earlier, so why wasn't it?

Replies from: lc, green_leaf

↑ comment by lc · 2023-07-16T17:15:02.050Z · LW(p) · GW(p)

It's obviously not defamation since Ziz believes its true.

We're veering dangerously close into dramaposting here, but just FYI habyka [LW(p) · GW(p)] has already contested that they ever said this. I would like to know if the ban accusations are true, though.

Replies from: habryka4

↑ comment by habryka (habryka4) · 2023-07-16T17:30:27.485Z · LW(p) · GW(p)

Can confirm that I don't believe I said anything about defamation, and in general continue to think that libel suits are really quite bad and do not think they are an appropriate tool in almost any circumstance.

We banned some of them for three months when they kept spamming the CFAR AMA a while ago: https://www.lesswrong.com/posts/96N8BT9tJvybLbn5z/we-run-the-center-for-applied-rationality-ama?commentId=5W86zzFy48WiLcSg6 [LW(p) · GW(p)]

I don't think we ever took any other moderation action, though I would likely ban then again, since like, I really don't want them around on LessWrong and they have far surpassed thresholds for acceptable behavior.

I would not ban anyone writing up details of the miricult stuff (including false accusations, and relatively strong emotions). Indeed somewhat recently I wrote like 3-5 pages of content here on a private Facebook thread with a lot of rationality community people on it. I would be up for someone extracting the parts that seem shareable more broadly. Seems good to finally have something more central and public.

↑ comment by green_leaf · 2023-07-16T14:46:21.800Z · LW(p) · GW(p)

Two points of order, without going into any specific accusations or their absence:

The post is transphobic, which anticorrelates with being correct/truthful/objective.
It seems optimized for smoothness/persuasion, which, based on my experience, also anticorrelates with both truth and objectivity.

Replies from: f____

↑ comment by orellanin (f____) · 2023-07-16T17:20:48.509Z · LW(p) · GW(p)

What seems optimized for smoothness/persuasion?

Replies from: green_leaf

↑ comment by green_leaf · 2023-09-17T15:25:09.116Z · LW(p) · GW(p)

The author shares how terrible it feels that X is true, without bringing arguments for X being true in the first place (based on me skimming the post). That can bypass the reader's fact-check (because why would he write about how bad it made him feel that X is true if it wasn't?).

It feels to me like he's trying to combine an emotional exposition (no facts, talking about his feelings) with an expository blogpost (explaining a topic), while trying to grab the best of both worlds (the persuasiveness and emotions of the former and the social status of the latter) without the substance to back it up.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-09-17T23:15:35.304Z · LW(p) · GW(p)

Sorry, you're going to need to be more specific. What particular claim X have I asserted is true without bringing arguments for it? Reply!

I agree that I'm combining emotional autobiography with topic exposition, but the reason I'm talking about my autobiography at all is because I tried object-level topic exposition for years—in such posts as "The Categories Were Made for Man to Make Predictions" (2018), "Where to Draw the Boundaries?" [LW · GW] (2019), "Unnatural Categories Are Optimized for Deception" [LW · GW] (2021), and "Challenges to Yudkowsky's Pronoun Reform Proposal" (2022)—and it wasn't working. From my perspective, the only thing left to do was jump up a metal level and talk about why it wasn't working. If your contention is that I don't have the substance to back up my claims, I think you should be able to explain what I got wrong in those posts. Reply!

Reply!

↑ comment by David Hornbein · 2023-07-16T19:30:55.537Z · LW(p) · GW(p)

"he didn't end up with more money than he started with after the whole miricult thing" is such a weirdly specific way to phrase things.

My speculation from this is that MIRI paid Helm or his lawyers some money, but less money than Helm had spent on the harassment campaign, and among people who know the facts there is a semantic disagreement about whether this constitutes a "payout". Some people say something like "it's a financial loss for Helm, so game-theoretically it doesn't provide an incentive to blackmail, therefore it's fine" and others say something like "if you pay out money in response to blackmail, that's a blackmail payout, you don't get to move the bar like that".

I would appreciate it if someone who knows what happened can confirm or deny this.

(AFAICT the only other possibility is that somewhere along the line, at least one of the various sources of contradictory-sounding rumors was just lying-or-so-careless-as-to-be-effectively-lying. Which is very possible, of course, that happens with rumors a lot.)

Replies from: habryka4

↑ comment by habryka (habryka4) · 2023-07-17T01:48:00.649Z · LW(p) · GW(p)

I sadly don't know the answer to this. To open up the set of possibilities further, I have heard rumors that maybe Louie was demanding some donations back he had given MIRI previously, and if that happened, that might also complicate the definition of a "payout".

and others say something like "if you pay out money in response to blackmail, that's a blackmail payout, you don't get to move the bar like that".

I don't understand the logic of this. Does seem like game-theoretically the net-payout is really what matters. What would be the argument for something else mattering?

Replies from: David Hornbein, SaidAchmiz

↑ comment by David Hornbein · 2023-07-17T19:32:14.381Z · LW(p) · GW(p)

I don't understand the logic of this. Does seem like game-theoretically the net-payout is really what matters. What would be the argument for something else mattering?

BEORNWULF: A messenger from the besiegers!

WIGMUND: Send him away. We have nothing to discuss with the norsemen while we are at war.

AELFRED: We might as well hear them out. This siege is deadly dull. Norseman, deliver your message, and then leave so that we may discuss our reply.

MESSENGER: Sigurd bids me say that if you give us two thirds of the gold in your treasury, our army will depart. He reminds you that if this siege goes on, you will lose the harvest, and this will cost you more dearly than the gold he demands.

The messenger exits.

AELFRED: Ah. Well, I can’t blame him for trying. But no, certainly not.

BEORNWULF: Hold on, I know what you’re thinking, but this actually makes sense. When Sigurd’s army first showed up, I was the first to argue against paying him off. After all, if we’d paid right at the start, then he would’ve made a profit on the attack, and it would only encourage more. But the siege has been long and hard for us both. If we accept this deal *now*, he’ll take a net loss. We’ve spent most of the treasury resisting the siege—

WIGMUND: As we should! Millions for defense, but not one cent for tribute!

BEORNWULF: Certainly. But the gold we have left won’t even cover what they’ve already spent on their attack. Their net payout will still be negative, so game-theoretically, it doesn’t make sense to think of it as “tribute”. As long as we’re extremely sure they’re in the red, we should minimize our own costs, and missing the harvest would be a *huge* cost. People will starve. The deal is a good one.

WIGMUND: Never! if once you have paid him the danegeld, you never get rid of the Dane!

BEORNWULF: Not quite. The mechanism matters. The Dane has an incentive to return *only if the danegeld exceeds his costs*.

WIGMUND: Look, you can mess with the categories however you like, and find some clever math that justifies doing whatever you’ve already decided you want to do. None of that constrains your behavior and so none of that matters. What matters is, take away all the fancy definitions and you’re still just paying danegeld.

BEORNWULF: How can I put this in language you’ll understand—it doesn’t matter whether the definitions support what *I* want to do, it matters whether the definitions reflect the *norsemen’s* decision algorithm. *They* care about the net payout, not the gross payout.

AELFRED: Hold on. Are you modeling the norsemen as profit-maximizers?

BEORNWULF: More or less? I mean, no one is perfectly rational, but yeah, everyone *approximates* a rational profit-maximizer.

WIGMUND: They are savage, irrational heathens! They never even study game theory!

BEORNWULF: Come on. I’ll grant that they don’t use the same jargon we do, but they attack because they expect to make a profit off it. If they don’t expect to profit, they’ll stop. Surely they do *that* much even without explicit game theoretic proofs.

AELFRED: That affects their decision, yes, but it’s far from the whole story. The norsemen care about more than just gold and monetary profit. They care about pride. Dominance. Social rank and standing. Their average warrior is a young man in his teens or early twenties. When he decides whether to join the chief’s attack, he’s not sitting down with spreadsheets and a green visor to compute the expected value, he’s remembering that time cousin Guthrum showed off the silver chalice he looted from Lindisfarne. Remember, Sigurd brought the army here in the first place to avenge his brother’s death—

BEORNWULF: That’s a transparent pretext! He can’t possibly blame us for that, we killed Agnarr in self-defense during the raid on the abbey.

WIGMUND: You can tell that to Sigurd. If it had been my brother, I’d avenge him too.

AELFRED: Among their people, when a man is murdered, it’s not a *tragedy* to his family, it’s an *insult*. It can only be wiped away with either a weregeld payment from the murderer or a blood feud. Yes, Sigurd cares about gold, but he also cares tremendously about *personally knowing he defeated us*, in order to remove the shame we dealt him by killing Agnarr. Modeling his decisions as profit-maximizing will miss a bunch of his actual decision criteria and constraints, and therefore fail to predict the norsemen’s future actions.

WIGMUND: You’re overcomplicating this. If we pay, the norsemen will learn that we pay, and more will come. If we do not pay, they will learn that we do not pay, and fewer will come.

BEORNWULF: They don’t care if we *pay*, they care if it’s *profitable*. This is basic accounting.

AELFRED: They *do* care if we pay. Most of them won’t know or care what the net-payout is. If we pay tribute, this will raise Sigurd’s prestige in their eyes no matter how much he spent on the expedition, and he needs his warriors’ support more than he needs our gold. Taking a net loss won’t change his view on whether he’s avenged the insult to his family, and we do *not* want the Norsemen to think they can get away with coming here to avenge “insults” like killing their raiders in self-defense. On the other hand, if Sigurd goes home doubly shamed by failing to make us submit, they’ll think twice about trying that next time.

BEORNWULF: I don’t care about insults. I don’t care what Sigurd’s warriors think of him. I don’t care who can spin a story of glorious victory or who ends up feeling like they took a shameful defeat. I care about how many of our people will die on norse spears, and how many of our people will die of famine if we don’t get the harvest in. All that other stuff is trivial bullshit in comparison.

AELFRED: That all makes sense. You still ought to track those things instrumentally. The norsemen care about all that, and it affects their behavior. If you want a model of how to deter them, you have to model the trivial bullshit that they care about. If you abstract away what they *actually do* care about with a model of what you think they *ought* to care about, then your model *won’t work*, and you might find yourself surprised when they attack again because they correctly predict that you’ll cave on “trivial bullshit”. Henry IV could swallow his pride and say “Paris is well worth a mass”, but that was because he was *correctly modeling* the Parisians’ pride.

WIGMUND: Wait. That is *wildly* anachronistic. Henry converted to Catholicism in 1593. This dialogue is taking place in, what, probably the 9th century?

AELFRED: Hey, I didn’t make a fuss when you quoted Kipling.

Replies from: elityre

↑ comment by Eli Tyre (elityre) · 2024-02-18T07:41:44.216Z · LW(p) · GW(p)

This was fantastic, and you should post it as a top level post.

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-17T04:56:32.078Z · LW(p) · GW(p)

I don’t understand the logic of this. Does seem like game-theoretically the net-payout is really what matters. What would be the argument for something else mattering?

Suppose that Alice blackmails me and I pay her $1,000,000. Alice has spent $1,500,000 on lawyers in the process of extracting this payout from me. The result of this interaction is that I have lost $1,000,000, while Alice has lost $500,000. (Alice’s lawyers have made a lot of money, of course.)

Bob hears about this. He correctly realizes that I am blackmailable. He talks to his lawyer, and they sign a contract whereby the lawyer gets half of any payout that they’re able to extract from me. Bob blackmails me and I pay him $1,000,000. Bob keeps $500,000, and his lawyer gets the other $500,000. Now I have again lost $1,000,000, while Bob has gained $500,000.

(How might this happen? Well, Bob’s lawyer is better than Alice’s lawyers were. Bob’s also more savvy, and knows how to find a good lawyer, how to negotiate a good contract, etc.)

That is: once the fact that you’re blackmailable is known, the net payout (taking into account expenditures needed to extract it from you) is not relevant, because those expenditures cannot be expected to hold constant—because they can be optimized. And the fact that (as is now a known fact) money can be extracted from you by blackmail, is the incentive to optimize them.

Replies from: jimrandomh, habryka4

↑ comment by jimrandomh · 2023-07-17T07:31:18.062Z · LW(p) · GW(p)

Note that a lawyer who participated in that would be committing a crime. In the case of LH, there was (by my unreliable secondhand understanding) an employment-contract dispute and a blackmail scheme happening concurrently. The lawyers would have been involved only in the employment-contract dispute, not in the blackmail, and any settlement reached would have nominally been only for dropping the employment-contract-related claims. An ordinary employment dispute is a common-enough thing that each side's lawyers would have experience estimating the other side's costs at each stage of litigation, and using those estimates as part of a settlement negotiation.

(Filing lawsuits without merit is sometimes analogized to blackmail, but US law defines blackmail much more narrowly, in such a way that asking for payment to not allege statutory rape on a website is blackmail, but asking for payment to not allege unfair dismissal in a civil court is not.)

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-17T16:55:50.085Z · LW(p) · GW(p)

Then it sounds like the blackmailer in question spent $0 on perpetrating the blackmail, which is even more of an incentive for others to blackmail MIRI in the future.

Replies from: jimrandomh

↑ comment by jimrandomh · 2023-07-17T19:18:37.617Z · LW(p) · GW(p)

Then it sounds like the blackmailer in question spent $0 on perpetrating the blackmail

No, that's not what I said (and is false). To estimate the cost you have to compare the outcome of the legal case to the counterfactual baseline in which there was no blackmail happening on the side (that baseline is not zero), and you have to include other costs besides lawyers.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-17T22:53:36.250Z · LW(p) · GW(p)

To estimate the cost you have to compare the outcome of the legal case to the counterfactual baseline in which there was no blackmail happening on the side (that baseline is not zero)

Seems wrong to me. Opportunity cost is not the same as expenditures.

and you have to include other costs besides lawyers.

Alright, and what costs were there?

Replies from: jimrandomh

↑ comment by jimrandomh · 2023-07-17T23:00:16.884Z · LW(p) · GW(p)

Sorry, this isn't a topic where I want to discuss with someone who's being thick in the way that you're being thick right now. Tapping out.

↑ comment by habryka (habryka4) · 2023-07-17T05:05:30.612Z · LW(p) · GW(p)

Sure, I agree that this is true. But as long as you run a policy that is sensitive to your counterparty optimizing expenditures, I think this no longer holds?

Like, I think in-general a policy I have for stuff like this is something like "ensure the costs to my counterparty were higher than their gains", and then take actions appropriate to the circumstances. This seems like it wouldn't allow for the kind of thing you describe above (and also seems like the most natural strategy for me in blackmail cases like this).

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-17T05:42:15.570Z · LW(p) · GW(p)

But as long as you run a policy that is sensitive to your counterparty optimizing expenditures, I think this no longer holds?

What would this look like…? It doesn’t seem to me to be the sort of thing which it’s at all feasible to do in practice. Indeed it’s hard to see what this would even mean; if the end result is that you pay out sometimes and refuse other times, all that happens is that external observers conclude “he pays out sometimes”, and keep blackmailing you.

in-general a policy I have for stuff like this is something like “ensure the costs to my counterparty were higher than their gains”, and then take actions appropriate to the circumstances

Actions like what?

Like, let’s say that you’re MIRI and you’re being blackmailed. You don’t know how much your blackmailer is paying his lawyers (why would you, after all?). What do you do?

And for all you know, the contract your blackmailer’s got with his lawyers might be as I described—lawyers get some percent of payout, and nothing if there’s no payout. What costs do you impose on the blackmailer?

In short, I think the policy you describe is usually impossible to implement in practice.

But note that this is all tangential. It’s only relevant to the original question (about MIRI) if you claim that MIRI were attempting to implement a policy such as you describe. Do you claim this? If so, have you any evidence?

Replies from: habryka4

↑ comment by habryka (habryka4) · 2023-07-17T17:19:29.297Z · LW(p) · GW(p)

I mean, the policy here really doesn't seem very hard. If you do know how much your opposing party is paying their lawyers, you optimize that hard. If you don't know, you make some conservative estimate. I've run policies like this in lots of different circumstances, and it's also pretty close to common sense as a response to blackmail and threats.

Do you claim this? If so, have you any evidence?

I've asked some MIRI people this exact question and they gave me this answer, with pretty strong confidence and relatively large error margins.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-17T17:42:46.844Z · LW(p) · GW(p)

I have to admit that I still haven’t the faintest clue what concrete behavior you’re actually suggesting. I repeat my questions: “What would this look like…?” and “Actions like what?” (Indeed, since—as I understand it—you say you’ve done this sort of thing, can you give concrete examples from those experiences?)

I’ve asked some MIRI people this exact question and they gave me this answer, with pretty strong confidence and relatively large error margins.

Alright, and what has this looked like in practice for MIRI…?

Replies from: habryka4

↑ comment by habryka (habryka4) · 2023-07-17T18:18:17.161Z · LW(p) · GW(p)

It means you sit down, you make some fermi estimates of how much benefit the counterparty could be deriving from this threat/blackmail, then you figure out what you would need to do to roughly net out to zero, then you do those things. If someone asks you what your policy is, you give this summary.

In every specific instance this looks different. Sometimes this means you reach out to people they know and let them know about the blackmailing in a way that would damage their reputation. Sometimes it means you threaten to escalate to a legal battle where you are willing to burn resources to make the counterparty come out in the red.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-17T22:58:47.004Z · LW(p) · GW(p)

In every specific instance this looks different. Sometimes this means you reach out to people they know and let them know about the blackmailing in a way that would damage their reputation. Sometimes it means you threaten to escalate to a legal battle where you are willing to burn resources to make the counterparty come out in the red.

Why would you condition any of this on how much they’re spending?

And how exactly would you calibrate it to impose a specific amount of cost on the blackmailer? (How do you even map some of these things to monetary cost…?)

↑ comment by lc · 2023-07-16T08:00:38.469Z · LW(p) · GW(p)

Which IMO was a dumb choice on MIRI's part since the NDA has made it much harder to clear things up here

The lack of comment from Eliezer and other MIRI personnel had actually convinced me in particular that the claims were true. This is the first I heard that there's any kind of NDA preventing them from talking about it.

Replies from: jimrandomh

↑ comment by jimrandomh · 2023-07-17T07:36:52.753Z · LW(p) · GW(p)

The lack of comment from Eliezer and other MIRI personnel had actually convinced me in particular that the claims were true. This is the first I heard that there's any kind of NDA preventing them from talking about it.

I think this means you had incorrect priors (about how often legal cases conclude with settlements containing nondisparagement agreements.)

Replies from: keith_wynroe

↑ comment by keith_wynroe · 2023-07-17T21:15:17.099Z · LW(p) · GW(p)

They can presumably confirm whether or not there is a nondisparagement agreement and whether that is preventing them from commenting though right

Replies from: jimrandomh

↑ comment by jimrandomh · 2023-07-17T22:53:07.297Z · LW(p) · GW(p)

You can confirm this if you're aware that it's a possibility, and interpret carefully-phrased refusals to comment in a way that's informed by reasonable priors. You should not assume that anyone is able to directly tell you that an agreement exists.

Replies from: keith_wynroe

↑ comment by keith_wynroe · 2023-07-18T00:13:32.474Z · LW(p) · GW(p)

Why not? Is it common for NDAs/non-disparagement agreements to also have a clause stating the parties aren’t allowed to tell anyone about it? I’ve never heard of this outside of super-injunctions which seems a pretty separate thing

Replies from: Dagon, pktechgirl

↑ comment by Dagon · 2023-07-18T02:10:51.701Z · LW(p) · GW(p)

Absolutely common. Most non-disparagement agreements are paired with non-disclosure agreements (or clauses in the non-disparagement wording) that prohibit talking about the agreement, as much as talking about the forbidden topics.

It's pretty obvious to lawyers that "I would like to say this, but I have a legal agreement that I won't" is equivalent, in many cases, to saying it outright.

↑ comment by Elizabeth (pktechgirl) · 2023-07-19T05:38:25.506Z · LW(p) · GW(p)

my boilerplate severance agreement at a job included an NDA that couldn't be acknowledged (I negotiated to change this).

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-16T00:39:25.415Z · LW(p) · GW(p)

So Yudkowsky doesn’t have a workable alignment plan, so he decided to just live off our donations, running out the clock.

Er… is anyone actually claiming this? This is quite the accusation, and if it were being made, I’d want to see some serious evidence, but… is it, in fact, being made?

(It does seem like OP is saying this, but… in a weird way that doesn’t seem to acknowledge the magnitude of the accusation, and treats it as a reasonable characterization of other claims made earlier in the post. But that doesn’t actually seem to make sense. Am I misreading, or what?)

Replies from: iceman, Raemon, martin-randall

↑ comment by iceman · 2023-07-16T02:06:49.441Z · LW(p) · GW(p)

The second half (just live off donations?) is also my interpretation of OP. The first half (workable alignment plan?) is my own intuition based on MIRI mostly not accomplishing anything of note over the last decade, and...

MIRI & company spent a decade working on decision theory which seems irrelevant if deep learning is the path (aside: and how would you face Omega if you were the sort of agent that pays out blackmail?). Yudkowsky offers to bet Demis Hassabis that Go won't be solved in the short term. They predict that AI will only come from GOFAI AIXI-likes with utility functions that will bootstrap recursively. They predict fast takeoff and FOOM.

Ooops.

The answer was actually deep learning and not systems with utility functions. Go gets solved. Deep Learning systems don't look like they FOOM. Stochastic Gradient Descent doesn't look like it will treacherous turn. Yudkowsky's dream of building the singleton Sysop is gone and was probably never achievable in the first place.

People double down with the "mesaoptimizer" frame instead of admitting that it looks like SGD does what it says on the tin. Yudkowsky goes on a doom media spree. They advocate for a regulatory regime that would be very easy to empower private interests over public interests. Enraging to me, there's a pattern of engagement where it seems like AI Doomers will only interact with weak arguments instead of strong ones: Yud mostly argues with low quality e/accs on twitter where it's easy to score Ws; it was mildly surprising when he even responded with "This is kinda long." [LW(p) · GW(p)] to Quinton Pope's objection thread [LW · GW].

What should MIRI have done, had they taken the good sliver of The Sequences to heart? They should have said oops [LW · GW]. The should have halted, melted and caught fire [LW · GW]. They should have acknowledged that the sky was blue. They should have radically changed their minds when the facts changed. But that would have cut off their funding. If the world isn't going to end from a FOOMing AI, why should MIRI get paid?

So what am I supposed to extract from this pattern of behaviour?

Replies from: jimrandomh, SaidAchmiz, FireStormOOO, jimrandomh, SaidAchmiz

↑ comment by jimrandomh · 2023-07-16T21:52:18.403Z · LW(p) · GW(p)

Deep Learning systems don't look like they FOOM. Stochastic Gradient Descent doesn't look like it will treacherous turn.

I think you've updated incorrectly, by failing to keep track of what the advance predictions were (or would have been) about when a FOOM or a treacherous turn will happen.

If foom happens, it happens no earlier than the point where AI systems can do software-development on their own codebases, without relying on close collaboration with a skilled human programmer. This point has not yet been reached; they're idiot-savants with skill gaps that prevent them from working independently, and no AI system has passed the litmus test I use for identifying good (human) programmers. They're advancing in that direction pretty rapidly, but they're unambiguously not there yet.

Similarly, if a treacherous turn happens, it happens no earlier than the point where AI systems can do strategic reasoning with long chains of inference; this again has an idiot-savant dynamic going on, which can create the false impression that this landmark has been reached, when in fact it hasn't.

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-16T04:41:02.254Z · LW(p) · GW(p)

They predict that AI will only come from GOFAI AIXI-likes with utility functions that will bootstrap recursively.

Do you have a link for this prediction? (Or are you just referring to, e.g., Eliezer’s dismissive attitude toward neural networks, as expressed in the Sequences?)

They predict fast takeoff and FOOM. … Deep Learning systems don’t look like they FOOM.

It’s not clear that deep learning systems get us to AGI, either. There doesn’t seem to be any good reason to be sure, at this time, that we won’t get “fast takeoff and FOOM”, does it? (Indeed it’s my understanding that Eliezer still predicts this. Or is that false?)

Stochastic Gradient Descent doesn’t look like it will treacherous turn.

It… doesn’t? What do you mean by this? I’ve seen no reason to be optimistic on this point—quite the opposite!

So what am I supposed to extract from this pattern of behaviour?

I think that at least some of the things you take to be obvious conclusions that Eliezer/MIRI should’ve drawn, are in fact not obvious, and some are even plausibly false.

You also make some good points. But there isn’t nearly so clear a pattern as you suggest.

Replies from: Vaniver

↑ comment by Vaniver · 2023-07-17T20:02:54.750Z · LW(p) · GW(p)

It… doesn’t? What do you mean by this? I’ve seen no reason to be optimistic on this point—quite the opposite!

As I understand the argument, it goes like the following:

For evolutionary methods, you can't predict the outcome of changes before they're made, and so you end up with 'throw the spaghetti at the wall and see what sticks'. At some point, those changes accumulate to a mind that's capable of figuring out what environment it's in and then performing well at that task, so you get what looks like an aligned agent while you haven't actually exerted any influence on its internal goals (i.e. what it'll do once it's out in the world).
For gradient-descent based methods, you can predict the outcome of changes before they're made; that's the gradient part. It's overall less plausible that the system you're building figures out generic reasoning and then applies that generic reasoning to a specific task, compared to figuring out the specific reasoning for the task that you'd like solved. Jumps in the loss look more like "a new cognitive capacity has emerged in the network" and less like "the system is now reasoning about its training environment".

Of course, that "overall less plausible" is making a handwavy argument about what simplicity metric we should be using and which design is simpler according to that metric. Related, earlier research: Are minimal circuits deceptive? [LW · GW]

IMO this should be somewhat persuasive but not conclusive. I'm much happier with a transformer shaped by a giant English text corpus than I am with whatever is spit out by a neural-architecture-search program pointed at itself! But for cognitive megaprojects, I think you probably have to have something-like-a-mind in there, even if you got to it by SGD.

↑ comment by FireStormOOO · 2023-07-16T04:30:06.920Z · LW(p) · GW(p)

It's pretty easy to find reasons why everything will hopefully be fine, or AI hopefully won't FOOM, or we otherwise needn't do anything inconvenient to get good outcomes. It's proving considerably harder (from my outside the field view) to prove alignment, or prove upper bounds on rate of improvement, or prove much of anything else that would be cause to stop ringing the alarm.

FWIW I'm considerably less worried than I was when the Sequences were originally written. The paradigms that have taken off since do seem a lot more compatible with straightforward training solutions that look much less alien than expected. There are plausible scenarios where we fail at solving alignment and still get something tolerably human shaped, and none of those scenarios previously seemed plausible. That optimism just doesn't take it under the stop worrying threshold.

↑ comment by jimrandomh · 2023-07-16T23:45:31.397Z · LW(p) · GW(p)

This doesn't seem consistent to me with MIRI having run a research program with a machine learning focus. IIRC (I don't have links handy but I'm pretty sure there were announcements made) that they wound up declaring failure on that research program, and it was only after that happened that they started talking about the world being doomed and there not being anything that seemed like it would work for aligning AGI in time.

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-16T04:45:40.292Z · LW(p) · GW(p)

Yudkowsky offers to bet Demis Hassabis that Go won’t be solved in the short term.

Incidentally, I don’t think I’m willing to trust a hearsay report on this without confirmation.

Do you happen to have any links to Eliezer making such a claim in public? Or, at least, any confirmation that the cited comment was made as described?

Replies from: DanielFilan

↑ comment by DanielFilan · 2023-07-16T06:32:08.058Z · LW(p) · GW(p)

Closest thing I'm aware of is that at the time of the AlphaGo matches he bet people at like 3:2 odds, favourable to him, that Lee Sedol would win. Link here

↑ comment by Raemon · 2023-07-16T03:02:43.494Z · LW(p) · GW(p)

My interpretation of various things Michael and co. have said is "Effective altruism in general (and MIRI / AI-safety in particular) is a memeplex optimizing to extract resources from people in a fraudulent way, which does include some degree of "straightforward fraud the way most people would interpret it", but also, their worldview includes generally seeing a lot of things as fraudulent in ways/degrees that common parlance wouldn't generally mean.

I predict they wouldn't phrase things the specific way iceman phrased it (but, not confidently).

I think Jessicata's The AI Timelines Scam [LW · GW] is a pointer to the class of thing they might tend to mean. Some other relevant posts including Can crimes be discussed literally? [LW · GW] and Approval Extraction Advertised as Production [LW · GW].

Replies from: SaidAchmiz, Raemon

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-16T03:51:06.106Z · LW(p) · GW(p)

Yes, this is all reasonable, but as a description of Eliezer’s behavior as understood by him, and also as understood by, like, an ordinary person, “doesn’t have a workable alignment plan, so he decided to just live off our donations, running out the clock” is just… totally wrong… isn’t it?

That is, that characterization doesn’t match what Eliezer sees himself as doing, nor does it match how an ordinary person (and one who had no particular antipathy toward Eliezer, and thus was not inclined to describe his behavior uncharitably, only impartially), speaking in ordinary English, would describe Eliezer as doing—correct?

Replies from: Raemon

↑ comment by Raemon · 2023-07-16T04:02:20.982Z · LW(p) · GW(p)

Yes, that is my belief. (Sorry, should have said that concretely). I'm not sure what an 'ordinary person' should think because 'AI is dangerous' has a lot of moving pieces and I think most people are (kinda reasonably?) epistemically helpless about the situation. But I do think iceman's summary is basically obviously false, yes.

My own current belief is "Eliezer/MIRI probably had something-like-a-plan around 2017, probably didn't have much of a plan by 2019 that Eliezer himself believed in, but, 'take a break, and then come back to the problem after thinking about it' feels like a totally reasonable thing to me to do". (and meanwhile there were still people at MIRI working on various concrete projects that at least at the people involved thought were worthwhile).

i.e. I don't think MIRI "gave up" [LW · GW]

I do think, if you don't share Eliezer's worldview, it's a reasonable position to be suspicious and hypothesize that MIRI's current activities are some sort of motivated-cognition-y cope, but I think confidently asserting that seems wrong to me. (I also think there's a variety of worldviews that aren't Eliezer's exact worldview that make his actions still pretty coherent, and if I think it's a pretty sketchy position to assert all those nearby-worldviews are so obviously wrong as to make 'motivated cope/fraud' your primary frame)

↑ comment by Raemon · 2023-07-16T03:08:39.827Z · LW(p) · GW(p)

(fwiw my overall take is that I think there is something to this line of thinking. My general experience is that when Michael/Benquo/Jessica say "something is fishy here", there often turns out to be something I agree is fishy in some sense, but I find their claims overstated and running with some other assumptions I don't believe that make the thing seem worse to them than it does to me)

↑ comment by Martin Randall (martin-randall) · 2023-07-16T03:20:30.502Z · LW(p) · GW(p)

For the first part, Yudkowsky has said that he doesn't have a workable alignment plan, and nobody does, and we are all going to die. This is not blameworthy, I also do not have a workable alignment plan.

For the second part, he was recently on a sabbatical, presumably funded by prior income that was funded by charity, so one might say he was living off donations. Not blameworthy, I also take vacations.

For the third part, everyone who thinks that we are all going to die is in some sense running out the clock, be they disillusioned transhumanists or medieval serfs. Hopefully we make some meaning while we are alive. Not blameworthy, just the human condition.

Whether MIRI is a good place to donate is a very complicated question, but certainly "no" is a valid answer for many donors.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-16T03:55:23.197Z · LW(p) · GW(p)

These are good points. But it does seem like what @iceman meant by the bit that I quoted at least has connotations that go beyond your interpretation, yes?

Whether MIRI is a good place to donate is a very complicated question, but certainly “no” is a valid answer for many donors.

Sure. I haven’t donated to MIRI in many years, so I certainly wouldn’t tell anyone else to do so. (It’s not my understanding that MIRI is funding constrained at this time. Can anyone confirm or disconfirm this?)

Replies from: martin-randall

↑ comment by Martin Randall (martin-randall) · 2023-07-19T01:11:36.063Z · LW(p) · GW(p)

What accusation do you see in the connotations of that quote? Genuine question, I could guess but I'd prefer to know. Mostly the subtext I see from iceman is disappointment and grief and anger and regret. Which are all valid emotions for them to feel.

I think a lot of what might have been serious accusations in 2019 are now common knowledge, eg after Bankless [LW · GW], Death with Dignity [LW · GW], etc.

(It’s not my understanding that MIRI is funding constrained at this time. Can anyone confirm or disconfirm this?)

From the Bankless interview:

How do I put it... The saner outfits do have uses for money. They don't really have scalable uses for money, but they do burn any money literally at all. Like, if you gave MIRI a billion dollars, I would not know how to...

Well, at a billion dollars, I might try to bribe people to move out of AI development, that gets broadcast to the whole world, and move to the equivalent of an island somewhere—not even to make any kind of critical discovery, but just to remove them from the system. If I had a billion dollars.

If I just have another $50 million, I'm not quite sure what to do with that, but if you donate that to MIRI, then you at least have the assurance that we will not randomly spray money on looking like we're doing stuff and we'll reserve it, as we are doing with the last giant crypto donation somebody gave us until we can figure out something to do with it that is actually helpful. And MIRI has that property. I would say probably Redwood Research has that property.

(Edited to fix misquote)

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-19T04:40:01.799Z · LW(p) · GW(p)

So, just to clarify, “serious accusation” is not a phrase that I have written in this discussion prior to this comment, which is what the use of quotes in your comment suggests. I did write something which has more or less the same meaning! So you’re not mis-ascribing beliefs to me. But quotes mean that you’re… quoting… and that’s not the case here.

Anyway, on to the substance:

What “serious accusation” do you see in the connotations of that quote?

And the quote in question, again, is [LW(p) · GW(p)]:

So Yudkowsky doesn’t have a workable alignment plan, so he decided to just live off our donations, running out the clock.

The connotations are that Eliezer has consciously chosen to stop working on alignment, while pretending to work on alignment, and receiving money to allegedly work on alignment but instead just not doing so, knowing that there won’t be any consequences for perpetrating this clear and obvious scam in the classic sense of the word, because the world’s going to end and he’ll never be held to account.

Needless to say, it just does not seem to me like Eliezer or MIRI are doing anything remotely like that. Indeed I don’t think anyone (serious) has even suggested that they’re doing anything like that. (The usual horde of haters on Twitter / Reddit / etc. notwithstanding.)

Mostly the subtext I see from iceman is disappointment and grief and anger and regret. Which are all valid emotions for them to feel.

But of course this is largely nonsensical in the absence of any “serious accusations”. Grief over what, anger about what? Why should these things be “valid emotions … to feel”? (And it can’t just be “we’re all going to die”, because that’s not new; we didn’t just find that out from the OP—while iceman’s comment clearly implies that whatever is the cause of his reaction, it’s something that he just learned from Zack’s post.)

I think a lot of what might have been “serious accusations” in 2019 are now common knowledge, eg after Bankless [LW · GW], Death with Dignity [LW · GW], etc.

Which is precisely why iceman’s comment does not make sense as a reply to this post, now; nor is the characterization which I quoted an accurate one.

(It’s not my understanding that MIRI is funding constrained at this time. Can anyone confirm or disconfirm this?)

From the Bankless interview:

Yep, I would describe that state of affairs as “not funding constrained”.

Replies from: martin-randall

↑ comment by Martin Randall (martin-randall) · 2023-07-19T18:34:51.193Z · LW(p) · GW(p)

I edited out my misquote, my apologies.

I think emotions are not blame assignment tools, and have other (evolutionary) purposes. A classic example is a relationship break-up, where two people can have strong emotions even though nobody did anything wrong. So I do not interpret emotions as accusations in general. It sounds like you have a different approach, and I don't object to that.

Grief over what, anger about what?

For example, grief over the loss of the $100k+ donation. Donated with the hope that it would reduce extinction risk, but with the benefit of hindsight the donor now thinks that the marginal donation had no counterfactual impact. It's not blameworthy because no researcher can possibly promise that a marginal donation will have a large counterfactual impact, and MIRI did not so promise. But a donor can still grieve the loss without someone being to blame.

For example, anger that Yudkowsky realized he had no workable alignment plan, in his estimation, in 2015 (Bankless [LW · GW]), and didn't share that until 2022 (Death with Dignity [LW · GW]). This is not blameworthy because people are not morally obliged to share their extinction risk predictions, and MIRI has a clear policy against sharing information by default. But a donor can still be angry that they were disadvantaged by known unknowns.

I hope these examples illustrate that a non-accusatory interpretation is sensical, even if you don't think it plausible.

There's a later comment from iceman [LW(p) · GW(p)], which is probably the place to discuss what iceman is alleging:

What should MIRI have done, had they taken the good sliver of The Sequences to heart? They should have said oops. The should have halted, melted and caught fire. They should have acknowledged that the sky was blue. They should have radically changed their minds when the facts changed. But that would have cut off their funding. If the world isn't going to end from a FOOMing AI, why should MIRI get paid?

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2023-07-19T19:39:22.191Z · LW(p) · GW(p)

I think emotions are not blame assignment tools, and have other (evolutionary) purposes. A classic example is a relationship break-up, where two people can have strong emotions even though nobody did anything wrong. So I do not interpret emotions as accusations in general. It sounds like you have a different approach, and I don’t object to that.

You misunderstand. I’m not “interpret[ing] emotions as accusations”; I’m simply saying that emotions don’t generally arise for no reason at all (if they do, we consider that to be a pathology!).

So, in your break-up example, the two people involved of course have strong emotions—because of the break-up! On the other hand, it would be very strange indeed to wake up one day and have those same emotions, but without having broken up with anyone, or anything going wrong in your relationships at all.

And likewise, in this case:

Grief over what, anger about what?

For example, grief over the loss of the $100k+ donation. Donated with the hope that it would reduce extinction risk, but with the benefit of hindsight the donor now thinks that the marginal donation had no counterfactual impact. It’s not blameworthy because no researcher can possibly promise that a marginal donation will have a large counterfactual impact, and MIRI did not so promise. But a donor can still grieve the loss without someone being to blame.

Well, it’s bit dramatic to talk of “grief” over the loss of money, but let’s let that pass. More to the point: why is it a “loss”, suddenly? What’s happened just now that would cause iceman to view it as a “loss”? It’s got to be something in Zack’s post, or else the comment is weirdly non-apropos, right? In other words, the implication here is that something in the OP has caused iceman to re-examine the facts, and gain a new “benefit of hindsight”. But that’s just what I’m questioning.

For example, anger that Yudkowsky realized he had no workable alignment plan, in his estimation, in 2015 (Bankless [LW · GW]), and didn’t share that until 2022 (Death with Dignity [LW · GW]). This is not blameworthy because people are not morally obliged to share their extinction risk predictions, and MIRI has a clear policy against sharing information by default. But a donor can still be angry that they were disadvantaged by known unknowns.

I do not read Eliezer’s statements in the Bankless interview as saying that he “realized he had no workable alignment plan” in 2015. As far as I know, at no time since starting to write the Sequences has Eliezer ever claimed to have, or thought that he had, a workable alignment plan. This has never been a secret, nor is it news, either to Eliezer in 2015 or to the rest of us in 2022.

I hope these examples illustrate that a non-accusatory interpretation is sensical, even if you don’t think it plausible.

They do not.

There’s a later comment from iceman [LW(p) · GW(p)], which is probably the place to discuss what iceman is alleging:

Well, you can see my response to that comment.

↑ comment by Eli Tyre (elityre) · 2024-02-18T07:32:06.168Z · LW(p) · GW(p)

And presumably Louie got paid out since why would you pay for silence if the accusations weren't at least partially true

FWIW, my current understanding is that this inference isn't correct. I think it's common practice to pay settlements to people, even if their claims are fallacious, since having an extended court battle is sometimes way worse.

comment by Chris_Leong · 2023-07-16T13:33:17.541Z · LW(p) · GW(p)

Sounds like a rough experience. Hope you're feeling better these days.

I can see why you might have an allergic reaction, however, your criticisms feel a bit overstated and as not quite hitting the nail on the head, at least from my perspective.

I agree with Scott et al. regarding the validity of the language game where we include transwomen as women. I recommend reading at least a bit of Wittgenstein, if you've never done this.

This said, you are correct to suspect that there is something funky going on, where people are insisting on the use of this language game to the exclusion of the more traditional use. And it seems that in some cases, this insistence goes beyond merely a request for politeness/respect to an attempt to gain an advantage in public policy discourse.

Here, I'm trying to differentiate between the validity of a language game and its weaponisation. In particular, that one might defend the former without also defending the latter.

Perhaps if people were less aggressive about their attempts to control the public conversation, then you might feel differently about the validity of the language game in and of itself?

But in the current discourse, perhaps you're worried that defending the language game unavoidably supports its weaponised form?

comment by Yoav Ravid · 2023-07-16T08:33:45.413Z · LW(p) · GW(p)

I would love to read your "twelve short stories about language" If you feel like publishing them.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2023-07-19T00:45:54.608Z · LW(p) · GW(p)

Sure; it's not worth adapting into a post because I've already made most of these points elsewhere, but I can put up the email as an ancillary page.

Replies from: Yoav Ravid

↑ comment by Yoav Ravid · 2023-07-19T05:43:47.393Z · LW(p) · GW(p)

Thanks!

comment by tailcalled · 2023-07-15T22:36:01.533Z · LW(p) · GW(p)

Jessica explained what she saw as the problem with this. What Ben was proposing was creating clarity about behavioral patterns. I was saying that I was afraid that creating such clarity is an attack on someone. But if so, then my blog was an attack on trans people. What was going on here?

Socially, creating clarity about behavioral patterns is construed as an attack and can make things worse for someone. For example, if your livelihood is based on telling a story about you and your flunkies being the only sane truthseeking people in the world, then me demonstrating that you don't care about the truth when it's politically inconvenient is a threat to your marketing story and therefore to your livelihood. As a result, it's easier to create clarity down power gradients than up them: it was easy for me to blow the whistle on trans people's narcissistic delusions, but hard to blow the whistle on Yudkowsky's.

The phrase "construed" makes me wonder... Is it not an attack? I guess I have a relatively shallow/mostly-aesthetic model of what is an "attack" or not in debates, because I can't really think of any crisp arguments in favor or against.

comment by Signer · 2023-07-21T17:52:03.556Z · LW(p) · GW(p)

It’s easy to say “X is a Y” for arbitrary X and Y if the stakes demand it, but that’s not the same thing as using that concept of Y internally as part of your world-model.

So don't use it internally and just say it?

After reading it I still don't get what's the connection between the object-level question and abstract epistemology is. Yes, some concepts are more useful for figuring out reality. So is self-modifying to not care about joy. It is a question of utility so what actual bad thing is supposed to happen in reality if we started to use words that are four symbols longer?

A Hill of Validity in Defense of Meaning

Contents

119 comments