Posts

Sexual Dimorphism in Yudkowsky's Sequences, in Relation to My Gender Problems 2021-05-03T04:31:23.547Z
Communication Requires Common Interests or Differential Signal Costs 2021-03-26T06:41:25.043Z
Less Wrong Poetry Corner: Coventry Patmore's "Magna Est Veritas" 2021-01-30T05:16:26.486Z
Unnatural Categories Are Optimized for Deception 2021-01-08T20:54:57.979Z
And You Take Me the Way I Am 2020-12-31T05:45:24.952Z
Containment Thread on the Motivation and Political Context for My Philosophy of Language Agenda 2020-12-10T08:30:19.126Z
Scoring 2020 U.S. Presidential Election Predictions 2020-11-08T02:28:29.234Z
Message Length 2020-10-20T05:52:56.277Z
Msg Len 2020-10-12T03:35:05.353Z
Artificial Intelligence: A Modern Approach (4th edition) on the Alignment Problem 2020-09-17T02:23:58.869Z
Maybe Lying Can't Exist?! 2020-08-23T00:36:43.740Z
Algorithmic Intent: A Hansonian Generalized Anti-Zombie Principle 2020-07-14T06:03:17.761Z
Optimized Propaganda with Bayesian Networks: Comment on "Articulating Lay Theories Through Graphical Models" 2020-06-29T02:45:08.145Z
Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning 2020-06-07T07:52:09.143Z
Comment on "Endogenous Epistemic Factionalization" 2020-05-20T18:04:53.857Z
"Starwink" by Alicorn 2020-05-18T08:17:53.193Z
Zoom Technologies, Inc. vs. the Efficient Markets Hypothesis 2020-05-11T06:00:24.836Z
A Book Review 2020-04-28T17:43:07.729Z
Brief Response to Suspended Reason on Parallels Between Skyrms on Signaling and Yudkowsky on Language and Evidence 2020-04-16T03:44:06.940Z
Why Telling People They Don't Need Masks Backfired 2020-03-18T04:34:09.644Z
The Heckler's Veto Is Also Subject to the Unilateralist's Curse 2020-03-09T08:11:58.886Z
Relationship Outcomes Are Not Particularly Sensitive to Small Variations in Verbal Ability 2020-02-09T00:34:39.680Z
Book Review—The Origins of Unfairness: Social Categories and Cultural Evolution 2020-01-21T06:28:33.854Z
Less Wrong Poetry Corner: Walter Raleigh's "The Lie" 2020-01-04T22:22:56.820Z
Don't Double-Crux With Suicide Rock 2020-01-01T19:02:55.707Z
Speaking Truth to Power Is a Schelling Point 2019-12-30T06:12:38.637Z
Stupidity and Dishonesty Explain Each Other Away 2019-12-28T19:21:52.198Z
Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think 2019-12-27T05:09:22.546Z
Funk-tunul's Legacy; Or, The Legend of the Extortion War 2019-12-24T09:29:51.536Z
Free Speech and Triskaidekaphobic Calculators: A Reply to Hubinger on the Relevance of Public Online Discussion to Existential Risk 2019-12-21T00:49:02.862Z
A Theory of Pervasive Error 2019-11-26T07:27:12.328Z
Relevance Norms; Or, Gricean Implicature Queers the Decoupling/Contextualizing Binary 2019-11-22T06:18:59.497Z
Algorithms of Deception! 2019-10-19T18:04:17.975Z
Maybe Lying Doesn't Exist 2019-10-14T07:04:10.032Z
Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists 2019-09-24T04:12:07.560Z
Schelling Categories, and Simple Membership Tests 2019-08-26T02:43:53.347Z
Diagnosis: Russell Aphasia 2019-08-06T04:43:30.359Z
Being Wrong Doesn't Mean You're Stupid and Bad (Probably) 2019-06-29T23:58:09.105Z
What does the word "collaborative" mean in the phrase "collaborative truthseeking"? 2019-06-26T05:26:42.295Z
The Univariate Fallacy 2019-06-15T21:43:14.315Z
No, it's not The Incentives—it's you 2019-06-11T07:09:16.405Z
"But It Doesn't Matter" 2019-06-01T02:06:30.624Z
Minimax Search and the Structure of Cognition! 2019-05-20T05:25:35.699Z
Where to Draw the Boundaries? 2019-04-13T21:34:30.129Z
Blegg Mode 2019-03-11T15:04:20.136Z
Change 2017-05-06T21:17:45.731Z
An Intuition on the Bayes-Structural Justification for Free Speech Norms 2017-03-09T03:15:30.674Z
Dreaming of Political Bayescraft 2017-03-06T20:41:16.658Z
Rationality Quotes January 2010 2010-01-07T09:36:05.162Z
News: Improbable Coincidence Slows LHC Repairs 2009-11-06T07:24:31.000Z

Comments

Comment by Zack_M_Davis on There’s no such thing as a tree (phylogenetically) · 2021-05-04T14:51:25.581Z · LW · GW

On the specific example of trees, John Wentworth recently pointed out that neural networks tend to learn a "tree" concept: a small, local change to the network can add or remove trees from generated images. That kind of correspondence between human and unsupervised (!) machine-learning model concepts is the kind of thing I'd expect to happen if trees "actually exist", rather than trees being weird and a little arbitrary. (Where things are closer to "actually existing" rather than being arbitrary when different humans and other AI architectures end up converging on the same concept in order to compress their predictions.)

(Now I'm wondering if there's some sort of fruitful analogy to be made between convergence of tree concepts in different maps, and convergent evolution in the territory; in some sense, the fact that evolution keeps rediscovering the tree strategy makes them less "arbitrary" than if trees had only been "invented once" and all descended from the same ur-tree ...)

Comment by Zack_M_Davis on Sexual Dimorphism in Yudkowsky's Sequences, in Relation to My Gender Problems · 2021-05-03T16:48:07.063Z · LW · GW

I expected you to realize how wrong everything you said was

What parts, specifically, are wrong? What is the evidence that shows that those parts are wrong? Please tell me! If I'm wrong about everything, I want to know!

Comment by Zack_M_Davis on There’s no such thing as a tree (phylogenetically) · 2021-05-03T04:54:10.840Z · LW · GW

Acknowledge that all of our categories are weird and a little arbitrary

That is not the moral! The moral is that the cluster-structure of similarities induced by phylogenetic relatedness exists in a different subspace from the cluster-structure of similarities induced by convergent evolution! (Where the math jargon "subspace" serves as a precise formalization of the idea that things can be similar in some aspects ("dimensions") while simultaneously being different in other aspects.) This shouldn't actually be surprising if you think about what the phrase "convergent evolution" means!

For more on the relevant AI/philosophy-of-language issues, see "Where to Draw the Boundaries?" and "Unnatural Categories Are Optimized for Deception".

Comment by Zack_M_Davis on The consequentialist case for social conservatism, or “Against Cultural Superstimuli” · 2021-04-15T20:06:36.735Z · LW · GW

actual trans people, or perverts willing to pretend to be trans if it allows them to sneak into female toilets

It gets worse: if the dominant root cause of late-onset gender dysphoria in males is actually a paraphilic sexual orientation, this is a false dichotomy! (It's not "pretending" if you sincerely believe it.)

Comment by Zack_M_Davis on The consequentialist case for social conservatism, or “Against Cultural Superstimuli” · 2021-04-15T17:51:14.516Z · LW · GW

So, I started writing an impassioned reply to this (draft got to 850 words), but I've been trying to keep my culture war efforts off this website (except for the Bayesian philosophy-of-language sub-campaign that's genuinely on-topic), so I probably shouldn't take the bait. (If nothing else, it's not a good use of my time when I have lots of other things to write for my topic-specific blog.)

If I can briefly say one thing without getting dragged into a larger fight, I would like to note that aggressively encouraging people to consider whether they might be trans is potentially harmful if the popular theory of what "trans" is, is actually false; even if you're a liberal who wants people to have the freedom to decide how to live their lives unencumbered by oppressive traditions, people might make worse decisions in an environment full of ideologically-fueled misinformation. (I consider trans activism to have been extremely harmful to me and people like me on this account.)

Comment by Zack_M_Davis on A Brief Review of Current and Near-Future Methods of Genetic Engineering · 2021-04-13T18:15:00.346Z · LW · GW

The effective altruist case for regime change??

Comment by Zack_M_Davis on Why We Launched LessWrong.SubStack · 2021-04-01T16:39:29.790Z · LW · GW

Has anyone tried buying a paid subscription? I would assume the payment attempt just fails unless your credit card has a limit over $60,000, but I'm scared to try it.

Comment by Zack_M_Davis on On future people, looking back at 21st century longtermism · 2021-03-23T01:21:21.336Z · LW · GW

I imagine them going: "Whoa. Basically all of history, the whole thing, all of everything, almost didn't happen."

But this kind of many-worldeaters thinking is already obsolete. It won't be that it "almost" didn't happen; it's that it mostly didn't happen. (The future will have the knowledge and compute to say what the distribution of outcomes was for a specified equivalence class of Earth-analogues across the multiverse.)

Comment by Zack_M_Davis on Unnatural Categories Are Optimized for Deception · 2021-03-20T01:03:19.862Z · LW · GW

This gave me a blog story idea!

Comment by Zack_M_Davis on Viliam's Shortform · 2021-03-19T23:39:37.853Z · LW · GW

YouTube lets me watch the video (even while logged out). Is it a region thing?? (I'm in California, USA). Anyway, the video depicts

dirt, branches, animals, &c. getting in Rapunzel's hair as it drags along the ground in the scene when she's frolicking after having left the tower for the first time, while Flynn Rider offers disparaging commentary for a minute, before delcaring, "Okay, this is getting weird; I'm just gonna go."

If you want to know how it really ends, check out the sequel series!

Comment by Zack_M_Davis on Unnatural Categories Are Optimized for Deception · 2021-03-19T21:22:08.227Z · LW · GW

So, I like this, but I'm still not sure I understand where features come from.

Say I'm an AI, and I've observed a bunch of sensor data that I'm representing internally as the points (6.94, 3.96), (1.44, -2.83), (5.04, 1.1), (0.07, -1.42), (-2.61, -0.21), (-2.33, 3.36), (-2.91, 2.43), (0.11, 0.76), (3.2, 1.32), (-0.43, -2.67).

The part where I look at this data and say, "Hey, these datapoints become approximately conditionally independent if I assume they were generated by a multivariate normal with mean (2, -1), and covariance matrix [[16, 0], [0, 9]][1]; let me allocate a new concept for that!" makes sense. (In the real world, I don't know how to write a program to do this offhand, but I know how to find what textbook chapters to read to tell me how.)

But what about the part where my sensor data came to me already pre-processed into the list of 2-tuples?—how do I learn that? Is it just, like, whatever transformations of a big buffer of camera pixels let me find conditional independence patterns probably correspond to regularities in the real world? Is it "that easy"??


  1. In the real world, I got those numbers from the Python expression ', '.join(str(d) for d in [(round(normal(2, 4), 2), round(normal(-1, 3), 2)) for _ in range(10)]) (using scipy.random.normal). ↩︎

Comment by Zack_M_Davis on Unnatural Categories Are Optimized for Deception · 2021-03-19T18:18:35.800Z · LW · GW

(Thinking out loud about how my categorization thing will end up relating to your abstraction thing ...)

200-word recap of my thing: I've been relying on our standard configuration space metaphor, talking about running some "neutral" clustering algorithm on some choice of subspace (which is "value-laden" in the sense that what features you care about predicting depends on your values). This lets me explain how to think about dolphins: they simultaneously cluster with fish in one subspace, but also cluster with other mammals in a different subspace, no contradiction there. It also lets me explain what's wrong with a fake promotion to "Vice President of Sorting": the "what business cards say" dimension is a very "thin" subspace; if it doesn't cluster with anything else, then there's no reason we care. As my measurement of what makes a cluster "good", I'm using the squared error, which is pretty "standard"—that's basically what, say, k means clustering is doing—but also pretty ad hoc: I don't have a proof of why squared error and only squared error is the right calculation to be doing given some simple deciderata, and it probably isn't. (In contrast, we can prove that if you want a monotonic, nonnegative, additive measure of information, you end up with entropy: the only free choice is the base of the logarithm.)

What I'm hearing from the parent and your reply to my comment on "... Ad Hoc Mathematical Definitions?": talking about looking for clusters in some pre-chosen subspace of features is getting the actual AI challenge backwards. There are no pre-existing features in the territory; rather, conditional-independence structure in the territory is what lets us construct features such that there are clusters. Saying that we want categories that cluster in a "thick" subspace that covers many dimensions is like saying we want to measure information with "a bunch of functions like , sin(Y), , &c., and require that those also be uncorrelated": it probably works, but there has to be some deeper principle that explains why most of the dimensions and ad hoc information measures agree, why we can construct a "thick" subspace.

To explain why "squiggly", "gerrymandered" categories are bad, I said that if you needed to make a decision that depended on how big an integer is, categorizing by parity would be bad: the squared-error score quantifies the fact that 2 is more similar to 3 than 12342. But notice that the choice of feature (the decision quality depending on magnitude, not parity) is doing all the work: 2 is more similar to 12342 than 3 in the mod-2 quotient space!

So maybe the exact measure of "closeness" in the space (squared error, or whatever) is a red herring, an uninteresting part of the problem?—like the choice of logarithm in the definition of entropy. We know that there isn't any principled reason why base 2 or base e is better than any others. It's just that we're talking about how uncertainty relates to information, so if we use our standard representation of uncertainty as probabilities from 0 to 1 under which independent events multiply, then we have a homomorphism from multiplication (of probability) to addition (of information), which means you have to pick a base for the logarithm if you want to work with concrete numbers instead of abstract nonsense.

If this is a good analogy, then we're looking for some sort of deeper theorem about "closeness" and conditional independence "and stuff" that explains why the configuration space metaphor works—after which we'll be able to show that the choice of metric on the "space" will be knowably arbitrary??

Comment by Zack_M_Davis on What's So Bad About Ad-Hoc Mathematical Definitions? · 2021-03-16T06:58:06.088Z · LW · GW

This is related to something I never quite figured out in my cognitive-function-of-categorization quest. How do we quantify how good a category is at "carving reality at the joints"?

Your first guess would be "mutual information between the category-label and the features you care about" (as suggested in the Job parable in April 2019's "Where to Draw the Boundaries?"), but that actually turns out to be wrong, because information theory has no way to give you "partial credit" for getting close to the right answer, which we want. Learning whether a number between 1 and 10 inclusive is even or odd gives you the same amount of information (1 bit) as learning whether it's over or under 5½, but if you need to make a decision whose goodness depends continuously on the magnitude of the number, then the high/low category system is useful and the even/odd system is not: we care about putting probability-mass "close" to the right answer, not just assigning more probability to the exact answer.

In January 2021's "Unnatural Categories Are Optimized for Deception", I ended up going with "minimize expected squared error (given some metric on the space of features you care about)", which seems to work, but I didn't have a principled justification for that choice, other than it solving my partial-credit problem and it being traditional. (Why not the absolute error? Why not exponentiate this feature and then, &c.?)

Another possibility might have been to do something with the Wasserstein metric, which reportedly fixes the problem of information theory not being able to award "partial credit". (The logarithmic score is the special case of the Kullback–Leibler divergence when the first distribution assigns Probability One to the actual answer, so if there's some sense in which Wasserstein generalizes Kullback–Leibler for partial credit, then maybe that's what I want.)

My intuition doesn't seem adequate to determine which (or something else) formalization captures the true nature of category-goodness, to which other ideas are a mere proxy.

Comment by Zack_M_Davis on Trapped Priors As A Basic Problem Of Rationality · 2021-03-13T04:43:20.538Z · LW · GW

Maybe it's unfortunate that the same word is overloaded to cover "prior probability" (e.g., probability 0.2 that dogs are bad), and "prior information" in the sense of "a mathematical object that represents all of your starting information plus the way you learn from experience."

Comment by Zack_M_Davis on Where does the phrase "central example" come from? · 2021-03-12T06:21:20.422Z · LW · GW

jinx

Comment by Zack_M_Davis on Where does the phrase "central example" come from? · 2021-03-12T06:20:29.139Z · LW · GW

Implied by "the noncentral fallacy"? (I'm surprised at the search engine results (Google, DuckDuckGo); I didn't realize this was a Less Wrong-ism.)

Comment by Zack_M_Davis on Defending the non-central fallacy · 2021-03-10T06:11:14.083Z · LW · GW

And a more natural clustering would reflect that.

What subspace are you doing your clustering in, though? Both the pro-capital-punishment and anti-capital-punishment side should be able to agree that capital punishment and "central" murder are similar in the "intentional killing of a human" aspects, but differ in the "motives and decision mechanism of the killer" aspects (where the "central" murderer is an individual, rather than a judicial institution). Each side has an incentive to try to bind the murder codeword in their shared language to a subspace that makes their own side's preferred policy look natural.

Comment by Zack_M_Davis on Unconvenient consequences of the logic behind the second law of thermodynamics · 2021-03-07T19:34:01.059Z · LW · GW

if entropy is decreasing maybe your memory is just working "backwards"

I think the key to the puzzle is likely to be here: there's likely to be some principled reason why agents embedded in physics will perceive the low-entropy time direction as "the past", such that it's not meaningful to ask which way is "really" "backwards".

Comment by Zack_M_Davis on Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think · 2021-03-03T04:12:07.346Z · LW · GW

Cade Metz hadn't had this much trouble with a story in years. Professional journalists don't get writer's block! Ms. Tam had rejected his original draft focused on the subject's early warnings of the pandemic. Her feedback hadn't been very specific ... but then, it didn't need to be.

For contingent reasons, the reporting for this piece had stretched out over months. He had tons of notes. It shouldn't be hard to come up with a story that would meet Ms. Tam's approval.

The deadline loomed. Alright, well, one sentence at a time. He wrote:

In one post, he aligned himself with Charles Murray, who proposed a link between race and I.Q. in "The Bell Curve."

Metz asked himself: Is this statement actually and literally true?

Yes! The subject had aligned himself with Charles Murray in one post: "The only public figure I can think of in the southeast quadrant with me is Charles Murray.".

In another, he pointed out that Mr. Murray believes Black people "are genetically less intelligent than white people."

Metz asked himself: Is this statement actually and literally true?

Yes! The subject had pointed that out in another post: "Consider Charles Murray saying that he believes black people are genetically less intelligent than white people."

Having gotten started, the rest of the story came out easily. Why had he been so reluctant to write the new draft, as if in fear of some state of sin? This was his profession—to seek out all the news that's fit to print, and bring it to the light of the world!

For that was his mastery.

Comment by Zack_M_Davis on Anna and Oliver discuss Children and X-Risk · 2021-02-27T20:35:24.596Z · LW · GW

being the-sort-of-person-who-chooses-to-have-kids

What years were most of these biographies about? Sexual marketplace and family dynamics have changed a lot since, say, 1970ish. (Such that a lot of people today who don't think of themselves as the-sort-of-person-who-chooses-to-have-kids would absolutely be married with children had someone with their genotype grown up in an earlier generation.)

Comment by Zack_M_Davis on Anna and Oliver discuss Children and X-Risk · 2021-02-27T19:32:31.906Z · LW · GW

Two complementary pro-natalist considerations I'd like to see discussed:

  • Eugenics! It doesn't seem like there are any technical barriers to embryo selection for IQ today. If longtermist parents disproportionately become early adopters of this tech in the 2020s, could that help their children be a disproportionate share of up-and-coming AI researchers in the 2040s?

  • Escaping our Society's memetic collapse. We are the children of a memetic brood-parasite strategy. It's a lot easier to recruit new longtermists out of universal culture than it is from Mormonism, but universal culture triumphed not because its adherents had more children than everyone else, but by capturing the school and media institutions that socialize everyone else's children: horizontal meme transmission rather than vertical. If social-media-era universal culture is no longer as conducive to Reason as its 20th-century strain, maybe we need to switch to a more Mormon-like strategy (homeschooling, &c.) if we want there to be top reasoners in the 2040s.

Comment by Zack_M_Davis on Above the Narrative · 2021-02-26T08:26:22.270Z · LW · GW

Consider adapting this into a top-level post? I anticipate wanting to link to it (specifically for the "smaller audiences offer more slack" moral).

Comment by Zack_M_Davis on Google’s Ethical AI team and AI Safety · 2021-02-22T02:10:29.649Z · LW · GW

people are afraid to engage in speech that will be interpreted as political [...] nobody is actually making statements about my model of alignment deployment [...] try to present the model at a further disconnect from the specific events and actors involved

This seems pretty unfortunate insofar as some genuinely relevant real-world details might not survive the obfuscation of premature abstraction.

Example of such an empirical consideration (relevant to the "have some members that keep up with AI Safety research" point in your hopeful plan): how much overlap and cultural compatibility is there between AI-ethics-researchers-as-exemplified-by-Timnit-Gebru and AI-safety-researchers-as-exemplified-by-Paul-Christiano? (By all rights, there should be overlap and compatibility, because the skills you need to prevent your credit-score AI from being racist (with respect to whatever the correct technical reduction of racism turns out to be) should be a strict subset of the skills you need to prevent your AGI from destroying all value in the universe (with respect to whatever the correct technical reduction of value turns out to be).)

Have you tried asking people to comment privately?

Comment by Zack_M_Davis on “PR” is corrosive; “reputation” is not. · 2021-02-17T08:05:47.579Z · LW · GW

Thanks for the detailed reply! I changed my mind; this is kind of interesting.

This is not about "tone policing." This is about the fundamental thrust of the engagement. "You're wrong, and I'mm'a prove it!" vs. "I don't think that's right, can we talk about why?"

Can you say more about why this distinction seems fundamental to you? In my culture, these seem pretty similar except for, well, tone?

"You're wrong" and "I don't think that's right" are expressing the same information (the thing you said is not true), but the former names the speaker rather than what was spoken ("you" vs. "that"), and the latter uses the idiom of talking about the map rather than the territory ("I think X" rather than "X") to indicate uncertainty. The semantics of "I'mm'a prove it!" and "Can we talk about why?" differ more, but both indicate that a criticism is about to be presented.

In my culture, "You're wrong, and I'mm'a prove it!" indicates that the critic is both confident in the criticism and passionate about pursuing it, whereas "I don't think that's right, can we talk about why?" indicates less confidence and less interest.

In my culture, the difference may influence whether the first speaker chooses to counterreply, because a speaker who ignores a confident, passionate, correct criticism may lose a small amount of status. However, the confident and passionate register is a high variance strategy that tends to be used infrequently, because a confident, passionate critic whose criticism is wrong loses a lot of status.

the exact same information cooperatively/collaboratively

Can you say more about what the word collaborative means to you in this context? I asked a question about this once!

implied claim that your strategy is motivated by a sober weighing of its costs and benefits, and you're being adversarial because you genuinely believe that's the best way forward [...] you tell yourself that it's virtuous so that you don't have to compare-contrast the successfulness of your strategy with the successfulness of the Erics and the Julias and the Benyas

Oh, it's definitely not a sober weighing of costs and benefits! Probably more like a reinforcement-learned strategy?—something that's been working well for me in my ecological context, that might not generalize to someone with a different personality in a different social environment. Basically, I'm positing that Eric and Julia and Benya are playing a different game with a harsher penalty for alienating people. If someone isn't interested in trying to change a trait in themselves, are they therefore claiming it a "virtue"? Ambiguous!

I defy you to say, with a straight face, "a supermajority of rationalists

Hold on. I categorically reject the epistemic authority of a supermajority of so-called "rationalists". I care about what's actually true, not what so-called "rationalists" think.

To be sure, there's lots of specific people in the "rationalist"-branded cluster of the social graph whose sanity or specific domain knowledge I trust a lot. But they each have to earn that individually; the signal of self-identification or social-graph-affiliation with the "rationalist" brand name is worth—maybe not nothing, but certainly less than, I don't know, graduating from the University of Chicago.

the hypothesis which best explains my first response

Well, my theory is that the illegible pattern-matching faculties in my brain returned a strong match between your comment, and what I claim is a very common and very pernicious instance of dark side epistemology where people evince a haughty, nearly ideological insistence that all precise generalizations about humans are false, which looks optimized for protecting people's false stories about themselves, and that I in particular am extremely sensitive to noticing this pattern and attacking it at every opportunity as part of the particular political project I've been focused on for the last four years.

You can't rely on people just magically knowing that of course you object to EpicNamer, and that your relative expenditure of words is unrepresentative of your true objections.

EpicNamer's comment seems bad (the -7 karma is unsurprising), but I don't feel strongly about it, because, like Oli, I don't understand it. ("[A]t the expense of A"? What is A?) In contrast, I object really strongly to the (perceived) all-precise-generalizations-about-humans-are-false pattern. So, I think my word expenditure is representative of my concerns.

it's disingenuous and sneaky to act like what's being requested here is that you "obfuscate your thoughts through a gentleness filter."

In retrospect, I actually think the (algorithmically) disingenuous and sneaky part was "actually helps anyone", which assumes more altruism or shared interests than may actually be present. (I want to make positive contributions to the forum, but the specific hopefully-positive-with-respect-to-the-forum-norms contributions I make are realistically going to be optimized to achieve my objectives, which may not coincide with minimizing exhaustingness to others.) Sorry!

Comment by Zack_M_Davis on “PR” is corrosive; “reputation” is not. · 2021-02-15T23:41:30.999Z · LW · GW

I also object to "would be very bad" in the subjunctive ... I assert that you ARE introducing this burden, with many of your comments, the above seeming not at all atypical for a Zack Davis clapback. Smacks of "I apologize IF I offended anybody," when one clearly did offend.

So, I think it's important to notice that the bargaining problem here really is two-sided: maybe the one giving offense should be nicer, but maybe the one taking offense shouldn't have taken it personally?

I guess I just don't believe that thoughts end up growing better than they would otherwise by being nurtured and midwifed? Thoughts grow better by being intelligently attacked. Criticism that persistently "plays dumb" with lame "gotcha"s in order to appear to land attacks in front of an undiscriminating audience are bad, but I think it's not hard to distinguish between persistently playing dumb, and "clapback that pointedly takes issue with the words that were actually typed, in a context that leaves open the opportunity for the speaker to use more words/effort to write something more precise, but without the critic being obligated to proactively do that work for them"?

We might actually have an intellectually substantive disagreement about priors on human variation! Exploring that line of discussion is potentially interesting! In contrast, tone-policing replies about not being sufficiently nurturing is ... boring? I like you, Duncan! You know I like you! I just ... don't see how obfuscating my thoughts through a gentleness filter actually helps anyone?

more willing to believe that your nitpicking was principled if you'd spared any of it for the top commenter

Well, I suppose it's not "principled" in the sense that my probability of doing it varies with things other than the severity of the "infraction". If it's not realistic for me to not engage in some form of "selective enforcement" (I'm a talking monkey that types blog comments when I feel motivated, not an AI neutrally applying fixed rules over all comments), I can at least try to be transparent about what selection algorithm I'm using?

I'm more motivated to reply to Duncan Sabien (former CfAR instructor, current MIRI employee) than I am to EpicNamer27098 (1 post, 17 comments, 20 karma, joined December 2020). (That's a compliment! I'm saying you matter!)

I'm more motivated to reply to appeals to assumed-to-exist individual variation, than the baseline average of comments that don't do that, because that's a specific pet peeve of mine lately for psychological reasons beyond the scope of this thread.

I'm more motivated to reply to comments that seem to be defending "even the wonderful cream-of-the-crop rationalists" than the baseline average of comments that don't do that, for psychological reasons beyond the scope of this thread.

Comment by Zack_M_Davis on “PR” is corrosive; “reputation” is not. · 2021-02-15T22:52:41.004Z · LW · GW

there are humans who do not laugh [...] humans who do not shiver when cold

Are there? I don't know! Part of where my comment was coming from is that I've grown wary of appeals to individual variation that are assumed to exist without specific evidence. I could easily believe, with specific evidence, that there's some specific, documented medical abnormality such that some people never develop the species-typical shiver, laugh, cry, &c. responses. (Granted, I am relying on the unstated precondition that, say, 2-week-old embryos don't count.) If you show me the Wikipedia page about such a specific, documented condition, I'll believe it. But if I haven't seen the specific Wikipedia page, should I have a prior that every variation that's easy to imagine, actually gets realized? I'm skeptical! The word human (referring to a specific biological lineage with a specific design specified in ~3·10⁹ bases of the specific molecule DNA) is already pointing to a very narrow and specific set of configurations (relative to the space of all possible ways to arrange 10²⁷ atoms); by all rights, there should be lots of actually-literally universal generalizations to be made.

Comment by Zack_M_Davis on “PR” is corrosive; “reputation” is not. · 2021-02-15T20:46:10.731Z · LW · GW

Oh. I agree that introducing a burden on saying anything at all would be very bad. I thought I was trying to introduce a burden on the fake precision of using the phrase "many orders of magnitude" without being able to supply numbers that are more than 100 times larger than other numbers. I don't think I would have bothered to comment if the great-grandparent had said "a sign that you're wrong" rather than "a sign that you are many orders of magnitude more likely to be wrong than right".

The first paragraph was written from an adversarial perspective, but, in my culture, the parenthetical and "I can empathize with ..." closing paragraph were enough to display overall prosocial and cooperative intent on my part? An opposing lawyer's nitpicking in the courtroom is "adversarial", but the existence of adversarial courts (where opposing lawyers have a duty to nitpick) is "prosocial"; I expect good lawyers to be able to go out for friendly beers after the trial, secure in the knowledge that uncharity while court is in session is "part of the game", and I expect the same layered structure to be comprehensible within a single Less Wrong comment?

Comment by Zack_M_Davis on “PR” is corrosive; “reputation” is not. · 2021-02-15T19:47:22.389Z · LW · GW

if you find yourself typing a sentence about some behavioral trait being universal among humans with that degree of absolute confidence, you can take this as a sign that you are many orders of magnitude more likely to be wrong than right.

"Many orders of magnitude"? (I assume that means we're working in odds rather than probabilities; you can't get more than two orders of magnitude more probability than 0.01.) So if I start listing off candidate behavioral universals like "All humans shiver when cold", "All humans laugh sometimes", "All humans tell stories", "All humans sacrifice honor for PR when the stakes are sufficiently high", you're more than 1000-to-1 against on all of them? Can we bet on this??

(Yes, you were writing casually and hyperbolically rather than precisely, but you can't expect to do that on lesswrong.com and not be called on it, any more than I could expect to do so on your Facebook wall.)

I empathize with the intuition that "Everyone without fail, even [...]" sounds like an extreme claim, but when you think about it, our world is actually sufficiently small that it's not hard to come up with conditions that no one matches: a pool of 7.6·10⁹ humans gets exhausted by less than 33 bits of weirdness.

Comment by Zack_M_Davis on Making Vaccine · 2021-02-12T04:09:16.335Z · LW · GW

I think another John Wentworth post is applicable here. It's not hard to invent reasons why any given post might increase existential risk by some amount. (What if your comment encourages pro-censorship attitudes that hamper the collective intellectual competence we need to reduce existential risk?) In order to not function as trolling, you need to present a case for the risk being plausible, not just possible.

Comment by Zack_M_Davis on Open & Welcome Thread – February 2021 · 2021-02-09T21:20:02.389Z · LW · GW

Archived. (My guess is that no one bothered to preserve all content/links from the old Singularity Institute website when moving to the new post-MIRI-rebranding website; your intelligence.org link was presumably the product of a search-and-replace operation and probably never worked.)

Comment by Zack_M_Davis on Preface · 2021-02-07T21:48:55.484Z · LW · GW

Does this really have 534 legitimate votes (almost 400 more than the next-highest karma post dated in 2015), or was there a bug in the voting system? I could see "Preface" getting the most exposure (and therefore upvote "surface area") from people following links from an AI to Zombies ebook starting from the beginning, but I'd be surprised if that alone could account for the massive karma here.

Comment by Zack_M_Davis on 2019 Review: Voting Results! · 2021-02-01T22:43:07.800Z · LW · GW

If it's easy, any chance we could get a variance (or standard deviation) column on the spreadsheet? (Quadratic voting makes it expensive to create outliers anyway, so throwing away the 50 percent most passionate of voters (as the interquartile range does) is discarding a lot of the actual dispersion signal.)

Comment by Zack_M_Davis on Open & Welcome Thread - January 2021 · 2021-01-19T17:42:09.440Z · LW · GW

You may be thinking of Crystal Society? Best wishes, Less Wrong Reference Desk

Comment by Zack_M_Davis on Richard Ngo's Shortform · 2021-01-14T19:17:32.315Z · LW · GW

If we can quantify how good a theory is at making accurate predictions (or rather, quantify a combination of accuracy and simplicity), that gives us a sense in which some theories are "better" (less wrong) than others, without needing theories to be "true".

Comment by Zack_M_Davis on Richard Ngo's Shortform · 2021-01-14T01:25:21.325Z · LW · GW

See the section about scoring rules in the Technical Explanation.

Comment by Zack_M_Davis on Where to Draw the Boundaries? · 2021-01-12T01:11:42.912Z · LW · GW

Have you (Zack) previously noted something somewhere about "that's coordination"... and... somehow wrapping that around to "but words are just for prediction anyway?".

Yes! You commented on it!

Comment by Zack_M_Davis on Where to Draw the Boundaries? · 2021-01-08T21:02:18.602Z · LW · GW

(Self-review.)

Argument for significance: earlier comment

Sequel: "Unnatural Categories Are Optimized for Deception"

Comment by Zack_M_Davis on Unnatural Categories Are Optimized for Deception · 2021-01-08T20:46:27.538Z · LW · GW

Mods: I'm confused that this isn't showing up in "Latest" even when I set "Personal Blog" to "Required" in a Private-mode window?! Could the algorithm be using "created by" date rather than "published on" date (an earlier version was sitting as a draft for a few months so I could preview some LaTeX), or ...?!

Comment by Zack_M_Davis on Unnatural Categories Are Optimized for Deception · 2021-01-08T20:37:22.539Z · LW · GW

Author's Meta Note

(I continue to maintain that this is fun and basic hidden-Bayesian-structure-of-language-and-cognition stuff that shouldn't be "political", but—if we need it—the "Containment Thread on the Motivation and Political Context for My Philosophy of Language Agenda" is now available for talking about the elephant in the room.)

(I intend to eventually reply to all substantive critical comments on this post and the containment thread, but I might be very slow to respond due to life events and priorities outside of this website. Your patience is deeply appreciated.)

Comment by Zack_M_Davis on Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists · 2021-01-02T22:19:33.526Z · LW · GW

(Self-review.) I've edited the post to include the calculation as footnote 10.

The post doesn't emphasize this angle, but this is also more-or-less my abstract story for the classic puzzle of why disagreement is so prevalent, which, from a Bayesian-wannabe rather than a human perspective, should be shocking: there's only one reality, so honest people should get the same answers. How can it simultaneously be the case that disagreement is ubiquitous, but people usually aren't outright lying? Explanation: the "dishonesty" is mostly in the form of motivatedly asking different questions.

Possible future work: varying the model assumptions might yield some more detailed morals. I never got around to trying the diminishing-marginal-relevance variation suggested in footnote 8. Another variation I didn't get around to trying would be for the importance of a fact to each coalition's narrative to vary: maybe there are a few "sacred cows" for which the social cost of challenging is huge (as opposed to just having to keep one's ratio of off-narrative reports in line).

Prior work: So, I happened to learn about the filtered-evidence problem from the Sequences, but of course, there's a big statistics literature about learning from missing data that I learned a little bit about in 2020 while perusing Ch. 19 of Probabilistic Graphical Models: Principles and Techniques by Daphne Koller and the other guy.

Comment by Zack_M_Davis on Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think · 2020-12-30T05:39:56.354Z · LW · GW

(Self-review.) I oppose including this post in a Best-of-2019 collection. I stand by what I wrote, but, as with "Relevance Norms", this was a "defensive" post; it exists as a reaction to "Meta-Honesty"'s candidacy in the 2018 Review, rather than trying to advance new material on its own terms.

The analogy between patch-resistence in AI alignment and humans finding ways to dodge the spirit of deontological rules, is very important, but not enough to carry the entire post.

A standalone canon-potential explanation of why I think we need a broader conception of honesty than avoiding individually false statements would look more like "Algorithms of Deception" (although that post didn't do so great karma-wise; I'm not sure whether because people don't want to read code, it was slow to get Frontpaged (as I recall), or if it's bad for some other reason).

I intend to reply to Fiddler's review, but likely not in a timely manner.

Comment by Zack_M_Davis on Relevance Norms; Or, Gricean Implicature Queers the Decoupling/Contextualizing Binary · 2020-12-30T05:14:33.276Z · LW · GW

(Self-review.) I oppose including this post in a Best-of-2019 collection. I stand by what I wrote, but it's not potential "canon" material, because this was a "defensive" post for the 2018 Review: if the "contextualizing vs. decoupling" idea hadn't been as popular and well-received as it was, there would be no reason for this post to exist.

A standalone Less Wrong "house brand" explanation of Gricean implicature (in terms of Bayesian signaling games, probably?) could be a useful reference post, but that's not what this is.

Comment by Zack_M_Davis on Where to Draw the Boundaries? · 2020-12-22T23:17:45.141Z · LW · GW

What specific other thing are you doing besides prediction? If you can give me a specific example, I think I should be able to reply with either (a) "that's a prediction", (b) "that's coordination", (c) "here's an explanation of why that's deception/wireheading in the technical sense I've described", (d) "that's a self-fulfilling prophecy", or (e) "whoops, looks like my philosophical thesis isn't quite right and I need to do some more thinking; thanks TAG!!".

(I should be able to reply eventually; no promises on turnaround time because I'm coping with the aftermath of a crisis that I'm no longer involved in, but for which I have both a moral responsibility and selfish interest to reflect and repent on my role in.)

Comment by Zack_M_Davis on Raemon's Shortform · 2020-12-14T02:03:40.434Z · LW · GW

The direct approach: "I'm curious [if/why ...]" → "Tell me [if/why ...]"

Comment by Zack_M_Davis on Scoring 2020 U.S. Presidential Election Predictions · 2020-12-14T01:20:32.969Z · LW · GW

Thanks; you are right and the thing I originally wrote was wrong. For posterity, I added a disclaimer (sixth paragraph) that credits you and the other person who pointed this out.

Comment by Zack_M_Davis on Where to Draw the Boundaries? · 2020-12-13T19:35:54.853Z · LW · GW

But that isn't relevant to what you are saying, because you are making a normative point: you are saying some concepts are wrong.

You know, I think I agree that the reliance on normativity intuitions is a weakness of the original post as written in April 2019. I've thought a lot more in the intervening 20 months, and have been working on a sequel that I hope to finish very soon (working title "Unnatural Categories Are Optimized for Deception", current draft sitting at 8,650 words) that I think does a much better job at reducing that black box. (That is, I think the original normative claim is basically "right", but I now have a deeper understanding of what that's even supposed to mean.)

In summary: when I say that some concepts are wrong, or more wrong than others, I just mean that some concepts are worse than others at making probabilistic predictions. We can formalize this with specific calculations in simple examples (like the Foos clustered at [1, 2, 3] in ℝ³ in the original post) and be confident that the underlying mathematical principles apply to the real world, even if the real world is usually too complicated for us to do explicit calculations for.

This is most straightforward in cases where the causal interaction between "the map" and "the territory" goes only in the one direction "territory → map", and where where we only have to consider one agent's map. As we relax those simplifying assumptions, the theory has to get more complicated.

First complication: if there are multiple agents with aligned perferences but limited ability to communicate, then they potentially face coordination problems: that's what "Schelling Categories" is about.

Second complication: if there are multiple agents whose preferences aren't aligned, then they might have an incentive to decieve each other, making the other agent have a worse map in a way that will trick it into making decisions that benefit the first agent. (Or, a poorly-designed agent might have an incentive to deceive itself, "wireheading" on making the map look good, instead of using a map that reflects the territory to formulate plans that make the territory better.) This is what my forthcoming sequel post is about.

Third complication: if the map can affect the territory, you can have self-fulfilling (or partially-self-fulfilling, or self-negating) prophecies. I'm not sure I understand the theory of this yet.

The sense in which I deny that scientifically inaccurate maps can have compensatory kinds of usefulness, is that I think they have to fall into the second case: the apparent usefulness has to derive from deception (or wireheading). Why else would you want a model/map that makes worse predictions rather than better predictions? (Note: self-fulfilling prophecies aren't inaccurate!)

You're one of them.

Well, yes. I mean, I think I'm fighting for more accurate maps, but that's (trivially) still fighting! I don't doubt that the feeling is mutual.

I'm reminded of discussions where one person argues that a shared interest group (for concreteness, let's say, a chess club) should remain politically neutral (as opposed to, say, issuing a collective condemnation of puppy-kicking), to which someone responds that everything is political and that therefore neutrality is just supporting the status quo (in which some number of puppies per day will continue to be kicked). There's a sense in which it's true that everything is political! (As it is written, refusing to act is like refusing to allow time to pass.)

I think a better counter-counter reply is not to repeat that Chess Club should be "neutral" (because I don't know what that means, either), but rather to contend that it's not Chess Club's job to save the puppies of the world: we can save more puppies with a division of labor in which Chess Club focuses on Society's chess needs, and an Anti-Puppy-Kicking League focuses on Society's interest in saving puppies. (And if you think Society should care more about puppies and less about chess, you should want to defund Chess Club rather than having it issue collective statements.)

Similarly, but even more fundamentally, it's not the map's job to provide compensatory usefulness; the map's job is to reflect the territory. In a world where agents are using maps to make decisions, you probably can affect the territory by distorting the map for purposes that aren't about maximizing predictive accuracy! It's just really bad AI design, because by the very nature of the operation, you're sabotaging your ability to tell whether your intervention is actually making things better.

Comment by Zack_M_Davis on Rafael Harth's Shortform · 2020-12-13T04:42:57.989Z · LW · GW

(Datapoint on initial perception: at the time, I had glanced at the post, but didn't vote or comment, because I thought Steven was in the right in the precipitating discussion and the "a prediction can assign less probability-mass to the actual outcome than another but still be better" position seemed either confused or confusingly phrased to me; I would say that a good model can make a bad prediction about a particular event, but the model still has to take a hit.)

Comment by Zack_M_Davis on What determines the balance between intelligence signaling and virtue signaling? · 2020-12-13T03:57:31.224Z · LW · GW

(In September 2018, frustrated by this exact issue (albeit not in these exact terms), I impulse-bought a custom T-shirt with the slogan "BEING SMART IS MORE IMPORTANT THAN BEING GOOD". I have never worn it in public. Not sure what the moral here is.)

Comment by Zack_M_Davis on human psycholinguists: a critical appraisal · 2020-12-13T03:21:16.926Z · LW · GW

What does Nostalgebraist do? They fucking write. (Engaging and educational.)

Comment by Zack_M_Davis on Book Review: The Secret Of Our Success · 2020-12-13T01:45:43.667Z · LW · GW

This passage from Sarah Blaffer Hrdy's Mothers and Others: The Evolutionary Origin of Mutual Understanding, on unexpected functionalist explanations of childrearing practices among the Beng, presents an evocative example of the kind of cultural adaptation Heinrich writes about—

In her richly textured account of "the culture of infancy" in a West African Beng village, the cultural anthropologist Alma Gottlieb describes infant care practices that initially seem puzzling. To Gottlieb, the way the Beng treat their babies seemed so nonsensical that she became convinced that their mode of childcare could be understood only within their peculiar symbolic system. Like many cultural anthropologists, she saw little point in considering evolutionary contexts or adaptive functions.

At first glance, her prejudice against adaptive explantions in this instance seems well-founded. For Beng mothers engage in some remarkably counterintuitive, maladaptive-seeming behaviors. They force babies to drink water before allowing them access to the breast. They also adminsiter herbal enemas several times a day, and decorate their babies with protective painted symbols thought to promote health and growth, as well as to advertise tribal status or identity. Such practices, Gottlieb argues, flow from belief systems specific to the Beng, having to do with the sacredness of water and the origin of babies who enter the world reincarnated from ancestors, and can only be understood within the context of a specifically Beng worldview.

At first glance, such practices seem to defy common sense and functional explanation. How could enemas and body paint have anything to do with keeping babies healthier or enhancing their survival? [...] Symbolic decorations are not going to encourage babies to grow faster or make them healthier, and parasite- and bacteria-laden water forced down a baby's throat is likely to do the reverse, causing diarrea. And what use are excretion-promoting enemas when the big problem in this society is malnutrition? [...]

And yet stand back and consider Beng maternal practice in terms of the universal dilemma confronting primate mothers who find themselves torn between heavy subsistence loads and the need to care for infants in the face of high rates of mortality. These are mothers who cannot possibly rear their infants without assistance from others. Next, consider the Beng in the context of a species that must have evolved as a cooperative breeder. Again and again, Gottlieb mentions the "enormous labor demands" on Beng mothers who farm full-time, chop and haul firewood, provide water, do the laundry, and prepare food using labor-intensive methods. A woman, especially an undernourished woman with several children, could not possibly manage these tasks without enlisting kin and other villagers to help her care for her infant. As it turns out, each of the seemingly useless cultural practice mentioned above also just happens to make babies more attractive to allomothers.

"Every Beng mother," Gottlieb writes, "makes great efforts to toilet-train her baby from birth so as to attract a possible [caretaker] who can be recruited to hte job without fear of being soiled. The goal is for the infant to defecate only once or twice a day, during bath time, so as never to dirty anyone between baths, especially while being carried." It is to make a baby more easily comforted by a nonlactating allomother that they are taught early—and forcibly—to be satisfied with a drink of water if no one is available to breastfeed. It is specifically to make her infant more attractive to caretakers that a mother beautifies her baby with painted symbols, for "if a baby is irresistably beautiful, someone will be eager to carry the little one for a few hours, and the mother can get her work done."