## Posts

Artificial Intelligence: A Modern Approach (4th edition) on the Alignment Problem 2020-09-17T02:23:58.869Z · score: 67 (33 votes)
Maybe Lying Can't Exist?! 2020-08-23T00:36:43.740Z · score: 46 (26 votes)
Algorithmic Intent: A Hansonian Generalized Anti-Zombie Principle 2020-07-14T06:03:17.761Z · score: 46 (15 votes)
Optimized Propaganda with Bayesian Networks: Comment on "Articulating Lay Theories Through Graphical Models" 2020-06-29T02:45:08.145Z · score: 70 (30 votes)
Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning 2020-06-07T07:52:09.143Z · score: 83 (46 votes)
Comment on "Endogenous Epistemic Factionalization" 2020-05-20T18:04:53.857Z · score: 125 (49 votes)
"Starwink" by Alicorn 2020-05-18T08:17:53.193Z · score: 40 (14 votes)
Zoom Technologies, Inc. vs. the Efficient Markets Hypothesis 2020-05-11T06:00:24.836Z · score: 69 (27 votes)
A Book Review 2020-04-28T17:43:07.729Z · score: 16 (13 votes)
Brief Response to Suspended Reason on Parallels Between Skyrms on Signaling and Yudkowsky on Language and Evidence 2020-04-16T03:44:06.940Z · score: 13 (6 votes)
Why Telling People They Don't Need Masks Backfired 2020-03-18T04:34:09.644Z · score: 29 (14 votes)
The Heckler's Veto Is Also Subject to the Unilateralist's Curse 2020-03-09T08:11:58.886Z · score: 52 (21 votes)
Relationship Outcomes Are Not Particularly Sensitive to Small Variations in Verbal Ability 2020-02-09T00:34:39.680Z · score: 17 (10 votes)
Book Review—The Origins of Unfairness: Social Categories and Cultural Evolution 2020-01-21T06:28:33.854Z · score: 30 (8 votes)
Less Wrong Poetry Corner: Walter Raleigh's "The Lie" 2020-01-04T22:22:56.820Z · score: 21 (13 votes)
Don't Double-Crux With Suicide Rock 2020-01-01T19:02:55.707Z · score: 65 (23 votes)
Speaking Truth to Power Is a Schelling Point 2019-12-30T06:12:38.637Z · score: 49 (18 votes)
Stupidity and Dishonesty Explain Each Other Away 2019-12-28T19:21:52.198Z · score: 36 (16 votes)
Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think 2019-12-27T05:09:22.546Z · score: 95 (34 votes)
Funk-tunul's Legacy; Or, The Legend of the Extortion War 2019-12-24T09:29:51.536Z · score: 13 (20 votes)
Free Speech and Triskaidekaphobic Calculators: A Reply to Hubinger on the Relevance of Public Online Discussion to Existential Risk 2019-12-21T00:49:02.862Z · score: 73 (28 votes)
A Theory of Pervasive Error 2019-11-26T07:27:12.328Z · score: 21 (7 votes)
Relevance Norms; Or, Gricean Implicature Queers the Decoupling/Contextualizing Binary 2019-11-22T06:18:59.497Z · score: 82 (24 votes)
Algorithms of Deception! 2019-10-19T18:04:17.975Z · score: 18 (7 votes)
Maybe Lying Doesn't Exist 2019-10-14T07:04:10.032Z · score: 59 (29 votes)
Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists 2019-09-24T04:12:07.560Z · score: 223 (79 votes)
Schelling Categories, and Simple Membership Tests 2019-08-26T02:43:53.347Z · score: 53 (20 votes)
Diagnosis: Russell Aphasia 2019-08-06T04:43:30.359Z · score: 47 (13 votes)
Being Wrong Doesn't Mean You're Stupid and Bad (Probably) 2019-06-29T23:58:09.105Z · score: 17 (12 votes)
What does the word "collaborative" mean in the phrase "collaborative truthseeking"? 2019-06-26T05:26:42.295Z · score: 27 (7 votes)
The Univariate Fallacy 2019-06-15T21:43:14.315Z · score: 27 (11 votes)
No, it's not The Incentives—it's you 2019-06-11T07:09:16.405Z · score: 91 (32 votes)
"But It Doesn't Matter" 2019-06-01T02:06:30.624Z · score: 47 (31 votes)
Minimax Search and the Structure of Cognition! 2019-05-20T05:25:35.699Z · score: 15 (6 votes)
Where to Draw the Boundaries? 2019-04-13T21:34:30.129Z · score: 91 (40 votes)
Blegg Mode 2019-03-11T15:04:20.136Z · score: 18 (13 votes)
Change 2017-05-06T21:17:45.731Z · score: 1 (1 votes)
An Intuition on the Bayes-Structural Justification for Free Speech Norms 2017-03-09T03:15:30.674Z · score: 4 (8 votes)
Dreaming of Political Bayescraft 2017-03-06T20:41:16.658Z · score: 9 (3 votes)
Rationality Quotes January 2010 2010-01-07T09:36:05.162Z · score: 3 (6 votes)
News: Improbable Coincidence Slows LHC Repairs 2009-11-06T07:24:31.000Z · score: 7 (8 votes)

Comment by zack_m_davis on What are examples of Rationalist fable-like stories? · 2020-09-28T17:59:30.824Z · score: 3 (2 votes) · LW · GW
Comment by zack_m_davis on Blog posts as epistemic trust builders · 2020-09-28T05:13:11.517Z · score: 11 (5 votes) · LW · GW

But fortunately, I have a very high level of epistemic trust for the rationalist community.

No! Not fortunately! Speaking from personal experience, succumbing to the delusion that there is any such thing as "the rationalist community" worthy of anyone's trust has caused me an enormous amount of psychological damage, such that I'm still (still?!) not done recovering from the betrayal trauma more than three years later—and I'm probably not the only one.

(I thought I was done recovering as of (specifically) 13 September, but the fact that I still felt motivated to write a "boo 'rationalists'" comment on Friday and then went into an anxiety spiral for the next 36 hours—and the fact that I'm drafting this comment in a paper notebook when I should be spending a relaxing network-free Sunday studying math—suggest that I'm still (somehow—still?!) not done grieving. I think I'm really close, though!)

There is no authority regulating who's allowed to use the "rationalist" brand name. Trusting "the rationalist community" leaves you open to getting fucked over[1] by any bad idea that can successfully market itself to high-Openness compsci nerds living in Berkeley, California in the current year. The craft is not the community. The ideology is not the movement. Don't revere the bearer of good info. Every cause—every cause—wants to be a cult. At this point, as a guard against my earlier mistakes, I've made a habit of using the pejorative "robot cult" to refer to the social cluster, reserving "rationalist" to describe the methodology set forth in the Sequences—and really, I should probably phase out "rationalist", too. Plain rationality is already a fine word for cognitive algorithms that create and exploit map–territory correspondences—maybe it doesn't have to be an -ism.

Real trust—trust that won't predictably blow up in your face and take three and a half years to recover from—needs to be to something finer-grained than some amorphous self-recommending "community." You need to model the specific competencies of specific people and institutions, and model their incentives to tell you the truth—or to fuck with you.

(Note: people don't have to be consciously fucking with you in order for modeling them as fucking with you to be useful for compressing the length of the message needed to describe your observations. I can't speak to what the algorithms of deception feel from the inside—just that the systematic production of maps that don't reflect the territory for any reason, even mere "bias", should be enough for you to mark them as hostile.)

COVID-19 is an unusually easy case, where people's interests are, for the most part, aligned. People may use the virus as a prop in their existing political games, but at least no one is actually pro-virus. Under those circumstances, sure, trusting an amorphous cluster of smart people who read each other's blogs can legitimately be a better bet than alternative aggregators of information. As soon as you step away from the unusually easy cases—watch your step!

If you learned a lot from the Sequences, I think that's a good reason to trust what Eliezer Yudkowsky in particular says about AI in particular, even if you can't immediately follow the argument. (There's a prior that any given nonprofit claiming you should give it money in order to prevent the destruction of all value in the universe is going to just be a scam—but you see, the Sequences are very good.) That trust does not bleed over (except at a very heavy quantitative discount) to an alleged "community" of people who also trust Yudkowsky—and I don't think it bleeds over to Yudkowsky's off-the-cuff opinions on (let's say, picking an arbitrary example) the relative merits of polyamory, unless you have some more specific reason to trust that he actually thought it through and actually made sure to get that specific question right, rather than happening to pick up that answer through random cultural diffusion from his robot cult. (Most people get most of their beliefs from random cultural diffusion; we can't think fast enough to do otherwise.) Constant vigilance!

1. I (again) feel bad about cussing in a Less Wrong comment, but I want to be very emphatic here! ↩︎

Comment by zack_m_davis on Artificial Intelligence: A Modern Approach (4th edition) on the Alignment Problem · 2020-09-28T04:31:34.548Z · score: 7 (3 votes) · LW · GW

(This extended runaround on appeals to consequences is at least a neat microcosm of the reasons we expect unaligned AIs to be deceptive by default! Having the intent to inform other agents of what you know without trying to take responsibility for controlling their decisions is an unusually anti-natural shape for cognition; for generic consequentialists, influence-seeking behavior is the default.)

Comment by zack_m_davis on The rationalist community's location problem · 2020-09-25T23:40:04.727Z · score: 7 (7 votes) · LW · GW

But if the rationalist project is supposed to be about spreading our ideas and achieving things [emphasis mine]

Thanks for phrasing this as a conditional! To fill in another branch of the if/else-if/else-if ... conditional statement: if the rationalist project is supposed to be about systematically correct reasoning—having the right ideas because they're right, rather than spreading our ideas because they're ours—then things that are advantageous to the movement could be disadvantageous to the ideology, if the needs of growing the coalition's resources conflict with the needs of constructing shared maps that reflect the territory.

if we're trying to get the marginal person who isn't quite a community member yet but occasionally reads Less Wrong to integrate more

I don't know who "we" are, but my personal hope for the marginal person who isn't quite a community member but occasionally reads this website isn't that they necessarily integrate with the community, but that they benefit from understanding the ideas that we talk about on this website—the stuff about science and Bayesian reasoning, which, being universals, bear no distinguishing evidence of their origin. I wouldn't want to privilege the hypothesis that integrating with the community is the right thing to do if you understand the material, given the size of the space of competing alternatives. (The rest of the world is a much larger place than "the community"; you need more evidence to justify the plan of reorganizing your life around a community qua community than you do to justify the plan of reading an interesting blog.)

Comment by zack_m_davis on Has anyone written stories happening in Hanson's em world? · 2020-09-21T17:57:17.966Z · score: 13 (8 votes) · LW · GW

"Blame Me for Trying"

Comment by zack_m_davis on Open & Welcome Thread - September 2020 · 2020-09-19T22:15:21.678Z · score: 22 (7 votes) · LW · GW

I've reliably used the word "threat" to simply mean signaling some kind of intention of inflicting some kind punishment in response to some condition by the other person. Curi and other people from FI have done this repeatedly, and the "list of people who have evaded/lied/etc." is exactly one of such threats, whether explicitly labeled as such or not.

This game-theoretic concept of "threat" is fine, but underdetermined: what counts as a threat in this sense depends on where the the "zero point" is; what counts as aggression versus self-defense depends on what the relevant "property rights" are. (Scare quotes on "property rights" because I'm not talking about legal claims, but "property rights" is an apt choice of words, because I'm claiming that the way people negotiate disputes that don't rise to the level of dragging in the (slow, expensive) formal legal system, have a similar structure.)

If people have a "right" to not be publicly described as lying, evading, &c., then someone who puts up a "these people lied, evaded, &c." page on their own website is engaging in a kind of aggression. The page functions as a threat: "If you don't keep engaging in a way that satisfies my standards of discourse, I'll publicly call you a liar, evader, &c.."

If people don't have a "right" to not be publicly described as lying, evading, &c., then a website administrator who cites a user's "these people lied, evaded, &c." page on their own website as part of a rationale for banning that user, is engaging in a kind of aggression. The ban functions as a threat: "If you don't cede your claim on being able to describe other people as lying, evading, &c., I won't let you participate in this forum."

The size of the website administrator's threat depends on the website's "market power." Less Wrong is probably small enough and niche enough such that the threat doesn't end up controlling anyone's off-site behavior: anyone who perceives not being able to post on Less Wrong as a serious threat is probably already so deeply socially-embedded into our little robot cult, that they either have similar property-rights intuitions as the administrators, or are too loyal to the group to publicly accuse other group members as lying, evading, &c., even if they privately think they are lying, evading, &c.. (Nobody likes self-styled whistleblowers!) But getting kicked off a service with the market power of a Google, Facebook, Twitter, &c. is a sufficiently big deal to sufficiently many people such that those websites' terms-of-service do exert some controlling pressure on the rest of Society.

What are the consequences of each of these "property rights" regimes?

In a world where people have a right to not be publicly described as lying, evading, &c., then people don't have to be afraid of losing reputation on that account. But we also lose out on the possibility of having a public accounting of who has actually in fact lied, evaded, &c.. We give up on maintaining the coordination equilibrium such that words like "lie" have a literal meaning that can actually be true or false, rather than the word itself simply constituting an attack.

Which regime better fulfills our charter of advancing the art of human rationality? I don't think I've written this skillfully enough for you to not be able to guess what answer I lean towards, but you shouldn't trust my answer if it seems like something I might lie or evade about! You need to think it through for yourself.

Comment by zack_m_davis on Causal Reality vs Social Reality · 2020-09-18T16:50:30.131Z · score: 2 (3 votes) · LW · GW

No problem. Hope your research is going well!

(Um, as long as you're initiating an interaction, maybe I should mention that I have been planning to very belatedly address your concern about premature abstraction potentially functioning as a covert meta-attack by putting up a non-Frontpagable "Motivation and Political Context for My Philosophy of Language Agenda" post in conjunction with my next philosophy-of-language post? I'm hoping that will make things better rather than worse from your perspective? But if not, um, sorry.)

Comment by zack_m_davis on Artificial Intelligence: A Modern Approach (4th edition) on the Alignment Problem · 2020-09-18T03:51:44.214Z · score: 7 (9 votes) · LW · GW

Can I also point to this as (some amount of) evidence against concerns that "we" (members of this stupid robot cult that I continue to feel contempt for but don't know how to quit) shouldn't try to have systematically truthseeking discussions about potentially sensitive or low-status subjects because guilt-by-association splash damage from those conversations will hurt AI alignment efforts, which are the most important thing in the world? (Previously: 1 2 3.)

Like, I agree that some nonzero amount of splash damage exists. But look! The most popular AI textbook, used in almost fifteen hundred colleges and universities, clearly explains the paperclip-maximizer problem, in the authorial voice, in the first chapter. "These behaviors are not 'unintelligent' or 'insane'; they are a logical consequence of defining winning as the sole objective for the machine." Italics in original! I couldn't transcribe it, but there's even one of those pay-attention-to-this triangles (◀) in the margin, in teal ink.

Everyone who gets a CS degree from this year onwards is going to know from the teal ink that there's a problem. If there was a marketing war to legitimize AI risk, we won! Now can "we" please stop using the marketing war as an excuse for lying?!

Comment by zack_m_davis on Artificial Intelligence: A Modern Approach (4th edition) on the Alignment Problem · 2020-09-18T03:39:17.627Z · score: 6 (3 votes) · LW · GW

It is an "iff" in §16.7.2 "Deference to Humans", but the toy setting in which this is shown is pretty impoverished. It's a story problem about a robot Robbie deciding whether to book an expensive hotel room for busy human Harriet, or whether to ask Harriet first.

Formally, let be Robbie's prior probability density over Harriet's utility for the proposed action a. Then the value of going ahead with a is

(We will see shortly why the integral is split up this way.) On the other hand, the value of action d, deferring to Harriet, is composed of two parts: if u > 0 then Harriet lets Robbie go ahead, so the value is us, but if u < 0 then Harriet switches Robbie off, so the value is 0:

Comparing the expressions for EU(a) and EU(d), we see immediately that

because the expression for EU(d) has the negative-utility region zeroed out. The two choices have equal value only when the negative region has zero probability—that is, when Robbie is already certain that Harriet likes the proposed action.

(I think this is fine as a topic-introducing story problem, but agree that the sentence in Chapter 1 referencing it shouldn't have been phrased to make it sound like it applies to machines-in-general.)

Comment by zack_m_davis on Decoherence is Falsifiable and Testable · 2020-09-12T00:36:21.481Z · score: 6 (3 votes) · LW · GW

It's mentioned in passing in the "Technical Explanation" (but yes, not a full independently-linkable post):

Humans are very fond of making their predictions afterward, so the social process of science requires an advance prediction before we say that a result confirms a theory. But how humans may move in harmony with the way of Bayes, and so wield the power, is a separate issue from whether the math works. When we’re doing the math, we just take for granted that likelihood density functions are fixed properties of a hypothesis and the probability mass sums to 1 and you’d never dream of doing it any other way.

Comment by zack_m_davis on How easily can we separate a friendly AI in design space from one which would bring about a hyperexistential catastrophe? · 2020-09-11T15:22:05.633Z · score: 9 (4 votes) · LW · GW

Sleep is very important! Get regular sleep every night! Speaking from personal experience, you don't want to have a sleep-deprivation-induced mental breakdown while thinking about Singularity stuff!

Comment by zack_m_davis on Tofly's Shortform · 2020-09-06T01:42:36.720Z · score: 10 (5 votes) · LW · GW

Yudkowsky addresses some of these objections in more detail in "Intelligence Explosion Microeconomics".

Comment by zack_m_davis on Sherrinford's Shortform · 2020-08-02T20:20:33.091Z · score: 14 (9 votes) · LW · GW

The wikipedia article, as far as I can see, explains in that paragraph where the neoreactionary movement originated.

It's not true, though! The article claims: "The neoreactionary movement first grew on LessWrong, attracted by discussions on the site of eugenics and evolutionary psychology".

I mean, okay, it's true that we've had discussions on eugenics and evolutionary psychology, and it's true that a few of the contrarian nerds who enthusiastically read Overcoming Bias back in the late 'aughts were also a few of the contrarian nerds who enthusiastically read Unqualified Reservations. But "first grew" (Wikipedia) and "originated" (your comment) really doesn't seem like a fair summary of that kind of minor overlap in readership. No one was doing neoreactionary political theorizing on this website. Okay, I don't have a exact formalization of what I mean by "no one" in the previous sentence because I haven't personally read and remembered every post in our archives; maybe there are nonzero posts with nonnegative karma that could be construed to match this description. Still, in essence, you can only make the claim "true" by gerrymandering the construal of those words.

And yet the characterization will remain in Wikipedia's view of us—glancing at the talk page, I don't expect to win an edit war with David Gerard.

Comment by zack_m_davis on Generalized Efficient Markets in Political Power · 2020-08-01T19:07:10.765Z · score: 6 (6 votes) · LW · GW

It gets worse. We also face coordination problems on the concepts we use to think with. In order for language to work, we need shared word definitions, so that the probabilistic model in my head when I say a word matches up with the model in your head when you heard the word. A leader isn't just in a position to coordinate what the group does, but also which aspects of reality the group is able to think about.

Comment by zack_m_davis on Open & Welcome Thread - July 2020 · 2020-07-25T19:46:21.868Z · score: 2 (1 votes) · LW · GW

I'm disappointed that the LaTeX processor doesn't seem to accept \nicefrac ("TeX parse error: Undefined control sequence \nicefrac"), but I suppose \frac will suffice.

Comment by zack_m_davis on Open & Welcome Thread - July 2020 · 2020-07-25T04:55:04.807Z · score: 7 (5 votes) · LW · GW

I was pretty freaked out about similar ideas in 2013, but I'm over it now. (Mostly. I'm not signed up for cryonics even though a lot of my friends are.)

If you can stop doing philosophy and futurism, I recommend that. But if you can't ... um, how deep into personal-identity reductionism are you? You say you're "selfishly" worried about bad things "happening to you". As is everyone (and for sound evolutionary reasons), but it doesn't really make sense if you think sub specie æternitatis. If an atom-for-atom identical copy of you, is you, and an almost identical copy is almost you, then in a sufficiently large universe where all possible configurations of matter are realized, it makes more sense to think about the relative measure of different configurations rather than what happens to "you". And from that perspective ...

Well, there's still an unimaginably large amount of suffering in the universe, which is unimaginably bad. However, there's also an unimaginably large amount of unimaginably great things, which are likely to vastly outnumber the bad things for very general reasons: lots of agents want to wirehead, almost no one wants to anti-wirehead. Some agents are altruists, almost no one is a general-purpose anti-altruist, as opposed to feeling spite towards some particular enemy. The only reason you would want to hurt other agents (rather than being indifferent to them except insofar as they are made out of atoms that can be used for other things), would be as part of a war—but superintelligences don't have to fight wars, because it's a Pareto improvement to compute what would have happened in a war, and divide resources accordingly. And there are evolutionary reasons for a creature like you to be more unable to imagine the scope of the great things.

So, those are some reasons to guess that the universe isn't as Bad as you fear. But more importantly—you're not really in a position to know, let alone do anything about it. Even if the future is Bad, this-you locally being upset about it, doesn't make it any better. (If you're freaked out thinking about this stuff, you're not alignment researcher material anyway.) All you can do is work to optimize the world you see around you—the only world you can actually touch.

Comment by zack_m_davis on Open & Welcome Thread - July 2020 · 2020-07-22T22:08:40.498Z · score: 8 (6 votes) · LW · GW

Interested. Be sure to check out Gwern's page on embryo selection if you haven't already.

Comment by zack_m_davis on My Dating Plan ala Geoffrey Miller · 2020-07-21T17:47:20.881Z · score: 6 (5 votes) · LW · GW

That's fair. Maybe our crux is about to what extent "don't draw fake stuff on the map" is a actually a serious constraint? When standing trial for a crime you didn't commit, it's not exactly comforting to be told that the prosecutor never lies, but "merely" reveals Shared Maps That Reflect The Territory But With a Few Blank Spots Where Depicting the Territory Would Have Caused the Defendant to Be Acquitted. It's good that the prosecutor never lies! But it's important that the prosecutor is known as the prosecutor, rather than claiming to be the judge. Same thing with a so-called "rationalist" community.

Comment by zack_m_davis on My Dating Plan ala Geoffrey Miller · 2020-07-21T06:42:42.131Z · score: 3 (6 votes) · LW · GW

Rather than debating the case for or against caution, I think the most interesting question is how to arrange a peaceful schism. Team Shared Maps That Reflect The Territory and Team Seek Power For The Greater Good obviously do not belong in the same "movement" or "community." It's understandable that Team Power doesn't want to be associated with Team Shared Maps because they're afraid we'll say things that will get them in trouble. (We totally will.) But for their part of the bargain, Team Power needs to not fraudulently market their beacon as "the rationality community" and thereby confuse innocents who came looking for shared maps.

Comment by zack_m_davis on Algorithmic Intent: A Hansonian Generalized Anti-Zombie Principle · 2020-07-18T08:25:52.738Z · score: 4 (2 votes) · LW · GW

So, I'm still behind on chores (some laundry on the floor, some more in the dryer, that pile of boxes in the living room, &c.), but allow me to quickly clarify one thing before I get to sleep. (I might have a separate reply for the great-grandparent later.)

the meaning of a word is determined by how it is actually used

Right, and the same word can have different meanings depending on context if it gets used differently in different contexts. Specifically, I perceive the word rationalist as commonly being used with two different meanings:

(1) anyone who seeks methods of systematically correct reasoning, general ways of thinking that produce maps that reflect the territory and use those maps to formulate plans that achieve goals

(2) a member of the particular social grouping of people who read Less Wrong. (And are also likely to read Slate Star Codex, be worried about existential risk from artificial intelligence, be polyamorous and live in Berkeley, CA, &c.)

I intended my "as an aspiring epistemic rationalist" to refer to the meaning rationalist(1), whereas I read your "other rationalists disagreeing with you" to refer to rationalist(2).

An analogy: someone who says, "As a Christian, I cannot condone homosexual 'marriage'" is unlikely to be moved by the reply, "But lots of Christians at my liberal church disagree with you". The first person is trying to be a Christian(1) as they understand it—one who accepts Jesus Christ as their lord and savior and adheres to the teachings of the Bible. The consensus of people who are merely Christian(2)—those who belong to a church—is irrelevant if that church is corrupt and has departed from God's will.

Hope this helps.

one function of "as an aspiring epistemic rationalist" in what you wrote is to encourage readers who also think of themselves that way to feel bad about disagreeing

I mean, they should feel bad if and only if feeling bad helps them be less wrong.

Comment by zack_m_davis on My Dating Plan ala Geoffrey Miller · 2020-07-18T07:11:19.259Z · score: 4 (7 votes) · LW · GW

Happy to discuss it. (I feel a little guilty for cussing in a Less Wrong comment, but I am at war with the forces of blandness and it felt appropriate to be forceful.)

My understanding of the Vision was that we were going to develop methods of systematically correct reasoning the likes of which the world had never seen, which, among other things, would be useful for preventing unaligned superintelligence from destroying all value in the universe.

Lately, however, I seem to see a lot of people eager to embrace censorship for P.R. reasons, seemingly without noticing or caring that this is a distortionary force on shared maps, as if the Vision was to run whatever marketing algorithm can win the most grant money and lure warm bodies for our robot cult—which I could get behind if I thought money and warm bodies were really the limiting resource for saving the world. But the problem with "systematically correct reasoning except leaving out all the parts of the discussion that might offend someone with a degree from Oxford or Berkeley" as opposed to "systematically correct reasoning" is that the former doesn't let you get anything right that Oxford or Berkeley gets wrong.

Optimized dating advice isn't important in itself, but the discourse algorithm that's too cowardly to even think about dating advice is thereby too constrained to do serious thinking about the things that are important.

Comment by zack_m_davis on My Dating Plan ala Geoffrey Miller · 2020-07-17T17:47:27.957Z · score: -2 (14 votes) · LW · GW

We already have the Frontpage/Personal distinction to reduce visibility of posts that might scare off cognitive children!

Posts on Less Wrong should focus on getting the goddamned right answer for the right reasons. If the "Less Wrong" and "rationalist" brand names mean anything, they mean that. If something about Snog's post is wrong—if it proposes beliefs that are false or plans that won't work, then it should be vigorously critiqued and downvoted.

If the terminology used in the post makes someone, somewhere have negative feelings about the "Less Wrong" brand name? Don't care; don't fucking care; can't afford to care. What does that have to do with maximizing the probability assigned to my observations?

Comment by zack_m_davis on Algorithmic Intent: A Hansonian Generalized Anti-Zombie Principle · 2020-07-15T16:29:12.939Z · score: 10 (7 votes) · LW · GW

Why write a lengthy and potentially controversial piece if you know you haven't time to engage with responses?

Look, I have a dayjob and I'm way behind on chores! I had a blog post idea that I'd been struggling with on-and-off for months, that I finally managed to finish a passable draft of and email to prereaders Sunday night. My prereaders thought it sucked (excerpts: "should ideally be a lot tighter. 2/3 the word count?", "Don't know why [this post] exists. It seems like you have something to say, so say it"), but on Monday night I was able to edit it into something that I wasn't embarrassed to shove out the door—even if it wasn't my best work. I have a dayjob videoconference meeting starting in two minutes. Can I maybe get back to you later???

Comment by zack_m_davis on Algorithmic Intent: A Hansonian Generalized Anti-Zombie Principle · 2020-07-15T06:08:46.491Z · score: 4 (2 votes) · LW · GW

I mean, I agree that pretending Levels 2+ don't exist may not be a good strategy for getting Level 1 discourse insofar as it's hard to coordinate on ... but maybe not as hard as you think?

Comment by zack_m_davis on Algorithmic Intent: A Hansonian Generalized Anti-Zombie Principle · 2020-07-15T06:01:16.219Z · score: -1 (4 votes) · LW · GW

(Regretfully, I'm too busy at the moment to engage with most of this, but one thing leapt out at me—)

I note with some interest that the link you offer in support of it is something you yourself wrote, and whose comment section consists mostly of other rationalists disagreeing with you

I don't think that should be interesting. I don't care what so-called "rationalists" think; I care about what's true.

Comment by zack_m_davis on The New Frontpage Design & Opening Tag Creation! · 2020-07-14T06:15:19.721Z · score: 13 (3 votes) · LW · GW

I strongly disapprove of graphically rendering "Frontpage" and "Personal Blog" as if they were tags on post pages. They're not tags! Conflating "Personal" with topic tags on the home page "Latest" controls wasn't jarring, but this really is.

Comment by zack_m_davis on Was a terminal degree ~necessary for inventing Boyle's desiderata? · 2020-07-13T05:16:58.006Z · score: 2 (1 votes) · LW · GW
Comment by zack_m_davis on Every Cause Wants To Be A Cult · 2020-07-11T22:27:53.809Z · score: 13 (6 votes) · LW · GW

Cade Metz at The Register

It's interesting to stumble across old references to authors whose names you only recognize now, but didn't at the time. Cade Metz, huh? I wonder what he's been up to lately!

Comment by zack_m_davis on The New Frontpage Design & Opening Tag Creation! · 2020-07-09T05:41:46.657Z · score: 2 (1 votes) · LW · GW

you can go to https://www.lesswrong.com/tags/all,

The link currently points to lessestwrong.com, presumably your staging server.

Comment by zack_m_davis on Classifying games like the Prisoner's Dilemma · 2020-07-09T04:01:25.695Z · score: 22 (8 votes) · LW · GW

if someone does know of an existing article

Herbert Gintis's Game Theory Evolving (2nd edition) offers the following exercise. (Bolding and hyperlinks mine.)

#### 6.15 Characterizing 2 x 2 Normal Form Games I

We say a normal form game is generic if no two payoffs for the same player are equal. Suppose and are the payoff matrices for Alice and Bob, so the payoff to Alice's strategy against Bob's strategy is for Alice and for Bob. We say two generic 2 x 2 games with payoff matrices (A, B) and (C, D) are equivalent if, for all i, j, k, l = 1, 2:

and

In particular, if a constant is added to the payoffs to all the pure strategies of one player when played against a given pure strategy of the other player, the resulting game is equivalent to the original.

Show that equivalent 2 x 2 generic games have the same number of pure Nash equilibria and the same number of strictly mixed Nash equilibria. Show also that every generic 2 x 2 game is equivalent to either the prisoner's dilemma (§3.11), the battle of the sexes (§3.9), or the hawk-dove (§3.10). Note that this list does not include throwing fingers (§3.8), which is not generic.

Comment by zack_m_davis on Optimized Propaganda with Bayesian Networks: Comment on "Articulating Lay Theories Through Graphical Models" · 2020-06-30T04:28:41.440Z · score: 3 (2 votes) · LW · GW

Thanks, you are right and the thing I actually typed was wrong. (For the graph A → C ← B, the collider C blocks the path between A and B, but conditioning on the collider un-blocks it.) Fixed.

Comment by zack_m_davis on A reply to Agnes Callard · 2020-06-28T04:46:45.773Z · score: 7 (4 votes) · LW · GW

(Done.)

Comment by zack_m_davis on Atemporal Ethical Obligations · 2020-06-27T03:43:50.618Z · score: 7 (8 votes) · LW · GW

the future will agree Rowling's current position is immoral

This is vague. An exercise: can you quote specific sentences from Rowling's recent essay that you think the future will agree are immoral?

Maybe don't answer that, because we don't care about the object level on this website? (Or, maybe you should answer it if you think avoiding the object-level is potentially a sneaky political move on my part.) But if you try the exercise and it turns out to be harder than you expected, one possible moral is that a lot of what passes for discourse in our Society doesn't even rise to the level of disagreement about well-specified beliefs or policy proposals, but is mostly about coalition-membership and influencing cultural sentiments. Those who scorn Rowling do so not because she has a specific proposal for revising the 2004 Gender Recognition Act that people disagree with, but because she talks in a way that pushes culture in the wrong direction. Everything is a motte-and-bailey: most people, most of the time don't really have "positions" as such!

Comment by zack_m_davis on Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning · 2020-06-22T01:14:18.492Z · score: 0 (2 votes) · LW · GW

So, I actually don't think Less Wrong needs to be nicer! (But I agree that elaborating more was warranted.)

Comment by zack_m_davis on Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning · 2020-06-22T01:11:39.788Z · score: 17 (6 votes) · LW · GW

Thanks for the comment!—and for your patience.

So, the general answer to "Is there anyone who doesn't know this?" is, in fact, "Yes." But I can try to say a little bit more about why I thought this was worth writing.

I do think Less Wrong and /r/rational readers know that words don't have intrinsic definitions. If someone wrote a story that just made the point, "Hey, words don't have intrinsic definitions!", I would probably downvote it.

But I think this piece is actually doing more work and exposing more details than that—I'm actually providing executable source code (!) that sketches how a simple sender–reciever game with a reinforcement-learning rule correlates a not-intrinsically-meaningful signal with the environment such that it can be construed as a meaningful word that could have a definition.

In analogy, explaining how the subjective sensation of "free will" might arise from a deterministic system that computes plans (without being able to predict what it will choose in advance of having computed it) is doing more work than the mere observation "Naïve free will can't exist because physics is deterministic".

So, I don't think all this was already obvious to Less Wrong readers. If it was already obvious to you, then you should be commended. However, even if some form of these ideas was already well-known, I'm also a proponent of "writing a thousand roads to Rome": part of how you get and maintain a community where "everybody knows" certain basic material, is by many authors grappling the ideas and putting their own ever-so-slightly-different pedagogical spin on them. It's fundamentally okay for Yudkowsky's account of free will, and Gary Drescher's account (in Chapter 5 of Good and Real), and my story about writing a chess engine to all exist, even if they're all basically "pointing at the same thing."

Another possible motivation for writing a new presentation of an already well-known idea, is because the new presentation might be better-suited as a prerequisite or "building block" towards more novel work in the future. In this case, some recent Less Wrong discussions have used a "four simulacrum levels" framework (loosely inspired by the work of Jean Baudrillard) to try to model how political forces alter the meaning of language, but I'm pretty unhappy with the "four levels" formulation: the fact that I could never remember the difference between "level 3" and "level 4" even after it was explained several times (Zvi's latest post helped a little), and the contrast between the "linear progression" and "2x2" formulations, make me feel like we're talking about a hodgepodge of different things and haphazardly shoving them into this "four levels" framework, rather than having a clean deconfused concept to do serious thinking with. I'm optimistic about a formal analysis of sender–receiver games (following the work of Skyrms and others) being able to provide this. Now, I haven't done that work yet, and maybe I won't find anything interesting, but laying out the foundations for that potential future work was part of my motivation for this piece.

Comment by zack_m_davis on When is it Wrong to Click on a Cow? · 2020-06-21T17:00:08.373Z · score: 5 (3 votes) · LW · GW

It is always wrong to click on a cow. Clicking on cows is contrary to the moral law.

Comment by zack_m_davis on What is meant by Simulcra Levels? · 2020-06-19T03:03:24.404Z · score: 18 (6 votes) · LW · GW

Because the local discussion of this framework grew out of Jessica Taylor's reading of Wikipedia's reading of continental philosopher Jean Baudrillard's Simulacra and Simulation, about how modern Society has ceased dealing with reality itself, and instead deals with our representations of it—maps that precede the territory, copies with no original. (That irony that no one in this discussion has actually read Baudrillard should not be forgotten!)

Comment by zack_m_davis on Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning · 2020-06-18T04:36:43.221Z · score: 4 (2 votes) · LW · GW

(Thanks for your patience.) If you liked the technical part of this post, then yes! But supplement or substitute Ch. 6, "Deception", with Don Fallis and Peter J. Lewis's "Towards a Formal Analysis of Deceptive Signaling", which explains what Skyrms gets wrong.

Comment by zack_m_davis on That Alien Message · 2020-06-16T06:28:00.189Z · score: 4 (2 votes) · LW · GW

(This was adapted into a longer story by Alicorn.)

Comment by zack_m_davis on Failed Utopia #4-2 · 2020-06-11T05:28:55.802Z · score: 6 (3 votes) · LW · GW

Thanks for commenting! (Strong-upvoted.) It's nice to get new discussion on old posts and comments.

probably applies to some people, somewhere

Hi!

I don't think I'm doing the idea a disservice

How much have you read about the idea from its proponents? ("From its proponents" because, tragically, opponents of an idea can't always be trusted to paraphrase it accurately, rather than attacking a strawman.) If I might recommend just one paper, may I suggest Anne Lawrence's "Autogynephilia and the Typology of Male-to-Female Transsexualism: Concepts and Controversies"?

by dismissing it with a couple of silly comics

Usually, when I dismiss an idea with links, I try to make sure that the links are directly about the idea in question, rather than having a higher inferential distance.

For example, when debating a creationist, I think it would be more productive to link to a page about the evidence for evolution, rather than to link to a comic about the application of Occam's razor to some other issue. To be sure, Occam's razor is relevant to the creation/evolution debate!—but in order to communicate to someone who doesn't already believe that, you (or your link) needs to explain the relevance in detail. The creationist probably thinks intelligent design is "the simplest explanation." In order to rebut them, you can't just say "Occam's razor!", you need to show how they're confused about how evolution works or the right concept of "simplicity".

In the present case, linking to Existential Comics on falsifiability and penis envy doesn't help me understand your point of view, because while I agree that scientific theories need to be falsifiable, I don't agree that the autogynephilia theory is unfalsifiable. An example of a more relevant link might be to Julia Serano's rebuttal? (However, I do not find Serano's rebuttal convincing.)

I don't see how "wanting to see a world without strict gender roles" has anything to do with sexuality

That part is admittedly a bit speculative; as it happens, I'm planning to explain more in a forthcoming post (working title: "Sexual Dimorphism in Yudkowsky's Sequences, in Relation to My Gender Problems") on my secret ("secret") blog, but it's not done yet.

Comment by zack_m_davis on What past highly-upvoted posts are overrated today? · 2020-06-09T22:05:46.569Z · score: 13 (6 votes) · LW · GW

You would downvote them in order to make the sorted-by-karma archives more useful! (See the tragically underrated "Why Artificial Optimism?")

Comment by zack_m_davis on Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning · 2020-06-09T05:55:04.955Z · score: 3 (2 votes) · LW · GW

(More comments on /r/rational and /r/SneerClub.)

Comment by zack_m_davis on Open & Welcome Thread - June 2020 · 2020-06-07T05:05:13.050Z · score: 8 (4 votes) · LW · GW

Comment and post text fields default to "LessWrong Docs [beta]" for me, I assume because I have "Opt into experimental features" checked in my user settings. I wonder if the "Activate Markdown Editor" setting should take precedence?—no one who prefers Markdown over the Draft.js WYSIWYG editor is going to switch because our WYSIWYG editor is just that much better, right? (Why are you guys writing an editor, anyway? Like, it looks fun, but I don't understand why you'd do it other than, "It looks fun!")

Comment by Zack_M_Davis on [deleted post] 2020-06-06T18:10:51.331Z

I'll grant that there's a sense in which instrumental and epistemic rationality could be said to not coincide for humans, but I think they conflict much less often than you seem to be implying, and I think overemphasizing the epistemic/instrumental distinction was a pedagogical mistake in the earlier days of the site.

Forget about humans and think about how to build an idealized agent out of mechanical parts. How do you expect your AI to choose actions that achieve its goals, except by modeling the world, and using the model to compute which actions will have what effects?

From this perspective, the purported counterexamples to the coincidence of instrumental and epistemic rationality seem like pathological edge cases that depend on weird defects in human psychology. Learning how to build an unaligned superintelligence or an atomic bomb isn't dangerous if you just ... choose not to build the dangerous thing, even if you know how. Maybe there are some cases where believing false things helps achieve your goals (particularly in domains where we were designed by evolution to have false beliefs for the function of decieving others), but trusting false information doesn't increase your chances of using information to make decisions that achieve your goals.

Comment by zack_m_davis on Open & Welcome Thread - June 2020 · 2020-06-04T21:50:36.366Z · score: 6 (4 votes) · LW · GW

This seems kind of terrible? I expect authors and readers care more about new posts being published than about the tags being pristine.

Comment by zack_m_davis on Open & Welcome Thread - June 2020 · 2020-06-04T21:24:43.708Z · score: 2 (1 votes) · LW · GW

I was wondering about this, too. (If the implicit Frontpaging queue is "stuck", that gives me an incentive to delay publishing my new post, so that it doesn't have to compete with a big burst of backlogged posts being Frontpaged at the same time.)

Comment by zack_m_davis on Conjuring An Evolution To Serve You · 2020-06-01T06:32:28.628Z · score: 4 (2 votes) · LW · GW

(This post could be read as a predecessor to the Immoral Mazes sequence.)

Comment by zack_m_davis on A Problem With Patternism · 2020-05-20T21:07:07.539Z · score: 5 (3 votes) · LW · GW

the same question Yudkowsky uses in his post on cryonics in the sequences, although I can't find a link at the moment

You may be thinking of "Timeless Identity". Best wishes, the Less Wrong Reference Desk

Comment by zack_m_davis on [Site Meta] Feature Update: More Tags! (Experimental) · 2020-05-19T04:20:41.429Z · score: 6 (3 votes) · LW · GW

I want a language or philosophy of language tag (examples: "37 Ways Words Can Be Wrong", "Fuzzy Boundaries, Real Concepts", some of my published and forthcoming work).

I want a disagreement tag (examples: "The Modesty Argument", "The Rhythm of Disagreement", a forthcoming post).

Comment by zack_m_davis on Raemon's Shortform · 2020-04-17T05:19:23.670Z · score: 10 (5 votes) · LW · GW

Looks like the weak 3-votes are gone now!