Posts "How can I help?" FAQ 2023-06-05T22:09:57.630Z
Announcing's Write-a-thon (June 16-18) and Second Distillation Fellowship (July 3-October 2) 2023-06-03T02:03:01.364Z
All AGI Safety questions welcome (especially basic ones) [May 2023] 2023-05-08T22:30:50.267Z
All AGI Safety questions welcome (especially basic ones) [April 2023] 2023-04-08T04:21:36.258Z
steven0461's Shortform Feed 2019-06-30T02:42:13.858Z
Agents That Learn From Human Behavior Can't Learn Human Values That Humans Haven't Learned Yet 2018-07-11T02:59:12.278Z
Meetup : San Jose Meetup: Park Day (X) 2016-11-28T02:46:20.651Z
Meetup : San Jose Meetup: Park Day (IX), 3pm 2016-11-01T15:40:19.623Z
Meetup : San Jose Meetup: Park Day (VIII) 2016-09-06T00:47:23.680Z
Meetup : Meetup : San Jose Meetup: Park Day (VII) 2016-08-15T01:05:00.237Z
Meetup : San Jose Meetup: Park Day (VI) 2016-07-25T02:11:44.237Z
Meetup : San Jose Meetup: Park Day (V) 2016-07-04T18:38:01.992Z
Meetup : San Jose Meetup: Park Day (IV) 2016-06-15T20:29:04.853Z
Meetup : San Jose Meetup: Park Day (III) 2016-05-09T20:10:55.447Z
Meetup : San Jose Meetup: Park Day (II) 2016-04-20T06:23:28.685Z
Meetup : San Jose Meetup: Park Day 2016-03-30T04:39:09.532Z
Meetup : Amsterdam 2013-11-12T09:12:31.710Z
Bayesian Adjustment Does Not Defeat Existential Risk Charity 2013-03-17T08:50:02.096Z
Meetup : Chicago Meetup 2011-09-28T04:29:35.777Z
Meetup : Chicago Meetup 2011-07-07T15:28:57.969Z
PhilPapers survey results now include correlations 2010-11-09T19:15:47.251Z
Chicago Meetup 11/14 2010-11-08T23:30:49.015Z
A Fundamental Question of Group Rationality 2010-10-13T20:32:08.085Z
Chicago/Madison Meetup 2010-07-15T23:30:15.576Z
Swimming in Reasons 2010-04-10T01:24:27.787Z
Disambiguating Doom 2010-03-29T18:14:12.075Z
Taking Occam Seriously 2009-05-29T17:31:52.268Z
Open Thread: May 2009 2009-05-01T16:16:35.156Z
Eliezer Yudkowsky Facts 2009-03-22T20:17:21.220Z
The Wrath of Kahneman 2009-03-09T12:52:41.695Z
Lies and Secrets 2009-03-08T14:43:22.152Z


Comment by steven0461 on Join's Writing & Editing Hackathon (Aug 25-28) (Prizes to be won!) · 2023-08-22T05:34:57.443Z · LW · GW

if there's interest in finding a place for a few people to cowork on this in Berkeley, please let me know

Comment by steven0461 on Stampy's AI Safety Info - New Distillations #4 [July 2023] · 2023-08-17T00:01:02.673Z · LW · GW

Thanks, I made a note on the doc for that entry and we'll update it.

Comment by steven0461 on Stampy's AI Safety Info - New Distillations #4 [July 2023] · 2023-08-16T23:55:20.826Z · LW · GW

Traffic is pretty low currently, but we've been improving the site during the distillation fellowships and we're hoping to make more of a real launch soon. And yes, people are working on a Stampy chatbot. (The current early prototype isn't finetuned on Stampy's Q&A but searches the alignment literature and passes things to a GPT context window.)

Comment by steven0461 on Join's Writing & Editing Hackathon (Aug 25-28) (Prizes to be won!) · 2023-08-06T06:31:24.623Z · LW · GW

Yes, but we decided to reschedule it before making the announcement. Apologies to anyone who found the event in some other way and was planning on it being around the 11th; if Aug 25-27 doesn't work for you, note that there's still the option to participate early.

Comment by steven0461 on Announcing's Write-a-thon (June 16-18) and Second Distillation Fellowship (July 3-October 2) · 2023-06-17T19:15:40.015Z · LW · GW

Since somebody was wondering if it's still possible to participate without having signed up through

Yes, people are definitely still welcome to participate today and tomorrow, and are invited to head over to Discord to get up to speed.

Comment by steven0461 on Vaniver's Shortform · 2023-05-08T22:02:56.756Z · LW · GW

Stampy's AI Safety Info is a little like that in that it has 1) pre-written answers, 2) a chatbot under very active development, and 3) a link to a Discord with people who are often willing to explain things. But it could probably be more like that in some ways, e.g. if more people who were willing to explain things were habitually in the Discord.

Also, I plan to post the new monthly basic AI safety questions open thread today (edit: here), which is also a little like that.

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-12T20:56:25.752Z · LW · GW

I tried to answer this here

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-12T01:11:10.102Z · LW · GW

Anonymous #7 asks:

I am familiar with the concept of a utility function, which assigns numbers to possible world states and considers larger numbers to be better. However, I am unsure how to apply this function in order to make decisions that take time into account. For example, we may be able to achieve a world with higher utility over a longer period of time, or a world with lower utility but in a shorter amount of time.

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-11T02:56:20.928Z · LW · GW

Anonymous #6 asks:

Why hasn't an alien superintelligence within our light cone already killed us?

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-09T22:29:54.893Z · LW · GW

Anonymous #5 asks:

How can programers build something and dont understand inner workings of it? Are they closer to biologists-cross-breeders than to car designers?

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-09T22:29:22.426Z · LW · GW

Anonymous #4 asks:

How large space of possible minds? How its size was calculated? Why is EY thinks that human-like minds are not fill most of this space? What are the evidence for it? What are the possible evidence against "giant Mind Design Space and human-like minds are tiny dot there"?

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-09T22:27:58.493Z · LW · GW

Anonymous #3 asks:

Can AIs be anything but utility maximisers? Most of the existing programs are something like finite-steps-executors (like Witcher 3 and calculator). So what's the difference?

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-09T22:12:02.075Z · LW · GW

I don't know why they think so, but here are some people speculating.

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-09T21:29:56.387Z · LW · GW

Anonymous #2 asks:

A footnote in 'Planning for AGI and beyond' says "Many of us think the safest quadrant in this two-by-two matrix is short timelines and slow takeoff speeds; shorter timelines seem more amenable to coordination" - why do shorter timelines seem more amenable to coordination?

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-08T23:44:46.683Z · LW · GW

Anonymous #1 asks:

This one is not technical: now that we live in a world in which people have access to systems like ChatGPT, how should I consider any of my career choices, primarily in the context of a computer technician? I'm not a hard-worker, and I consider that my intelligence is just a little above average, so I'm not going to pretend that I'm going to become a systems analyst or software engineer, but now code programming and content creation are starting to be automated more and more, so how should I update my decisions based on that?

Sure, this question is something that most can ask about their intellectual jobs, but I would like to see answers from people in this community; and particularly about a field in which, more than most, employers are going to expect any technician to stay up-to-date with these tools.

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-08T22:54:18.717Z · LW · GW

Here's a form you can use to send questions anonymously. I'll check for responses and post them as comments.

Comment by steven0461 on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-08T21:23:01.826Z · LW · GW

From 38:58 of the podcast:

So I do think that over time I have come to expect a bit more that things will hang around in a near human place and weird shit will happen as a result. And my failure review where I look back and ask — was that a predictable sort of mistake? I feel like it was to some extent maybe a case of — you’re always going to get capabilities in some order and it was much easier to visualize the endpoint where you have all the capabilities than where you have some of the capabilities. And therefore my visualizations were not dwelling enough on a space we’d predictably in retrospect have entered into later where things have some capabilities but not others and it’s weird. I do think that, in 2012, I would not have called that large language models were the way and the large language models are in some way more uncannily semi-human than what I would justly have predicted in 2012 knowing only what I knew then. But broadly speaking, yeah, I do feel like GPT-4 is already kind of hanging out for longer in a weird, near-human space than I was really visualizing. In part, that's because it's so incredibly hard to visualize or predict correctly in advance when it will happen, which is, in retrospect, a bias.

Comment by steven0461 on Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent · 2023-03-10T19:46:43.548Z · LW · GW

trevor has already mentioned the Stampy project, which is trying to do something very similar to what's described here and wishes to join forces.

Right now, Stampy just uses language models for semantic search, but the medium-term plan is to use them for text generation as well: people will be able to go to or, type in questions, and have a conversational agent respond. This would probably use a language model fine-tuned by the authors of Cyborgism (probably starting with a weak model as a trial, then increasingly strong ones as they become available), with primary fine-tuning on the alignment literature and hopefully secondary fine-tuning on Stampy content. A question asked in chat would be used to do an extractive search on the literature, then the results would be put into the LM's context window and it would generate a response.

Stampy welcomes volunteer developers to help with building the conversational agent and a front end for it, as well as volunteers to help write content.

Comment by steven0461 on Taboo P(doom) · 2023-02-07T16:16:50.554Z · LW · GW

There's another issue where "P(doom)" can be read either as the probability that a bad outcome will happen, or the probability that a bad outcome is inevitable. I think the former is usually what's meant, but if "P(doom)" means "the probability that we're doomed", then that suggests the latter as a distracting alternative interpretation.

Comment by steven0461 on Do anthropic considerations undercut the evolution anchor from the Bio Anchors report? · 2022-10-01T20:11:52.954Z · LW · GW

How Hard is Artificial Intelligence? Evolutionary Arguments and Selection Effects

Comment by steven0461 on What should you change in response to an "emergency"? And AI risk · 2022-07-20T23:14:26.046Z · LW · GW

In terms of "and those people who care will be broad and varied and trying their hands at making movies and doing varied kinds of science and engineering research and learning all about the world while keeping their eyes open for clues about the AI risk conundrum, and being ready to act when a hopeful possibility comes up" we're doing less well compared to my 2008 hopes. I want to know why and how to unblock it.

I think to the extent that people are failing to be interesting in all the ways you'd hoped they would be, it's because being interesting in those ways seems to them to have greater costs than benefits. If you want people to see the benefits of being interesting as outweighing the costs, you should make arguments to help them improve their causal models of the costs, and to improve their causal models of the benefits, and to compare the latter to the former. (E.g., what's the causal pathway by which an hour of thinking about Egyptology or repairing motorcycles or writing fanfic ends up having, not just positive expected usefulness, but higher expected usefulness at the margin than an hour of thinking about AI risk?) But you haven't seemed very interested in explicitly building out this kind of argument, and I don't understand why that isn't at the top of your list of strategies to try.

Comment by steven0461 on How could the universe be infinitely large? · 2022-07-13T16:35:28.620Z · LW · GW

As far as I know, this is the standard position. See also this FAQ entry. A lot of people sloppily say "the universe" when they mean the observable part of the universe, and that's what's causing the confusion.

Comment by steven0461 on Slowing down AI progress is an underexplored alignment strategy · 2022-07-12T21:34:04.463Z · LW · GW

I have also talked with folks who’ve thought a lot about safety and who honestly think that existential risk is lower if we have AI soon (before humanity can harm itself in other ways), for example.

It seems hard to make the numbers come out that way. E.g. suppose human-level AGI in 2030 would cause a 60% chance of existential disaster and a 40% chance of existential disaster becoming impossible, and human-level AGI in 2050 would cause a 50% chance of existential disaster and a 50% chance of existential disaster becoming impossible. Then to be indifferent about AI timelines, conditional on human-level AGI in 2050, you'd have to expect a 1/5 probability of existential disaster from other causes in the 2030-2050 period. (That way, with human-level AGI in 2050, you'd have a 1/2 * 4/5 = 40% chance of surviving, just like with human-level AGI in 2030.) I don't really know of non-AI risks in the ballpark of 10% per decade.

(My guess at MIRI people's model is more like 99% chance of existential disaster from human-level AGI in 2030 and 90% in 2050, in which case indifference would require a 90% chance of some other existential disaster in 2030-2050, to cut 10% chance of survival down to 1%.)

Comment by steven0461 on Safetywashing · 2022-07-01T16:56:47.450Z · LW · GW

"Safewashing" would be more directly parallel to "greenwashing" and sounds less awkward to my ears than "safetywashing", but on the other hand the relevant ideas are more often called "AI safety" than "safe AI", so I'm not sure if it's a better or worse term.

Comment by steven0461 on Intergenerational trauma impeding cooperative existential safety efforts · 2022-06-03T19:57:46.677Z · LW · GW

Yes, my experience of "nobody listened 20 years ago when the case for caring about AI risk was already overwhelmingly strong and urgent" doesn't put strong bounds on how much I should anticipate that people will care about AI risk in the future, and this is important; but it puts stronger bounds on how much I should anticipate that people will care about counterintuitive aspects of AI risk that haven't yet undergone a slow process of climbing in mainstream respectability, even if the case for caring about those aspects is overwhelmingly strong and urgent (except insofar as LessWrong culture has instilled a general appreciation for things that have overwhelmingly strong and urgent cases for caring about them), and this is also important.

Comment by steven0461 on "Tech company singularities", and steering them to reduce x-risk · 2022-05-13T19:30:21.483Z · LW · GW
  1. after a tech company singularity,

I think this was meant to read "2. after AGI,"

Comment by steven0461 on What are your recommendations for technical AI alignment podcasts? · 2022-05-11T22:14:58.689Z · LW · GW

Note that the full 2021 MIRI conversations are also available (in robot voice) in the Nonlinear Library archive.

Comment by steven0461 on What are your recommendations for technical AI alignment podcasts? · 2022-05-11T22:06:15.046Z · LW · GW

edit: also FLI's AI alignment podcast

Comment by steven0461 on [Linkpost] New multi-modal Deepmind model fusing Chinchilla with images and videos · 2022-05-03T21:37:14.952Z · LW · GW

Some relevant Altman tweets: 1, 2, 3

Comment by steven0461 on Salvage Epistemology · 2022-04-30T22:12:02.001Z · LW · GW

As I see it, "rationalist" already refers to a person who thinks rationality is particularly important, not necessarily a person who is rational, like how "libertarian" refers to a person who thinks freedom is particularly important, not necessarily a person who is free. Then literally speaking "aspiring rationalist" refers to a person who aspires to think rationality is particularly important, not to a person who aspires to be rational. Using "aspiring rationalist" to refer to people who aspire to attain rationality encourages people to misinterpret self-identified rationalists as claiming to have attained rationality. Saying something like "person who aspires to rationality" instead of "aspiring rationalist" is a little more awkward, but it respects the literal meaning of words, and I think that's important.

Comment by steven0461 on Replicating and extending the grabby aliens model · 2022-04-24T20:34:58.668Z · LW · GW

Great report. I found the high decision-worthiness vignette especially interesting.

I haven't read it closely yet, so people should feel free to be like "just read the report more closely and the answers are in there", but here are some confusions and questions that have been on my mind when trying to understand these things:

Has anyone thought about this in terms of a "consequence indication assumption" that's like the self-indication assumption but normalizes by the probability of producing paths from selves to cared-about consequences instead of the probability of producing selves? Maybe this is discussed in the anthropic decision theory sequence and I should just catch up on that?

I wonder how uncertainty about the cosmological future would affect grabby aliens conclusions. In particular, I think not very long ago it was thought plausible that the affectable universe is unbounded, in which case there could be worlds where aliens were almost arbitrarily rare that still had high decision-worthiness. (Faster than light travel seems like it would have similar implications.)

SIA and SSA mean something different now than when Bostrom originally defined them, right? Modern SIA is Bostrom's SIA+SSA and modern SSA is Bostrom's (not SIA)+SSA? Joe Carlsmith talked about this, but it would be good if there were a short comment somewhere that just explained the change of definition, so people can link it whenever it comes up in the future. (edit: ah, just noticed footnote 13)

SIA doomsday is a very different thing than the regular doomsday argument, despite the name, right? The former is about being unlikely to colonize the universe, the latter is about being unlikely to have a high number of observers? A strong great filter that lies in our future seems like it would require enough revisions to our world model to make SIA doom basically a variant of the simulation argument, i.e. the best explanation of our ability to colonize the stars not being real would be the stars themselves not being real. Many other weird hypotheses seem like they'd become more likely than the naive world view under SIA doom reasoning. E.g., maybe there are 10^50 human civilizations on Earth, but they're all out of phase and can't affect each other, but they can still see the same sun and stars. Anyway, I guess this problem doesn't turn up in the "high decision-worthiness" or "consequence indication assumption" formulation.

Comment by steven0461 on [RETRACTED] It's time for EA leadership to pull the short-timelines fire alarm. · 2022-04-08T22:00:32.192Z · LW · GW

My impression (based on using Metaculus a lot) is that, while questions like this may give you a reasonable ballpark estimate and it's great that they exist, they're nowhere close to being efficient enough for it to mean much when they fail to move. As a proxy for the amount of mental effort that goes into it, there's only been three comments on the linked question in the last month. I've been complaining about people calling Metaculus a "prediction market" because if people think it's a prediction market then they'll assume there's a point to be made like "if you can tell that the prediction is inefficient, then why aren't you rich, at least in play money?" But the estimate you're seeing is just a recency-weighted median of the predictions of everyone who presses the button, not weighted by past predictive record, and not weighted by willingness-to-bet, because there's no buying or selling and everyone makes only one prediction. It's basically a poll of people who are trying to get good results (in terms of Brier/log score and Metaculus points) on their answers.

Comment by steven0461 on What Twitter fixes should we advocate, now that Elon is on the board? · 2022-04-06T22:16:23.800Z · LW · GW

Metaculus (unlike Manifold) is not a market and does not use play money except in the same sense that Tetris score is play money.

Comment by steven0461 on Ukraine Post #7: Prediction Market Update · 2022-03-30T02:08:34.158Z · LW · GW

I don't understand why people are calling Metaculus a prediction market. There's no buying or selling going on, even in play money. There's a score, but score doesn't affect the community estimate, which is just a median of all user predictions weighted by recency. I think it ends up doing pretty well, but calling it a market (which it doesn't call itself) will give readers a mistaken impression of how it works.

Comment by steven0461 on Open Thread - Jan 2022 [Vote Experiment!] · 2022-01-03T22:23:39.484Z · LW · GW

It took a minute to "click" for me that the green up marks and red down marks corresponded to each other in four opposed pairs, and that the Truth/Aim/Clarity numbers also corresponded to these axes. Possibly this is because I went straight to the thread after quickly skimming the OP, but most threads won't have the OP to explain things anyway. So my impression is it should be less opaque somehow. I do like having votes convey a lot more information than up/down. I wonder if it would be best to hide the new features under some sort of "advanced options" interface.

"Seeks truth" and "seeks conflict" aren't always opposites. For example, it's common for comments to seek harmony instead of either truth or conflict.

If there's going to be a small number of emojis, they should probably be very different colors, like red/yellow/blue/green.

Comment by steven0461 on steven0461's Shortform Feed · 2021-12-12T21:56:21.622Z · LW · GW

Are there online spaces that talk about the same stuff LW talks about (AI futurism, technical rationality, and so on), with reasonably high quality standards, but more conversational-oriented and less soapbox-oriented, and maybe with less respectability signaling? I often find myself wanting to talk about things discussed here but feeling overconstrained by things like knowing that comments are permanent and having to anticipate objections instead of taking them as they come.

Comment by steven0461 on Daniel Kokotajlo's Shortform · 2021-12-12T21:44:19.444Z · LW · GW


Comment by steven0461 on Considerations on interaction between AI and expected value of the future · 2021-12-08T23:32:01.913Z · LW · GW

I tend to want to split "value drift" into "change in the mapping from (possible beliefs about logical and empirical questions) to (implied values)" and "change in beliefs about logical and empirical questions", instead of lumping both into "change in values".

Comment by steven0461 on Considerations on interaction between AI and expected value of the future · 2021-12-07T03:45:45.280Z · LW · GW

This seems to be missing what I see as the strongest argument for "utopia": most of what we think of as "bad values" in humans comes from objective mistakes in reasoning about the world and about moral philosophy, rather than from a part of us that is orthogonal to such reasoning in a paperclip-maximizer-like way, and future reflection can be expected to correct those mistakes.

Comment by steven0461 on Transcript for Geoff Anders and Anna Salamon's Oct. 23 conversation · 2021-11-12T22:57:31.103Z · LW · GW

"Problematic dynamics happened at Leverage" and "Leverage influenced EA Summit/Global" don't imply "Problematic dynamics at Leverage influenced EA Summit/Global" if EA Summit/Global had their own filters against problematic influences. (If such filters failed, it should be possible to point out where.)

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-06T23:52:14.866Z · LW · GW

Your posts seem to be about what happens if you filter out considerations that don't go your way. Obviously, yes, that way you can get distortion without saying anything false. But the proposal here is to avoid certain topics and be fully honest about which topics are being avoided. This doesn't create even a single bit of distortion. A blank canvas is not a distorted map. People can get their maps elsewhere, as they already do on many subjects, and as they will keep having to do regardless, simply because some filtering is inevitable beneath the eye of Sauron. (Distortions caused by misestimation of filtering are going to exist whether the filter has 40% strength or 30% strength. The way to minimize them is to focus on estimating correctly. A 100% strength filter is actually relatively easy to correctly estimate. And having the appearance of a forthright debate creates perverse incentives for people to distort their beliefs so they can have something inoffensive to be forthright about.)

The people going after Steve Hsu almost entirely don't care whether LW hosts Bell Curve reviews. If adjusting allowable topic space gets us 1 util and causes 2 utils of damage distributed evenly across 99 Sargons and one Steve Hsu, that's only 0.02 Hsu utils lost, which seems like a good trade.

I don't have a lot of verbal energy and find the "competing grandstanding walls of text" style of discussion draining, and I don't think the arguments I'm making are actually landing for some reason, and I'm on the verge of tapping out. Generating and posting an IM chat log could be a lot more productive. But people all seem pretty set in their opinions, so it could just be a waste of energy.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-06T20:16:36.821Z · LW · GW

due to the mechanisms described in "Entangled Truths, Contagious Lies" and "Dark Side Epistemology"

I'm not advocating lying. I'm advocating locally preferring to avoid subjects that force people to either lie or alienate people into preferring lies, or both. In the possible world where The Bell Curve is mostly true, not talking about it on LessWrong will not create a trail of false claims that have to be rationalized. It will create a trail of no claims. LessWrongers might fill their opinion vacuum with false claims from elsewhere, or with true claims, but either way, this is no different from what they already do about lots of subjects, and does not compromise anyone's epistemic integrity.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-05T23:45:08.849Z · LW · GW

"Offensive things" isn't a category determined primarily by the interaction of LessWrong and people of the sneer. These groups exist in a wider society that they're signaling to. It sounds like your reasoning is "if we don't post about the Bell Curve, they'll just start taking offense to technological forecasting, and we'll be back where we started but with a more restricted topic space". But doing so would make the sneerers look stupid, because society, for better or worse, considers The Bell Curve to be offensive and does not consider technological forecasting to be offensive.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-05T23:06:13.343Z · LW · GW

You'd have to use a broad sense of "political" to make this true (maybe amounting to "controversial"). Nobody is advocating blanket avoidance of controversial opinions, only blanket avoidance of narrow-sense politics, and even then with a strong exception of "if you can make a case that it's genuinely important to the fate of humanity in the way that AI alignment is important to the fate of humanity, go ahead". At no point could anyone have used the proposed norms to prevent discussion of AI alignment.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-05T22:54:17.790Z · LW · GW

Another way this matters: Offense takers largely get their intuitions about "will taking offense achieve my goals" from experience in a wide variety of settings and not from LessWrong specifically. Yes, theoretically, the optimal strategy is for them to estimate "will taking offense specifically against LessWrong achieve my goals", but most actors simply aren't paying enough attention to form a target-by-target estimate. Viewing this as a simple game theory textbook problem might lead you to think that adjusting our behavior to avoid punishment would lead to an equal number of future threats of punishment against us and is therefore pointless, when actually it would instead lead to future threats of punishment against some other entity that we shouldn't care much about, like, I don't know, fricking Sargon of Akkad.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-05T22:44:40.693Z · LW · GW

I think simplifying all this to a game with one setting and two players with human psychologies obscures a lot of what's actually going on. If you look at people of the sneer, it's not at all clear that saying offensive things thwarts their goals. They're pretty happy to see offensive things being said, because it gives them opportunities to define themselves against the offensive things and look like vigilant guardians against evil. Being less offensive, while paying other costs to avoid having beliefs be distorted by political pressure (e.g. taking it elsewhere, taking pains to remember that politically pressured inferences aren't reliable), arguably de-energizes such people more than it emboldens them.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-05T22:22:43.017Z · LW · GW

My claim was:

if this model is partially true, then something more nuanced than an absolutist "don't give them an inch" approach is warranted

It's obvious to everyone in the discussion that the model is partially false and there's also a strategic component to people's emotions, so repeating this is not responsive.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-05T05:50:06.518Z · LW · GW

I think an important cause of our disagreement is you model the relevant actors as rational strategic consequentialists trying to prevent certain kinds of speech, whereas I think they're at least as much like a Godzilla that reflexively rages in pain and flattens some buildings whenever he's presented with an idea that's noxious to him. You can keep irritating Godzilla until he learns that flattening buildings doesn't help him achieve his goals, but he'll flatten buildings anyway because that's just the kind of monster he is, and in this way, you and Godzilla can create arbitrary amounts of destruction together. And (to some extent) it's not like someone constructed a reflexively-acting Godzilla so they could control your behavior, either, which would make it possible to deter that person from making future Godzillas. Godzillas seem (to some extent) to arise spontaneously out of the social dynamics of large numbers of people with imperfect procedures for deciding what they believe and care about. So it's not clear to me that there's an alternative to just accepting the existence of Godzilla and learning as best as you can to work around him in those cases where working around him is cheap, especially if you have a building that's unusually important to keep intact. All this is aside from considerations of mercy to Godzilla or respect for Godzilla's opinions.

If I make some substitutions in your comment to illustrate this view of censorious forces as reflexive instead of strategic, it goes like this:

The implied game is:

Step 1: The bull decides what is offensively red

Step 2: LW people decide what cloths to wave given this

Steven is proposing a policy for step 2 that doesn't wave anything that the bull has decided is offensively red. This gives the bull the ability to prevent arbitrary cloth-waving.

If the bull is offended by negotiating for more than $1 in the ultimatum game, Steven's proposed policy would avoid doing that, thereby yielding. (The money here is metaphorical, representing benefits LW people could get by waving cloths without being gored by the bull)

I think "wave your cloths at home or in another field even if it's not as good" ends up looking clearly correct here, and if this model is partially true, then something more nuanced than an absolutist "don't give them an inch" approach is warranted.

edit: I should clarify that when I say Godzilla flattens buildings, I'm mostly not referring to personal harm to people with unpopular opinions, but to epistemic closure to whatever is associated with those people, which you can see in action every day on e.g. Twitter.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-05T04:48:56.218Z · LW · GW

standing up to all kinds of political entryism seems to me obviously desirable for its own sake

I agree it's desirable for its own sake, but meant to give an additional argument why even those people who don't agree it's desirable for its own sake should be on board with it.

if for some reason left-wing political entryism is fundamentally worse than right-wing political entryism then surely that makes it not necessarily hypocritical to take a stronger stand against the former than against the latter

Not necessarily objectively hypocritical, but hypocritical in the eyes of a lot of relevant "neutral" observers.

Comment by steven0461 on [Book Review] "The Bell Curve" by Charles Murray · 2021-11-05T04:42:13.719Z · LW · GW

"Stand up to X by not doing anything X would be offended by" is not what I proposed. I was temporarily defining "right wing" as "the political side that the left wing is offended by" so I could refer to posts like the OP as "right wing" without setting off a debate about how actually the OP thinks of it more as centrist that's irrelevant to the point I was making, which is that "don't make LessWrong either about left wing politics or about right wing politics" is a pretty easy to understand criterion and that invoking this criterion to keep LW from being about left wing politics requires also keeping LessWrong from being about right wing politics. Using such a criterion on a society-wide basis might cause people to try to redefine "1+1=2" as right wing politics or something, but I'm advocating using it locally, in a place where we can take our notion of what is political and what is not political as given from outside by common sense and by dynamics in wider society (and use it as a Schelling point boundary for practical purposes without imagining that it consistently tracks what is good and bad to talk about). By advocating keeping certain content off one particular website, I am not advocating being "maximally yielding in an ultimatum game", because the relevant game also takes place in a whole universe outside this website (containing your mind, your conversations with other people, and lots of other websites) that you're free to use to adjust your degree of yielding. Nor does "standing up to political entryism" even imply standing up to offensive conclusions reached naturally in the course of thinking about ideas sought out for their importance rather than their offensiveness or their symbolic value in culture war.