Structured Transparency: a framework for addressing use/mis-use trade-offs when sharing information 2024-04-11T18:35:44.824Z
LessWrong's (first) album: I Have Been A Good Bing 2024-04-01T07:33:45.242Z
How useful is "AI Control" as a framing on AI X-Risk? 2024-03-14T18:06:30.459Z
Open Thread Spring 2024 2024-03-11T19:17:23.833Z
Is a random box of gas predictable after 20 seconds? 2024-01-24T23:00:53.184Z
Will quantum randomness affect the 2028 election? 2024-01-24T22:54:30.800Z
Vote in the LessWrong review! (LW 2022 Review voting phase) 2024-01-17T07:22:17.921Z
AI Impacts 2023 Expert Survey on Progress in AI 2024-01-05T19:42:17.226Z
Originality vs. Correctness 2023-12-06T18:51:49.531Z
The LessWrong 2022 Review 2023-12-05T04:00:00.000Z
Open Thread – Winter 2023/2024 2023-12-04T22:59:49.957Z
Complex systems research as a field (and its relevance to AI Alignment) 2023-12-01T22:10:25.801Z
How useful is mechanistic interpretability? 2023-12-01T02:54:53.488Z
My techno-optimism [By Vitalik Buterin] 2023-11-27T23:53:35.859Z
"Epistemic range of motion" and LessWrong moderation 2023-11-27T21:58:40.834Z
Debate helps supervise human experts [Paper] 2023-11-17T05:25:17.030Z
How much to update on recent AI governance moves? 2023-11-16T23:46:01.601Z
AI Timelines 2023-11-10T05:28:24.841Z
How to (hopefully ethically) make money off of AGI 2023-11-06T23:35:16.476Z
Integrity in AI Governance and Advocacy 2023-11-03T19:52:33.180Z
What's up with "Responsible Scaling Policies"? 2023-10-29T04:17:07.839Z
Trying to understand John Wentworth's research agenda 2023-10-20T00:05:40.929Z
Trying to deconfuse some core AI x-risk problems 2023-10-17T18:36:56.189Z
How should TurnTrout handle his DeepMind equity situation? 2023-10-16T18:25:38.895Z
The Lighthaven Campus is open for bookings 2023-09-30T01:08:12.664Z
Navigating an ecosystem that might or might not be bad for the world 2023-09-15T23:58:00.389Z
Long-Term Future Fund Ask Us Anything (September 2023) 2023-08-31T00:28:13.953Z
Open Thread - August 2023 2023-08-09T03:52:55.729Z
Long-Term Future Fund: April 2023 grant recommendations 2023-08-02T07:54:49.083Z
Final Lightspeed Grants coworking/office hours before the application deadline 2023-07-05T06:03:37.649Z
Correctly Calibrated Trust 2023-06-24T19:48:05.702Z
My tentative best guess on how EAs and Rationalists sometimes turn crazy 2023-06-21T04:11:28.518Z
Lightcone Infrastructure/LessWrong is looking for funding 2023-06-14T04:45:53.425Z
Launching Lightspeed Grants (Apply by July 6th) 2023-06-07T02:53:29.227Z
Yoshua Bengio argues for tool-AI and to ban "executive-AI" 2023-05-09T00:13:08.719Z
Open & Welcome Thread – April 2023 2023-04-10T06:36:03.545Z
Shutting Down the Lightcone Offices 2023-03-14T22:47:51.539Z
Review AI Alignment posts to help figure out how to make a proper AI Alignment review 2023-01-10T00:19:23.503Z
Kurzgesagt – The Last Human (Youtube) 2022-06-29T03:28:44.213Z
Replacing Karma with Good Heart Tokens (Worth $1!) 2022-04-01T09:31:34.332Z
Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22] 2021-11-03T18:22:58.879Z
The LessWrong Team is now Lightcone Infrastructure, come work with us! 2021-10-01T01:20:33.411Z
Welcome & FAQ! 2021-08-24T20:14:21.161Z
Berkeley, CA – ACX Meetups Everywhere 2021 2021-08-23T08:50:51.898Z
The Death of Behavioral Economics 2021-08-22T22:39:12.697Z
Open and Welcome Thread – August 2021 2021-08-15T05:59:05.270Z
Open and Welcome Thread – July 2021 2021-07-03T19:53:07.048Z
Open and Welcome Thread – June 2021 2021-06-06T02:20:22.421Z
Attributions, Karma and better discoverability for wiki/tag features 2021-06-02T23:47:03.604Z
Open and Welcome Thread - May 2021 2021-05-03T07:58:03.130Z


Comment by habryka (habryka4) on nikola's Shortform · 2024-04-16T16:03:00.899Z · LW · GW

Wouldn't the equivalent be more like burning a body of a dead person?

It's not like the AI would have a continuous stream of consciousness, and it's more that you are destroying the information necessary to run them. It seems to me that shutting off an AI is more similar to killing them.

Seems like the death analogy here is a bit spotty. I could see it going either way as a best fit.

Comment by habryka (habryka4) on Anthropic AI made the right call · 2024-04-15T20:57:05.339Z · LW · GW

I would take pretty strong bets that that isn't what happened based on having talked to more people about this. Happy to operationalize and then try to resolve it.

Comment by habryka (habryka4) on Anthropic AI made the right call · 2024-04-15T20:01:54.714Z · LW · GW

That seems concerning! Did you follow up with the leadership of your organization to understand to what degree they seem to have been making different (and plausibly contradictory) commitments to different interest groups? 

It seems like it's quite important to know what promises your organization has made to whom, if you are trying to assess whether you working there will positively or negatively effect how AI will go.

(Note, I talked with Evan about this in private some other times, so the above comment is more me bringing a private conversation into the public realm than me starting a whole conversation about this. I've already poked Evan privately asking him to please try to get better confirmation of the nature of the commitments made here, but he wasn't interested at the time, so I am making the same bid publicly.)

Comment by habryka (habryka4) on Anthropic AI made the right call · 2024-04-15T19:58:19.428Z · LW · GW

I also strongly expected them to violate this commitment, though my understanding is that various investors and early collaborators did believe they would keep this commitment. 

I think it's important to understand that Anthropic was founded before the recent post-Chat-GPT hype/AI-interest-explosion. Similarly to how OpenAIs charter seemed plausible as something that OpenAI could adhere to for people early on, so did it seem possible that commercial pressures would not cause a fully-throated arms-race between all the top companies, with billions to trillions of dollars for the taking for whoever got to AGI first, which I do agree made violating this commitment a relatively likely conclusion.

Comment by habryka (habryka4) on nikola's Shortform · 2024-04-15T18:19:03.829Z · LW · GW

Not just "some robots or nanomachines" but "enough robots or nanomachines to maintain existing chip fabs, and also the supply chains (e.g. for ultra-pure water and silicon) which feed into those chip fabs, or make its own high-performance computing hardware".

My guess is software performance will be enough to not really have to make many more chips until you are at a quite advanced tech level where making better chips is easy. But it's something one should actually think carefully about, and there is a bit of hope in that it would become a blocker, but it doesn't seem that likely to me.

Comment by habryka (habryka4) on Anthropic AI made the right call · 2024-04-15T05:17:09.146Z · LW · GW

I guess I'm more willing to treat Anthropic's marketing as not-representing-Anthropic. Shrug.

I feel sympathetic to this, but when I think of the mess of trying to hold an organization accountable when I literally can't take the public statements of the organization itself as evidence, then that feels kind of doomed to me. It feels like it would allow Anthropic to weasel itself out of almost any commitment.

Comment by habryka (habryka4) on Anthropic AI made the right call · 2024-04-15T04:54:23.094Z · LW · GW

Claude 3 Opus meaningfully advanced the frontier? Or slightly advanced it but Anthropic markets it like it was a substantial advance so they're being similarly low-integrity?

I updated somewhat over the following weeks that Opus had meaningfully advanced the frontier, but I don't know how much that is true for other people. 

It seems like Anthropic's marketing is in direct contradiction with the explicit commitment they made to many people, including Dustin, which seems to have quite consistently been the "meaningfully advance the frontier" line. I think it's less clear whether their actual capabilities are, as opposed to their marketing statements. I think if you want to have any chance of enforcing commitments like this, the enforcement needs to happen at the latest when the organization publicly claims to have done something in direct contradiction to it, so I think the marketing statements matter a bunch here.

Anthropic has also continued to publish ads claiming that Claude 3 has meaningfully pushed the state of the art and is the smartest model on the market since the discussion around this happened, so it's not just a one-time oversight by their marketing department.

Separately, multiple Anthropic staffers seem to think themselves no longer bound by their previous commitment and expect that Anthropic will likely unambiguously advance the frontier if they get the chance.

Comment by habryka (habryka4) on The Best Tacit Knowledge Videos on Every Subject · 2024-04-15T03:45:44.680Z · LW · GW

Promoted to curated: The original "The Best Textbooks on Every Subject" post was among the most valuable that LessWrong has ever featured. I really like this extension of it into the realm of tacit knowledge videos, which does feel like a very valuable set of content that I haven't seen curated anywhere else on the internet.

Thank you very much for doing this! And I hope this post will see contributions for many months and years to come.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-04-15T03:43:05.587Z · LW · GW

Had a very aggressive crawler basically DDos-ing us from a few dozen IPs for the last hour. Sorry for the slower server response times. Things should be fixed now.

Comment by habryka (habryka4) on Anthropic AI made the right call · 2024-04-15T03:34:39.571Z · LW · GW

Most of us agree with you that deploying Claude 3 was reasonable,

I at least didn't interpret this poll to mean that deploying it was reasonable. I think given past Anthropic commitments it was pretty unreasonable (violating your deployment commitments seems really quite bad, and is IMO one of the most central things that Anthropic should be judged on). It's just not really clear whether it directly increased risk. I would be quite sad if that poll result would be seen as something like "approval of whether Anthropic made the right call".

Comment by habryka (habryka4) on nikola's Shortform · 2024-04-15T01:56:19.092Z · LW · GW

Before then, if the AI wishes to actually survive, it needs to construct and control a robot/nanomachine population advanced enough to maintain its infrastructure.

As Gwern said, you don't really need to maintain all the infrastructure for that long, and doing it for a while seems quite doable without advanced robots or nanomachines. 

If one wanted to do a very prosaic estimate, you could do something like "how fast is AI software development progress accelerating when the AI can kill all the humans" and then see how many calendar months you need to actually maintain the compute infrastructure before the AI can obviously just build some robots or nanomachines. 

My best guess is that the AI will have some robots from which it could bootstrap substantially before it can kill all the humans. But even if it didn't, it seems like with algorithmic progress rates being likely at the very highest when the AI will get smart enough to kill everyone, it seems like you would at most need a few more doublings of compute-efficiency to get that capacity, which would be only a few weeks to months away then, where I think you won't really run into compute-infrastructure issues even if everyone is dead. 

Of course, forecasting this kind of stuff is hard, but I do think "the AI needs to maintain infrastructure" tends to be pretty overstated. My guess is at any point where the AI could kill everyone, it would probably also not really have a problem of bootstrapping afterwards. 

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-14T01:57:09.637Z · LW · GW

What's a service that works everywhere? I would have expected YouTube to do pretty well here. Happy to upload it wherever convenient.

Comment by habryka (habryka4) on What's with all the bans recently? · 2024-04-12T03:47:57.374Z · LW · GW

By the way, the rate-limiting algorithm as I've understood it seems poor. It only takes one downvoted comment to get limited, So it doesn't matter if a user leaves one good comment and one poor comment, or if they write 99 good comments and one poor comment.

Automatic rate-limiting only uses the last 20 posts and comments, which can still be relatively harsh, but 99 good comments will definitely outweigh one poor comment.

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-11T20:25:47.972Z · LW · GW

I played around yesterday with Udio for like half an hour, but couldn't get even the start of any usable song out of it. 

It seems to me like the sample rate and artifacts are much less bad in Udio than Suno, but it seems to mess up the lyrics much more, and seems a lot less clever about how to fit the music around the lyrics. But I also might have just gotten some bad samples, not sure. I was hoping to play around a bit more.

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-11T15:12:47.428Z · LW · GW

Yeah, it's pretty decent. I don't know whether there is organic adoption, though one piece of evidence is that we are still getting a good number of listens a day (70% of the peak), though my guess is that's more evidence that people are continuing to listen and have added it to their favorites than that we are getting organic recommendations on the music platforms.

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-11T15:10:39.574Z · LW · GW

Comment here and I'll consider them for the next album! Though be warned, there are a lot of considerations beyond just quality.

Comment by habryka (habryka4) on Thinking harder doesn’t work · 2024-04-10T20:02:10.550Z · LW · GW

(Which post was it that you liked?)

Comment by habryka (habryka4) on Toward a Broader Conception of Adverse Selection · 2024-04-09T20:56:02.698Z · LW · GW

Promoted to curated: Adverse selections seems like a really useful lens to throw at lots of different things in the world, and I can't currently think of another post on the internet that gets the concept across as well as this one. 

I generally really like starting of a sequence like this using lots of concrete examples, instead of abstract definitions.

I do think there is something tricky about adverse selection in that it is the kind of thing that does often invite a kind of magical thinking or serve as a semantic stopsign for people trying to analyze a situation. People modeling you, or people modeling groups that you are part of, results in tricky and loopy situations, and I've often seen people arrive at confident wrong conclusions based on analysis in this space (though this post, mostly as a list of examples doesn't fall into that error mode, but I find myself curious how future posts in the sequence might handle those cases). 

Comment by habryka (habryka4) on What's with all the bans recently? · 2024-04-07T02:32:43.097Z · LW · GW

De-facto I think people are pretty good about not downvoting contrarian takes (splitting up/downvote from agree/disagree vote helped a lot in improving this). 

But also, we do have a manual review step to catch the cases where people get downvoted because of tribal dynamics and object-level disagreements (that's where at least a chunk of the 40% where we didn't apply the rule above came from).

Comment by habryka (habryka4) on Vanessa Kosoy's Shortform · 2024-04-06T18:45:17.042Z · LW · GW

"build superintelligence and use it to take unilateral world-scale actions in a manner inconsistent with existing law and order"

The whole point of the pivotal act framing is that you are looking for something to do that you can do with the least advanced AI system. This means it's definitely not a superintelligence. If you have an aligned superintelligence this I think makes that framing not really make sense. The problem the framing is trying to grapple with is that we want to somehow use AI to solve AI risk, and for that we want to use the very dumbest AI that we can use for a successful plan.

Comment by habryka (habryka4) on What's with all the bans recently? · 2024-04-06T18:38:31.928Z · LW · GW

FWIW, my sense is that the rate-limit system triggering was a mistake on your account, and we tweaked the numbers to make that no longer happen. Still sucks that you got rate-limited for a while, but the numbers are quite different now, and you almost certainly would not have been caught in the manual review that is part of these rate limits.

Comment by habryka (habryka4) on What's with all the bans recently? · 2024-04-06T18:29:23.115Z · LW · GW

We haven't written up concrete takeaways. My sense is the effect was relatively minor, mostly because we set quite high rate limits, but it's quite hard to disentangle from lots of other stuff going on. 

This was an experiment in setting stronger rate-limits using more admin-supervision. 

I do feel pretty solid in using rate-limiting as the default tool instead of temporary bans as I think most other forums use. I've definitely felt things escalate much less unhealthily and have observed a large effect size in how OK it is to reverse a rate-limit (whereas if I ban someone it tends to escalate quite quickly into a very sharp disagreement). It does also seem to reduce chilling effects a lot (as I think posts like this demonstrate).

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-06T08:04:45.058Z · LW · GW

I think by far the biggest piece of advice I can give is "just press the generate button 3 times every time you finish a prompt". The second biggest is "when you listen to the beginning of a song and it isn't good, just skip it. You can continue generating from any point in a song, but you cannot take the middle or the end of any song, so if the beginning doesn't work, you won't be able to change it". 

Annotations in lyrics are very helpful. Most of our songs have things like "[instrument solo]" and various instructions like that written into the lyrics. They don't get reliably observed, but good enough to steer the song. 

Beyond that, it really depends on the song. I have a lot of detailed taste about what genres work well and which ones don't, but that's harder to quickly summarize.

Comment by habryka (habryka4) on What's with all the bans recently? · 2024-04-06T07:59:16.713Z · LW · GW

I am confused. The quotes I sent are quotes from DMs we sent to Gerald. Here they are again just for posterity: 

You've been commenting fairly frequently, and my subjective impression as well as voting patterns suggest most people aren't finding your comments sufficiently helpful.


To conclude, the rate limit is your warning. Currently I feel your typical comments (even not downvoted) ones aren't amazing, and now that we're prioritizing raising standards due the dramatic rise in new users, we're also getting tougher on contributions from established users that don't feel like they're meeting the bar either.

I think we have more but they are in DMs with just Raemon in it, but the above IMO clearly communicate "your current contributions are not breaking even". 

Comment by habryka (habryka4) on What's with all the bans recently? · 2024-04-06T02:20:42.889Z · LW · GW

We have really given you a lot of feedback and have communicated that we don't think you are breaking even. Here are some messages we sent to you: 

April 7th 2023

You've been commenting fairly frequently, and my subjective impression as well as voting patterns suggest most people aren't finding your comments sufficiently helpful.

And from Ruby: 

In the "wrong" category, some of your criticisms of the Time piece post seemed to be failing to operating probabilistically which is a fundamental basic I expect from LW users. "May not" is not sufficient argument. You need to talk about probabilities and why yours are different from others. "It's irrational to worry about X because it might not happen" does not cut it. That's just something that stuck out to me.

In my mind, the 1 contribution/day is better than a ban because it gives you a chance to improve your contributions and become unrestricted.

Regarding your near-1000 karma, this is not a great sign given you have nearly 900 comments, meaning your average comment is not getting much positive engagement. Unfortunately karma is an imperfect measure and captures the combination of "is good" and "engages a lot" and engaging a lot alone isn't something we reward.

To conclude, the rate limit is your warning. Currently I feel your typical comments (even not downvoted) ones aren't amazing, and now that we're prioritizing raising standards due the dramatic rise in new users, we're also getting tougher on contributions from established users that don't feel like they're meeting the bar either.

Separately, here are some quotes from our about page and new user guide: 

This is a hard section to write. The new users who need to read it least are more likely to spend time worrying about the below, and those who need it most are likely to ignore it. Don't stress too hard. If you submit it and we don't like it, we'll give you some feedback.

A lot of the below is written for the people who aren't putting in much effort at all, so we can at least say "hey, we did give you a heads up in multiple places".

There are a number of dimensions upon which content submissions may be strong or weak. Strength in one place can compensate for weakness in another, but overall the moderators assess each first post/comment from new users for the following. If the first submission is lacking, it might be rejected and you'll get feedback on why.

Your first post or comment is more likely to approved by moderators (and upvoted by general site users) if you:

Demonstrate understanding of LessWrong rationality fundamentals. Or at least don't do anything contravened by them. These are the kinds of things covered in The Sequences such as probabilistic reasoning, proper use of beliefs, being curious about where you might be wrong, avoiding arguing over definitions, etc. See the Foundational Reading section above.

Write a clear introduction. If your first submission is lengthy, i.e. a long post, it's more likely to get quickly approved if the site moderators can quickly understand what you're trying to say rather than having to delve deep into your post to figure it out. Once you're established on the site and people know that you have good things to say, you can pull off having a "literary" opening that doesn't start with the main point.

Address existing arguments on the topic (if applicable). Many topics have been discussed at length already on LessWrong, or have an answer strongly implied by core content on the site, e.g. from the Sequences (which has rather large relevance to AI questions). Your submission is more likely to be accepted if it's clear you're aware of prior relevant discussion and are building upon on it. It's not a big deal if you weren't aware, there's just a chance the moderator team will reject your submission and point you to relevant material.

This doesn't mean that you can't question positions commonly held on LessWrong, just that it's a lot more productive for everyone involved if you're able to respond to or build upon the existing arguments, e.g. showing why they're wrong.

Address the LessWrong audience. A recent trend is more and more people crossposting from their personal blogs, e.g. their Substack or Medium, to LessWrong. There's nothing inherently wrong with that (we welcome good content!) but many of these posts neither strike us as particularly interesting or insightful, nor demonstrate an interest in LessWrong's culture/norms or audience (as revealed by a very different style and not really responding to anyone on site).

It's good (though not absolutely necessary) when a post is written for the LessWrong audience and shows that by referencing other discussions on LessWrong (links to other posts are good). 

Aim for a high standard if you're contributing on the topic AI. As AI becomes higher and higher profile in the world, many more people are flowing to LessWrong because we have discussion of it. In order to not lose what makes our site uniquely capable of making good intellectual progress, we have particularly high standards for new users showing up to talk about AI. If we don't think your AI-related contribution is particularly valuable and it's not clear you've tried to understand the site's culture or values, then it's possible we'll reject it.

And on the topic of positive goals for LessWrong and what we are trying to do here: 

On LessWrong we attempt (though don't always succeed) to apply the rationality lessons we've accumulated to any topic that interests us, and especially topics that seem important, like how to make the world a better place. We don't just care about truth in the abstract, but care about having true beliefs about things we care about so that we can make better and more successful decisions.

Right now, AI seems like one of the most (or the most) important topics for humanity. It involves many tricky questions, high stakes, and uncertainty in an unprecedented situation. On LessWrong, many users are attempting to apply their best thinking to ensure that the advent of increasingly powerful AI goes well for humanity.[5]

It's not amazingly concrete, but I do think it's clear we are trying to do something specific here. We are here to develop an art of rationality and cause good outcomes on issues like AI and other world-scale outcomes, and we'll moderate to achieve that.

Comment by habryka (habryka4) on What's with all the bans recently? · 2024-04-05T23:30:34.645Z · LW · GW

Thanks for making this post! 

One of the reasons why I like rate-limits instead of bans is that it allows people to complain about the rate-limiting and to participate in discussion on their own posts (so seeing a harsh rate-limit of something like "1 comment per 3 days" is not equivalent to a general ban from LessWrong, but should be more interpreted as "please comment primarily on your own posts", though of course it shares many important properties of a ban).

Things that seem most important to bring up in terms of moderation philosophy: 

Moderation on LessWrong does not depend on effort

Another thing I've noticed is that almost all the users are trying.  They are trying to use rationality, trying to understand what's been written here, trying to apply Baye's rule or understand AI.  Even some of the users with negative karma are trying, just having more difficulty.

Just because someone is genuinely trying to contribute to LessWrong, does not mean LessWrong is a good place for them. LessWrong has a particular culture, with particular standards and particular interests, and I think many people, even if they are genuinely trying, don't fit well within that culture and those standards. 

In making rate-limiting decisions like this I don't pay much attention to whether the user in question is "genuinely trying " to contribute to LW,  I am mostly just evaluating the effects I see their actions having on the quality of the discussions happening on the site, and the quality of the ideas they are contributing. 

Motivation and goals are of course a relevant component to model, but that mostly pushes in the opposite direction, in that if I have someone who seems to be making great contributions, and I learn they aren't even trying, then that makes me more excited, since there is upside if they do become more motivated in the future.

Signal to Noise ratio is important

Thomas and Elizabeth pointed this out already, but just because someone's comments don't seem actively bad, doesn't mean I don't want to limit their ability to contribute. We do a lot of things on LW to improve the signal to noise ratio of content on the site, and one of those things is to reduce the amount of noise, even if the mean of what we remove looks not actively harmful. 

We of course also do other things than to remove some of the lower signal content to improve the signal to noise ratio. Voting does a lot, how we sort the frontpage does a lot, subscriptions and notification systems do a lot. But rate-limiting is also a tool I use for the same purpose.

Old users are owed explanations, new users are (mostly) not

I think if you've been around for a while on LessWrong, and I decide to rate-limit you, then I think it makes sense for me to make some time to argue with you about that, and give you the opportunity to convince me that I am wrong. But if you are new, and haven't invested a lot in the site, then I think I owe you relatively little. 

I think in doing the above rate-limits, we did not do enough to give established users the affordance to push back and argue with us about them. I do think most of these users are relatively recent or are users we've been very straightforward with since shortly after they started commenting that we don't think they are breaking even on their contributions to the site (like the OP Gerald Monroe, with whom we had 3 separate conversations over the past few months), and for those I don't think we owe them much of an explanation. LessWrong is a walled garden. 

You do not by default have the right to be here, and I don't want to, and cannot, accept the burden of explaining to everyone who wants to be here but who I don't want here, why I am making my decisions. As such a moderation principle that we've been aspiring to for quite a while is to let new users know as early as possible if we think them being on the site is unlikely to work out, so that if you have been around for a while you can feel stable, and also so that you don't invest in something that will end up being taken away from you.

Feedback helps a bit, especially if you are young, but usually doesn't

Maybe there are other people who are much better at giving feedback and helping people grow as commenters, but my personal experience is that giving users feedback, especially the second or third time, rarely tends to substantially improve things. 

I think this sucks. I would much rather be in a world where the usual reasons why I think someone isn't positively contributing to LessWrong were of the type that a short conversation could clear up and fix, but it alas does not appear so, and after having spent many hundreds of hours over the years giving people individualized feedback, I don't really think "give people specific and detailed feedback" is a viable moderation strategy, at least more than once or twice per user. I recognize that this can feel unfair on the receiving end, and I also feel sad about it.

I do think the one exception here is that if people are young or are non-native english speakers. Do let me know if you are in your teens or you are a non-native english speaker who is still learning the language. People do really get a lot better at communication between the ages of 14-22 and people's english does get substantially better over time, and this helps with all kinds communication issues.

We consider legibility, but its only a relatively small input into our moderation decisions

It is valuable and a precious public good to make it easy to know which actions you take will cause you to end up being removed from a space. However, that legibility also comes at great cost, especially in social contexts. Every clear and bright-line rule you outline will have people budding right up against it, and de-facto, in my experience, moderation of social spaces like LessWrong is not the kind of thing you can do while being legible in the way that for example modern courts aim to be legible. 

As such, we don't have laws. If anything we have something like case-law which gets established as individual moderation disputes arise, which we then use as guidelines for future decisions, but also a huge fraction of our moderation decisions are downstream of complicated models we formed about what kind of conversations and interactions work on LessWrong, and what role we want LessWrong to play in the broader world, and those shift and change as new evidence comes in and the world changes.

I do ultimately still try pretty hard to give people guidelines and to draw lines that help people feel secure in their relationship to LessWrong, and I care a lot about this, but at the end of the day I will still make many from-the-outside-arbitrary-seeming-decisions in order to keep LessWrong the precious walled garden that it is.

I try really hard to not build an ideological echo chamber

When making moderation decisions, it's always at the top of my mind whether I am tempted to make a decision one way or another because they disagree with me on some object-level issue. I try pretty hard to not have that affect my decisions, and as a result have what feels to me a subjectively substantially higher standard for rate-limiting or banning people who disagree with me, than for people who agree with me. I think this is reflected in the decisions above.

I do feel comfortable judging people on the methodologies and abstract principles that they seem to use to arrive at their conclusions. LessWrong has a specific epistemology, and I care about protecting that. If you are primarily trying to... 

  • argue from authority, 
  • don't like speaking in probabilistic terms, 
  • aren't comfortable holding multiple conflicting models in your head at the same time, 
  • or are averse to breaking things down into mechanistic and reductionist terms, 

then LW is probably not for you, and I feel fine with that. I feel comfortable reducing the visibility or volume of content on the site that is in conflict with these epistemological principles (of course this list isn't exhaustive, in-general the LW sequences are the best pointer towards the epistemological foundations of the site).

If you see me or other LW moderators fail to judge people on epistemological principles but instead see us directly rate-limiting or banning users on the basis of object-level opinions that even if they seem wrong seem to have been arrived at via relatively sane principles, then I do really think you should complain and push back at us. I see my mandate as head of LW to only extend towards enforcing what seems to me the shared epistemological foundation of LW, and to not have the mandate to enforce my own object-level beliefs on the participants of this site.

Now some more comments on the object-level: 

I overall feel good about rate-limiting everyone on the above list. I think it will probably make the conversations on the site go better and make more people contribute to the site. 

Us doing more extensive rate-limiting is an experiment, and we will see how it goes. As kave said in the other response to this post, the rule that suggested these specific rate-limits does not seem like it has an amazing track record, though I currently endorse it as something that calls things to my attention (among many other heuristics).

Also, if anyone reading this is worried about being rate-limited or banned in the future, feel free to reach out to me or other moderators on Intercom. I am generally happy to give people direct and frank feedback about their contributions to the site, as well as how likely I am to take future moderator actions. Uncertainty is costly, and I think it's worth a lot of my time to help people understand to what degree investing in LessWrong makes sense for them. 

Comment by habryka (habryka4) on Partial value takeover without world takeover · 2024-04-05T20:11:23.715Z · LW · GW

The environment in which digital minds thrive seem very different from the environment in which humans thrive. I don't see a way to convert the mass of the earth into computronium without killing all the humans, without doing a lot more economic work than the humans are likely capable of producing.

Comment by habryka (habryka4) on Open Thread Spring 2024 · 2024-04-05T18:50:22.897Z · LW · GW

I was unsure whether people would prefer that, and decided yesterday to instead cut it, but IDK, I do like it. I might clean up the code and find some way to re-activate it on the site.

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-04T17:06:56.958Z · LW · GW

Yeah, should be fixed within the next few days. 

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-03T23:37:11.723Z · LW · GW

Here is the Suno playlist which I think has all the styles and lyrics and prompts: 

Beware though, in total I think we made around 3000 - 4000 song-generations to get the 15 that we felt happy about here. My guess is total effort per song was still somewhere in the 5-10 hours range or so, if you include all the dead ends and things that never worked out.

Comment by habryka (habryka4) on Richard Ngo's Shortform · 2024-04-03T20:11:02.881Z · LW · GW

Strong agree, also I spoiler-texted it, hope you don't mind.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-04-03T20:00:43.414Z · LW · GW


Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-03T17:35:46.634Z · LW · GW

We don't manage the Youtube videos, they are managed by Distrokid. I am thinking about uploading our own videos today, but that might be bad for algorithm-reasons. 

I will see whether I can add lyrics via Distrokid today. I think they support adding lyrics, but I am not sure how they will show up on Youtube.

Comment by habryka (habryka4) on Open Thread Spring 2024 · 2024-04-03T16:28:07.162Z · LW · GW

Where else would it go? We need a minimum level of saliency to get accurate markets, and I care about the signal from the markets a good amount.

Comment by habryka (habryka4) on How Often Does ¬Correlation ⇏ ¬Causation? · 2024-04-03T07:21:24.063Z · LW · GW

Cool, seems good. Just wasn't fully clear to me from the framing.

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-03T06:37:51.540Z · LW · GW

It's probably not intended, but I always imagine that in "We do not wish to advance", first the singer whispers sweet nothings to the alignment community, then the shareholder meeting starts and so: glorius-vibed music: "OPUS!!!" haha

That was indeed the intended effect!

Comment by habryka (habryka4) on NickH's Shortform · 2024-04-03T05:02:44.946Z · LW · GW

This seems like a relatively standard argument, but I also struggle a bit to understand why this is a problem. If the AI is aligned it will indeed try to spread through the universe as quickly as possible, eliminating all competition, but if shares our values, that would be good, not bad (and if we value aliens, which I think I do, then we would presumably still somehow trade with them afterwards from a position of security and stability).

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-04-03T05:00:39.180Z · LW · GW

And finally, I am freed from this curse.

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-03T00:17:26.195Z · LW · GW

Now live on Apple Music! 

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-03T00:16:49.717Z · LW · GW

It is on Youtube Music! 

Comment by habryka (habryka4) on How Often Does ¬Correlation ⇏ ¬Causation? · 2024-04-02T20:39:13.997Z · LW · GW

Did you want this to be a question? It seems more like a post than a question. Happy to convert it just to a normal post if you want.

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-02T16:45:17.868Z · LW · GW

Click on the "Listen Now" button on the frontpage banner and the audio player should re-appear.

Comment by habryka (habryka4) on metachirality's Shortform · 2024-04-02T05:21:36.534Z · LW · GW

I don't know whether metachirality was thinking of a setting for authors or for commenters. What makes you confident he was talking about the author version?

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-02T05:06:57.127Z · LW · GW

And we're live!

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-02T03:14:18.067Z · LW · GW

Should go up later tonight! We'll see how it goes. 

You can also pre-save the album for Spotify here, which I think will cause you to be notified as soon as the album goes live: 

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-02T00:28:32.649Z · LW · GW

No huge licensing issues, I think. My guess is these should go live on Spotify within the next few days or so, we are currently waiting on their review to complete.

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-04-01T20:44:59.925Z · LW · GW

You can just get people's userIds via the API, so it's nothing private. 

Comment by habryka (habryka4) on Habryka's Shortform Feed · 2024-04-01T20:06:23.949Z · LW · GW

Welp, I guess my life is comic sans today. The EA Forum snuck some code into our deployment bundle for my account in-particular, lol:

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-01T20:04:22.097Z · LW · GW

We've submitted them to Spotify! We are currently waiting on them getting through review.

Feel free to download them and upload them yourself for now.

Comment by habryka (habryka4) on LessWrong's (first) album: I Have Been A Good Bing · 2024-04-01T14:31:48.614Z · LW · GW

Well, as I said, this is all thanks to Agendra and their band. I'll ask her about how she did it and maybe she'll give me more details, though I would be surprised if she responds before tomorrow.