Posts

OODA your OODA Loop 2024-10-11T00:50:48.119Z
Scaffolding for "Noticing Metacognition" 2024-10-09T17:54:13.657Z
"Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" 2024-09-28T23:38:25.512Z
2024 Petrov Day Retrospective 2024-09-28T21:30:14.952Z
[Completed] The 2024 Petrov Day Scenario 2024-09-26T08:08:32.495Z
What are the best arguments for/against AIs being "slightly 'nice'"? 2024-09-24T02:00:19.605Z
Struggling like a Shadowmoth 2024-09-24T00:47:05.030Z
Interested in Cognitive Bootcamp? 2024-09-19T22:12:13.348Z
Skills from a year of Purposeful Rationality Practice 2024-09-18T02:05:58.726Z
What is SB 1047 *for*? 2024-09-05T17:39:39.871Z
Forecasting One-Shot Games 2024-08-31T23:10:05.475Z
LessWrong email subscriptions? 2024-08-27T21:59:56.855Z
Please stop using mediocre AI art in your posts 2024-08-25T00:13:52.890Z
Would you benefit from, or object to, a page with LW users' reacts? 2024-08-20T16:35:47.568Z
Optimistic Assumptions, Longterm Planning, and "Cope" 2024-07-17T22:14:24.090Z
Fluent, Cruxy Predictions 2024-07-10T18:00:06.424Z
80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) 2024-07-03T20:34:50.741Z
What percent of the sun would a Dyson Sphere cover? 2024-07-03T17:27:50.826Z
What distinguishes "early", "mid" and "end" games? 2024-06-21T17:41:30.816Z
"Metastrategic Brainstorming", a core building-block skill 2024-06-11T04:27:52.488Z
Can we build a better Public Doublecrux? 2024-05-11T19:21:53.326Z
some thoughts on LessOnline 2024-05-08T23:17:41.372Z
Prompts for Big-Picture Planning 2024-04-13T03:04:24.523Z
"Fractal Strategy" workshop report 2024-04-06T21:26:53.263Z
One-shot strategy games? 2024-03-11T00:19:20.480Z
Rationality Research Report: Towards 10x OODA Looping? 2024-02-24T21:06:38.703Z
Exercise: Planmaking, Surprise Anticipation, and "Baba is You" 2024-02-24T20:33:49.574Z
Things I've Grieved 2024-02-18T19:32:47.169Z
CFAR Takeaways: Andrew Critch 2024-02-14T01:37:03.931Z
Skills I'd like my collaborators to have 2024-02-09T08:20:37.686Z
"Does your paradigm beget new, good, paradigms?" 2024-01-25T18:23:15.497Z
Universal Love Integration Test: Hitler 2024-01-10T23:55:35.526Z
2022 (and All Time) Posts by Pingback Count 2023-12-16T21:17:00.572Z
Raemon's Deliberate (“Purposeful?”) Practice Club 2023-11-14T18:24:19.335Z
Hiring: Lighthaven Events & Venue Lead 2023-10-13T21:02:33.212Z
"The Heart of Gaming is the Power Fantasy", and Cohabitive Games 2023-10-08T21:02:33.526Z
Related Discussion from Thomas Kwa's MIRI Research Experience 2023-10-07T06:25:00.994Z
Thomas Kwa's MIRI research experience 2023-10-02T16:42:37.886Z
Feedback-loops, Deliberate Practice, and Transfer Learning 2023-09-07T01:57:33.066Z
Open Thread – Autumn 2023 2023-09-03T22:54:42.259Z
The God of Humanity, and the God of the Robot Utilitarians 2023-08-24T08:27:57.396Z
Book Launch: "The Carving of Reality," Best of LessWrong vol. III 2023-08-16T23:52:12.518Z
Feedbackloop-first Rationality 2023-08-07T17:58:56.349Z
Private notes on LW? 2023-08-04T17:35:37.917Z
Exercise: Solve "Thinking Physics" 2023-08-01T00:44:48.975Z
Rationality !== Winning 2023-07-24T02:53:59.764Z
Announcement: AI Narrations Available for All New LessWrong Posts 2023-07-20T22:17:33.454Z
What are the best non-LW places to read on alignment progress? 2023-07-07T00:57:21.417Z
My "2.9 trauma limit" 2023-07-01T19:32:14.805Z
Automatic Rate Limiting on LessWrong 2023-06-23T20:19:41.049Z

Comments

Comment by Raemon on Mark Xu's Shortform · 2024-10-14T17:29:51.423Z · LW · GW

It might be "fine" to do research at GDM (depending on how free you are to actually pursue good research directions, or how good a mentor you have). But, part of the schema in Mark's post is "where should one go for actively good second-order effects?".

Comment by Raemon on Struggling like a Shadowmoth · 2024-10-14T16:48:28.902Z · LW · GW

A thing I am interested in but can't tell from this comment is whether, as that kid, reading this post would have been helpful or harmful (I'd guess harmful, but not overwhelmingly)

Comment by Raemon on Open Thread Fall 2024 · 2024-10-14T04:02:43.669Z · LW · GW

The oldest posts were from before we had nested comments, so the comments there need to be in chronological order to make sense of the conversation.

Comment by Raemon on Overview of strong human intelligence amplification methods · 2024-10-13T22:51:13.980Z · LW · GW

FYI I do think the downside of "people may anchor off the numbers" is reasonable to weigh in the calculus of epistemic-community-norm-setting.

I would frame the question: "is the downside of people anchoring off potentially-very-off-base numbers worse than the upside of having intuitions somewhat more quantified, with more gears exposed?". I can imagine that question resolving in the "actually yeah it's net negative", but, if you're treating the upside as "zero" I think you're missing some important stuff.

Comment by Raemon on sarahconstantin's Shortform · 2024-10-13T22:23:38.795Z · LW · GW

Ah yeah that’s a much more specific takeaway than I’d been imagining.

Comment by Raemon on Overview of strong human intelligence amplification methods · 2024-10-13T18:44:40.665Z · LW · GW

First, reiterating, the most important bit here is the schema, and drawing attention to this as an important area of further work.

Second, I think calling it "baseless speculation" is just wrong. Given that you're jumping to a kinda pejorative framing, it looks like your mind is kinda made up and I don't feel like arguing with you more. I don't think you actually read the scott article in a way that was really listening to it and considering the implications.

But, since I think the underlying question of "what is LessWrong curated for" is nuanced and not clearly spelled out, I'll go spell that out for the benefit of everyone just tuning in.

Model 1: LessWrong as "full intellectual pipeline, from 'research office watercooler' to 'published'"

The purpose of LW curated is not to be a peer reviewed journal, and the purpose of LW is not to have quite the same standards for published academic work. Instead, I think of LW has tackling "the problem that academia is solving" through a somewhat different lens, which includes many of the same pieces but organizes them differently.

What you see in a finished, published journal article is the very end of a process, and it's not where most of the generativity happens. Most progress is happening in conversations around watercoolers at work, slack channels, conference chit-chat, etc.

LW curation is not "published peer review." The LessWrong Review is more aspiring to be that (I also think The Review fails at achieving all my goals with "the good parts of peer review," although it achieves other goals, and I have thoughts on how to improve it on that axis)

But the bar for curated is something like "we've been talking about this around the watercooler for weeks, the people involved in the overall conversation have found this a useful concept and they are probably going to continue further research that builds on this and eventually you will see some more concrete output.

In this case, the conversation has already been ongoing awhile, with posts like Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible (another curated post, which I think is more "rigorous" in the classical sense). 

I don't know if there's a good existing reference post for "here in detail is the motivation for why we want to do human intelligence enhancement and make it a major priority." Tsvi sort of briefly discusses that here but mostly focuses on "where might we want to focus, given this goal."

Model 2. "Review" is an ongoing process.

One way you can do science is to do all of the important work in private, and then publish at the end. That is basically just not how LW is arranged. The whole idea here is to move the watercooler to the public area, and handle the part where "ideas we talk about at the watercooler are imprecise and maybe wrong" with continuous comment-driven review and improving our conversational epistemics.

I do think the bar for curated is "it's been at least a few days, the arguments in the post make sense to me (the curator), and nobody was raised major red flags about the overall thrust of the post." (I think this post meets that bar)

I want people to argue about both the fiddly details of the post, or the overall frame of the post. The way you argue that is by making specific claims about why the post's details are wrong, or incomplete, or why the posts's framing is pointed in the wrong direction. 

The fact that this post's framing seems important is more reason to curate it, if we haven't found first-pass major flaws and I want more opportunity for people to discuss major flaws.

Saying "this post is vague and it's made up numbers aren't very precise" isn't adding anything to the conversation (except for providing some scaffold for a meta-discussion on LW site philosophy, which is maybe useful to do periodically since it's not obvious at a glance)

Revisiting the guesswork / "baseless speculation" bit

If a group of researchers have a vein they have been discussing at the watercooler, and it has survived a few rounds of discussion and internal criticism, and it'll be awhile before a major legible rigorous output is published:

I absolutely want those researchers intuitions and best guesses about which bits are important. Those researchers have some expertise and worldmodels. They could spend another 10-100 hours articulating those intuitions with more precision and backing them up with more evidence. Sometimes it's correct to do that. But if I want other researchers to be able to pick up the work and run with it, I don't want them bottlenecked on the first researchers privately iterating another 10-100 hours before sharing it.

I don't want us to overanchor on those initial intuitions and best guesses. And if you don't trust those researcher's intuitions, I want you to have an easy time throwing them out and thinking about them from scratch.

Comment by Raemon on My theory of change for working in AI healthtech · 2024-10-13T18:04:34.895Z · LW · GW

I think Critch isn’t imagining ‘AI run’ as a distinction, per se, but that there are industries whose outputs mostly benefit humans, and industries who are ‘dual use’. Human Healthcare might be run by AIs eventually but is still pointed at human centric goals

Comment by Raemon on Overview of strong human intelligence amplification methods · 2024-10-13T17:38:37.392Z · LW · GW

Why do you think the table is the most important thing in the article?

A different thing Tsvi could have done was say “here’s my best guess of which of these are most important, and my reasoning why”, but this would have essentially the same thing as the table + surrounding essay but with somewhat less fidelity of what his guesses were for the ranking.

Meanwhile I think the most important thing was laying out all the different potential areas of investigation, which I can now reason about on my own.

Comment by Raemon on Overview of strong human intelligence amplification methods · 2024-10-13T03:14:47.968Z · LW · GW

The point of made up numbers is that they are a helpful tool for teasing out some implicit information from your intuitions, which is often better than not doing that at all, but, it's important that they are useful in a pretty different way from numbers-you-empirically-got-from-somewhere, and thus it's important that they be clearly labeled as made up numbers that Tsvi made up numbers.

See: If it's worth doing, it's worth doing with Made Up Statistics

During this particular tutorial, Julia tried to explain Bayes’ Theorem to some, er, rationality virgins. I record a heavily-edited-to-avoid-recognizable-details memory of the conversation below:

Julia: So let’s try an example. Suppose there’s a five percent chance per month your computer breaks down. In that case…
Student: Whoa. Hold on here. That’s not the chance my computer will break down.
Julia: No? Well, what do you think the chance is?
Student: Who knows? It might happen, or it might not.
Julia: Right, but can you turn that into a number?
Student: No. I have no idea whether my computer will break. I’d be making the number up.
Julia: Well, in a sense, yes. But you’d be communicating some information. A 1% chance your computer will break down is very different from a 99% chance.
Student: I don’t know the future. Why do you want to me to pretend I do?
Julia: (who is heroically nice and patient) Okay, let’s back up. Suppose you buy a sandwich. Is the sandwich probably poisoned, or probably not poisoned?
Student: Exactly which sandwich are we talking about here?

In the context of a lesson on probability, this is a problem I think most people would be able to avoid. But the student’s attitude, the one that rejects hokey quantification of things we don’t actually know how to quantify, is a pretty common one. And it informs a lot of the objections to utilitarianism – the problem of quantifying exactly how bad North Korea shares some of the pitfalls of quantifying exactly how likely your computer is to break (for example, “we are kind of making this number up” is a pitfall).

The explanation that Julia and I tried to give the other student was that imperfect information still beats zero information. Even if the number “five percent” was made up (suppose that this is a new kind of computer being used in a new way that cannot be easily compared to longevity data for previous computers) it encodes our knowledge that computers are unlikely to break in any given month. Even if we are wrong by a very large amount (let’s say we’re off by a factor of four and the real number is 20%), if the insight we encoded into the number is sane we’re still doing better than giving no information at all (maybe model this as a random number generator which chooses anything from 0 – 100?)

Comment by Raemon on Overview of strong human intelligence amplification methods · 2024-10-12T23:41:22.522Z · LW · GW

I think the terror reaction is honestly pretty reasonable. ([edit: Not, like, like, necessarily meaning one shouldn't pursue this sort of direction on balance. I think the risks of doing this badly are real and I think the risks of not doing anything are also quite real and probably great for a variety of reasons])

One reason I nonetheless think this is very important to pursue is that we're probably going to end up with superintelligent AI this century, and it's going to be dramatically more alien and scary than the tail-risk outcomes here.

I do think the piece would be improved if it acknowledged and grappled with that more.

Comment by Raemon on Overview of strong human intelligence amplification methods · 2024-10-11T19:55:19.334Z · LW · GW

Curated. Augmenting human intelligence seems like one of the most important things-to-think-about this century. I appreciated this post's taxonomy.

I appreciate the made of graph of made up numbers that Tsvi made up being clearly labeled as such.

I have a feeling that this post could be somewhat more thorough, maybe with more links to the places where someone could followup on the technical bits of each thread.

Comment by Raemon on Open Thread Fall 2024 · 2024-10-11T19:52:04.750Z · LW · GW

I'm interested in knowing which AI forum you came from.

Comment by Raemon on Who created the Less Wrong Gather Town? · 2024-10-11T18:28:41.163Z · LW · GW

It's me! 

It's been generally inactive for awhile and I wouldn't mind you using some art. It is a bit annoying to dig up, but, let me know what you want and I'll see what I can do.

Comment by Raemon on OODA your OODA Loop · 2024-10-11T00:53:02.036Z · LW · GW

In addition to wanting to know if this ends up helping anyone, I'd also like to know...

  1. does anyone know any existing resources about intentionally improving your OODA loop (esp. if they seem better than this)
  2. if you bounce off this, do you have any sense of why, or what nearby thing might have been better for you?
Comment by Raemon on Values Are Real Like Harry Potter · 2024-10-11T00:37:39.680Z · LW · GW

I like this frame, but I'd like a much better grasp on the "how do we distinguish changes in beliefs vs values" and "how do we distinguish reliable from unreliable data."

The more central problem is: at least some of the time, it's possible for someone to say things to me that nudges me in some values-seeming direction, in ways that when I look back, I'm not sure whether or not I endorse.

Some case studies:

  • I am born into a situation where, as a very young child, I happen to have peers that have a puritan work ethic. I end up orienting myself such that I get feel-good reward signals for doing dutiful work. (vs, in an alternate world, I have some STEM-y parents who encourage me to solve puzzles and get an 'aha! insight!' signal that trains me value intellectual challenge, and who maybe meanwhile actively discourage me from puritan-style work in favor of unschool-y self exploration.)
     
  • I happen to move into a community where people have different political opinions, and mysteriously I find myself adopting those political opinions in a few years.
     
  • Someone makes subtly-wrong arguments at me (maybe intentionally, maybe not), and those lead me to start rehearsing statements about my goals or beliefs that lead me to get some reward signals that are based on falsehood. (This one at least seems sort of obvious – you handle this with ordinary epistemics and be like "well, your epistemics weren't good enough but that's more of a problem with epistemics than a flaw in this model of values)
Comment by Raemon on Raemon's Shortform · 2024-10-11T00:24:31.429Z · LW · GW

Using "cruxiness" instead of operationalization for predictions.

One problem with making predictions is "operationalization." A simple-seeming prediction can have endless edge cases.

For personal predictions, I often think it's basically not worth worrying about it. Write something rough down, and then say "I know what I meant." But, sometimes this is actually unclear, and you may be tempted to interpret a prediction in a favorable light. And at the very least it's a bit unsatisfying for people who just aren't actually sure what they meant.

One advantage of cruxy predictions (aside from "they're actually particularly useful in the first place), is that if you know what decision a prediction was a crux for, you can judge ambiguous resolution based on "would this actually have changed my mind about the decision?"

("Cruxiness instead of operationalization" is a bit overly click-baity. Realistically, you need at least some operationalization, to clarify for yourself what a prediction even means in the first place. But, I think maybe you can get away with more marginal fuzziness if you're clear on how the prediction was supposed to inform your decisionmaking)

⚖ A year from now, in the three months prior, will I have used "cruxiness-as-operationalization" on a prediction, and found it helpful. (Raymond Arnold: 50%)

Comment by Raemon on Scaffolding for "Noticing Metacognition" · 2024-10-10T20:42:55.181Z · LW · GW

Yeah I do concretely think one needs to guard against being an obsessive problem solver… but, also, there are some big problems that gotta get solved and while there are downsides and risks I mostly think "yep, I’m basically here to ~obsessive problem solve." (even if I'll try to be reasonable about it and encourage others to as well)

(To be clear, psychologically unhealthy or counterproductive obsessions with problem solving are bad. But if I have to choose between accidentally veering towards that too much or too little, I'm choosing too much)

Comment by Raemon on sarahconstantin's Shortform · 2024-10-10T20:19:25.932Z · LW · GW

People who love solarpunk don't obviously love computronium dyson spheres tho

Comment by Raemon on TurnTrout's shortform feed · 2024-10-10T20:08:24.721Z · LW · GW

Fuck yeah.

Comment by Raemon on Mark Xu's Shortform · 2024-10-10T19:19:35.681Z · LW · GW

I think two major cruxes for me here are:

  • is it actually tractable to affect Deepmind's culture and organizational decisionmaking
  • how close to the threshold is Anthropic for having a good enough safety culture?

My current best guess is that Anthropic is still under the threshold for good enough safety culture (despite seeming better than I expected in a number of ways), and meanwhile that Deepmind is just too intractably far gone. 

I think people should be hesitant to work at any scaling lab, but, I think Anthropic might be possible to make "the one actually good scaling lab", and I don't currently expect that to be tractable at Deepmind and I think "having at least one" seems good for the world (although it's a bit hard for me to articulate why at the moment)

I am interested in hearing details about Deepmind that anyone thinks should change my mind about this.

This viewpoint is based on having spent at least 10s of hours trying to learn and about influence both org's culture, at various times.

In both cases, I don't get the sense that people at the orgs really have a visceral sense that "decisionmaking processes can be fake", I think they will be fake by default and the org is better modeled as following general incentives, and DeepMind has too many moving people and moving parts at a low enough density that it doesn't seem possible to fix. For me to change my mind about this, I would need to someone there to look me in the eye and explain that they do have a visceral sense of how organizational decisionmaking processes can be fake, and why they nonetheless think DeepMind is tractable to fix. I assume it's hard for @Rohin Shah and @Neel Nanda can't really say anything publicly that's capable of changing my mind for various confidentiality and political reasons, but, like, that's my crux.

(conving me in more general terms "Ray, you're too pessimistic about org culture" would hypothetically somehow work, but, you have a lot of work to do given how thoroughly those pessimistic predictions came true about OpenAi)

I think Anthropic also has this problem, but the threshold of almost-aligned-leadership and actually-pretty-aligned people that it feels at least possible to me for the to fix it. The main things that would persuade me that they are over the critical threshold is if they publicly spent social capital on clearly spelling out why the x-risk problem is hard, and made explicit plans to not merely pause for a bit when they hit an RSP threshold, but (at least in some circumstances) advocate strongly for global government shutdown for like 20+ years.

Comment by Raemon on sarahconstantin's Shortform · 2024-10-10T19:01:18.371Z · LW · GW

I think this prompts some kind of directional update in me. My paraphrase of this is:

  • it’s actually pretty ridiculous to think you can steer the future
  • It’s also pretty ridiculous to choose to identify with what the future is likely to be.

Therefore…. Well, you don’t spell out your answer. My answer is "I should have a personal meaning-making resolution to 'what would I do if those two things are both true,' even if one of them turns out to be false, so that I can think clearly about whether they are true."

I’ve done a fair amount of similar meaningmaking work through the lens of Solstice 2022 and 2023. But that was more through lens of ‘nearterm extinction’ than ‘inevitability of value loss', which does feel like a notably different thing.

So it seems worth doing some thinking and pre-grieving about that.

I of course have some answers to ‘why value loss might not be inevitable’, but it’s not something I’ve yet thought about through an unclouded lens.

Comment by Raemon on Scaffolding for "Noticing Metacognition" · 2024-10-10T18:15:11.803Z · LW · GW

First, I totally think it's worth learning to notice things without having any particular response. 

I think some people find that intuitively or intrinsically valuable. For people who don't find "judgmentless/reactionless noticing" valuable, I would say: 

"The reason to do that is to develop a rich understanding of your mind. A problem you would run into if you have reactions/judgments is that doing so changes your mind while you're looking at it, you can only get sort of distorted data if you immediately jump into changing things. You may want this raw data from your mind a) because it helps you diagnose confusing problems in your psychology, b) because you might just intrinsically value getting to know your own mind with as close contact as possible – it's where you live, and in some sense, it's all the reality you have to interact with." 

I think all of that is pretty important for becoming a poweruser-rationalist. Now that you've drawn my attention to it, I probably will update the essay to include it somehow.

But, I think all of that takes quite awhile to pay off, and if it's not intuitively appealing, I don't think it's really worth trying until you've gotten some fluency with Noticing in the first place.

...

And, that all said: I think the buddhists-and-such are ultimately trying to achieve a different goal than I'm trying to achieve, so even though the methods are pretty similar in many places, they are just optimized pretty differently.

The goal I'm trying to achieve is "solve confusing problems at the edge of my ability that feel impossible, but are nonetheless incredibly important." This post is exploring Noticing in that particular context, and furthermore, in the context of "what skills can you train that will quickly pay off, such that you'll get some indication they are valuable at all", either in a dedicated workshop I'm designing, or, on your own without any personalized guidance.

...

It does seem like there will be other types of workshops (even ones that are focused on solving confusing problems), in which it makes sense to notice-without-reaction, perhaps because the workshop is oriented around diagnosing psychological hangups or confusions. I think such a workshop would need very different mentorship and support structure than the format I'm currently optimizing, but it does also seem like something that'll ultimately be part of the artform I'm trying to pursue in some fashion.

Comment by Raemon on [deleted post] 2024-10-09T22:07:52.264Z

I haven't read this post yet and am not sure what the downvotes are about yet (maybe they are justified), but, I feel like I gained something important just from reading the title, so have a strong upvote.

Comment by Raemon on Video and transcript of presentation on Otherness and control in the age of AGI · 2024-10-09T04:41:32.472Z · LW · GW

The chart is great.

I'm curious what you'd think about updating the transcript here to have fewer awkward verbal repetitions – several places you started a sentence and then kind of start over halfway through (in a way that I imagine sounding natural in person, but just makes it harder to read here). I think an LLM could probably be instructed to remove them without otherwise messing up the post.

Comment by Raemon on What constitutes an infohazard? · 2024-10-08T21:43:17.422Z · LW · GW

Mod note: I often don't let new users with this sort of question through because these sort of questions tend to be kinda cursed. But, honestly I don't think we have a good canonical answer post to this and I wanted to take the opportunity to spur someone to write one.

I personally think people should mostly worry less about acausal extortion, but this question isn't quite about that. 

I think my actual answer is "realistically, you probably haven't found something dangerous other to justify the time cost of running it by someone, but I feel dissatisfied with that state of affairs."

Maybe someone should write an LLM-bot that tells you if your maybe-infohazardous idea is making one of the standard philosophical errors.

Comment by Raemon on 2025 Color Trends · 2024-10-08T21:38:32.700Z · LW · GW

mod note: I found myself confused about whether to frontpage this, because like on one hand it's... sort of explicitly not timeless. But, it does feel like a neat historical document for people in the future to read.

(Frontpage status now determines whether it'll show up in the enriched recommended section years later, so the "is it timeless?" question is a bit better operationalized as "will someone 4 years from now wanna read it?")

Comment by Raemon on Struggling like a Shadowmoth · 2024-10-08T05:21:48.528Z · LW · GW

Certainly seems a reasonable worry.

My angle here is "this obviously doesn't tell you anything about what humans can or should do when they are being maximally tortured. But it is inspirational the way stories can often be in a way that is more about making something feel like a visceral possibility, which didn't previously feel like a visceral possibility."

And then, the concrete details that follow are true (well, the metastrategy one is "true" in that "this is why I'm doing it this way", it doesn't really go into "but how well does it work actually?". 

The thing I would encourage you to do is at least consider, in various difficult circumstances, whether you can actually just shut up and do the impossible, and imagine what it'd look like to succeed. And then concretely visualize the impossible-seeming plan and whatever your best alternative is, and decide between them as best you can. 

Comment by Raemon on 2025 Color Trends · 2024-10-08T05:11:08.104Z · LW · GW

I just fixed it with a manual screenshot copy-paste, since it was hard to maintain the exact arrangement otherwise.

Comment by Raemon on Three Subtle Examples of Data Leakage · 2024-10-03T18:14:33.531Z · LW · GW

Curated. I liked that this post both illustrated an important idea through a few different lenses, and in particular that it showcased how easy it would be to nod along with an incomplete/wrong explanation.

Comment by Raemon on We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap · 2024-10-03T17:34:27.175Z · LW · GW

I find myself wanting to curate this, because it illustrated a useful new frame and/or gear for thinking about values.

I feel like it's juuuust under the threshold for me feeling good about curating, because the example given feels sort of... simple. It's not exactly trivial but I think I'd be more confident what you meant if you contrasted it with an example that was more stereotypically "values-y" (i.e. I don't really care whether I like escamoles or not, although maybe some people do)

Comment by Raemon on MichaelDickens's Shortform · 2024-10-02T17:45:49.909Z · LW · GW

I kinda don't want to litigate it right now, but, I was thinking "I can think of one particular Anthropic prediction Habryka made that seemed false and overly pessimistic to me", which doesn't mean I think you're overall uncalibrated about Anthropic, and/or not pessimistic enough.

And (I think Habryka got this but for benefit of others), a major point of my original comment was not just "you might be overly paranoid/pessimistic in some cases", but, ambiguity about how paranoid/pessimistic is appropriate to be results in some kind of confusing, miasmic social-epistemic process (where like maybe you are exactly calibrated on how pessimistic to be, but it comes across as too aggro to other people, who pushback). This can be bad whether you're somewhat-too-pessimistic, somewhat-too-optimistic, or exactly calibrated. 

Comment by Raemon on MichaelDickens's Shortform · 2024-10-02T06:00:59.783Z · LW · GW

The predictions that seemed (somewhat) overly paranoid of yours were more about Anthropic than OpenPhil, and the dynamic seemed similar and I didn't check that hard while writing the comment. (maybe some predictions about how/why the OpenAI board drama went down, which was at the intersection of all three orgs, which I don't think have been explicitly revealed to have been "too paranoid" but I'd still probably take bets against)

(I think I agree that overall you were more like "not paranoid enough" than "too paranoid", although I'm not very confident)

Comment by Raemon on MichaelDickens's Shortform · 2024-10-02T05:07:50.053Z · LW · GW

I want to add the gear of "even if it actually turns out that OpenPhil was making the right judgment calls the whole time in hindsight, the fact that it's hard from the outside to know that has some kind of weird Epistemic Murkiness effects that are confusing to navigate, at the very least kinda suck, and maybe are Quite Bad." 

I've been trying to articulate the costs of this sort of thing lately and having trouble putting it into words, and maybe it'll turn out this problem was less of a big deal than it currently feels like to me. But, something like the combo of

a) the default being for many people to trust OpenPhil

b) many people who are paying attention think that they should at least be uncertain about it, and somewhere on a "slightly wary" to "paranoid" scale. and...

c) this at least causes a lot of wasted cognitive cycles

d) it's... hard to figure out how big a deal to make of it. A few people (i.e. habryka or previously Benquo or Jessicata) make it their thing to bring up concerns frequently. Some of those concerns are, indeed, overly paranoid, but, like, it wasn't actually reasonable to calibrate the wariness/conflict-theory-detector to zero, you have to make guesses. This is often exhausting and demoralizing for the people doing it. People typically only select into this sort of role if they're a bit more prone to conflict about it, which means a lot of the work is kinda thankless because people are pushing back on you for being too conflicty. Something about this compounds over time.

e) the part that feels hardest to articulate and maybe is fake is that, there's something of a "group epistemic process" going on in the surrounding community, and everyone either not tracking this sort of thing, or tracking it but not sure how to take it or what to do about it... I'm not sure how to describe it better than "I dunno something about the group orienting process subtly epistemically fucked" and/or "people just actually take sanity-damage from it."

("subtly epistemically fucked" might operationalize as "it takes an extra 1-3 years for things to become consensus knowledge/beliefs than it'd otherwise take")

Anyway, thanks for bringing it up.

Comment by Raemon on 2024 Petrov Day Retrospective · 2024-10-02T03:38:46.395Z · LW · GW

I actually think we should have made "all clears" worth something like 50 or -50 karma (if you get it right or wrong), and "NUKES INCOMING!" worth 300 or -300, partly for the reasons you mention, partly because you'd generally expect LWers to avoid nuking most of the time, partly because doesn't really feel worth 1000 karma to correctly guess "all clear" 5 times, but does feel worth a few hundred karma to correctly guess the one incoming nuke.

Comment by Raemon on Conventional footnotes considered harmful · 2024-10-01T16:41:45.771Z · LW · GW

Curious what you think of hoverable-footnotes on web pages, or the style of side-notes that LW recently implemented.

Comment by Raemon on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-10-01T06:52:23.857Z · LW · GW

Smooth/Lumpy Takeoff

Comment by Raemon on 2024 Petrov Day Retrospective · 2024-10-01T05:19:00.573Z · LW · GW

Stanley Peterson is just a (somewhat silly but surprisingly reasonable) Americanized version of Stanislav Petrov. (Peter and Petrov apparently both mean "rock", not sure about Stanislav offhand).

Sorry about screenshot-itis. In this case I think Ben wanted to convey the visuals of what the participants saw, and in general it's just easier to copy-paste without screwing up formatting (I think it's also incentivized due to various social media algorithms rewarding pictures)

Comment by Raemon on A basic systems architecture for AI agents that do autonomous research · 2024-10-01T02:47:59.566Z · LW · GW

Curated. It seems like in the current regime of frontier models, it's worth making more explicit models of what architectures we can expect, if dangerous capabilities develop in the near future. 

This post feels like it spells out a model that matches my general understanding of the state-of-the-art, but draws several inferences about it I hadn't previously thought about. 

I'd be interested in other people who've thought about current generation deployment setups chiming in with their takes, if they disagree or think there's important elements missing.

Comment by Raemon on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-30T18:31:16.320Z · LW · GW

serious answer that is agnostic as to how you are responding:

only if you know the takeoff is happening

Comment by Raemon on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-30T17:43:39.915Z · LW · GW

The relevant question is ‘Will a policy wonk inform Joe Biden (or, any other major decisionmaker) who either read a report with ‘slow takeoff’ and got confused,

or, read a report by someone who read a report by someone who was confused. (This is the one that seems very likely to me)

Comment by Raemon on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-30T16:59:59.911Z · LW · GW

I meant ‘do you think it’s good, bad, or neutral that people use the phrase ‘slow’/‘fast’ takeoff? And, if bad, what do you wish people did instead in those sentences?

Comment by Raemon on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-30T07:34:21.189Z · LW · GW

What does this cache out to in terms of what terms you think make sense?

Comment by Raemon on Raemon's Shortform · 2024-09-29T21:29:20.893Z · LW · GW

Some people have reported bugs wherein "you post a top level comment, and then the comment box doesn't clear (still displaying the text of your comment." It doesn't happen super reliably. I'm curious if anyone else has seen this recently.

Comment by Raemon on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-29T20:19:55.748Z · LW · GW

I think in most cases it might make sense to give the unit you expect to measure it in. “Days-long takeoff”. “Months-long takeoff.” “Years-long-takeoff”. “Decades-long takeoff”.

Comment by Raemon on Chapter 7: Reciprocation · 2024-09-29T19:02:33.888Z · LW · GW

This one.

Comment by Raemon on 2024 Petrov Day Retrospective · 2024-09-29T17:00:55.101Z · LW · GW

The one who voted "not destroy the world" was the one I had design the game. (I'd intended it as a sort of doubleedged reward/punishment of "well, as the first unilateralist, you do get to design the game, but, you need to do it for a virtue you didn't believe in.")

The resulting thing didn't quite work in the version they handed to us but was close enough that I feel pretty happy with the outcome.

Comment by Raemon on 2024 Petrov Day Retrospective · 2024-09-29T07:16:39.914Z · LW · GW

I definitely erred explicitly in the direction of the Opt In button looking scary (Ben specifically argued against this but it felt right to me) I have heard from a few people that they didn't even consider pressing it because "c'mon, it's Petrov Day, you don't go clicking big red buttons." I'm not sure if it was the right call. In any case if we do a similar thing in the future my guess is we'll make the opt-in less scary looking.

Comment by Raemon on 2024 Petrov Day Retrospective · 2024-09-29T06:56:41.166Z · LW · GW

Yes. (That wasn’t meant to be a secret, sorry!)

Comment by Raemon on 2024 Petrov Day Retrospective · 2024-09-29T06:55:51.536Z · LW · GW

I feel satisfied with Ben’s articulation of ‘taking responsibility’ as the primary Petrov virtue. It feels more like a real virtue than the overly consequentialist ‘don’t take actions that would destroy the world’, but it naturally lends itself to the other virtues on our poll from last year, when appropriate.

Comment by Raemon on 2024 Petrov Day Retrospective · 2024-09-29T06:46:47.757Z · LW · GW

Yeah I find the ‘you want to keep the message consistent for Science’ argument convincing (but think it’s good to still stick with the most reasonable interpretation of what our word was that we can, unless we have a specific reason not to that a reasonable number of nonteammates agree makes sense.)