Posts

2025 Prediction Thread 2024-12-30T01:50:14.216Z
Open Thread Winter 2024/2025 2024-12-25T21:02:41.760Z
The Deep Lore of LightHaven, with Oliver Habryka (TBC episode 228) 2024-12-24T22:45:50.065Z
Announcing the Q1 2025 Long-Term Future Fund grant round 2024-12-20T02:20:22.448Z
Sorry for the downtime, looks like we got DDosd 2024-12-02T04:14:30.209Z
(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser 2024-11-30T02:55:16.077Z
OpenAI Email Archives (from Musk v. Altman and OpenAI blog) 2024-11-16T06:38:03.937Z
Using Dangerous AI, But Safely? 2024-11-16T04:29:20.914Z
Open Thread Fall 2024 2024-10-05T22:28:50.398Z
If-Then Commitments for AI Risk Reduction [by Holden Karnofsky] 2024-09-13T19:38:53.194Z
Open Thread Summer 2024 2024-06-11T20:57:18.805Z
"AI Safety for Fleshy Humans" an AI Safety explainer by Nicky Case 2024-05-03T18:10:12.478Z
Goal oriented cognition in "a single forward pass" 2024-04-22T05:03:18.649Z
Express interest in an "FHI of the West" 2024-04-18T03:32:58.592Z
Structured Transparency: a framework for addressing use/mis-use trade-offs when sharing information 2024-04-11T18:35:44.824Z
LessWrong's (first) album: I Have Been A Good Bing 2024-04-01T07:33:45.242Z
How useful is "AI Control" as a framing on AI X-Risk? 2024-03-14T18:06:30.459Z
Open Thread Spring 2024 2024-03-11T19:17:23.833Z
Is a random box of gas predictable after 20 seconds? 2024-01-24T23:00:53.184Z
Will quantum randomness affect the 2028 election? 2024-01-24T22:54:30.800Z
Vote in the LessWrong review! (LW 2022 Review voting phase) 2024-01-17T07:22:17.921Z
AI Impacts 2023 Expert Survey on Progress in AI 2024-01-05T19:42:17.226Z
Originality vs. Correctness 2023-12-06T18:51:49.531Z
The LessWrong 2022 Review 2023-12-05T04:00:00.000Z
Open Thread – Winter 2023/2024 2023-12-04T22:59:49.957Z
Complex systems research as a field (and its relevance to AI Alignment) 2023-12-01T22:10:25.801Z
How useful is mechanistic interpretability? 2023-12-01T02:54:53.488Z
My techno-optimism [By Vitalik Buterin] 2023-11-27T23:53:35.859Z
"Epistemic range of motion" and LessWrong moderation 2023-11-27T21:58:40.834Z
Debate helps supervise human experts [Paper] 2023-11-17T05:25:17.030Z
How much to update on recent AI governance moves? 2023-11-16T23:46:01.601Z
AI Timelines 2023-11-10T05:28:24.841Z
How to (hopefully ethically) make money off of AGI 2023-11-06T23:35:16.476Z
Integrity in AI Governance and Advocacy 2023-11-03T19:52:33.180Z
What's up with "Responsible Scaling Policies"? 2023-10-29T04:17:07.839Z
Trying to understand John Wentworth's research agenda 2023-10-20T00:05:40.929Z
Trying to deconfuse some core AI x-risk problems 2023-10-17T18:36:56.189Z
How should TurnTrout handle his DeepMind equity situation? 2023-10-16T18:25:38.895Z
The Lighthaven Campus is open for bookings 2023-09-30T01:08:12.664Z
Navigating an ecosystem that might or might not be bad for the world 2023-09-15T23:58:00.389Z
Long-Term Future Fund Ask Us Anything (September 2023) 2023-08-31T00:28:13.953Z
Open Thread - August 2023 2023-08-09T03:52:55.729Z
Long-Term Future Fund: April 2023 grant recommendations 2023-08-02T07:54:49.083Z
Final Lightspeed Grants coworking/office hours before the application deadline 2023-07-05T06:03:37.649Z
Correctly Calibrated Trust 2023-06-24T19:48:05.702Z
My tentative best guess on how EAs and Rationalists sometimes turn crazy 2023-06-21T04:11:28.518Z
Lightcone Infrastructure/LessWrong is looking for funding 2023-06-14T04:45:53.425Z
Launching Lightspeed Grants (Apply by July 6th) 2023-06-07T02:53:29.227Z
Yoshua Bengio argues for tool-AI and to ban "executive-AI" 2023-05-09T00:13:08.719Z
Open & Welcome Thread – April 2023 2023-04-10T06:36:03.545Z

Comments

Comment by habryka (habryka4) on Reframing AI Safety as a Neverending Institutional Challenge · 2025-03-24T02:41:02.892Z · LW · GW

No worries!

You did say it would be premised on either "inevitable or desirable for normal institutions to be eventually lose control". In some sense I do think this is "inevitable" but only in the same sense as past "normal human institutions" lost control. 

We now have the internet and widespread democracy so almost all governmental institutions needed to change how they operate. Future technological change will force similar changes. But I don't put any value in the literal existence of our existing institutions, what I care about is whether our institutions are going to make good governance decisions. I am saying that the development of systems much smarter than current humans will change those institutions, very likely within the next few decades, making most concerns about present institutional challenges obsolete.

Of course something that one might call "institutional challenges" will remain, but I do think there really will be a lot of buck-passing that will happen from the perspective of present day humans. We do really have a crunch time of a few decades on our hands, after which we will no longer have much influence over the outcome.

Comment by habryka (habryka4) on Reframing AI Safety as a Neverending Institutional Challenge · 2025-03-23T22:55:01.621Z · LW · GW

I don't think I understand. It's not about human institutions losing control "to a small regime". It's just about most coordination problems being things you can solve by being smarter. You can do that in high-integrity ways, probably much higher integrity and with less harmful effects than how we've historically overcome coordination problems. I de-facto don't expect things to go this way, but my opinions here are not at all premised on it being desirable for humanity to lose control?

Comment by habryka (habryka4) on Reframing AI Safety as a Neverending Institutional Challenge · 2025-03-23T20:39:49.010Z · LW · GW

This IMO doesn't really make any sense. If we get powerful AI, and we can either control it, or ideally align it, then the gameboard for both global coordination and building institutions completely changes (and of course if we fail to control or align it, the gameboard is also flipped, but in a way that removes us completely from the picture).

Does anyone really think that by the time you have systems vastly more competent than humans, that we will still face the same coordination problems and institutional difficulties as we have right now?

It does really look like there will be a highly pivotal period of at most a few decades. There is a small chance humanity decides to very drastically slow down AI development for centuries, but that seems pretty unlikely, and also not clearly beneficial. That means it's not a neverending institutional challenge, it's a challenge that lasts a few decades at most, during which humanity will be handing off control to some kind of cognitive successor which is very unlikely to face the same kinds of institutional challenges as we are facing today.

That handoff is not purely a technical problem, but a lot of it will be. At the end of the day, whether your successor AI systems/AI-augmented-civilization/uplifted-humanity/intelligence-enhanced-population will be aligned with our preferences over the future has a lot of highly technical components.

Yes, there will be a lot of social problems, but the size and complexity of the problems are finite, at least from our perspective. It does appear that humanity is at the cusp of unlocking vast intelligence, and after you do that, you really don't care very much about the weird institutional challenges that humanity is currently facing, most of which can clearly be overcome by being smarter and more competent.

Comment by habryka (habryka4) on On the Rationality of Deterring ASI · 2025-03-22T22:20:10.814Z · LW · GW

I mean, you saw people make fun of it when Eliezer said it, and then my guess is people conservatively assumed that this would generalize to the future. I've had conversations with people where they tried to convince me that Eliezer mentioning kinetic escalation was one of the worst things that anyone has ever done for AI policy, and they kept pointing to twitter threads and conversations where opponents made fun of it as evidence. I think there clearly was something real here, but I also think people really fail to understand the communication dynamics here.

Comment by habryka (habryka4) on On the Rationality of Deterring ASI · 2025-03-22T21:48:22.823Z · LW · GW

My sense is a lot of the x-risk oriented AI policy community is very focused on avoiding "gaffes" and have a very short-term and opportunistic relationship with reputation and public relations and all that kind of stuff. My sense is that people in the space don't believe being principled or consistently honest basically ever gets rewarded or recognized, so the right strategy is to try to identify what the overton window is, only push very conservatively on expanding it, and focus on staying in the good graces of whatever process determines social standing, which is generally assumed to be pretty random and arbitrary.

I think many people in the space, if pushed, would of course acknowledge that kinetic responses are appropriate in many AI scenarios, but they would judge it as an unnecessarily risky gaffe, and that perception of a gaffe creates a pretty effective enforcement regime for people to basically never bring it up, lest you be judged as politically unresponsible.

Comment by habryka (habryka4) on On the Rationality of Deterring ASI · 2025-03-22T03:04:31.687Z · LW · GW

Promoted to curated: I have various pretty substantial critiques of this work, but I do overall think this is a pretty great effort at crossing the inferential distance from people who think AGI will be a huge deal and potentially dangerous, to the US government and national security apparatus. 

The thing that I feel most unhappy about is that the document feels to me like it follows a pattern that Situational Awareness also had, where it seemed to me like it kept framing various things that it wanted to happen, as "inevitable to happen", while also arguing that they are a good idea, in a way that felt to me like it tried too hard to make some kind of self-fulfilling prophecy.

But overall, I feel like this document speaks with surprising candor and clarity about many things that have been left unsaid in many circumstances. I particularly appreciated its coverage of explicitly including conventional ballistic escalation as part of a sabotage strategy for datacenters. Relevant quotes: 

Should these measures falter, some leaders may contemplate kinetic attacks on datacenters, arguing that allowing one actor to risk dominating or destroying the world are graver dangers, though kinetic attacks are likely unnecessary. Finally, under dire circumstances, states may resort to broader hostilities by climbing up existing escalation ladders or threatening non-AI assets. We refer to attacks against rival AI projects as "maiming attacks."

I also particularly appreciated this proposed policy for how to handle AIs capable of recursive self-improvement: 

In the near term, geopolitical events may prevent attempts at an intelligence recursion. Looking further ahead, if humanity chooses to attempt an intelligence recursion, it should happen in a controlled environment with extensive preparation and oversight—not under extreme competitive pressure that induces a high risk tolerance.

Comment by habryka (habryka4) on METR: Measuring AI Ability to Complete Long Tasks · 2025-03-21T00:15:16.414Z · LW · GW

Research engineers I talk to already report >3x speedups from AI assistants

Huh, I would be extremely surprised by this number. I program most days, in domains where AI assistance is particularly useful (frontend programming with relatively high churn), and I am definitely not anywhere near 3x total speedup. Maybe a 1.5x, maybe a 2x on good weeks, but definitely not a 3x. A >3x in any domain would be surprising, and my guess is generalization for research engineer code (as opposed to churn-heavy frontend development) is less.

Comment by habryka (habryka4) on Elizabeth's Shortform · 2025-03-17T01:25:36.146Z · LW · GW

I tried and it looks bad for some reason, I think because the current order of the symbol reflects the position on the numberline and if you invert them it looks worse. I don't feel confident but think I prefer the current situation.

Comment by habryka (habryka4) on Joseph Miller's Shortform · 2025-03-17T01:10:06.610Z · LW · GW

Interesting. I am concerned about this effect, but I do really like a lot of quick takes. I wonder whether maybe this suggests a problem with how we present posts.

Comment by habryka (habryka4) on I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats? · 2025-03-16T19:51:50.509Z · LW · GW

Here are some initial thoughts: 

I do think there are a bunch of good donation opportunities these days, especially in domains where Open Philanthropy withdrew funding recently. Some more thoughts and details here.

At the highest level, I think what the world can use most right now is a mixture of: 

  1. Clear explanations for the core arguments around AI x-risk, both so that people can poke holes in them, and because they will enable many more people who are in positions to do something about AI to do good things
  2. People willing to publicly, with their real identity, argue that governments and society more broadly should do pretty drastic things to handle the rise of AGI

I think good writing and media production is probably at the core of a lot of this. I particularly think that writing and arguments directed at smart educated people who do not necessarily have any kind of AI or ML background is more valuable than things that are more directed at AI and ML people, mostly because there has been a lot of the latter, the incentives on engaging in discourse with them are less bad, and because I think collectively there is often a temptation to create priesthoods around various kinds of knowledge and then to insist on deferring to those priesthoods, which I think usually causes worse collective decision-making, and writing in a more accessible way helps push against that.

I think both of these things can benefit a decent amount from funding. I do think the current funding distribution landscape is pretty hard to navigate. I am on the Long Term Future Fund which in some sense is trying to address this, but IMO we aren't really doing an amazing job at identifying and vetting opportunities here, so I am not sure whether I would recommend donations to us, but also, nobody else is doing a great job, so I am not sure. 

My tentative guess is that the best choice is to spend a few hours trying to identify one or two organizations that seem particularly impactful and at least somewhat funding constrained, then make a public comment or post asking about critical thoughts from other people on those organizations, and then iterate that a few times until you find something good. This is a decent amount of work, but I don't think there currently exist good and robust deference chains in this space that would cause you to have a reliably positive impact on things by just trusting them.

I tentatively think that writing a single essay or reasonably popular tweet under your real-identity where you express concern about AI x-risk, as a pretty successful business person, is also quite valuable. I don't think it has to be anything huge, but I do think it's good if it's more than just a paragraph or a retweet. Something that people could refer to if they try to list non-crazy people who think these kinds of concerns are real, and that can meaningfully be weighed as part of the public discussion on these kinds of topics. 

I do also think visiting one of the hubs where people who work on this stuff a lot tend to work is pretty valuable. You could attend LessOnline or EA Global or something in that space, and talk to people about these topics. I do think there is a risk of ending up unduly influenced by social factors and various herd mentality dynamics, but there are a lot of smart people around who spend all day thinking about what things are most helpful, and there is lots of useful knowledge to extract.

Comment by habryka (habryka4) on AI Can't Write Good Fiction · 2025-03-12T17:56:31.962Z · LW · GW

I whipped up a very quick example in GPT-4.5, which unfortunately 'moderation' somehow forbids me from sharing, but my initial prompt went like this:

(If this is referring to LW moderation that's inaccurate. In general I am in favor of people sharing LLM snippets to discuss their content, as well as for the purpose of background sources in collapsible sections.)

Comment by habryka (habryka4) on Preparing for the Intelligence Explosion · 2025-03-11T20:03:31.346Z · LW · GW

My guess is you know this, but the sidenote implementation appears to be broken. When clicking on the footnote labeled "1" it opens up a footnote labeled "2", and also, the footnotes overlap on the right in very broken looking ways: 

Comment by habryka (habryka4) on Neil Warren's Shortform · 2025-03-11T05:41:32.656Z · LW · GW

Yeah, our policy is to reject anything that looks like it was written or heavily edited with LLMs from new users, and I tend to downvote LLM-written content from approved users, but it is getting harder and harder to detect the difference on a quick skim, so content moderation has been getting harder. 

Comment by habryka (habryka4) on Childhood and Education #9: School is Hell · 2025-03-10T00:01:56.529Z · LW · GW

Ah, oops, now I get it. Yes, I what I wrote sure didn't make any sense. In my first paragraph I meant to write something like "if no home schoolers are allowed to be as bad as bad or average public schools, the costs of homeschooling increase a lot, constituting effectively a tax on homeschooling" and then in my second paragraph I meant to strengthen it into "the very worst public school". I did sure write the same clarifiers in each paragraph, being very confusing.

Comment by habryka (habryka4) on Childhood and Education #9: School is Hell · 2025-03-09T04:51:47.561Z · LW · GW

Huh, it grammatically reads fine to me. I am assuming the first paragraph reads fine, so I'll clarify just the second. 

In my first paragraph I said that making sure that most reasonable interpretations of "a right to an education at least as good as voluntary public school education" would put undue cost on homeschooling. In my second paragraph I then suggested one reading that does not plausibly incur that cost, which is a right to an education at least better than the worst voluntary public school education. However, it appears to me that students already have a right to an education at least better than the worst voluntary public school education, as I am sure the worst public school education violates many straightforward human rights and would be prosecutable under current law (just nobody is bothering to do that), suggesting that adding an additional right with such a low threshold wouldn't really make any difference. 

Hope that helps!

Comment by habryka (habryka4) on Childhood and Education #9: School is Hell · 2025-03-09T01:25:20.435Z · LW · GW

Something else I'm unsure about, but not necessarily a hill I want to die on given that government resources aren't unlimited, is the question of whether kids should have a right to "something at least similarly good as voluntary public school education."

This seems like it would punish variance a lot, and de-facto therefore be a huge tax on homeschooling. Some public schools are extremely bad, if no home schoolers are allowed to be as bad as the worst public schools, the costs of homeschooling increase a lot, constituting effectively a tax on homeschooling. 

Maybe you mean "a right to an education at least as good as the worst public school education", but my guess is the worst public school education is so bad that these would already be covered by almost any reasonable approach to human rights (like, my guess is it already involves continuous ongoing threats of violence, being lied to, frequent physical violence, etc.).

Comment by habryka (habryka4) on So how well is Claude playing Pokémon? · 2025-03-09T01:21:41.561Z · LW · GW

IMO this would be a great top-level post (as would many other of the posts on your Substack I just discovered!)

Comment by habryka (habryka4) on Childhood and Education #9: School is Hell · 2025-03-08T19:46:07.457Z · LW · GW

I strong-upvoted and strong-disagree voted, since I also agree the current voting distribution didn't make much sense. 

I do think you are doing something in your comment that feels pretty off. For example you link to aphyer's comment as a "fully general counterargument that clearly prove[s] way too much", but I don't buy it, I think it's a pretty reasonable argument. The prior should be towards liberty, and if the higher-liberty option is also safer, then I don't see any reason to mess with it for now. 

Like, it seems fine to improve things, but I do think state involvement in education has been really very terrifying and I sense a continuous missing mood throughout your comments of not understanding how costly marginal regulation can be.

To be clear, I think your comment is fine and doesn't deserve downvoting, and disagree-voting feels like the appropriate dimension.

Comment by habryka (habryka4) on Arbital has been imported to LessWrong · 2025-03-03T16:47:49.041Z · LW · GW

Yeah, my guess is we should change that UI a bit. IMO it makes sense to make comments a bit less prevalent on wiki and tag pages (because many comments will be more outdated), but the current text is too much about just proposing changes. 

Comment by habryka (habryka4) on Statistical Challenges with Making Super IQ babies · 2025-03-03T07:14:22.675Z · LW · GW

Thank you! I'll see whether I can do some of my own thinking on this, as I care a lot about the issue, but do feel like I would have to really dig into it. I appreciate your high-level gloss on the size of the overestimate.

Comment by habryka (habryka4) on Statistical Challenges with Making Super IQ babies · 2025-03-03T06:10:16.710Z · LW · GW

I greatly appreciate this kind of critique, thank you!

My guess is this is too big of an ask, and I am already grateful for your post, but do you have a prediction about how much of the variance would turn out to be causal in the relevant way? 

My current best guess is we are going to be seeing some of these technologies used in the animal breeding space relatively soon (within a few years), and so early predictions seem helpful for validating models, and also might also just help people understand how much you currently think the post overestimates the impact of edits.

Comment by habryka (habryka4) on The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better · 2025-03-02T04:44:59.462Z · LW · GW

Assuming the second refers to "Stuttgart 21"?

Yep! 

but I don't think these examples seem well-described as not having precedents / lots of societal and cultural preconditions

I totally think there are lots of cultural preconditions and precedents, I just think they mostly don't look like "small protests for many years that gradually or even suddenly grew into larger ones". My best guess is if you see a protest movement not have substantial growth for many months, it's unlikely to start growing, and it's not that valuable to have started it earlier (and somewhat likely to have a bit of an inoculation effect, though I also don't think that effect is that big).

Comment by habryka (habryka4) on The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better · 2025-02-23T22:31:30.560Z · LW · GW

I don't understand, I don't think there was any ambiguity in what you said. Even not taking things literally, you implied that having big protests without having small protests is at least highly unusual. That also doesn't match my model. I think it's pretty normal. The thing that I think happens before big protests is big media coverage and social media discussion, not many months and years of small protests. I am not sure of this, but that's my current model. 

Comment by habryka (habryka4) on The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better · 2025-02-23T22:29:22.742Z · LW · GW

The specific ones I was involved in? Pretty sure they didn't. They were SOPA related and related to what people thought was a corrupt construction of a train station in my hometown. I don't think there was much organizing for either of these before they took off. I knew some of the core organizers, they did not create many small protests before this.

Comment by habryka (habryka4) on Willa's Shortform · 2025-02-23T20:49:35.895Z · LW · GW

Aw man, sad to hear that, and I am glad you seem to be doing better.

Comment by habryka (habryka4) on The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better · 2025-02-23T20:48:47.287Z · LW · GW

What leapt out to me about your model was that is was very focused how an observer of the protests would react with a rationalist worldview. You didn’t seem to have given much thought to the breadth of social movements and how a diverse public would have experienced them. Like, most people aren’t gonna think PauseAI is anti-tech in general and therefore similar to the unabomber. Rationalists think that way, and few others.

I am confused, did you somehow accidentally forget a negation here? You can argue that Thane is confused, but clearly Thane was arguing from what the public believes, and of course Thane himself doesn't think that PauseAI is similar to the Unabomber based on vague associations, and certainly almost nobody else on this site believes that (some might believe that non-rationalists believe that, but isn't that exactly the kind of thinking you are asking for?).

Comment by habryka (habryka4) on The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better · 2025-02-23T20:45:21.597Z · LW · GW

When I was involved with various forms of internet freedom activism, as well as various protests around government misspending in Germany, I do not remember a run-up of many months of small protests before the big ones. It seemed that people basically directly organized some quite big ones, and then they grew a bit bigger over the course of a month, and then became smaller again. I do not remember anything like the small PauseAI protests on those issues. 

(This isn't to say it isn't a good thing in the case of AGI, I am just disputing that "small protests are the only way to get big protests")

Comment by habryka (habryka4) on Gary Marcus now saying AI can't do things it can already do · 2025-02-22T18:00:21.339Z · LW · GW

I've engaged with Gary 3-4 times in good faith. He responded in very frustrating and IMO bad faith ways every time. I've also seen this 10+ times in other threads. 

Comment by habryka (habryka4) on How AI Takeover Might Happen in 2 Years · 2025-02-21T01:50:30.294Z · LW · GW

Promoted to curated: I think concrete specific scenarios for how things might go with AI are IMO among the most helpful tools to help people start forming their own models about how this whole AI thing might go. Being specific is good, grounding things in concrete observable consequences is good. Somewhat sticking your neck out and making public predictions is good.

This is among the best entries I've seen in this genre, and I hope there will be more. Thank you for writing it!

Comment by habryka (habryka4) on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2025-02-21T00:51:21.756Z · LW · GW

Sorry about that! I am adding a donate link back to the frontpage sometime this week. Here is the link for now: https://www.lesswrong.com/donate 

Comment by habryka (habryka4) on Arbital has been imported to LessWrong · 2025-02-20T19:04:26.336Z · LW · GW

It's true

Comment by habryka (habryka4) on How might we safely pass the buck to AI? · 2025-02-20T03:29:45.099Z · LW · GW

Seems good! 

FWIW, at least in my mind this is in some sense approximately the only and central core of the alignment problem, and so having it left unaddressed feels confusing. It feels a bit like making a post about how to make a nuclear reactor where you happen to not say anything about how to prevent the uranium from going critical, but you did spend a lot of words about the how to make the cooling towers and the color of the bikeshed next door and how to translate the hot steam into energy. 

Like, it's fine, and I think it's not crazy to think there are other hard parts, but it felt quite confusing to me.

Comment by habryka (habryka4) on Noah Birnbaum's Shortform · 2025-02-20T02:24:27.394Z · LW · GW

Someone I trust on this says: 

AFAICT what's going on here is just that AISI and CHIPS are getting hit especially hard by the decision to fire probationary staff across USG, since they're new and therefore have lots of probationary staff - it's not an indication (yet) that either office is being targeted to be killed

Comment by habryka (habryka4) on Arbital has been imported to LessWrong · 2025-02-20T01:42:00.661Z · LW · GW

The central problem of any wiki system is [1]"what edits do you accept to a wiki page?". The lenses system is trying to provide a better answer to that question.

My default experience on e.g. Wikipedia when I am on pages where I am highly familiar with the domain is "man, I could write a much better page". But writing a whole better page is a lot of effort, and the default consequence of rewriting the page is that the editor who wrote the previous page advocates for your edits to be reverted, because they are attached to their version of the page. 

With lenses, if you want to suggest large changes to a wiki page, your default action is now "write a new lens". This leaves the work of the previous authors intact, while still giving your now page the potential for substantial readership. Lenses are sorted in order of how many people like them. If you think you can write a better lens, you can make a new lens, and if it's better, it can replace the original lens after it got traction.

More broadly, wikis suffer a lot from everything feeling like it is written by a committee. Lenses enable more individual authorship, while still trying to have some collective iteration on canonicity and structure of the wiki.

  1. ^

    Well, after you have solved the problem of "does anyone care about this wiki?"

Comment by habryka (habryka4) on How might we safely pass the buck to AI? · 2025-02-20T01:21:57.537Z · LW · GW

To the extent the tool just gets gamed, you can iterate until you find detection tools that are more robust (or find ways of training against detection tools that don't game them so hard).

How do you iterate? You mostly won't know whether you just trained away your signal, or actually made progress. The inability to iterate is kind of the whole central difficulty of this problem.

(To be clear, I do think there are some methods of iteration, but it's a very tricky kind of iteration where you need constant paranoia about whether you are fooling yourself, and that makes it very different from other kinds of scientific iteration)

Comment by habryka (habryka4) on Arbital has been imported to LessWrong · 2025-02-20T01:13:55.127Z · LW · GW

Yeah, I've been very glad to have that up. It does lack a quite large fraction of Arbital features (such as UI for picking between multiple lenses, probabilistic claims, and tons of other small UI things which were a lot of work to import), but it's still been a really good resource for linking to.

Comment by habryka (habryka4) on How might we safely pass the buck to AI? · 2025-02-20T00:09:54.990Z · LW · GW

Ajeya gave 15% to AGI before 2036, with little of that in the first few years after her report; maybe she'd have said 10% between 2025 and 2036.

Just because I was curious, here is the most relevant chart from the report: 

This is not a direct probability estimate (since it's about probability of affordability), but it's probably within a factor of 2. Looks like the estimate by 2030 was 7.72% and the estimate by 2036 is 17.36%.

Comment by habryka (habryka4) on How might we safely pass the buck to AI? · 2025-02-20T00:02:20.974Z · LW · GW

This thought might be detectable. Now the problem of scaling safety becomes a problem of detecting [...] this kind of conditional, deceptive reasoning.

What do you do when you detect this reasoning? This feels like the part where all plans I ever encounter fail. 

Yes, you will probably see early instrumentally convergent thinking. We have already observed a bunch of that. Do you train against it? I think that's unlikely to get rid of it. I think at this point the natural answer is "yes, your systems are scheming against you, so you gotta stop, because when you train against it, you are probably primarily making it a better schemer". 

I would be very surprised if you have a 3-month Eliezer that is not doing scheming the first time, and training your signals away is much easier than actually training away the scheming.

Comment by habryka (habryka4) on EniScien's Shortform · 2025-02-18T20:58:12.627Z · LW · GW

What does that mean? It doesn't affect any recent content, and it's one of the most prominent options if you are looking through all historical posts.

Comment by habryka (habryka4) on The Unearned Privilege We Rarely Discuss: Cognitive Capability · 2025-02-18T20:54:14.773Z · LW · GW

I reviewed it. It didn't trigger my "LLM generated content" vibes, though I also don't think it's an amazing essay.

Comment by habryka (habryka4) on Announcing the Q1 2025 Long-Term Future Fund grant round · 2025-02-15T17:30:12.469Z · LW · GW

EOD AOE on February 15th, and honestly, I am not going to throw out your application if it comes in on the 16th either.

Comment by habryka (habryka4) on 6 (Potential) Misconceptions about AI Intellectuals · 2025-02-15T03:46:51.748Z · LW · GW

While artificial intelligence has made impressive strides in specialized domains like coding, art, and medicine, I think its potential to automate high-level strategic thinking has been surprisingly underrated. I argue that developing "AI Intellectuals" - software systems capable of sophisticated strategic analysis and judgment - represents a significant opportunity that's currently being overlooked, both by the EA/rationality communities and by the public.

FWIW, this paragraph reads LLM generated to me (then I stopped reading because I have a huge prior that content that reads that LLM-edited is almost universally low-quality).

Comment by habryka (habryka4) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T01:05:30.286Z · LW · GW

Keep in mind that I'm talking about agent scaffolds here.

Yeah, I have failed to get any value out of agent scaffolds, and I don't think I know anyone else who has so far. If anyone has gotten more value out of them than just the Cursor chat, I would love to see how they do it! 

All things like Cursor composer and codebuff and other scaffolds have been worse than useless for me (though I haven't tried it again after o3-mini, which maybe made a difference, it's been on my to-do list to give it another try).

Comment by habryka (habryka4) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T00:51:14.401Z · LW · GW

And 1 hour on software engineering.

FWIW, this seems like an overestimate to me. Maybe o3 is better than other things, but I definitely can't get equivalents of 1-hour chunks out of language models, unless it happens to be an extremely boilerplate-heavy step. My guess is more like 15-minutes, and for debugging (which in my experience is close to most software-engineering time), more like 5-10 minutes.

Comment by habryka (habryka4) on Elephant seal 2 · 2025-02-12T21:00:29.373Z · LW · GW

Presumably "Elephant Seal 3"

Comment by habryka (habryka4) on Writer's Shortform · 2025-02-12T20:58:14.278Z · LW · GW

FWIW, my sense is that it's a bad paper. I expect other people will come out with critiques in the next few days that will expand on that, but I will write something if no one has done it in a week or two. I think the paper notices some interesting weak correlations, but man, it really doesn't feel like the way you would go about answering the central question it is trying to answer and I keep having the feeling of it very much having been written to produce the thing that on the most shallow read will produce the most surface-level similar object in order to persuade and be socially viral, and not to inform.

Comment by habryka (habryka4) on Logical Correlation · 2025-02-11T05:03:04.063Z · LW · GW

Alas, also looks like our font is lacking some relevant character sets: 

Comment by habryka (habryka4) on Thread for Sense-Making on Recent Murders and How to Sanely Respond · 2025-02-10T20:28:04.119Z · LW · GW

What should have been the trigger? When she started wearing black robes? When she started calling herself Ziz? When she started writing up her own homegrown theories of psychology? Weird clothes, weird names, and weird beliefs are part and parcel of the rationalist milieu.

FWIW, I think I had triggers around them being weird/sketchy that would now cause me to exclude them from many community things, so I do think there were concrete triggers, and I did update on that.

Comment by habryka (habryka4) on [Job ad] LISA CEO · 2025-02-09T18:54:20.977Z · LW · GW

I mean, someone must have been running it somehow. In case that has been done by some group of people, I feel like saying why they now want a boss would also answer my question. 

Comment by habryka (habryka4) on [Job ad] LISA CEO · 2025-02-09T02:10:40.200Z · LW · GW

What happened to the previous CEO?