LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

[link] OpenAI: Detecting misbehavior in frontier reasoning models
Daniel Kokotajlo (daniel-kokotajlo) · 2025-03-11T02:17:21.026Z · comments (18)

[link] Trojan Sky
Richard_Ngo (ricraz) · 2025-03-11T03:14:00.681Z · comments (4)

[link] Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases
Fabien Roger (Fabien) · 2025-03-11T11:52:38.994Z · comments (11)

AI Control May Increase Existential Risk
Jan_Kulveit · 2025-03-11T14:30:05.972Z · comments (2)

[link] Preparing for the Intelligence Explosion
fin · 2025-03-11T15:38:29.524Z · comments (8)

Elon Musk May Be Transitioning to Bipolar Type I
Cyborg25 · 2025-03-11T17:45:06.599Z · comments (7)

HPMOR Anniversary Parties: Coordination, Resources, and Discussion
Screwtape · 2025-03-11T01:30:41.177Z · comments (0)

Response to Scott Alexander on Imprisonment
Zvi · 2025-03-11T20:40:06.250Z · comments (2)

[link] Paths and waystations in AI safety
Joe Carlsmith (joekc) · 2025-03-11T18:52:57.772Z · comments (0)

Don't over-update on FrontierMath results
David Matolcsi (matolcsid) · 2025-03-11T20:44:04.459Z · comments (0)

Existing UDTs test the limits of Bayesianism (and consistency)
Cole Wyeth (Amyr) · 2025-03-12T04:09:11.615Z · comments (1)

Meridian Cambridge Visiting Researcher Programme: Turn AI safety ideas into funded projects in one week!
Meridian Cambridge · 2025-03-11T17:46:29.656Z · comments (0)

Cognitive Reframing—How to Overcome Negative Thought Patterns and Behaviors
Declan Molony (declan-molony) · 2025-03-11T04:56:03.696Z · comments (0)

Forethought: a new AI macrostrategy group
Max Dalton (max-dalton) · 2025-03-11T15:39:25.086Z · comments (0)

[link] (Anti)Aging 101
George3d6 · 2025-03-12T03:59:21.859Z · comments (0)

[link] A different take on the Musk v OpenAI preliminary injunction order
TFD · 2025-03-11T12:46:23.497Z · comments (0)

[link] The Grapes of Hardness
adamShimi · 2025-03-11T21:01:14.963Z · comments (0)

[link] AI Can't Write Good Fiction
JustisMills · 2025-03-12T06:11:57.786Z · comments (0)

A Hogwarts Guide to Citizenship
WillPetillo · 2025-03-11T05:50:02.768Z · comments (1)

The Social Economy
kylefurlong · 2025-03-11T22:51:14.857Z · comments (4)

stop solving problems that have already been solved
dhruvmethi · 2025-03-11T15:30:41.896Z · comments (2)

[link] How Language Models Understand Nullability
Anish Tondwalkar (anish-tondwalkar) · 2025-03-11T15:57:28.686Z · comments (0)

When is it Better to Train on the Alignment Proxy?
dil-leik-og (samuel-buteau) · 2025-03-11T13:35:51.152Z · comments (0)

[link] You don't actually need a physical multiverse to explain anthropic fine-tuning.
Fraser · 2025-03-12T07:33:43.278Z · comments (1)

Scaling AI Regulation: Realistically, what Can (and Can’t) Be Regulated?
Katalina Hernandez (katalina-hernandez) · 2025-03-11T16:51:41.651Z · comments (0)

next page (older posts) →

Archive

Recent comments

cubefox on xpostah's Shortform

I see you fixed the https issue. I think the resulting text snippets are reasonably related to the input question, though not overly so. Google search often answers questions more directly with quotes (from websites, not from books), though that may be too ambitious to match for a small project. Other than that, the first column could be improved with relevant metadata such as the source title. Perhaps the snippets in the second column could be trimmed to whole sentences if it doesn't impact the snippet length too much. In general, I believe snippets currently do to not show line breaks present in the source.

wdmacaskill on Preparing for the Intelligence Explosion

Ah, by the "software feedback loop" I mean: "At the point of time at which AI has automated AI R&D, does a doubling of cognitive effort result in more than a doubling of output? If yes, there's a software feedback loop - you get (for a time, at least) accelerating rates of algorithmic efficiency progress, rather than just a one-off gain from automation."

I see now why you could understand "RSI" to mean "AI improves itself at all over time". But even so, the claim would still hold - even if (implausibly) AI gets no smarter than human-level, you'd still get accelerated tech development, because the quantity of AI research effort would increase at a growth rate much faster than the quantity of human research effort.

wdmacaskill on Preparing for the Intelligence Explosion

There's definitely a new trend towards custom-website essays. Forethought is a website for lots of research content, though (like Epoch), not just PrepIE.

And I don't think it's because of people getting more productive because of reasoning models - AI was helpful for PrepIE but more like 10-20% productivity boost than 100% boost, and I don't think AI was used much for SA, either.

tsvibt on johnswentworth's Shortform

Some things even withdraw. https://tsvibt.blogspot.com/2023/05/the-possible-shared-craft-of-deliberate.html#aside-on-withdrawal-and-the-leap https://tsvibt.blogspot.com/2023/09/a-hermeneutic-net-for-agency.html#withdrawal

wdmacaskill on Preparing for the Intelligence Explosion

Thanks - appreciate that! It comes up a little differently for me, but still an issue - we've asked the devs to fix.

samuelshadrach on xpostah's Shortform

Update: HTTPS issue fixed. Should work now.

booksearch.samuelshadrach.com

Books Search for Researchers

knight-lee on Daniel Kokotajlo's Shortform

I think it's very hard for a single step of evolution to create a balloon large enough to counteract the organism's weight, fill it with a lighter-than-air gas (e.g. hydrogen), and then adapt the organism for survival in the air.

Each one of these adaptations is not very useful without the others, so they must all evolve at once.

samuelshadrach on xpostah's Shortform

Thanks for your patience. I'd be happy to receive any feedback. Negative feedback especially.

samuelshadrach on xpostah's Shortform

Update: HTTPS should work now

max-dalton on Preparing for the Intelligence Explosion

Thanks Oli! I think the clustering issue is fixed now, looking into what's going on with the numbers.