LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

[link] OpenAI: Detecting misbehavior in frontier reasoning models
Daniel Kokotajlo (daniel-kokotajlo) · 2025-03-11T02:17:21.026Z · comments (18)
[link] Trojan Sky
Richard_Ngo (ricraz) · 2025-03-11T03:14:00.681Z · comments (4)
[link] Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases
Fabien Roger (Fabien) · 2025-03-11T11:52:38.994Z · comments (11)
AI Control May Increase Existential Risk
Jan_Kulveit · 2025-03-11T14:30:05.972Z · comments (2)
[link] Preparing for the Intelligence Explosion
fin · 2025-03-11T15:38:29.524Z · comments (8)
Elon Musk May Be Transitioning to Bipolar Type I
Cyborg25 · 2025-03-11T17:45:06.599Z · comments (7)
HPMOR Anniversary Parties: Coordination, Resources, and Discussion
Screwtape · 2025-03-11T01:30:41.177Z · comments (0)
Response to Scott Alexander on Imprisonment
Zvi · 2025-03-11T20:40:06.250Z · comments (2)
[link] Paths and waystations in AI safety
Joe Carlsmith (joekc) · 2025-03-11T18:52:57.772Z · comments (0)
Don't over-update on FrontierMath results
David Matolcsi (matolcsid) · 2025-03-11T20:44:04.459Z · comments (0)
Existing UDTs test the limits of Bayesianism (and consistency)
Cole Wyeth (Amyr) · 2025-03-12T04:09:11.615Z · comments (1)
Meridian Cambridge Visiting Researcher Programme: Turn AI safety ideas into funded projects in one week!
Meridian Cambridge · 2025-03-11T17:46:29.656Z · comments (0)
Cognitive Reframing—How to Overcome Negative Thought Patterns and Behaviors
Declan Molony (declan-molony) · 2025-03-11T04:56:03.696Z · comments (0)
Forethought: a new AI macrostrategy group
Max Dalton (max-dalton) · 2025-03-11T15:39:25.086Z · comments (0)
[link] (Anti)Aging 101
George3d6 · 2025-03-12T03:59:21.859Z · comments (0)
[link] A different take on the Musk v OpenAI preliminary injunction order
TFD · 2025-03-11T12:46:23.497Z · comments (0)
[link] The Grapes of Hardness
adamShimi · 2025-03-11T21:01:14.963Z · comments (0)
[link] AI Can't Write Good Fiction
JustisMills · 2025-03-12T06:11:57.786Z · comments (0)
A Hogwarts Guide to Citizenship
WillPetillo · 2025-03-11T05:50:02.768Z · comments (1)
The Social Economy
kylefurlong · 2025-03-11T22:51:14.857Z · comments (4)
stop solving problems that have already been solved
dhruvmethi · 2025-03-11T15:30:41.896Z · comments (2)
[link] How Language Models Understand Nullability
Anish Tondwalkar (anish-tondwalkar) · 2025-03-11T15:57:28.686Z · comments (0)
When is it Better to Train on the Alignment Proxy?
dil-leik-og (samuel-buteau) · 2025-03-11T13:35:51.152Z · comments (0)
[link] You don't actually need a physical multiverse to explain anthropic fine-tuning.
Fraser · 2025-03-12T07:33:43.278Z · comments (1)
Scaling AI Regulation: Realistically, what Can (and Can’t) Be Regulated?
Katalina Hernandez (katalina-hernandez) · 2025-03-11T16:51:41.651Z · comments (0)
next page (older posts) →