LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Alignment Faking Revisited: Improved Classifiers and Open Source Extensions
John Hughes (john-hughes) · 2025-04-08T17:32:55.315Z · comments (5)
Short Timelines don't Devalue Long Horizon Research
Vladimir_Nesov · 2025-04-09T00:42:07.324Z · comments (8)
Among Us: A Sandbox for Agentic Deception
7vik (satvik-golechha) · 2025-04-05T06:24:49.000Z · comments (4)
The Lizardman and the Black Hat Bobcat
Screwtape · 2025-04-06T19:02:01.238Z · comments (13)
A Slow Guide to Confronting Doom
Ruby · 2025-04-06T02:10:56.483Z · comments (20)
AI 2027: Responses
Zvi · 2025-04-08T12:50:02.197Z · comments (2)
AI CoT Reasoning Is Often Unfaithful
Zvi · 2025-04-04T14:50:05.538Z · comments (4)
Will compute bottlenecks prevent a software intelligence explosion?
Tom Davidson (tom-davidson-1) · 2025-04-04T17:41:37.088Z · comments (2)
AI 2027: Dwarkesh’s Podcast with Daniel Kokotajlo and Scott Alexander
Zvi · 2025-04-07T13:40:05.944Z · comments (2)
[link] Google DeepMind: An Approach to Technical AGI Safety and Security
Rohin Shah (rohinmshah) · 2025-04-05T22:00:14.803Z · comments (10)
[link] How Gay is the Vatican?
rba · 2025-04-06T21:27:50.530Z · comments (31)
[link] birds and mammals independently evolved intelligence
bhauth · 2025-04-08T20:00:05.100Z · comments (11)
LLM AGI will have memory, and memory changes alignment
Seth Herd · 2025-04-04T14:59:13.070Z · comments (6)
Alignment faking CTFs: Apply to my MATS stream
joshc (joshua-clymer) · 2025-04-04T16:29:02.070Z · comments (0)
Learned pain as a leading cause of chronic pain
SoerenMind · 2025-04-09T11:57:58.523Z · comments (0)
A collection of approaches to confronting doom, and my thoughts on them
Ruby · 2025-04-06T02:11:31.271Z · comments (15)
[link] American College Admissions Doesn't Need to Be So Competitive
Arjun Panickssery (arjun-panickssery) · 2025-04-07T17:35:26.791Z · comments (18)
The first AI war will be in your computer
Viliam · 2025-04-08T09:28:53.191Z · comments (8)
Meditation and Reduced Sleep Need
niplav · 2025-04-04T14:42:54.792Z · comments (7)
[link] Thoughts on AI 2027
Max Harms (max-harms) · 2025-04-09T21:26:23.926Z · comments (3)
Austin Chen on Winning, Risk-Taking, and FTX
Elizabeth (pktechgirl) · 2025-04-07T19:00:08.039Z · comments (3)
Most Questionable Details in 'AI 2027'
scarcegreengrass · 2025-04-05T00:32:54.896Z · comments (4)
How much progress actually happens in theoretical physics?
ChristianKl · 2025-04-04T23:08:00.633Z · comments (32)
Who wants to bet me $25k at 1:7 odds that there won't be an AI market crash in the next year?
Remmelt (remmelt-ellen) · 2025-04-08T08:31:59.900Z · comments (10)
[Linkpost] Visual roadmap to strong human germline engineering
TsviBT · 2025-04-05T22:22:57.744Z · comments (0)
Changing my mind about Christiano's malign prior argument
Cole Wyeth (Amyr) · 2025-04-04T00:54:44.199Z · comments (34)
Llama Does Not Look Good 4 Anything
Zvi · 2025-04-09T13:20:01.799Z · comments (1)
Explaining the Joke: Pausing is The Way
WillPetillo · 2025-04-04T09:04:38.847Z · comments (2)
[link] Well-foundedness as an organizing principle of healthy minds and societies
Richard_Ngo (ricraz) · 2025-04-07T00:31:34.098Z · comments (6)
Navigation by Moonlight
Jacob Falkovich (Jacobian) · 2025-04-07T15:32:17.353Z · comments (17)
Introduction to Representing Sentences as Logical Statements
Towards_Keeperhood (Simon Skade) · 2025-04-05T20:35:31.422Z · comments (9)
Against podcasts
Adam Zerner (adamzerner) · 2025-04-05T19:20:00.716Z · comments (18)
[link] Ferrer, Pilar, and Me
Askwho · 2025-04-06T11:22:57.758Z · comments (1)
Coupling for Decouplers
Jacob Falkovich (Jacobian) · 2025-04-07T15:40:30.743Z · comments (3)
Love is Love, Science is Fake
Jacob Falkovich (Jacobian) · 2025-04-07T15:19:17.047Z · comments (2)
Sleep peacefully: no hidden reasoning detected in LLMs. Well, at least in small ones.
Ilia Shirokov (ilia-shirokov) · 2025-04-04T20:49:59.031Z · comments (2)
[link] Arusha Perpetual Chicken—an unlikely iterated game
James Stephen Brown (james-brown) · 2025-04-06T22:56:09.673Z · comments (1)
[question] Are there any (semi-)detailed future scenarios where we win?
Jan Betley (jan-betley) · 2025-04-07T19:13:09.299Z · answers+comments (2)
Meta releases Llama-4 herd of models
winstonBosan · 2025-04-05T19:51:06.688Z · comments (5)
A Bunch of Matryoshka SAEs
chanind · 2025-04-04T14:53:56.805Z · comments (0)
Log-linear Scaling is Worth the Cost due to Gains in Long-Horizon Tasks
shash42 · 2025-04-07T21:50:37.693Z · comments (2)
[link] AI companies’ unmonitored internal AI use poses serious risks
sjadler · 2025-04-04T18:17:46.924Z · comments (2)
Quarter Inch Cables are Devious
jefftk (jkaufman) · 2025-04-05T02:40:05.054Z · comments (4)
[question] What faithfulness metrics should general claims about CoT faithfulness be based upon?
Rauno Arike (rauno-arike) · 2025-04-08T15:27:20.346Z · answers+comments (0)
What alignment-relevant abilities might Terence Tao lack?
Towards_Keeperhood (Simon Skade) · 2025-04-07T19:44:18.620Z · comments (2)
Moonlight Reflected
Jacob Falkovich (Jacobian) · 2025-04-07T15:35:11.708Z · comments (0)
The world according to ChatGPT
Richard_Kennaway · 2025-04-07T13:44:43.781Z · comments (0)
[link] The case for AGI by 2030
Benjamin_Todd · 2025-04-09T20:35:55.167Z · comments (0)
Cheesecake Frosting
jefftk (jkaufman) · 2025-04-04T02:10:07.755Z · comments (9)
Misinformation is the default, and information is the government telling you your tap water is safe to drink
danielechlin · 2025-04-07T22:28:18.158Z · comments (1)
next page (older posts) →