LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

What’s the short timeline plan?
Marius Hobbhahn (marius-hobbhahn) · 2025-01-02T14:59:20.026Z · comments (49)
The Case Against AI Control Research
johnswentworth · 2025-01-21T16:03:10.143Z · comments (80)
[link] The Gentle Romance
Richard_Ngo (ricraz) · 2025-01-19T18:29:18.469Z · comments (46)
“Sharp Left Turn” discourse: An opinionated review
Steven Byrnes (steve2152) · 2025-01-28T18:47:04.395Z · comments (26)
Mechanisms too simple for humans to design
Malmesbury (Elmer of Malmesbury) · 2025-01-22T16:54:37.601Z · comments (45)
What Is The Alignment Problem?
johnswentworth · 2025-01-16T01:20:16.826Z · comments (50)
Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals
johnswentworth · 2025-01-24T20:20:28.881Z · comments (61)
How will we update about scheming?
ryan_greenblatt · 2025-01-06T20:21:52.281Z · comments (20)
[link] Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development
Jan_Kulveit · 2025-01-30T17:03:45.545Z · comments (52)
Maximizing Communication, not Traffic
jefftk (jkaufman) · 2025-01-05T13:00:02.280Z · comments (10)
OpenAI #10: Reflections
Zvi · 2025-01-07T17:00:07.348Z · comments (7)
[link] Quotes from the Stargate press conference
Nikola Jurkovic (nikolaisalreadytaken) · 2025-01-22T00:50:14.793Z · comments (7)
Don’t ignore bad vibes you get from people
Kaj_Sotala · 2025-01-18T09:20:17.397Z · comments (50)
Capital Ownership Will Not Prevent Human Disempowerment
beren · 2025-01-05T06:00:23.095Z · comments (18)
AI companies are unlikely to make high-assurance safety cases if timelines are short
ryan_greenblatt · 2025-01-23T18:41:40.546Z · comments (5)
Activation space interpretability may be doomed
bilalchughtai (beelal) · 2025-01-08T12:49:38.421Z · comments (32)
Applying traditional economic thinking to AGI: a trilemma
Steven Byrnes (steve2152) · 2025-01-13T01:23:00.397Z · comments (32)
Planning for Extreme AI Risks
joshc (joshua-clymer) · 2025-01-29T18:33:14.844Z · comments (4)
What Indicators Should We Watch to Disambiguate AGI Timelines?
snewman · 2025-01-06T19:57:43.398Z · comments (57)
[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty
tandem · 2025-01-07T19:11:21.238Z · comments (5)
Ten people on the inside
Buck · 2025-01-28T16:41:22.990Z · comments (28)
Anomalous Tokens in DeepSeek-V3 and r1
henry (henry-bass) · 2025-01-25T22:55:41.232Z · comments (2)
[link] Training on Documents About Reward Hacking Induces Reward Hacking
evhub · 2025-01-21T21:32:24.691Z · comments (14)
Human takeover might be worse than AI takeover
Tom Davidson (tom-davidson-1) · 2025-01-10T16:53:27.043Z · comments (54)
Tell me about yourself: LLMs are aware of their learned behaviors
Martín Soto (martinsq) · 2025-01-22T00:47:15.023Z · comments (5)
Building AI Research Fleets
Ben Goldhaber (bgold) · 2025-01-12T18:23:09.682Z · comments (11)
[link] Parkinson's Law and the Ideology of Statistics
Benquo · 2025-01-04T15:49:21.247Z · comments (7)
[link] The Intelligence Curse
lukedrago · 2025-01-03T19:07:43.493Z · comments (26)
2024 in AI predictions
jessicata (jessica.liu.taylor) · 2025-01-01T20:29:49.132Z · comments (3)
The Game Board has been Flipped: Now is a good time to rethink what you’re doing
LintzA (alex-lintz) · 2025-01-28T23:36:18.106Z · comments (30)
[link] Aristocracy and Hostage Capital
Arjun Panickssery (arjun-panickssery) · 2025-01-08T19:38:47.104Z · comments (7)
[link] Attribution-based parameter decomposition
Lucius Bushnaq (Lblack) · 2025-01-25T13:12:11.031Z · comments (21)
My supervillain origin story
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-27T12:20:46.101Z · comments (1)
Fake thinking and real thinking
Joe Carlsmith (joekc) · 2025-01-28T20:05:06.735Z · comments (11)
How do you deal w/ Super Stimuli?
Logan Riggs (elriggs) · 2025-01-14T15:14:51.552Z · comments (25)
Comment on "Death and the Gorgon"
Zack_M_Davis · 2025-01-01T05:47:30.730Z · comments (33)
The purposeful drunkard
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-12T12:27:51.952Z · comments (13)
Reasons for and against working on technical AI safety at a frontier AI lab
bilalchughtai (beelal) · 2025-01-05T14:49:53.529Z · comments (12)
We probably won't just play status games with each other after AGI
Matthew Barnett (matthew-barnett) · 2025-01-15T04:56:38.330Z · comments (21)
[link] On Eating the Sun
jessicata (jessica.liu.taylor) · 2025-01-08T04:57:20.457Z · comments (96)
The subset parity learning problem: much more than you wanted to know
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-03T09:13:59.245Z · comments (18)
The Rising Sea
Jesse Hoogland (jhoogland) · 2025-01-25T20:48:52.971Z · comments (2)
Introducing Squiggle AI
ozziegooen · 2025-01-03T17:53:42.915Z · comments (15)
Six Thoughts on AI Safety
boazbarak · 2025-01-24T22:20:50.768Z · comments (55)
Thoughts on the conservative assumptions in AI control
Buck · 2025-01-17T19:23:38.575Z · comments (5)
Tips On Empirical Research Slides
James Chua (james-chua) · 2025-01-08T05:06:44.942Z · comments (4)
Implications of the inference scaling paradigm for AI safety
Ryan Kidd (ryankidd44) · 2025-01-14T02:14:53.562Z · comments (70)
Tips and Code for Empirical Research Workflows
John Hughes (john-hughes) · 2025-01-20T22:31:51.498Z · comments (14)
[link] Five Recent AI Tutoring Studies
Arjun Panickssery (arjun-panickssery) · 2025-01-19T03:53:47.714Z · comments (0)
Agent Foundations 2025 at CMU
Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2025-01-19T23:48:22.569Z · comments (10)
next page (older posts) →