LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals
johnswentworth · 2025-01-24T20:20:28.881Z · comments (51)
“Sharp Left Turn” discourse: An opinionated review
Steven Byrnes (steve2152) · 2025-01-28T18:47:04.395Z · comments (8)
Anomalous Tokens in DeepSeek-V3 and r1
henry (henry-bass) · 2025-01-25T22:55:41.232Z · comments (2)
Ten people on the inside
Buck · 2025-01-28T16:41:22.990Z · comments (24)
Planning for Extreme AI Risks
joshc (joshua-clymer) · 2025-01-29T18:33:14.844Z · comments (3)
My supervillain origin story
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-27T12:20:46.101Z · comments (0)
[link] Attribution-based parameter decomposition
Lucius Bushnaq (Lblack) · 2025-01-25T13:12:11.031Z · comments (12)
The Game Board has been Flipped: Now is a good time to rethink what you’re doing
Alex Lintz (alex-lintz) · 2025-01-28T23:36:18.106Z · comments (18)
The Rising Sea
Jesse Hoogland (jhoogland) · 2025-01-25T20:48:52.971Z · comments (2)
Stargate AI-1
Zvi · 2025-01-24T15:20:18.752Z · comments (1)
Six Thoughts on AI Safety
boazbarak · 2025-01-24T22:20:50.768Z · comments (51)
[link] Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development
Jan_Kulveit · 2025-01-30T17:03:45.545Z · comments (11)
[link] Yudkowsky on The Trajectory podcast
Seth Herd · 2025-01-24T19:52:15.104Z · comments (36)
Should you go with your best guess?: Against precise Bayesianism and related views
Anthony DiGiovanni (antimonyanthony) · 2025-01-27T20:25:26.809Z · comments (8)
[link] Paper: Open Problems in Mechanistic Interpretability
Lee Sharkey (Lee_Sharkey) · 2025-01-29T10:25:54.727Z · comments (0)
Fake thinking and real thinking
Joe Carlsmith (joekc) · 2025-01-28T20:05:06.735Z · comments (3)
Kessler's Second Syndrome
Jesse Hoogland (jhoogland) · 2025-01-26T07:04:17.852Z · comments (2)
On polytopes
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-25T13:56:35.681Z · comments (5)
DeepSeek Panic at the App Store
Zvi · 2025-01-28T19:30:07.555Z · comments (14)
A sketch of an AI control safety case
Tomek Korbak (tomek-korbak) · 2025-01-30T17:28:47.992Z · comments (0)
[link] Dario Amodei: On DeepSeek and Export Controls
Zach Stein-Perlman · 2025-01-29T17:15:18.986Z · comments (2)
Brainrot
Jesse Hoogland (jhoogland) · 2025-01-26T05:35:35.396Z · comments (0)
Why care about AI personhood?
Francis Rhys Ward (francis-rhys-ward) · 2025-01-26T11:24:45.596Z · comments (6)
[link] You should read Hobbes, Locke, Hume, and Mill via EarlyModernTexts.com
Arjun Panickssery (arjun-panickssery) · 2025-01-30T12:35:03.564Z · comments (1)
Operator
Zvi · 2025-01-28T20:00:08.374Z · comments (1)
Eliciting bad contexts
Geoffrey Irving · 2025-01-24T10:39:39.358Z · comments (2)
Agents don't have to be aligned to help us achieve an indefinite pause.
Hastings (hastings-greer) · 2025-01-25T18:51:03.523Z · comments (0)
AI #101: The Shallow End
Zvi · 2025-01-30T14:50:08.269Z · comments (1)
[link] Anthropic CEO calls for RSI
Andrea_Miotti (AndreaM) · 2025-01-29T16:54:24.943Z · comments (10)
DeepSeek: Lemon, It’s Wednesday
Zvi · 2025-01-29T15:00:07.914Z · comments (0)
[link] Insights from "The Manga Guide to Physiology"
TurnTrout · 2025-01-24T05:18:57.772Z · comments (3)
What's Behind the SynBio Bust?
sarahconstantin · 2025-01-30T22:30:06.916Z · comments (1)
Deference and Decision-Making
ben_levinstein (benlev) · 2025-01-27T22:02:17.578Z · comments (0)
[link] Steering Gemini with BiDPO
TurnTrout · 2025-01-31T02:37:55.839Z · comments (1)
The generalization phase diagram
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-26T20:30:15.212Z · comments (2)
[question] Is the output of the softmax in a single transformer attention head usually winner-takes-all?
Linda Linsefors · 2025-01-27T15:33:28.992Z · answers+comments (1)
[link] Reinforcement Learning by AI Punishment
Abhishaike Mahajan (abhishaike-mahajan) · 2025-01-28T00:57:51.715Z · comments (0)
The Upcoming PEPFAR Cut Will Kill Millions, Many of Them Children
omnizoid · 2025-01-27T16:03:51.214Z · comments (2)
[link] Counterintuitive effects of minimum prices
dynomight · 2025-01-24T23:05:26.099Z · comments (0)
ARENA 5.0 - Call for Applicants
JamesH (AtlasOfCharts) · 2025-01-30T13:18:27.052Z · comments (0)
so you have a chronic health issue
agencypilled · 2025-01-26T19:00:29.972Z · comments (9)
[link] Are we trying to figure out if AI is conscious?
Kristaps Zilgalvis (kristaps-zilgalvis-1) · 2025-01-27T01:05:07.001Z · comments (6)
SAE regularization produces more interpretable models
Peter Lai (peter-lai) · 2025-01-28T20:02:56.662Z · comments (3)
AI Strategy Updates that You Should Make
Alice Blair (Diatom) · 2025-01-27T21:10:41.838Z · comments (2)
QFT and neural nets: the basic idea
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-24T13:54:45.099Z · comments (0)
Efficiency spectra and “bucket of circuits” cartoons
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-29T15:06:50.768Z · comments (0)
[link] Notes on Argentina
Annapurna (jorge-velez) · 2025-01-26T03:51:15.393Z · comments (5)
Monet: Mixture of Monosemantic Experts for Transformers Explained
CalebMaresca (caleb-maresca) · 2025-01-25T19:37:09.078Z · comments (2)
The present perfect tense is ruining your life
PatrickDFarley · 2025-01-27T16:14:48.843Z · comments (8)
The memorization-generalization spectrum and learning coefficients
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-28T16:53:24.628Z · comments (0)
next page (older posts) →