LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Patient Observation
LoganStrohl (BrienneYudkowsky) · 2022-02-23T19:31:45.062Z · comments (4)
High Status Eschews Quantification of Performance
niplav · 2023-03-19T22:14:16.523Z · comments (36)
Long covid: probably worth avoiding—some considerations
KatjaGrace · 2022-01-16T11:46:52.087Z · comments (88)
Limerence Messes Up Your Rationality Real Bad, Yo
Raemon · 2022-07-01T16:53:10.914Z · comments (42)
Clarifying AI X-risk
zac_kenton (zkenton) · 2022-11-01T11:03:01.144Z · comments (24)
On the Diplomacy AI
Zvi · 2022-11-28T13:20:00.884Z · comments (29)
I left Russia on March 8
avturchin · 2022-03-10T20:05:59.650Z · comments (16)
"Pivotal Acts" means something specific
Raemon · 2022-06-07T21:56:00.574Z · comments (23)
Community Notes by X
NicholasKees (nick_kees) · 2024-03-18T17:13:33.195Z · comments (15)
Selection Theorems: A Program For Understanding Agents
johnswentworth · 2021-09-28T05:03:19.316Z · comments (28)
[link] Parkinson's Law and the Ideology of Statistics
Benquo · 2025-01-04T15:49:21.247Z · comments (6)
Re-Examining LayerNorm
Eric Winsor (EricWinsor) · 2022-12-01T22:20:23.542Z · comments (12)
Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think
Zack_M_Davis · 2019-12-27T05:09:22.546Z · comments (43)
My Overview of the AI Alignment Landscape: A Bird's Eye View
Neel Nanda (neel-nanda-1) · 2021-12-15T23:44:31.873Z · comments (9)
Building AI Research Fleets
Ben Goldhaber (bgold) · 2025-01-12T18:23:09.682Z · comments (11)
Tell me about yourself: LLMs are aware of their learned behaviors
Martín Soto (martinsq) · 2025-01-22T00:47:15.023Z · comments (5)
One-layer transformers aren’t equivalent to a set of skip-trigrams
Buck · 2023-02-17T17:26:13.819Z · comments (11)
Goodhart's Law in Reinforcement Learning
jacek (jacek-karwowski) · 2023-10-16T00:54:11.669Z · comments (22)
[link] FLI open letter: Pause giant AI experiments
Zach Stein-Perlman · 2023-03-29T04:04:23.333Z · comments (123)
Warning Shots Probably Wouldn't Change The Picture Much
So8res · 2022-10-06T05:15:39.391Z · comments (42)
Shared reality: a key driver of human behavior
kdbscott · 2022-12-24T19:35:51.126Z · comments (25)
Pantheon Interface
NicholasKees (nick_kees) · 2024-07-08T19:03:51.681Z · comments (22)
[link] The Hubinger lectures on AGI safety: an introductory lecture series
evhub · 2023-06-22T00:59:27.820Z · comments (0)
ARC is hiring theoretical researchers
paulfchristiano · 2023-06-12T18:50:08.232Z · comments (12)
AI Alignment 2018-19 Review
Rohin Shah (rohinmshah) · 2020-01-28T02:19:52.782Z · comments (6)
Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems)
NunoSempere (Radamantis) · 2020-12-03T22:00:26.889Z · comments (28)
A Longlist of Theories of Impact for Interpretability
Neel Nanda (neel-nanda-1) · 2022-03-11T14:55:35.356Z · comments (41)
The case for becoming a black-box investigator of language models
Buck · 2022-05-06T14:35:24.630Z · comments (20)
Some background for reasoning about dual-use alignment research
Charlie Steiner · 2023-05-18T14:50:54.401Z · comments (22)
Insights from Euclid's 'Elements'
TurnTrout · 2020-05-04T15:45:30.711Z · comments (17)
A Shutdown Problem Proposal
johnswentworth · 2024-01-21T18:12:48.664Z · comments (61)
Baking is Not a Ritual
Sisi Cheng (sisi-cheng) · 2020-05-25T18:08:24.836Z · comments (28)
Recommendation: Bug Bounties and Responsible Disclosure for Advanced ML Systems
Vaniver · 2023-02-17T20:11:39.255Z · comments (12)
From fear to excitement
Richard_Ngo (ricraz) · 2023-05-15T06:23:18.656Z · comments (9)
An even deeper atheism
Joe Carlsmith (joekc) · 2024-01-11T17:28:31.843Z · comments (47)
AI Safety "Success Stories"
Wei Dai (Wei_Dai) · 2019-09-07T02:54:15.003Z · comments (27)
One Minute Every Moment
abramdemski · 2023-09-01T20:23:56.391Z · comments (23)
[link] Gene drives: why the wait?
Metacelsus · 2022-09-19T23:37:17.595Z · comments (50)
There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs
Taran · 2023-02-19T12:25:52.212Z · comments (34)
Induction heads - illustrated
CallumMcDougall (TheMcDouglas) · 2023-01-02T15:35:20.550Z · comments (10)
Transcript: "You Should Read HPMOR"
TurnTrout · 2021-11-02T18:20:53.161Z · comments (12)
[link] Fiber arts, mysterious dodecahedrons, and waiting on “Eureka!”
eukaryote · 2022-08-04T20:37:59.388Z · comments (15)
Deconfusing Direct vs Amortised Optimization
beren · 2022-12-02T11:30:46.754Z · comments (19)
Explaining the Twitter Postrat Scene
Jacob Falkovich (Jacobian) · 2022-04-05T22:23:27.125Z · comments (28)
"The Solomonoff Prior is Malign" is a special case of a simpler argument
David Matolcsi (matolcsid) · 2024-11-17T21:32:34.711Z · comments (44)
The Wicked Problem Experience
HoldenKarnofsky · 2022-03-02T17:50:18.621Z · comments (6)
[link] Bayesian Injustice
Kevin Dorst · 2023-12-14T15:44:08.664Z · comments (10)
Things I've Grieved
Raemon · 2024-02-18T19:32:47.169Z · comments (6)
[link] When discussing AI risks, talk about capabilities, not intelligence
Vika · 2023-08-11T13:38:48.844Z · comments (7)
[link] OpenAI's CBRN tests seem unclear
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:28:30.290Z · comments (6)
← previous page (newer posts) · next page (older posts) →