LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Discussion with Eliezer Yudkowsky on AGI interventions
Rob Bensinger (RobbBB) · 2021-11-11T03:01:11.208Z · comments (251)
What should you change in response to an "emergency"? And AI risk
AnnaSalamon · 2022-07-18T01:11:14.667Z · comments (60)
Self-Integrity and the Drowning Child
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-10-24T20:57:01.742Z · comments (85)
Why I think strong general AI is coming soon
porby · 2022-09-28T05:40:38.395Z · comments (139)
Inside Views, Impostor Syndrome, and the Great LARP
johnswentworth · 2023-09-25T16:08:17.040Z · comments (53)
Sharing Information About Nonlinear
Ben Pace (Benito) · 2023-09-07T06:51:11.846Z · comments (323)
[link] Childhoods of exceptional people
Henrik Karlsson (henrik-karlsson) · 2023-02-06T17:27:09.596Z · comments (62)
A non-magical explanation of Jeffrey Epstein
lc · 2021-12-28T21:15:41.953Z · comments (59)
Staring into the abyss as a core life skill
benkuhn · 2022-12-22T15:30:05.093Z · comments (21)
Looking back on my alignment PhD
TurnTrout · 2022-07-01T03:19:59.497Z · comments (63)
[link] EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem
Elizabeth (pktechgirl) · 2023-09-28T23:30:03.390Z · comments (246)
Feature Selection
Zack_M_Davis · 2021-11-01T00:22:29.993Z · comments (24)
Frame Control
Aella · 2021-11-27T22:59:29.436Z · comments (282)
Against Almost Every Theory of Impact of Interpretability
Charbel-Raphaël (charbel-raphael-segerie) · 2023-08-17T18:44:41.099Z · comments (82)
Understanding and controlling a maze-solving policy network
TurnTrout · 2023-03-11T18:59:56.223Z · comments (22)
Alignment Grantmaking is Funding-Limited Right Now
johnswentworth · 2023-07-19T16:49:08.811Z · comments (67)
Models Don't "Get Reward"
Sam Ringer · 2022-12-30T10:37:11.798Z · comments (61)
Epistemic Legibility
Elizabeth (pktechgirl) · 2022-02-09T18:10:06.591Z · comments (30)
Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research
evhub · 2023-08-08T01:30:10.847Z · comments (26)
Shallow review of live agendas in alignment & safety
technicalities · 2023-11-27T11:10:27.464Z · comments (69)
On not getting contaminated by the wrong obesity ideas
Natália (Natália Mendonça) · 2023-01-28T20:18:21.322Z · comments (67)
Optimality is the tiger, and agents are its teeth
Veedrac · 2022-04-02T00:46:27.138Z · comments (42)
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger (RobbBB) · 2022-12-01T23:11:44.279Z · comments (33)
Six Dimensions of Operational Adequacy in AGI Projects
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-05-30T17:00:30.833Z · comments (66)
On how various plans miss the hard bits of the alignment challenge
So8res · 2022-07-12T02:49:50.454Z · comments (88)
An Unexpected Victory: Container Stacking at the Port of Long Beach
Zvi · 2021-10-28T14:40:00.497Z · comments (41)
[link] Pausing AI Developments Isn't Enough. We Need to Shut it All Down by Eliezer Yudkowsky
jacquesthibs (jacques-thibodeau) · 2023-03-29T23:16:19.431Z · comments (296)
Fucking Goddamn Basics of Rationalist Discourse
LoganStrohl (BrienneYudkowsky) · 2023-02-04T01:47:32.578Z · comments (97)
LessWrong is providing feedback and proofreading on drafts as a service
Ruby · 2021-09-07T01:33:10.666Z · comments (53)
Why Agent Foundations? An Overly Abstract Explanation
johnswentworth · 2022-03-25T23:17:10.324Z · comments (56)
[link] When do "brains beat brawn" in Chess? An experiment
titotal (lombertini) · 2023-06-28T13:33:23.854Z · comments (79)
EfficientZero: How It Works
1a3orn · 2021-11-26T15:17:08.321Z · comments (50)
LW Team is adjusting moderation policy
Raemon · 2023-04-04T20:41:07.603Z · comments (181)
Book Review: How Minds Change
bc4026bd4aaa5b7fe (bc4026bd4aaa5b7fe0bdcd47da7a22b453953f990d35286b9d315a619b23667a) · 2023-05-25T17:55:32.218Z · comments (51)
Two-year update on my personal AI timelines
Ajeya Cotra (ajeya-cotra) · 2022-08-02T23:07:48.698Z · comments (60)
Predictable updating about AI risk
Joe Carlsmith (joekc) · 2023-05-08T21:53:34.730Z · comments (23)
[link] Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
evhub · 2024-01-12T19:51:01.021Z · comments (94)
Lies, Damn Lies, and Fabricated Options
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2021-10-17T02:47:24.909Z · comments (131)
Speaking to Congressional staffers about AI risk
Akash (akash-wasil) · 2023-12-04T23:08:52.055Z · comments (23)
The Parable of the King and the Random Process
moridinamael · 2023-03-01T22:18:59.734Z · comments (22)
Hooray for stepping out of the limelight
So8res · 2023-04-01T02:45:31.397Z · comments (24)
[link] Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2023-10-05T21:01:39.767Z · comments (18)
Mysteries of mode collapse
janus · 2022-11-08T10:37:57.760Z · comments (56)
Social Dark Matter
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-11-16T20:00:00.000Z · comments (112)
OpenAI: The Battle of the Board
Zvi · 2023-11-22T17:30:04.574Z · comments (82)
We Choose To Align AI
johnswentworth · 2022-01-01T20:06:23.307Z · comments (16)
Is AI Progress Impossible To Predict?
alyssavance · 2022-05-15T18:30:12.103Z · comments (39)
Study Guide
johnswentworth · 2021-11-06T01:23:09.552Z · comments (48)
Sazen
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2022-12-21T07:54:51.415Z · comments (83)
A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res · 2022-06-15T13:10:18.658Z · comments (52)
← previous page (newer posts) · next page (older posts) →