LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

2023 Unofficial LessWrong Census/Survey
Screwtape · 2023-12-02T04:41:51.418Z · comments (81)
[April Fools'] Definitive confirmation of shard theory
TurnTrout · 2023-04-01T07:27:23.096Z · comments (8)
Will alignment-faking Claude accept a deal to reveal its misalignment?
ryan_greenblatt · 2025-01-31T16:49:47.316Z · comments (23)
Thoughts on the AI Safety Summit company policy requests and responses
So8res · 2023-10-31T23:54:09.566Z · comments (14)
Conflict vs. mistake in non-zero-sum games
Nisan · 2020-04-05T22:22:41.374Z · comments (40)
Testing The Natural Abstraction Hypothesis: Project Intro
johnswentworth · 2021-04-06T21:24:43.135Z · comments (41)
Davidad's Bold Plan for Alignment: An In-Depth Explanation
Charbel-Raphaël (charbel-raphael-segerie) · 2023-04-19T16:09:01.455Z · comments (40)
The 'Neglected Approaches' Approach: AE Studio's Alignment Agenda
Cameron Berg (cameron-berg) · 2023-12-18T20:35:01.569Z · comments (22)
2021 AI Alignment Literature Review and Charity Comparison
Larks · 2021-12-23T14:06:50.721Z · comments (28)
You are probably underestimating how good self-love can be
Charlie Rogers-Smith (charlie.rs) · 2021-11-14T00:41:35.011Z · comments (19)
[link] o1: A Technical Primer
Jesse Hoogland (jhoogland) · 2024-12-09T19:09:12.413Z · comments (17)
The Brain is Not Close to Thermodynamic Limits on Computation
DaemonicSigil · 2023-04-24T08:21:44.727Z · comments (58)
Book Review: Working With Contracts
johnswentworth · 2020-09-14T23:22:11.215Z · comments (27)
Make more land
jefftk (jkaufman) · 2019-10-16T11:20:03.381Z · comments (36)
Impossibility results for unbounded utilities
paulfchristiano · 2022-02-02T03:52:18.780Z · comments (109)
Shard Theory: An Overview
David Udell · 2022-08-11T05:44:52.852Z · comments (34)
My understanding of Anthropic strategy
Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2023-02-15T01:56:40.961Z · comments (31)
Worst-case thinking in AI alignment
Buck · 2021-12-23T01:29:47.954Z · comments (18)
What Discovering Latent Knowledge Did and Did Not Find
Fabien Roger (Fabien) · 2023-03-13T19:29:45.601Z · comments (17)
[link] Things that can kill you quickly: What everyone should know about first aid
jasoncrawford · 2022-12-27T16:23:24.831Z · comments (21)
Planes are still decades away from displacing most bird jobs
guzey · 2022-11-25T16:49:32.344Z · comments (13)
How useful is mechanistic interpretability?
ryan_greenblatt · 2023-12-01T02:54:53.488Z · comments (54)
Playing with DALL·E 2
Dave Orr (dave-orr) · 2022-04-07T18:49:16.301Z · comments (118)
When can we trust model evaluations?
evhub · 2023-07-28T19:42:21.799Z · comments (10)
How will we update about scheming?
ryan_greenblatt · 2025-01-06T20:21:52.281Z · comments (20)
[link] Overcoming Bias Anthology
Arjun Panickssery (arjun-panickssery) · 2024-10-20T02:01:23.463Z · comments (14)
A list of core AI safety problems and how I hope to solve them
davidad · 2023-08-26T15:12:18.484Z · comments (29)
Most People Start With The Same Few Bad Ideas
johnswentworth · 2022-09-09T00:29:12.740Z · comments (30)
[link] Tuning your Cognitive Strategies
Raemon · 2023-04-27T20:32:06.337Z · comments (58)
Everything I Need To Know About Takeoff Speeds I Learned From Air Conditioner Ratings On Amazon
johnswentworth · 2022-04-15T19:05:46.442Z · comments (128)
[link] Six (and a half) intuitions for KL divergence
CallumMcDougall (TheMcDouglas) · 2022-10-12T21:07:07.796Z · comments (27)
[link] The Social Recession: By the Numbers
antonomon · 2022-10-29T18:45:09.001Z · comments (29)
$20 Million in NSF Grants for Safety Research
Dan H (dan-hendrycks) · 2023-02-28T04:44:38.417Z · comments (12)
Everyday Lessons from High-Dimensional Optimization
johnswentworth · 2020-06-06T20:57:05.155Z · comments (44)
[Beta Feature] Google-Docs-like editing for LessWrong posts
Ruby · 2022-02-23T01:52:22.141Z · comments (26)
You can just spontaneously call people you haven't met in years
lc · 2023-11-13T05:21:05.726Z · comments (21)
On A List of Lethalities
Zvi · 2022-06-13T12:30:01.624Z · comments (50)
Studies On Slack
Scott Alexander (Yvain) · 2020-05-13T05:00:02.772Z · comments (34)
Deepmind's Gato: Generalist Agent
Daniel Kokotajlo (daniel-kokotajlo) · 2022-05-12T16:01:21.803Z · comments (62)
Why I think there's a one-in-six chance of an imminent global nuclear war
Max Tegmark (MaxTegmark) · 2022-10-08T06:26:40.235Z · comments (169)
Towards understanding-based safety evaluations
evhub · 2023-03-15T18:18:01.259Z · comments (16)
Prizes for matrix completion problems
paulfchristiano · 2023-05-03T23:30:08.069Z · comments (52)
Paper-Reading for Gears
johnswentworth · 2019-12-04T21:02:56.316Z · comments (6)
RSPs are pauses done right
evhub · 2023-10-14T04:06:02.709Z · comments (73)
The Coordination Frontier: Sequence Intro
Raemon · 2021-09-04T22:11:00.122Z · comments (22)
[link] Masterpiece
Richard_Ngo (ricraz) · 2024-02-13T23:10:35.376Z · comments (21)
[link] Boycott OpenAI
PeterMcCluskey · 2024-06-18T19:52:42.854Z · comments (26)
"Can you keep this confidential? How do you know?"
Raemon · 2020-07-21T00:33:27.974Z · comments (43)
Slack matters more than any outcome
Valentine · 2022-12-31T20:11:02.287Z · comments (56)
Announcing ILIAD — Theoretical AI Alignment Conference
Nora_Ammann · 2024-06-05T09:37:39.546Z · comments (18)
← previous page (newer posts) · next page (older posts) →