LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Refine: An Incubator for Conceptual Alignment Research Bets
adamShimi · 2022-04-15T08:57:35.502Z · comments (13)
Interpreting Neural Networks through the Polytope Lens
Sid Black (sid-black) · 2022-09-23T17:58:30.639Z · comments (29)
We're already in AI takeoff
Valentine · 2022-03-08T23:09:06.733Z · comments (119)
[link] Nursing doubts
dynomight · 2024-08-30T02:25:36.826Z · comments (23)
[link] That Alien Message - The Animation
Writer · 2024-09-07T14:53:30.604Z · comments (9)
My side of an argument with Jacob Cannell about chip interconnect losses
Steven Byrnes (steve2152) · 2023-06-21T13:33:49.543Z · comments (11)
Stop posting prompt injections on Twitter and calling it "misalignment"
lc · 2023-02-19T02:21:44.061Z · comments (9)
[link] Fields that I reference when thinking about AI takeover prevention
Buck · 2024-08-13T23:08:54.950Z · comments (16)
[link] Transformer Circuits
evhub · 2021-12-22T21:09:22.676Z · comments (4)
The Bayesian Tyrant
abramdemski · 2020-08-20T00:08:55.738Z · comments (21)
Updating my AI timelines
Matthew Barnett (matthew-barnett) · 2022-12-05T20:46:28.161Z · comments (50)
Momentum of Light in Glass
Ben (ben-lang) · 2024-10-09T20:19:42.088Z · comments (44)
Conversational Cultures: Combat vs Nurture (V2)
Ruby · 2019-12-31T20:23:53.772Z · comments (92)
Takeaways from our robust injury classifier project [Redwood Research]
dmz (DMZ) · 2022-09-17T03:55:25.868Z · comments (12)
[link] We Found An Neuron in GPT-2
Joseph Miller (Josephm) · 2023-02-11T18:27:29.410Z · comments (23)
Sentience matters
So8res · 2023-05-29T21:25:30.638Z · comments (96)
A brief collection of Hinton's recent comments on AGI risk
Kaj_Sotala · 2023-05-04T23:31:06.157Z · comments (9)
Shard Theory in Nine Theses: a Distillation and Critical Appraisal
LawrenceC (LawChan) · 2022-12-19T22:52:20.031Z · comments (30)
Value Claims (In Particular) Are Usually Bullshit
johnswentworth · 2024-05-30T06:26:21.151Z · comments (18)
Irrational Modesty
Tomás B. (Bjartur Tómas) · 2021-06-20T19:38:25.320Z · comments (6)
The Case for Extreme Vaccine Effectiveness
Ruby · 2021-04-13T21:08:39.470Z · comments (37)
Clarifying and predicting AGI
Richard_Ngo (ricraz) · 2023-05-04T15:55:26.283Z · comments (44)
Responses to apparent rationalist confusions about game / decision theory
Anthony DiGiovanni (antimonyanthony) · 2023-08-30T22:02:12.218Z · comments (20)
The Translucent Thoughts Hypotheses and Their Implications
Fabien Roger (Fabien) · 2023-03-09T16:30:02.355Z · comments (7)
[link] The Goddess of Everything Else - The Animation
Writer · 2023-07-13T16:26:25.552Z · comments (4)
Steam
abramdemski · 2022-06-20T17:38:58.548Z · comments (13)
App and book recommendations for people who want to be happier and more productive
KatWoods (ea247) · 2021-11-06T17:40:40.592Z · comments (43)
AI Views Snapshots
Rob Bensinger (RobbBB) · 2023-12-13T00:45:50.016Z · comments (61)
Limits to Legibility
Jan_Kulveit · 2022-06-29T17:42:19.338Z · comments (11)
Twitter thread on postrationalists
Eli Tyre (elityre) · 2022-02-17T09:02:54.806Z · comments (32)
Activation space interpretability may be doomed
bilalchughtai (beelal) · 2025-01-08T12:49:38.421Z · comments (28)
High-stakes alignment via adversarial training [Redwood Research report]
dmz (DMZ) · 2022-05-05T00:59:18.848Z · comments (29)
Consider The Hand Axe
ymeskhout · 2023-04-08T01:31:44.614Z · comments (16)
Why Not Just Outsource Alignment Research To An AI?
johnswentworth · 2023-03-09T21:49:19.774Z · comments (50)
A Semitechnical Introductory Dialogue on Solomonoff Induction
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-03-04T17:27:35.591Z · comments (33)
[link] The Checklist: What Succeeding at AI Safety Will Involve
Sam Bowman (sbowman) · 2024-09-03T18:18:34.230Z · comments (49)
MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models"
Rob Bensinger (RobbBB) · 2021-03-05T23:43:54.186Z · comments (13)
Hashing out long-standing disagreements seems low-value to me
So8res · 2023-02-16T06:20:00.899Z · comments (34)
Age changes what you care about
Dentin · 2022-10-16T15:36:36.148Z · comments (37)
How long does it take to become Gaussian?
Maxwell Peterson (maxwell-peterson) · 2020-12-08T07:23:41.725Z · comments (40)
Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers
[deleted] · 2021-04-09T19:19:42.826Z · comments (17)
When is a mind me?
Rob Bensinger (RobbBB) · 2024-04-17T05:56:38.482Z · comments (126)
Maximizing Communication, not Traffic
jefftk (jkaufman) · 2025-01-05T13:00:02.280Z · comments (7)
“Alignment Faking” frame is somewhat fake
Jan_Kulveit · 2024-12-20T09:51:04.664Z · comments (13)
Survey: How Do Elite Chinese Students Feel About the Risks of AI?
Nick Corvino (nick-corvino) · 2024-09-02T18:11:11.867Z · comments (13)
Exercises in Comprehensive Information Gathering
johnswentworth · 2020-02-15T17:27:19.753Z · comments (18)
Request to AGI organizations: Share your views on pausing AI progress
Akash (akash-wasil) · 2023-04-11T17:30:46.707Z · comments (11)
MIRI location optimization (and related topics) discussion
Rob Bensinger (RobbBB) · 2021-05-08T23:12:02.476Z · comments (163)
Understanding Infra-Bayesianism: A Beginner-Friendly Video Series
Jack Parker · 2022-09-22T13:25:04.254Z · comments (6)
The Learning-Theoretic Agenda: Status 2023
Vanessa Kosoy (vanessa-kosoy) · 2023-04-19T05:21:29.177Z · comments (17)
← previous page (newer posts) · next page (older posts) →