LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Transformers Represent Belief State Geometry in their Residual Stream
Adam Shai (adam-shai) · 2024-04-16T21:16:11.377Z · comments (100)
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen (thomas-larsen) · 2022-08-29T01:23:58.073Z · comments (90)
Ugh fields
Roko · 2010-04-12T17:06:18.510Z · comments (81)
[link] It Looks Like You're Trying To Take Over The World
gwern · 2022-03-09T16:35:35.326Z · comments (120)
You Are Not Measuring What You Think You Are Measuring
johnswentworth · 2022-09-20T20:04:22.899Z · comments (44)
Bing Chat is blatantly, aggressively misaligned
evhub · 2023-02-15T05:29:45.262Z · comments (181)
That Alien Message
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-05-22T05:55:13.000Z · comments (176)
Dying Outside
HalFinney · 2009-10-05T02:45:02.960Z · comments (91)
[link] How AI Takeover Might Happen in 2 Years
joshc (joshua-clymer) · 2025-02-07T17:10:10.530Z · comments (132)
DeepMind alignment team opinions on AGI ruin arguments
Vika · 2022-08-12T21:06:40.582Z · comments (37)
Reliable Sources: The Story of David Gerard
TracingWoodgrains (tracingwoodgrains) · 2024-07-10T19:50:21.191Z · comments (54)
What Do We Mean By "Rationality"?
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-03-16T22:33:55.765Z · comments (19)
[link] Reflections on six months of fatherhood
jasoncrawford · 2022-01-31T05:28:09.154Z · comments (24)
How I got 4.2M YouTube views without making a single video
Closed Limelike Curves · 2024-09-03T03:52:33.025Z · comments (36)
The hostile telepaths problem
Valentine · 2024-10-27T15:26:53.610Z · comments (89)
[link] Statement on AI Extinction - Signed by AGI Labs, Top Academics, and Many Other Notable Figures
Dan H (dan-hendrycks) · 2023-05-30T09:05:25.986Z · comments (78)
Lies Told To Children
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-04-14T11:25:10.282Z · comments (94)
Expecting Short Inferential Distances
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2007-10-22T23:42:01.000Z · comments (106)
Intellectual Hipsters and Meta-Contrarianism
Scott Alexander (Yvain) · 2010-09-13T21:36:33.236Z · comments (367)
How to Ignore Your Emotions (while also thinking you're awesome at emotions)
Hazard · 2019-07-31T13:34:16.506Z · comments (79)
There is way too much serendipity
Malmesbury (Elmer of Malmesbury) · 2024-01-19T19:37:57.068Z · comments (56)
Anti-Aging: State of the Art
JackH · 2020-12-31T19:07:03.430Z · comments (176)
Reward is not the optimization target
TurnTrout · 2022-07-25T00:03:18.307Z · comments (123)
Applause Lights
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2007-09-11T18:31:48.000Z · comments (99)
[link] A Mechanistic Interpretability Analysis of Grokking
Neel Nanda (neel-nanda-1) · 2022-08-15T02:41:36.245Z · comments (48)
Counterarguments to the basic AI x-risk case
KatjaGrace · 2022-10-14T13:00:05.903Z · comments (124)
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra (ajeya-cotra) · 2022-07-18T19:06:14.670Z · comments (95)
The Lens That Sees Its Flaws
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2007-09-23T00:10:41.000Z · comments (47)
Twelve Virtues of Rationality
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2006-01-01T08:00:05.370Z · comments (13)
[link] My hour of memoryless lucidity
Eric Neyman (UnexpectedValues) · 2024-05-04T01:40:56.717Z · comments (35)
Please don't throw your mind away
TsviBT · 2023-02-15T21:41:05.988Z · comments (48)
Accounting For College Costs
johnswentworth · 2022-04-01T17:28:19.409Z · comments (41)
Noting an error in Inadequate Equilibria
Matthew Barnett (matthew-barnett) · 2023-02-08T01:33:33.715Z · comments (60)
To listen well, get curious
benkuhn · 2020-12-13T00:20:09.608Z · comments (37)
How to have Polygenically Screened Children
GeneSmith · 2023-05-07T16:01:07.096Z · comments (128)
Working hurts less than procrastinating, we fear the twinge of starting
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2011-01-02T00:15:08.923Z · comments (162)
How it feels to have your mind hacked by an AI
blaked · 2023-01-12T00:33:18.866Z · comments (222)
Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
elspood · 2022-06-21T23:55:39.918Z · comments (42)
Lessons I've Learned from Self-Teaching
TurnTrout · 2021-01-23T19:00:55.559Z · comments (76)
My Objections to "We’re All Gonna Die with Eliezer Yudkowsky"
Quintin Pope (quintin-pope) · 2023-03-21T00:06:07.889Z · comments (233)
[link] Survival without dignity
L Rudolf L (LRudL) · 2024-11-04T02:29:38.758Z · comments (29)
Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)
Andrew_Critch · 2024-06-14T00:16:47.850Z · comments (38)
[link] Review: Planecrash
L Rudolf L (LRudL) · 2024-12-27T14:18:33.611Z · comments (45)
Notifications Received in 30 Minutes of Class
tanagrabeast · 2024-05-26T17:02:20.989Z · comments (16)
How To Get Into Independent Research On Alignment/Agency
johnswentworth · 2021-11-19T00:00:21.600Z · comments (38)
Staring into the abyss as a core life skill
benkuhn · 2022-12-22T15:30:05.093Z · comments (22)
The Parable of Predict-O-Matic
abramdemski · 2019-10-15T00:49:20.167Z · comments (43)
MIRI announces new "Death With Dignity" strategy
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-04-02T00:43:19.814Z · comments (545)
What DALL-E 2 can and cannot do
Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2022-05-01T23:51:22.310Z · comments (303)
[link] Thoughts on seed oil
dynomight · 2024-04-20T12:29:14.212Z · comments (129)
← previous page (newer posts) · next page (older posts) →