LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2023-10-05T21:01:39.767Z · comments (21)
Politics is way too meta
Rob Bensinger (RobbBB) · 2021-03-17T07:04:42.187Z · comments (46)
Social Dark Matter
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-11-16T20:00:00.000Z · comments (112)
Mysteries of mode collapse
janus · 2022-11-08T10:37:57.760Z · comments (57)
Study Guide
johnswentworth · 2021-11-06T01:23:09.552Z · comments (48)
Hooray for stepping out of the limelight
So8res · 2023-04-01T02:45:31.397Z · comments (24)
[link] My hour of memoryless lucidity
Eric Neyman (UnexpectedValues) · 2024-05-04T01:40:56.717Z · comments (19)
[link] Intentionally Making Close Friends
Neel Nanda (neel-nanda-1) · 2021-06-27T23:06:49.269Z · comments (35)
A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res · 2022-06-15T13:10:18.658Z · comments (53)
We Choose To Align AI
johnswentworth · 2022-01-01T20:06:23.307Z · comments (16)
Is AI Progress Impossible To Predict?
alyssavance · 2022-05-15T18:30:12.103Z · comments (39)
OpenAI: The Battle of the Board
Zvi · 2023-11-22T17:30:04.574Z · comments (82)
What Are You Tracking In Your Head?
johnswentworth · 2022-06-28T19:30:06.164Z · comments (81)
Sazen
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2022-12-21T07:54:51.415Z · comments (83)
My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI
Andrew_Critch · 2023-05-24T00:02:08.836Z · comments (39)
Guide to rationalist interior decorating
mingyuan · 2023-06-19T06:47:13.704Z · comments (45)
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
Andrew_Critch · 2021-03-31T23:50:31.620Z · comments (64)
Notes on Teaching in Prison
jsd · 2023-04-19T01:53:00.427Z · comments (12)
Don't die with dignity; instead play to your outs
Jeffrey Ladish (jeff-ladish) · 2022-04-06T07:53:05.172Z · comments (59)
The Base Rate Times, news through prediction markets
vandemonian · 2023-06-06T17:42:56.718Z · comments (40)
Toni Kurz and the Insanity of Climbing Mountains
GeneSmith · 2022-07-03T20:51:58.429Z · comments (67)
Gentleness and the artificial Other
Joe Carlsmith (joekc) · 2024-01-02T18:21:34.746Z · comments (33)
Humans are very reliable agents
alyssavance · 2022-06-16T22:02:10.892Z · comments (35)
We don’t trade with ants
KatjaGrace · 2023-01-10T23:50:11.476Z · comments (109)
Seven Years of Spaced Repetition Software in the Classroom
tanagrabeast · 2021-03-04T02:42:01.475Z · comments (38)
OpenAI: Facts from a Weekend
Zvi · 2023-11-20T15:30:06.732Z · comments (158)
Accidentally Load Bearing
jefftk (jkaufman) · 2023-07-13T16:10:00.806Z · comments (14)
Core Pathways of Aging
johnswentworth · 2021-03-28T00:31:49.698Z · comments (123)
[link] Scale Was All We Needed, At First
Gabe M (gabe-mukobi) · 2024-02-14T01:49:16.184Z · comments (31)
12 interesting things I learned studying the discovery of nature's laws
Ben Pace (Benito) · 2022-02-19T23:39:47.841Z · comments (40)
Your Cheerful Price
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-02-13T05:41:53.511Z · comments (82)
The 6D effect: When companies take risks, one email can be very powerful.
scasper · 2023-11-04T20:08:39.775Z · comments (40)
A Brief Introduction to Container Logistics
Vitor · 2021-11-11T15:58:11.510Z · comments (22)
[link] Where do your eyes go?
alkjash · 2021-09-19T22:43:47.491Z · comments (22)
Basics of Rationalist Discourse
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-01-27T02:40:52.739Z · comments (180)
Omicron Variant Post #1: We’re F***ed, It’s Never Over
Zvi · 2021-11-26T19:00:00.988Z · comments (95)
Express interest in an "FHI of the West"
habryka (habryka4) · 2024-04-18T03:32:58.592Z · comments (41)
Constellations are Younger than Continents
Jeffrey Heninger (jeffrey-heninger) · 2023-12-19T06:12:40.667Z · comments (22)
"Carefully Bootstrapped Alignment" is organizationally hard
Raemon · 2023-03-17T18:00:09.943Z · comments (22)
On green
Joe Carlsmith (joekc) · 2024-03-21T17:38:56.295Z · comments (34)
Changing the world through slack & hobbies
Steven Byrnes (steve2152) · 2022-07-21T18:11:05.636Z · comments (13)
AI Timelines
habryka (habryka4) · 2023-11-10T05:28:24.841Z · comments (74)
Your Dog is Even Smarter Than You Think
StyleOfDog · 2021-05-01T05:16:09.821Z · comments (108)
So, geez there's a lot of AI content these days
Raemon · 2022-10-06T21:32:20.833Z · comments (140)
Safetywashing
Adam Scholl (adam_scholl) · 2022-07-01T11:56:33.495Z · comments (20)
[link] [SEE NEW EDITS] No, *You* Need to Write Clearer
NicholasKross · 2023-04-29T05:04:01.559Z · comments (64)
Sexual Abuse attitudes might be infohazardous
Pseudonymous Otter · 2022-07-19T18:06:43.956Z · comments (71)
The Plan
johnswentworth · 2021-12-10T23:41:39.417Z · comments (78)
larger language models may disappoint you [or, an eternally unfinished draft]
nostalgebraist · 2021-11-26T23:08:56.221Z · comments (31)
[link] Paul Christiano named as US AI Safety Institute Head of AI Safety
Joel Burget (joel-burget) · 2024-04-16T16:22:06.937Z · comments (59)
← previous page (newer posts) · next page (older posts) →