LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

[link] How much do you believe your results?
Eric Neyman (UnexpectedValues) · 2023-05-06T20:31:31.277Z · comments (14)
Steering GPT-2-XL by adding an activation vector
TurnTrout · 2023-05-13T18:42:41.321Z · comments (97)
[link] Statement on AI Extinction - Signed by AGI Labs, Top Academics, and Many Other Notable Figures
Dan H (dan-hendrycks) · 2023-05-30T09:05:25.986Z · comments (77)
How to have Polygenically Screened Children
GeneSmith · 2023-05-07T16:01:07.096Z · comments (108)
Book Review: How Minds Change
bc4026bd4aaa5b7fe (bc4026bd4aaa5b7fe0bdcd47da7a22b453953f990d35286b9d315a619b23667a) · 2023-05-25T17:55:32.218Z · comments (52)
Predictable updating about AI risk
Joe Carlsmith (joekc) · 2023-05-08T21:53:34.730Z · comments (23)
My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI
Andrew_Critch · 2023-05-24T00:02:08.836Z · comments (39)
Mental Health and the Alignment Problem: A Compilation of Resources (updated April 2023)
Chris Scammell (chris-scammell) · 2023-05-10T19:04:21.138Z · comments (53)
Announcing Apollo Research
Marius Hobbhahn (marius-hobbhahn) · 2023-05-30T16:17:19.767Z · comments (11)
Twiblings, four-parent babies and other reproductive technology
GeneSmith · 2023-05-20T17:11:23.726Z · comments (32)
Decision Theory with the Magic Parts Highlighted
moridinamael · 2023-05-16T17:39:55.038Z · comments (24)
When is Goodhart catastrophic?
Drake Thomas (RavenclawPrefect) · 2023-05-09T03:59:16.043Z · comments (23)
Prizes for matrix completion problems
paulfchristiano · 2023-05-03T23:30:08.069Z · comments (51)
Request: stop advancing AI capabilities
So8res · 2023-05-26T17:42:07.182Z · comments (23)
[link] Conjecture internal survey: AGI timelines and probability of human extinction from advanced AI
Maris Sala (maris-sala) · 2023-05-22T14:31:59.139Z · comments (5)
A brief collection of Hinton's recent comments on AGI risk
Kaj_Sotala · 2023-05-04T23:31:06.157Z · comments (9)
Sentience matters
So8res · 2023-05-29T21:25:30.638Z · comments (96)
Dark Forest Theories
Raemon · 2023-05-12T20:21:49.052Z · comments (48)
LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem
Steven Byrnes (steve2152) · 2023-05-08T19:35:19.180Z · comments (37)
AGI safety career advice
Richard_Ngo (ricraz) · 2023-05-02T07:36:09.044Z · comments (24)
Clarifying and predicting AGI
Richard_Ngo (ricraz) · 2023-05-04T15:55:26.283Z · comments (42)
Trust develops gradually via making bids and setting boundaries
Richard_Ngo (ricraz) · 2023-05-19T22:16:38.483Z · comments (12)
Advice for newly busy people
Severin T. Seehrich (sts) · 2023-05-11T16:46:15.313Z · comments (2)
[link] Who regulates the regulators? We need to go beyond the review-and-approval paradigm
jasoncrawford · 2023-05-04T22:11:17.465Z · comments (29)
Some background for reasoning about dual-use alignment research
Charlie Steiner · 2023-05-18T14:50:54.401Z · comments (20)
Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1
StefanHex (Stefan42) · 2023-05-09T19:41:10.528Z · comments (1)
Investigating Fabrication
LoganStrohl (BrienneYudkowsky) · 2023-05-18T17:46:52.783Z · comments (14)
From fear to excitement
Richard_Ngo (ricraz) · 2023-05-15T06:23:18.656Z · comments (8)
Retrospective: Lessons from the Failed Alignment Startup AISafety.com
Søren Elverlin (soren-elverlin-1) · 2023-05-12T18:07:20.857Z · comments (9)
Open Thread With Experimental Feature: Reactions
jimrandomh · 2023-05-24T16:46:39.367Z · comments (189)
A Case for the Least Forgiving Take On Alignment
Thane Ruthenis · 2023-05-02T21:34:49.832Z · comments (82)
Geoff Hinton Quits Google
Adam Shai (adam-shai) · 2023-05-01T21:03:47.806Z · comments (14)
Shah (DeepMind) and Leahy (Conjecture) Discuss Alignment Cruxes
OliviaJ (olivia-jimenez-1) · 2023-05-01T16:47:41.655Z · comments (10)
AI Safety in China: Part 2
Lao Mein (derpherpize) · 2023-05-22T14:50:54.482Z · comments (28)
Most people should probably feel safe most of the time
Kaj_Sotala · 2023-05-09T09:35:11.911Z · comments (28)
[link] DeepMind: Model evaluation for extreme risks
Zach Stein-Perlman · 2023-05-25T03:00:00.915Z · comments (11)
[link] What if they gave an Industrial Revolution and nobody came?
jasoncrawford · 2023-05-17T19:41:32.198Z · comments (10)
[link] Yoshua Bengio: How Rogue AIs may Arise
harfe · 2023-05-23T18:28:27.489Z · comments (12)
An artificially structured argument for expecting AGI ruin
Rob Bensinger (RobbBB) · 2023-05-07T21:52:54.421Z · comments (26)
Input Swap Graphs: Discovering the role of neural network components at scale
Alexandre Variengien (alexandre-variengien) · 2023-05-12T09:41:08.800Z · comments (0)
LessWrong Community Weekend 2023 [Applications now closed]
Henry Prowbell · 2023-05-01T09:08:14.502Z · comments (0)
The bullseye framework: My case against AI doom
titotal (lombertini) · 2023-05-30T11:52:31.194Z · comments (35)
Conditional Prediction with Zero-Sum Training Solves Self-Fulfilling Prophecies
Rubi J. Hudson (Rubi) · 2023-05-26T17:44:35.575Z · comments (13)
Bayesian Networks Aren't Necessarily Causal
Zack_M_Davis · 2023-05-14T01:42:24.319Z · comments (36)
Reacts now enabled on 100% of posts, though still just experimenting
Ruby · 2023-05-28T05:36:40.953Z · comments (73)
Coercion is an adaptation to scarcity; trust is an adaptation to abundance
Richard_Ngo (ricraz) · 2023-05-23T18:14:19.117Z · comments (11)
Lessons learned from offering in-office nutritional testing
Elizabeth (pktechgirl) · 2023-05-15T23:20:10.582Z · comments (11)
Judgments often smuggle in implicit standards
Richard_Ngo (ricraz) · 2023-05-15T18:50:07.781Z · comments (4)
[link] Wikipedia as an introduction to the alignment problem
[deleted] · 2023-05-29T18:43:47.247Z · comments (10)
next page (older posts) →