LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra (ajeya-cotra) · 2022-07-18T19:06:14.670Z · comments (94)
Reward is not the optimization target
TurnTrout · 2022-07-25T00:03:18.307Z · comments (123)
What should you change in response to an "emergency"? And AI risk
AnnaSalamon · 2022-07-18T01:11:14.667Z · comments (60)
Looking back on my alignment PhD
TurnTrout · 2022-07-01T03:19:59.497Z · comments (63)
On how various plans miss the hard bits of the alignment challenge
So8res · 2022-07-12T02:49:50.454Z · comments (88)
Toni Kurz and the Insanity of Climbing Mountains
GeneSmith · 2022-07-03T20:51:58.429Z · comments (67)
Changing the world through slack & hobbies
Steven Byrnes (steve2152) · 2022-07-21T18:11:05.636Z · comments (13)
Safetywashing
Adam Scholl (adam_scholl) · 2022-07-01T11:56:33.495Z · comments (20)
Sexual Abuse attitudes might be infohazardous
Pseudonymous Otter · 2022-07-19T18:06:43.956Z · comments (71)
Unifying Bargaining Notions (1/2)
Diffractor · 2022-07-25T00:28:27.572Z · comments (41)
Humans provide an untapped wealth of evidence about alignment
TurnTrout · 2022-07-14T02:31:48.575Z · comments (94)
[link] Connor Leahy on Dying with Dignity, EleutherAI and Conjecture
Michaël Trazzi (mtrazzi) · 2022-07-22T18:44:19.749Z · comments (29)
A note about differential technological development
So8res · 2022-07-15T04:46:53.166Z · comments (32)
AGI ruin scenarios are likely (and disjunctive)
So8res · 2022-07-27T03:21:57.615Z · comments (38)
ITT-passing and civility are good; "charity" is bad; steelmanning is niche
Rob Bensinger (RobbBB) · 2022-07-05T00:15:36.308Z · comments (36)
«Boundaries», Part 1: a key missing concept from utility theory
Andrew_Critch · 2022-07-26T23:03:55.941Z · comments (32)
Resolve Cycles
CFAR!Duncan (CFAR 2017) · 2022-07-16T23:17:13.037Z · comments (8)
Brainstorm of things that could force an AI team to burn their lead
So8res · 2022-07-24T23:58:16.988Z · comments (8)
Carrying the Torch: A Response to Anna Salamon by the Guild of the Rose
moridinamael · 2022-07-06T14:20:14.847Z · comments (16)
AI Forecasting: One Year In
jsteinhardt · 2022-07-04T05:10:18.470Z · comments (12)
Conjecture: Internal Infohazard Policy
Connor Leahy (NPCollapse) · 2022-07-29T19:07:08.491Z · comments (6)
Limerence Messes Up Your Rationality Real Bad, Yo
Raemon · 2022-07-01T16:53:10.914Z · comments (41)
Principles for Alignment/Agency Projects
johnswentworth · 2022-07-07T02:07:36.156Z · comments (20)
Unifying Bargaining Notions (2/2)
Diffractor · 2022-07-27T03:40:30.524Z · comments (19)
Circumventing interpretability: How to defeat mind-readers
Lee Sharkey (Lee_Sharkey) · 2022-07-14T16:59:22.201Z · comments (12)
Moral strategies at different capability levels
Richard_Ngo (ricraz) · 2022-07-27T18:50:05.366Z · comments (14)
Criticism of EA Criticism Contest
Zvi · 2022-07-14T14:30:00.782Z · comments (17)
Focusing
CFAR!Duncan (CFAR 2017) · 2022-07-29T19:15:35.377Z · comments (23)
Examples of AI Increasing AI Progress
ThomasW (ThomasWoodside) · 2022-07-17T20:06:41.213Z · comments (14)
Safety Implications of LeCun's path to machine intelligence
Ivan Vendrov (ivan-vendrov) · 2022-07-15T21:47:44.411Z · comments (18)
Comment on "Propositions Concerning Digital Minds and Society"
Zack_M_Davis · 2022-07-10T05:48:51.013Z · comments (12)
Marriage, the Giving What We Can Pledge, and the damage caused by vague public commitments
Jeffrey Ladish (jeff-ladish) · 2022-07-11T19:38:42.468Z · comments (27)
Naive Hypotheses on AI Alignment
Shoshannah Tekofsky (DarkSym) · 2022-07-02T19:03:49.458Z · comments (29)
Help ARC evaluate capabilities of current language models (still need people)
Beth Barnes (beth-barnes) · 2022-07-19T04:55:18.189Z · comments (6)
A summary of every "Highlights from the Sequences" post
Akash (akash-wasil) · 2022-07-15T23:01:04.392Z · comments (7)
Human values & biases are inaccessible to the genome
TurnTrout · 2022-07-07T17:29:56.190Z · comments (54)
Internal Double Crux
CFAR!Duncan (CFAR 2017) · 2022-07-22T04:34:54.719Z · comments (15)
Immanuel Kant and the Decision Theory App Store
Daniel Kokotajlo (daniel-kokotajlo) · 2022-07-10T16:04:04.248Z · comments (12)
How to Diversify Conceptual Alignment: the Model Behind Refine
adamShimi · 2022-07-20T10:44:02.637Z · comments (11)
MATS Models
johnswentworth · 2022-07-09T00:14:24.812Z · comments (5)
[link] Trends in GPU price-performance
Marius Hobbhahn (marius-hobbhahn) · 2022-07-01T15:51:10.850Z · comments (12)
All AGI safety questions welcome (especially basic ones) [July 2022]
plex (ete) · 2022-07-16T12:57:44.157Z · comments (132)
[link] Don't use 'infohazard' for collectively destructive info
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-07-15T05:13:18.642Z · comments (33)
Benchmark for successful concept extrapolation/avoiding goal misgeneralization
Stuart_Armstrong · 2022-07-04T20:48:14.703Z · comments (12)
Opening Session Tips & Advice
CFAR!Duncan (CFAR 2017) · 2022-07-25T03:57:49.731Z · comments (3)
Trigger-Action Planning
CFAR!Duncan (CFAR 2017) · 2022-07-03T01:42:22.083Z · comments (14)
Goal Factoring
CFAR!Duncan (CFAR 2017) · 2022-07-05T07:10:04.930Z · comments (2)
Addendum: A non-magical explanation of Jeffrey Epstein
lc · 2022-07-18T17:40:37.099Z · comments (21)
[question] How do AI timelines affect how you live your life?
Quadratic Reciprocity · 2022-07-11T13:54:12.961Z · answers+comments (50)
Aversion Factoring
CFAR!Duncan (CFAR 2017) · 2022-07-07T16:09:11.392Z · comments (1)
next page (older posts) →