LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Manifold Predicted the AI Extinction Statement and CAIS Wanted it Deleted
David Chee (david-chee) · 2023-06-12T15:54:51.699Z · comments (14)
Cultivate an obsession with the object level
Richard_Ngo (ricraz) · 2023-06-07T01:39:54.778Z · comments (4)
MetaAI: less is less for alignment.
Cleo Nardo (strawberry calm) · 2023-06-13T14:08:45.209Z · comments (17)
[link] LEAst-squares Concept Erasure (LEACE)
tricky_labyrinth · 2023-06-07T21:51:04.494Z · comments (10)
[Fiction] A Disneyland Without Children
L Rudolf L (LRudL) · 2023-06-04T13:06:46.323Z · comments (3)
Adventist Health Study-2 supports pescetarianism more than veganism
Elizabeth (pktechgirl) · 2023-06-17T20:10:06.161Z · comments (11)
Introduction to Towards Causal Foundations of Safe AGI
tom4everitt · 2023-06-12T17:55:24.406Z · comments (6)
[link] "textbooks are all you need"
bhauth · 2023-06-21T17:06:46.148Z · comments (18)
A Friendly Face (Another Failure Story)
Karl von Wendt · 2023-06-20T10:31:24.655Z · comments (21)
Man in the Arena
Richard_Ngo (ricraz) · 2023-06-26T21:57:45.353Z · comments (6)
Short timelines and slow, continuous takeoff as the safest path to AGI
rosehadshar · 2023-06-21T08:56:30.675Z · comments (15)
[link] UK Foundation Model Task Force - Expression of Interest
ojorgensen · 2023-06-18T09:43:27.734Z · comments (2)
TASRA: A Taxonomy and Analysis of Societal-Scale Risks from AI
Andrew_Critch · 2023-06-13T05:04:46.756Z · comments (1)
Uncertainty about the future does not imply that AGI will go well
Lauro Langosco · 2023-06-01T17:38:09.619Z · comments (11)
Which personality traits are real? Stress-testing the lexical hypothesis
tailcalled · 2023-06-21T19:46:03.164Z · comments (4)
The ones who endure
Richard_Ngo (ricraz) · 2023-06-16T14:40:09.623Z · comments (15)
A Double-Feature on The Extropians
Maxwell Tabarrok (maxwell-tabarrok) · 2023-06-03T18:27:47.429Z · comments (4)
AISafety.info "How can I help?" FAQ
steven0461 · 2023-06-05T22:09:57.630Z · comments (0)
Ages Survey: Results
jefftk (jkaufman) · 2023-06-05T02:10:06.986Z · comments (10)
A "weak" AGI may attempt an unlikely-to-succeed takeover
RobertM (T3t) · 2023-06-28T20:31:46.356Z · comments (17)
Improvement on MIRI's Corrigibility
WCargo (Wcargo) · 2023-06-09T16:10:46.903Z · comments (8)
[link] formalizing the QACI alignment formal-goal
Tamsin Leake (carado-1) · 2023-06-10T03:28:29.541Z · comments (6)
An Exercise to Build Intuitions on AGI Risk
Lauro Langosco · 2023-06-07T18:35:47.779Z · comments (3)
[Replication] Conjecture's Sparse Coding in Small Transformers
Hoagy · 2023-06-16T18:02:34.874Z · comments (0)
[link] Are Bayesian methods guaranteed to overfit?
Ege Erdil (ege-erdil) · 2023-06-17T12:52:43.987Z · comments (5)
AXRP Episode 22 - Shard Theory with Quintin Pope
DanielFilan · 2023-06-15T19:00:01.340Z · comments (11)
InternLM - China's Best (Unverified)
Lao Mein (derpherpize) · 2023-06-09T07:39:15.179Z · comments (4)
[link] Contingency: A Conceptual Tool from Evolutionary Biology for Alignment
clem_acs · 2023-06-12T20:54:04.315Z · comments (2)
[link] How to Think About Activation Patching
Neel Nanda (neel-nanda-1) · 2023-06-04T14:17:42.264Z · comments (5)
[link] The Case for Overconfidence is Overstated
Kevin Dorst · 2023-06-28T17:21:06.160Z · comments (13)
Crystal Healing — or the Origins of Expected Utility Maximizers
Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2023-06-25T03:18:25.033Z · comments (11)
The Control Problem: Unsolved or Unsolvable?
Remmelt (remmelt-ellen) · 2023-06-02T15:42:37.269Z · comments (46)
A moral backlash against AI will probably slow down AGI development
geoffreymiller · 2023-06-07T20:39:42.951Z · comments (10)
Mode collapse in RL may be fueled by the update equation
TurnTrout · 2023-06-19T21:51:04.129Z · comments (10)
Causality: A Brief Introduction
tom4everitt · 2023-06-20T15:01:39.377Z · comments (18)
[link] "Safety Culture for AI" is important, but isn't going to be easy
Davidmanheim · 2023-06-26T12:52:47.368Z · comments (2)
DSLT 1. The RLCT Measures the Effective Dimension of Neural Networks
Liam Carroll (liam-carroll) · 2023-06-16T09:50:10.113Z · comments (8)
AI #18: The Great Debate Debate
Zvi · 2023-06-29T16:20:05.569Z · comments (9)
Instrumental Convergence? [Draft]
J. Dmitri Gallow (j-dmitri-gallow) · 2023-06-14T20:21:41.485Z · comments (20)
[link] Elon talked with senior Chinese leadership about AI X-risk
ChristianKl · 2023-06-07T15:02:49.606Z · comments (2)
AI #16: AI in the UK
Zvi · 2023-06-15T13:20:03.939Z · comments (20)
Updating Drexler's CAIS model
Matthew Barnett (matthew-barnett) · 2023-06-16T22:53:58.140Z · comments (32)
Ban development of unpredictable powerful models?
TurnTrout · 2023-06-20T01:43:11.574Z · comments (25)
I can see how I am Dumb
Johannes C. Mayer (johannes-c-mayer) · 2023-06-10T19:18:59.659Z · comments (11)
[link] an Evangelion dialogue explaining the QACI alignment plan
Tamsin Leake (carado-1) · 2023-06-10T03:28:47.096Z · comments (15)
My impression of singular learning theory
Ege Erdil (ege-erdil) · 2023-06-18T15:34:27.249Z · comments (30)
We Are Less Wrong than E. T. Jaynes on Loss Functions in Human Society
Zack_M_Davis · 2023-06-05T05:34:59.440Z · comments (14)
Leveling Up Or Leveling Off? Understanding The Science Behind Skill Plateaus
lynettebye · 2023-06-16T00:18:04.378Z · comments (9)
How tall is the Shard, really?
philh · 2023-06-23T08:10:02.124Z · comments (10)
Agentic Mess (A Failure Story)
Karl von Wendt · 2023-06-06T13:09:19.125Z · comments (5)
← previous page (newer posts) · next page (older posts) →