LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

14+ AI Safety Advisors You Can Speak to – New AISafety.com Resource
Bryce Robertson (bryceerobertson) · 2025-01-21T17:34:02.170Z · comments (0)
Navigation by Moonlight
Jacob Falkovich (Jacobian) · 2025-04-07T15:32:17.353Z · comments (39)
Come join Dovetail's agent foundations fellowship talks & discussion
Alex_Altair · 2025-02-15T22:10:02.166Z · comments (0)
The non-tribal tribes
PatrickDFarley · 2025-02-26T17:22:59.949Z · comments (4)
Whether governments will control AGI is important and neglected
Seth Herd · 2025-03-14T09:48:34.062Z · comments (2)
Bike Lights are Cheap Enough to Give Away
jefftk (jkaufman) · 2025-03-14T02:10:02.482Z · comments (0)
MATS Spring 2024 Extension Retrospective
HenningB (HenningBlue) · 2025-02-12T22:43:58.193Z · comments (1)
[link] Nucleic Acid Observatory Updates, April 2025
jefftk (jkaufman) · 2025-04-15T18:58:29.839Z · comments (0)
Logical Correlation
niplav · 2025-02-10T23:29:10.518Z · comments (6)
Saving Zest
jefftk (jkaufman) · 2025-03-02T12:00:41.732Z · comments (1)
Against podcasts
Adam Zerner (adamzerner) · 2025-04-05T19:20:00.716Z · comments (19)
[question] What faithfulness metrics should general claims about CoT faithfulness be based upon?
Rauno Arike (rauno-arike) · 2025-04-08T15:27:20.346Z · answers+comments (0)
Explaining the Joke: Pausing is The Way
WillPetillo · 2025-04-04T09:04:38.847Z · comments (2)
I grade every NBA basketball game I watch based on enjoyability
proshowersinger · 2025-03-12T21:46:26.791Z · comments (2)
[link] New Report: Multi-Agent Risks from Advanced AI
Lewis Hammond (lewis-hammond-1) · 2025-02-23T00:32:29.534Z · comments (0)
Export Surplusses
lsusr · 2025-02-24T05:53:23.422Z · comments (21)
The present perfect tense is ruining your life
PatrickDFarley · 2025-01-27T16:14:48.843Z · comments (14)
Interesting ACX 2024 Book Review Entries
jenn (pixx) · 2025-04-20T18:10:04.973Z · comments (1)
How to mitigate sandbagging
Teun van der Weij (teun-van-der-weij) · 2025-03-23T17:19:07.452Z · comments (0)
Monthly Roundup #29: April 2025
Zvi · 2025-04-14T11:50:02.324Z · comments (6)
[link] Forging A New AGI Social Contract
Deric Cheng (deric-cheng) · 2025-04-10T13:41:11.817Z · comments (3)
What is a circuit? [in interpretability]
Yudhister Kumar (randomwalks) · 2025-02-14T04:40:42.978Z · comments (1)
[link] Currency Collapse
prue (prue0) · 2025-04-11T03:48:01.469Z · comments (3)
[link] Notes on the Presidential Election of 1836
Arjun Panickssery (arjun-panickssery) · 2025-02-13T23:40:23.224Z · comments (0)
Review: The Lathe of Heaven
dr_s · 2025-01-31T08:10:58.673Z · comments (0)
Two flaws in the Machiavelli Benchmark
TheManxLoiner · 2025-02-12T19:34:35.241Z · comments (0)
A model of the final phase: the current frontier AIs as de facto CEOs of their own companies
Mitchell_Porter · 2025-03-08T22:15:35.260Z · comments (2)
[question] LessWrong merch?
Brendan Long (korin43) · 2025-04-03T21:51:47.190Z · answers+comments (2)
AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability
DanielFilan · 2025-03-28T18:40:01.856Z · comments (0)
Prodromes and Biomarkers in Chronic Disease
sarahconstantin · 2025-04-16T21:30:02.978Z · comments (2)
The Last Light
Bridgett Kay (bridgett-kay) · 2025-04-14T15:41:02.745Z · comments (2)
A Bunch of Matryoshka SAEs
chanind · 2025-04-04T14:53:56.805Z · comments (0)
The Leapfrogging Terminus and the Fuzzy Cut
Jim Pivarski (jim-pivarski) · 2025-03-31T04:08:24.023Z · comments (6)
[link] AI Tools for Existential Security
Lizka · 2025-03-14T18:38:06.110Z · comments (4)
[link] Why People Commit White Collar Fraud (Ozy linkpost)
sapphire (deluks917) · 2025-03-03T19:33:15.609Z · comments (1)
[link] The Peeperi (unfinished) - By Katja Grace
Nathan Young · 2025-02-17T19:33:29.894Z · comments (0)
so you have a chronic health issue
agencypilled · 2025-01-26T19:00:29.972Z · comments (9)
Notes on handling non-concentrated failures with AI control: high level methods and different regimes
ryan_greenblatt · 2025-03-24T01:00:38.222Z · comments (3)
Doing principle-of-charity better
Sniffnoy · 2025-03-27T05:19:52.195Z · comments (1)
[question] Does the AI control agenda broadly rely on no FOOM being possible?
Noosphere89 (sharmake-farah) · 2025-03-29T19:38:23.971Z · answers+comments (3)
[question] Examples of self-fulfilling prophecies in AI alignment?
Chipmonk · 2025-03-03T02:45:51.619Z · answers+comments (6)
[question] Is weak-to-strong generalization an alignment technique?
cloud · 2025-01-31T07:13:03.332Z · answers+comments (1)
The Uses of Complacency
sarahconstantin · 2025-04-21T18:50:02.725Z · comments (1)
Seven sources of goals in LLM agents
Seth Herd · 2025-02-08T21:54:20.186Z · comments (3)
Opportunity Space: Renormalization for AI Safety 
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T20:55:52.155Z · comments (0)
Grok3 On Kant On AI Slavery
JenniferRM · 2025-04-01T04:10:48.093Z · comments (3)
Ruling Out Lookup Tables
Alfred Harwood · 2025-02-04T10:39:34.899Z · comments (11)
Understanding Trust: Overview Presentations
abramdemski · 2025-04-16T18:08:31.064Z · comments (0)
Introduction to Representing Sentences as Logical Statements
Towards_Keeperhood (Simon Skade) · 2025-04-05T20:35:31.422Z · comments (9)
[link] Published report: Pathways to short TAI timelines
Zershaaneh Qureshi (zershaaneh-qureshi) · 2025-02-20T22:10:12.276Z · comments (0)
← previous page (newer posts) · next page (older posts) →