LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Monthly Roundup #28: March 2025
Zvi · 2025-03-17T12:50:03.097Z · comments (8)
Meetups Notes (Q1 2025)
jenn (pixx) · 2025-03-31T01:12:11.774Z · comments (2)
Prospects for Alignment Automation: Interpretability Case Study
Jacob Pfau (jacob-pfau) · 2025-03-21T14:05:51.528Z · comments (4)
How much progress actually happens in theoretical physics?
ChristianKl · 2025-04-04T23:08:00.633Z · comments (32)
Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
Alex Mallen (alex-mallen) · 2025-03-24T17:55:59.358Z · comments (0)
Most Questionable Details in 'AI 2027'
scarcegreengrass · 2025-04-05T00:32:54.896Z · comments (4)
Selection Pressures on LM Personas
Raymond D · 2025-03-28T20:33:09.918Z · comments (0)
[Linkpost] Visual roadmap to strong human germline engineering
TsviBT · 2025-04-05T22:22:57.744Z · comments (0)
EIS XV: A New Proof of Concept for Useful Interpretability
scasper · 2025-03-17T20:05:30.580Z · comments (2)
Call for Collaboration: Renormalization for AI safety 
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T21:01:56.500Z · comments (0)
How much does it cost to back up solar with batteries?
jasoncrawford · 2025-03-25T16:35:52.834Z · comments (6)
[link] Fundraising for Mox: coworking & events in SF
Austin Chen (austin-chen) · 2025-03-31T18:25:03.571Z · comments (0)
[link] Smelling Nice is Good, Actually
Gordon Seidoh Worley (gworley) · 2025-03-18T16:54:43.324Z · comments (8)
Non-Consensual Consent: The Performance of Choice in a Coercive World
Alex_Steiner · 2025-03-20T17:12:16.302Z · comments (4)
Reflections on Neuralese
Alice Blair (Diatom) · 2025-03-12T16:29:31.230Z · comments (0)
Introducing WAIT to Save Humanity
carterallen · 2025-04-01T21:47:17.857Z · comments (1)
Proof-of-Concept Debugger for a Small LLM
Peter Lai (peter-lai) · 2025-03-17T22:27:52.386Z · comments (0)
[link] Your Communication Preferences Aren’t Law
Jonathan Moregård (JonathanMoregard) · 2025-03-12T17:20:11.117Z · comments (4)
Why Were We Wrong About China and AI? A Case Study in Failed Rationality
thedudeabides · 2025-03-22T05:13:52.181Z · comments (35)
Existing UDTs test the limits of Bayesianism (and consistency)
Cole Wyeth (Amyr) · 2025-03-12T04:09:11.615Z · comments (18)
Austin Chen on Winning, Risk-Taking, and FTX
Elizabeth (pktechgirl) · 2025-04-07T19:00:08.039Z · comments (0)
[link] Sentinel minutes #10/2025: Trump tariffs, US/China tensions, Claude code reward hacking.
NunoSempere (Radamantis) · 2025-03-10T19:00:25.808Z · comments (0)
What Uniparental Disomy Tells Us About Improper Imprinting in Humans
Morpheus · 2025-03-28T11:24:47.133Z · comments (1)
[link] OpenAI lost $5 billion in 2024 (and its losses are increasing)
Remmelt (remmelt-ellen) · 2025-03-31T04:17:27.242Z · comments (15)
Changing my mind about Christiano's malign prior argument
Cole Wyeth (Amyr) · 2025-04-04T00:54:44.199Z · comments (34)
Report & retrospective on the Dovetail fellowship
Alex_Altair · 2025-03-14T23:20:17.940Z · comments (3)
[link] How prediction markets can create harmful outcomes: a case study
B Jacobs (Bob Jacobs) · 2025-04-02T15:37:09.285Z · comments (2)
Whether governments will control AGI is important and neglected
Seth Herd · 2025-03-14T09:48:34.062Z · comments (2)
Bike Lights are Cheap Enough to Give Away
jefftk (jkaufman) · 2025-03-14T02:10:02.482Z · comments (0)
Explaining the Joke: Pausing is The Way
WillPetillo · 2025-04-04T09:04:38.847Z · comments (2)
I grade every NBA basketball game I watch based on enjoyability
proshowersinger · 2025-03-12T21:46:26.791Z · comments (2)
How to mitigate sandbagging
Teun van der Weij (teun-van-der-weij) · 2025-03-23T17:19:07.452Z · comments (0)
A model of the final phase: the current frontier AIs as de facto CEOs of their own companies
Mitchell_Porter · 2025-03-08T22:15:35.260Z · comments (2)
[link] Well-foundedness as an organizing principle of healthy minds and societies
Richard_Ngo (ricraz) · 2025-04-07T00:31:34.098Z · comments (5)
Grok3 On Kant On AI Slavery
JenniferRM · 2025-04-01T04:10:48.093Z · comments (3)
Against podcasts
Adam Zerner (adamzerner) · 2025-04-05T19:20:00.716Z · comments (8)
[question] Does the AI control agenda broadly rely on no FOOM being possible?
Noosphere89 (sharmake-farah) · 2025-03-29T19:38:23.971Z · answers+comments (3)
Notes on handling non-concentrated failures with AI control: high level methods and different regimes
ryan_greenblatt · 2025-03-24T01:00:38.222Z · comments (3)
Doing principle-of-charity better
Sniffnoy · 2025-03-27T05:19:52.195Z · comments (1)
AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability
DanielFilan · 2025-03-28T18:40:01.856Z · comments (0)
Opportunity Space: Renormalization for AI Safety 
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T20:55:52.155Z · comments (0)
The Leapfrogging Terminus and the Fuzzy Cut
Jim Pivarski (jim-pivarski) · 2025-03-31T04:08:24.023Z · comments (6)
Read More News
utilistrutil · 2025-03-16T21:31:28.817Z · comments (2)
[question] Can we ever ensure AI alignment if we can only test AI personas?
Karl von Wendt · 2025-03-16T08:06:42.345Z · answers+comments (8)
[link] AI Tools for Existential Security
Lizka · 2025-03-14T18:38:06.110Z · comments (4)
Defense Against The Super-Worms
viemccoy · 2025-03-20T07:24:56.975Z · comments (1)
Consequentialism is for making decisions
Sniffnoy · 2025-03-27T04:00:07.020Z · comments (9)
[link] "Long" timelines to advanced AI have gotten crazy short
Matrice Jacobine · 2025-04-03T22:46:39.416Z · comments (0)
Towards an understanding of the Chinese AI scene
Mitchell_Porter · 2025-03-24T09:10:19.498Z · comments (0)
[question] LessWrong merch?
Brendan Long (korin43) · 2025-04-03T21:51:47.190Z · answers+comments (1)
← previous page (newer posts) · next page (older posts) →