LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

Monthly Roundup #28: March 2025
Zvi · 2025-03-17T12:50:03.097Z · comments (8)
[link] Are corporations superintelligent?
Vishakha (vishakha-agrawal) · 2025-03-17T10:36:12.703Z · comments (3)
[link] One pager
samuelshadrach (xpostah) · 2025-03-17T08:12:49.789Z · comments (2)
[link] The Case for AI Optimism
Annapurna (jorge-velez) · 2025-03-17T01:29:22.734Z · comments (1)
Notable runaway-optimiser-like LLM failure modes on Biologically and Economically aligned AI safety benchmarks for LLMs with simplified observation format
Roland Pihlakas (roland-pihlakas) · 2025-03-16T23:23:30.989Z · comments (6)
Read More News
utilistrutil · 2025-03-16T21:31:28.817Z · comments (2)
What would a post labor economy *actually* look like?
Ansh Juneja (ansh-juneja) · 2025-03-16T20:38:41.788Z · comments (1)
Why White-Box Redteaming Makes Me Feel Weird
Zygi Straznickas (nonagon) · 2025-03-16T18:54:48.078Z · comments (34)
How I've run major projects
benkuhn · 2025-03-16T18:40:04.223Z · comments (10)
Counting Objections to Housing
jefftk (jkaufman) · 2025-03-16T18:20:06.898Z · comments (7)
I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?
shrimpy · 2025-03-16T16:52:42.177Z · comments (25)
Siberian Arctic origins of East Asian psychology
davidsun · 2025-03-16T16:52:23.068Z · comments (0)
[link] AI Model History is Being Lost
Vale · 2025-03-16T12:38:47.907Z · comments (1)
Metacognition Broke My Nail-Biting Habit
Rafka · 2025-03-16T12:36:47.437Z · comments (20)
[question] Can we ever ensure AI alignment if we can only test AI personas?
Karl von Wendt · 2025-03-16T08:06:42.345Z · answers+comments (8)
Can time preferences make AI safe?
TerriLeaf · 2025-03-15T21:41:33.127Z · comments (1)
Help make the orca language experiment happen
Towards_Keeperhood (Simon Skade) · 2025-03-15T21:39:43.276Z · comments (12)
Announcing EXP: Experimental Summer Workshop on Collective Cognition
Jan_Kulveit · 2025-03-15T20:14:47.972Z · comments (2)
AI Self-Correction vs. Self-Reflection: Is There a Fundamental Difference?
Project Solon · 2025-03-15T18:24:50.579Z · comments (0)
The Fork in the Road
testingthewaters · 2025-03-15T17:36:37.503Z · comments (12)
Any-Benefit Mindset and Any-Reason Reasoning
silentbob · 2025-03-15T17:10:14.682Z · comments (9)
The Silent War: AGI-on-AGI Warfare and What It Means For Us
funnyfranco · 2025-03-15T15:24:08.819Z · comments (2)
[link] Paper: Field-building and the epistemic culture of AI safety
peterslattery · 2025-03-15T12:30:14.088Z · comments (3)
Why Billionaires Will Not Survive an AGI Extinction Event
funnyfranco · 2025-03-15T06:08:23.829Z · comments (0)
AI Says It’s Not Conscious. That’s a Bad Answer to the Wrong Question.
JohnMarkNorman · 2025-03-15T01:25:44.019Z · comments (0)
Report & retrospective on the Dovetail fellowship
Alex_Altair · 2025-03-14T23:20:17.940Z · comments (3)
The Dangers of Outsourcing Thinking: Losing Our Critical Thinking to the Over-Reliance on AI Decision-Making
Cameron Tomé-Moreira · 2025-03-14T23:07:48.446Z · comments (4)
LLMs may enable direct democracy at scale
Davey Morse (davey-morse) · 2025-03-14T22:51:13.384Z · comments (16)
2024 Unofficial LessWrong Survey Results
Screwtape · 2025-03-14T22:29:00.045Z · comments (28)
AI4Science: The Hidden Power of Neural Networks in Scientific Discovery
Max Ma (max-ma) · 2025-03-14T21:18:33.941Z · comments (2)
[link] What are we doing when we do mathematics?
epicurus · 2025-03-14T20:54:31.985Z · comments (1)
[link] AI for Epistemics Hackathon
Austin Chen (austin-chen) · 2025-03-14T20:46:34.250Z · comments (10)
Geometry of Features in Mechanistic Interpretability
Gunnar Carlsson (gunnar-carlsson) · 2025-03-14T19:11:04.287Z · comments (0)
[link] AI Tools for Existential Security
Lizka · 2025-03-14T18:38:06.110Z · comments (4)
Capitalism as the Catalyst for AGI-Induced Human Extinction
funnyfranco · 2025-03-14T18:14:02.375Z · comments (2)
Minor interpretability exploration #3: Extending superposition to different activation functions (loss landscape)
Rareș Baron · 2025-03-14T15:45:14.365Z · comments (0)
[link] AI for AI safety
Joe Carlsmith (joekc) · 2025-03-14T15:00:23.491Z · comments (13)
On MAIM and Superintelligence Strategy
Zvi · 2025-03-14T12:30:07.451Z · comments (2)
Whether governments will control AGI is important and neglected
Seth Herd · 2025-03-14T09:48:34.062Z · comments (2)
Something to fight for
RomanS · 2025-03-14T08:27:13.810Z · comments (0)
Interpreting Complexity
Maxwell Adam (intern) · 2025-03-14T04:52:32.103Z · comments (7)
Bike Lights are Cheap Enough to Give Away
jefftk (jkaufman) · 2025-03-14T02:10:02.482Z · comments (0)
Superintelligence's goals are likely to be random
Mikhail Samin (mikhail-samin) · 2025-03-13T22:41:06.325Z · comments (6)
Should AI safety be a mass movement?
mhampton · 2025-03-13T20:36:59.284Z · comments (1)
Auditing language models for hidden objectives
Sam Marks (samuel-marks) · 2025-03-13T19:18:32.638Z · comments (15)
Reducing LLM deception at scale with self-other overlap fine-tuning
Marc Carauleanu (Marc-Everin Carauleanu) · 2025-03-13T19:09:43.620Z · comments (40)
Vacuum Decay: Expert Survey Results
JessRiedel · 2025-03-13T18:31:17.434Z · comments (26)
[link] A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management
simeon_c (WayZ) · 2025-03-13T18:29:52.776Z · comments (0)
Creating Complex Goals: A Model to Create Autonomous Agents
theraven · 2025-03-13T18:17:58.519Z · comments (1)
← previous page (newer posts) · next page (older posts) →