LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

How Logic "Really" Works: An Engineering Perspective
Daniil Strizhov (mila-dolontaeva) · 2025-04-16T05:34:09.443Z · comments (0)
Opportunity to to learn more about AI Innovation & Security Policy
PolicyTakes · 2025-04-16T01:35:27.203Z · comments (0)
D&D.Sci Tax Day: Adventurers and Assessments
aphyer · 2025-04-15T23:43:14.733Z · comments (2)
[link] Should AIs be Encouraged to Cooperate?
PeterMcCluskey · 2025-04-15T21:57:06.096Z · comments (0)
OpenAI rewrote its Preparedness Framework
Zach Stein-Perlman · 2025-04-15T20:00:50.614Z · comments (1)
[link] ASI existential risk: Reconsidering Alignment as a Goal
habryka (habryka4) · 2025-04-15T19:57:42.547Z · comments (4)
[link] Nucleic Acid Observatory Updates, April 2025
jefftk (jkaufman) · 2025-04-15T18:58:29.839Z · comments (0)
Some OthelloGPT Circuits
Alfred Wong (alfred-wong) · 2025-04-15T18:41:36.216Z · comments (0)
The Mirror Problem in AI: Why Language Models Say Whatever You Want
RobT · 2025-04-15T18:40:02.793Z · comments (1)
What happens when LLMs learn new things? & Continual learning forever.
sunchipsster · 2025-04-15T18:38:35.166Z · comments (0)
To be legible, evidence of misalignment probably has to be behavioral
ryan_greenblatt · 2025-04-15T18:14:53.022Z · comments (5)
[link] AISN #51: AI Frontiers
Corin Katzke (corin-katzke) · 2025-04-15T16:01:56.701Z · comments (1)
Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI
Kaj_Sotala · 2025-04-15T15:56:19.466Z · comments (15)
OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing
Zvi · 2025-04-15T15:30:02.518Z · comments (3)
[link] The real reason AI benchmarks haven’t reflected economic impacts
Noosphere89 (sharmake-farah) · 2025-04-15T13:44:06.225Z · comments (0)
Map of AI Safety v2
Bryce Robertson (bryceerobertson) · 2025-04-15T13:04:40.993Z · comments (1)
[link] 3M Subscriber YouTube Account 'Channel 5' Reporting On Rationalism
sakraf · 2025-04-15T13:02:33.736Z · comments (0)
Can SAE steering reveal sandbagging?
jordine · 2025-04-15T12:33:41.264Z · comments (2)
Risers for Foot Percussion
jefftk (jkaufman) · 2025-04-15T11:10:08.577Z · comments (0)
What empirical research directions has Eliezer commented positively on?
Chris_Leong · 2025-04-15T08:53:41.677Z · comments (1)
Debunking the Hard Problem: Consciousness as Integrated Prediction
gmax (maxim-gurevich) · 2025-04-15T08:38:50.637Z · comments (8)
How to Defend the Indefensible
Alex Beyman (alexbeyman) · 2025-04-15T07:45:15.971Z · comments (0)
A Talmudic Rationalist Cautionary Tale
Noah Birnbaum (daniel-birnbaum) · 2025-04-15T04:11:16.972Z · comments (1)
Creating 'Making God': a Feature Documentary on risks from AGI
Connor Axiotes (connor-axiotes-1) · 2025-04-15T02:56:09.206Z · comments (0)
A Dissent on Honesty
eva_ · 2025-04-15T02:43:44.163Z · comments (20)
$500 bounty for best short-form fiction about our near future world; $100 for recommending winning piece: new “Art of Near Future World” quarterly art project
Ramon Gonzalez (ramon-gonzalez) · 2025-04-15T00:46:10.637Z · comments (0)
What if there was a nuke in Manhattan and why that could be a good thing
Ratburn · 2025-04-15T00:19:41.844Z · comments (10)
[link] Nihilism Is Not Enough By Peter Thiel
shawkisukkar · 2025-04-15T00:13:01.375Z · comments (0)
Correcting Deceptive Alignment using a Deontological Approach
JeaniceK · 2025-04-14T22:07:57.860Z · comments (0)
Religious Persistence: A Missing Primitive for Robust Alignment
lauriewired · 2025-04-14T22:03:45.868Z · comments (3)
[link] The 4-Minute Mile Effect
Parker Conley (parker-conley) · 2025-04-14T21:41:27.726Z · comments (3)
Lightning Talks!
nathandunkerley · 2025-04-14T20:39:17.593Z · comments (0)
The Bell Curve of Bad Behavior
Screwtape · 2025-04-14T19:58:10.293Z · comments (5)
[link] Sentinel's Global Risks Weekly Roundup #15/2025: Tariff yoyo, OpenAI slashing safety testing, Iran nuclear programme negotiations, 1K H5N1 confirmed herd infections.
NunoSempere (Radamantis) · 2025-04-14T19:11:20.977Z · comments (0)
Sam Altman's sister claims Sam sexually abused her -- Part 7: Timeline, continued
pythagoras5015 (pl5015) · 2025-04-14T17:43:28.897Z · comments (0)
Sam Altman's sister claims Sam sexually abused her -- Part 8: Timeline, continued
pythagoras5015 (pl5015) · 2025-04-14T17:42:53.705Z · comments (0)
[link] Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study
Adam Karvonen (karvonenadam) · 2025-04-14T17:38:02.918Z · comments (23)
How to evaluate control measures for LLM agents? A trajectory from today to superintelligence
Tomek Korbak (tomek-korbak) · 2025-04-14T16:45:46.584Z · comments (0)
Applications Open for Impact Accelerator Program for Experienced Professionals
Clark Wisenbaker (accounts-hip) · 2025-04-14T16:27:32.340Z · comments (0)
The Last Light
Bridgett Kay (bridgett-kay) · 2025-04-14T15:41:02.745Z · comments (0)
Offer: Team Conflict Counseling for AI Safety Orgs
Severin T. Seehrich (sts) · 2025-04-14T15:17:00.835Z · comments (1)
[link] Slopworld 2035: The dangers of mediocre AI
titotal (lombertini) · 2025-04-14T13:14:08.390Z · comments (6)
Try training token-level probes
StefanHex (Stefan42) · 2025-04-14T11:56:23.191Z · comments (2)
Monthly Roundup #29: April 2025
Zvi · 2025-04-14T11:50:02.324Z · comments (6)
A Solution to Sandbagging and other Self-Provable Misalignment: Constitutional AI Detectives
Knight Lee (Max Lee) · 2025-04-14T10:27:24.903Z · comments (2)
One-shot steering vectors cause emergent misalignment, too
Jacob Dunefsky (jacob-dunefsky) · 2025-04-14T06:40:41.503Z · comments (6)
[link] Unbendable Arm as Test Case for Religious Belief
Ivan Vendrov (ivan-vendrov) · 2025-04-14T01:57:12.013Z · comments (29)
Sam Altman's sister claims Sam sexually abused her -- Part 5: Timeline, continued
pythagoras5015 (pl5015) · 2025-04-14T01:00:07.084Z · comments (0)
Луна Лавгуд и Комната Тайн, Часть 5
Kongo Landwalker (kongo-landwalker) · 2025-04-14T00:10:36.028Z · comments (0)
Sam Altman's sister claims Sam sexually abused her -- Part 4: Timeline, continued
pythagoras5015 (pl5015) · 2025-04-13T23:41:55.411Z · comments (0)
next page (older posts) →