LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

A collection of approaches to confronting doom, and my thoughts on them
Ruby · 2025-04-06T02:11:31.271Z · comments (18)
Prioritizing threats for AI control
ryan_greenblatt · 2025-03-19T17:09:45.044Z · comments (2)
Vestigial reasoning in RL
Caleb Biddulph (caleb-biddulph) · 2025-04-13T15:40:11.954Z · comments (7)
Reactions to METR task length paper are insane
Cole Wyeth (Amyr) · 2025-04-10T17:13:36.428Z · comments (41)
23andMe potentially for sale for <$50M
lemonhope (lcmgcd) · 2025-03-25T04:34:28.388Z · comments (2)
Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle
Czynski (JacobKopczynski) · 2025-03-29T02:51:29.786Z · comments (36)
[link] College Advice For People Like Me
henryj · 2025-04-12T14:36:46.643Z · comments (5)
Youth Lockout
Xavi CF (xavi-cf) · 2025-04-11T15:05:54.441Z · comments (6)
Try training token-level probes
StefanHex (Stefan42) · 2025-04-14T11:56:23.191Z · comments (4)
OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing
Zvi · 2025-04-15T15:30:02.518Z · comments (3)
I changed my mind about orca intelligence
Towards_Keeperhood (Simon Skade) · 2025-03-18T10:15:29.860Z · comments (24)
[link] The Russell Conjugation Illuminator
TimmyM (timmym) · 2025-04-17T19:33:06.924Z · comments (14)
Equations Mean Things
abstractapplic · 2025-03-19T08:16:35.312Z · comments (10)
On (Not) Feeling the AGI
Zvi · 2025-03-25T14:30:02.215Z · comments (25)
[link] American College Admissions Doesn't Need to Be So Competitive
Arjun Panickssery (arjun-panickssery) · 2025-04-07T17:35:26.791Z · comments (18)
Silly Time
jefftk (jkaufman) · 2025-03-21T12:30:08.560Z · comments (2)
[question] Why do many people who care about AI Safety not clearly endorse PauseAI?
humnrdble · 2025-03-30T18:06:32.426Z · answers+comments (41)
Tabula Bio: towards a future free of disease (& looking for collaborators)
mpoon (michael-poon) · 2025-03-23T16:30:15.523Z · comments (15)
The first AI war will be in your computer
Viliam · 2025-04-08T09:28:53.191Z · comments (10)
ALLFED emergency appeal: Help us raise $800,000 to avoid cutting half of programs
denkenberger · 2025-04-16T21:47:40.687Z · comments (8)
AI #108: Straight Line on a Graph
Zvi · 2025-03-20T13:50:00.983Z · comments (5)
An Advent of Thought
Kaarel (kh) · 2025-03-17T14:21:08.765Z · comments (8)
Paper
dynomight · 2025-04-11T12:20:04.200Z · comments (12)
[link] Sentinel's Global Risks Weekly Roundup #15/2025: Tariff yoyo, OpenAI slashing safety testing, Iran nuclear programme negotiations, 1K H5N1 confirmed herd infections.
NunoSempere (Radamantis) · 2025-04-14T19:11:20.977Z · comments (0)
AI #109: Google Fails Marketing Forever
Zvi · 2025-03-27T14:50:01.825Z · comments (12)
[link] Automated Researchers Can Subtly Sandbag
gasteigerjo · 2025-03-26T19:13:26.879Z · comments (0)
Follow me on TikTok
lsusr · 2025-04-01T08:22:29.521Z · comments (8)
A Dissent on Honesty
eva_ · 2025-04-15T02:43:44.163Z · comments (42)
Handling schemers if shutdown is not an option
Buck · 2025-04-18T14:39:18.609Z · comments (0)
An overview of control measures
ryan_greenblatt · 2025-03-24T23:16:49.400Z · comments (0)
Analyzing long agent transcripts (Docent)
jsteinhardt · 2025-03-24T20:49:54.472Z · comments (2)
[link] The case for AGI by 2030
Benjamin_Todd · 2025-04-09T20:35:55.167Z · comments (6)
SHIFT relies on token-level features to de-bias Bias in Bios probes
Tim Hua · 2025-03-19T21:29:15.974Z · comments (2)
D&D.Sci Tax Day: Adventurers and Assessments
aphyer · 2025-04-15T23:43:14.733Z · comments (8)
[link] Map of all 40 copyright suits v. AI in U.S.
Remmelt (remmelt-ellen) · 2025-03-26T07:57:58.976Z · comments (3)
We need (a lot) more rogue agent honeypots
Ozyrus · 2025-03-23T22:24:52.785Z · comments (12)
They Took MY Job?
Zvi · 2025-03-21T13:30:38.507Z · comments (4)
LessOnline 2025: Early Bird Tickets On Sale
Ben Pace (Benito) · 2025-03-18T00:22:02.653Z · comments (4)
Meditation and Reduced Sleep Need
niplav · 2025-04-04T14:42:54.792Z · comments (8)
Scaffolding Skills
Screwtape · 2025-04-18T17:39:25.634Z · comments (1)
[link] Existing Safety Frameworks Imply Unreasonable Confidence
Joe Rogero · 2025-04-10T16:31:50.240Z · comments (1)
[link] Three Types of Intelligence Explosion
rosehadshar · 2025-03-17T14:47:46.696Z · comments (8)
[link] Forecasting time to automated superhuman coders [AI 2027 Timelines Forecast]
elifland · 2025-04-10T23:10:23.063Z · comments (0)
The Rise of Hyperpalatability
Jack (jack-3) · 2025-04-02T20:18:04.407Z · comments (10)
Can SAE steering reveal sandbagging?
jordine · 2025-04-15T12:33:41.264Z · comments (3)
Boots theory and Sybil Ramkin
philh · 2025-03-18T22:10:08.855Z · comments (17)
Call for Collaboration: Renormalization for AI safety 
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T21:01:56.500Z · comments (0)
Why Are The Human Sciences Hard? Two New Hypotheses
Aydin Mohseni (aydin-mohseni) · 2025-03-18T15:45:52.239Z · comments (14)
Avoid the Counterargument Collapse
marknm · 2025-03-26T03:19:58.655Z · comments (3)
More Fun With GPT-4o Image Generation
Zvi · 2025-04-03T02:10:02.317Z · comments (3)
← previous page (newer posts) · next page (older posts) →