LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[question] Why do many people who care about AI Safety not clearly endorse PauseAI?
humnrdble · 2025-03-30T18:06:32.426Z · answers+comments (41)
Tabula Bio: towards a future free of disease (& looking for collaborators)
mpoon (michael-poon) · 2025-03-23T16:30:15.523Z · comments (15)
ALLFED emergency appeal: Help us raise $800,000 to avoid cutting half of programs
denkenberger · 2025-04-16T21:47:40.687Z · comments (9)
Paper
dynomight · 2025-04-11T12:20:04.200Z · comments (12)
The first AI war will be in your computer
Viliam · 2025-04-08T09:28:53.191Z · comments (10)
AI #109: Google Fails Marketing Forever
Zvi · 2025-03-27T14:50:01.825Z · comments (12)
A Dissent on Honesty
eva_ · 2025-04-15T02:43:44.163Z · comments (52)
D&D.Sci Tax Day: Adventurers and Assessments
aphyer · 2025-04-15T23:43:14.733Z · comments (10)
Handling schemers if shutdown is not an option
Buck · 2025-04-18T14:39:18.609Z · comments (0)
[link] Sentinel's Global Risks Weekly Roundup #15/2025: Tariff yoyo, OpenAI slashing safety testing, Iran nuclear programme negotiations, 1K H5N1 confirmed herd infections.
NunoSempere (Radamantis) · 2025-04-14T19:11:20.977Z · comments (0)
Follow me on TikTok
lsusr · 2025-04-01T08:22:29.521Z · comments (8)
[link] Automated Researchers Can Subtly Sandbag
gasteigerjo · 2025-03-26T19:13:26.879Z · comments (0)
An overview of control measures
ryan_greenblatt · 2025-03-24T23:16:49.400Z · comments (0)
Analyzing long agent transcripts (Docent)
jsteinhardt · 2025-03-24T20:49:54.472Z · comments (2)
[link] The case for AGI by 2030
Benjamin_Todd · 2025-04-09T20:35:55.167Z · comments (6)
[link] Map of all 40 copyright suits v. AI in U.S.
Remmelt (remmelt-ellen) · 2025-03-26T07:57:58.976Z · comments (3)
[link] The US Executive vs Supreme Court Deportations Clash
NunoSempere (Radamantis) · 2025-04-21T19:56:03.711Z · comments (0)
Crime and Punishment #1
Zvi · 2025-04-21T15:30:06.420Z · comments (4)
We need (a lot) more rogue agent honeypots
Ozyrus · 2025-03-23T22:24:52.785Z · comments (12)
[link] Existing Safety Frameworks Imply Unreasonable Confidence
Joe Rogero · 2025-04-10T16:31:50.240Z · comments (2)
Scaffolding Skills
Screwtape · 2025-04-18T17:39:25.634Z · comments (7)
Meditation and Reduced Sleep Need
niplav · 2025-04-04T14:42:54.792Z · comments (8)
o3 Will Use Its Tools For You
Zvi · 2025-04-18T21:20:02.566Z · comments (3)
The Rise of Hyperpalatability
Jack (jack-3) · 2025-04-02T20:18:04.407Z · comments (10)
[link] Forecasting time to automated superhuman coders [AI 2027 Timelines Forecast]
elifland · 2025-04-10T23:10:23.063Z · comments (0)
Call for Collaboration: Renormalization for AI safety 
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T21:01:56.500Z · comments (0)
Can SAE steering reveal sandbagging?
jordine · 2025-04-15T12:33:41.264Z · comments (3)
[link] Well-foundedness as an organizing principle of healthy minds and societies
Richard_Ngo (ricraz) · 2025-04-07T00:31:34.098Z · comments (6)
Austin Chen on Winning, Risk-Taking, and FTX
Elizabeth (pktechgirl) · 2025-04-07T19:00:08.039Z · comments (3)
Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
Alex Mallen (alex-mallen) · 2025-03-24T17:55:59.358Z · comments (0)
More Fun With GPT-4o Image Generation
Zvi · 2025-04-03T02:10:02.317Z · comments (3)
Avoid the Counterargument Collapse
marknm · 2025-03-26T03:19:58.655Z · comments (3)
OpenAI rewrote its Preparedness Framework
Zach Stein-Perlman · 2025-04-15T20:00:50.614Z · comments (1)
Is instrumental convergence a thing for virtue-driven agents?
mattmacdermott · 2025-04-02T03:59:20.064Z · comments (37)
Why do misalignment risks increase as AIs get more capable?
ryan_greenblatt · 2025-04-11T03:06:50.928Z · comments (6)
[link] Center on Long-Term Risk: Summer Research Fellowship 2025 - Apply Now
Tristan Cook · 2025-03-26T17:29:14.797Z · comments (0)
FLAKE-Bench: Outsourcing Awkwardness in the Age of AI
annas (annasoli) · 2025-04-01T17:08:25.092Z · comments (0)
How much progress actually happens in theoretical physics?
ChristianKl · 2025-04-04T23:08:00.633Z · comments (32)
Goodhart Typology via Structure, Function, and Randomness Distributions
JustinShovelain · 2025-03-25T16:01:08.327Z · comments (0)
More on Various AI Action Plans
Zvi · 2025-03-24T13:10:05.637Z · comments (0)
What Uniparental Disomy Tells Us About Improper Imprinting in Humans
Morpheus · 2025-03-28T11:24:47.133Z · comments (1)
When the Wannabe Rambo Comedian Cried
P. João (gabriel-brito) · 2025-03-31T14:47:50.660Z · comments (0)
On the Implications of Recent Results on Latent Reasoning in LLMs
Rauno Arike (rauno-arike) · 2025-03-31T11:06:23.939Z · comments (6)
An overview of areas of control work
ryan_greenblatt · 2025-03-25T22:02:16.178Z · comments (0)
Most Questionable Details in 'AI 2027'
scarcegreengrass · 2025-04-05T00:32:54.896Z · comments (4)
[link] How prediction markets can create harmful outcomes: a case study
B Jacobs (Bob Jacobs) · 2025-04-02T15:37:09.285Z · comments (2)
Non-Monotonic Infra-Bayesian Physicalism
Marcus Ogren · 2025-04-02T12:14:19.783Z · comments (0)
Llama Does Not Look Good 4 Anything
Zvi · 2025-04-09T13:20:01.799Z · comments (1)
[link] The 4-Minute Mile Effect
Parker Conley (parker-conley) · 2025-04-14T21:41:27.726Z · comments (6)
Meetups Notes (Q1 2025)
jenn (pixx) · 2025-03-31T01:12:11.774Z · comments (2)
← previous page (newer posts) · next page (older posts) →