LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AI #110: Of Course You Know…
Zvi · 2025-04-03T13:10:05.674Z · comments (9)

DeepSeek Panic at the App Store
Zvi · 2025-01-28T19:30:07.555Z · comments (14)

Against Yudkowsky's evolution analogy for AI x-risk [unfinished]
Fiora Sunshine (Fiora from Rosebloom) · 2025-03-18T01:41:06.453Z · comments (18)

AI #100: Meet the New Boss
Zvi · 2025-01-23T15:40:07.473Z · comments (4)

AI "Deep Research" Tools Reviewed
sarahconstantin · 2025-03-24T18:40:03.864Z · comments (5)

Dream, Truth, & Good
abramdemski · 2025-02-24T16:59:05.045Z · comments (11)

The Bell Curve of Bad Behavior
Screwtape · 2025-04-14T19:58:10.293Z · comments (6)

The vision of Bill Thurston
TsviBT · 2025-03-28T11:45:14.297Z · comments (34)

Four Types of Disagreement
silentbob · 2025-04-13T11:22:38.466Z · comments (2)

Vestigial reasoning in RL
Caleb Biddulph (caleb-biddulph) · 2025-04-13T15:40:11.954Z · comments (7)

We’re not prepared for an AI market crash
Remmelt (remmelt-ellen) · 2025-04-01T04:33:55.040Z · comments (12)

Introducing BenchBench: An Industry Standard Benchmark for AI Strength
Jozdien · 2025-04-02T02:11:41.555Z · comments (0)

Against blanket arguments against interpretability
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-22T09:46:23.486Z · comments (4)

[link] Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
Matrice Jacobine · 2025-02-12T09:15:07.793Z · comments (49)

Time to Welcome Claude 3.7
Zvi · 2025-02-26T13:00:06.489Z · comments (2)

Reactions to METR task length paper are insane
Cole Wyeth (Amyr) · 2025-04-10T17:13:36.428Z · comments (41)

Prioritizing threats for AI control
ryan_greenblatt · 2025-03-19T17:09:45.044Z · comments (2)

[link] The Russell Conjugation Illuminator
TimmyM (timmym) · 2025-04-17T19:33:06.924Z · comments (14)

Racing Towards Fusion and AI
Jeffrey Heninger (jeffrey-heninger) · 2025-02-07T20:40:56.798Z · comments (11)

Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle
Czynski (JacobKopczynski) · 2025-03-29T02:51:29.786Z · comments (36)

OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing
Zvi · 2025-04-15T15:30:02.518Z · comments (3)

Proselytizing
lsusr · 2025-02-22T11:54:12.740Z · comments (3)

23andMe potentially for sale for <$50M
lemonhope (lcmgcd) · 2025-03-25T04:34:28.388Z · comments (2)

A collection of approaches to confronting doom, and my thoughts on them
Ruby · 2025-04-06T02:11:31.271Z · comments (18)

The GDM AGI Safety+Alignment Team is Hiring for Applied Interpretability Research
Arthur Conmy (arthur-conmy) · 2025-02-24T02:17:12.991Z · comments (1)

[link] College Advice For People Like Me
henryj · 2025-04-12T14:36:46.643Z · comments (5)

[link] Habermas Machine
NicholasKees (nick_kees) · 2025-03-13T18:16:50.453Z · comments (7)

AI #107: The Misplaced Hype Machine
Zvi · 2025-03-13T14:40:05.318Z · comments (10)

Celtic Knots on Einstein Lattice
Ben (ben-lang) · 2025-02-16T15:56:06.888Z · comments (11)

Youth Lockout
Xavi CF (xavi-cf) · 2025-04-11T15:05:54.441Z · comments (6)

For scheming, we should first focus on detection and then on prevention
Marius Hobbhahn (marius-hobbhahn) · 2025-03-04T15:22:06.105Z · comments (7)

Lots of brief thoughts on Software Engineering
Yair Halberstadt (yair-halberstadt) · 2025-03-06T19:50:34.438Z · comments (17)

Equations Mean Things
abstractapplic · 2025-03-19T08:16:35.312Z · comments (10)

Skepticism towards claims about the views of powerful institutions
tlevin (trevor) · 2025-02-13T07:40:52.257Z · comments (2)

I changed my mind about orca intelligence
Towards_Keeperhood (Simon Skade) · 2025-03-18T10:15:29.860Z · comments (24)

Try training token-level probes
StefanHex (Stefan42) · 2025-04-14T11:56:23.191Z · comments (4)

Interpreting Complexity
Maxwell Adam (intern) · 2025-03-14T04:52:32.103Z · comments (7)

On (Not) Feeling the AGI
Zvi · 2025-03-25T14:30:02.215Z · comments (25)

o3-mini Early Days
Zvi · 2025-02-03T14:20:06.443Z · comments (0)

[link] American College Admissions Doesn't Need to Be So Competitive
Arjun Panickssery (arjun-panickssery) · 2025-04-07T17:35:26.791Z · comments (18)

Training AI to do alignment research we don’t already know how to do
joshc (joshua-clymer) · 2025-02-24T19:19:43.067Z · comments (23)

Metacognition Broke My Nail-Biting Habit
Rafka · 2025-03-16T12:36:47.437Z · comments (20)

[question] Why do many people who care about AI Safety not clearly endorse PauseAI?
humnrdble · 2025-03-30T18:06:32.426Z · answers+comments (41)

[link] Intelsat as a Model for International AGI Governance
rosehadshar · 2025-03-13T12:58:11.692Z · comments (0)

DeepSeek: Don’t Panic
Zvi · 2025-01-31T14:20:08.264Z · comments (6)

On the Meta and DeepMind Safety Frameworks
Zvi · 2025-02-07T13:10:08.449Z · comments (1)

Subjective Naturalism in Decision Theory: Savage vs. Jeffrey–Bolker
Daniel Herrmann (Whispermute) · 2025-02-04T20:34:22.625Z · comments (22)

We’re in Deep Research
Zvi · 2025-02-04T17:20:06.540Z · comments (2)

Tear Down the Burren
jefftk (jkaufman) · 2025-02-04T03:40:02.767Z · comments (2)

Silly Time
jefftk (jkaufman) · 2025-03-21T12:30:08.560Z · comments (2)

← previous page (newer posts) · next page (older posts) →

^{^}

Of course, moving a pass@400 capability to pass@1 isn't nothing, but it's clearly astronomically short of a Singularity-enabling technique that RL-on-CoTs is touted as.

LessWrong 2.0 Reader

Archive

Recent comments