LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

The Field of AI Alignment: A Postmortem, and What To Do About It
johnswentworth · 2024-12-26T18:48:07.614Z · comments (81)

Orienting to 3 year AGI timelines
Nikola Jurkovic (nikolaisalreadytaken) · 2024-12-22T01:15:11.401Z · comments (35)

[link] Review: Planecrash
L Rudolf L (LRudL) · 2024-12-27T14:18:33.611Z · comments (10)

What o3 Becomes by 2028
Vladimir_Nesov · 2024-12-22T12:37:20.929Z · comments (13)

Hire (or become) a Thinking Assistant / Body Double
Raemon · 2024-12-23T03:58:42.061Z · comments (31)

[question] What are the strongest arguments for very short timelines?
Kaj_Sotala · 2024-12-23T09:38:56.905Z · answers+comments (69)

A breakdown of AI capability levels focused on AI R&D labor acceleration
ryan_greenblatt · 2024-12-22T20:56:00.298Z · comments (2)

AIs Will Increasingly Fake Alignment
Zvi · 2024-12-24T13:00:07.770Z · comments (0)

A Three-Layer Model of LLM Psychology
Jan_Kulveit · 2024-12-26T16:49:41.738Z · comments (3)

When AI 10x's AI R&D, What Do We Do?
Logan Riggs (elriggs) · 2024-12-21T23:56:11.069Z · comments (14)

Checking in on Scott's composition image bet with imagen 3
Dave Orr (dave-orr) · 2024-12-22T19:04:17.495Z · comments (0)

AI #96: o3 But Not Yet For Thee
Zvi · 2024-12-26T20:30:06.722Z · comments (8)

ReSolsticed vol I: "We're Not Going Quietly"
Raemon · 2024-12-26T17:52:33.727Z · comments (3)

Vegans need to eat just enough Meat - emperically evaluate the minimum ammount of meat that maximizes utility
Johannes C. Mayer (johannes-c-mayer) · 2024-12-22T22:08:31.971Z · comments (28)

[question] What Have Been Your Most Valuable Casual Conversations At Conferences?
johnswentworth · 2024-12-25T05:49:36.711Z · answers+comments (19)

[link] The Deep Lore of LightHaven, with Oliver Habryka (TBC episode 228)
Eneasz · 2024-12-24T22:45:50.065Z · comments (4)

[question] What are the most interesting / challenging evals (for humans) available?
Raemon · 2024-12-27T03:05:26.831Z · answers+comments (10)

[link] Review: Good Strategy, Bad Strategy
L Rudolf L (LRudL) · 2024-12-21T17:17:04.342Z · comments (0)

Greedy-Advantage-Aware RLHF
sej2020 · 2024-12-27T19:47:25.562Z · comments (4)

[question] What is your personal totalizing and self-consistent worldview/philosophy?
lsusr · 2024-12-27T23:59:30.641Z · answers+comments (7)

People aren't properly calibrated on FrontierMath
cakubilo · 2024-12-23T19:35:44.467Z · comments (4)

Acknowledging Background Information with P(Q|I)
JenniferRM · 2024-12-24T18:50:25.323Z · comments (8)

Corrigibility's Desirability is Timing-Sensitive
RobertM (T3t) · 2024-12-26T22:24:17.435Z · comments (4)

[link] AI as systems, not just models
Andy Arditi (andy-arditi) · 2024-12-21T23:19:05.507Z · comments (0)

[link] The Alignment Simulator
Yair Halberstadt (yair-halberstadt) · 2024-12-22T11:45:55.220Z · comments (3)

Living with Rats in College
lsusr · 2024-12-25T10:44:13.085Z · comments (0)

Good Reasons for Alts
jefftk (jkaufman) · 2024-12-21T01:30:03.113Z · comments (2)

[link] Funding Case: AI Safety Camp 11
Remmelt (remmelt-ellen) · 2024-12-23T08:51:55.255Z · comments (0)

Elon Musk and Solar Futurism
transhumanist_atom_understander · 2024-12-21T02:55:28.554Z · comments (27)

[link] Letter from an Alien Mind
Shoshannah Tekofsky (DarkSym) · 2024-12-27T13:20:49.277Z · comments (4)

[link] PCR retrospective
bhauth · 2024-12-26T21:20:56.484Z · comments (0)

[link] Human-AI Complementarity: A Goal for Amplified Oversight
rishubjain · 2024-12-24T09:57:55.111Z · comments (1)

Monthly Roundup #25: December 2024
Zvi · 2024-12-23T14:20:04.682Z · comments (3)

Non-Obvious Benefits of Insurance
jefftk (jkaufman) · 2024-12-23T03:40:02.184Z · comments (5)

[link] It looks like there are some good funding opportunities in AI safety right now
Benjamin_Todd · 2024-12-22T12:41:02.151Z · comments (0)

Whistleblowing Twitter Bot
Mckiev · 2024-12-26T04:09:45.493Z · comments (5)

subfunctional overlaps in attentional selection history implies momentum for decision-trajectories
Emrik (Emrik North) · 2024-12-22T14:12:49.027Z · comments (1)

AGI with RL is Bad News for Safety
Nadav Brandes (nadav-brandes) · 2024-12-21T19:36:03.970Z · comments (22)

Theoretical Alignment's Second Chance
lunatic_at_large · 2024-12-22T05:03:51.653Z · comments (0)

Proof Explained for "Robust Agents Learn Causal World Model"
Dalcy (Darcy) · 2024-12-22T15:06:16.880Z · comments (0)

Open Thread Winter 2024/2025
habryka (habryka4) · 2024-12-25T21:02:41.760Z · comments (1)

[link] We are in a New Paradigm of AI Progress - OpenAI's o3 model makes huge gains on the toughest AI benchmarks in the world
garrison · 2024-12-22T21:45:52.026Z · comments (3)

[link] A primer on machine learning in cryo-electron microscopy (cryo-EM)
Abhishaike Mahajan (abhishaike-mahajan) · 2024-12-22T15:11:58.860Z · comments (0)

Coin Flip
XelaP (scroogemcduck1) · 2024-12-27T11:53:01.781Z · comments (0)

Apply to the 2025 PIBBSS Summer Research Fellowship
DusanDNesic · 2024-12-24T10:25:12.882Z · comments (0)

[link] Why OpenAI’s Structure Must Evolve To Advance Our Mission
stuhlmueller · 2024-12-28T04:24:19.937Z · comments (0)

Stop Making Sense
JenniferRM · 2024-12-23T05:16:12.428Z · comments (0)

How I saved 1 human life (in expectation) without overthinking it
Christopher King (christopher-king) · 2024-12-22T20:53:13.492Z · comments (0)

If all trade is voluntary, then what is "exploitation?"
Darmani · 2024-12-27T11:21:30.036Z · comments (33)

Robbin's Farm Sledding Route
jefftk (jkaufman) · 2024-12-21T22:10:01.175Z · comments (1)

next page (older posts) →

^{^}

My understanding is that a significant aim of your recent research is to test model's alignment so that people will take AI risk more seriously when things start to heat up. This seems good but I expect the net effect of Anthropic is still to make people take alignment less seriously due to the public communications of the company.

LessWrong 2.0 Reader

Archive

Recent comments