LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Is AI Alignment Enough?
Aram Panasenco (panasenco) · 2025-01-10T18:57:48.409Z · comments (6)
Corrigibility's Desirability is Timing-Sensitive
RobertM (T3t) · 2024-12-26T22:24:17.435Z · comments (4)
Book Summary: Zero to One
bilalchughtai (beelal) · 2024-12-29T16:13:52.922Z · comments (2)
Will bird flu be the next Covid? "Little chance" says my dashboard.
Nathan Young · 2025-01-07T20:10:50.080Z · comments (0)
[link] AI as systems, not just models
Andy Arditi (andy-arditi) · 2024-12-21T23:19:05.507Z · comments (0)
[link] Impact in AI Safety Now Requires Specific Strategic Insight
MiloSal (milosal) · 2024-12-29T00:40:53.780Z · comments (1)
[link] The Roots of Progress 2024 in review
jasoncrawford · 2025-01-01T00:02:06.441Z · comments (0)
Learning Multi-Level Features with Matryoshka SAEs
Bart Bussmann (Stuckwork) · 2024-12-19T15:59:00.036Z · comments (4)
On the OpenAI Economic Blueprint
Zvi · 2025-01-15T14:30:06.773Z · comments (0)
Living with Rats in College
lsusr · 2024-12-25T10:44:13.085Z · comments (0)
Voluntary Salary Reduction
jefftk (jkaufman) · 2025-01-15T03:40:02.909Z · comments (0)
Intranasal mRNA Vaccines?
J Bostock (Jemist) · 2025-01-01T23:46:40.524Z · comments (2)
Preface
Allison Duettmann (allison-duettmann) · 2025-01-02T18:59:46.290Z · comments (1)
[link] debating buying NVDA in 2019
bhauth · 2025-01-04T05:06:54.047Z · comments (0)
Elevating Air Purifiers
jefftk (jkaufman) · 2024-12-17T01:40:05.401Z · comments (0)
[link] The Alignment Simulator
Yair Halberstadt (yair-halberstadt) · 2024-12-22T11:45:55.220Z · comments (3)
[link] Genetically edited mosquitoes haven't scaled yet. Why?
alexey · 2024-12-30T21:37:32.942Z · comments (0)
Good Reasons for Alts
jefftk (jkaufman) · 2024-12-21T01:30:03.113Z · comments (2)
[link] Being Present is Not a Skill
Chipmonk · 2024-12-18T01:11:04.715Z · comments (8)
[link] NAO Updates, January 2025
jefftk (jkaufman) · 2025-01-10T03:37:36.698Z · comments (0)
The Second Gemini
Zvi · 2024-12-17T15:50:06.373Z · comments (0)
[link] Letter from an Alien Mind
Shoshannah Tekofsky (DarkSym) · 2024-12-27T13:20:49.277Z · comments (7)
[link] Human-AI Complementarity: A Goal for Amplified Oversight
rishubjain · 2024-12-24T09:57:55.111Z · comments (3)
[link] Job Opening: SWE to help improve grant-making software
Ethan Ashkie (ethan-ashkie-1) · 2025-01-08T00:54:22.820Z · comments (1)
The average rationalist IQ is about 122
Rockenots (Ekefa) · 2024-12-28T15:42:07.067Z · comments (23)
Elon Musk and Solar Futurism
transhumanist_atom_understander · 2024-12-21T02:55:28.554Z · comments (27)
[link] PCR retrospective
bhauth · 2024-12-26T21:20:56.484Z · comments (0)
Non-Obvious Benefits of Insurance
jefftk (jkaufman) · 2024-12-23T03:40:02.184Z · comments (5)
The absolute basics of representation theory of finite groups
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-08T09:47:13.136Z · comments (0)
[link] LLMs for language learning
Benquo · 2025-01-15T14:08:54.620Z · comments (0)
The Alignment Mapping Program: Forging Independent Thinkers in AI Safety - A Pilot Retrospective
Alvin Ånestrand (alvin-anestrand) · 2025-01-10T16:22:16.905Z · comments (0)
[link] Our new video about goal misgeneralization, plus an apology
Writer · 2025-01-14T14:07:21.648Z · comments (0)
Broken Latents: Studying SAEs and Feature Co-occurrence in Toy Models
chanind · 2024-12-30T22:50:54.964Z · comments (3)
[question] Meal Replacements in 2025?
alkjash · 2025-01-06T15:37:25.041Z · answers+comments (9)
Grading my 2024 AI predictions
Nikola Jurkovic (nikolaisalreadytaken) · 2025-01-02T05:01:46.587Z · comments (1)
[link] I read every major AI lab’s safety plan so you don’t have to
sarahhw · 2024-12-16T18:51:38.499Z · comments (0)
[link] It looks like there are some good funding opportunities in AI safety right now
Benjamin_Todd · 2024-12-22T12:41:02.151Z · comments (0)
Is AI Physical?
Lauren Greenspan (LaurenGreenspan) · 2025-01-14T21:21:39.999Z · comments (2)
NYC Congestion Pricing: Early Days
Zvi · 2025-01-14T14:00:07.445Z · comments (0)
Latent Adversarial Training (LAT) Improves the Representation of Refusal
alexandraabbas · 2025-01-06T10:24:53.419Z · comments (6)
A Generalization of the Good Regulator Theorem
Alfred Harwood · 2025-01-04T09:55:25.432Z · comments (6)
Proof Explained for "Robust Agents Learn Causal World Model"
Dalcy (Darcy) · 2024-12-22T15:06:16.880Z · comments (0)
subfunctional overlaps in attentional selection history implies momentum for decision-trajectories
Emrik (Emrik North) · 2024-12-22T14:12:49.027Z · comments (1)
Theoretical Alignment's Second Chance
lunatic_at_large · 2024-12-22T05:03:51.653Z · comments (0)
Can we rescue Effective Altruism?
Elizabeth (pktechgirl) · 2025-01-09T16:40:02.405Z · comments (0)
Measuring Nonlinear Feature Interactions in Sparse Crosscoders [Project Proposal]
Jason Gross (jason-gross) · 2025-01-06T04:22:12.633Z · comments (0)
AGI with RL is Bad News for Safety
Nadav Brandes (nadav-brandes) · 2024-12-21T19:36:03.970Z · comments (22)
[link] Why OpenAI’s Structure Must Evolve To Advance Our Mission
stuhlmueller · 2024-12-28T04:24:19.937Z · comments (1)
Turning up the Heat on Deceptively-Misaligned AI
J Bostock (Jemist) · 2025-01-07T00:13:28.191Z · comments (16)
Don’t Legalize Drugs
Declan Molony (declan-molony) · 2025-01-14T06:51:14.005Z · comments (7)
← previous page (newer posts) · next page (older posts) →