LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

Paper in Science: Managing extreme AI risks amid rapid progress
JanB (JanBrauner) · 2024-05-23T08:40:40.678Z · comments (0)
[link] Power Law Policy
Ben Turtel (ben-turtel) · 2024-05-23T05:28:46.022Z · comments (1)
Why entropy means you might not have to worry as much about superintelligent AI
Ron J (ron-j) · 2024-05-23T03:52:40.874Z · comments (0)
Quick Thoughts on Our First Sampling Run
jefftk (jkaufman) · 2024-05-23T00:20:02.050Z · comments (2)
AI Safety proposal - Influencing the superintelligence explosion
Morgan · 2024-05-22T23:31:16.487Z · comments (1)
The Button (Short Comic)
milanrosko · 2024-05-22T23:28:57.919Z · comments (0)
Implementing Asimov's Laws of Robotics - How I imagine alignment working.
Joshua Clancy (joshua-clancy) · 2024-05-22T23:15:56.187Z · comments (0)
Higher-Order Forecasts
ozziegooen · 2024-05-22T21:49:42.802Z · comments (0)
A Positive Double Standard—Self-Help Principles Work For Individuals Not Populations
James Stephen Brown (james-brown) · 2024-05-22T21:37:16.578Z · comments (2)
A Bi-Modal Brain Model
Johannes C. Mayer (johannes-c-mayer) · 2024-05-22T20:10:08.919Z · comments (1)
Offering service as a sensayer for simulationist-adjacent spiritualities.
mako yass (MakoYass) · 2024-05-22T18:52:05.576Z · comments (0)
Do Not Mess With Scarlett Johansson
Zvi · 2024-05-22T15:10:03.215Z · comments (7)
How Multiverse Theory dissolves Quantum inexplicability
mrdlm (mridul.mohan.m@gmail.com) · 2024-05-22T14:55:28.592Z · comments (0)
[question] Should we be concerned about eating too much soy?
ChristianKl · 2024-05-22T12:53:16.388Z · answers+comments (3)
Procedural Executive Function, Part 3
DaystarEld · 2024-05-22T11:58:53.031Z · comments (2)
Cicadas, Anthropic, and the bilateral alignment problem
kromem · 2024-05-22T11:09:56.469Z · comments (0)
[link] Announcing Human-aligned AI Summer School
Jan_Kulveit · 2024-05-22T08:55:10.839Z · comments (0)
"Which chains-of-thought was that faster than?"
Emrik (Emrik North) · 2024-05-22T08:21:00.269Z · comments (1)
Each Llama3-8b text uses a different "random" subspace of the activation space
tailcalled · 2024-05-22T07:31:32.764Z · comments (2)
[link] ARIA's Safeguarded AI grant program is accepting applications for Technical Area 1.1 until May 28th
Brendon_Wong · 2024-05-22T06:54:55.206Z · comments (0)
[link] Anthropic announces interpretability advances. How much does this advance alignment?
Seth Herd · 2024-05-21T22:30:52.638Z · comments (4)
[question] What would stop you from paying for an LLM?
yanni kyriacos (yanni) · 2024-05-21T22:25:52.949Z · answers+comments (14)
EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
scasper · 2024-05-21T20:15:36.502Z · comments (10)
Mitigating extreme AI risks amid rapid progress [Linkpost]
Akash (akash-wasil) · 2024-05-21T19:59:21.343Z · comments (5)
The problem with rationality
David Loomis (david-loomis) · 2024-05-21T18:49:44.863Z · comments (1)
[link] Helping loved ones with their finances: the why and how of an unusually impactful opportunity
Sam Anschell · 2024-05-21T18:48:47.566Z · comments (1)
rough draft on what happens in the brain when you have an insight
Emrik (Emrik North) · 2024-05-21T18:02:47.060Z · comments (2)
On Dwarkesh’s Podcast with OpenAI’s John Schulman
Zvi · 2024-05-21T17:30:04.332Z · comments (3)
[question] Is deleting capabilities still a relevant research question?
tailcalled · 2024-05-21T13:24:44.946Z · answers+comments (1)
[question] What are some infohazards?
Justus · 2024-05-21T12:48:32.736Z · answers+comments (1)
[link] New voluntary commitments (AI Seoul Summit)
Zach Stein-Perlman · 2024-05-21T11:00:41.794Z · comments (16)
ACX/LW/EA/* Meetup Bremen
RasmusHB (JohannWolfgang) · 2024-05-21T05:42:43.010Z · comments (0)
My Dating Heuristic
Declan Molony (declan-molony) · 2024-05-21T05:28:40.197Z · comments (4)
Scorable Functions: A Format for Algorithmic Forecasting
ozziegooen · 2024-05-21T04:14:11.749Z · comments (0)
The Problem With the Word ‘Alignment’
peligrietzer · 2024-05-21T03:48:26.983Z · comments (4)
What's Going on With OpenAI's Messaging?
ozziegooen · 2024-05-21T02:22:04.171Z · comments (11)
[link] Harmony Intelligence is Hiring!
James Dao (james-dao) · 2024-05-21T02:11:44.675Z · comments (0)
[link] [Linkpost] Statement from Scarlett Johansson on OpenAI's use of the "Sky" voice, that was shockingly similar to her own voice.
Linch · 2024-05-20T23:50:28.138Z · comments (8)
[link] Some perspectives on the discipline of Physics
Tahp · 2024-05-20T18:19:22.429Z · comments (3)
[question] Are there any groupchats for people working on Representation reading/control, activation steering type experiments?
Joe Kwon · 2024-05-20T18:03:15.481Z · answers+comments (0)
Interpretability: Integrated Gradients is a decent attribution method
Lucius Bushnaq (Lblack) · 2024-05-20T17:55:22.893Z · comments (7)
The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks
Lucius Bushnaq (Lblack) · 2024-05-20T17:53:25.985Z · comments (2)
[link] NAO Updates, Spring 2024
jefftk (jkaufman) · 2024-05-20T16:51:03.693Z · comments (0)
OpenAI: Exodus
Zvi · 2024-05-20T13:10:03.543Z · comments (23)
Infra-Bayesian haggling
hannagabor (hanna-gabor) · 2024-05-20T12:23:30.165Z · comments (0)
[link] Jaan Tallinn's 2023 Philanthropy Overview
jaan · 2024-05-20T12:11:39.416Z · comments (4)
D&D.Sci (Easy Mode): On The Construction Of Impossible Structures [Evaluation and Ruleset]
abstractapplic · 2024-05-20T09:38:55.228Z · comments (1)
Why I find Davidad's plan interesting
Paul W · 2024-05-20T08:13:15.950Z · comments (0)
[link] Anthropic: Reflections on our Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-05-20T04:14:44.435Z · comments (21)
[link] The consistent guessing problem is easier than the halting problem
jessicata (jessica.liu.taylor) · 2024-05-20T04:02:03.865Z · comments (5)
next page (older posts) →