LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] From the Archives: a story
Richard_Ngo (ricraz) · 2024-12-27T16:36:50.735Z · comments (1)
Monet: Mixture of Monosemantic Experts for Transformers Explained
CalebMaresca (caleb-maresca) · 2025-01-25T19:37:09.078Z · comments (2)
[link] Mechanistic Interpretability of Llama 3.2 with Sparse Autoencoders
PaulPauls · 2024-11-24T05:45:20.124Z · comments (3)
[link] How sci-fi can have drama without dystopia or doomerism
jasoncrawford · 2025-01-17T15:22:00.414Z · comments (3)
The present perfect tense is ruining your life
PatrickDFarley · 2025-01-27T16:14:48.843Z · comments (8)
Almost all growth is exponential growth
lemonhope (lcmgcd) · 2025-01-21T07:16:24.686Z · comments (7)
[link] Why OpenAI’s Structure Must Evolve To Advance Our Mission
stuhlmueller · 2024-12-28T04:24:19.937Z · comments (1)
Historical Net Worth
jefftk (jkaufman) · 2024-12-07T23:10:01.519Z · comments (1)
[link] Poetic Methods I: Meter as Communication Protocol
adamShimi · 2025-02-01T18:22:39.676Z · comments (0)
minifest
Austin Chen (austin-chen) · 2024-12-07T03:50:38.573Z · comments (1)
Definition of alignment science I like
quetzal_rainbow · 2025-01-06T20:40:38.187Z · comments (0)
Higher and lower pleasures
Chris_Leong · 2024-12-05T13:13:46.526Z · comments (3)
Really radical empathy
MichaelStJules · 2025-01-06T17:46:31.269Z · comments (0)
Turning up the Heat on Deceptively-Misaligned AI
J Bostock (Jemist) · 2025-01-07T00:13:28.191Z · comments (16)
PIBBSS Fellowship 2025: Bounties and Cooperative AI Track Announcement
DusanDNesic · 2025-01-09T14:23:47.027Z · comments (0)
How I saved 1 human life (in expectation) without overthinking it
Christopher King (christopher-king) · 2024-12-22T20:53:13.492Z · comments (0)
subfunctional overlaps in attentional selection history implies momentum for decision-trajectories
Emrik (Emrik North) · 2024-12-22T14:12:49.027Z · comments (1)
Measuring Nonlinear Feature Interactions in Sparse Crosscoders [Project Proposal]
Jason Gross (jason-gross) · 2025-01-06T04:22:12.633Z · comments (0)
Proof Explained for "Robust Agents Learn Causal World Model"
Dalcy (Darcy) · 2024-12-22T15:06:16.880Z · comments (0)
Whistleblowing Twitter Bot
Mckiev · 2024-12-26T04:09:45.493Z · comments (5)
AGI with RL is Bad News for Safety
Nadav Brandes (nadav-brandes) · 2024-12-21T19:36:03.970Z · comments (22)
[link] Forecast 2025 With Vox's Future Perfect Team — $2,500 Prize Pool
ChristianWilliams · 2024-12-20T23:00:35.334Z · comments (0)
[link] AI safety content you could create
Adam Jones (domdomegg) · 2025-01-06T15:35:56.167Z · comments (0)
Rebuttals for ~all criticisms of AIXI
Cole Wyeth (Amyr) · 2025-01-07T17:41:10.557Z · comments (15)
[link] Chess As The Model Game
criticalpoints · 2024-11-17T19:45:26.499Z · comments (0)
A Collection of Empirical Frames about Language Models
Daniel Tan (dtch1997) · 2025-01-02T02:49:05.965Z · comments (0)
[link] Can o1-preview find major mistakes amongst 59 NeurIPS '24 MLSB papers?
Abhishaike Mahajan (abhishaike-mahajan) · 2024-12-18T14:21:03.661Z · comments (0)
QFT and neural nets: the basic idea
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-24T13:54:45.099Z · comments (0)
Reality is Fractal-Shaped
silentbob · 2024-12-17T13:52:16.946Z · comments (1)
[link] AI & Liability Ideathon
Kabir Kumar (kabir-kumar) · 2024-11-26T13:54:01.820Z · comments (2)
Advisors for Smaller Major Donors?
jefftk (jkaufman) · 2024-11-06T14:30:06.187Z · comments (2)
Feature request: comment bookmarks
dirk (abandon) · 2025-01-15T06:45:23.862Z · comments (2)
Monthly Roundup #25: December 2024
Zvi · 2024-12-23T14:20:04.682Z · comments (3)
[link] Genesis
PeterMcCluskey · 2024-12-31T22:01:17.277Z · comments (0)
Evolutionary prompt optimization for SAE feature visualization
neverix · 2024-11-14T13:06:49.728Z · comments (0)
Efficiency spectra and “bucket of circuits” cartoons
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-29T15:06:50.768Z · comments (0)
Announcing the CLR Foundations Course and CLR S-Risk Seminars
JamesFaville (elephantiskon) · 2024-11-19T01:18:10.085Z · comments (0)
2024 NYC Secular Solstice & Megameetup
Joe Rogero · 2024-11-12T17:46:18.674Z · comments (0)
[question] What is the most impressive game LLMs can play well?
Cole Wyeth (Amyr) · 2025-01-08T19:38:18.530Z · answers+comments (20)
[link] A primer on machine learning in cryo-electron microscopy (cryo-EM)
Abhishaike Mahajan (abhishaike-mahajan) · 2024-12-22T15:11:58.860Z · comments (0)
In the Name of All That Needs Saving
pleiotroth · 2024-11-07T15:26:12.252Z · comments (2)
Beliefs and state of mind into 2025
RussellThor · 2025-01-10T22:07:01.060Z · comments (9)
We need a universal definition of 'agency' and related words
CstineSublime · 2025-01-11T03:22:56.623Z · comments (1)
Everything you care about is in the map
Tahp · 2024-12-17T14:05:36.824Z · comments (27)
[question] How useful would alien alignment research be?
Donald Hobson (donald-hobson) · 2025-01-23T10:59:22.330Z · answers+comments (5)
Incredibow
jefftk (jkaufman) · 2025-01-07T03:30:02.197Z · comments (3)
Using Dangerous AI, But Safely?
habryka (habryka4) · 2024-11-16T04:29:20.914Z · comments (2)
[link] The Legacy of Computer Science
Johannes C. Mayer (johannes-c-mayer) · 2024-12-29T13:15:28.606Z · comments (0)
Computational functionalism probably can't explain phenomenal consciousness
EuanMcLean (euanmclean) · 2024-12-10T17:11:28.044Z · comments (35)
Most Minds are Irrational
Davidmanheim · 2024-12-10T09:36:33.144Z · comments (4)
← previous page (newer posts) · next page (older posts) →