LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Jan Betley (jan-betley) · 2025-02-25T17:39:31.059Z · comments (22)
[link] what an efficient market feels from inside
DMMF · 2025-02-25T02:38:40.129Z · comments (9)
Economics Roundup #5
Zvi · 2025-02-25T13:40:07.086Z · comments (6)
Osaka
lsusr · 2025-02-26T13:50:24.102Z · comments (0)
Time to Welcome Claude 3.7
Zvi · 2025-02-26T13:00:06.489Z · comments (0)
[link] Upcoming Protest for AI Safety
Matt Vincent (matthew-milone) · 2025-02-25T03:04:03.153Z · comments (0)
Revisiting Conway's Law
annebrandes (annebrandes1@gmail.com) · 2025-02-25T08:33:52.421Z · comments (0)
[question] Intellectual lifehacks repo
Antoine de Scorraille (Etoile de Scauchy) · 2025-02-25T16:32:09.814Z · answers+comments (9)
[link] We Can Build Compassionate AI
Gordon Seidoh Worley (gworley) · 2025-02-25T16:37:06.160Z · comments (2)
Three Levels for Large Language Model Cognition
Eleni Angelou (ea-1) · 2025-02-25T23:14:00.306Z · comments (0)
Levels of analysis for thinking about agency
Cole Wyeth (Amyr) · 2025-02-26T04:24:24.583Z · comments (0)
Technical comparison of Deepseek, Novasky, S1, Helix, P0
Juliezhanggg · 2025-02-25T04:20:40.413Z · comments (0)
[PAPER] Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Lucy Farnik (lucy.fa) · 2025-02-26T12:50:04.204Z · comments (0)
[question] Name for Standard AI Caveat?
yrimon (yehuda-rimon) · 2025-02-26T07:07:16.523Z · answers+comments (5)
Optimizing Feedback to Learn Faster
Towards_Keeperhood (Simon Skade) · 2025-02-26T14:24:26.835Z · comments (0)
[link] The Stag Hunt—cultivating cooperation to reap rewards
James Stephen Brown (james-brown) · 2025-02-25T23:45:07.472Z · comments (0)
Minor interpretability exploration #1: Grokking of modular addition, subtraction, multiplication, for different activation functions
Rareș Baron · 2025-02-26T11:35:56.610Z · comments (2)
[link] [Crosspost] Strategic wealth accumulation under transformative AI expectations
arden446 · 2025-02-25T21:50:11.458Z · comments (0)
outlining is a historically recent underutilized gift to family
daijin · 2025-02-26T13:58:17.623Z · comments (0)
Making alignment a law of the universe
juggins · 2025-02-25T10:44:11.632Z · comments (2)
Demystifying the Pinocchio Paradox
Novak Zukowski (Zantarus) · 2025-02-25T06:16:57.219Z · comments (0)
next page (older posts) →