LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

[link] Linkpost: A Post Mortem on the Gino Case
Linch · 2023-10-24T06:50:42.896Z · comments (7)
South Bay SSC Meetup, San Jose, November 5th. 
David Friedman (david-friedman) · 2023-10-24T04:50:50.974Z · comments (1)
AI Pause Will Likely Backfire (Guest Post)
jsteinhardt · 2023-10-24T04:30:02.113Z · comments (6)
Human wanting
TsviBT · 2023-10-24T01:05:39.374Z · comments (1)
[link] Towards Understanding Sycophancy in Language Models
Ethan Perez (ethan-perez) · 2023-10-24T00:30:48.923Z · comments (0)
Manifold Halloween Hackathon
Austin Chen (austin-chen) · 2023-10-23T22:47:18.462Z · comments (0)
Open Source Replication & Commentary on Anthropic's Dictionary Learning Paper
Neel Nanda (neel-nanda-1) · 2023-10-23T22:38:33.951Z · comments (12)
[link] The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists
EJT (ElliottThornley) · 2023-10-23T21:00:48.398Z · comments (22)
[link] AI Alignment [Incremental Progress Units] this Week (10/22/23)
Logan Zoellner (logan-zoellner) · 2023-10-23T20:32:37.998Z · comments (0)
z is not the cause of x
hrbigelow · 2023-10-23T17:43:59.563Z · comments (2)
Some of my predictable updates on AI
Aaron_Scher · 2023-10-23T17:24:34.720Z · comments (8)
Programmatic backdoors: DNNs can use SGD to run arbitrary stateful computation
Fabien Roger (Fabien) · 2023-10-23T16:37:45.611Z · comments (3)
Machine Unlearning Evaluations as Interpretability Benchmarks
NickyP (Nicky) · 2023-10-23T16:33:04.878Z · comments (2)
[link] VLM-RM: Specifying Rewards with Natural Language
ChengCheng (ccstan99) · 2023-10-23T14:11:34.493Z · comments (2)
Contra Dance Dialect Survey
jefftk (jkaufman) · 2023-10-23T13:40:08.294Z · comments (0)
[question] Which LessWrongers are (aspiring) YouTubers?
Mati_Roy (MathieuRoy) · 2023-10-23T13:21:49.004Z · answers+comments (13)
[question] What is an "anti-Occamian prior"?
Zane · 2023-10-23T02:26:10.851Z · answers+comments (22)
AI Safety is Dropping the Ball on Clown Attacks
trevor (TrevorWiesinger) · 2023-10-22T20:09:31.810Z · comments (72)
The Drowning Child
Tomás B. (Bjartur Tómas) · 2023-10-22T16:39:53.016Z · comments (8)
Announcing Timaeus
Jesse Hoogland (jhoogland) · 2023-10-22T11:59:03.938Z · comments (15)
[link] Into AI Safety - Episode 0
jacobhaimes · 2023-10-22T03:30:57.865Z · comments (1)
Thoughts On (Solving) Deep Deception
Jozdien · 2023-10-21T22:40:10.060Z · comments (2)
Best effort beliefs
Adam Zerner (adamzerner) · 2023-10-21T22:05:59.382Z · comments (9)
How toy models of ontology changes can be misleading
Stuart_Armstrong · 2023-10-21T21:13:56.384Z · comments (0)
Soups as Spreads
jefftk (jkaufman) · 2023-10-21T20:30:08.320Z · comments (0)
Which COVID booster to get?
Sameerishere · 2023-10-21T19:43:04.273Z · comments (0)
Alignment Implications of LLM Successes: a Debate in One Act
Zack_M_Davis · 2023-10-21T15:22:23.053Z · comments (50)
How to find a good moving service
Ziyue Wang (VincentWang25) · 2023-10-21T04:59:07.814Z · comments (0)
Apply for MATS Winter 2023-24!
Rocket (utilistrutil) · 2023-10-21T02:27:34.350Z · comments (6)
[question] Can we isolate neurons that recognize features vs. those which have some other role?
Joshua Clancy (joshua-clancy) · 2023-10-21T00:30:11.758Z · answers+comments (2)
Muddling Along Is More Likely Than Dystopia
Jeffrey Heninger (jeffrey-heninger) · 2023-10-20T21:25:15.459Z · comments (10)
What's Hard About The Shutdown Problem
johnswentworth · 2023-10-20T21:13:27.624Z · comments (31)
Holly Elmore and Rob Miles dialogue on AI Safety Advocacy
jacobjacob · 2023-10-20T21:04:32.645Z · comments (30)
TOMORROW: the largest AI Safety protest ever!
Holly_Elmore · 2023-10-20T18:15:18.276Z · comments (25)
The Overkill Conspiracy Hypothesis
ymeskhout · 2023-10-20T16:51:20.308Z · comments (8)
I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines
307th · 2023-10-20T16:37:46.541Z · comments (32)
Internal Target Information for AI Oversight
Paul Colognese (paul-colognese) · 2023-10-20T14:53:00.284Z · comments (0)
On the proper date for solstice celebrations
jchan · 2023-10-20T13:55:02.999Z · comments (0)
Are (at least some) Large Language Models Holographic Memory Stores?
Bill Benzon (bill-benzon) · 2023-10-20T13:07:02.041Z · comments (4)
[link] Mechanistic interpretability of LLM analogy-making
Sergii (sergey-kharagorgiev) · 2023-10-20T12:53:26.550Z · comments (0)
[link] How To Socialize With Psycho(logist)s
Sable · 2023-10-20T11:33:46.066Z · comments (11)
Revealing Intentionality In Language Models Through AdaVAE Guided Sampling
jdp · 2023-10-20T07:32:28.749Z · comments (15)
Features and Adversaries in MemoryDT
Joseph Bloom (Jbloom) · 2023-10-20T07:32:21.091Z · comments (6)
[link] AI Safety Hub Serbia Soft Launch
DusanDNesic · 2023-10-20T07:11:48.389Z · comments (1)
Announcing new round of "Key Phenomena in AI Risk" Reading Group
DusanDNesic · 2023-10-20T07:11:09.360Z · comments (2)
Unpacking the dynamics of AGI conflict that suggest the necessity of a premptive pivotal act
Eli Tyre (elityre) · 2023-10-20T06:48:06.765Z · comments (2)
[link] Genocide isn't Decolonization
robotelvis · 2023-10-20T04:14:07.716Z · comments (19)
Trying to understand John Wentworth's research agenda
johnswentworth · 2023-10-20T00:05:40.929Z · comments (11)
Boost your productivity, happiness and health with this one weird trick
ajc586 (Adrian Cable) · 2023-10-19T23:30:54.734Z · comments (9)
[link] A Good Explanation of Differential Gears
Johannes C. Mayer (johannes-c-mayer) · 2023-10-19T23:07:46.354Z · comments (4)
← previous page (newer posts) · next page (older posts) →