LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

Outlaw Code
scarcegreengrass · 2025-01-30T23:41:57.239Z · comments (1)
Can someone, anyone, make superintelligence a more concrete concept?
Ori Nagel (ori-nagel) · 2025-01-30T23:25:36.135Z · comments (4)
Upcoming Neuroscience Workshop - Functionalizing Brain Data, Ground-Truthing, and the Role of Artificial Data in Advancing Neuroscience
Devin Ward (Carboncopies Foundation) · 2025-01-30T23:02:00.681Z · comments (0)
What's Behind the SynBio Bust?
sarahconstantin · 2025-01-30T22:30:06.916Z · comments (7)
[link] The future of humanity is in management
jasoncrawford · 2025-01-30T22:14:46.765Z · comments (4)
[Translation] AI Generated Fake News is Taking Over my Family Group Chat
mushroomsoup · 2025-01-30T20:24:22.175Z · comments (0)
A sketch of an AI control safety case
Tomek Korbak (tomek-korbak) · 2025-01-30T17:28:47.992Z · comments (0)
[link] Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development
Jan_Kulveit · 2025-01-30T17:03:45.545Z · comments (46)
[question] Implication of Uncomputable Problems
Nathan1123 · 2025-01-30T16:48:38.222Z · answers+comments (1)
[link] Hello World
Charlie Sanders (charlie-sanders) · 2025-01-30T15:33:57.427Z · comments (0)
Introducing the Coalition for a Baruch Plan for AI: A Call for a Radical Treaty-Making process for the Global Governance of AI
rguerreschi · 2025-01-30T15:26:09.482Z · comments (0)
AI #101: The Shallow End
Zvi · 2025-01-30T14:50:08.269Z · comments (1)
Memorization-generalization in practice
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-30T14:10:48.239Z · comments (0)
ARENA 5.0 - Call for Applicants
JamesH (AtlasOfCharts) · 2025-01-30T13:18:27.052Z · comments (0)
[link] You should read Hobbes, Locke, Hume, and Mill via EarlyModernTexts.com
Arjun Panickssery (arjun-panickssery) · 2025-01-30T12:35:03.564Z · comments (3)
[question] Should you publish solutions to corrigibility?
rvnnt · 2025-01-30T11:52:05.983Z · answers+comments (13)
[link] Tetherware #1: The case for humanlike AI with free will
Jáchym Fibír · 2025-01-30T10:58:11.717Z · comments (6)
[link] A High Level Closed-Door Session Discussing DeepSeek: Vision Trumps Technology
Cosmia_Nebula · 2025-01-30T09:53:16.152Z · comments (1)
Are we the Wolves now? Human Eugenics under AI Control
Brit (james-spencer) · 2025-01-30T08:31:34.423Z · comments (1)
[question] Why not train reasoning models with RLHF?
CBiddulph (caleb-biddulph) · 2025-01-30T07:58:35.742Z · answers+comments (4)
The Road to Evil Is Paved with Good Objectives: Framework to Classify and Fix Misalignments.
Shivam · 2025-01-30T02:44:47.907Z · comments (0)
How *exactly* can AI take your job in the next few years?
Ansh Juneja (ansh-juneja) · 2025-01-30T02:33:13.475Z · comments (0)
Absorbing Your Friends' Powers
Alice Blair (Diatom) · 2025-01-30T02:32:27.091Z · comments (1)
Detailed Ideal World Benchmark
Knight Lee (Max Lee) · 2025-01-30T02:31:39.852Z · comments (0)
[link] Fertility Will Never Recover
Eneasz · 2025-01-30T01:16:41.332Z · comments (29)
[link] Predation as Payment for Criticism
Benquo · 2025-01-30T01:06:27.591Z · comments (6)
Learn to Develop Your Advantage
ReverendBayes (vedernikov-andrei) · 2025-01-29T22:06:00.641Z · comments (1)
Revealing alignment faking with a single prompt
Florian_Dietz · 2025-01-29T21:01:15.000Z · comments (5)
Allegory of the Tsunami
Evan Hu (evan-hu) · 2025-01-29T19:09:33.761Z · comments (1)
My Mental Model of AI Optimist Opinions
tailcalled · 2025-01-29T18:44:36.485Z · comments (2)
Planning for Extreme AI Risks
joshc (joshua-clymer) · 2025-01-29T18:33:14.844Z · comments (3)
[question] Does the ChatGPT (web)app sometimes show actual o1 CoTs now?
Sohaib Imran (sohaib-imran) · 2025-01-29T17:27:08.067Z · answers+comments (6)
[link] Dario Amodei: On DeepSeek and Export Controls
Zach Stein-Perlman · 2025-01-29T17:15:18.986Z · comments (3)
[link] Anthropic CEO calls for RSI
Andrea_Miotti (AndreaM) · 2025-01-29T16:54:24.943Z · comments (10)
Efficiency spectra and “bucket of circuits” cartoons
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-29T15:06:50.768Z · comments (0)
DeepSeek: Lemon, It’s Wednesday
Zvi · 2025-01-29T15:00:07.914Z · comments (0)
How not to build a dystopia
ank · 2025-01-29T14:16:09.862Z · comments (4)
[link] Whereby: The Zoom alternative you probably haven't heard of
Itay Dreyfus (itay-dreyfus) · 2025-01-29T13:01:08.564Z · comments (0)
[question] Whose track record of AI predictions would you like to see evaluated?
Jonny Spicer (jonnyspicer) · 2025-01-29T12:05:30.311Z · answers+comments (1)
[link] Paper: Open Problems in Mechanistic Interpretability
Lee Sharkey (Lee_Sharkey) · 2025-01-29T10:25:54.727Z · comments (0)
Positive jailbreaks in LLMs
dereshev · 2025-01-29T08:41:44.680Z · comments (0)
Untrusted monitoring insights from watching ChatGPT play coordination games
jwfiredragon · 2025-01-29T04:53:33.125Z · comments (1)
The Game Board has been Flipped: Now is a good time to rethink what you’re doing
LintzA (alex-lintz) · 2025-01-28T23:36:18.106Z · comments (26)
Reconceptualizing the Nothingness and Existence
Htarlov (htarlov) · 2025-01-28T20:29:44.390Z · comments (1)
Fake thinking and real thinking
Joe Carlsmith (joekc) · 2025-01-28T20:05:06.735Z · comments (7)
SAE regularization produces more interpretable models
Peter Lai (peter-lai) · 2025-01-28T20:02:56.662Z · comments (6)
Operator
Zvi · 2025-01-28T20:00:08.374Z · comments (1)
DeepSeek Panic at the App Store
Zvi · 2025-01-28T19:30:07.555Z · comments (14)
“Sharp Left Turn” discourse: An opinionated review
Steven Byrnes (steve2152) · 2025-01-28T18:47:04.395Z · comments (20)
Detecting out of distribution text with surprisal and entropy
Sandy Fraser (alex-fraser) · 2025-01-28T18:46:46.977Z · comments (4)
next page (older posts) →