LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Companies' safety plans neglect risks from scheming AI
Zach Stein-Perlman · 2024-06-03T15:00:20.236Z · comments (4)

[link] Moderately More Than You Wanted To Know: Depressive Realism
JustisMills · 2025-01-13T02:57:32.022Z · comments (4)

Do sparse autoencoders find "true features"?
Demian Till · 2024-02-22T18:06:59.630Z · comments (33)

The World in 2029
Nathan Young · 2024-03-02T18:03:29.368Z · comments (37)

[link] Nick Bostrom’s new book, “Deep Utopia”, is out today
PeterH · 2024-03-27T11:24:01.401Z · comments (5)

On Dwarkesh’s Podcast with OpenAI’s John Schulman
Zvi · 2024-05-21T17:30:04.332Z · comments (4)

[link] A Narrow Path: a plan to deal with AI extinction risk
Andrea_Miotti (AndreaM) · 2024-10-07T13:02:15.229Z · comments (12)

D&D.Sci Scenario Index
aphyer · 2024-07-23T02:00:43.483Z · comments (0)

AI for Bio: State Of The Field
sarahconstantin · 2024-08-30T18:00:02.187Z · comments (2)

The One and a Half Gemini
Zvi · 2024-02-22T13:10:04.725Z · comments (4)

Stream Entry
lsusr · 2025-01-07T23:56:13.530Z · comments (7)

Joshua Achiam Public Statement Analysis
Zvi · 2024-10-10T12:50:06.285Z · comments (14)

A Gentle Introduction to Risk Frameworks Beyond Forecasting
pendingsurvival · 2024-04-11T18:03:25.605Z · comments (10)

A gentle introduction to mechanistic anomaly detection
Erik Jenner (ejenner) · 2024-04-03T23:06:16.778Z · comments (2)

Automation collapse
Geoffrey Irving · 2024-10-21T14:50:54.500Z · comments (9)

When AI 10x's AI R&D, What Do We Do?
Logan Riggs (elriggs) · 2024-12-21T23:56:11.069Z · comments (16)

Transcoders enable fine-grained interpretable circuit analysis for language models
Jacob Dunefsky (jacob-dunefsky) · 2024-04-30T17:58:09.982Z · comments (14)

Beards and Masks?
jefftk (jkaufman) · 2025-01-18T16:00:04.049Z · comments (5)

The Simplest Good
Jesse Hoogland (jhoogland) · 2025-02-02T19:51:14.155Z · comments (5)

New, improved multiple-choice TruthfulQA
Owain_Evans · 2025-01-15T23:32:09.202Z · comments (0)

Announcing Suffering For Good
Garrett Baker (D0TheMath) · 2024-04-01T17:08:12.322Z · comments (5)

Prompts for Big-Picture Planning
Raemon · 2024-04-13T03:04:24.523Z · comments (1)

AXRP Episode 31 - Singular Learning Theory with Daniel Murfet
DanielFilan · 2024-05-07T03:50:05.001Z · comments (4)

[link] Excerpts from "A Reader's Manifesto"
Arjun Panickssery (arjun-panickssery) · 2024-09-06T22:37:40.254Z · comments (1)

[Summary] Progress Update #1 from the GDM Mech Interp Team
Neel Nanda (neel-nanda-1) · 2024-04-19T19:06:17.755Z · comments (0)

[link] LK-99 in retrospect
bhauth · 2024-07-07T02:06:27.660Z · comments (21)

The Mask Comes Off: At What Price?
Zvi · 2024-10-21T23:50:05.247Z · comments (16)

What is "True Love"?
johnswentworth · 2024-08-18T16:05:47.358Z · comments (11)

Heritability: Five Battles
Steven Byrnes (steve2152) · 2025-01-14T18:21:17.756Z · comments (18)

[link] Policymakers don't have access to paywalled articles
Adam Jones (domdomegg) · 2025-01-05T10:56:11.495Z · comments (11)

[link] [Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind · 2024-09-25T09:31:03.296Z · comments (16)

Guide to SB 1047
Zvi · 2024-08-20T13:10:07.408Z · comments (18)

LW Frontpage Experiments! (aka "Take the wheel, Shoggoth!")
Ruby · 2024-04-23T03:58:43.443Z · comments (27)

Shard Theory - is it true for humans?
Rishika (rishika-bose) · 2024-06-14T19:21:47.997Z · comments (7)

Multiplex Gene Editing: Where Are We Now?
sarahconstantin · 2024-07-16T20:50:04.590Z · comments (6)

MATS Applications + Research Directions I'm Currently Excited About
Neel Nanda (neel-nanda-1) · 2025-02-06T11:03:40.093Z · comments (4)

FarmKind's Illusory Offer
jefftk (jkaufman) · 2024-08-09T11:30:07.082Z · comments (5)

[Intuitive self-models] 8. Rooting Out Free Will Intuitions
Steven Byrnes (steve2152) · 2024-11-04T18:16:26.736Z · comments (16)

[link] Investigating an insurance-for-AI startup
L Rudolf L (LRudL) · 2024-09-21T15:29:10.083Z · comments (0)

[link] Peak Human Capital
PeterMcCluskey · 2024-09-30T21:13:30.421Z · comments (3)

Best in Class Life Improvement
sapphire (deluks917) · 2024-04-04T01:51:02.556Z · comments (20)

[link] Yudkowsky on The Trajectory podcast
Seth Herd · 2025-01-24T19:52:15.104Z · comments (39)

Language Models Use Trigonometry to Do Addition
Subhash Kantamneni (subhashk) · 2025-02-05T13:50:08.243Z · comments (1)

Numberwang: LLMs Doing Autonomous Research, and a Call for Input
eggsyntax · 2025-01-16T17:20:37.552Z · comments (30)

[link] Thermodynamic entropy = Kolmogorov complexity
Aram Ebtekar (EbTech) · 2025-02-17T05:56:06.960Z · comments (12)

Dumbing down
Martin Sustrik (sustrik) · 2024-06-09T06:50:47.469Z · comments (0)

Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Diego Caples (diego-caples) · 2024-09-06T17:55:34.265Z · comments (7)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (23)

The King and the Golem - The Animation
Writer · 2024-11-08T18:23:10.935Z · comments (0)

[link] Yoshua Bengio: Reasoning through arguments against taking AI safety seriously
Judd Rosenblatt (judd) · 2024-07-11T23:53:17.187Z · comments (3)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

cousin_it on Export Surplusses

I don't understand Eliezer's explanation. Imagine Alice is hard-working and Bob is lazy. Then Alice can make goods and sell them to Bob. Half the money she'll spend on having fun, the other half she'll save. In this situation she's rich and has a trade surplus, but the other parts of the explanation - different productivity between different parts of Alice (?) and inability to judge her own work fairly (?) - don't seem to be present.

annapurna on Annapurna's Shortform

Update:

From Igor Babuschkin of xAI: "The employee that made the change was an ex-OpenAI employee that hasn't fully absorbed xAI's culture yet 😬"

https://x.com/ibab/status/1893774017376485466?t=vqJvcSPltsMI5sdGYZJnjg&s=19

saulius on How to Make Superbabies

Thanks for clarifying. If you ever pitch your ideas to potential investors or something, I recommend avoid talking about hundreds of embryos, at least not without acknowledging that this is unrealistic with current technologies. When reading, I was a bit worried that you might be divorced from reality, thinking in sci-fi terms, not knowing the basic realities about IVF. This made it difficult for me to trust other things you were saying about domains I know nothing about. Just letting you know in case it's helpful :)

weightt-an on leogao's Shortform

The alternative is to pit people against each other in some competitive games, 1 on 1 or in teams. I don't think the feeling you get from such games is consistent with "being competent doesn't feel like being competent, it feels like the thing just being really easy", probably mainly because there is skill level matching, there are always opponents who pose you a real challenge.

Hmm maybe such games need some more long tail probabilistic matching, to sometimes feel the difference. Or maybe variable team sizes, with many incompetent people versus few competent, to get a more "doomguy" feeling.

purplehermann on Power Lies Trembling: a three-book review

I found the graph confusing, why one set of points is unstable/stable

purplehermann on The case for the death penalty

Just notice that systems are not stable, even if you got to decide all policy in a given point in time, policy will naturally warp and people will abuse it.

If killing people, quickly etc was normal, I assume regimes would use this to stop people from unseating them. (Trump may have been killed, see the attempts to paint him as a rapist)

yair-halberstadt on The case for the death penalty

I would only use the death penalty where we're close to certain X actually committed the crime. That's fairly common for shoplifting and murder, but unfortunately far less common in rape (unless it's e.g. on a street with CCTV cameras). I guess for probable rape I probably wouldn't impose the death penalty and hope that they'll get caught for a violent crimes later.

purplehermann on Announcement: Learning Theory Online Course

Please record this course and release the problems

purplehermann on The case for the death penalty

Hitler

Trump - if killing people on short time lines was accepted...

Girl owes guy money, gets him killed for rape. (Her friends join in)

People who were canceled?

I could see nasty business issues, mafias using this etc - but that sounds like a novel so we'll leave it aside

zero-contradictions on Biology, Ideology and Violence

Thanks for commenting. However, he also wrote in the same paragraph:

There are no written records of it, but I'm pretty sure that's what happened, or something like that.

He wrote "or something like that", so I think that allows some variation of two (main) groups fighting each other in a war. He gave his reasoning for why individuals would team into larger groups in the previous paragraph, but I will agree that it's mostly speculative how many warring groups there were. Regardless, I'm convinced that the island's environmental degradation and population collapse were both most likely caused by overpopulation.