LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

“Sharp Left Turn” discourse: An opinionated review
Steven Byrnes (steve2152) · 2025-01-28T18:47:04.395Z · comments (8)
Ten people on the inside
Buck · 2025-01-28T16:41:22.990Z · comments (21)
Planning for Extreme AI Risks
joshc (joshua-clymer) · 2025-01-29T18:33:14.844Z · comments (3)
My supervillain origin story
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-27T12:20:46.101Z · comments (0)
The Game Board has been Flipped: Now is a good time to rethink what you’re doing
Alex Lintz (alex-lintz) · 2025-01-28T23:36:18.106Z · comments (18)
Should you go with your best guess?: Against precise Bayesianism and related views
Anthony DiGiovanni (antimonyanthony) · 2025-01-27T20:25:26.809Z · comments (8)
[link] Paper: Open Problems in Mechanistic Interpretability
Lee Sharkey (Lee_Sharkey) · 2025-01-29T10:25:54.727Z · comments (0)
Fake thinking and real thinking
Joe Carlsmith (joekc) · 2025-01-28T20:05:06.735Z · comments (3)
[link] Dario Amodei: On DeepSeek and Export Controls
Zach Stein-Perlman · 2025-01-29T17:15:18.986Z · comments (2)
DeepSeek Panic at the App Store
Zvi · 2025-01-28T19:30:07.555Z · comments (14)
A sketch of an AI control safety case
Tomek Korbak (tomek-korbak) · 2025-01-30T17:28:47.992Z · comments (0)
Operator
Zvi · 2025-01-28T20:00:08.374Z · comments (1)
[link] Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development
Jan_Kulveit · 2025-01-30T17:03:45.545Z · comments (2)
AI #101: The Shallow End
Zvi · 2025-01-30T14:50:08.269Z · comments (1)
[link] Anthropic CEO calls for RSI
Andrea_Miotti (AndreaM) · 2025-01-29T16:54:24.943Z · comments (10)
DeepSeek: Lemon, It’s Wednesday
Zvi · 2025-01-29T15:00:07.914Z · comments (0)
Deference and Decision-Making
ben_levinstein (benlev) · 2025-01-27T22:02:17.578Z · comments (0)
[question] Is the output of the softmax in a single transformer attention head usually winner-takes-all?
Linda Linsefors · 2025-01-27T15:33:28.992Z · answers+comments (1)
The Upcoming PEPFAR Cut Will Kill Millions, Many of Them Children
omnizoid · 2025-01-27T16:03:51.214Z · comments (2)
[link] Reinforcement Learning by AI Punishment
Abhishaike Mahajan (abhishaike-mahajan) · 2025-01-28T00:57:51.715Z · comments (0)
[link] You should read Hobbes, Locke, Hume, and Mill via EarlyModernTexts.com
Arjun Panickssery (arjun-panickssery) · 2025-01-30T12:35:03.564Z · comments (0)
ARENA 5.0 - Call for Applicants
JamesH (AtlasOfCharts) · 2025-01-30T13:18:27.052Z · comments (0)
[link] Are we trying to figure out if AI is conscious?
Kristaps Zilgalvis (kristaps-zilgalvis-1) · 2025-01-27T01:05:07.001Z · comments (6)
SAE regularization produces more interpretable models
Peter Lai (peter-lai) · 2025-01-28T20:02:56.662Z · comments (2)
AI Strategy Updates that You Should Make
Alice Blair (Diatom) · 2025-01-27T21:10:41.838Z · comments (2)
Efficiency spectra and “bucket of circuits” cartoons
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-29T15:06:50.768Z · comments (0)
The memorization-generalization spectrum and learning coefficients
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-28T16:53:24.628Z · comments (0)
The present perfect tense is ruining your life
PatrickDFarley · 2025-01-27T16:14:48.843Z · comments (8)
How different LLMs answered PhilPapers 2020 survey
Satron · 2025-01-27T21:41:12.334Z · comments (1)
Nvidia doesn’t just sell shovels
winstonBosan · 2025-01-28T04:56:38.720Z · comments (4)
[question] Those of you with lots of meditation experience: How did it influence your understanding of philosophy of mind and topics such as qualia?
SpectrumDT · 2025-01-28T14:29:47.034Z · answers+comments (14)
[question] Should you publish solutions to corrigibility?
rvnnt · 2025-01-30T11:52:05.983Z · answers+comments (8)
Learn to Develop Your Advantage
ReverendBayes (vedernikov-andrei) · 2025-01-29T22:06:00.641Z · comments (0)
My Mental Model of AI Optimist Opinions
tailcalled · 2025-01-29T18:44:36.485Z · comments (2)
Untrusted monitoring insights from watching ChatGPT play coordination games
jwfiredragon · 2025-01-29T04:53:33.125Z · comments (0)
Detecting out of distribution text with surprisal and entropy
Sandy Fraser (alex-fraser) · 2025-01-28T18:46:46.977Z · comments (3)
[link] A High Level Closed-Door Session Discussing DeepSeek: Vision Trumps Technology
Cosmia_Nebula · 2025-01-30T09:53:16.152Z · comments (0)
Revealing alignment faking with a single prompt
Florian_Dietz · 2025-01-29T21:01:15.000Z · comments (4)
Reconceptualizing the Nothingness and Existence
Htarlov (htarlov) · 2025-01-28T20:29:44.390Z · comments (1)
[link] The future of humanity is in management
jasoncrawford · 2025-01-30T22:14:46.765Z · comments (0)
Superintelligent AI will make mistakes
juggins · 2025-01-30T15:12:50.561Z · comments (2)
The Clueless Sniper and the Principle of Indifference
Jim Buhler (jim-buhler) · 2025-01-27T11:52:57.978Z · comments (19)
Introducing the Coalition for a Baruch Plan for AI: A Call for a Radical Treaty-Making process for the Global Governance of AI
rguerreschi · 2025-01-30T15:26:09.482Z · comments (0)
Memorization-generalization in practice
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-30T14:10:48.239Z · comments (0)
Positive jailbreaks in LLMs
dereshev · 2025-01-29T08:41:44.680Z · comments (0)
[question] Does the ChatGPT (web)app sometimes show actual o1 CoTs now?
Sohaib Imran (sohaib-imran) · 2025-01-29T17:27:08.067Z · answers+comments (6)
[link] Fertility Will Never Recover
Eneasz · 2025-01-30T01:16:41.332Z · comments (14)
How *exactly* can AI take your job in the next few years?
Ansh Juneja (ansh-juneja) · 2025-01-30T02:33:13.475Z · comments (0)
Jevon's paradox and economic intuitions
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2025-01-27T23:04:23.854Z · comments (0)
[link] Understanding AI World Models w/ Chris Canal
jacobhaimes · 2025-01-27T16:32:47.724Z · comments (0)
next page (older posts) →