LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AI for Bio: State Of The Field
sarahconstantin · 2024-08-30T18:00:02.187Z · comments (2)

The Mask Comes Off: At What Price?
Zvi · 2024-10-21T23:50:05.247Z · comments (16)

Some Rules for an Algebra of Bayes Nets
johnswentworth · 2023-11-16T23:53:11.650Z · comments (31)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (23)

Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Diego Caples (diego-caples) · 2024-09-06T17:55:34.265Z · comments (7)

Epistemic Hell
rogersbacon · 2024-01-27T17:13:09.578Z · comments (20)

“Artificial General Intelligence”: an extremely brief FAQ
Steven Byrnes (steve2152) · 2024-03-11T17:49:02.496Z · comments (6)

[link] OpenAI: Preparedness framework
Zach Stein-Perlman · 2023-12-18T18:30:10.153Z · comments (23)

[link] Yoshua Bengio: Reasoning through arguments against taking AI safety seriously
Judd Rosenblatt (judd) · 2024-07-11T23:53:17.187Z · comments (3)

[link] [Repost] The Copenhagen Interpretation of Ethics
mesaoptimizer · 2024-01-25T15:20:08.162Z · comments (4)

[link] The True Story of How GPT-2 Became Maximally Lewd
Writer · 2024-01-18T21:03:08.167Z · comments (7)

If we solve alignment, do we die anyway?
Seth Herd · 2024-08-23T13:13:10.933Z · comments (65)

Update on Chinese IQ-related gene panels
Lao Mein (derpherpize) · 2023-12-14T10:12:21.212Z · comments (7)

Instruction-following AGI is easier and more likely than value aligned AGI
Seth Herd · 2024-05-15T19:38:03.185Z · comments (25)

Dumbing down
Martin Sustrik (sustrik) · 2024-06-09T06:50:47.469Z · comments (0)

[link] Former OpenAI Superalignment Researcher: Superintelligence by 2030
Julian Bradshaw · 2024-06-05T03:35:19.251Z · comments (30)

[link] Davidad's Provably Safe AI Architecture - ARIA's Programme Thesis
simeon_c (WayZ) · 2024-02-01T21:30:44.090Z · comments (17)

Game Theory without Argmax [Part 1]
Cleo Nardo (strawberry calm) · 2023-11-11T15:59:47.486Z · comments (18)

Multiplex Gene Editing: Where Are We Now?
sarahconstantin · 2024-07-16T20:50:04.590Z · comments (6)

[link] Motivation gaps: Why so much EA criticism is hostile and lazy
titotal (lombertini) · 2024-04-22T11:49:59.389Z · comments (5)

How We Picture Bayesian Agents
johnswentworth · 2024-04-08T18:12:48.595Z · comments (14)

Transcoders enable fine-grained interpretable circuit analysis for language models
Jacob Dunefsky (jacob-dunefsky) · 2024-04-30T17:58:09.982Z · comments (14)

Flagging Potentially Unfair Parenting
jefftk (jkaufman) · 2023-12-26T12:40:05.099Z · comments (1)

Finding Sparse Linear Connections between Features in LLMs
Logan Riggs (elriggs) · 2023-12-09T02:27:42.456Z · comments (5)

[link] [Link Post] "Foundational Challenges in Assuring Alignment and Safety of Large Language Models"
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2024-06-06T18:55:09.151Z · comments (2)

[link] The Inner Ring by C. S. Lewis
Saul Munn (saul-munn) · 2024-04-24T22:48:09.228Z · comments (6)

[link] We're all in this together
Tamsin Leake (carado-1) · 2023-12-05T13:57:46.270Z · comments (65)

[link] InterLab – a toolkit for experiments with multi-agent interactions
Tomáš Gavenčiak (tomas-gavenciak) · 2024-01-22T18:23:35.661Z · comments (0)

AXRP Episode 27 - AI Control with Buck Shlegeris and Ryan Greenblatt
DanielFilan · 2024-04-11T21:30:04.244Z · comments (10)

[link] [Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind · 2024-09-25T09:31:03.296Z · comments (15)

How useful is "AI Control" as a framing on AI X-Risk?
habryka (habryka4) · 2024-03-14T18:06:30.459Z · comments (4)

Best in Class Life Improvement
sapphire (deluks917) · 2024-04-04T01:51:02.556Z · comments (20)

MATS AI Safety Strategy Curriculum
Ronny Fernandez (ronny-fernandez) · 2024-03-07T19:59:37.434Z · comments (2)

[link] GPT-4o System Card
Zach Stein-Perlman · 2024-08-08T20:30:52.633Z · comments (11)

The Hessian rank bounds the learning coefficient
Lucius Bushnaq (Lblack) · 2024-08-08T20:55:36.960Z · comments (9)

When Are Circular Definitions A Problem?
johnswentworth · 2024-05-28T20:00:23.408Z · comments (15)

Duct Tape security
Isaac King (KingSupernova) · 2024-04-26T18:57:05.659Z · comments (11)

Shard Theory - is it true for humans?
Rishika (rishika-bose) · 2024-06-14T19:21:47.997Z · comments (7)

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

Generalized Stat Mech: The Boltzmann Approach
David Lorell · 2024-04-12T17:47:31.880Z · comments (7)

Brief notes on the Wikipedia game
Olli Järviniemi (jarviniemi) · 2024-07-14T02:28:22.473Z · comments (9)

Text Posts from the Kids Group: 2020
jefftk (jkaufman) · 2024-04-13T22:30:05.326Z · comments (3)

Alignment can improve generalisation through more robustly doing what a human wants - CoinRun example
Stuart_Armstrong · 2023-11-21T11:41:34.798Z · comments (9)

[link] The 2nd Demographic Transition
Maxwell Tabarrok (maxwell-tabarrok) · 2024-04-06T14:10:13.095Z · comments (17)

[Summary] Progress Update #1 from the GDM Mech Interp Team
Neel Nanda (neel-nanda-1) · 2024-04-19T19:06:17.755Z · comments (0)

AI #79: Ready for Some Football
Zvi · 2024-08-29T13:30:10.902Z · comments (16)

Meetup Tip: Heartbeat Messages
Screwtape · 2023-12-07T17:18:33.582Z · comments (4)

[New Feature] Your Subscribed Feed
Ruby · 2024-06-11T22:45:00.000Z · comments (8)

Different senses in which two AIs can be “the same”
Vivek Hebbar (Vivek) · 2024-06-24T03:16:43.400Z · comments (0)

Ophiology (or, how the Mamba architecture works)
Danielle Ensign (phylliida-dev) · 2024-04-09T19:31:09.975Z · comments (8)

← previous page (newer posts) · next page (older posts) →

My big reason for going to RSS was to mitigate the content prioritization system. I want to skim every headline, or at least every headline over some minimum threshold of "good". On the other hand, I don't want to have to look at any old headlines twice to see the new ones. I'm really minimally interested in either the software's or the other users' opinions of which material I should want to see. RSS makes it easier to get a simple chronological view; the built-in chronological view is weird and hard to navigate to. I really feel like I'm having to fight the site to see what I want to see. ↩︎

LessWrong 2.0 Reader

Archive

Recent comments