LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Why People Commit White Collar Fraud (Ozy linkpost)
sapphire (deluks917) · 2025-03-03T19:33:15.609Z · comments (1)
[link] Published report: Pathways to short TAI timelines
Zershaaneh Qureshi (zershaaneh-qureshi) · 2025-02-20T22:10:12.276Z · comments (0)
Three Levels for Large Language Model Cognition
Eleni Angelou (ea-1) · 2025-02-25T23:14:00.306Z · comments (0)
[Replication] Crosscoder-based Stage-Wise Model Diffing
annas (annasoli) · 2025-03-22T18:35:19.003Z · comments (0)
A hierarchy of disagreement
Adam Zerner (adamzerner) · 2025-01-23T03:17:59.051Z · comments (4)
[link] Inside OpenAI's Controversial Plan to Abandon its Nonprofit Roots
garrison · 2025-04-18T18:46:57.310Z · comments (0)
[link] Are we trying to figure out if AI is conscious?
Kristaps Zilgalvis (kristaps-zilgalvis-1) · 2025-01-27T01:05:07.001Z · comments (6)
Feature Hedging: Another way correlated features break SAEs
chanind · 2025-03-25T14:33:08.694Z · comments (0)
Why Were We Wrong About China and AI? A Case Study in Failed Rationality
thedudeabides · 2025-03-22T05:13:52.181Z · comments (38)
Read More News
utilistrutil · 2025-03-16T21:31:28.817Z · comments (2)
[link] "Long" timelines to advanced AI have gotten crazy short
Matrice Jacobine · 2025-04-03T22:46:39.416Z · comments (0)
[question] What are the surviving worlds like?
KvmanThinking (avery-liu) · 2025-02-17T00:41:49.810Z · answers+comments (2)
Consequentialism is for making decisions
Sniffnoy · 2025-03-27T04:00:07.020Z · comments (9)
Energy Markets Temporal Arbitrage with Batteries
NickyP (Nicky) · 2025-03-04T17:37:56.804Z · comments (3)
[link] Ferrer, Pilar, and Me
Askwho · 2025-04-06T11:22:57.758Z · comments (1)
[question] Can we ever ensure AI alignment if we can only test AI personas?
Karl von Wendt · 2025-03-16T08:06:42.345Z · answers+comments (8)
Distilling the Internal Model Principle
JoseFaustino · 2025-02-08T14:59:29.730Z · comments (0)
Spending on Ourselves
jefftk (jkaufman) · 2025-04-20T18:40:07.988Z · comments (0)
SAE regularization produces more interpretable models
Peter Lai (peter-lai) · 2025-01-28T20:02:56.662Z · comments (7)
Defense Against The Super-Worms
viemccoy · 2025-03-20T07:24:56.975Z · comments (1)
[link] The State of Metaculus
ChristianWilliams · 2025-02-05T19:17:44.862Z · comments (0)
Local Trust
ben_levinstein (benlev) · 2025-02-24T19:53:26.953Z · comments (4)
Reflections on the state of the race to superintelligence, February 2025
Mitchell_Porter · 2025-02-23T13:58:07.663Z · comments (7)
List of most interesting ideas I encountered in my life, ranked
Lucien (lucien) · 2025-02-23T12:36:48.158Z · comments (6)
Towards an understanding of the Chinese AI scene
Mitchell_Porter · 2025-03-24T09:10:19.498Z · comments (0)
Longtermist implications of aliens Space-Faring Civilizations - Introduction
Maxime Riché (maxime-riche) · 2025-02-21T12:08:42.403Z · comments (0)
[link] When should we worry about AI power-seeking?
Joe Carlsmith (joekc) · 2025-02-19T19:44:25.062Z · comments (0)
The Insanity Detector and Writing
Johannes C. Mayer (johannes-c-mayer) · 2025-03-07T11:19:10.758Z · comments (3)
Monet: Mixture of Monosemantic Experts for Transformers Explained
CalebMaresca (caleb-maresca) · 2025-01-25T19:37:09.078Z · comments (2)
[link] Slopworld 2035: The dangers of mediocre AI
titotal (lombertini) · 2025-04-14T13:14:08.390Z · comments (6)
The optimizer won’t just guess your intended semantics
Thomas Kehrenberg (thomas-kehrenberg) · 2025-03-06T19:42:12.682Z · comments (1)
Will US tariffs push data centers for large model training offshore?
ChristianKl · 2025-04-12T12:47:12.917Z · comments (3)
[link] Can Knowledge Hurt You? The Dangers of Infohazards (and Exfohazards)
aggliu · 2025-02-08T15:51:43.143Z · comments (0)
Wiki on Suspects in Lind, Zajko, and Maland Killings
Rebecca_Records · 2025-02-08T04:16:08.589Z · comments (4)
QFT and neural nets: the basic idea
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-24T13:54:45.099Z · comments (0)
Don't go bankrupt, don't go rogue
Nathan Young · 2025-02-06T10:31:14.312Z · comments (1)
Space-Faring Civilization density estimates and models - Review
Maxime Riché (maxime-riche) · 2025-02-27T11:44:21.101Z · comments (0)
Improved visualizations of METR Time Horizons paper.
LDJ (luigi-d) · 2025-03-19T23:36:52.771Z · comments (4)
[link] The Geometry of Linear Regression versus PCA
criticalpoints · 2025-02-23T21:01:33.415Z · comments (7)
The Internal Model Principle: A Straightforward Explanation
Alfred Harwood · 2025-04-12T10:58:51.479Z · comments (1)
[question] How far along Metr's law can AI start automating or helping with alignment research?
Christopher King (christopher-king) · 2025-03-20T15:58:08.369Z · answers+comments (21)
AI Strategy Updates that You Should Make
Alice Blair (Diatom) · 2025-01-27T21:10:41.838Z · comments (2)
Weird Random Newcomb Problem
Tapatakt · 2025-04-11T13:09:01.856Z · comments (15)
Leverage, Exit Costs, and Anger: Re-examining Why We Explode at Home, Not at Work
at_the_zoo · 2025-04-01T18:28:26.611Z · comments (2)
[link] Poetic Methods I: Meter as Communication Protocol
adamShimi · 2025-02-01T18:22:39.676Z · comments (0)
[link] AI Model History is Being Lost
Vale · 2025-03-16T12:38:47.907Z · comments (1)
Distillation of Meta's Large Concept Models Paper
NickyP (Nicky) · 2025-03-04T17:33:40.116Z · comments (3)
Moral Hazard in Democratic Voting
lsusr · 2025-02-12T23:17:39.355Z · comments (8)
Finding Emergent Misalignment
Jan Betley (jan-betley) · 2025-03-26T17:33:46.792Z · comments (0)
[link] "Self-Blackmail" and Alternatives
jessicata (jessica.liu.taylor) · 2025-02-09T23:20:19.895Z · comments (12)
← previous page (newer posts) · next page (older posts) →