LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Nursing doubts
dynomight · 2024-08-30T02:25:36.826Z · comments (23)
Why Don't We Just... Shoggoth+Face+Paraphraser?
Daniel Kokotajlo (daniel-kokotajlo) · 2024-11-19T20:53:52.084Z · comments (58)
[link] China Hawks are Manufacturing an AI Arms Race
garrison · 2024-11-20T18:17:51.958Z · comments (44)
Value Claims (In Particular) Are Usually Bullshit
johnswentworth · 2024-05-30T06:26:21.151Z · comments (18)
Alignment Faking Revisited: Improved Classifiers and Open Source Extensions
John Hughes (john-hughes) · 2025-04-08T17:32:55.315Z · comments (17)
[link] Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study
Adam Karvonen (karvonenadam) · 2025-04-14T17:38:02.918Z · comments (38)
[link] Fields that I reference when thinking about AI takeover prevention
Buck · 2024-08-13T23:08:54.950Z · comments (16)
The "Think It Faster" Exercise
Raemon · 2024-12-11T19:14:10.427Z · comments (35)
Applying traditional economic thinking to AGI: a trilemma
Steven Byrnes (steve2152) · 2025-01-13T01:23:00.397Z · comments (32)
When is a mind me?
Rob Bensinger (RobbBB) · 2024-04-17T05:56:38.482Z · comments (130)
[link] That Alien Message - The Animation
Writer · 2024-09-07T14:53:30.604Z · comments (9)
Momentum of Light in Glass
Ben (ben-lang) · 2024-10-09T20:19:42.088Z · comments (44)
The Most Forbidden Technique
Zvi · 2025-03-12T13:20:04.732Z · comments (9)
Survey: How Do Elite Chinese Students Feel About the Risks of AI?
Nick Corvino (nick-corvino) · 2024-09-02T18:11:11.867Z · comments (13)
OpenAI #12: Battle of the Board Redux
Zvi · 2025-03-31T15:50:02.156Z · comments (1)
Human takeover might be worse than AI takeover
Tom Davidson (tom-davidson-1) · 2025-01-10T16:53:27.043Z · comments (55)
Passages I Highlighted in The Letters of J.R.R.Tolkien
Ivan Vendrov (ivan-vendrov) · 2024-11-25T01:47:59.071Z · comments (38)
Planning for Extreme AI Risks
joshc (joshua-clymer) · 2025-01-29T18:33:14.844Z · comments (5)
[link] A computational no-coincidence principle
Eric Neyman (UnexpectedValues) · 2025-02-14T21:39:39.277Z · comments (38)
Ten people on the inside
Buck · 2025-01-28T16:41:22.990Z · comments (28)
Auditing language models for hidden objectives
Sam Marks (samuel-marks) · 2025-03-13T19:18:32.638Z · comments (15)
[link] The Hidden Cost of Our Lies to AI
Nicholas Andresen (nicholas-andresen) · 2025-03-06T05:03:47.239Z · comments (17)
What Indicators Should We Watch to Disambiguate AGI Timelines?
snewman · 2025-01-06T19:57:43.398Z · comments (57)
Hire (or Become) a Thinking Assistant
Raemon · 2024-12-23T03:58:42.061Z · comments (49)
My experience using financial commitments to overcome akrasia
William Howard (william-howard) · 2024-04-15T22:57:32.574Z · comments (33)
[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty
tandem · 2025-01-07T19:11:21.238Z · comments (5)
[link] The Failed Strategy of Artificial Intelligence Doomers
Ben Pace (Benito) · 2025-01-31T18:56:06.784Z · comments (78)
Anomalous Tokens in DeepSeek-V3 and r1
henry (henry-bass) · 2025-01-25T22:55:41.232Z · comments (2)
On saying "Thank you" instead of "I'm Sorry"
Michael Cohn (michael-cohn) · 2024-07-08T03:13:50.663Z · comments (16)
The Milton Friedman Model of Policy Change
JohnofCharleston · 2025-03-04T00:38:56.778Z · comments (17)
[Completed] The 2024 Petrov Day Scenario
Ben Pace (Benito) · 2024-09-26T08:08:32.495Z · comments (114)
How it All Went Down: The Puzzle Hunt that took us way, way Less Online
A* (agendra) · 2024-06-02T08:01:40.109Z · comments (5)
[question] How Much Are LLMs Actually Boosting Real-World Programmer Productivity?
Thane Ruthenis · 2025-03-04T16:23:39.296Z · answers+comments (51)
[question] Which things were you surprised to learn are not metaphors?
Eric Neyman (UnexpectedValues) · 2024-11-21T18:56:18.025Z · answers+comments (88)
Loving a world you don’t trust
Joe Carlsmith (joekc) · 2024-06-18T19:31:36.581Z · comments (13)
An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2
Neel Nanda (neel-nanda-1) · 2024-07-07T17:39:35.064Z · comments (16)
Limitations on Formal Verification for AI Safety
Andrew Dickson · 2024-08-19T23:03:52.706Z · comments (60)
Why I don't believe in the placebo effect
transhumanist_atom_understander · 2024-06-10T02:37:07.776Z · comments (22)
[link] Simple probes can catch sleeper agents
Monte M (montemac) · 2024-04-23T21:10:47.784Z · comments (21)
[link] "AI achieves silver-medal standard solving International Mathematical Olympiad problems"
gjm · 2024-07-25T15:58:57.638Z · comments (38)
Parasites (not a metaphor)
lemonhope (lcmgcd) · 2024-08-08T20:07:13.593Z · comments (19)
A Dozen Ways to Get More Dakka
Davidmanheim · 2024-04-08T04:45:19.427Z · comments (11)
[link] Training on Documents About Reward Hacking Induces Reward Hacking
evhub · 2025-01-21T21:32:24.691Z · comments (15)
[link] "Can AI Scaling Continue Through 2030?", Epoch AI (yes)
gwern · 2024-08-24T01:40:32.929Z · comments (4)
Circuits in Superposition: Compressing many small neural networks into one
Lucius Bushnaq (Lblack) · 2024-10-14T13:06:14.596Z · comments (9)
Some articles in “International Security” that I enjoyed
Buck · 2025-01-31T16:23:27.061Z · comments (10)
The Paris AI Anti-Safety Summit
Zvi · 2025-02-12T14:00:07.383Z · comments (21)
Tell me about yourself: LLMs are aware of their learned behaviors
Martín Soto (martinsq) · 2025-01-22T00:47:15.023Z · comments (5)
Building AI Research Fleets
Ben Goldhaber (bgold) · 2025-01-12T18:23:09.682Z · comments (11)
Near-mode thinking on AI
Olli Järviniemi (jarviniemi) · 2024-08-04T20:47:28.085Z · comments (9)
← previous page (newer posts) · next page (older posts) →