LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Report on Frontier Model Training
YafahEdelman (yafah-edelman-1) · 2023-08-30T20:02:46.317Z · comments (21)
[link] The smallest possible button (or: moth traps!)
Neil (neil-warren) · 2023-09-02T15:24:20.453Z · comments (18)
Reducing sycophancy and improving honesty via activation steering
Nina Panickssery (NinaR) · 2023-07-28T02:46:23.122Z · comments (18)
Law of No Evidence
Zvi · 2021-12-20T13:50:01.189Z · comments (20)
Soft takeoff can still lead to decisive strategic advantage
Daniel Kokotajlo (daniel-kokotajlo) · 2019-08-23T16:39:31.317Z · comments (47)
Hire (or Become) a Thinking Assistant
Raemon · 2024-12-23T03:58:42.061Z · comments (46)
[link] Who regulates the regulators? We need to go beyond the review-and-approval paradigm
jasoncrawford · 2023-05-04T22:11:17.465Z · comments (29)
[link] Investigating the Chart of the Century: Why is food so expensive?
Maxwell Tabarrok (maxwell-tabarrok) · 2024-08-16T13:21:23.596Z · comments (26)
Book review: The Checklist Manifesto
Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2021-09-17T23:09:09.590Z · comments (13)
Principles for Alignment/Agency Projects
johnswentworth · 2022-07-07T02:07:36.156Z · comments (20)
How bad a future do ML researchers expect?
KatjaGrace · 2023-03-09T04:50:05.122Z · comments (8)
Book review: "A Thousand Brains" by Jeff Hawkins
Steven Byrnes (steve2152) · 2021-03-04T05:10:44.929Z · comments (18)
[link] Cohabitive Games so Far
mako yass (MakoYass) · 2023-09-28T15:41:27.986Z · comments (141)
[New LW Feature] "Debates"
Ruby · 2023-04-01T07:00:24.466Z · comments (35)
[link] Philosophy of Therapy
DaystarEld · 2020-10-10T20:12:38.204Z · comments (27)
Why I take short timelines seriously
NicholasKees (nick_kees) · 2024-01-28T22:27:21.098Z · comments (29)
Transcript of Sam Altman's interview touching on AI safety
Andy_McKenzie · 2023-01-20T16:14:18.974Z · comments (42)
A proposed method for forecasting transformative AI
Matthew Barnett (matthew-barnett) · 2023-02-10T19:34:01.358Z · comments (21)
What Comes After Epistemic Spot Checks?
Elizabeth (pktechgirl) · 2019-10-22T17:00:00.758Z · comments (9)
LW Petrov Day 2022 (Monday, 9/26)
Ruby · 2022-09-22T02:56:19.738Z · comments (111)
GPT-175bee
Adam Scherlis (adam-scherlis) · 2023-02-08T18:58:01.364Z · comments (14)
Choice Writings of Dominic Cummings
Connor_Flexman · 2021-10-13T02:41:44.291Z · comments (75)
Soares, Tallinn, and Yudkowsky discuss AGI cognition
So8res · 2021-11-29T19:26:33.232Z · comments (39)
[question] What will 2040 probably look like assuming no singularity?
Daniel Kokotajlo (daniel-kokotajlo) · 2021-05-16T22:10:38.542Z · answers+comments (86)
Why was the AI Alignment community so unprepared for this moment?
Ras1513 · 2023-07-15T00:26:29.769Z · comments (65)
Ukraine Situation Report 2022/03/01
lsusr · 2022-03-02T05:07:59.763Z · comments (59)
Harms and possibilities of schooling
TsviBT · 2022-02-22T07:48:09.542Z · comments (38)
[link] On hiding the source of knowledge
jessicata (jessica.liu.taylor) · 2020-01-26T02:48:51.310Z · comments (40)
"Zero Sum" is a misnomer.
abramdemski · 2020-09-30T18:25:30.603Z · comments (34)
[link] DontDoxScottAlexander.com - A Petition
Ben Pace (Benito) · 2020-06-25T05:44:50.050Z · comments (32)
[link] Paper: LLMs trained on “A is B” fail to learn “B is A”
[deleted] · 2023-09-23T19:55:53.427Z · comments (74)
[link] The Alignment Problem: Machine Learning and Human Values
Rohin Shah (rohinmshah) · 2020-10-06T17:41:21.138Z · comments (7)
Mnestics
Jarred Filmer (4thWayWastrel) · 2022-10-23T00:30:11.159Z · comments (6)
Quintin's alignment papers roundup - week 1
Quintin Pope (quintin-pope) · 2022-09-10T06:39:01.773Z · comments (6)
Stampy's AI Safety Info soft launch
steven0461 · 2023-10-05T22:13:04.632Z · comments (9)
Geometric Exploration, Arithmetic Exploitation
Scott Garrabrant · 2022-11-24T15:36:30.334Z · comments (4)
Cup-Stacking Skills (or, Reflexive Involuntary Mental Motions)
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2021-10-11T07:16:45.950Z · comments (36)
Propagating Facts into Aesthetics
Raemon · 2019-12-19T04:09:17.816Z · comments (37)
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network
Erik Jenner (ejenner) · 2024-06-04T15:50:47.475Z · comments (14)
Convincing All Capability Researchers
Logan Riggs (elriggs) · 2022-04-08T17:40:25.488Z · comments (70)
How to Bounded Distrust
Zvi · 2023-01-09T13:10:00.942Z · comments (17)
Utilitarianism Meets Egalitarianism
Scott Garrabrant · 2022-11-21T19:00:12.168Z · comments (16)
Compendium of problems with RLHF
Charbel-Raphaël (charbel-raphael-segerie) · 2023-01-29T11:40:53.147Z · comments (16)
My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda
Chi Nguyen · 2020-08-15T20:02:00.205Z · comments (20)
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker (D0TheMath) · 2022-08-26T18:26:47.667Z · comments (48)
Moloch and the sandpile catastrophe
Eric Raymond (eric-raymond) · 2022-04-02T15:35:12.552Z · comments (25)
Land Ho!
Zvi · 2022-01-20T13:30:01.262Z · comments (4)
Omicron Variant Post #2
Zvi · 2021-11-29T16:30:01.368Z · comments (34)
[link] Matt Levine on "Fraud is no fun without friends."
Raemon · 2021-01-19T18:23:20.614Z · comments (24)
I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines
307th · 2023-10-20T16:37:46.541Z · comments (33)
← previous page (newer posts) · next page (older posts) →