LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Notice When People Are Directionally Correct
Chris_Leong · 2024-01-14T14:12:37.090Z · comments (7)
Utility ≠ Reward
Vlad Mikulik (vlad_m) · 2019-09-05T17:28:13.222Z · comments (24)
Ukraine Situation Report 2022/03/01
lsusr · 2022-03-02T05:07:59.763Z · comments (59)
Choice Writings of Dominic Cummings
Connor_Flexman · 2021-10-13T02:41:44.291Z · comments (75)
AGI safety from first principles: Introduction
Richard_Ngo (ricraz) · 2020-09-28T19:53:22.849Z · comments (18)
Why I'm joining Anthropic
evhub · 2023-01-05T01:12:13.822Z · comments (4)
[link] Gene drives: why the wait?
Metacelsus · 2022-09-19T23:37:17.595Z · comments (50)
What Comes After Epistemic Spot Checks?
Elizabeth (pktechgirl) · 2019-10-22T17:00:00.758Z · comments (9)
Soares, Tallinn, and Yudkowsky discuss AGI cognition
So8res · 2021-11-29T19:26:33.232Z · comments (39)
An Update on Academia vs. Industry (one year into my faculty job)
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2022-09-03T20:43:37.701Z · comments (18)
A proposed method for forecasting transformative AI
Matthew Barnett (matthew-barnett) · 2023-02-10T19:34:01.358Z · comments (20)
Law of No Evidence
Zvi · 2021-12-20T13:50:01.189Z · comments (19)
[link] Paper: LLMs trained on “A is B” fail to learn “B is A”
lberglund (brglnd) · 2023-09-23T19:55:53.427Z · comments (73)
My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda
Chi Nguyen · 2020-08-15T20:02:00.205Z · comments (20)
Reward Is Not Enough
Steven Byrnes (steve2152) · 2021-06-16T13:52:33.745Z · comments (19)
[link] The Alignment Problem: Machine Learning and Human Values
Rohin Shah (rohinmshah) · 2020-10-06T17:41:21.138Z · comments (7)
Land Ho!
Zvi · 2022-01-20T13:30:01.262Z · comments (4)
[link] Matt Levine on "Fraud is no fun without friends."
Raemon · 2021-01-19T18:23:20.614Z · comments (24)
Stampy's AI Safety Info soft launch
steven0461 · 2023-10-05T22:13:04.632Z · comments (9)
Moloch and the sandpile catastrophe
Eric Raymond (eric-raymond) · 2022-04-02T15:35:12.552Z · comments (25)
Compendium of problems with RLHF
Charbel-Raphaël (charbel-raphael-segerie) · 2023-01-29T11:40:53.147Z · comments (16)
Convincing All Capability Researchers
Logan Riggs (elriggs) · 2022-04-08T17:40:25.488Z · comments (70)
Omicron Variant Post #2
Zvi · 2021-11-29T16:30:01.368Z · comments (34)
[link] DontDoxScottAlexander.com - A Petition
Ben Pace (Benito) · 2020-06-25T05:44:50.050Z · comments (32)
Conversation with Eliezer: What do you want the system to do?
Akash (akash-wasil) · 2022-06-25T17:36:14.145Z · comments (38)
Quintin's alignment papers roundup - week 1
Quintin Pope (quintin-pope) · 2022-09-10T06:39:01.773Z · comments (6)
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker (D0TheMath) · 2022-08-26T18:26:47.667Z · comments (48)
Perpetual Dickensian Poverty?
jefftk (jkaufman) · 2021-12-21T13:30:03.543Z · comments (18)
Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1
StefanHex (Stefan42) · 2023-05-09T19:41:10.528Z · comments (1)
The Territory
LoganStrohl (BrienneYudkowsky) · 2022-02-15T18:56:36.992Z · comments (12)
What good is G-factor if you're dumped in the woods? A field report from a camp counselor.
Hastings (hastings-greer) · 2024-01-12T13:17:23.829Z · comments (22)
A Significant Portion of COVID-19 Transmission Is Presymptomatic
jimrandomh · 2020-03-14T05:52:33.734Z · comments (22)
Christiano, Cotra, and Yudkowsky on AI progress
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T16:45:32.482Z · comments (95)
Unwitting cult leaders
Kaj_Sotala · 2021-02-11T11:10:04.504Z · comments (9)
[link] Introducing the Center for AI Policy (& we're hiring!)
Thomas Larsen (thomas-larsen) · 2023-08-28T21:17:11.703Z · comments (50)
Some background for reasoning about dual-use alignment research
Charlie Steiner · 2023-05-18T14:50:54.401Z · comments (19)
RTFB: On the New Proposed CAIP AI Bill
Zvi · 2024-04-10T18:30:08.410Z · comments (13)
Harms and possibilities of schooling
TsviBT · 2022-02-22T07:48:09.542Z · comments (38)
Late 2021 MIRI Conversations: AMA / Discussion
Rob Bensinger (RobbBB) · 2022-02-28T20:03:05.318Z · comments (199)
Delta Strain: Fact Dump and Some Policy Takeaways
Connor_Flexman · 2021-07-28T03:38:34.455Z · comments (60)
[link] Steering Llama-2 with contrastive activation additions
Nina Rimsky (NinaR) · 2024-01-02T00:47:04.621Z · comments (29)
One-layer transformers aren’t equivalent to a set of skip-trigrams
Buck · 2023-02-17T17:26:13.819Z · comments (10)
How to Bounded Distrust
Zvi · 2023-01-09T13:10:00.942Z · comments (15)
GPT-175bee
Adam Scherlis (adam-scherlis) · 2023-02-08T18:58:01.364Z · comments (13)
FHI paper published in Science: interventions against COVID-19
[deleted] · 2020-12-16T21:19:00.441Z · comments (0)
Theses on Sleep
guzey · 2022-02-11T12:58:15.300Z · comments (104)
Narrative Syncing
AnnaSalamon · 2022-05-01T01:48:45.889Z · comments (48)
Why was the AI Alignment community so unprepared for this moment?
Ras1513 · 2023-07-15T00:26:29.769Z · comments (64)
Problem relaxation as a tactic
TurnTrout · 2020-04-22T23:44:42.398Z · comments (8)
Future ML Systems Will Be Qualitatively Different
jsteinhardt · 2022-01-11T19:50:11.377Z · comments (10)
← previous page (newer posts) · next page (older posts) →