LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Geometric Exploration, Arithmetic Exploitation
Scott Garrabrant · 2022-11-24T15:36:30.334Z · comments (4)
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker (D0TheMath) · 2022-08-26T18:26:47.667Z · comments (48)
Cup-Stacking Skills (or, Reflexive Involuntary Mental Motions)
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2021-10-11T07:16:45.950Z · comments (36)
Utilitarianism Meets Egalitarianism
Scott Garrabrant · 2022-11-21T19:00:12.168Z · comments (16)
My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda
Chi Nguyen · 2020-08-15T20:02:00.205Z · comments (20)
[link] Matt Levine on "Fraud is no fun without friends."
Raemon · 2021-01-19T18:23:20.614Z · comments (24)
[link] DontDoxScottAlexander.com - A Petition
Ben Pace (Benito) · 2020-06-25T05:44:50.050Z · comments (32)
[link] On hiding the source of knowledge
jessicata (jessica.liu.taylor) · 2020-01-26T02:48:51.310Z · comments (40)
Quintin's alignment papers roundup - week 1
Quintin Pope (quintin-pope) · 2022-09-10T06:39:01.773Z · comments (6)
How to Bounded Distrust
Zvi · 2023-01-09T13:10:00.942Z · comments (17)
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network
Erik Jenner (ejenner) · 2024-06-04T15:50:47.475Z · comments (14)
AI Alignment Metastrategy
Vanessa Kosoy (vanessa-kosoy) · 2023-12-31T12:06:11.433Z · comments (13)
Stampy's AI Safety Info soft launch
steven0461 · 2023-10-05T22:13:04.632Z · comments (9)
Compendium of problems with RLHF
Charbel-Raphaël (charbel-raphael-segerie) · 2023-01-29T11:40:53.147Z · comments (16)
"Zero Sum" is a misnomer.
abramdemski · 2020-09-30T18:25:30.603Z · comments (34)
Land Ho!
Zvi · 2022-01-20T13:30:01.262Z · comments (4)
[link] The Alignment Problem: Machine Learning and Human Values
Rohin Shah (rohinmshah) · 2020-10-06T17:41:21.138Z · comments (7)
Omicron Variant Post #2
Zvi · 2021-11-29T16:30:01.368Z · comments (34)
[link] Paper: LLMs trained on “A is B” fail to learn “B is A”
[deleted] · 2023-09-23T19:55:53.427Z · comments (74)
Convincing All Capability Researchers
Logan Riggs (elriggs) · 2022-04-08T17:40:25.488Z · comments (70)
[question] Which skincare products are evidence-based?
Vanessa Kosoy (vanessa-kosoy) · 2024-05-02T15:22:12.597Z · answers+comments (48)
Future ML Systems Will Be Qualitatively Different
jsteinhardt · 2022-01-11T19:50:11.377Z · comments (10)
Views on when AGI comes and on strategy to reduce existential risk
TsviBT · 2023-07-08T09:00:19.735Z · comments (55)
Problem relaxation as a tactic
TurnTrout · 2020-04-22T23:44:42.398Z · comments (8)
Late 2021 MIRI Conversations: AMA / Discussion
Rob Bensinger (RobbBB) · 2022-02-28T20:03:05.318Z · comments (199)
Delta Strain: Fact Dump and Some Policy Takeaways
Connor_Flexman · 2021-07-28T03:38:34.455Z · comments (60)
[link] The Failed Strategy of Artificial Intelligence Doomers
Ben Pace (Benito) · 2025-01-31T18:56:06.784Z · comments (69)
Perpetual Dickensian Poverty?
jefftk (jkaufman) · 2021-12-21T13:30:03.543Z · comments (18)
Revealing Intentionality In Language Models Through AdaVAE Guided Sampling
jdp · 2023-10-20T07:32:28.749Z · comments (15)
FHI paper published in Science: interventions against COVID-19
SoerenMind · 2020-12-16T21:19:00.441Z · comments (0)
[link] The Dangers of Mirrored Life
Niko_McCarty (niko-2) · 2024-12-12T20:58:32.750Z · comments (7)
RTFB: On the New Proposed CAIP AI Bill
Zvi · 2024-04-10T18:30:08.410Z · comments (14)
Christiano, Cotra, and Yudkowsky on AI progress
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T16:45:32.482Z · comments (95)
Preventing Language Models from hiding their reasoning
Fabien Roger (Fabien) · 2023-10-31T14:34:04.633Z · comments (15)
Unwitting cult leaders
Kaj_Sotala · 2021-02-11T11:10:04.504Z · comments (9)
AI catastrophes and rogue deployments
Buck · 2024-06-03T17:04:51.206Z · comments (16)
A bird's eye view of ARC's research
Jacob_Hilton · 2024-10-23T15:50:06.123Z · comments (12)
A Significant Portion of COVID-19 Transmission Is Presymptomatic
jimrandomh · 2020-03-14T05:52:33.734Z · comments (22)
Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1
StefanHex (Stefan42) · 2023-05-09T19:41:10.528Z · comments (1)
[question] What are your greatest one-shot life improvements?
Mark Xu (mark-xu) · 2020-05-16T16:53:40.608Z · answers+comments (171)
Parable of the Dammed
johnswentworth · 2020-12-10T00:08:44.493Z · comments (29)
Full-time AGI Safety!
Steven Byrnes (steve2152) · 2021-03-01T12:42:14.813Z · comments (3)
[question] How do we prepare for final crunch time?
Eli Tyre (elityre) · 2021-03-30T05:47:54.654Z · answers+comments (30)
Why I'm joining Anthropic
evhub · 2023-01-05T01:12:13.822Z · comments (4)
AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years
basil.halperin (bhalperin) · 2023-01-10T16:06:52.329Z · comments (44)
[question] Why The Focus on Expected Utility Maximisers?
DragonGod · 2022-12-27T15:49:36.536Z · answers+comments (84)
Scissors Statements for President?
AnnaSalamon · 2024-11-06T10:38:21.230Z · comments (32)
The Standard Analogy
Zack_M_Davis · 2024-06-03T17:15:42.327Z · comments (28)
Mental health benefits and downsides of psychedelic use in ACX readers: survey results
RationalElf · 2021-10-25T22:55:09.522Z · comments (18)
A List of 45+ Mech Interp Project Ideas from Apollo Research’s Interpretability Team
Lee Sharkey (Lee_Sharkey) · 2024-07-18T14:15:50.248Z · comments (18)
← previous page (newer posts) · next page (older posts) →