LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs
Taran · 2023-02-19T12:25:52.212Z · comments (34)

[link] Fiber arts, mysterious dodecahedrons, and waiting on “Eureka!”
eukaryote · 2022-08-04T20:37:59.388Z · comments (15)

[question] What do coherence arguments actually prove about agentic behavior?
[deleted] · 2024-06-01T09:37:28.451Z · answers+comments (37)

My Effortless Weightloss Story: A Quick Runthrough
CuoreDiVetro · 2023-09-30T23:02:45.128Z · comments (78)

[link] Did ChatGPT just gaslight me?
TW123 (ThomasWoodside) · 2022-12-01T05:41:46.560Z · comments (45)

[link] Responsible Scaling Policies Are Risk Management Done Wrong
simeon_c (WayZ) · 2023-10-25T23:46:34.247Z · comments (35)

BIG-Bench Canary Contamination in GPT-4
Jozdien · 2024-10-22T15:40:48.166Z · comments (13)

[link] Introducing the Center for AI Policy (& we're hiring!)
Thomas Larsen (thomas-larsen) · 2023-08-28T21:17:11.703Z · comments (50)

Deep Forgetting & Unlearning for Safely-Scoped LLMs
scasper · 2023-12-05T16:48:18.177Z · comments (30)

Reward Is Not Enough
Steven Byrnes (steve2152) · 2021-06-16T13:52:33.745Z · comments (19)

Goodhart's Law inside the human mind
Kaj_Sotala · 2023-04-17T13:48:13.183Z · comments (13)

High schoolers can apply to the Atlas Fellowship: $50k scholarship + summer program
sydney (sydney-von-arx) · 2022-04-03T00:53:05.397Z · comments (18)

$250 prize for checking Jake Cannell's Brain Efficiency
Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2023-04-26T16:21:06.035Z · comments (170)

Let's See You Write That Corrigibility Tag
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-06-19T21:11:03.505Z · comments (70)

[link] The 300-year journey to the covid vaccine
jasoncrawford · 2020-11-09T23:06:45.790Z · comments (9)

In Defense of Chatbot Romance
Kaj_Sotala · 2023-02-11T14:30:05.696Z · comments (52)

Natural Latents: The Math
johnswentworth · 2023-12-27T19:03:01.923Z · comments (40)

[link] Report on Frontier Model Training
YafahEdelman (yafah-edelman-1) · 2023-08-30T20:02:46.317Z · comments (21)

Law of No Evidence
Zvi · 2021-12-20T13:50:01.189Z · comments (20)

Principles for Alignment/Agency Projects
johnswentworth · 2022-07-07T02:07:36.156Z · comments (20)

Book review: The Checklist Manifesto
Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2021-09-17T23:09:09.590Z · comments (13)

An Update on Academia vs. Industry (one year into my faculty job)
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2022-09-03T20:43:37.701Z · comments (18)

Book review: "A Thousand Brains" by Jeff Hawkins
Steven Byrnes (steve2152) · 2021-03-04T05:10:44.929Z · comments (18)

[link] Who regulates the regulators? We need to go beyond the review-and-approval paradigm
jasoncrawford · 2023-05-04T22:11:17.465Z · comments (29)

Awakening
lsusr · 2024-05-30T07:03:00.821Z · comments (79)

How bad a future do ML researchers expect?
KatjaGrace · 2023-03-09T04:50:05.122Z · comments (8)

Why I take short timelines seriously
NicholasKees (nick_kees) · 2024-01-28T22:27:21.098Z · comments (29)

Reducing sycophancy and improving honesty via activation steering
Nina Panickssery (NinaR) · 2023-07-28T02:46:23.122Z · comments (18)

What Comes After Epistemic Spot Checks?
Elizabeth (pktechgirl) · 2019-10-22T17:00:00.758Z · comments (9)

[question] What will 2040 probably look like assuming no singularity?
Daniel Kokotajlo (daniel-kokotajlo) · 2021-05-16T22:10:38.542Z · answers+comments (86)

[link] Investigating the Chart of the Century: Why is food so expensive?
Maxwell Tabarrok (maxwell-tabarrok) · 2024-08-16T13:21:23.596Z · comments (26)

[link] The smallest possible button (or: moth traps!)
Neil (neil-warren) · 2023-09-02T15:24:20.453Z · comments (18)

Do you believe in hundred dollar bills lying on the ground? Consider humming
Elizabeth (pktechgirl) · 2024-05-16T00:00:05.257Z · comments (22)

Greyed Out Options
ozymandias · 2022-04-04T20:43:13.566Z · comments (12)

Transcript of Sam Altman's interview touching on AI safety
Andy_McKenzie · 2023-01-20T16:14:18.974Z · comments (42)

Harms and possibilities of schooling
TsviBT · 2022-02-22T07:48:09.542Z · comments (38)

GPT-175bee
Adam Scherlis (adam-scherlis) · 2023-02-08T18:58:01.364Z · comments (14)

LW Petrov Day 2022 (Monday, 9/26)
Ruby · 2022-09-22T02:56:19.738Z · comments (111)

A proposed method for forecasting transformative AI
Matthew Barnett (matthew-barnett) · 2023-02-10T19:34:01.358Z · comments (21)

I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines
307th · 2023-10-20T16:37:46.541Z · comments (33)

[link] Philosophy of Therapy
DaystarEld · 2020-10-10T20:12:38.204Z · comments (27)

[link] My Number 1 Epistemology Book Recommendation: Inventing Temperature
adamShimi · 2024-09-08T14:30:40.456Z · comments (18)

Soares, Tallinn, and Yudkowsky discuss AGI cognition
So8res · 2021-11-29T19:26:33.232Z · comments (39)

Ukraine Situation Report 2022/03/01
lsusr · 2022-03-02T05:07:59.763Z · comments (59)

Choice Writings of Dominic Cummings
Connor_Flexman · 2021-10-13T02:41:44.291Z · comments (75)

Why was the AI Alignment community so unprepared for this moment?
Ras1513 · 2023-07-15T00:26:29.769Z · comments (65)

[New LW Feature] "Debates"
Ruby · 2023-04-01T07:00:24.466Z · comments (35)

My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda
Chi Nguyen · 2020-08-15T20:02:00.205Z · comments (20)

Geometric Exploration, Arithmetic Exploitation
Scott Garrabrant · 2022-11-24T15:36:30.334Z · comments (4)

Cup-Stacking Skills (or, Reflexive Involuntary Mental Motions)
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2021-10-11T07:16:45.950Z · comments (36)

← previous page (newer posts) · next page (older posts) →

^{^}

To some degree, I am describing an imaginary person here. But the pattern I describe definitely exists in my thinking even if less clearly than I put it above.

LessWrong 2.0 Reader

Archive

Recent comments