LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

What Indicators Should We Watch to Disambiguate AGI Timelines?
snewman · 2025-01-06T19:57:43.398Z · comments (55)
Comments on Carlsmith's “Is power-seeking AI an existential risk?”
So8res · 2021-11-13T04:29:30.673Z · comments (15)
ELK prize results
paulfchristiano · 2022-03-09T00:01:02.085Z · comments (50)
The theory-practice gap
Buck · 2021-09-17T22:51:46.307Z · comments (15)
[link] EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised
gwern · 2021-11-02T02:32:41.856Z · comments (52)
Cosmopolitan values don't come free
So8res · 2023-05-31T15:58:16.974Z · comments (85)
Announcing the LessWrong Curated Podcast
Ben Pace (Benito) · 2022-06-22T22:16:58.170Z · comments (27)
[link] AI-Written Critiques Help Humans Notice Flaws
paulfchristiano · 2022-06-25T17:22:56.959Z · comments (5)
My experience using financial commitments to overcome akrasia
William Howard (william-howard) · 2024-04-15T22:57:32.574Z · comments (33)
Inner Alignment in Salt-Starved Rats
Steven Byrnes (steve2152) · 2020-11-19T02:40:10.232Z · comments (41)
Yudkowsky vs Hanson on FOOM: Whose Predictions Were Better?
1a3orn · 2023-06-01T19:36:48.351Z · comments (76)
Inflection.ai is a major AGI lab
Nikola Jurkovic (nikolaisalreadytaken) · 2023-08-09T01:05:54.604Z · comments (13)
[$10k bounty] Read and compile Robin Hanson’s best posts
Richard_Ngo (ricraz) · 2021-10-20T22:03:47.376Z · comments (29)
2020 AI Alignment Literature Review and Charity Comparison
Larks · 2020-12-21T15:27:19.303Z · comments (14)
Honoring Petrov Day on LessWrong, in 2019
Ben Pace (Benito) · 2019-09-26T09:10:27.783Z · comments (168)
Comparing Anthropic's Dictionary Learning to Ours
Robert_AIZI · 2023-10-07T23:30:32.402Z · comments (8)
Defending the non-central fallacy
Matthew Barnett (matthew-barnett) · 2021-03-09T21:42:17.068Z · comments (38)
The Seeker’s Game – Vignettes from the Bay
Yulia · 2023-07-09T19:32:58.717Z · comments (19)
Beyond Blame Minimization
physicaleconomics · 2022-03-27T00:03:31.650Z · comments (47)
[Completed] The 2024 Petrov Day Scenario
Ben Pace (Benito) · 2024-09-26T08:08:32.495Z · comments (114)
Read the Roon
Zvi · 2024-03-05T13:50:04.967Z · comments (6)
Contra EY: Can AGI destroy us without trial & error?
Nikita Sokolsky (nikita-sokolsky) · 2022-06-13T18:26:09.460Z · comments (72)
But why would the AI kill us?
So8res · 2023-04-17T18:42:39.720Z · comments (96)
Carrying the Torch: A Response to Anna Salamon by the Guild of the Rose
moridinamael · 2022-07-06T14:20:14.847Z · comments (16)
EA orgs' legal structure inhibits risk taking and information sharing on the margin
Elizabeth (pktechgirl) · 2023-11-05T19:13:56.135Z · comments (17)
The Alignment Community Is Culturally Broken
sudo · 2022-11-13T18:53:55.054Z · comments (68)
Four mindset disagreements behind existential risk disagreements in ML
Rob Bensinger (RobbBB) · 2023-04-11T04:53:48.427Z · comments (12)
[link] Ten Thousand Years of Solitude
agp (antonio-papa) · 2023-08-15T17:45:34.556Z · comments (19)
A mechanistic model of meditation
Kaj_Sotala · 2019-11-06T21:37:03.819Z · comments (11)
Covid 1/21: Turning the Corner
Zvi · 2021-01-21T16:40:00.941Z · comments (41)
Interpretability/Tool-ness/Alignment/Corrigibility are not Composable
johnswentworth · 2022-08-08T18:05:11.982Z · comments (12)
"Rationalist Discourse" Is Like "Physicist Motors"
Zack_M_Davis · 2023-02-26T05:58:29.249Z · comments (153)
[question] LessWrong Coronavirus Agenda
Elizabeth (pktechgirl) · 2020-03-18T04:48:56.769Z · answers+comments (65)
Five Ways To Prioritize Better
lynettebye · 2020-06-27T18:40:26.600Z · comments (7)
On Bounded Distrust
Zvi · 2022-02-03T14:50:00.883Z · comments (19)
Monitoring for deceptive alignment
evhub · 2022-09-08T23:07:03.327Z · comments (8)
[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty
tandem · 2025-01-07T19:11:21.238Z · comments (5)
Anomalous Tokens in DeepSeek-V3 and r1
henry (henry-bass) · 2025-01-25T22:55:41.232Z · comments (2)
Don't Dismiss Simple Alignment Approaches
Chris_Leong · 2023-10-07T00:35:26.789Z · comments (9)
The 99% principle for personal problems
Kaj_Sotala · 2023-10-02T08:20:07.379Z · comments (20)
2018 Review: Voting Results!
Ben Pace (Benito) · 2020-01-24T02:00:34.656Z · comments (59)
LessWrong Now Has Dark Mode
jimrandomh · 2022-05-10T01:21:44.065Z · comments (31)
Pretraining Language Models with Human Preferences
Tomek Korbak (tomek-korbak) · 2023-02-21T17:57:09.774Z · comments (20)
Possible takeaways from the coronavirus pandemic for slow AI takeoff
Vika · 2020-05-31T17:51:26.437Z · comments (36)
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau · 2022-10-27T01:32:44.750Z · comments (14)
Debate update: Obfuscated arguments problem
Beth Barnes (beth-barnes) · 2020-12-23T03:24:38.191Z · comments (24)
[link] Neuronpedia
Johnny Lin (hijohnnylin) · 2023-07-26T16:29:28.884Z · comments (51)
Planning for Extreme AI Risks
joshc (joshua-clymer) · 2025-01-29T18:33:14.844Z · comments (3)
An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2
Neel Nanda (neel-nanda-1) · 2024-07-07T17:39:35.064Z · comments (16)
Mechanistic anomaly detection and ELK
paulfchristiano · 2022-11-25T18:50:04.447Z · comments (22)
← previous page (newer posts) · next page (older posts) →