LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Mid-Generation Self-Correction: A Simple Tool for Safer AI
MrThink (ViktorThink) · 2024-12-19T23:41:00.702Z · comments (0)

[link] How do fictional stories illustrate AI misalignment?
Vishakha (vishakha-agrawal) · 2025-01-15T06:11:44.336Z · comments (4)

Comparing the AirFanta 3Pro to the Coway AP-1512
jefftk (jkaufman) · 2024-12-16T01:40:01.522Z · comments (0)

Curriculum of Ascension
andrew sauer (andrew-sauer) · 2024-11-07T23:54:18.983Z · comments (0)

[link] Keeping Capital is the Challenge
LTM · 2025-02-03T02:04:27.142Z · comments (2)

[question] What are some scenarios where an aligned AGI actually helps humanity, but many/most people don't like it?
RomanS · 2025-01-10T18:13:11.900Z · answers+comments (6)

[link] Forecasting AGI: Insights from Prediction Markets and Metaculus
Alvin Ånestrand (alvin-anestrand) · 2025-02-04T13:03:45.927Z · comments (0)

(My) self-referential reason to believe in free will
jacek (jacek-karwowski) · 2025-01-06T23:35:02.809Z · comments (6)

Detecting out of distribution text with surprisal and entropy
Sandy Fraser (alex-fraser) · 2025-01-28T18:46:46.977Z · comments (4)

Exploring the petertodd / Leilan duality in GPT-2 and GPT-J
mwatkins · 2024-12-23T13:17:53.755Z · comments (1)

Basics of Bayesian learning
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-14T10:00:46.000Z · comments (0)

AXRP Episode 38.1 - Alan Chan on Agent Infrastructure
DanielFilan · 2024-11-16T23:30:09.098Z · comments (0)

A Ground-Level Perspective on Capacity Building in International Development
Sean Aubin (sean-aubin) · 2025-01-05T20:36:54.308Z · comments (1)

Reflections on ML4Good
james__p · 2024-11-25T02:40:32.586Z · comments (0)

[question] AI for medical care for hard-to-treat diseases?
CronoDAS · 2025-01-10T23:55:39.902Z · answers+comments (1)

Preliminary Thoughts on Flirting Theory
Alice Blair (Diatom) · 2024-12-24T07:37:47.045Z · comments (6)

A problem shared by many different alignment targets
ThomasCederborg · 2025-01-15T14:22:12.754Z · comments (11)

[link] Linkpost: Rat Traps by Sheon Han in Asterisk Mag
Chris_Leong · 2024-12-03T03:22:45.424Z · comments (7)

How Much to Give is a Pragmatic Question
jefftk (jkaufman) · 2024-12-24T04:20:01.480Z · comments (1)

[question] Who are the worthwhile non-European pre-Industrial thinkers?
Lorec · 2024-12-03T01:45:31.445Z · answers+comments (4)

[link] Markov's Inequality Explained
criticalpoints · 2025-01-08T00:31:55.125Z · comments (2)

Playing with Otamatones
jefftk (jkaufman) · 2025-01-02T19:50:01.781Z · comments (0)

curate
technicalities · 2025-01-14T14:40:30.510Z · comments (0)

Commenting Patterns by Platform
jefftk (jkaufman) · 2024-12-01T11:50:06.932Z · comments (0)

[link] My AI timelines
samuelshadrach (xpostah) · 2024-12-22T21:06:41.722Z · comments (2)

Book review: Range by David Epstein
PatrickDFarley · 2025-01-08T04:27:26.391Z · comments (0)

Maximally Eggy Crepes
jefftk (jkaufman) · 2025-01-19T20:40:03.709Z · comments (0)

GPT-4o Can In Some Cases Solve Moderately Complicated Captchas
dirk (abandon) · 2024-11-09T04:04:37.782Z · comments (2)

Approaches to Group Singing
jefftk (jkaufman) · 2025-01-01T12:50:01.877Z · comments (1)

No Internally-Crispy Mac and Cheese
jefftk (jkaufman) · 2024-12-20T03:20:01.798Z · comments (5)

[question] Recommendations for Recent Posts/Sequences on Instrumental Rationality?
Benjamin Hendricks (benjamin-hendricks) · 2025-01-26T00:41:08.577Z · answers+comments (3)

[question] Would anyone be interested in pursuing the Virtue of Scholarship with me?
japancolorado (russell-white) · 2025-02-02T04:02:27.116Z · answers+comments (1)

Sideloading: creating a model of a person via LLM with very large prompt
avturchin · 2024-11-22T16:41:28.293Z · comments (4)

[question] How counterfactual are logical counterfactuals?
Donald Hobson (donald-hobson) · 2024-12-15T21:16:40.515Z · answers+comments (10)

Fundamental Uncertainty: Chapter 9 - How do we live with uncertainty?
Gordon Seidoh Worley (gworley) · 2024-11-07T18:15:45.049Z · comments (2)

[link] Progress links and short notes, 2024-12-27: Clinical trial abundance, grid-scale fusion, permitting vs. compliance, crossword mania, and more
jasoncrawford · 2024-12-27T23:34:43.807Z · comments (0)

Seasonal Patterns in BIDA's Attendance
jefftk (jkaufman) · 2025-02-02T02:40:03.768Z · comments (0)

Introducing the Coalition for a Baruch Plan for AI: A Call for a Radical Treaty-Making process for the Global Governance of AI
rguerreschi · 2025-01-30T15:26:09.482Z · comments (0)

My Mental Model of AI Optimist Opinions
tailcalled · 2025-01-29T18:44:36.485Z · comments (2)

The Clueless Sniper and the Principle of Indifference
Jim Buhler (jim-buhler) · 2025-01-27T11:52:57.978Z · comments (26)

The Three Warnings of the Zentradi
Trevor Hill-Hand (Jadael) · 2024-11-21T20:28:45.567Z · comments (1)

Do you need a better map of your myriad of maps to the territory?
CstineSublime · 2024-12-24T02:00:30.426Z · comments (2)

Panology
JenniferRM · 2024-12-23T21:40:14.540Z · comments (8)

[link] Uncontrollable: A Surprisingly Good Introduction to AI Risk
PeterMcCluskey · 2025-01-24T04:30:37.499Z · comments (0)

Contra Dances Getting Shorter and Earlier
jefftk (jkaufman) · 2025-01-23T23:30:03.595Z · comments (0)

Rethinking Laplace's Rule of Succession
Cleo Nardo (strawberry calm) · 2024-11-22T18:46:25.156Z · comments (5)

What does success look like?
Raymond D · 2025-01-23T17:48:35.618Z · comments (0)

Reward Bases: A simple mechanism for adaptive acquisition of multiple reward type
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-11-23T12:45:01.067Z · comments (0)

LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (7)

7. Iterate the Game: Racing Where?
Allison Duettmann (allison-duettmann) · 2025-01-02T19:06:22.165Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

ruby on Open Thread Winter 2024/2025

Welcome! Don't be too worried, you can try posting some stuff and see how it's received. Based on how you wrote this comment, I think you won't have much trouble. The New User Guide and other stuff gets worded a bit sternly because of the people who tend not to put in much effort at all and expect to be well received – which doesn't sound like you at all. It's hard hard to write one document that's stern to those who need it and more welcoming to those who need that, unfortunately.

tsvibt on ozziegooen's Shortform

I assume that at [year(TAI) - 3] we'll have a decent idea of what's needed

Why?? What happened to the bitter lesson?

jbash on artifex0's Shortform

My gut says it's now at least 5%, which seems easily high enough to start putting together an emigration plan. Is that alarmist?

That's a crazy low probability.

More generally, what would be an appropriate smoke alarm for this sort of thing?

You're already beyond the "smoke alarm" stage and into the "worrying whether the fire extinguisher will work" stage.

chris_leong on Chris_Leong's Shortform

Thanks, seems pretty good on a quick skim, I'm a bit less certain on the corrigibility section, also more issues might become apparent if I read through it more slowly.

perry-cai on Perry Cai's Shortform

Anyone have a logical solution to exactly why we should act altruistically? I know it makes sense evolutionarily through game theory and statistics, but human decision making is still controlled by emotions, and it's still most advantageous for an individual actor to follow their own self-interest to a degree in a social community. I know how altruistic actors develop, but not why unconstrained intelligences should choose to do so.

milan-w on Introducing Collective Action for Existential Safety: 80+ actions individuals, organizations, and nations can take to improve our existential safety

I am very much not a fan of this project's website linking to it's leaders' coaching program. I'm sorry, but that just screams grift. James Norris: If you are even a real person (your personal website screams SEO-optimized LLM slop), please do better. You are being actively unhelpful.

aram-panasenco on AGI Ruin: A List of Lethalities

if there are any survivors, you solved alignment

I believe deploying the Observer [LW · GW] satisfies this requirement. The Observer is an ASI that's interested in the continuation of humanity's story. It will intervene and not let humanity get wiped out, though it gets to choose how many casualties there are before it intervenes, which could well be in the billions.

seth-herd on OpenAI releases deep research agent

Apparently people have been trying to do such comparisons:

Hugging Face researchers aim to build an ‘open’ version of OpenAI’s deep research tool

nicholas-heather-kross on We Fell For It

In hindsight, I over-updated on my previous success with a poorly-written angry short post with a clickbait title and lots of inline links criticizing the rationality community [LW · GW]. Oops.

jack-vandrunen on Subjective Naturalism in Decision Theory: Savage vs. Jeffrey–Bolker

You might be interested in this paper by Wolfgang Spohn on auto-epistemology and Sleeping Beauty (and related) problems (Sleeping Beauty starts on p. 388). Auto-epistemic models have more machinery than the basic model described in this post has, but I'm not sure there's anything special about your example that prevents it being modeled in a similar way.