LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

The Translucent Thoughts Hypotheses and Their Implications
Fabien Roger (Fabien) · 2023-03-09T16:30:02.355Z · comments (7)

Steam
abramdemski · 2022-06-20T17:38:58.548Z · comments (13)

High-stakes alignment via adversarial training [Redwood Research report]
dmz (DMZ) · 2022-05-05T00:59:18.848Z · comments (29)

MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models"
Rob Bensinger (RobbBB) · 2021-03-05T23:43:54.186Z · comments (13)

A Semitechnical Introductory Dialogue on Solomonoff Induction
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-03-04T17:27:35.591Z · comments (33)

MIRI location optimization (and related topics) discussion
Rob Bensinger (RobbBB) · 2021-05-08T23:12:02.476Z · comments (163)

Request to AGI organizations: Share your views on pausing AI progress
[deleted] · 2023-04-11T17:30:46.707Z · comments (11)

Hashing out long-standing disagreements seems low-value to me
So8res · 2023-02-16T06:20:00.899Z · comments (34)

Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers
[deleted] · 2021-04-09T19:19:42.826Z · comments (17)

Survey: How Do Elite Chinese Students Feel About the Risks of AI?
Nick Corvino (nick-corvino) · 2024-09-02T18:11:11.867Z · comments (13)

Age changes what you care about
Dentin · 2022-10-16T15:36:36.148Z · comments (37)

How long does it take to become Gaussian?
Maxwell Peterson (maxwell-peterson) · 2020-12-08T07:23:41.725Z · comments (40)

Exercises in Comprehensive Information Gathering
johnswentworth · 2020-02-15T17:27:19.753Z · comments (18)

A revolution in philosophy: the rise of conceptual engineering
Suspended Reason (suspended-reason) · 2020-06-02T18:30:30.495Z · comments (50)

[link] Against LLM Reductionism
Erich_Grunewald · 2023-03-08T15:52:38.741Z · comments (17)

Against GDP as a metric for timelines and takeoff speeds
Daniel Kokotajlo (daniel-kokotajlo) · 2020-12-29T17:42:24.788Z · comments (19)

Graphical tensor notation for interpretability
Jordan Taylor (Nadroj) · 2023-10-04T08:04:33.341Z · comments (11)

The Parable of the Boy Who Cried 5% Chance of Wolf
KatWoods (ea247) · 2022-08-15T14:33:21.649Z · comments (24)

The Learning-Theoretic Agenda: Status 2023
Vanessa Kosoy (vanessa-kosoy) · 2023-04-19T05:21:29.177Z · comments (17)

Developmental Stages of GPTs
orthonormal · 2020-07-26T22:03:19.588Z · comments (72)

Coordination Schemes Are Capital Investments
Raemon · 2021-09-06T23:27:28.384Z · comments (31)

Another RadVac Testing Update
johnswentworth · 2021-03-23T17:29:10.741Z · comments (19)

How might we align transformative AI if it’s developed very soon?
HoldenKarnofsky · 2022-08-29T15:42:08.985Z · comments (55)

Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent
ArthurB · 2023-03-09T09:26:25.383Z · comments (33)

[question] How to Convince my Son that Drugs are Bad
concerned_dad · 2022-12-17T18:47:24.398Z · answers+comments (84)

Understanding Infra-Bayesianism: A Beginner-Friendly Video Series
Jack Parker · 2022-09-22T13:25:04.254Z · comments (6)

LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem
Steven Byrnes (steve2152) · 2023-05-08T19:35:19.180Z · comments (37)

A review of Steven Pinker's new book on rationality
Matthew Barnett (matthew-barnett) · 2021-09-29T01:29:58.151Z · comments (43)

More Is Different for AI
jsteinhardt · 2022-01-04T19:30:20.352Z · comments (24)

Fixing The Good Regulator Theorem
johnswentworth · 2021-02-09T20:30:16.888Z · comments (39)

The Curse Of The Counterfactual
pjeby · 2019-11-01T18:34:41.186Z · comments (35)

What o3 Becomes by 2028
Vladimir_Nesov · 2024-12-22T12:37:20.929Z · comments (15)

Takeoff speeds have a huge effect on what it means to work on AI x-risk
Buck · 2022-04-13T17:38:11.990Z · comments (27)

Resolve Cycles
CFAR!Duncan (CFAR 2017) · 2022-07-16T23:17:13.037Z · comments (8)

Applying traditional economic thinking to AGI: a trilemma
Steven Byrnes (steve2152) · 2025-01-13T01:23:00.397Z · comments (32)

Going Crazy and Getting Better Again
Evenstar · 2023-07-02T18:55:25.790Z · comments (13)

Category Theory Without The Baggage
johnswentworth · 2020-02-03T20:03:13.586Z · comments (51)

Working in Virtual Reality: A Review
ozziegooen · 2020-11-20T23:14:28.707Z · comments (40)

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments
Andrew_Critch · 2022-04-19T20:25:35.018Z · comments (55)

AI Timelines via Cumulative Optimization Power: Less Long, More Short
jacob_cannell · 2022-10-06T00:21:02.447Z · comments (33)

The Apprentice Thread
Zvi · 2021-06-17T13:10:01.175Z · comments (59)

What good is G-factor if you're dumped in the woods? A field report from a camp counselor.
Hastings (hastings-greer) · 2024-01-12T13:17:23.829Z · comments (22)

[link] Why didn't we get the four-hour workday?
jasoncrawford · 2023-01-06T21:29:38.995Z · comments (34)

Passages I Highlighted in The Letters of J.R.R.Tolkien
Ivan Vendrov (ivan-vendrov) · 2024-11-25T01:47:59.071Z · comments (38)

Why Don't We Just... Shoggoth+Face+Paraphraser?
Daniel Kokotajlo (daniel-kokotajlo) · 2024-11-19T20:53:52.084Z · comments (57)

[link] Anomalous tokens reveal the original identities of Instruct models
janus · 2023-02-09T01:30:56.609Z · comments (16)

A descriptive, not prescriptive, overview of current AI Alignment Research
Jan (jan-2) · 2022-06-06T21:59:22.344Z · comments (21)

ELK prize results
paulfchristiano · 2022-03-09T00:01:02.085Z · comments (50)

The theory-practice gap
Buck · 2021-09-17T22:51:46.307Z · comments (15)

I'm Sorry Fluttershy
sapphire (deluks917) · 2021-05-22T20:09:27.342Z · comments (4)

← previous page (newer posts) · next page (older posts) →

^{^}

It seems to me that actually reflectively endorsing evil (objectively describing what's happening and then being like "this is good", instead of euphemizing and coping) is rare, and most people are actually altruistic on some level but conformism overrides that and animal abuse is so normal that they usually don't notice it, but then when they see basic information like "cows have best friends and get stressed when they are separated" they seemingly get empathetic and reflective (check the comments).

LessWrong 2.0 Reader

Archive

Recent comments