LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

AGI Ruin: A List of Lethalities
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-06-05T22:05:52.224Z · comments (690)

Where I agree and disagree with Eliezer
paulfchristiano · 2022-06-19T19:15:55.698Z · comments (219)

What an actually pessimistic containment strategy looks like
lc · 2022-04-05T00:19:50.212Z · comments (138)

[link] Simulators
janus · 2022-09-02T12:45:33.723Z · comments (161)

Let’s think about slowing down AI
KatjaGrace · 2022-12-22T17:40:04.787Z · comments (183)

The Redaction Machine
Ben (ben-lang) · 2022-09-20T22:03:15.309Z · comments (46)

[link] Luck based medicine: my resentful story of becoming a medical miracle
Elizabeth (pktechgirl) · 2022-10-16T17:40:03.702Z · comments (119)

Losing the root for the tree
Adam Zerner (adamzerner) · 2022-09-20T04:53:53.435Z · comments (30)

It’s Probably Not Lithium
Natália (Natália Mendonça) · 2022-06-28T21:24:10.246Z · comments (186)

Counter-theses on Sleep
Natália (Natália Mendonça) · 2022-03-21T23:21:07.943Z · comments (131)

(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen (thomas-larsen) · 2022-08-29T01:23:58.073Z · comments (89)

chinchilla's wild implications
nostalgebraist · 2022-07-31T01:18:28.254Z · comments (128)

[link] It Looks Like You're Trying To Take Over The World
gwern · 2022-03-09T16:35:35.326Z · comments (120)

[link] Reflections on six months of fatherhood
jasoncrawford · 2022-01-31T05:28:09.154Z · comments (24)

DeepMind alignment team opinions on AGI ruin arguments
Vika · 2022-08-12T21:06:40.582Z · comments (37)

Lies Told To Children
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-04-14T11:25:10.282Z · comments (94)

[link] A Mechanistic Interpretability Analysis of Grokking
Neel Nanda (neel-nanda-1) · 2022-08-15T02:41:36.245Z · comments (47)

Counterarguments to the basic AI x-risk case
KatjaGrace · 2022-10-14T13:00:05.903Z · comments (124)

You Are Not Measuring What You Think You Are Measuring
johnswentworth · 2022-09-20T20:04:22.899Z · comments (44)

Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra (ajeya-cotra) · 2022-07-18T19:06:14.670Z · comments (94)

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
elspood · 2022-06-21T23:55:39.918Z · comments (42)

Accounting For College Costs
johnswentworth · 2022-04-01T17:28:19.409Z · comments (41)

What DALL-E 2 can and cannot do
Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2022-05-01T23:51:22.310Z · comments (303)

Reward is not the optimization target
TurnTrout · 2022-07-25T00:03:18.307Z · comments (123)

MIRI announces new "Death With Dignity" strategy
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-04-02T00:43:19.814Z · comments (543)

Beware boasting about non-existent forecasting track records
Jotto999 · 2022-05-20T19:20:03.854Z · comments (112)

What should you change in response to an "emergency"? And AI risk
AnnaSalamon · 2022-07-18T01:11:14.667Z · comments (60)

Why I think strong general AI is coming soon
porby · 2022-09-28T05:40:38.395Z · comments (139)

Staring into the abyss as a core life skill
benkuhn · 2022-12-22T15:30:05.093Z · comments (20)

Looking back on my alignment PhD
TurnTrout · 2022-07-01T03:19:59.497Z · comments (63)

Models Don't "Get Reward"
Sam Ringer · 2022-12-30T10:37:11.798Z · comments (61)

Epistemic Legibility
Elizabeth (pktechgirl) · 2022-02-09T18:10:06.591Z · comments (30)

Optimality is the tiger, and agents are its teeth
Veedrac · 2022-04-02T00:46:27.138Z · comments (42)

On how various plans miss the hard bits of the alignment challenge
So8res · 2022-07-12T02:49:50.454Z · comments (88)

A challenge for AGI organizations, and a challenge for readers
Rob Bensinger (RobbBB) · 2022-12-01T23:11:44.279Z · comments (33)

Six Dimensions of Operational Adequacy in AGI Projects
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-05-30T17:00:30.833Z · comments (66)

Why Agent Foundations? An Overly Abstract Explanation
johnswentworth · 2022-03-25T23:17:10.324Z · comments (56)

Two-year update on my personal AI timelines
Ajeya Cotra (ajeya-cotra) · 2022-08-02T23:07:48.698Z · comments (60)

Mysteries of mode collapse
janus · 2022-11-08T10:37:57.760Z · comments (56)

A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res · 2022-06-15T13:10:18.658Z · comments (53)

We Choose To Align AI
johnswentworth · 2022-01-01T20:06:23.307Z · comments (16)

Is AI Progress Impossible To Predict?
alyssavance · 2022-05-15T18:30:12.103Z · comments (39)

What Are You Tracking In Your Head?
johnswentworth · 2022-06-28T19:30:06.164Z · comments (81)

Sazen
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2022-12-21T07:54:51.415Z · comments (83)

Don't die with dignity; instead play to your outs
Jeffrey Ladish (jeff-ladish) · 2022-04-06T07:53:05.172Z · comments (59)

Toni Kurz and the Insanity of Climbing Mountains
GeneSmith · 2022-07-03T20:51:58.429Z · comments (67)

Humans are very reliable agents
alyssavance · 2022-06-16T22:02:10.892Z · comments (35)

12 interesting things I learned studying the discovery of nature's laws
Ben Pace (Benito) · 2022-02-19T23:39:47.841Z · comments (40)

Changing the world through slack & hobbies
Steven Byrnes (steve2152) · 2022-07-21T18:11:05.636Z · comments (13)

Safetywashing
Adam Scholl (adam_scholl) · 2022-07-01T11:56:33.495Z · comments (20)

next page (older posts) →

Archive

Recent comments

eggsyntax on eggsyntax's Shortform

Horses are surely sentient and worthy of consideration as moral patients. Horses are also not exactly all free citizens.

I think I'm not getting what intuition you're pointing at. Is it that we already ignore the interests of sentient beings?

Additional consideration: Does the AI moral patient's interests actually line up with our intuitions? Will naively applying ethical solutions designed for human interests potentially make things worse from the AI's perspective?

Certainly I would consider any fully sentient being to be the final authority on their own interests. I think that mostly escapes that problem (although I'm sure there are edge cases) -- if (by hypothesis) we consider a particular AI system to be fully sentient and a moral patient, then whether it asks to be shut down or asks to be left alone or asks for humans to only speak to it in Aramaic, I would consider its moral interests to be that.

Would you disagree? I'd be interested to hear cases where treating the system as the authority on its interests would be the wrong decision. Of course in the case of current systems, we've shaped them to only say certain things, and that presents problems, is that the issue you're raising?

mesaoptimizer on Lucie Philippon's Shortform

The main part of the issue was actually that I was not aware I had internal conflicts. I just mysteriously felt less emotions and motivation.

Yes, I believe that one can learn to entirely stop even considering certain potential actions as actions available to us. I don't really have a systematic solution for this right now aside from some form of Noticing practice (I believe a more refined version of this practice is called Naturalism [? · GW] but I don't have much experience with this form of practice).

cronodas on Examples of Highly Counterfactual Discoveries?

Antonie van Leeuwenhoek, known as the Father of Microbiology, made the first microscopes capable of seeing microorganisms and is credited as the person who discovered them. He kept his lensmaking techniques secret, however, and microscopes capable of the same magnification didn't become generally available until many, many years later.

mesaoptimizer on Lucie Philippon's Shortform

What do you think antidepressants would be useful for?

In my experience I've gone months through a depressive episode while remaining externally functional and convincing myself (and the people around me) that I'm not going through a depressive episode. Another thing I've noticed is that with medication (whether anxiolytics, antidepressants or ADHD medication), I regularly underestimate the level at which I was 'blocked' by some mental issue that, after taking the medication, would not exist, and I would only realize it previously existed due to the (positive) changes in my behavior and cognition.

Essentially, I'm positing that you may be in a similar situation.

nathan-young on The Inner Ring by C. S. Lewis

Yeboooiiiii.

Also this was gonna be the second essay i posted, so great minds think alike!

will_pearson on Will_Pearson's Shortform

I was thinking of having evals that controlled deployment of LLMs could be something that needs multiple stakeholders to agree upon.

Butt really it is a general use pattern.

michaltill on Thoughts on seed oil

Most sources that I read often refer to this overfeeding trial of extra 1000 kcal of saturated (SAT), unsaturated (UNSAT) or carbohydrates (CARBS). Not that many people involved (38), but a good thing is that it is a controlled randomized trial where participants were given the food. Endpoints measured are liver fat content change via two pathways. So not an serious event like CVD or stroke, but a marker (fatty liver) that is easier to measure and an endpoint that we know that is really bad for health. Results: overfeeding with SAT is most harmful, more than CARBS and more than UNSAT. Another piece in the puzzle (in different setting and with different outcome) that SAT is probably bad.

Saturated Fat Is More Metabolically Harmful for the Human Liver Than Unsaturated Fat or Simple Sugars
https://diabetesjournals.org/care/article/41/8/1732/36380/Saturated-Fat-Is-More-Metabolically-Harmful-for

josh-mitchell on My experience using financial commitments to overcome akrasia

Thanks, Elizabeth! Really has helped us out :)

josh-mitchell on My experience using financial commitments to overcome akrasia

Hey Eshvy!

Retention is a really tough one to answer questions on, sadly - it very much depends on where the user came from (ie, a TikTok ad vs an organic intentional search). Happy to answer any other questions though!

ricraz on AI Regulation is Unsafe

I don't actually think proponents of anti-x-risk AI regulation have thought very much about the ways in which regulatory capture might in fact be harmful to reducing AI x-risk. At least, I haven't seen much writing about this, nor has it come up in many of the discussions I've had (except insofar as I brought it up).

In general I am against arguments of the form "X is terrible but we have to try it because worlds that don't do it are even more doomed". I'll steal Scott Garrabrant's quote from here [? · GW]:

"If you think everything is doomed, you should try not to mess anything up. If your worldview is right, we probably lose, so our best out is the one where your your worldview is somehow wrong. In that world, we don't want mistaken people to take big unilateral risk-seeking actions.

Until recently, people with P(doom) of, say, 10%, have been natural allies of people with P(doom) of >80%. But the regulation that the latter group thinks is sufficient to avoid xrisk with high confidence has, on my worldview, a significant chance of either causing x-risk from totalitarianism, or else causing x-risk via governments being worse at alignment than companies would have been. How high? Not sure, but plausibly enough to make these two groups no longer natural allies.