LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Podcast with Oli Habryka on LessWrong / Lightcone Infrastructure
DanielFilan · 2023-02-05T02:52:06.632Z · comments (20)

[question] Seriously, what goes wrong with "reward the agent when it makes you smile"?
TurnTrout · 2022-08-11T22:22:32.198Z · answers+comments (42)

Let's Terraform West Texas
blackstampede · 2022-09-04T16:24:15.151Z · comments (33)

How well do truth probes generalise?
mishajw · 2024-02-24T14:12:19.729Z · comments (11)

[link] Google's PaLM-E: An Embodied Multimodal Language Model
SandXbox (PandaFusion) · 2023-03-07T04:11:18.183Z · comments (7)

I'm a bit skeptical of AlphaFold 3
Oleg Trott (oleg-trott) · 2024-06-25T00:04:41.274Z · comments (14)

Bioinfohazards
Spiracular · 2019-09-17T02:41:30.175Z · comments (14)

Clarifying the palatability theory of obesity
Matthew Barnett (matthew-barnett) · 2022-02-10T19:16:03.555Z · comments (19)

[link] Re: Anthropic's suggested SB-1047 amendments
RobertM (T3t) · 2024-07-27T22:32:39.447Z · comments (13)

[link] More Hyphenation
Arjun Panickssery (arjun-panickssery) · 2024-02-07T19:43:29.086Z · comments (19)

But exactly how complex and fragile?
KatjaGrace · 2019-11-03T18:20:01.268Z · comments (32)

2024 Petrov Day Retrospective
Ben Pace (Benito) · 2024-09-28T21:30:14.952Z · comments (25)

This is my 100ᵗʰ post on Less Wrong
lsusr · 2021-02-14T03:11:44.055Z · comments (9)

A Bayesian Aggregation Paradox
Jsevillamol · 2021-11-22T10:39:59.935Z · comments (23)

Coronavirus Justified Practical Advice Summary
Elizabeth (pktechgirl) · 2020-03-15T22:25:17.492Z · comments (53)

Covid-19 Points of Leverage, Travel Bans and Eradication
Roko · 2020-03-19T09:08:28.846Z · comments (48)

How DeepMind's Generally Capable Agents Were Trained
1a3orn · 2021-08-20T18:52:52.512Z · comments (6)

Helping the kids post
jefftk (jkaufman) · 2020-04-17T21:40:03.774Z · comments (9)

Research update: Towards a Law of Iterated Expectations for Heuristic Estimators
Eric Neyman (UnexpectedValues) · 2024-10-07T19:29:29.033Z · comments (2)

PSA: The Sequences don't need to be read in sequence
kave · 2022-05-23T02:53:41.957Z · comments (7)

Naturalism
LoganStrohl (BrienneYudkowsky) · 2022-02-24T19:45:21.659Z · comments (22)

Growth and Form in a Toy Model of Superposition
Liam Carroll (liam-carroll) · 2023-11-08T11:08:04.359Z · comments (7)

Request for proposals for projects in AI alignment that work with deep learning systems
abergal · 2021-10-29T07:26:58.754Z · comments (0)

Why Balsa Research is Worthwhile
Zvi · 2022-10-10T13:50:00.950Z · comments (12)

Covid 5/6: Vaccine Patent Suspension
Zvi · 2021-05-06T20:20:00.639Z · comments (58)

Ceiling Air Purifier
jefftk (jkaufman) · 2022-05-30T19:20:02.130Z · comments (11)

Testing The Natural Abstraction Hypothesis: Project Update
johnswentworth · 2021-09-20T03:44:43.061Z · comments (17)

Solving adversarial attacks in computer vision as a baby version of general AI alignment
Stanislav Fort (stanislavfort) · 2024-08-29T17:17:47.136Z · comments (8)

Two explanations for variation in human abilities
Matthew Barnett (matthew-barnett) · 2019-10-25T22:06:26.329Z · comments (28)

You can use GPT-4 to create prompt injections against GPT-4
WitchBOT · 2023-04-06T20:39:51.584Z · comments (7)

Actually possible: thoughts on Utopia
Joe Carlsmith (joekc) · 2021-01-18T08:27:39.428Z · comments (7)

Alignment versus AI Alignment
Alex Flint (alexflint) · 2022-02-04T22:59:09.794Z · comments (15)

[link] Self-Help Corner: Loop Detection
adamShimi · 2024-10-02T08:33:23.487Z · comments (6)

Polysemanticity and Capacity in Neural Networks
Buck · 2022-10-07T17:51:06.686Z · comments (14)

Practical Pitfalls of Causal Scrubbing
Jérémy Scheurer (JerrySch) · 2023-03-27T07:47:31.309Z · comments (17)

AGI safety from first principles: Superintelligence
Richard_Ngo (ricraz) · 2020-09-28T19:53:40.888Z · comments (8)

You’re Measuring Model Complexity Wrong
Jesse Hoogland (jhoogland) · 2023-10-11T11:46:12.466Z · comments (15)

[link] Singularities against the Singularity: Announcing Workshop on Singular Learning Theory and Alignment
Jesse Hoogland (jhoogland) · 2023-04-01T09:58:22.764Z · comments (0)

A Confession about the LessWrong Team
Ruby · 2023-04-01T21:47:11.572Z · comments (5)

[link] Detecting Genetically Engineered Viruses With Metagenomic Sequencing
jefftk (jkaufman) · 2024-06-27T14:01:34.868Z · comments (10)

New User's Guide to LessWrong
Ruby · 2023-05-17T00:55:49.814Z · comments (50)

How to Diversify Conceptual Alignment: the Model Behind Refine
adamShimi · 2022-07-20T10:44:02.637Z · comments (11)

We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming"
Lukas_Gloor · 2024-05-09T15:43:11.490Z · comments (36)

Apply to be a Safety Engineer at Lockheed Martin!
yanni kyriacos (yanni) · 2024-03-31T21:02:08.499Z · comments (3)

There is a globe in your LLM
jacob_drori (jacobcd52) · 2024-10-08T00:43:40.300Z · comments (4)

MATS Models
johnswentworth · 2022-07-09T00:14:24.812Z · comments (5)

Ineffective Altruism
lsusr · 2022-04-23T22:07:09.946Z · comments (17)

Novum Organum: Introduction
Ruby · 2019-09-19T22:34:21.223Z · comments (5)

[link] Benchmarks for Detecting Measurement Tampering [Redwood Research]
ryan_greenblatt · 2023-09-05T16:44:48.032Z · comments (19)

The Aspiring Rationalist Congregation
maia · 2024-01-10T22:52:54.298Z · comments (23)

← previous page (newer posts) · next page (older posts) →

^{^}

I say "some version of" to allow for a distinction between (a) the "larger argument" of Eliezer_2007's which this post was meant to support in 2007, and (b) whatever version of the same "larger argument" was a standard MIRI position as of roughly 2016-2017.

As far as I can tell, Matthew is only interested in evaluating the 2016-2017 MIRI position, not the 2007 EY position (insofar as the latter is different, if it fact it is). When he cites older EY material, he does so as a means to an end – either as indirect evidence of later MIRI positions, because it was itself cited in the later MIRI material which is his main topic.

^{^}

Note that the current version of Matthew's 2023 post [LW · GW] includes multiple caveats that he's not making the mistake referred to in the May 2024 update.

Note also that Matthew's post only mentions this post in two relatively minor ways, first to clarify that he doesn't make the mistake referred to in the update (unlike some "Non-MIRI people" who do make the mistake), and second to support an argument about whether "Yudkowsky and other MIRI people" believe that it could be sufficient to get a single human's values into the AI, or whether something like CEV would be required instead.

I bring up the mentions of this post in Matthew's post in order to clarifies what role "is 'The Hidden Complexity of Wishes' correct in isolation, considered apart from anything outside it?" plays in Matthew's critique – namely, none at all, IIUC.

(I realize that Matthew's post has been edited over time, so I can only speak to the current version.)

^{^}

To be fully explicit: I'm not claiming anything about whether or not the May 2024 update was about Matthew's 2023 post [LW · GW] (alone or in combination with anything else) or not. I'm just rephrasing what Matthew said in the first comment of this thread (which was also agnostic on the topic of whether the update referred to him).

LessWrong 2.0 Reader

Archive

Recent comments