LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Inner Alignment: Explain like I'm 12 Edition
Rafael Harth (sil-ver) · 2020-08-01T15:24:33.799Z · comments (47)

There should be more AI safety orgs
Marius Hobbhahn (marius-hobbhahn) · 2023-09-21T14:53:52.779Z · comments (25)

[link] The topic is not the content
Aaron Bergman (aaronb50) · 2021-07-06T00:14:26.106Z · comments (25)

Conjecture: a retrospective after 8 months of work
Connor Leahy (NPCollapse) · 2022-11-23T17:10:23.510Z · comments (9)

Towards Developmental Interpretability
Jesse Hoogland (jhoogland) · 2023-07-12T19:33:44.788Z · comments (10)

Talking publicly about AI risk
Jan_Kulveit · 2023-04-21T11:28:16.665Z · comments (9)

[link] I still think it's very unlikely we're observing alien aircraft
dynomight · 2023-06-15T13:01:27.734Z · comments (70)

Deliberate Grieving
Raemon · 2022-05-30T20:49:19.860Z · comments (16)

re: Yudkowsky on biological materials
bhauth · 2023-12-11T13:28:10.639Z · comments (30)

Cortés, Pizarro, and Afonso as Precedents for Takeover
Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-01T03:49:44.573Z · comments (78)

When is Goodhart catastrophic?
Drake Thomas (RavenclawPrefect) · 2023-05-09T03:59:16.043Z · comments (29)

ChatGPT (and now GPT4) is very easily distracted from its rules
dmcs (dmcsh) · 2023-03-15T17:55:04.356Z · comments (42)

The prototypical catastrophic AI action is getting root access to its datacenter
Buck · 2022-06-02T23:46:31.360Z · comments (13)

A report about LessWrong karma volatility from a different universe
Ben Pace (Benito) · 2023-04-01T21:48:32.503Z · comments (7)

IMO challenge bet with Eliezer
paulfchristiano · 2022-02-26T04:50:06.033Z · comments (26)

Postmortem on DIY Recombinant Covid Vaccine
caffemacchiavelli · 2022-01-22T14:12:58.030Z · comments (27)

Intro to Naturalism: Orientation
LoganStrohl (BrienneYudkowsky) · 2022-02-13T07:52:03.503Z · comments (23)

[link] Neural networks generalize because of this one weird trick
Jesse Hoogland (jhoogland) · 2023-01-18T00:10:36.998Z · comments (29)

The Solomonoff Prior is Malign
Mark Xu (mark-xu) · 2020-10-14T01:33:58.440Z · comments (52)

Every "Every Bay Area House Party" Bay Area House Party
Richard_Ngo (ricraz) · 2024-02-16T18:53:28.567Z · comments (6)

[question] Why is o1 so deceptive?
abramdemski · 2024-09-27T17:27:35.439Z · answers+comments (24)

Announcing the Alignment Research Center
paulfchristiano · 2021-04-26T23:30:02.685Z · comments (6)

The Best Software For Every Need
Tomás B. (Bjartur Tómas) · 2021-09-10T02:40:13.731Z · comments (225)

Jean Monnet: The Guerilla Bureaucrat
Martin Sustrik (sustrik) · 2021-03-20T10:37:27.466Z · comments (25)

The salt in pasta water fallacy
Thomas Sepulchre · 2023-03-27T14:53:07.718Z · comments (42)

[link] Toward a Broader Conception of Adverse Selection
Ricki Heicklen (bayesshammai) · 2024-03-14T22:40:57.920Z · comments (61)

Alexander and Yudkowsky on AGI goals
Scott Alexander (Yvain) · 2023-01-24T21:09:16.938Z · comments (53)

LLMs Sometimes Generate Purely Negatively-Reinforced Text
Fabien Roger (Fabien) · 2023-06-16T16:31:32.848Z · comments (11)

Some conceptual alignment research projects
Richard_Ngo (ricraz) · 2022-08-25T22:51:33.478Z · comments (15)

Evaluating the historical value misspecification argument
Matthew Barnett (matthew-barnett) · 2023-10-05T18:34:15.695Z · comments (161)

[link] FHI (Future of Humanity Institute) has shut down (2005–2024)
gwern · 2024-04-17T13:54:16.791Z · comments (22)

[link] microCOVID.org: A tool to estimate COVID risk from common activities
catherio · 2020-08-29T23:01:02.081Z · comments (36)

Struggling like a Shadowmoth
Raemon · 2024-09-24T00:47:05.030Z · comments (38)

Predictive Coding has been Unified with Backpropagation
lsusr · 2021-04-02T21:42:12.937Z · comments (51)

Most Prisoner's Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems
abramdemski · 2020-09-14T22:13:01.236Z · comments (36)

7 traps that (we think) new alignment researchers often fall into
Akash (akash-wasil) · 2022-09-27T23:13:46.697Z · comments (10)

This is already your second chance
Malmesbury (Elmer of Malmesbury) · 2024-07-28T17:13:57.680Z · comments (13)

Motive Ambiguity
Zvi · 2020-12-15T18:10:01.372Z · comments (58)

AGI ruin scenarios are likely (and disjunctive)
So8res · 2022-07-27T03:21:57.615Z · comments (38)

WTH is Cerebrolysin, actually?
gsfitzgerald (neuroplume) · 2024-08-06T20:40:53.378Z · comments (23)

Decision Theory with the Magic Parts Highlighted
moridinamael · 2023-05-16T17:39:55.038Z · comments (24)

What AI Safety Materials Do ML Researchers Find Compelling?
Vael Gates · 2022-12-28T02:03:31.894Z · comments (34)

[link] [Linkpost] Introducing Superalignment
beren · 2023-07-05T18:23:18.419Z · comments (69)

Can you control the past?
Joe Carlsmith (joekc) · 2021-08-27T19:39:29.993Z · comments (90)

Choosing the Zero Point
orthonormal · 2020-04-06T23:44:02.083Z · comments (24)

Gears-Level Models are Capital Investments
johnswentworth · 2019-11-22T22:41:52.943Z · comments (28)

The next decades might be wild
Marius Hobbhahn (marius-hobbhahn) · 2022-12-15T16:10:04.750Z · comments (42)

A rough and incomplete review of some of John Wentworth's research
So8res · 2023-03-28T18:52:50.553Z · comments (18)

Book Launch: The Engines of Cognition
Ben Pace (Benito) · 2021-12-21T07:24:45.170Z · comments (56)

Specializing in Problems We Don't Understand
johnswentworth · 2021-04-10T22:40:40.690Z · comments (29)

← previous page (newer posts) · next page (older posts) →

^{^}

Implementation note: you need to make sure your registers and random number generators resistant to systematic ways of turning a lot of bits in your register on or off at once (e.g. freezing the GPU). This is possible with e.g. error correcting circuits

^{^}

This can be done slowly, serially with a single line

LessWrong 2.0 Reader

Archive

Recent comments