LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Response to Aschenbrenner's "Situational Awareness"
Rob Bensinger (RobbBB) · 2024-06-06T22:57:11.737Z · comments (27)

Optimistic Assumptions, Longterm Planning, and "Cope"
Raemon · 2024-07-17T22:14:24.090Z · comments (46)

Self-Other Overlap: A Neglected Approach to AI Alignment
Marc Carauleanu (Marc-Everin Carauleanu) · 2024-07-30T16:22:29.561Z · comments (43)

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage
orthonormal · 2024-08-06T02:32:41.364Z · comments (26)

[link] Sam Altman fired from OpenAI
LawrenceC (LawChan) · 2023-11-17T20:42:30.759Z · comments (75)

What's Going on With OpenAI's Messaging?
ozziegooen · 2024-05-21T02:22:04.171Z · comments (13)

My AI Model Delta Compared To Christiano
johnswentworth · 2024-06-12T18:19:44.768Z · comments (73)

Two easy things that maybe Just Work to improve AI discourse
jacobjacob · 2024-06-08T15:51:18.078Z · comments (35)

[link] What TMS is like
Sable · 2024-10-31T00:44:22.612Z · comments (16)

My Interview With Cade Metz on His Reporting About Slate Star Codex
Zack_M_Davis · 2024-03-26T17:18:05.114Z · comments (187)

On Not Pulling The Ladder Up Behind You
Screwtape · 2024-04-26T21:58:29.455Z · comments (21)

Announcing Timaeus
Jesse Hoogland (jhoogland) · 2023-10-22T11:59:03.938Z · comments (15)

OMMC Announces RIP
Adam Scholl (adam_scholl) · 2024-04-01T23:20:00.433Z · comments (5)

A basic systems architecture for AI agents that do autonomous research
Buck · 2024-09-23T13:58:27.185Z · comments (12)

[link] The Compendium, A full argument about extinction risk from AGI
adamShimi · 2024-10-31T12:01:51.714Z · comments (39)

[link] Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”
Ricki Heicklen (bayesshammai) · 2024-02-22T23:56:02.318Z · comments (5)

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
JanB (JanBrauner) · 2023-09-28T18:53:58.896Z · comments (38)

Thinking By The Clock
Screwtape · 2023-11-08T07:40:59.936Z · comments (27)

The other side of the tidal wave
KatjaGrace · 2023-11-03T05:40:05.363Z · comments (85)

[link] Survival without dignity
L Rudolf L (LRudL) · 2024-11-04T02:29:38.758Z · comments (5)

Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer
johnswentworth · 2024-04-18T00:27:43.451Z · comments (21)

[link] Daniel Kahneman has died
DanielFilan · 2024-03-27T15:59:14.517Z · comments (11)

AI as a science, and three obstacles to alignment strategies
So8res · 2023-10-25T21:00:16.003Z · comments (80)

Humming is not a free $100 bill
Elizabeth (pktechgirl) · 2024-06-06T20:10:02.457Z · comments (6)

Introducing Alignment Stress-Testing at Anthropic
evhub · 2024-01-12T23:51:25.875Z · comments (23)

Safety consultations for AI lab employees
Zach Stein-Perlman · 2024-07-27T15:00:27.276Z · comments (4)

[link] Why I’m not a Bayesian
Richard_Ngo (ricraz) · 2024-10-06T15:22:45.644Z · comments (89)

There should be more AI safety orgs
Marius Hobbhahn (marius-hobbhahn) · 2023-09-21T14:53:52.779Z · comments (25)

"Humanity vs. AGI" Will Never Look Like "Humanity vs. AGI" to Humanity
Thane Ruthenis · 2023-12-16T20:08:39.375Z · comments (34)

re: Yudkowsky on biological materials
bhauth · 2023-12-11T13:28:10.639Z · comments (30)

The Hopium Wars: the AGI Entente Delusion
Max Tegmark (MaxTegmark) · 2024-10-13T17:00:29.033Z · comments (53)

Contra papers claiming superhuman AI forecasting
nikos (followtheargument) · 2024-09-12T18:10:50.582Z · comments (16)

Cryonics is free
Mati_Roy (MathieuRoy) · 2024-09-29T17:58:17.108Z · comments (35)

Every "Every Bay Area House Party" Bay Area House Party
Richard_Ngo (ricraz) · 2024-02-16T18:53:28.567Z · comments (6)

[link] Toward a Broader Conception of Adverse Selection
Ricki Heicklen (bayesshammai) · 2024-03-14T22:40:57.920Z · comments (61)

[question] Why is o1 so deceptive?
abramdemski · 2024-09-27T17:27:35.439Z · answers+comments (23)

[link] FHI (Future of Humanity Institute) has shut down (2005–2024)
gwern · 2024-04-17T13:54:16.791Z · comments (22)

Skills from a year of Purposeful Rationality Practice
Raemon · 2024-09-18T02:05:58.726Z · comments (18)

Struggling like a Shadowmoth
Raemon · 2024-09-24T00:47:05.030Z · comments (38)

Effective Aspersions: How the Nonlinear Investigation Went Wrong
TracingWoodgrains (tracingwoodgrains) · 2023-12-19T12:00:23.529Z · comments (170)

Architects of Our Own Demise: We Should Stop Developing AI Carelessly
Roko · 2023-10-26T00:36:05.126Z · comments (75)

WTH is Cerebrolysin, actually?
gsfitzgerald (neuroplume) · 2024-08-06T20:40:53.378Z · comments (23)

Thomas Kwa's MIRI research experience
Thomas Kwa (thomas-kwa) · 2023-10-02T16:42:37.886Z · comments (53)

Timaeus's First Four Months
Jesse Hoogland (jhoogland) · 2024-02-28T17:01:53.437Z · comments (6)

Critical review of Christiano's disagreements with Yudkowsky
Vanessa Kosoy (vanessa-kosoy) · 2023-12-27T16:02:50.499Z · comments (40)

This is already your second chance
Malmesbury (Elmer of Malmesbury) · 2024-07-28T17:13:57.680Z · comments (13)

Evaluating the historical value misspecification argument
Matthew Barnett (matthew-barnett) · 2023-10-05T18:34:15.695Z · comments (142)

[link] President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence
Tristan Williams (tristan-williams) · 2023-10-30T11:15:38.422Z · comments (39)

Did Christopher Hitchens change his mind about waterboarding?
Isaac King (KingSupernova) · 2024-09-15T08:28:09.451Z · comments (22)

'Empiricism!' as Anti-Epistemology
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2024-03-14T02:02:59.723Z · comments (90)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

kvmanthinking on Chapter 27: Empathy

Harry's brain tried to calculate the ramifications and implications of this and ran out of swap space.

this is very relatable

ryankidd44 on Ryan Kidd's Shortform

I'm not sure!

yair-halberstadt on Could orcas be (trained to be) smarter than humans? 

Douglas Adams answered this long ago of course:

For instance, on the planet Earth, man had always assumed that he was more intelligent than dolphins because he had achieved so much—the wheel, New York, wars and so on—whilst all the dolphins had ever done was muck about in the water having a good time. But conversely, the dolphins had always believed that they were far more intelligent than man—for precisely the same reasons.

petermccluskey on Investing for a World Transformed by AI

No, I don't recall any ethical concerns. Just basic concerns such as the difficulty of finding a boss that I'm comfortable with, having control over my hours, etc.

steve2152 on Complete Feedback

This is a confusing post from my perspective, because I think of LI as being about beliefs and corrigibility being about desires.

If I want my AGI to believe that the sky is green, I guess it’s good if it’s possible to do that. But it’s kinda weird, and not a central example of corrigibility.

Admittedly, one can try to squish beliefs and desires into the same framework. The Active Inference people do that. Does LI do that too? If so, well, I’m generally very skeptical of attempts to do that kind of thing. See here [LW · GW], especially Section 7. In the case of humans, it’s perfectly possible for a plan to seem desirable but not plausible, or for a plan to seem plausible but not desirable. I think there are very good reasons that our brains are set up that way.

gordon-seidoh-worley on What if muscle tension is sometimes signal jamming?

I don't know, but I can say that after a lot of hours of Alexander lessons my posture and movement improved in ways that would be described as "having less muscle tension" and this having less tension happened in conjunction with various sorts of opening and being more awake and moving closer to PNSE.

mitchell_porter on We can survive

If I understand you correctly, you want to create an unprecedentedly efficient and coordinated network, made out of intelligent people with goodwill, that will solve humanity's problems in theory and in practice?

steve2152 on What's a good book for a technically-minded 11-year old?

My 9yo has recently enjoyed Ender’s Game, Harry Potter, Hitchhiker’s Guide to the Galaxy, and What If. He recently asked to borrow my The Vital Question (it came up in conversation about abiogenesis) and he’s mostly following it so far but has occasional questions for me, we’ll see how far he gets or if he loses steam.

For non-books, he wanted to do Khan academy cosmology / astronomy, I think he did one big unit of Khan academy math before losing interest, he likes Eureka crates (little kits to build your own soap dispenser, rivet press, ukulele, whatever, they come once a month, good gift), lotsa video games, and he was doing DuoLingo Spanish every night (he has a streak, he’s a total sucker for gamification) but to my dismay decided to switch to the rather less practical DuoLingo Klingon. ¯\_(ツ)_/¯

crissman on Join a LessWrong Team for the Unaging System Challenge

Streamlined the registration page, and added a field to note that you want to join a Less Wrong team: https://www.unaging.com/unaging-system-2/

rhollerith_dot_com on Is OpenAI net negative for AI Safety?

Yes, humanity would be more likely to survive if OpenAI never existed or if it closed tomorrow.

I wonder why no one else has answered. Am I stepping on some landmine I don't know about?

What would drastically increase P(survival) is stopping all large training runs. If that turns out to be impossible, then some labs might be "better" than the average lab in that if their model is the first one to become capable of extincting humanity, it might choose not to do that because of technical details in how the lab made the model. (I don't put a lot of hope in this possibility.)

To my knowledge, no one who is not employed at OpenAI and who is not an investor in OpenAI believes that OpenAI is one of these "better" labs. OK, that is not literally true, but it is unlikely that anyone who is not employed at OpenAI and who is not an investor in OpenAI and who believes OpenAI is one of the "better" labs can provide an argument longer than 50 words that you or I would consider rational and coherent.

But even if OpenAI is shut down tomorrow and everyone working there is permanently prevented from working in AI (a pleasant thought!) the AI enterprise would still be the thorniest danger facing humanity.