LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Notes on Dwarkesh Patel’s Podcast with Demis Hassabis
Zvi · 2024-03-01T16:30:08.687Z · comments (0)

Apollo Research 1-year update
Marius Hobbhahn (marius-hobbhahn) · 2024-05-29T17:44:32.484Z · comments (0)

We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming"
Lukas_Gloor · 2024-05-09T15:43:11.490Z · comments (36)

Dragon Agnosticism
jefftk (jkaufman) · 2024-08-01T17:00:06.434Z · comments (75)

OpenAI: The Board Expands
Zvi · 2024-03-12T14:00:04.110Z · comments (1)

SB 1047: Final Takes and Also AB 3211
Zvi · 2024-08-27T22:10:07.647Z · comments (11)

LLMs Look Increasingly Like General Reasoners
eggsyntax · 2024-11-08T23:47:28.886Z · comments (45)

Science advances one funeral at a time
Cameron Berg (cameron-berg) · 2024-11-01T23:06:19.381Z · comments (9)

[question] Am I confused about the "malign universal prior" argument?
nostalgebraist · 2024-08-27T23:17:22.779Z · answers+comments (35)

Takeoff speeds presentation at Anthropic
Tom Davidson (tom-davidson-1) · 2024-06-04T22:46:35.448Z · comments (0)

Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers
hugofry · 2024-04-29T20:57:35.127Z · comments (8)

Catastrophic sabotage as a major threat model for human-level AI systems
evhub · 2024-10-22T20:57:11.395Z · comments (11)

New page: Integrity
Zach Stein-Perlman · 2024-07-10T15:00:41.050Z · comments (3)

Introducing Squiggle AI
ozziegooen · 2025-01-03T17:53:42.915Z · comments (15)

Quotes from Leopold Aschenbrenner’s Situational Awareness Paper
Zvi · 2024-06-07T11:40:03.981Z · comments (10)

Defining alignment research
Richard_Ngo (ricraz) · 2024-08-19T20:42:29.279Z · comments (23)

Zvi’s Thoughts on His 2nd Round of SFF
Zvi · 2024-11-20T13:40:08.092Z · comments (2)

What is malevolence? On the nature, measurement, and distribution of dark traits
David Althaus (wallowinmaya) · 2024-10-23T08:41:33.197Z · comments (23)

Circular Reasoning
abramdemski · 2024-08-05T18:10:32.736Z · comments (37)

My model of what is going on with LLMs
Cole Wyeth (Amyr) · 2025-02-13T03:43:29.447Z · comments (45)

Just admit that you’ve zoned out
joec · 2024-06-04T02:51:27.594Z · comments (22)

[link] Introducing METR's Autonomy Evaluation Resources
Megan Kinniment (megan-kinniment) · 2024-03-15T23:16:59.696Z · comments (0)

The Rising Sea
Jesse Hoogland (jhoogland) · 2025-01-25T20:48:52.971Z · comments (2)

Anvil Shortage
Screwtape · 2024-11-13T22:57:41.974Z · comments (16)

A very strange probability paradox
notfnofn · 2024-11-22T14:01:36.587Z · comments (27)

[link] "AI Safety for Fleshy Humans" an AI Safety explainer by Nicky Case
habryka (habryka4) · 2024-05-03T18:10:12.478Z · comments (11)

Thoughts on the conservative assumptions in AI control
Buck · 2025-01-17T19:23:38.575Z · comments (5)

Matryoshka Sparse Autoencoders
Noa Nabeshima (noa-nabeshima) · 2024-12-14T02:52:32.017Z · comments (15)

Natural Latents: The Concepts
johnswentworth · 2024-03-20T18:21:19.878Z · comments (18)

[link] Should you be worried about H5N1?
gw · 2024-12-05T21:11:06.996Z · comments (2)

Partial value takeover without world takeover
KatjaGrace · 2024-04-05T06:20:03.961Z · comments (23)

Rejecting Television
Declan Molony (declan-molony) · 2024-04-23T04:59:50.253Z · comments (10)

Implications of the inference scaling paradigm for AI safety
Ryan Kidd (ryankidd44) · 2025-01-14T02:14:53.562Z · comments (69)

Fake thinking and real thinking
Joe Carlsmith (joekc) · 2025-01-28T20:05:06.735Z · comments (10)

Teaching CS During Take-Off
andrew carle (andrew-carle) · 2024-05-14T22:45:39.447Z · comments (13)

AI #73: Openly Evil AI
Zvi · 2024-07-18T14:40:05.770Z · comments (20)

Tips On Empirical Research Slides
James Chua (james-chua) · 2025-01-08T05:06:44.942Z · comments (4)

Covert Malicious Finetuning
Tony Wang (tw) · 2024-07-02T02:41:51.698Z · comments (4)

[Intuitive self-models] 1. Preliminaries
Steven Byrnes (steve2152) · 2024-09-19T13:45:27.976Z · comments (23)

[link] New report: Safety Cases for AI
joshc (joshua-clymer) · 2024-03-20T16:45:27.984Z · comments (14)

Is "VNM-agent" one of several options, for what minds can grow up into?
AnnaSalamon · 2024-12-30T06:36:20.890Z · comments (54)

AIs Will Increasingly Fake Alignment
Zvi · 2024-12-24T13:00:07.770Z · comments (0)

Three Notions of "Power"
johnswentworth · 2024-10-30T06:10:08.326Z · comments (44)

Timaeus in 2024
Jesse Hoogland (jhoogland) · 2025-02-20T23:54:56.939Z · comments (1)

[link] Self-Help Corner: Loop Detection
adamShimi · 2024-10-02T08:33:23.487Z · comments (6)

Agent Foundations 2025 at CMU
Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel) · 2025-01-19T23:48:22.569Z · comments (10)

[link] On Eating the Sun
jessicata (jessica.liu.taylor) · 2025-01-08T04:57:20.457Z · comments (93)

[link] The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review
jessicata (jessica.liu.taylor) · 2024-03-27T19:59:27.893Z · comments (37)

How well do truth probes generalise?
mishajw · 2024-02-24T14:12:19.729Z · comments (11)

There is a globe in your LLM
jacob_drori (jacobcd52) · 2024-10-08T00:43:40.300Z · comments (4)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

michael-roe on [NSFW] The Subspace Jhana

I think there might be something to this, so the rest of what I have to say is nit-picking, not an objection to the basic premise.

1. In karmamudra, one imagines oneself (and one’s partner) as enlightened beings. The intention to act as you imagine an enlightened beings would act might be an important safeguard against all sorts of badness.

2. An obvious question is whether chöd is rather more BDSM-y than other forms of meditation.

gurkenglas on Gauging Interest for a Learning-Theoretic Agenda Mentorship Programme

I deferred my decision to after visiting the Learning Theory course [LW · GW]. At the time, the timing had made them seem vaguely affiliated with this programme.

genesmith on How to Make Superbabies

Ha, sadly it is a pseudonym. My parents were neither that lucky nor that prescient when it came to naming me.

ea247 on How to Make Superbabies

Very not important question: is Gene Smith your actual name or a pseudonymn?

Either way, it's the perfect name for the author of this post.

Hats off to you gene smith.

A blacksmith in a traditional forge, hammering a glowing strand of DNA on an anvil. Sparks fly as the DNA helix takes shape under the impact. The scene is illuminated by the fiery glow of the forge, with tools and metalwork surrounding the blacksmith. The blacksmith is muscular, wearing a leather apron, with intense focus on shaping the DNA strand.

marcus-williams on Annapurna's Shortform

Personally it doesn't feel reassuring that a single person can change the production system prompt without any internal discussion/review and that they would decide to blame a single person/competitor for the problem.

vanessa-kosoy on Gauging Interest for a Learning-Theoretic Agenda Mentorship Programme

So far interest in the programme was modest. I would appreciate it to hear from people who either (i) deliberated whether to apply and decided against it or (ii) feel that they might meet the requirements but are not interested. Specifically, what held you back and what changes (if any) would persuade you to apply?

lsusr on Export Surplusses

It doesn't work at that small of a scale. More generally, this principle doesn't work on any scale too small to support an international industrial economy. It wouldn't even work for trade between different tribes of farmers. This is a phenomenon that you only see at very large scales of human behavior. You need massive coordination failures colliding with each other for these ideas to kick in.

cousin_it on Export Surplusses

I don't understand Eliezer's explanation. Imagine Alice is hard-working and Bob is lazy. Then Alice can make goods and sell them to Bob. Half the money she'll spend on having fun, the other half she'll save. In this situation she's rich and has a trade surplus, but the other parts of the explanation - different productivity between different parts of Alice (?) and inability to judge her own work fairly (?) - don't seem to be present.

annapurna on Annapurna's Shortform

Update:

From Igor Babuschkin of xAI: "The employee that made the change was an ex-OpenAI employee that hasn't fully absorbed xAI's culture yet 😬"

https://x.com/ibab/status/1893774017376485466?t=vqJvcSPltsMI5sdGYZJnjg&s=19

saulius on How to Make Superbabies

Thanks for clarifying. If you ever pitch your ideas to potential investors or something, I recommend avoid talking about hundreds of embryos, at least not without acknowledging that this is unrealistic with current technologies. When reading, I was a bit worried that you might be divorced from reality, thinking in sci-fi terms, not knowing the basic realities about IVF. This made it difficult for me to trust other things you were saying about domains I know nothing about. Just letting you know in case it's helpful :)