LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[Linkpost] Visual roadmap to strong human germline engineering
TsviBT · 2025-04-05T22:22:57.744Z · comments (0)

Selection Pressures on LM Personas
Raymond D · 2025-03-28T20:33:09.918Z · comments (0)

How to evaluate control measures for LLM agents? A trajectory from today to superintelligence
Tomek Korbak (tomek-korbak) · 2025-04-14T16:45:46.584Z · comments (1)

[link] Unbendable Arm as Test Case for Religious Belief
Ivan Vendrov (ivan-vendrov) · 2025-04-14T01:57:12.013Z · comments (46)

MONA: Three Month Later - Updates and Steganography Without Optimization Pressure
David Lindner · 2025-04-12T23:15:07.964Z · comments (0)

[link] Reasoning models don't always say what they think
Joe Benton · 2025-04-09T19:48:58.733Z · comments (4)

AI #112: Release the Everything
Zvi · 2025-04-17T15:10:02.029Z · comments (6)

How much does it cost to back up solar with batteries?
jasoncrawford · 2025-03-25T16:35:52.834Z · comments (6)

GPT-4.1 Is a Mini Upgrade
Zvi · 2025-04-16T19:00:03.181Z · comments (6)

Thoughts on the Double Impact Project
Mati_Roy (MathieuRoy) · 2025-04-13T19:07:57.687Z · comments (10)

[link] Fundraising for Mox: coworking & events in SF
Austin Chen (austin-chen) · 2025-03-31T18:25:03.571Z · comments (0)

[link] OpenAI lost $5 billion in 2024 (and its losses are increasing)
Remmelt (remmelt-ellen) · 2025-03-31T04:17:27.242Z · comments (15)

Introducing WAIT to Save Humanity
carterallen · 2025-04-01T21:47:17.857Z · comments (1)

AI #111: Giving Us Pause
Zvi · 2025-04-10T14:00:04.194Z · comments (4)

Changing my mind about Christiano's malign prior argument
Cole Wyeth (Amyr) · 2025-04-04T00:54:44.199Z · comments (34)

AI could cause a drop in GDP, even if markets are competitive and efficient
Casey Barkan (casey-barkan) · 2025-04-10T22:35:16.290Z · comments (0)

[link] Understanding and overcoming AGI apathy
Dhruv Sumathi (dhruv-sumathi) · 2025-04-17T01:04:53.853Z · comments (1)

Explaining the Joke: Pausing is The Way
WillPetillo · 2025-04-04T09:04:38.847Z · comments (2)

[link] Nucleic Acid Observatory Updates, April 2025
jefftk (jkaufman) · 2025-04-15T18:58:29.839Z · comments (0)

Navigation by Moonlight
Jacob Falkovich (Jacobian) · 2025-04-07T15:32:17.353Z · comments (39)

[question] What faithfulness metrics should general claims about CoT faithfulness be based upon?
Rauno Arike (rauno-arike) · 2025-04-08T15:27:20.346Z · answers+comments (0)

Against podcasts
Adam Zerner (adamzerner) · 2025-04-05T19:20:00.716Z · comments (19)

[link] Forging A New AGI Social Contract
Deric Cheng (deric-cheng) · 2025-04-10T13:41:11.817Z · comments (3)

How to mitigate sandbagging
Teun van der Weij (teun-van-der-weij) · 2025-03-23T17:19:07.452Z · comments (0)

Monthly Roundup #29: April 2025
Zvi · 2025-04-14T11:50:02.324Z · comments (6)

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability
DanielFilan · 2025-03-28T18:40:01.856Z · comments (0)

The Last Light
Bridgett Kay (bridgett-kay) · 2025-04-14T15:41:02.745Z · comments (2)

[question] LessWrong merch?
Brendan Long (korin43) · 2025-04-03T21:51:47.190Z · answers+comments (2)

[link] Currency Collapse
prue (prue0) · 2025-04-11T03:48:01.469Z · comments (3)

Prodromes and Biomarkers in Chronic Disease
sarahconstantin · 2025-04-16T21:30:02.978Z · comments (2)

A Bunch of Matryoshka SAEs
chanind · 2025-04-04T14:53:56.805Z · comments (0)

The Leapfrogging Terminus and the Fuzzy Cut
Jim Pivarski (jim-pivarski) · 2025-03-31T04:08:24.023Z · comments (6)

Notes on handling non-concentrated failures with AI control: high level methods and different regimes
ryan_greenblatt · 2025-03-24T01:00:38.222Z · comments (3)

[question] Does the AI control agenda broadly rely on no FOOM being possible?
Noosphere89 (sharmake-farah) · 2025-03-29T19:38:23.971Z · answers+comments (3)

Introduction to Representing Sentences as Logical Statements
Towards_Keeperhood (Simon Skade) · 2025-04-05T20:35:31.422Z · comments (9)

Interesting ACX 2024 Book Review Entries
jenn (pixx) · 2025-04-20T18:10:04.973Z · comments (1)

Understanding Trust: Overview Presentations
abramdemski · 2025-04-16T18:08:31.064Z · comments (0)

Doing principle-of-charity better
Sniffnoy · 2025-03-27T05:19:52.195Z · comments (1)

Grok3 On Kant On AI Slavery
JenniferRM · 2025-04-01T04:10:48.093Z · comments (3)

Opportunity Space: Renormalization for AI Safety
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T20:55:52.155Z · comments (0)

Why Were We Wrong About China and AI? A Case Study in Failed Rationality
thedudeabides · 2025-03-22T05:13:52.181Z · comments (38)

[link] "Long" timelines to advanced AI have gotten crazy short
Matrice Jacobine · 2025-04-03T22:46:39.416Z · comments (0)

Spending on Ourselves
jefftk (jkaufman) · 2025-04-20T18:40:07.988Z · comments (0)

[link] Ferrer, Pilar, and Me
Askwho · 2025-04-06T11:22:57.758Z · comments (1)

Consequentialism is for making decisions
Sniffnoy · 2025-03-27T04:00:07.020Z · comments (9)

[Replication] Crosscoder-based Stage-Wise Model Diffing
annas (annasoli) · 2025-03-22T18:35:19.003Z · comments (0)

Towards an understanding of the Chinese AI scene
Mitchell_Porter · 2025-03-24T09:10:19.498Z · comments (0)

Feature Hedging: Another way correlated features break SAEs
chanind · 2025-03-25T14:33:08.694Z · comments (0)

[link] Inside OpenAI's Controversial Plan to Abandon its Nonprofit Roots
garrison · 2025-04-18T18:46:57.310Z · comments (0)

Leverage, Exit Costs, and Anger: Re-examining Why We Explode at Home, Not at Work
at_the_zoo · 2025-04-01T18:28:26.611Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

chris_leong on aog's Shortform

The root cause may be that there is too much inferential distance [? · GW]

Perhaps, although I generally become more sympathetic to someone's point of view the more I read from them.

And it's part of why I think it's useful to create scenes that operate on different worldview assumptions: it's worth working out the implications of specific beliefs without needing to justify those beliefs each time.

I used to lean more strongly towards more schools of thought being good, however I've updated slightly on the margin towards believing thinking some schools of thought just end up muddying the waters.

That said, Epoch has done some great research, so I'm overall happy the scene exists. And I think Matthew Barnett is extremely talented, I just think he's unfortunately become confused.

mikhail-samin on VDT: a solution to decision theory

This paradox doesn't occur because a computation trying to prove its own output (and give the opposite output) will have to simulate itself

Due to Löb, if a computation knows that if it finds a proof that it outputs A, then it will output A, then it proves that it outputs A, without any need for recursion. This is why you really shouldn’t output something just because you’ve proved that you will.

mikhail-samin on VDT: a solution to decision theory

Yeah, from the claim that pi starts with two you can easily prove anything. But I think:

(1) something like logical induction should somewhat help: maybe the agent doesn’t know whether some statement is true and isn’t going to run for long enough to start encounter contradictions.

(2) Omega can also maybe intervene on the agent’s experience/knowledge of more accessible logical statements while leaving other things intact, sort of like making you experience what Eliezer describes here as convincing that 2+2=3: https://www.lesswrong.com/posts/6FmqiAgS8h4EJm86s/how-to-convince-me-that-2-2-3 [LW · GW], and if that’s what it is doing, we should basically ignore our knowledge of maths for the purpose of thinking about logical counterfactuals.

snewman on AI 2027 is a Bet Against Amdahl's Law

That's not how the math works. Suppose there are 200 activities under the heading of "AI R&D" that each comprise at least 0.1% of the workload. Suppose we reach a point where AI is vastly superhuman at 150 of those activities (which would include any activities that humans are particularly bad at), moderately superhuman at 40 more, and not much better than human (or even worse than human) at the remaining 10. Those 10 activities where AI is not providing much uplift comprise at least 1% of the AI R&D workload, and so progress can be accelerated at most 100x.

This is oversimplified; there is some room for superhuman ability (making excellent choices of experiments to run) can compensate for lack of uplift in other areas (time to code and execute individual experiments). But the fundamental point remains: a complex process can be bottlenecked by its slowest step. Amdahl's Law is not symmetric – a chain can't be as strong as its strongest link.

tag on Moral patienthood of simulated minds allows uncountabe infinity of value on finite hardware

I claim that it is possible to create a program, which can be interpreted as running uncountably infinite number of simulations

I can't see that such a programme would have to be interpreted as running uncountably infinite number of simulations

My response is to just discard those frameworks, and use something else.

something else that's another philosophical framework, or something else entirely?

mr-beastly on An Alternate History of the Future, 2025-2040

All services not running behind AWS, GCP or Azure will be banned from access to the newly branded "Internet 2.0", as they are proven vulnerable to attack from any newer "PhD+ level reasoning/coding ai agent".

See also:

"Critical infrastructure systems often suffer from "patch lag," resulting in software remaining unpatched for extended periods, sometimes years or decades. In many cases, patches cannot be applied in a timely manner because systems must operate without interruption, the software remains outdated because its developer went out of business, or interoperability constraints require specific legacy software." -- Superintelligence Strategy by Dan Hendrycks, Eric Schmidt, Alexandr Wang Mar 2025 https://www.nationalsecurity.ai/chapter/ai-is-pivotal-for-nationaZ<l-security, https://arxiv.org/abs/2503.05628

"Partner with critical national infrastructure companies (e.g. power utilities) to patch vulnerabilities" -- "An Approach to Technical AGI Safety and Security" Google DeepMind Team, Apr 2025 https://arxiv.org/html/2504.01849v1#S5

kilgoar on Illiteracy in Silicon Valley

The argument here is incredibly unconvincing and utterly puzzling. Moses is a mythological figure

kilgoar on Illiteracy in Silicon Valley

These vast sweeping claims you're making are not original thoughts that you've gotten from firsthand sources, but rather they are from 18th and 19th century historians. That is, the narrative of gradual improvement over time in what's called Whiggish history. It's very popular among non-historians or amateur historians but 20th century historians were very critical of this view. Experts in the field, the people who are making a career of "looking at historical documents," have largely flipped on this view.

Herbert Butterfield wrote a famous takedown, The Whig Interpretation of History (1931). P. B. M. Blaas felt the style had already passed by 1914, in his seminal work on historiography, Continuity and Anachronism (1978). His term for your idea that people in the past acted on the concept of survival of the fittest before its conceptualization is called Presentism, a form of anachronism, and it's the biggest stumbling block for understanding the people of the past.

David Cannadine, the Dodge Professor of History at Princeton said, "Whig history was, in short, an extremely biased view of the past: eager to hand out moral judgements, and distorted by teleology, anachronism and present-mindedness."

Frederic William Maitland is widely considered the first of a new breed of historians. The answer to Whiggish history was in fact utilizing more data than ever. For him, that meant actually reading as much of English law as possible and understanding it in its own terms, rather than treating it as a more vague process inevitably leading to the present. Contrary to your claim, the firsthand sources in fact shattered Whiggish pretensions.

I'm trying to very politely tell y'all in this thread that this crap is the Newtonian Physics of history. Sure, Edward Gibbon and other ye olde history is a decent starting point, but if that's all you have you're pretty much out of touch with the field.

Does a bacterium "practice survival of the fittest" in a way that matches the expressly Darwinist ideology of Hitler? Of course not. And neither does a Chimp.

tag on Moral patienthood of simulated minds allows uncountabe infinity of value on finite hardware

The easiest explanation for high measure of biological minds is simulated minds lacking consciousness.

big_friendly_kiwi on Not All Beliefs Are Created Equal: Diagnosing Toxic Ideologies

You can acknowledge critics are deluded or self interested whilst also admitting they have some substantial points - this is more in the vein of using that as a justification to ignore all criticism; even valid criticism.