LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

A model of the final phase: the current frontier AIs as de facto CEOs of their own companies
Mitchell_Porter · 2025-03-08T22:15:35.260Z · comments (2)

Prodromes and Biomarkers in Chronic Disease
sarahconstantin · 2025-04-16T21:30:02.978Z · comments (2)

[link] Forging A New AGI Social Contract
Deric Cheng (deric-cheng) · 2025-04-10T13:41:11.817Z · comments (3)

Non-Monotonic Infra-Bayesian Physicalism
Marcus Ogren · 2025-04-02T12:14:19.783Z · comments (0)

[question] LessWrong merch?
Brendan Long (korin43) · 2025-04-03T21:51:47.190Z · answers+comments (2)

[link] Currency Collapse
prue (prue0) · 2025-04-11T03:48:01.469Z · comments (3)

A Bunch of Matryoshka SAEs
chanind · 2025-04-04T14:53:56.805Z · comments (0)

Understanding Trust: Overview Presentations
abramdemski · 2025-04-16T18:08:31.064Z · comments (0)

Grok3 On Kant On AI Slavery
JenniferRM · 2025-04-01T04:10:48.093Z · comments (3)

Notes on handling non-concentrated failures with AI control: high level methods and different regimes
ryan_greenblatt · 2025-03-24T01:00:38.222Z · comments (3)

The Leapfrogging Terminus and the Fuzzy Cut
Jim Pivarski (jim-pivarski) · 2025-03-31T04:08:24.023Z · comments (6)

Doing principle-of-charity better
Sniffnoy · 2025-03-27T05:19:52.195Z · comments (1)

Opportunity Space: Renormalization for AI Safety
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T20:55:52.155Z · comments (0)

[question] Does the AI control agenda broadly rely on no FOOM being possible?
Noosphere89 (sharmake-farah) · 2025-03-29T19:38:23.971Z · answers+comments (3)

Introduction to Representing Sentences as Logical Statements
Towards_Keeperhood (Simon Skade) · 2025-04-05T20:35:31.422Z · comments (9)

Read More News
utilistrutil · 2025-03-16T21:31:28.817Z · comments (2)

Towards an understanding of the Chinese AI scene
Mitchell_Porter · 2025-03-24T09:10:19.498Z · comments (0)

[question] Can we ever ensure AI alignment if we can only test AI personas?
Karl von Wendt · 2025-03-16T08:06:42.345Z · answers+comments (8)

[Replication] Crosscoder-based Stage-Wise Model Diffing
annas (annasoli) · 2025-03-22T18:35:19.003Z · comments (0)

Why Were We Wrong About China and AI? A Case Study in Failed Rationality
thedudeabides · 2025-03-22T05:13:52.181Z · comments (38)

[link] AI Tools for Existential Security
Lizka · 2025-03-14T18:38:06.110Z · comments (4)

[link] Ferrer, Pilar, and Me
Askwho · 2025-04-06T11:22:57.758Z · comments (1)

Defense Against The Super-Worms
viemccoy · 2025-03-20T07:24:56.975Z · comments (1)

[link] "Long" timelines to advanced AI have gotten crazy short
Matrice Jacobine · 2025-04-03T22:46:39.416Z · comments (0)

Consequentialism is for making decisions
Sniffnoy · 2025-03-27T04:00:07.020Z · comments (9)

Feature Hedging: Another way correlated features break SAEs
chanind · 2025-03-25T14:33:08.694Z · comments (0)

[link] Slopworld 2035: The dangers of mediocre AI
titotal (lombertini) · 2025-04-14T13:14:08.390Z · comments (6)

[link] Inside OpenAI's Controversial Plan to Abandon its Nonprofit Roots
garrison · 2025-04-18T18:46:57.310Z · comments (0)

The Internal Model Principle: A Straightforward Explanation
Alfred Harwood · 2025-04-12T10:58:51.479Z · comments (1)

Will US tariffs push data centers for large model training offshore?
ChristianKl · 2025-04-12T12:47:12.917Z · comments (3)

[question] How far along Metr's law can AI start automating or helping with alignment research?
Christopher King (christopher-king) · 2025-03-20T15:58:08.369Z · answers+comments (21)

Improved visualizations of METR Time Horizons paper.
LDJ (luigi-d) · 2025-03-19T23:36:52.771Z · comments (4)

Leverage, Exit Costs, and Anger: Re-examining Why We Explode at Home, Not at Work
at_the_zoo · 2025-04-01T18:28:26.611Z · comments (2)

Weird Random Newcomb Problem
Tapatakt · 2025-04-11T13:09:01.856Z · comments (15)

Edge Cases in AI Alignment
Florian_Dietz · 2025-03-24T09:27:58.164Z · comments (3)

[link] AI Model History is Being Lost
Vale · 2025-03-16T12:38:47.907Z · comments (1)

Does Summarization Affect LLM Performance?
atharva · 2025-04-01T02:14:31.826Z · comments (2)

[link] My day in 2035
Tenoke · 2025-04-11T16:31:19.610Z · comments (2)

Comments on "AI 2027"
Randaly · 2025-04-11T20:32:34.419Z · comments (10)

AI could cause a drop in GDP, even if markets are competitive and efficient
Casey Barkan (casey-barkan) · 2025-04-10T22:35:16.290Z · comments (0)

Finding Emergent Misalignment
Jan Betley (jan-betley) · 2025-03-26T17:33:46.792Z · comments (0)

The Last Light
Bridgett Kay (bridgett-kay) · 2025-04-14T15:41:02.745Z · comments (0)

Offer: Team Conflict Counseling for AI Safety Orgs
Severin T. Seehrich (sts) · 2025-04-14T15:17:00.835Z · comments (1)

Dusty Hands and Geo-arbitrage
Tomás B. (Bjartur Tómas) · 2025-03-22T16:05:30.364Z · comments (3)

Experts have it easy
beyarkay · 2025-04-12T19:32:17.158Z · comments (3)

Legibility
lsusr · 2025-03-22T06:54:35.259Z · comments (22)

Everything's An Emergency
omnizoid · 2025-03-20T17:12:23.006Z · comments (0)

Ghiblification is good, actually
Ozyrus · 2025-04-02T10:48:57.135Z · comments (1)

Why does Claude Speak Byzantine Music Notation?
Lennart Finke (l-f) · 2025-03-31T15:13:10.753Z · comments (2)

Technical Claims
Vladimir_Nesov · 2025-04-03T00:30:56.185Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

hpcfung on Rationalist Should Win. Not Dying with Dignity and Funding WBE.

I'm also interested, have you made any progress since your comment?

lc on Three Months In, Evaluating Three Rationalist Cases for Trump

The doubling down is delusional but I think you're simplifying the failure of projection a bit. The inability of markets and forecasters to predict Trump's second term is quite interesting. A lot of different models of politics failed.

gjm on o3 Will Use Its Tools For You

Pedantic note: there are many instances of "syncopathy" that I am fairly sure should be "sycophancy".

(It's an understandable mistake -- "syncopathy" is composed of familiar components, which could plausibly be put together to mean something like "the disease of agreeing too much" which is, at least in the context of AI, not far off what sycophancy in fact means. Whereas if you can parse "sycophancy" at all you might work out that it means "fig-showing" which obviously has nothing to do with anything. So far as I can tell, no one actually knows how "fig-showing" came to be the term for servile flattery.)

michaeldickens on Planning for Extreme AI Risks

I think the right way to self-destruct isn't to shut down entirely. It's to spend all your remaining assets on safety (whether that be lobbying for regulations, or research, or whatever). This would greatly increase the total amount of money spent on safety efforts so it might help quite a lot.

I do believe shutting down does have a decent chance, although not a comfortingly large one, of scaring government and/or other AI companies into taking the risks seriously.

anthonyc on What Makes an AI Startup "Net Positive" for Safety?

I won't comment on your specific startup, but I wonder in general how an AI Safety startup becomes a successful business. What's the business model? Who is the target customer? Why do they buy? Unless the goal is to get acquired by one of the big labs, in which case, sure, but again, why or when do they buy, and at what price? Especially since they already don't seem to be putting much effort into solving the problem themselves despite having better tools and more money to do so than any new entrant startup.

anthonyc on Three Months In, Evaluating Three Rationalist Cases for Trump

I really, really hope at some point the Democrats will acknowledge the reason they lost is that they failed to persuade the median voter of their ideas, and/or adopt ideas that appeal to said voters. At least among those I interact with, there seems to be a denial of the idea that this is how you win elections, which is a prerequisite for governing.

saidachmiz on A Dissent on Honesty

The hard cases are much more interesting. What about lying to my landlord about renting a room on airbnb? What about saying your class will make people millionaires for the low low price of $1,000 (hey, it could happen)? What about hiding the rats from the health inspector?

None of these seem like hard cases to me. Lying is wrong (and pretty obviously so) in all three of these cases.

anthonyc on Why Does It Feel Like Something? An Evolutionary Path to Subjectivity

That seems very possible to me, and if and when we can show whether something like that is the case, I do think it would represent significant progress. If nothing else, it would help tell us what the thing we need to be examining actually is, in a way we don't currently have an easy way to specify.

elizabeth-1 on A Dissent on Honesty

I liked this post a lot more than I expected to, but I'm disappointed the only examples of lying are a combination of people who have no right to the information and people who are better off for you lying (in a way that gives them truer beliefs than if you'd told the literal truth).

The hard cases are much more interesting. What about lying to my landlord about renting a room on airbnb? What about saying your class will make people millionaires for the low low price of $1,000 (hey, it could happen)? What about hiding the rats from the health inspector?

hpcfung on hpcfung's Shortform

Is there any attempt at compiling a list of all publicly available university courses materials (lecture notes, videos, reference books, syllabi), across all institutions? I seem to remember cosmolearning.org but the site is no longer running.

I imagine this kind of infrastructure is really helpful, or even necessary to self learners.

The equivalent for researchers would be conferences, summer schools/workshops, powerpoints for talks, etc.