LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Monthly Roundup #28: March 2025
Zvi · 2025-03-17T12:50:03.097Z · comments (8)

Meetups Notes (Q1 2025)
jenn (pixx) · 2025-03-31T01:12:11.774Z · comments (2)

Prospects for Alignment Automation: Interpretability Case Study
Jacob Pfau (jacob-pfau) · 2025-03-21T14:05:51.528Z · comments (4)

How much progress actually happens in theoretical physics?
ChristianKl · 2025-04-04T23:08:00.633Z · comments (32)

Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
Alex Mallen (alex-mallen) · 2025-03-24T17:55:59.358Z · comments (0)

Most Questionable Details in 'AI 2027'
scarcegreengrass · 2025-04-05T00:32:54.896Z · comments (4)

Selection Pressures on LM Personas
Raymond D · 2025-03-28T20:33:09.918Z · comments (0)

[Linkpost] Visual roadmap to strong human germline engineering
TsviBT · 2025-04-05T22:22:57.744Z · comments (0)

EIS XV: A New Proof of Concept for Useful Interpretability
scasper · 2025-03-17T20:05:30.580Z · comments (2)

Call for Collaboration: Renormalization for AI safety
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T21:01:56.500Z · comments (0)

How much does it cost to back up solar with batteries?
jasoncrawford · 2025-03-25T16:35:52.834Z · comments (6)

[link] Fundraising for Mox: coworking & events in SF
Austin Chen (austin-chen) · 2025-03-31T18:25:03.571Z · comments (0)

[link] Smelling Nice is Good, Actually
Gordon Seidoh Worley (gworley) · 2025-03-18T16:54:43.324Z · comments (8)

Non-Consensual Consent: The Performance of Choice in a Coercive World
Alex_Steiner · 2025-03-20T17:12:16.302Z · comments (4)

Reflections on Neuralese
Alice Blair (Diatom) · 2025-03-12T16:29:31.230Z · comments (0)

Introducing WAIT to Save Humanity
carterallen · 2025-04-01T21:47:17.857Z · comments (1)

Proof-of-Concept Debugger for a Small LLM
Peter Lai (peter-lai) · 2025-03-17T22:27:52.386Z · comments (0)

[link] Your Communication Preferences Aren’t Law
Jonathan Moregård (JonathanMoregard) · 2025-03-12T17:20:11.117Z · comments (4)

Why Were We Wrong About China and AI? A Case Study in Failed Rationality
thedudeabides · 2025-03-22T05:13:52.181Z · comments (35)

Existing UDTs test the limits of Bayesianism (and consistency)
Cole Wyeth (Amyr) · 2025-03-12T04:09:11.615Z · comments (18)

Austin Chen on Winning, Risk-Taking, and FTX
Elizabeth (pktechgirl) · 2025-04-07T19:00:08.039Z · comments (0)

[link] Sentinel minutes #10/2025: Trump tariffs, US/China tensions, Claude code reward hacking.
NunoSempere (Radamantis) · 2025-03-10T19:00:25.808Z · comments (0)

What Uniparental Disomy Tells Us About Improper Imprinting in Humans
Morpheus · 2025-03-28T11:24:47.133Z · comments (1)

[link] OpenAI lost $5 billion in 2024 (and its losses are increasing)
Remmelt (remmelt-ellen) · 2025-03-31T04:17:27.242Z · comments (15)

Changing my mind about Christiano's malign prior argument
Cole Wyeth (Amyr) · 2025-04-04T00:54:44.199Z · comments (34)

Report & retrospective on the Dovetail fellowship
Alex_Altair · 2025-03-14T23:20:17.940Z · comments (3)

[link] How prediction markets can create harmful outcomes: a case study
B Jacobs (Bob Jacobs) · 2025-04-02T15:37:09.285Z · comments (2)

Whether governments will control AGI is important and neglected
Seth Herd · 2025-03-14T09:48:34.062Z · comments (2)

Bike Lights are Cheap Enough to Give Away
jefftk (jkaufman) · 2025-03-14T02:10:02.482Z · comments (0)

Explaining the Joke: Pausing is The Way
WillPetillo · 2025-04-04T09:04:38.847Z · comments (2)

I grade every NBA basketball game I watch based on enjoyability
proshowersinger · 2025-03-12T21:46:26.791Z · comments (2)

How to mitigate sandbagging
Teun van der Weij (teun-van-der-weij) · 2025-03-23T17:19:07.452Z · comments (0)

A model of the final phase: the current frontier AIs as de facto CEOs of their own companies
Mitchell_Porter · 2025-03-08T22:15:35.260Z · comments (2)

[link] Well-foundedness as an organizing principle of healthy minds and societies
Richard_Ngo (ricraz) · 2025-04-07T00:31:34.098Z · comments (5)

Grok3 On Kant On AI Slavery
JenniferRM · 2025-04-01T04:10:48.093Z · comments (3)

Against podcasts
Adam Zerner (adamzerner) · 2025-04-05T19:20:00.716Z · comments (8)

[question] Does the AI control agenda broadly rely on no FOOM being possible?
Noosphere89 (sharmake-farah) · 2025-03-29T19:38:23.971Z · answers+comments (3)

Notes on handling non-concentrated failures with AI control: high level methods and different regimes
ryan_greenblatt · 2025-03-24T01:00:38.222Z · comments (3)

Doing principle-of-charity better
Sniffnoy · 2025-03-27T05:19:52.195Z · comments (1)

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability
DanielFilan · 2025-03-28T18:40:01.856Z · comments (0)

Opportunity Space: Renormalization for AI Safety
Lauren Greenspan (LaurenGreenspan) · 2025-03-31T20:55:52.155Z · comments (0)

The Leapfrogging Terminus and the Fuzzy Cut
Jim Pivarski (jim-pivarski) · 2025-03-31T04:08:24.023Z · comments (6)

Read More News
utilistrutil · 2025-03-16T21:31:28.817Z · comments (2)

[question] Can we ever ensure AI alignment if we can only test AI personas?
Karl von Wendt · 2025-03-16T08:06:42.345Z · answers+comments (8)

[link] AI Tools for Existential Security
Lizka · 2025-03-14T18:38:06.110Z · comments (4)

Defense Against The Super-Worms
viemccoy · 2025-03-20T07:24:56.975Z · comments (1)

Consequentialism is for making decisions
Sniffnoy · 2025-03-27T04:00:07.020Z · comments (9)

[link] "Long" timelines to advanced AI have gotten crazy short
Matrice Jacobine · 2025-04-03T22:46:39.416Z · comments (0)

Towards an understanding of the Chinese AI scene
Mitchell_Porter · 2025-03-24T09:10:19.498Z · comments (0)

[question] LessWrong merch?
Brendan Long (korin43) · 2025-04-03T21:51:47.190Z · answers+comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

o-o on LWLW's Shortform

Yes, the likely outcome of a long tariff regime is China replaces the U.S. as the hegemon + AI race leader and they can’t read Lesswrong or EA blogs there so all this work is useless.

mateusz-baginski on LWLW's Shortform

If that was his goal, he has better options.

andrew-sauer on LWLW's Shortform

Trump shot an arrow into the air; it fell to Earth, he knows not where...

Probably one of the best succinct summaries of every damn week that man is president lmao

jsd on Ram Potham's Shortform

there's this https://github.com/Jellyfish042/uncheatable_eval

mo-putera on Mo Putera's Shortform

Venkatesh Rao's recent newsletter article Terms of Centaur Service caught my eye for his professed joy of AI-assisted writing, both nonfiction and fiction:

In the last couple of weeks, I’ve gotten into a groove with AI-assisted writing, as you may have noticed, and I am really enjoying it. ... The AI element in my writing has gotten serious, and I think is here to stay. ...
On the writing side, when I have a productive prompting session, not only does the output feel information dense for the audience, it feels information dense for me.
An example of this kind of essay is one I posted last week, on a memory-access-boundary understanding of what intelligence is. This was an essay I generated that I got value out of reading. And it didn’t feel like a simple case of “thinking through writing.” There’s stuff in here contributed by ChatGPT that I didn’t know or realize even subconsciously, even though I’ve been consulting for 13 years in the semiconductor industry.
Generated text having elements new to even the prompter is a real benefit, especially with fiction. I wrote a bit of fiction last week that will be published in Protocolized tomorrow that was so much fun, I went back and re-read it twice. This is something I never do with m own writing. By the time I ship an unassisted piece of writing, I’m generally sick of it.
AI-assisted writing allows you to have your cake and eat it too. The pleasure of the creative process, and the pleasure of reading. That’s in fact a test of good slop — do you feel like reading it?

I think this made an impression on me because Venkat's joy contrasts so much to many people's criticism of Sam Altman's recent tweet re: their new creative fiction model's completion to the prompt "Please write a metafictional literary short story about AI and grief", including folks like Eliezer, who said "To be clear, I would be impressed with a dog that wrote the same story, but only because it was a dog". I liked the AI's output quite a lot actually, more than I did Eliezer's (and I loved HPMOR so I should be selected for Eliezer-fiction-bias), and I found myself agreeing with Roon's pushback to him.

Although Roshan's remark that "AI fiction seems to be in the habit of being interesting only to the person who prompted it" does give me pause. While this doesn't seem to be true in the AI vs Eliezer comparison specifically, I do find plausible a hyperpersonalisation-driven near-future where AI fiction becomes superstimuli-level interesting only to the prompter. But I find the contra scenario plausible too. Not sure where I land here.

jimmy on Navigation by Moonlight

Now, one could reasonably counter-argue that the yin strategy delivers value somewhere else, besides just e.g. "probability of a date".

Yeah, probability of a date isn't something you want to Goodhart on.

That said, the post conspicuously avoids asking: how well will this yin strategy actually work? How much will the yin strategy improve this girl's chance of a date with the guy, compared to (a) doing nothing and acting normally, or (b) directly asking him out? It seems very obvious that the yin stuff will result in a date-probability only marginally higher than doing nothing (I'd say 1-10 percentage points higher at most, if I had to Put A Number On It), and far far lower than if she asks him (I'd say tens of percentage points).

You're greatly underestimating the power of the "yin strategy" both to create desire where there was none and also to be very very obvious when it needs to be.

The normal pattern is for woman to give some subtle cues, and if the man doesn't respond to them, the woman doesn't do more and doesn't get the date. Sometimes this is due to the woman in question not recognizing how subtle she's being, and losing out on a date with a man she's still interested in. Much of the time though, the woman isn't attached to "obtain date" as the goal, and is better described as probing than "trying to manipulate him into asking her out". I've had women explicitly tell me that they know they would be more likely to get a date with the guy if they were to make themselves more available (whether by explicitly asking him out or otherwise), and that they don't actually want the date unless the guy demonstrates sufficient interest/courage/perceptiveness/etc.

That doesn't mean that the yin is weak, or stalls out if the guy doesn't get the hint. In situations where the man is demonstrating sufficient perceptiveness/interest/courage and still is holding out for whatever reason, women can make these "subtle cues" very very very obvious. Like, way more obvious than explicitly asking (which could conceivably be insincere). For example, a woman who wants to be kissed might hold eye contact a bit longer, lean a bit closer, and might speak a bit less, as if to appreciate the silence -- which is subtle and could be missed.

But if she holds unbroken eye contact, flat out doesn't respond to anything he says in attempt to distract her or test her resolve, and leans all the way in until her lips are mere millimeters from his, waiting for minutes until he responds.... that's yin, but not something that can be missed, you know? And it's not a joke and it's not a whim.

"Do you want to kiss me?" is something that could be decoupled from, but if she embodies the invitation full force, he'll feel it if he wants to kiss her -- even if he might not have wanted to before.

vladimir_nesov on AI 2027: What Superintelligence Looks Like

GPT-4.5 might've been trained on 100K H100s of the Goodyear Microsoft site ($4-5bn, same as first phase of Colossus), about 3e26 FLOPs (though there are hints in the announcement video it could've been trained in FP8 and on compute from more than one location, which makes up to 1e27 FLOPs possible in principle).

Abilene site of Crusoe/Stargate/OpenAI will have 1 GW of Blackwell servers in 2026, about 6K-7K racks, possibly at $4M per rack all-in, for the total of $25-30bn, which they've already raised money for (mostly from SoftBank). They are projecting about $12bn in revenue for 2025. If used as a single training system, it's enough to train models for 5e27 BF16 FLOPs (or 1e28 FP8 FLOPs).

The AI 2027 timeline assumes reliable agentic models work out, so revenue continues scaling, with the baseline guess of 3x per year. If Rubin NVL144 arrives 1.5 years after Blackwell NVL72, that's about 5x increase in expected revenue. If that somehow translates into proportional investment in datacenter construction, that might be enough to buy $150bn worth of Rubin NVL144 racks, say at $5M per rack all-in, which is 30K racks and 5 GW. Compared to Blackwell NVL72, that's 2x more BF16 compute per rack (and 3.3x more FP8 compute). This makes the Rubin datacenter of early 2027 sufficient to train a 5e28 BF16 FLOPs model (or 1.5e29 FP8 FLOPs) later in 2027. Which is a bit more than 100x the estimate for GPT-4.5.

(I think this is borderline implausible technologically if only the AI company believes in the aggressive timeline in advance, and ramping Rubin to 30K racks for a single company will take more time. Getting 0.5-2 GW of Rubin racks by early 2027 seems more likely. Using Blackwell at that time means ~2x lower performance for the same money, undercutting the amount of compute that will be available in 2027-2028 in the absence of an intelligence explosion, but at least it's something money will be able to buy. And of course this still hinges on the revenue actually continuing to grow, and translating into capital for the new datacenter.)

lwlw on LWLW's Shortform

What if Trump is channeling his inner doctor strange and is crashing the economy in order to slow AI progress and buy time for alignment? Eliezer calls for an AI pause, Trump MAKES an AI pause. I rest my case that Trump is the most important figure in the history of AI alignment.

jblack on (Some) Humans do choose 5 dollars in the 5-and-10 problem

Evidence for the claim in the title? Or for anything else in the post?

remmelt-ellen on [deleted]

I think the US is in a recession now, and that the AI market has a ~40% chance of crashing with it this year.