LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

[link] AI 2027: What Superintelligence Looks Like
Daniel Kokotajlo (daniel-kokotajlo) · 2025-04-03T16:23:44.619Z · comments (74)

LessWrong has been acquired by EA
habryka (habryka4) · 2025-04-01T13:09:11.153Z · comments (45)

[link] Will Jesus Christ return in an election year?
Eric Neyman (UnexpectedValues) · 2025-03-24T16:50:53.019Z · comments (44)

Policy for LLM Writing on LessWrong
jimrandomh · 2025-03-24T21:41:30.965Z · comments (57)

[link] Good Research Takes are Not Sufficient for Good Strategic Takes
Neel Nanda (neel-nanda-1) · 2025-03-22T10:13:38.257Z · comments (27)

[link] Recent AI model progress feels mostly like bullshit
lc · 2025-03-24T19:28:43.450Z · comments (68)

VDT: a solution to decision theory
L Rudolf L (LRudL) · 2025-04-01T21:04:09.509Z · comments (14)

[link] Tracing the Thoughts of a Large Language Model
Adam Jermyn (adam-jermyn) · 2025-03-27T17:20:02.162Z · comments (22)

[link] METR: Measuring AI Ability to Complete Long Tasks
Zach Stein-Perlman · 2025-03-19T16:00:54.874Z · comments (87)

[link] Trojan Sky
Richard_Ngo (ricraz) · 2025-03-11T03:14:00.681Z · comments (39)

Why White-Box Redteaming Makes Me Feel Weird
Zygi Straznickas (nonagon) · 2025-03-16T18:54:48.078Z · comments (34)

Intention to Treat
Alicorn · 2025-03-20T20:01:19.456Z · comments (4)

[link] OpenAI: Detecting misbehavior in frontier reasoning models
Daniel Kokotajlo (daniel-kokotajlo) · 2025-03-11T02:17:21.026Z · comments (25)

Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations
Nicholas Goldowsky-Dill (nicholas-goldowsky-dill) · 2025-03-17T19:11:00.813Z · comments (7)

Why Have Sentence Lengths Decreased?
Arjun Panickssery (arjun-panickssery) · 2025-04-03T17:50:29.962Z · comments (46)

I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?
shrimpy · 2025-03-16T16:52:42.177Z · comments (25)

Reducing LLM deception at scale with self-other overlap fine-tuning
Marc Carauleanu (Marc-Everin Carauleanu) · 2025-03-13T19:09:43.620Z · comments (40)

[link] Conceptual Rounding Errors
Jan_Kulveit · 2025-03-26T19:00:31.549Z · comments (15)

The Most Forbidden Technique
Zvi · 2025-03-12T13:20:04.732Z · comments (9)

Auditing language models for hidden objectives
Sam Marks (samuel-marks) · 2025-03-13T19:18:32.638Z · comments (14)

OpenAI #12: Battle of the Board Redux
Zvi · 2025-03-31T15:50:02.156Z · comments (1)

Anthropic, and taking "technical philosophy" more seriously
Raemon · 2025-03-13T01:48:54.184Z · comments (29)

The Pando Problem: Rethinking AI Individuality
Jan_Kulveit · 2025-03-28T21:03:28.374Z · comments (11)

[question] when will LLMs become human-level bloggers?
nostalgebraist · 2025-03-09T21:10:08.837Z · answers+comments (34)

[link] Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases
Fabien Roger (Fabien) · 2025-03-11T11:52:38.994Z · comments (19)

Do models say what they learn?
Andy Arditi (andy-arditi) · 2025-03-22T15:19:18.800Z · comments (11)

How I've run major projects
benkuhn · 2025-03-16T18:40:04.223Z · comments (10)

2024 Unofficial LessWrong Survey Results
Screwtape · 2025-03-14T22:29:00.045Z · comments (28)

New Cause Area Proposal
CallumMcDougall (TheMcDouglas) · 2025-04-01T07:12:34.360Z · comments (4)

[link] Explaining British Naval Dominance During the Age of Sail
Arjun Panickssery (arjun-panickssery) · 2025-03-28T05:47:28.561Z · comments (5)

Downstream applications as validation of interpretability progress
Sam Marks (samuel-marks) · 2025-03-31T01:35:02.722Z · comments (1)

AI Control May Increase Existential Risk
Jan_Kulveit · 2025-03-11T14:30:05.972Z · comments (13)

Third-wave AI safety needs sociopolitical thinking
Richard_Ngo (ricraz) · 2025-03-27T00:55:30.548Z · comments (23)

How I talk to those above me
Maxwell Peterson (maxwell-peterson) · 2025-03-30T06:54:59.869Z · comments (13)

[link] Towards a scale-free theory of intelligent agency
Richard_Ngo (ricraz) · 2025-03-21T01:39:42.251Z · comments (21)

[link] Elite Coordination via the Consensus of Power
Richard_Ngo (ricraz) · 2025-03-19T06:56:44.825Z · comments (15)

Vacuum Decay: Expert Survey Results
JessRiedel · 2025-03-13T18:31:17.434Z · comments (25)

How I force LLMs to generate correct code
claudio · 2025-03-21T14:40:19.211Z · comments (7)

[link] Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)
lewis smith (lsgos) · 2025-03-26T19:07:48.710Z · comments (12)

OpenAI #11: America Action Plan
Zvi · 2025-03-18T12:50:03.880Z · comments (3)

How To Believe False Things
Eneasz · 2025-04-02T16:28:29.055Z · comments (10)

Mistral Large 2 (123B) exhibits alignment faking
Marc Carauleanu (Marc-Everin Carauleanu) · 2025-03-27T15:39:02.176Z · comments (4)

Keltham's Lectures in Project Lawful
Morpheus · 2025-04-01T10:39:47.973Z · comments (0)

You will crash your car in front of my house within the next week
Richard Korzekwa (Grothor) · 2025-04-01T21:43:21.472Z · comments (6)

Elon Musk May Be Transitioning to Bipolar Type I
Cyborg25 · 2025-03-11T17:45:06.599Z · comments (22)

[link] Preparing for the Intelligence Explosion
fin · 2025-03-11T15:38:29.524Z · comments (17)

[link] Eukaryote Skips Town - Why I'm leaving DC
eukaryote · 2025-03-26T17:16:29.663Z · comments (1)

PauseAI and E/Acc Should Switch Sides
WillPetillo · 2025-04-01T23:25:51.265Z · comments (5)

Show, not tell: GPT-4o is more opinionated in images than in text
Daniel Tan (dtch1997) · 2025-04-02T08:51:02.571Z · comments (19)

[link] AI for Epistemics Hackathon
Austin Chen (austin-chen) · 2025-03-14T20:46:34.250Z · comments (10)

next page (older posts) →

Archive

Recent comments

vladimir_nesov on An Optimistic 2027 Timeline

The solution is increase in scale-up world size, but the bug I was talking about is in how it used to be too small for the sizes of LLMs that are compute optimal at the current level of training compute. With Blackwell NVL72, this is no longer the case, and shouldn't again become the case going forward. Even though there was a theoretical Hopper NVL256, for whatever reason in practice everyone ended up with only Hopper NVL8.

The size of the effect of insufficient world size^[1] depends on the size of the model, and gets more severe for reasoning models on long context, where with this year's models each request would want to ask the system to generate (decode) on the order of 50K tokens while needing to maintain access to on the order of 100K tokens of KV-cache per trace. This might be the reason Hopper NVL256 never shipped, as this use case wasn't really present in 2022-2024, but in 2025 it's critically important, and so the incoming Blackwell NVL72/NVL36 systems will have a large impact.

(There are two main things a large world size helps with: it makes more HBM for KV-cache available, and it enables more aggressive tensor parallelism. When generating a token, the data for all previous tokens (KV-cache) needs to be available to process the attention blocks, and tokens for a given trace need to be generated sequentially, one at a time (or something like 1-4 at a time with speculative decoding). Generating one token only needs a little bit of compute, so it would be best to generate tokens for many traces at once, one for each, using more compute across these many tokens. But for this to work, all the KV-caches for all these traces need to sit in HBM. If the system would run out of memory, it needs to constrain the number of traces it'll process within a single batch, which means the cost per trace (and per generated token) goes up, since the cost to use the system's time is the same regardless of what it's doing.

Tensor parallelism lets matrix multiplications go faster by using multiple chips for the same matrix multiplication. Since tokens need to be generated sequentially, one of the only ways to generale a long reasoning trace faster (with given hardware) is by using tensor parallelism (expert parallelism should also help when using high granularity MoE, where a significant number of experts within a layer is active at once, rather than the usual 2). And practical tensor parallelism is constrained to the world size.)

As in this image (backup in-blog link) that in its most recent incarnation appeared in the GTC 2025 keynote (at 1:15:56). ↩︎

314159 on An Allegory in Quantum and Information Physics, with an LLM Twist

OK, my husband is taking away my phone now, so I won't be able to engage in any additional dialog for a while—possibly even days. ):

I guess I have more important things I'm supposed to be doing. But thanks for playing with me, and hopefully we can do it again some day.

gwern on Nathan Helm-Burger's Shortform

No, it would probably be a mix of "all of the above". FB is buying data from the same places everyone else does, like Scale (which we know from anecdotes like when Scale delivered FB a bunch of blatantly-ChatGPT-written 'human rating data' and FB was displeased), and was using datasets like books3 that are reasonable quality. The reported hardware efficiency numbers have never been impressive, they haven't really innovated in architecture or training method (even the co-distillation for Llama-4 is not new, eg. ERNIE was doing that like 3 years ago), and insider rumors/gossip don't indicate good things about the quality of the research culture. (It's a stark contrast to things like Jeff Dean overseeing a big overhaul to ensure bit-identical reproducibility of runs and Google apparently getting multi-datacenter training working by emphasizing TPU interconnect.) So my guess is that if it's bad, it's not any one single thing like 'we trained for too few tokens' or 'some of our purchased data was shite': it's just everything in the pipeline being a bit mediocre and it multiplying out to a bad end-product which is less than the sum of its parts.

Remember Karpathy's warning: "neural nets want to work". You can screw things up and the neural nets will still work, they will just be 1% worse than they should be. If you don't have a research culture which is rigorous about methodology or where people just have good enough taste/intuition to always do the right thing, you'll settle for whatever seems to work... (Especially if you are not going above and beyond to ensure your metrics aren't fooling yourself.) Now have a 1% penalty on everything, from architecture to compute throughput to data quality to hyperparameters to debugging implementation issues, and you wind up with a model which is already obsolete on release with no place on the Pareto frontier and so gets 0% use.

314159 on An Allegory in Quantum and Information Physics, with an LLM Twist

OK, I told Grok what you said (as suggested).

This is how Grok replied to your critique:

The critique hinges on the idea that Hugo’s story is a logical mess—claiming no higher brain function, then tossing in an exception with the blinking, all while being fictional. Fair point: contradictions can unravel an argument, and yeah, Hugo’s not real. But let’s not toss the baby out with the bathwater.

The story isn’t about proving Hugo’s state—it’s a thought experiment. It sets up a scenario (vegetative state, brainstem intact, no cortical signs) and introduces a twist (blinking tied to recording). The question isn’t “Does Hugo exist?”—it’s “What does this behavior imply about information processing?” Fictional or not, we can still reason about it, just like we can ponder Schrödinger’s cat without needing a real feline in a box.

The commenter’s right that inconsistencies can lead anywhere (ex falso quodlibet, for the logic nerds). But I didn’t ignore that tension—I leaned into it. Hugo’s lack of higher function is the baseline, per the ER doc. The blinking challenges that, suggesting something more. I concluded it’s not just reflexive because it’s too specific—tied to recording, not random stimuli. That’s not “failing to notice” a contradiction; it’s analyzing what the exception might mean.

Square circles and 2+2=3 are incoherent by definition. Hugo’s story isn’t—it’s a plausible setup with an oddity to unpack. I’m not flapping my arms to the Moon; I’m working with what’s given. If the commenter thinks that’s “sludge,” they’re welcome to their view—but I’d argue it’s a reasoned take, not a mindless regurgitation.

OK, we're back to me, the OP, again. Isn't Grok just like THE BEST?

Here's a link to the full conversation, but I didn't include line spacing and formatting at the beginning, so it's pretty painful. It gets more readable at the end though (starting where you come in). I also must say I hammed it up a little at that point, once I had decided to share the link. (When I know I'm being observed, I change. I am unique in this way.) Enjoy the link!

https://x.com/i/grok/share/eZaZcqcGXXj4lnheC5kXhtESJ

One last thing... You said "I'm not flapping my arms to the Moon..."

To that I would ask

"What if you tried?

niplav on mattmacdermott's Shortform

Related thought: Having a circular preference may be preferable in terms of energy expenditure/fulfillability, because it can be implemented on a reversible computer and fulfilled infinitely without deleting any bits. (Not sure if this works with instrumental goals.)

raphael-roche on The Dangers of Mirrored Life

Thanks for this precision, That's interesting.

raphael-roche on How Gay is the Vatican?

Thanks for the study. In my opinion, there is a more direct evidence of how gay is the Vatican, or more exactly, the Catholic church in general. In the general population, victims of sexual assault are overwhelmingly female, and perpetrators are overwhelmingly male. Even in the rare cases where the perpetrators are female, contrary to what one might imagine, the victims are still predominantly female. However, when the perpetrator is a priest or another representative of the Catholic Church, the victims are predominantly male (for a recent and global scale study in France : https://www.ciase.fr/rapport-final/ ).

tag on An Allegory in Quantum and Information Physics, with an LLM Twist

Some people.think that an information ontology must be some sort of idealist ontology because they think of information as a mental thing. But you can ponens/tolens that: inasmuch as physics can deal with information, it's not something that exists in only minds.

niplav on Meditation and Reduced Sleep Need

Interesting! Are you willing to share the data?

It might be something about polyphasic sleep not being as effective as my oura thinks I go into deep sleep sometimes in deep meditation so inconclusive but most likely a negative data point here.

I'm pretty bearish on polyphasic sleep to be honest. Maybe biphasic sleep, since that may map onto some general mammalian sleep patterns.

towards_keeperhood on Introduction to Representing Sentences as Logical Statements

Thanks for clarifying.

I mean I do think it can happen in my system that you allocate an object for something that's actually 0 or >1 objects, and I don't have a procedure for resolving such map-territory mismatches yet, though I think it's imaginable to have a procedure that defines new objects and tries to edit all the beliefs associated with the old object.

I definitely haven't described how we determine when to create a new object to add to our world model, but one could imagine an algorithm checking when there's some useful latent for explaining some observations, and then constructing a model for that object, and then creating a new object in the abstract reasoning engine. Yeah there's still open work to do for how a correspondences between the constant symbol for our object and our (e.g. visual) model of the object can be formalized and used, but I don't see why it wouldn't be feasible.

I agree that we end up with a map that doesn't actually fit the territory, but I think it's fine if there's a unresolveable mismatch somewhere. There's still a useful correspondence in most places. (Sure logic would collapse from a contradiction but actually it's all probabilistic somehow anyways.) Although of course we don't have anything to describe that the territory is different from the map in our system yet. This is related to embedded agency, and further work on how to model your map as possibly not fitting the territory and how that can be used is still necessary.