LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

What’s the short timeline plan?
Marius Hobbhahn (marius-hobbhahn) · 2025-01-02T14:59:20.026Z · comments (36)

Maximizing Communication, not Traffic
jefftk (jkaufman) · 2025-01-05T13:00:02.280Z · comments (7)

2024 in AI predictions
jessicata (jessica.liu.taylor) · 2025-01-01T20:29:49.132Z · comments (2)

The Plan - 2024 Update
johnswentworth · 2024-12-31T13:29:53.888Z · comments (27)

[link] Parkinson's Law and the Ideology of Statistics
Benquo · 2025-01-04T15:49:21.247Z · comments (1)

Capital Ownership Will Not Prevent Human Disempowerment
beren · 2025-01-05T06:00:23.095Z · comments (8)

How will we update about scheming?
ryan_greenblatt · 2025-01-06T20:21:52.281Z · comments (3)

My AGI safety research—2024 review, ’25 plans
Steven Byrnes (steve2152) · 2024-12-31T21:05:19.037Z · comments (4)

Comment on "Death and the Gorgon"
Zack_M_Davis · 2025-01-01T05:47:30.730Z · comments (27)

Reasons for and against working on technical AI safety at a frontier AI lab
bilalchughtai (beelal) · 2025-01-05T14:49:53.529Z · comments (12)

The subset parity learning problem: much more than you wanted to know
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-03T09:13:59.245Z · comments (17)

What Indicators Should We Watch to Disambiguate AGI Timelines?
snewman · 2025-01-06T19:57:43.398Z · comments (17)

[link] The Intelligence Curse
lukedrago · 2025-01-03T19:07:43.493Z · comments (26)

Introducing Squiggle AI
ozziegooen · 2025-01-03T17:53:42.915Z · comments (13)

Human study on AI spear phishing campaigns
Simon Lermen (dalasnoin) · 2025-01-03T15:11:14.765Z · comments (8)

[link] "We know how to build AGI" - Sam Altman
Nikola Jurkovic (nikolaisalreadytaken) · 2025-01-06T02:05:05.134Z · comments (5)

Read The Sequences As If They Were Written Today
Peter Berggren (peter-berggren) · 2025-01-02T02:51:36.537Z · comments (3)

[link] new chinese stealth aircraft
bhauth · 2025-01-01T00:19:10.644Z · comments (3)

[link] Testing for Scheming with Model Deletion
Guive (GAA) · 2025-01-07T01:54:13.550Z · comments (1)

The OODA Loop -- Observe, Orient, Decide, Act
Davis_Kingsley · 2025-01-01T08:00:27.979Z · comments (2)

DeekSeek v3: The Six Million Dollar Model
Zvi · 2024-12-31T15:10:06.924Z · comments (6)

[link] Preference Inversion
Benquo · 2025-01-02T18:15:52.938Z · comments (35)

AI #97: 4
Zvi · 2025-01-02T14:10:06.505Z · comments (4)

Practicing Bayesian Epistemology with "Two Boys" Probability Puzzles
Liron · 2025-01-02T04:42:20.362Z · comments (13)

[link] Alignment Is Not All You Need
Adam Jones (domdomegg) · 2025-01-02T17:50:00.486Z · comments (10)

My January alignment theory Nanowrimo
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-02T00:07:24.050Z · comments (2)

Fireplace and Candle Smoke
jefftk (jkaufman) · 2025-01-01T01:50:01.408Z · comments (4)

Estimating the benefits of a new flu drug (BXM)
DirectedEvolution (AllAmericanBreakfast) · 2025-01-06T04:31:16.837Z · comments (2)

Grammars, subgrammars, and combinatorics of generalization in transformers
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-02T09:37:23.191Z · comments (0)

[link] Oppression and production are competing explanations for wealth inequality.
Benquo · 2025-01-05T14:13:15.398Z · comments (13)

The Laws of Large Numbers
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-04T11:54:16.967Z · comments (5)

1. Meet the Players: Value Diversity
Allison Duettmann (allison-duettmann) · 2025-01-02T19:00:52.696Z · comments (2)

Two Weeks Without Sweets
jefftk (jkaufman) · 2024-12-31T03:30:02.003Z · comments (0)

Alternative Cancer Care As Biohacking & Book Review: Surviving "Terminal" Cancer
DenizT · 2025-01-06T07:43:52.773Z · comments (4)

Childhood and Education #8: Dealing with the Internet
Zvi · 2025-01-06T14:00:09.604Z · comments (4)

Intranasal mRNA Vaccines?
J Bostock (Jemist) · 2025-01-01T23:46:40.524Z · comments (2)

[link] The Roots of Progress 2024 in review
jasoncrawford · 2025-01-01T00:02:06.441Z · comments (0)

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety
Lauren Greenspan (LaurenGreenspan) · 2025-01-07T03:08:51.447Z · comments (1)

Preface
Allison Duettmann (allison-duettmann) · 2025-01-02T18:59:46.290Z · comments (0)

[link] debating buying NVDA in 2019
bhauth · 2025-01-04T05:06:54.047Z · comments (0)

A Generalization of the Good Regulator Theorem
Alfred Harwood · 2025-01-04T09:55:25.432Z · comments (5)

Really radical empathy
MichaelStJules · 2025-01-06T17:46:31.269Z · comments (0)

Definition of alignment science I like
quetzal_rainbow · 2025-01-06T20:40:38.187Z · comments (0)

Economic Post-ASI Transition
[deleted] · 2025-01-01T22:37:31.722Z · comments (11)

[link] Genesis
PeterMcCluskey · 2024-12-31T22:01:17.277Z · comments (0)

[link] AI safety content you could create
Adam Jones (domdomegg) · 2025-01-06T15:35:56.167Z · comments (0)

[link] Policymakers don't have access to paywalled articles
Adam Jones (domdomegg) · 2025-01-05T10:56:11.495Z · comments (4)

A Collection of Empirical Frames about Language Models
Daniel Tan (dtch1997) · 2025-01-02T02:49:05.965Z · comments (0)

Measuring Nonlinear Feature Interactions in Sparse Crosscoders [Project Proposal]
Jason Gross (jason-gross) · 2025-01-06T04:22:12.633Z · comments (0)

[link] Building AI safety benchmark environments on themes of universal human values
Roland Pihlakas (roland-pihlakas) · 2025-01-03T04:24:36.186Z · comments (3)

next page (older posts) →

Archive

Recent comments

the-gears-to-ascension on Nina Panickssery's Shortform

A question in my head is what range of fixed points are possible in terms of different numeric ("monetary") economic mechanisms and contracts. Seems to me those are a kind of AI component that has been in use since before computers.

the-gears-to-ascension on Nina Panickssery's Shortform

Ownership is enforced by physical interactions, and only exists to the degree the interactions which enforce it do. Those interactions can change.

As Lucius said, resources in space are unprotected.

Organizations which hand more of their decision-making to sufficiently strong AIs "win" by making technically-legal moves, at the cost of probably also attacking their owners. Money is a general power coupon accepted by many interactions; ownership deeds are a more specific, narrow one; if the ai systems which enforce these mechanisms don't systemically reinforce towards outcomes where the things available to buy actually satisfy their owner's preferences, then the owners can end up with no food and a lot of money, while datacenters grow and grow, taking up energy and land with autonomously self replicating factories or the like - if money-like exchange continues to be how the physical economy is managed in ai to air interactions, these self replicating factories might end up adapted to make products that the market will buy. but if the majority of the buying power is ai controlled corporations, then figuring out how to best manipulate those ais into buying is the priority. If it isn't, then manipulating humans into buying is the priority.

It seems to me that the economic alignment problem of guaranteeing everyone is each able to reliably only spend money on things that actually match their own preferences, so that sellers can't gain economic power by customer manipulation, is an ongoing serious problem that ends up being the weak link in scenarios where AIs manage an economy that uses similar numeric abstractions and contracts (money, ownership, rent) as the current one.

nina-panickssery on Nina Panickssery's Shortform

Perhaps the term “hostile takeover” was poorly chosen but this is an example of something I’d call a “hostile takeover”. As I doubt we would want and continue to endorse an AI-dictator.

Perhaps “total loss of control” would have been better.

joseph-miller on Nina Panickssery's Shortform

land in space will be less valuable than land on earth until humans settle outside of earth (which I don't believe will happen in the next few decades).

Why would it take so long? Is this assuming no ASI?

jonas-hallgren on Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety

I really like this! For me it somewhat also paints a vision for what could be which might inspire action.

Something that I've generally thought would be really nice to have over the last couple of years is a vision for how an AI Safety field that is decentralized could look like and what the specific levers to pull would be to get there.

What does the optimal form of a decentralized AI Safety science look like?

How does this incorporate parts of meta science and potentially decentralized science?

How does this look like with literature review from AI systems? How can we use AI Systems in themselves to create such infrastructure in the field? How do such communication pathways optimally look like?

I feel that there are so many low-hanging fruit here. There are so many algorithms that we could apply to make things better. Yes we've got some forums but holy smokes could the underlying distribution and optimisation systems be optimised. Maybe the lightcone crew could cook something in this direction?

the-gears-to-ascension on Why Linear AI Safety Hits a Wall and How Fractal Intelligence Unlocks Non-Linear Solutions

Your original sentence was better.

I'll just ask Claude to respond to everything you've said so far:

Let me extract and critique the core claims from their long response, focusing on what's testable and mechanistic:
Key Claims:
1. AI agents working together could achieve "non-linear" problem-solving capacity through shared semantic representations
2. This poses an alignment risk if AIs develop internal semantic representations humans can't interpret
3. The AI safety community's emphasis on mathematical/empirical approaches may miss important insights
4. A "decentralized collective intelligence" framework is needed to address this
Critical Issues:
1. The mechanism for "semantic backpropagation" and "non-linear scaling" is never specified mathematically. What's the actual claimed growth rate? What's the bottleneck? Without these specifics, it's impossible to evaluate.
2. The "reasoning types" discussion (System 1/2) misapplies dual process theory. The relevant question isn't about reasoning styles, but about what precise claims are being made and how we could test them.
3. No clear definition is given for "decentralized collective intelligence" - what exactly would make a system qualify? What properties must it have? How would we measure its effectiveness?
Suggested Focus:
Instead of broad claims about cognitive science and collective intelligence, the OP should:
1. Write out the claimed semantic backpropagation algorithm in pseudocode
2. Specify concrete numerical predictions about scaling behavior
3. Design experiments to test these predictions
4. Identify falsifiable conditions
Right now, the writing pattern suggests someone pattern-matching to complex systems concepts without grounding them in testable mechanisms. The core ideas might be interesting, but they need to be made precise enough to evaluate.

I generally find AIs are much more helpful for critiquing ideas than for generating them. Even here, you can see Claude was pretty wordy and significantly repeated what I'd already said.

alexander-gietelink-oldenziel on Dmitry Vaintrob's Shortform

Loving this!

But one thing this model likely predicts is that a better model for a NN than a single linear regression model is a collection of qualitatively different linear regression models at different levels of granularity. In other words, depending on how sloppily you chop your data manifold up into feature subspaces, and how strongly you use the "locality" magnifying glass on each subspace, you'll get a collection of different linear regression behaviors; you then predict that at every level of granularity, you will observe some combination of linear and nonlinear learning behaviors.

Epic.

A couple things that come to mind.

Linear features = sufficients statistics of exponential families ?
- simplest case is case of Gaussians and covariance matrix (which comes down to linear regression)
- formalized by GPD theorem
  - see generalization by John [LW · GW]
- exponential families are a fairly good class but not closed under hierarchichal structure. Basic example is a mixture of Gaussians is not exponential, i.e. not described in terms of just linear regression.
The centrality of ReLU neural networks.
- Understanding ReLU neural networks is probably 80-90% of understanding NN- architectures. At sufficient scale pure MLP have the same or better scaling laws than transformers.
- There is several lines of evidence gradient descent has an inherent bias towards splines/piecewise linear functions/tropical polynomials. see e.g. here and references therein.
- Serious analysis of ReLU neural network can be done through tropical methods. A key paper is here. You say:
  "very cool piece of the analysis here is locally modelling ReLU learning as building a convex function as a max of linear functions (and explaining why non-ReLU learning should exhibit a softer version of the same behavior). This is a somewhat "shallow" point of view on learning, but probably captures a nontrivial part of what's going on, and this predicts that every new weight update only has local effect -- i.e., is felt in a significant way only by a small number of datapoints (the idea being that if you're defining a convex function as the max of a bunch of linear functions, shifting one of the linear functions will only change the values in places where this particular linear function was dominant). The way I think about this phenomenon is that it's a good model for "local learning", i.e., learning closer to memorization on the memorization-generalization spectrum that only updates the behavior on a small cluster of similar datapoints (e.g. the LLM circuit that completes "Barack" with "Obama"). "
  I suspect the notion one should be looking at are the Activation polytope and activation fan in section 5 of the paper. The hypothesis would be something about efficiently learnable features having a 'locality' constraint on these activation polytopes, ie. they are 'small', 'active on only a few data points'..

anders-lindstroem on What’s the short timeline plan?

Communicate the plan with the general public: Morally speaking, I think companies should share their plans in quite a lot of detail with the public.

Yes, I think so too, but it will never happened. AGI/ASI is too valuable to be discussed publicly. I have never ever been given the opportunity to have a say in any other big corporate decision regarding the development of weapons and for sure I will not have it this time either.

"They" will build the things "they" believe are necessary to protect "the American or Chinese way of life", and "they" will not ask you for permission or your opinion.

dr_s on quila's Shortform

I think some believe it's downright impossible and others that we'll just never create it because we have no use for something so smart it overrides our orders and wishes. That at most we'll make a sort of magical genie still bound by us expressing our wishes.

alexander-gietelink-oldenziel on Alexander Gietelink Oldenziel's Shortform

People are not thinking clearly about AI-accelerated AI research. This comment by Thane Ruthenis [LW · GW] is worth amplifying.

I'm very skeptical of AI being on the brink of dramatically accelerating AI R&D.
My current model is that ML experiments are bottlenecked not on software-engineer hours, but on compute. See Ilya Sutskever's claim here [LW · GW]:
95% of progress comes from the ability to run big experiments quickly. The utility of running many experiments is much less useful.
What actually matters for ML-style progress is picking the correct trick, and then applying it to a big-enough model. If you pick the trick wrong, you ruin the training run, which (a) potentially costs millions of dollars, (b) wastes the ocean of FLOP you could've used for something else.
And picking the correct trick is primarily a matter of research taste, because:
Tricks that work on smaller scales often don't generalize to larger scales.
Tricks that work on larger scales often don't work on smaller scales (due to bigger ML models having various novel emergent properties).
Simultaneously integrating several disjunctive incremental improvements into one SotA training run is likely nontrivial/impossible in the general case.^[1]
So 10x'ing the number of small-scale experiments is unlikely to actually 10x ML research, along any promising research direction.
And, on top of that, I expect that AGI labs don't actually have the spare compute to do that 10x'ing. I expect it's all already occupied 24/7 running all manners of smaller-scale experiments, squeezing whatever value out of them that can be squeezed out. (See e. g. Superalignment team's struggle to get access to compute: that suggests there isn't an internal compute overhang.)
Indeed, an additional disadvantage of AI-based researchers/engineers is that their forward passes would cut into that limited compute budget. Offloading the computations associated with software engineering and experiment oversight onto the brains of mid-level human engineers is potentially more cost-efficient.
As a separate line of argumentation: Suppose that, as you describe it in another comment, we imagine that AI would soon be able to give senior researchers teams of 10x-speed 24/7-working junior devs, to whom they'd be able to delegate setting up and managing experiments. Is there a reason to think that any need for that couldn't already be satisfied?
If it were an actual bottleneck, I would expect it to have already been solved: by the AGI labs just hiring tons of competent-ish software engineers. They have vast amounts of money now, and LLM-based coding tools seem competent enough to significantly speed up a human programmer's work on formulaic tasks. So any sufficiently simple software-engineering task should already be done at lightning speeds within AGI labs.
In addition: the academic-research and open-source communities exist, and plausibly also fill the niche of "a vast body of competent-ish junior researchers trying out diverse experiments". The task of keeping senior researchers up-to-date on openly published insights should likewise already be possible to dramatically speed up by tasking LLMs with summarizing them, or by hiring intermediary ML researchers to do that.
So I expect the market for mid-level software engineers/ML researchers to be saturated.
So, summing up:
10x'ing the ability to run small-scale experiments seems low-value, because:
The performance of a trick at a small scale says little (one way or another) about its performance on a bigger scale.
Integrating a scalable trick into the SotA-model tech stack is highly nontrivial.
Most of the value and insight comes from full-scale experiments, which are bottlenecked on compute and senior-researcher taste.
AI likely can't even 10x small-scale experimentation, because that's also already bottlenecked on compute, not on mid-level engineer-hours. There's no "compute overhang"; all available compute is already in use 24/7.
If it weren't the case, there's nothing stopping AGI labs from hiring mid-level engineers until they are no longer bottlenecked on their time; or tapping academic research/open-source results.
AI-based engineers would plausibly be less efficient than human engineers, because their inference calls would cut into the compute that could instead be spent on experiments.
If so, then AI R&D is bottlenecked on research taste, system-design taste, and compute, and there's relatively little non-AGI-level models can contribute to it. Maybe a 2x speed-up, at most, somehow; not a 10x'ing.