LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Maximizing Communication, not Traffic
jefftk (jkaufman) · 2025-01-05T13:00:02.280Z · comments (7)

Capital Ownership Will Not Prevent Human Disempowerment
beren · 2025-01-05T06:00:23.095Z · comments (8)

How will we update about scheming?
ryan_greenblatt · 2025-01-06T20:21:52.281Z · comments (3)

Reasons for and against working on technical AI safety at a frontier AI lab
bilalchughtai (beelal) · 2025-01-05T14:49:53.529Z · comments (12)

What Indicators Should We Watch to Disambiguate AGI Timelines?
snewman · 2025-01-06T19:57:43.398Z · comments (17)

[link] "We know how to build AGI" - Sam Altman
Nikola Jurkovic (nikolaisalreadytaken) · 2025-01-06T02:05:05.134Z · comments (5)

[link] Testing for Scheming with Model Deletion
Guive (GAA) · 2025-01-07T01:54:13.550Z · comments (1)

Estimating the benefits of a new flu drug (BXM)
DirectedEvolution (AllAmericanBreakfast) · 2025-01-06T04:31:16.837Z · comments (2)

[link] Oppression and production are competing explanations for wealth inequality.
Benquo · 2025-01-05T14:13:15.398Z · comments (13)

Childhood and Education #8: Dealing with the Internet
Zvi · 2025-01-06T14:00:09.604Z · comments (4)

Alternative Cancer Care As Biohacking & Book Review: Surviving "Terminal" Cancer
DenizT · 2025-01-06T07:43:52.773Z · comments (4)

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety
Lauren Greenspan (LaurenGreenspan) · 2025-01-07T03:08:51.447Z · comments (1)

Definition of alignment science I like
quetzal_rainbow · 2025-01-06T20:40:38.187Z · comments (0)

Really radical empathy
MichaelStJules · 2025-01-06T17:46:31.269Z · comments (0)

[link] AI safety content you could create
Adam Jones (domdomegg) · 2025-01-06T15:35:56.167Z · comments (0)

Measuring Nonlinear Feature Interactions in Sparse Crosscoders [Project Proposal]
Jason Gross (jason-gross) · 2025-01-06T04:22:12.633Z · comments (0)

Incredibow
jefftk (jkaufman) · 2025-01-07T03:30:02.197Z · comments (1)

[question] Meal Replacements in 2025?
alkjash · 2025-01-06T15:37:25.041Z · answers+comments (8)

[link] Policymakers don't have access to paywalled articles
Adam Jones (domdomegg) · 2025-01-05T10:56:11.495Z · comments (4)

(My) self-referential reason to believe in free will
jacek (jacek-karwowski) · 2025-01-06T23:35:02.809Z · comments (4)

[link] You should delay engineering-heavy research in light of R&D automation
Daniel Paleka · 2025-01-07T02:11:11.501Z · comments (1)

Guilt, Shame, and Depravity
Benquo · 2025-01-07T01:16:00.273Z · comments (2)

A Ground-Level Perspective on Capacity Building in International Development
Sean Aubin (sean-aubin) · 2025-01-05T20:36:54.308Z · comments (1)

Orange and Strawberry Truffles
jefftk (jkaufman) · 2025-01-05T01:50:01.587Z · comments (1)

[question] Is "hidden complexity of wishes problem" solved?
Roman Malov · 2025-01-05T22:59:30.911Z · answers+comments (4)

AXRP Episode 38.4 - Shakeel Hashim on AI Journalism
DanielFilan · 2025-01-05T00:20:05.096Z · comments (0)

Latent Adversarial Training (LAT) Improves the Representation of Refusal
alexandraabbas · 2025-01-06T10:24:53.419Z · comments (2)

D&D.Sci Dungeonbuilding: the Dungeon Tournament Evaluation & Ruleset
aphyer · 2025-01-07T05:02:25.929Z · comments (0)

Turning up the Heat on Deceptively-Misaligned AI
J Bostock (Jemist) · 2025-01-07T00:13:28.191Z · comments (0)

[question] Is my distinctiveness evidence for being in a simulation?
AynonymousPrsn123 · 2025-01-06T21:27:13.280Z · answers+comments (25)

[link] How to Do a PhD (in AI Safety)
Lewis Hammond (lewis-hammond-1) · 2025-01-05T16:57:35.409Z · comments (0)

Generating Cognateful Sentences with Large Language Models
vkethana (vijay-k) · 2025-01-06T18:40:09.564Z · comments (0)

Speedrunning Rationality: Day II
aproteinengine · 2025-01-06T03:59:25.488Z · comments (3)

Meditation insights as phase shifts in your self-model
Jonas Hallgren · 2025-01-07T10:09:35.854Z · comments (0)

We Will Likely Go Extinct Before the Unemployment Rate Reaches 99%. How Could That Happen?
Koki (Koki Takeda) · 2025-01-06T21:29:48.647Z · comments (0)

[link] Independent research article analyzing consistent self-reports of experience in ChatGPT and Claude
rife (edgar-muniz) · 2025-01-06T17:34:01.505Z · comments (7)

[link] My Experience Biohacking
Vale · 2025-01-07T03:01:21.410Z · comments (0)

Why Linear AI Safety Hits a Wall and How Fractal Intelligence Unlocks Non-Linear Solutions
Andy E Williams (andy-e-williams) · 2025-01-05T17:08:06.734Z · comments (5)

Alleviating shrimp pain is immoral.
G Wood (geoffrey-wood) · 2025-01-07T07:28:49.432Z · comments (0)

[link] Chinese Researchers Crack ChatGPT: Replicating OpenAI’s Advanced AI Model
Evan_Gaensbauer · 2025-01-05T03:50:34.245Z · comments (1)

next page (older posts) →

Archive

Recent comments

the-gears-to-ascension on Nina Panickssery's Shortform

A question in my head is what range of fixed points are possible in terms of different numeric ("monetary") economic mechanisms and contracts. Seems to me those are a kind of AI component that has been in use since before computers.

the-gears-to-ascension on Nina Panickssery's Shortform

Ownership is enforced by physical interactions, and only exists to the degree the interactions which enforce it do. Those interactions can change.

As Lucius said, resources in space are unprotected.

Organizations which hand more of their decision-making to sufficiently strong AIs "win" by making technically-legal moves, at the cost of probably also attacking their owners. Money is a general power coupon accepted by many interactions; ownership deeds are a more specific, narrow one; if the ai systems which enforce these mechanisms don't systemically reinforce towards outcomes where the things available to buy actually satisfy their owner's preferences, then the owners can end up with no food and a lot of money, while datacenters grow and grow, taking up energy and land with autonomously self replicating factories or the like - if money-like exchange continues to be how the physical economy is managed in ai to air interactions, these self replicating factories might end up adapted to make products that the market will buy. but if the majority of the buying power is ai controlled corporations, then figuring out how to best manipulate those ais into buying is the priority. If it isn't, then manipulating humans into buying is the priority.

It seems to me that the economic alignment problem of guaranteeing everyone is each able to reliably only spend money on things that actually match their own preferences, so that sellers can't gain economic power by customer manipulation, is an ongoing serious problem that ends up being the weak link in scenarios where AIs manage an economy that uses similar numeric abstractions and contracts (money, ownership, rent) as the current one.

nina-panickssery on Nina Panickssery's Shortform

Perhaps the term “hostile takeover” was poorly chosen but this is an example of something I’d call a “hostile takeover”. As I doubt we would want and continue to endorse an AI-dictator.

Perhaps “total loss of control” would have been better.

joseph-miller on Nina Panickssery's Shortform

land in space will be less valuable than land on earth until humans settle outside of earth (which I don't believe will happen in the next few decades).

Why would it take so long? Is this assuming no ASI?

jonas-hallgren on Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety

I really like this! For me it somewhat also paints a vision for what could be which might inspire action.

Something that I've generally thought would be really nice to have over the last couple of years is a vision for how an AI Safety field that is decentralized could look like and what the specific levers to pull would be to get there.

What does the optimal form of a decentralized AI Safety science look like?

How does this incorporate parts of meta science and potentially decentralized science?

How does this look like with literature review from AI systems? How can we use AI Systems in themselves to create such infrastructure in the field? How do such communication pathways optimally look like?

I feel that there are so many low-hanging fruit here. There are so many algorithms that we could apply to make things better. Yes we've got some forums but holy smokes could the underlying distribution and optimisation systems be optimised. Maybe the lightcone crew could cook something in this direction?

the-gears-to-ascension on Why Linear AI Safety Hits a Wall and How Fractal Intelligence Unlocks Non-Linear Solutions

Your original sentence was better.

I'll just ask Claude to respond to everything you've said so far:

Let me extract and critique the core claims from their long response, focusing on what's testable and mechanistic:
Key Claims:
1. AI agents working together could achieve "non-linear" problem-solving capacity through shared semantic representations
2. This poses an alignment risk if AIs develop internal semantic representations humans can't interpret
3. The AI safety community's emphasis on mathematical/empirical approaches may miss important insights
4. A "decentralized collective intelligence" framework is needed to address this
Critical Issues:
1. The mechanism for "semantic backpropagation" and "non-linear scaling" is never specified mathematically. What's the actual claimed growth rate? What's the bottleneck? Without these specifics, it's impossible to evaluate.
2. The "reasoning types" discussion (System 1/2) misapplies dual process theory. The relevant question isn't about reasoning styles, but about what precise claims are being made and how we could test them.
3. No clear definition is given for "decentralized collective intelligence" - what exactly would make a system qualify? What properties must it have? How would we measure its effectiveness?
Suggested Focus:
Instead of broad claims about cognitive science and collective intelligence, the OP should:
1. Write out the claimed semantic backpropagation algorithm in pseudocode
2. Specify concrete numerical predictions about scaling behavior
3. Design experiments to test these predictions
4. Identify falsifiable conditions
Right now, the writing pattern suggests someone pattern-matching to complex systems concepts without grounding them in testable mechanisms. The core ideas might be interesting, but they need to be made precise enough to evaluate.

I generally find AIs are much more helpful for critiquing ideas than for generating them. Even here, you can see Claude was pretty wordy and significantly repeated what I'd already said.

alexander-gietelink-oldenziel on Dmitry Vaintrob's Shortform

Loving this!

But one thing this model likely predicts is that a better model for a NN than a single linear regression model is a collection of qualitatively different linear regression models at different levels of granularity. In other words, depending on how sloppily you chop your data manifold up into feature subspaces, and how strongly you use the "locality" magnifying glass on each subspace, you'll get a collection of different linear regression behaviors; you then predict that at every level of granularity, you will observe some combination of linear and nonlinear learning behaviors.

Epic.

A couple things that come to mind.

Linear features = sufficients statistics of exponential families ?
- simplest case is case of Gaussians and covariance matrix (which comes down to linear regression)
- formalized by GPD theorem
  - see generalization by John [LW · GW]
- exponential families are a fairly good class but not closed under hierarchichal structure. Basic example is a mixture of Gaussians is not exponential, i.e. not described in terms of just linear regression.
The centrality of ReLU neural networks.
- Understanding ReLU neural networks is probably 80-90% of understanding NN- architectures. At sufficient scale pure MLP have the same or better scaling laws than transformers.
- There is several lines of evidence gradient descent has an inherent bias towards splines/piecewise linear functions/tropical polynomials. see e.g. here and references therein.
- Serious analysis of ReLU neural network can be done through tropical methods. A key paper is here. You say:
  "very cool piece of the analysis here is locally modelling ReLU learning as building a convex function as a max of linear functions (and explaining why non-ReLU learning should exhibit a softer version of the same behavior). This is a somewhat "shallow" point of view on learning, but probably captures a nontrivial part of what's going on, and this predicts that every new weight update only has local effect -- i.e., is felt in a significant way only by a small number of datapoints (the idea being that if you're defining a convex function as the max of a bunch of linear functions, shifting one of the linear functions will only change the values in places where this particular linear function was dominant). The way I think about this phenomenon is that it's a good model for "local learning", i.e., learning closer to memorization on the memorization-generalization spectrum that only updates the behavior on a small cluster of similar datapoints (e.g. the LLM circuit that completes "Barack" with "Obama"). "
  I suspect the notion one should be looking at are the Activation polytope and activation fan in section 5 of the paper. The hypothesis would be something about efficiently learnable features having a 'locality' constraint on these activation polytopes, ie. they are 'small', 'active on only a few data points'..

anders-lindstroem on What’s the short timeline plan?

Communicate the plan with the general public: Morally speaking, I think companies should share their plans in quite a lot of detail with the public.

Yes, I think so too, but it will never happened. AGI/ASI is too valuable to be discussed publicly. I have never ever been given the opportunity to have a say in any other big corporate decision regarding the development of weapons and for sure I will not have it this time either.

"They" will build the things "they" believe are necessary to protect "the American or Chinese way of life", and "they" will not ask you for permission or your opinion.

dr_s on quila's Shortform

I think some believe it's downright impossible and others that we'll just never create it because we have no use for something so smart it overrides our orders and wishes. That at most we'll make a sort of magical genie still bound by us expressing our wishes.

alexander-gietelink-oldenziel on Alexander Gietelink Oldenziel's Shortform

People are not thinking clearly about AI-accelerated AI research. This comment by Thane Ruthenis [LW · GW] is worth amplifying.

I'm very skeptical of AI being on the brink of dramatically accelerating AI R&D.
My current model is that ML experiments are bottlenecked not on software-engineer hours, but on compute. See Ilya Sutskever's claim here [LW · GW]:
95% of progress comes from the ability to run big experiments quickly. The utility of running many experiments is much less useful.
What actually matters for ML-style progress is picking the correct trick, and then applying it to a big-enough model. If you pick the trick wrong, you ruin the training run, which (a) potentially costs millions of dollars, (b) wastes the ocean of FLOP you could've used for something else.
And picking the correct trick is primarily a matter of research taste, because:
Tricks that work on smaller scales often don't generalize to larger scales.
Tricks that work on larger scales often don't work on smaller scales (due to bigger ML models having various novel emergent properties).
Simultaneously integrating several disjunctive incremental improvements into one SotA training run is likely nontrivial/impossible in the general case.^[1]
So 10x'ing the number of small-scale experiments is unlikely to actually 10x ML research, along any promising research direction.
And, on top of that, I expect that AGI labs don't actually have the spare compute to do that 10x'ing. I expect it's all already occupied 24/7 running all manners of smaller-scale experiments, squeezing whatever value out of them that can be squeezed out. (See e. g. Superalignment team's struggle to get access to compute: that suggests there isn't an internal compute overhang.)
Indeed, an additional disadvantage of AI-based researchers/engineers is that their forward passes would cut into that limited compute budget. Offloading the computations associated with software engineering and experiment oversight onto the brains of mid-level human engineers is potentially more cost-efficient.
As a separate line of argumentation: Suppose that, as you describe it in another comment, we imagine that AI would soon be able to give senior researchers teams of 10x-speed 24/7-working junior devs, to whom they'd be able to delegate setting up and managing experiments. Is there a reason to think that any need for that couldn't already be satisfied?
If it were an actual bottleneck, I would expect it to have already been solved: by the AGI labs just hiring tons of competent-ish software engineers. They have vast amounts of money now, and LLM-based coding tools seem competent enough to significantly speed up a human programmer's work on formulaic tasks. So any sufficiently simple software-engineering task should already be done at lightning speeds within AGI labs.
In addition: the academic-research and open-source communities exist, and plausibly also fill the niche of "a vast body of competent-ish junior researchers trying out diverse experiments". The task of keeping senior researchers up-to-date on openly published insights should likewise already be possible to dramatically speed up by tasking LLMs with summarizing them, or by hiring intermediary ML researchers to do that.
So I expect the market for mid-level software engineers/ML researchers to be saturated.
So, summing up:
10x'ing the ability to run small-scale experiments seems low-value, because:
The performance of a trick at a small scale says little (one way or another) about its performance on a bigger scale.
Integrating a scalable trick into the SotA-model tech stack is highly nontrivial.
Most of the value and insight comes from full-scale experiments, which are bottlenecked on compute and senior-researcher taste.
AI likely can't even 10x small-scale experimentation, because that's also already bottlenecked on compute, not on mid-level engineer-hours. There's no "compute overhang"; all available compute is already in use 24/7.
If it weren't the case, there's nothing stopping AGI labs from hiring mid-level engineers until they are no longer bottlenecked on their time; or tapping academic research/open-source results.
AI-based engineers would plausibly be less efficient than human engineers, because their inference calls would cut into the compute that could instead be spent on experiments.
If so, then AI R&D is bottlenecked on research taste, system-design taste, and compute, and there's relatively little non-AGI-level models can contribute to it. Maybe a 2x speed-up, at most, somehow; not a 10x'ing.