Rolling Thresholds for AGI Scaling Regulation

larks

Rolling Thresholds for AGI Scaling Regulation

post by Larks · 2025-01-12T01:30:23.797Z · LW · GW · 6 comments

6 comments

6 comments

Comments sorted by top scores.

comment by Aaron_Scher · 2025-01-14T07:59:39.059Z · LW(p) · GW(p)

I like this blog post. I think this plan has a few problems, which you mention, e.g., Potential Problem 1, getting the will and oversight to enact this domestically, getting the will and oversight/verification to enact this internationally.

There's a sense in which any plan like this that coordinates AI development and deployment to a slower-than-ludicrous rate seems like it reduces risk substantially. To me it seems like most of the challenge comes from getting to a place of political will from some authority to actually do that (and in the international context there could be substantial trust/verification needs). But nevertheless, it is interesting and useful to think through what some of the details might be of such a coordinated-slow-down regime. And I think this post does a good job explaining an interesting idea in that space.

Replies from: Larks

↑ comment by Larks · 2025-01-15T03:49:02.688Z · LW(p) · GW(p)

Thanks very much! Yeah, I agree political will seems like a big issue. But I also hear people saying that they don't know what to push for, so I wanted to try to offer a concrete example of a system that wasn't as destructive to any constituency's interests as e.g. a total pause.

comment by Nathan Helm-Burger (nathan-helm-burger) · 2025-01-12T01:39:10.524Z · LW(p) · GW(p)

Sigh. Ok. I'm giving an upvote for good-faith effort to think this through and come up with a plan, but I just disagree with your world-model and its projections about training costs and associated danger levels so strongly that it seems hard to figure out how to even begin a discussion.

I'll just leave a link here [LW(p) · GW(p)] to a different comment talking about the same problem.

Replies from: Larks, RussellThor

↑ comment by Larks · 2025-01-12T02:15:32.504Z · LW(p) · GW(p)

Thanks very much for your feedback, though I confess I'm not entirely sure where to go with it. My interpretation is you have basically two concerns:

This policy doesn't really directly regulate algorithmic progress, e.g. if it happened on smaller amounts of compute.
Algorithmic theft/leakage is easy.

The first one is true, as I alluded in the problems section. Part of my perspective here is coming from a place of skepticism about regulatory competence - I basically believe we can get regulators to control total compute usage, and to evaluate specific models according to pre-established evals, but I'm not sure I'd trust them to be able to determine "this is a new algorithmic advance, we need to evaluate it". To the extent you had less libertarian priors you could try to use something like the above scheme for algorithms as well, but I wouldn't expect it to work so well, as you lack the cardinal structure of compute size.

In terms of theft/leakage, you're right this plan doesn't discuss it much, and I agree it's worth working on.

Replies from: nathan-helm-burger

↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2025-01-15T18:00:53.972Z · LW(p) · GW(p)

I agree that I have no faith in current governments to implement and enforce policies that are more complex than things on the order of governance compute and chip export controls.

I think the conclusion this points towards is that we need new forms of governance. Not to replace existing governments, but to complement them. Voluntary mutual inspection contracts with privacy-respecting technology using AI inspectors. Something of that sort.

Here's some recent evidence of compute thresholds not being reliable: https://novasky-ai.github.io/posts/sky-t1/

Here's another self-link to some of my thoughts on this: https://www.lesswrong.com/posts/tdrK7r4QA3ifbt2Ty/is-ai-alignment-enough?commentId=An6L68WETg3zCQrHT [LW(p) · GW(p)]

↑ comment by RussellThor · 2025-01-12T03:37:43.043Z · LW(p) · GW(p)

Yes I think thats the problem - my biggest worry is sudden algorithmic progress, this becomes almost certain as the AI tends towards superintelligence. An AI lab on the threshold of the overhang is going to have incentives to push through, even if they don't plan to submit their model for approval. At the very least they would "suddenly" have a model that uses 10-100* less resources to do existing tasks giving them a massive commercial lead. They would of course be tempted to use it internally to solve aging, make a Dyson swarm ... also.

Another concern I have is I expect the regulator to impose a de-facto unlimited pause if it is in their power to do so as we approach superintelligence as the model/s would be objectively at least somewhat dangerous.

Rolling Thresholds for AGI Scaling Regulation

Contents

6 comments