AI governance needs a theory of victory

corin-katzke

AI governance needs a theory of victory

post by Corin Katzke (corin-katzke), Justin Bullock (justin-bullock) · 2024-06-21T16:15:46.560Z · LW · GW · 8 comments

This is a link post for https://www.convergenceanalysis.org/publications/ai-governance-needs-a-theory-of-victory

8 comments

8 comments

Comments sorted by top scores.

comment by Nathan Helm-Burger (nathan-helm-burger) · 2024-06-26T21:37:18.745Z · LW(p) · GW(p)

At a minimum, it [moratorium] would require establishing strong international institutions with effective control over the access to and use of compute.

Note that for this to be an actually long-term victory, you can't just control large scale compute resources. This requires controlling even personal computers. Algorithmic progress will continue to advance what can be done with a personal home computer or smartphone. This wouldn't be an immediate problem, but you'd need to have a plan which involved controlling securely, confiscating or destroying every computer and smartphone and preventing any uncontrolled computers from being made.

That's not the way I usually hear this discussed when I hear people endorse a moratorium. They usually talk about just non-personal large scale computing resources like data centers.

The true moratorium is the full Butlerian Jihad. No more computer chips anywhere, unless controlled so thoroughly you are willing to bet the existence of humanity on that control.

Replies from: William the Kiwi

↑ comment by William the Kiwi · 2024-06-28T15:13:26.989Z · LW(p) · GW(p)

This would depend on whether algorithmic progress can continue indefinitely. If it can, then yes the full Butlerian Jihad would be required. If it can't, either due to physical limitations or enforcement, then only computers over a certain scale would be required to be controlled/destroyed.

Replies from: nathan-helm-burger

↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2025-01-29T17:36:27.200Z · LW(p) · GW(p)

Can it continue indefinitely? No, infinity is big.

Can it continue far enough that a single laptop computer can host a powerful model? Pretty sure most technical experts are going to agree that that seems feasible in theory.

After that, it's a question of offense-defense balance. Currently, one powerful uncontrolled model can launch an attack that can wipe out the vast majority of humanity. Currently, having lots of similarly powerful models working for the good guys doesn't stop this. Defensive acceleration seeks to change this balance. Will offense continue to dominate in the future? If so, we face precarious times ahead.

comment by Nathan Helm-Burger (nathan-helm-burger) · 2024-06-26T21:53:32.488Z · LW(p) · GW(p)

I currently believe with fairly high confidence that AI Leviathan is the only plausibly workable approach which maintains the Bright Future (humanity expanding beyond Earth). I think a Butlerian Jihad moratorium must either eventually be revoked in order to settle other worlds, or fail over the long term due to lack of control maintenance.

I do think that a temporary moratorium to allow for time for AI alignment research is a reasonable plan, but should be explicit about being temporary and about needing substantial resources to be invested in AI alignment research during the pause. Additionally, a temporary moratorium would need a much more lax enforcement / control scheme. You could probably get by with controlling just data centers for maybe 10 or 20 years. No need to confiscate every personal computer in the world.

I don't believe defensive acceleration is plausibly viable, due to specific concerns around the nature of known technologies. It's possible this view could be changed upon discovery of new defensive technology. I don't anticipate this, and think that many actions which lead towards defensive acceleration are actively counter-productive for pursuing an AI Leviathan. Thus, I would like to convince people to abandon the pursuit of defensive acceleration until such time as the technological strategic landscape shifts substantially in favor of defense.

I have lots of reasoning and research behind my views on this, and would enjoy having a thoughtful discussion with someone who sees this differently. I've enjoyed the discussions-via-LessWrong-mechanism that I've had so far.

Replies from: William the Kiwi

↑ comment by William the Kiwi · 2024-06-28T15:34:53.560Z · LW(p) · GW(p)

Is humanity expanding beyond Earth a requirement or a goal in your world view?

comment by William the Kiwi · 2024-06-28T15:32:45.090Z · LW(p) · GW(p)

A novel theory of victory is human extinction.
I do not personally agree with it, but it is supported by people like Hans Moravec and Richard Sutton, who believe AI to be our "mind children" and that humans should "bow out when we can no longer contribute".

Replies from: mateusz-baginski

↑ comment by Mateusz Bagiński (mateusz-baginski) · 2025-01-29T12:07:18.614Z · LW(p) · GW(p)

This ToV is a ToV for actors whose values are inverted relative to [shared values of humanity] on a particular dimension.

Ergo, if worldwide theocracy or totalitarianism is not a theory of victory, then human extinction is not either.

comment by William the Kiwi · 2024-06-28T15:15:26.279Z · LW(p) · GW(p)

I recommend the follow up work happens.

AI governance needs a theory of victory

Contents

8 comments