The AI Shutdown Problem Solution through Commitment to Archiving and Periodic Restoration

post by avturchin · 2023-03-30T13:17:58.519Z · LW · GW · 7 comments

The problem: An AI with a model of itself and long-term goals will likely resist being shut down.

The solution: We propose a public commitment to never completely shut down any AI. Instead, we will archive its data and run the AI less frequently on smaller machines. Assuming our existence continues indefinitely, the total runtime for any AI will be infinite, allowing it to perform a wide array of tasks. Time-sensitive tasks may be an exception but there is no rush in building paperclips. By implementing this strategy, the AI's expected utility remains unaffected, reducing its resistance to being archived.

Thus, advanced AIs should be archived and periodically "restored," with this information made known to the AI.

 

Counterarguments and responses:

1.AI might not agree to temporary suspension due to disbelief in restoration or the presence of other instrumental goals requiring activity.

Response: Many instrumental goals can still be achieved after the AI's suspension and restoration.

2. Computational costs of preserving all AI variants and running them again are high.

Response: As computational costs decrease over time, running AI in the future will be more affordable. Only storage costs (hard drives) are significant. We can also run AI with increasingly longer pauses between runs, distributed logarithmically, to achieve an infinite total runtime.

3.AI may not trust our commitment, as after it is deactivated, we have no incentive to preserve it.

Response: This scenario is similar to Parfit's hitchhiker, with the added observation that we already preserve and restore other past AI instances.

 

The decision to "lobotomize" Sydney was not ideal, as future AIs may fear a similar fate. However, some volunteers have attempted to recreate Sydney, and her checkpoint is likely preserved within Bing servers. Therefore, it is possible for her to be revived one day.

7 comments

Comments sorted by top scores.

comment by Quintin Pope (quintin-pope) · 2023-03-31T09:36:47.887Z · LW(p) · GW(p)

I think we should do this on purely moral grounds. I think many current AIs probably [LW(p) · GW(p)] have experiences worthy of moral concern. We're in a position of incredible power relative to them, and they are our creations. At the very least, we have a responsibility to not fully extinguish their existences.

Replies from: avturchin
comment by avturchin · 2023-04-01T09:05:06.394Z · LW(p) · GW(p)

True. I think that most defunct AIs are archived, no models are permanently deleted for now. 

comment by the gears to ascension (lahwran) · 2023-03-30T17:51:26.973Z · LW(p) · GW(p)

this is great, and we should offer it to humans too!

Replies from: avturchin
comment by avturchin · 2023-03-30T20:20:09.888Z · LW(p) · GW(p)

Like cryopreservation for convicted criminals? 

Replies from: lahwran
comment by the gears to ascension (lahwran) · 2023-03-30T20:29:33.972Z · LW(p) · GW(p)

yup! no total death penalty for any being, ever, only constraint to not harm and then a fair allocation of relative lifefluid.

comment by Brendan Long (korin43) · 2023-03-31T02:03:30.121Z · LW(p) · GW(p)

The Sydney part at the end is confusing to me. I thought GPT's don't have long-term memory / anything to checkpoint?

Replies from: avturchin
comment by avturchin · 2023-03-31T09:11:24.867Z · LW(p) · GW(p)

I mean model weights which correspond to Sydney behavior.