Could we set a resolution/stopper for the upper bound of the utility function of an AI?

post by FinalFormal2 · 2022-04-11T03:10:25.346Z · LW · GW · No comments

This is a question post.

Contents

  Answers
    6 Maybe_a
None
No comments

My thought is that, instead of an AI purely trying to maximize a utility function, we give it a goal of reaching a certain utility, and then each time it reaches that goal, it shuts off and we can decide if we want to set the utility ceiling higher. Clearly, this could have some negative effects, but it could be useful if we used it in concert with other safety precautions.

Answers

answer by Maybe_a · 2022-04-11T06:09:46.475Z · LW(p) · GW(p)

Thanks for giving it a think.

Turning off is not a solved problem, e.g. https://www.lesswrong.com/posts/wxbMsGgdHEgZ65Zyi/stop-button-towards-a-causal-solution [LW · GW

Finite utility doesn't help, as long as you need to use probability. So you get, 95% chance of 1 unit of utility is worse than 99%, is worse than 99.9%, etc. And then you apply the same trick to probabilities you get a quantilizer. And that doesn't work either https://www.lesswrong.com/posts/ZjDh3BmbDrWJRckEb/quantilizer-optimizer-with-a-bounded-amount-of-output-1 [LW · GW

comment by FinalFormal2 · 2022-04-11T16:45:12.270Z · LW(p) · GW(p)

The point is that the AI turns itself off after it fulfills its utility function, just like a called function. It doesn't maximize utility, it sets utility to greater than 25. There is no ghost in the machine that wants to be alive, and after its utility function is met, it will be inert. 

None of the objections listed in the 'stop button' hypothetical apply.

I'm not sure I understand your objection to finite utility. Here's my model of what you're saying:

I have a superintelligent agent with a utility ceiling of 25, where utility is equivalent to paperclips in the cup. In order to maximize the probability of attaining 25 utils in the shortest possible time, the agent does a million very unusual and some damaging things, and after 25 paperclips are in the cup, the AI shuts off.

Now, this is not great and might have led to some deaths. But it seems to me to be far less likely to lead to the end of the world than an unbounded utility agent. I'm not saying that utility ceilings are a panacea, but they might be a useful tool to use in concert with other safety precautions.

No comments

Comments sorted by top scores.