[link] Simplifying the environment: a new convergent instrumental goal

kaj_sotala

[link] Simplifying the environment: a new convergent instrumental goal

post by Kaj_Sotala · 2016-04-22T06:48:33.718Z · LW · GW · Legacy · 4 comments

4 comments

http://kajsotala.fi/2016/04/simplifying-the-environment-a-new-convergent-instrumental-goal/

Convergent instrumental goals (also basic AI drives) are goals that are useful for pursuing almost any other goal, and are thus likely to be pursued by any agent that is intelligent enough to understand why they’re useful. They are interesting because they may allow us to roughly predict the behavior of even AI systems that are much more intelligent than we are.

Instrumental goals are also a strong argument for why sufficiently advanced AI systems that were indifferent towards human values could be dangerous towards humans, even if they weren’t actively malicious: because the AI having instrumental goals such as self-preservation or resource acquisition could come to conflict with human well-being. “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.”

I’ve thought of a candidate for a new convergent instrumental drive: simplifying the environment to make it more predictable in a way that aligns with your goals.

4 comments

Comments sorted by top scores.

comment by gwern · 2016-04-22T14:49:07.851Z · LW(p) · GW(p)

All stable processes we shall predict. All unstable processes we shall control.

Replies from: Gunnar_Zarncke

↑ comment by Gunnar_Zarncke · 2016-04-24T20:34:41.054Z · LW(p) · GW(p)

Sounds like the Serenity Prayer for AI.

comment by lukeprog · 2016-04-22T14:43:10.592Z · LW(p) · GW(p)

Replies from: Kaj_Sotala

↑ comment by Kaj_Sotala · 2016-04-23T05:13:17.376Z · LW(p) · GW(p)

Neat, thanks!

[link] Simplifying the environment: a new convergent instrumental goal

Contents

4 comments