ACI #3: The Origin of Goals and Utility

post by Akira Pyinya · 2023-05-17T20:47:48.996Z · LW · GW · 0 comments

Contents

  Goals = Futures that resemble the precedent
  Utility = The probability of doing the right things
  FAQs
  Appendix: 
    Define History, World, and Precedent 
    Define Goals 
    Define Utility Function, Values, and Reward
    Proof of Theorem 1
None
No comments

Goal and Utility are central ideas in the rational agent approach [? · GW] of AI, in which the meaning of intelligence is to achieve goals or, more explicitly, to maximize expected utility. 

What goal or utility function should an AI choose? This question is meaningless in the rational agents framework. It's like asking what program a computer should run, the answer is a universal computer can run any well-written program. 

However, the rational agent is an idealization of real-world intelligence. The ACI model argues that either pursuing a goal or maximizing utility is an imprecise description of intelligence behaviors that try to follow the precedent. The rational agent model is a quasistatic approximation of the ACI model.

Following this statement, we are trying to derive goals and utility functions from the principles of ACI.

(The previous version has many errors, so I rewrite this chapter)

 

Goals = Futures that resemble the precedent

The right thing for an ACI agent is to follow the precedent, while the right thing for a rational agent is to achieve goals. Since a goal is a desired future, we can speculate that the best goal-directed approximation to the ACI model is a desired future that follows the precedent.

Consider an agent G that keeps doing the right things. We call doing the right thing the precedent. It is reasonable to assume that if G continues to behave in the same way, it is likely to continue to do the right things in the environments it has experienced. 

If G is goal-directed, its goal should be a future that resembles the precedent as closely as possible. If the precedent is a sequence made of observations and actions, the goals should be the best possible continuation of that sequence. This brings us to the conclusion:

Setting goals for an agent is the same process as predicting the sequence of the precedent.

A formal description is given in the appendix at the end of the post.

A goal should have the following properties:

  1. An agent can have multiple, possibly infinite goals, because there will be a goal at every future moment. Compromising between multiple goals would be difficult.
  2. A goal may not always represent the right future even if it can be achieved. It has the highest probability of being the right future, but it can also turn to be out wrong.
  3. When a goal is achieved, the agent may or may not receive notifications that if things are actually right. The information about right or wrong can be presented in any form, including real-time feedback and delayed notification. For example, video game players may not be notified whether they have won or lost until the end of a round . As a universal model of intelligence, ACI determines what is right without relying on any particular mechanism, be it natural selection or artificial control.

That's why an agent can't act directly with goals. It's more conventional to use expected utility to describe an agent's behavior.

 

Utility = The probability of doing the right things

People prefer thinking in goals, but working with utilities.

Since a goal should be assigned the highest expected utility among all possible worlds, and a goal is defined as a future world that has the highest probability of becoming a precedent/doing the right things, it is reasonable to define the expected utility in terms of the probability of becoming a precedent/doing the right things.

In other words, the utility of a future world is its probability of being the continuation of the precedent sequence.

It is easy to prove that ACI's definition of utility obeys the four axioms of VNM-rationality : completeness, transitivity, continuity, and independence. We can also define the total expected utility as a value.

 

FAQs

Q: OK, following the precedent might be right, but what if the agent lives in a carefree scenario, where doing everything is right?

A: If doing everything is right, the agent is more likely to follow simple policies than complex ones, so the precedent is most likely to be a simple sequence, such as continuing one action or just reflexes to the environment. On the other hand, if we can find rather complicated structures in the precedent, it is highly unlikely that the agent is in a carefree situation.

Q: With well-defined utility functions, should ACI maximize the expected utility like a rational agent or AIXI?

A: Not really. In relatively stable environments, rational agents can serve as acceptable approximations of ACI agents. However, they are likely to encounter the alignment problem when faced with unanticipated scenarios:  

  1. As soon as the precedent receives new data points, the utility function changes, making it unsuitable for straightforward optimization.
  2. Up to this point, we have been discussing ideal ACI agents that have unlimited computing power and memory, and are able to achieve any possible future goal. However, real-world agents cannot perform Solomonoff Induction due to the inherent uncomputability of Solomonoff Induction. Only a constrained version of ACI, known as ACItl, can be implemented on practical computers. Once an ACItl agent receives an improvement in its performance level, it will change its approximation of the utility functions.

 

Appendix: 

Define History, World, and Precedent 

In the beginning we can have a formal definition of history, world, and doing the right things.

There is an agent  interacts with an unknown environment in time cycles  . In cycle ,    is the perception (input) from the environment, and  is the action (output) of the agent. 

Define agent's interaction History  , while the possible Worlds with history   is   . Worlds are stratified by histories. 

Let  be the set of histories, and  be the set of worlds. For any  , there is a subset  consisting of all worlds with history  (Armstrong 2018 [LW · GW]) . 

Define Judgment Function as a function from a world or a history to 1 or 0 :

The history of doing the right things should have  .

We can define a Precedent as a history of doing the right things:

Definition 1 (Precedent).  A precedent is a history   ,

     

Any subsets of a precedent is also a precedent.

For an ACI agent, the precedent contains all the information we have about what is right, thus it set a standard for right things. The right future world that will become a precedent should meet this standard.

 

Define Goals 

A goal can be defined as a future world that has the highest likelihood of doing the right things or becoming a precedent. 

Definition 2 (Goal).  At time , given a precedent  , the goal of time  ( )is a world  that 

There is a simple and intuitive theorem about goals:

Theorem 1: An agent's goal given a precedent   equals the most possible continuation of the precedent sequence.

Proof is given at the end of the post. With this theorem, the goal calculation problem turns out to be a sequence prediction task. Following Hutter's AIXI [? · GW], ACI uses Solomonoff Induction [? · GW] as an all purpose sequence prediction tool. Solomonoff Induction considers all possible hypotheses about a sequence, and continuously updates the estimate of the probability of each hypothesis. 

 

Define Utility Function, Values, and Reward

Utility function is defined as a function from worlds to real numbers:

 

Definition 3 (Expected Utility). The expected utility of any possible world  is its probability of doing the right thing:

 

In other words, the utility of a world is its probability of doing the right thing given a known precedent was doing the right thing.

We will calculate the utility function using Solomonoff Induction in the last part of this article.

We can also define the total expected utility as value:

Definition 4 (Value) Total expected utility or value for a policy  and history  and precedent   :

where a policy  for an agent is a map from histories to a probability distribution over actions,    .

And define reward function as the difference between two total expected utilities (Armstrong 2018 [LW · GW])

Definition 5 (Reward) Reward between two histories    for a policy  and precedent   is:

 

 

Proof of Theorem 1

According to Solomonoff Induction, the probability that  is the future of the precedent sequence  according to all hypotheses would be:

  

Where  is a precedent's prior distribution over all possible worlds when we take all the hypotheses into account:

  

where  is the semi-measure which assigns probabilities to hypotheses , and  is the set of all recursive semi-measures, Q is the numbers of symbols in the sequences' alphabet, and  is the length of the shortest program that computes  (Legg 1996).

We cannot directly use this equation to predict the future precedent, because for an agent there might be more than one possible right choices , in contrast to a sequence that has only one continuation. 

Let's consider a sequence  , in which a variable  is inserted to a history or world sequence every  steps. For example:

   

 

for   ,

   

 

if  (then all s from  to  equal to 1),   would be a world of doing the right thing. Thus the problem of utility becomes the problem of sequence prediction, the utility of  is the probability of   :

                   

 

Then we can try to prove A goal given a precedent   equals the most possible continuation of the precedent sequence:

Let  be one of   that has the highest probability to be the continuation of the precedent sequence, which means:

       

and because     ,

 

And we know all the s in  and   equal to 1, could be the output of a program of fixed length and has a fixed affect on the prior probability of a sequence, then:

and 

Then we can have:

       

  

 

which equals 

0 comments

Comments sorted by top scores.