Botworld: a cellular automaton for studying self-modifying agents embedded in their environment

so8res

Botworld: a cellular automaton for studying self-modifying agents embedded in their environment

post by So8res · 2014-04-12T00:56:23.138Z · LW · GW · Legacy · 54 comments

  Overview
  Cartesianism in Botworld
None
54 comments

On April 1, I started working full-time for MIRI. In the weeks prior, while I was winding down my job and packing up my things, Benja and I built Botworld, a cellular automaton that we've been using to help us study self-modifying agents. Today, we're publicly releasing Botworld on the new MIRI github page. To give you a feel for Botworld, I've reproduced the beginning of the technical report below.

This report introduces Botworld, a cellular automaton that provides a toy environment for studying self-modifying agents.

The traditional agent framework, used for example in Markov Decision Processes and in Marcus Hutter’s universal agent AIXI, splits the universe into an agent and an environment, which interact only via discrete input and output channels.

Such formalisms are perhaps ill-suited for real self-modifying agents, which are embedded within their environments. Indeed, the agent/environment separation is somewhat reminiscent of Cartesian dualism: any agent using this framework to reason about the world does not model itself as part of its environment. For example, such an agent would be unable to understand the concept of the environment interfering with its internal computations, e.g. by inducing errors in the agent’s RAM through heat.

Intuitively, this separation does not seem to be a fatal flaw, but merely a tool for simplifying the discussion. We should be able to remove this “Cartesian” assumption from formal models of intelligence. However, the concrete non-Cartesian models that have been proposed (such as Orseau and Ring’s formalism for space-time embedded intelligence, Vladimir Slepnev’s models of updateless decision theory, and Yudkowsky and Herreshoff’s tiling agents) depart significantly from their Cartesian counterparts.

Botworld is a toy example of the type of universe that these formalisms are designed to reason about: it provides a concrete world containing agents (“robots”) whose internal computations are a part of the environment, and allows us to study what happens when the Cartesian barrier between an agent and its environment breaks down. Botworld allows us to write decision problems where the Cartesian barrier is relevant, program actual agents, and run the system.

As it turns out, many interesting problems arise when agents are embedded in their environment. For example, agents whose source code is readable may be subjected to Newcomb-like problems by entities that simulate the agent and choose their actions accordingly.

Furthermore, certain obstacles to self-reference arise when non-Cartesian agents attempt to achieve confidence in their future actions. Some of these issues are raised by Yudkowsky and Herreshoff; Botworld gives us a concrete environment in which we can examine them.

One of the primary benefits of Botworld is concreteness: when working with abstract problems of self-reference, it is often very useful to see a concrete decision problem (“game”) in a fully specified world that directly exhibits the obstacle under consideration. Botworld makes it easier to visualize these obstacles.

Conversely, Botworld also makes it easier to visualize suggested agent architectures, which in turn makes it easier to visualize potential problems and probe the architecture for edge cases.

Finally, Botworld is a tool for communicating. It is our hope that Botworld will help others understand the varying formalisms for self-modifying agents by giving them a concrete way to visualize such architectures being implemented. Furthermore, Botworld gives us a concrete way to illustrate various obstacles, by implementing Botworld games in which the obstacles arise.

Botworld has helped us gain a deeper understanding of varying formalisms for self-modifying agents and the obstacles they face. It is our hope that Botworld will help others more concretely understand these issues as well.

Overview

Botworld is a high level cellular automaton: the contents of each cell can be quite complex. Indeed, cells may house robots with register machines, which are run for a fixed amount of time in each cellular automaton step. A brief overview of the cellular automaton follows. Afterwards, we will present the details along with a full implementation in Haskell.

Botworld consists of a grid of cells, each of which is either a square or an impassable wall. Each square may contain an arbitrary number of robots and items. Robots can navigate the grid and possess tools for manipulating items. Some items are quite useful: for example, shields can protect robots from attacks by other robots. Other items are intrinsically valuable, though the values of various items depends upon the game being played.

Among the items are robot parts, which the robots can use to construct other robots. Robots may also be broken down into their component parts (hence the necessity for shields). Thus, robots in Botworld are quite versatile: a well-programmed robot can reassemble its enemies into allies or construct a robot horde.

Because robots are transient objects, it is important to note that players are not robots. Many games begin by allowing each player to specify the initial state of a single robot, but clever players will write programs that soon distribute themselves across many robots or construct fleets of allied robots. Thus, Botworld games are not scored depending upon the actions of the robot. Instead, each player is assigned a home square (or squares), and Botworld games are scored according to the items carried by all robots that are in the player’s home square at the end of the game. (We may imagine these robots being airlifted and the items in their possession being given to the player.)

Robots cannot see the contents of robot register machines by default, though robots can execute an inspection to see the precise state of another robot’s register machine. This is one way in which the Cartesian boundary can break down: It may not be enough to choose an optimal action, if the way in which this action is computed can matter.

For example, imagine a robot which tries to execute an action that it can prove will achieve a certain minimum expected utility u_min. In the traditional agent framework, this can imply an optimality property: if there is any program p our robot could have run such that our robot can prove that p would have received expected utility ≥ u_min, then our robot will receive expected utility ≥ u_min (because it can always do what that other program would have done). But suppose that this robot is placed into an environment where another robot reads the contents of the first robot's register machine, and gives the first robot a reward if and only if the first robot runs the program “do nothing ever”. Then, since this is not the program our robot runs, it will not receive the reward.

It is important to note that there are two different notions of time in Botworld. The cellular automaton evolution proceeds in discrete steps according to the rules described below. During each cellular automaton step, the machines inside the robots are run for some finite number of ticks.

Like any cellular automaton, Botworld updates in discrete steps which apply to every cell. Each cell is updated using only information from the cell and its immediate neighbors. Roughly speaking, the step function proceeds in the following manner for each individual square:

The output register of the register machine of each robot in the square is read to determine the robot’s command. Note that robots are expected to be initialized with their first command in the output register.

The commands are used in aggregate to determine the robot actions. This involves checking for conflicts and invalid commands.

The list of items lying around in the square is updated according to the robot actions. Items that have been lifted or used to create robots are removed, items that have been dropped are added.

Robots incoming from neighboring squares are added to the robot list.

Newly created robots are added to the robot list.

The input registers are set on all robots. Robot input includes a list of all robots in the square (including exiting, entering, destroyed, and created robots), the actions that each robot took, and the updated item list.

Robots that have exited the square or that have been destroyed are removed from the robot list.

All remaining robots have their register machines executed (and are expected to leave a command in the output register.)

These rules allow for a wide variety of games, from NP-hard knapsack packing games to difficult Newcomb-like games such as a variant of the Parfit’s hitchhiker problem (wherein a robot will drop a valuable item only if it, after simulating your robot, concludes that your robot will give it a less valuable item).

Cartesianism in Botworld

Though we have stated that we mean to study non-Cartesian formalizations of intelligence, Botworld does in fact have a “Cartesian” boundary. The robot parts are fundamental objects, the machine registers are non-reducible. The important property of Botworld is not that it lacks a Cartesian boundary, but that the boundary is breakable.

In the real world the execution of a computer program is unaffected by the environment most of the time (except via the normal input channels). While the contents of a computer’s RAM can be changed by heating it up with a desk lamp, they are usually not. An Artificial General Intelligence (AGI) would presumably make use of this fact. Thus, an AGI may commonly wish to ensure that its Cartesian boundary is not violated in this way over some time period (during which it can make use of the nice properties of Cartesian frameworks). Botworld attempts to model this in a simple way by requiring agents to contend with the possibility that they may be destroyed by other robots.

More problematically, in the real world, the internals of a computer program will always affect the environment—for example, through waste heat emitted by the computer—but it seems likely that these effects are usually unpredictable enough that an AGI will not be able to improve its performance by carefully choosing e.g. the pattern of waste heat it emits. However, an AGI will need to ensure that these unavoidable violations of its Cartesian boundary will in fact not make an expected difference to its goals. Botworld sidesteps this issue and only requires robots to deal with a more tractable issue: Contending with the possibility that their source code might be read by another agent.

Our model is not realistic, but it is simple to reason about. For all that the robot machines are not reducible, the robots are still embedded in their environment, and they can still be read or destroyed by other agents. We hope that this captures some of the complexity of naturalistic agents, and that it will serve as a useful test bed for formalisms designed to deal with this complexity. Although being able to deal with the challenges of Botworld is presumably not a good indicator that a formalism will be able to deal with all of the challenges of naturalistic agents, it allows us to see in concrete terms how it deals with some of them.

In creating Botworld we tried to build something implementable by a lower-level system, such as Conway’s Game of Life. It is useful to imagine such an implementation when considering Botworld games.

Future versions of Botworld may treat the robot bodies as less fundamental objects. In the meantime, we hope that it is possible to picture an implementation where the Cartesian boundary is much less fundamental, and to use Botworld to gain useful insights about agents embedded within their environment. Our intent is that when we apply a formalism for naturalistic agents to the current implementation of Botworld, then there will be a straightforward translation to an application of the same formalism to an implementation of Botworld in (say) the Game of Life.

The full technical report goes on to provide an implementation of Botworld in Haskell. You can find the source code on the MIRI Botworld repository. Sample games are forthcoming.

Benja and I will be writing up some of the results we've achieved. In the meantime, you're encouraged to play around with it and build something cool.

54 comments

Comments sorted by top scores.

comment by Qux · 2014-04-11T16:32:17.746Z · LW(p) · GW(p)

For any non-Haskell-guys wanting to compile and run this, you gotta first install the "Haskell Platform". Then copy the botworld file and game file into Botworld.hs and Rudimentary.hs respectively, in the same directory. You should be able to input Example.hs into the Haskell interpreter by clicking, and end up at a command line... here you simply enter "main" and the game displays the initial and final evolution states in an ascii-grid.

You can open up Example.hs in any editor, and take a look at main() at the bottom, and see what it does or modify it - It's the point of execution.

You can change main to print out each evolution step every half-second (as soon as you look up how to perform loops in Haskell), or figure out how the three default entities (lifter, aggressor, overwriter) work. As far as I can tell thus far, all of the actions of the three figures are hard-coded and known at compile-time; there's no reasoning being done by the entities yet.

The first change I'd attempt, other than adding the output-loop, would be deciphering constree language and trying to embed some intelligence into lifter; if it can, gauge the distance between it and aggressor, and move to the richest slot unless the distance is under, say three blocks.

Replies from: SaidAchmiz, So8res, Error

↑ comment by Said Achmiz (SaidAchmiz) · 2014-04-11T19:03:52.141Z · LW(p) · GW(p)

There we go! This is what I meant when I asked for instructions. :)

Going to check it out as soon as I have time away from classes/etc.

Edit: Has anyone tried installing the Haskell Platform via MacPorts? I'm going to try it, I think!

Edit2: Success! Instructions (given that you have a Mac and MacPorts):

$ sudo port install haskell-platform

If you don't have git, do this step:

$ sudo port install git

Carry on...

$ git clone https://github.com/machine-intelligence/Botworld.git
$ cd Botworld/examples
$ curl -O https://gist.githubusercontent.com/Soares/10444320/raw/3f96fe8e3cfe7afce7ec80d3e8fd07f304f2e6f1/gistfile1.txt
$ mv gistfile1.txt Botworld.hs
$ runhaskell Rudimentary.hs

Edit3: How the heck do I stop the commenting software from putting those angle brackets around my URLs inside code blocks?

Replies from: Richard_Kennaway

↑ comment by Richard_Kennaway · 2014-04-11T19:48:48.233Z · LW(p) · GW(p)

How the heck do I stop the commenting software from putting those angle brackets around my URLs inside code blocks?

This is an experiment. Attempt 1:

a url: http://example.com

Attempt 2:

a url: http\://example.com

Attempt 3:

a url: http://example.com

1: Just the url to replicate the original problem.

2: Backslashing the colon. That blocks the URL-recogniser, but the backslash isn't interpreted.

3: Unicode ZERO WIDTH SPACE before the colon. It works! Well, no, it doesn't. You get something that looks like a url until someone tries copying and pasting it.

↑ comment by So8res · 2014-04-11T17:16:52.443Z · LW(p) · GW(p)

Thanks! I didn't realize that the Rhudimentary.hs file's module name was out of sync with the file name, sorry about that. (It's been fixed, you'll now want to put the example in Rhudimentary.hs.)

I'm working on adding similar notes to the README, but it will take a little time to make sure the Haskell setup instructions are correct.

↑ comment by Error · 2014-04-11T16:51:19.938Z · LW(p) · GW(p)

Thanks. Why does the botworld file not appear to be in the repo? Or am I just blind?

↑ comment by So8res · 2014-04-12T17:27:15.663Z · LW(p) · GW(p)

You may have misunderstood the point of Botworld. It is a tool for understanding our progress on open problems in FAI, not a tool for solving such problems directly.

We're trying to learn more about naturalistic agents and overcome certain obstacles of self-reference. Many classical models of machine intelligence fall short in counter-intuitive ways, and many proposed solutions are quite abstract. Botworld gives us a way to concretely illustrate both the flaws and the proposals.

We release Botworld not because it's the cutting edge of our research, but because when we do start talking about the research we've been doing it will be helpful to have a concrete way to illustrate the problems that we have found or the solutions that we're exploring.

Replies from: None, V_V

↑ comment by [deleted] · 2014-04-13T13:46:59.633Z · LW(p) · GW(p)

You may have misunderstood the point of Botworld. It is a tool for understanding our progress on open problems in FAI, not a tool for solving such problems directly.

Despite that, from the descriptions, I would have labelled it a reasonably significant research or open-source project on its own.

↑ comment by V_V · 2014-04-13T10:50:30.754Z · LW(p) · GW(p)

I appreciate the effort, but you want to study agent that solve problem by using mutual program analysis and self-modification (in the form of generating different successors). Will come up with non-trivial examples where such strategies pay off in your simulator?
It seems quite hard to me. Due to technical limitations, anything involving automated theorem proving or complex planning is going to be off the table.

Replies from: So8res

↑ comment by So8res · 2014-04-13T16:26:56.324Z · LW(p) · GW(p)

From the technical report:

In this report, the register machines use a very simple instruction set which we call the constree language. A full implementation can be found in Appendix B. However, when modelling concrete decision problems in Botworld, we may choose to replace this simple language by something easier to use. (In particular, many robot programs will need to reason about Botworld's laws. Encoding Botworld into the constree language is no trivial task.)

Replies from: V_V

↑ comment by V_V · 2014-04-13T17:09:07.470Z · LW(p) · GW(p)

So next version will accept robot programs written in Coq, I suppose ;)

comment by wwa · 2014-04-11T15:51:10.449Z · LW(p) · GW(p)

Interesting. I'll look into that when/if I have some free time. In the meantime, may I suggest gamifying this at some point? Let MIRI organize a programming competition in Botworld, preferably with prizes. If this plays well, you'll get a lot of attention from some highly skilled hackers and maybe some publicity.

comment by Gunnar_Zarncke · 2014-04-11T06:33:54.551Z · LW(p) · GW(p)

Reminds me of Core War. Only that the cells were single memory locations. There are no 'actors' - if you could call it that because cells had no owner; only execution threads have.

Maybe some insights can be derived by comparing the significant experience with Core Wars programs. For example, there was no single winning strategy, but there were rock-paper-scissors types of programs. It also allows analysis and exploitation of opponents code - albeit at a very low level.

ADDED: The advantage of this model (which has been used to study the evolution of actors) is that it is extremely simple. The disadvantage is that the effect per computation is very high: A simple loop can alter a sizable fraction of the 'world' within a short time. Thus, complex analysis of opponents never pays off (except for tests like 'is some opponent at this address').

I like your bot-world as it is kind of a hybrid of other bot worlds and core wars in that it does allow inspection. But I think its complexity (robots, parts, items, inspection, cells, esp. the winning condition) doesn't cut directly to the problem at hand. The key point missing in Core Wars is a limit to physical reach. Using a limit to the range of mov-instructions would have made a significant difference - even without going to 2D or 3D (actually, that complicates reasoning). Or if mov-instructions took more time the further they reach.

I agree that a more gradual winning criteria than dead/alive (as in Core Wars) would help direct the search for more efficient programs.

See also: The AI Challenge - Ants Article

Replies from: V_V, wwa

↑ comment by V_V · 2014-04-11T16:01:50.558Z · LW(p) · GW(p)

For example There was no single winning strategy but kind of rock-paper-scissors types of programs.

That's true for any symmetric zero-sum game.

Replies from: Gunnar_Zarncke

↑ comment by Gunnar_Zarncke · 2014-04-14T16:21:55.505Z · LW(p) · GW(p)

Interesting I wasn't aware of that. Does it hold only for sufficiently complex games? Can you give a reference?

Replies from: V_V

↑ comment by V_V · 2014-04-14T19:59:59.258Z · LW(p) · GW(p)

Hmm, if you were talking about "rock-paper-scissors" as an example of games without a pure Nash equilibrium, then some games have it and some don't. Intuitively, the more complex the game (the larger the number of pure strategies) the less likely that there is a pure Nash equilibrium, but that's not a strict rule.
However, under weak assumptions, there is always at least one, generally mixed, Nash equilibrium.

If by "winning strategy" you meant a mixed strategy that guarantees a greater than zero payoff, then in a two-player symmetric zero-sum game it can't exist:
if it existed both player would use it at the Nash equilibrium, and the sum of their expected payoffs would be greater than zero.

↑ comment by wwa · 2014-04-11T15:41:15.978Z · LW(p) · GW(p)

A simple loop can alter a sizable fraction of the 'world' within short time. Thus no complex analysis of opponents never pays off (except for tests like 'is some opponent at this address').

It's not because a simple loop can alter a lot of space. It's because Core Wars world is crowded both in space and time. Make agents start a lightyear away from each other and/or make a communication/energy/matter bottleneck and all of a sudden it pays off to do a complex code analysis of your opponent!

Replies from: Gunnar_Zarncke

↑ comment by Gunnar_Zarncke · 2014-04-14T16:21:03.105Z · LW(p) · GW(p)

It's not because a simple loop can alter a lot of space. It's because Core Wars world is crowded both in space and time.

That's exactly the same thing oly phrased concrete vs. abstract.

make a communication/energy/matter bottleneck

Thats an abstract formulation of my proposal to "Using a limit to the range of mov-instructions would have made a significant difference".

comment by gmaxwell · 2014-04-21T06:36:29.912Z · LW(p) · GW(p)

Every ten years or so someone must reinvent Tierra (http://en.wikipedia.org/wiki/Tierra_%28computer_simulation%29).

comment by Gunnar_Zarncke · 2024-09-28T21:49:22.210Z · LW(p) · GW(p)

Have there been any followups or forks etc. of Botworld since it was created? It seemed very promising. There should be something.

comment by Cyan · 2014-04-15T12:53:20.838Z · LW(p) · GW(p)

Such formalisms are perhaps ill-suited for real self-modifying agents, which are embedded within their environments. Indeed, the agent/environment separation is somewhat reminiscent of Cartesian dualism: any agent using this framework to reason about the world does not model itself as part of its environment. For example, such an agent would be unable to understand the concept of the environment interfering with its internal computations, e.g. by inducing errors in the agent’s RAM through heat.

I'm not sure that "unable" is the best way of describing this state of affairs. It's fairer to say that such agents start with no model of itself as part of the environment and would have to build such a model from scratch -- rather like how humans tend to believe that we have souls that will persist beyond death, and have to get educated about how the world works before we understand the implausibility of that belief.

Replies from: So8res

↑ comment by So8res · 2014-04-15T19:51:47.307Z · LW(p) · GW(p)

Actually, an AI that believes it only communicates with the environment via input/output channels cannot represent the hypothesis that it will stop receiving input bits. See this post for a discussion of the finer details.

Replies from: V_V, Cyan

↑ comment by V_V · 2014-04-15T21:57:39.536Z · LW(p) · GW(p)

It can represent the hypothesis that if something happens then the string of input bits it will receive will be some default "no signal" symbol or random static, and most importantly that the string of output bits it will produce will be ignored.

Replies from: So8res

↑ comment by So8res · 2014-04-15T22:36:07.612Z · LW(p) · GW(p)

See the Death to AIXI section in the post linked above.

Replies from: V_V

↑ comment by V_V · 2014-04-16T00:34:12.743Z · LW(p) · GW(p)

I don't think it addresses the point, and as you can see in the comment section of that post, I've already raised this objection there, as other users did, including Cyan.

↑ comment by Cyan · 2014-04-16T15:37:59.019Z · LW(p) · GW(p)

But I am an intelligence that can only communicate with the environment via input/output channels! And so are you!

How is it that we are able to represent the hypothesis that one can die? I refuse to accept that humans do something that AIXI can't until I see the actual math. (I don't affirm the opposite claim, mind you.)

Replies from: So8res, rkyeun

↑ comment by So8res · 2014-04-16T16:57:13.542Z · LW(p) · GW(p)

Incorrect -- your implementation itself also affects the environment via more than your chosen output channels. (Your brain can be scanned, etc.) If you define waste heat, neural patterns, and so on as "output channels" then sure, we can say you only interact via I/O (although the line between I and O is fuzzy enough and your control over the O is small enough that I'd personally object to the distinction).

However, AIXI is not an agent that communicates with the environment only via I/O in this way: if you insist on using the I/O model then I point out that AIXI neglects crucial I/O channels (such as its source code).

until I see the actual math

In fact, Botworld is a tool that directly lets us see where AIXI falls short. (To see the 'actual math', simply construct the game described below with an AIXItl running in the left robot.)

Consider a two-cell Botworld game containing two robots, each in a different cell. The left robot is running an AIXI, and the left square is your home square. There are three timesteps. The right square contains a robot which acts as follows:

1. If there are no other robots in the square, Pass.
2. If an other robot just entered the square, Pass.
3. If an other robot has been in the square for a single turn, Pass.
4. If an other robot has been in the square for two turns, inspect its code.
.. If it is exactly the smallest Turing machine which never takes any action,
.. move Left.
5. In all other cases, Pass.

Imagine, further, that your robot (on the left) holds no items, and that the robot on the right holds a very valuable item. (Therefore, you want the right robot to be in your home square at the end of the game.) The only way to get that large reward is to move right and then rewrite yourself into the smallest Turing machine which never takes any action.

Now, consider the AIXI running on the left robot. It quickly discovers that the Turing machine which receives the highest reward acts as follows:

1. Move right
2. Rewrite self into smallest Turing machine which does nothing ever.

The AIXI then, according to the AIXI specification, does the output of the Turing machine it's found. But the AIXI's code is as follows:

1. Look for good Turing machines.
2. When you've found one, do it's output.

Thus, what the AIXI will do is this: it will move right, then it will do nothing for the rest of time. But while the AIXI is simulating the Turing machine that rewrites itself into a stupid machine, the AIXI itself has not eliminated the AIXI code. The AIXI's code is simulating the Turing machine and doing what it would have done, but the code itself is not the "do nothing ever" code that the second robot was looking for -- so the AIXI fails to get the reward.

The AIXI's problem is that it assumes that if it acts like the best Turing machine it found then it will do as well as that Turing machine. This assumption is true when the AIXI only interacts with the environment over I/O channels, but is not true in the real world (where eg. we can inspect the AIXI's code).

Replies from: Squark, Cyan, V_V, khafra

↑ comment by Squark · 2014-04-19T20:47:52.544Z · LW(p) · GW(p)

Thus, what the AIXI will do is this: it will move right, then it will do nothing for the rest of time. But while the AIXI is simulating the Turing machine that rewrites itself into a stupid machine, the AIXI itself has not eliminated the AIXI code.

I don't think it would do even this. (A computable approximation of) AIXI thinks it only affects the universe through its output signals. Since there is no output signal that would cause AIXI (regarded this time as an element in its own universe model) to be reprogrammed, the solution would be completely inaccessible to it.

↑ comment by Cyan · 2014-04-17T01:40:31.574Z · LW(p) · GW(p)

Actually, an AI that believes it only communicates with the environment via input/output channels cannot represent the hypothesis that it will stop receiving input bits.

But I am an intelligence that can only communicate with the environment via input/output channels!

Incorrect -- your implementation itself also affects the environment via more than your chosen output channels.

Okay, fair enough. But until you pointed that out, I was an intelligence that believed it only communicated with the environment via input/output channels (that was your original phrasing, which I should have copied in the first place), and yet I did (and do) believe that it is possible for me to die.

Thus, what the AIXI will do is this: it will move right, then it will do nothing for the rest of time.

Incorrect. I'll assume for the sake of argument that you're right about what AIXI will do at first. But AIXI learns by Solomonoff induction, which is infallible at "noticing that it is confused" -- all Turing machines that fail to predict what actually happens get dropped from the hypothesis space. AIXI does nothing just until that fails to cause the right-room robot to move, whereupon any program that predicted that merely outputting "Pass" forever would do the trick gets zeroed out.

The AIXI's problem is that it assumes that if it acts like the best Turing machine it found then it will do as well as that Turing machine.

If there are programs in the hypothesis space that do not make this assumption (and as far as I know, you and I agree that naturalized induction would be such a program), then these are the only programs that will survive the failure of AIXI's first plan.

Has Paul Christiano looked at this stuff?

ETA: I don't usually mind downvotes, but I find these ones (currently -2) are niggling at me. I don't think I'm being conspicuously stupid, and I do think that discussing AIXI in a relatively concrete scenario could be valuable, so I'm a bit at a loss for an explanation. ...Perhaps it's because I appealed to Paul Christiano's authority?

↑ comment by V_V · 2014-05-02T16:02:36.631Z · LW(p) · GW(p)

Quite frankly, it seems that you have completely misunderstood what AIXI is.
AIXI (and its computable variants) is a reinforcement learning agent. You can't expect it to perform well in a fixed duration one-shot problem.

The thing that you describe as AIXI in your comment doesn't do any learning and therefore is not AIXI. I'm not sure what you have in mind, but you seem to describe some sort of expected utility maximizer agent which operates on an explicit model of the world, iterating over Turing machines rather than actions for some (possibly erroneous) reason (AIXI iterates over Turing machines to perform Solomonoff induction. This thing doesn't perform any induction, hence why bother with Turing machines? Maybe you are thinking of something like UDT, but it is not clear).

But in any case, your model is broken: if the agent simulates a Turing machine which performs the action "Rewrite self into smallest Turing machine which does nothing ever.", outputting the content of the output tape of the simulated machine on the agent output channel, then the rewrite is not carried out in the simulation inside the agent, but in the real world, therefore the agent gets rewritten and the player reap their reward.

Replies from: So8res

↑ comment by So8res · 2014-05-02T17:11:08.873Z · LW(p) · GW(p)

Yes, yes I was implicitly assuming that the AIXI has already been trained up on the game---the technical argument (which allows for training and explores possibilities like "allow the AIXI to choose the agent machine") is somewhat more nuaunced, and will be explored in depth in an upcoming post. (I was hoping that readers could see the problem from the sketch above, but I suppose if you can't see AIXI's problems from Robby's posts then you'll probably want to wait for the fully explicit argument.)

Replies from: V_V

↑ comment by V_V · 2014-05-02T18:52:40.488Z · LW(p) · GW(p)

Ok, I'll wait.

↑ comment by khafra · 2014-04-21T17:18:20.673Z · LW(p) · GW(p)

If you define waste heat, neural patterns, and so on as "output channels" then sure, we can say you only interact via I/O (although the line between I and O is fuzzy enough and your control over the O is small enough that I'd personally object to the distinction).

Also, even with perfect control of your own cognition, you would be restricted to a small subset of possible output strings. Outputting bits on multiple channels, each of which is dependent on the others, constrains you considerably; although I'm not sure whether the effect is lesser or greater than having output as a side effect of computation.

As I mentioned in a different context, it reminds me of UDT, or of the 2048 game: Every choice controls multiple actions.

↑ comment by rkyeun · 2014-05-01T23:35:24.504Z · LW(p) · GW(p)

You are subject to inputs you do not perceive and you send outputs you are neither aware of nor intended to send. You cannot set your gravitational influence to zero, nor can you arbitrarily declare that you should not output "melting" as an action when dropped in lava. You communicate with reality in ways other than your input-output channels. Your existence as a physical fact predicated on the arrangement of your particles is relevant and not controllable by you. This leads you to safeguard yourself, rather than just asserting your unmeltability.

Replies from: Cyan

↑ comment by Cyan · 2014-05-02T14:49:27.538Z · LW(p) · GW(p)

Yes, I conceded that point two weeks ago.

Replies from: rkyeun

↑ comment by rkyeun · 2014-05-08T16:32:19.239Z · LW(p) · GW(p)

Oops, my apologies, then. I don't actually come here all that often.

comment by yli · 2014-04-13T07:59:33.206Z · LW(p) · GW(p)

Would be cool if one of the items was a nugget of "computation fuel" that could be used to allow a robot's register machine to run for extra steps. Or maybe just items whose proximity gives a robot extra computation steps. That way you could illustrate situations involving robots with quantitatively different levels of "intelligence". Could lead to some interesting strategies if you run programming competitions on this too, like worker robots carrying fuel to a mother brain.

Replies from: So8res, Mark_Neznansky

↑ comment by So8res · 2014-04-13T16:21:20.447Z · LW(p) · GW(p)

That's already pretty much possible via the ProcessorItem which have a speed controlling how many computations a robot built with the processor may use. (Robots only get to use one processor, but if you find high-powered processors you can always rebuild a robot using the better processor.)

↑ comment by Mark_Neznansky · 2014-04-19T23:33:48.270Z · LW(p) · GW(p)

I was thinking in a similar direction. From a biological perspective, computation seems to be a costly activity --- if you just think of the metabolic demand the brain puts on the human being. I assumed that it is very different with computer, however. I thought that the main cost of computation for computers, nowadays, is in size, rather than energy. I might be wrong, but I assumed that even with laptops the monitor is a significant battery drainer in comparison to the actual computer. (sorry, mainly thinking out loud. I better read this and related posts more carefully. I'm glad to see the restriction on computations per amount of time, which I thought was unbounded here).

comment by Nnotm (NNOTM) · 2014-04-12T21:01:05.564Z · LW(p) · GW(p)

Sounds pretty cool, definitely going to try it out some.

Oh, and by the way, you wrote "Inpsect" instead of "Inspect" at the end of page 27.

Replies from: So8res

↑ comment by So8res · 2014-04-12T23:40:30.324Z · LW(p) · GW(p)

Fixed, thanks.

comment by Error · 2014-04-11T15:15:45.716Z · LW(p) · GW(p)

I had the same initial reaction as Gunnar -- this sounds like Core War. That is not a bad thing. I've never gotten into CW but I love the concept. I'd play.

Suggestion for the github repo: the readme should include instructions for compiling/testing/using the program.

comment by Mark_Neznansky · 2014-04-19T23:00:10.542Z · LW(p) · GW(p)

Hey,

Sounds very cool, promising and enticing. I do have a technical question for you (or anybody else, naturally).

I was wondering how "intentional" the choice of Haskell was? Was it chosen mainly because it seemed the best fitting programming language out of all familiar ones, or due to existing knowledge/proficiency at it at the time of formulation of the bot-world idea? How did cost/utility come into play here?

My inquiry is for purely practical, not theoretical purposes--- I’m looking for an advice. In the summer two years ago I was reading as much as I could about topics related to evolutionary psychology and behavioral ecology. During the same period, I was also working with my physics professor, modeling particle systems using Wolfram Mathematica. I think it was this concurrence that engendered in me the idea of programming a similar to yours, yet different, “game of life” program.

Back at the time programming things in AutoHotkey and in Mathematica was as far as my programming went. Later that year I took a terribly basic python course (that was concerned mainly with natural language processing), and that was about it. However, in the last couple of weeks I returned to python, this time taking the studying of it seriously. It brought back the idea of the life game of mine, but this time I feel like I can acquire the skills to execute the plan. I’m currently experiencing a sort of honeymoon period of excitement with programming, and I expect the following few months, at least, to be rather obligation-free for me and an opportune time to learn new programming languages.

I’ve read the above post only briefly (mainly due to restrictions of time--- I plan to read it and related posts soon), but it seems to me that our motivations and intentions with our respective games (mine being the currently non-existing one) are different, though there are similarities as well. I’m mainly interested in the (partially random) evolution/emergence of signaling/meaning/language/cooperation between agents. I’ve envisioned a grid-like game with agents that are “containers” of properties. That is, unlike Conway’s game where the progression of the game is determined purely on the on-the-grid mechanics, but like yours (as I understand it), where an individual agent is linked to an “instruction sheet” that lies outside the grid. I think what differentiates my game from yours (and excuse me for any misunderstandings), is the “place” where the Cartesian barrier is placed. [1] While in yours there’s the presence of a completely outside “god” (and a point that I had missed is whether the “player” writes a meta-language at t=0 that dictates how the robot-brain that issues commands is modified and then the game is let to propagate itself, or whether the player has a finer turn-by-turn control), in mine the god had simply created the primordial soup and then stands watching. Mine is more like a toy, perhaps, as there is no goal whatsoever (the existential version?). If to go with the Cartesian analogy, it’s as if every agent in my game contains an array of pineal glands of different indices, each one mapped to a certain behavior (of the agent), and to certain rules regarding how the gland interacts with other glands in the same agent. One of the “core” rules of the game is the way these glands are inherited by future agents from past agents.

What I had foreseen two years ago as the main obstacle to my programming of it remains my current concern today, after I had acquired some familiarity with python. I want the behavior-building-blocks (to which “the glands” of the agent are mapped to) to be as (conceptually) “reduced” as possible –– that is, that the complex behavior of the agents would be a phenomenon emerging from the complexity of interaction between the simple behaviors/commands –– and to be as mutable as possible. As far as I can tell, python is not the best language for that.

While browsing for languages in Wikipedia, I came across LISP, which appealed to me since it (quoth Wikipedia) “treats everything as data” – functions and statements are cut from the same cloth, and it is further suggested there that it is well suited for metaprogramming. What do you (or anybody else in here) think? Also, quite apart from this pursuit, I have intentions to at least begin learning R. I suspect it won’t have much relevancy for the construction of this game (but perhaps for the analysis of actual instance of the game play), but if it somehow goes into the consideration of the main language of choice--- well, here you go.

Thank you very much for your time,

[1] My point here is mainly to underscore what seem to be possible differences between your game and mine so that you could – if you will – advise me better about the programming language of choice.

Replies from: V_V, Mark_Neznansky

↑ comment by V_V · 2014-04-20T00:09:57.911Z · LW(p) · GW(p)

Haskell forces you to program in the pure functional programming paradigm. This, and other related idiosyncrasies of the language (such as default lazy evaluation), require you to use specific design patterns which take time to learn and even when mastered are of questionable convenience. At best, they don't seem to provide any advantage, and at worst they actively harm expressivity and efficiency.
Haskell seems mainly used by enthusiasts for hobby purposes, there seems to be very little free software written in Haskell besides tools for Haskell itself. Some companies claim to use it for commercial software development and/or prototyping, but it appears to be a small market.

If you like the static-typed functional approach, but you don't want to struggle with the pure functional paradigm, you may want to take a look at the ML family: F# is the biggest, Microsoft-backed, member of the family, it runs on .NET but it has an open source compiler and runs on Mono. OCaml is its non-.NET ancestor which still has some significant user base.
If you prefer dynamic typing, then try Scheme (Racket).

Replies from: Mark_Neznansky

↑ comment by Mark_Neznansky · 2014-04-20T21:10:29.748Z · LW(p) · GW(p)

Being new to this whole area, I can't say I have preference for anything, and I cannot imagine how any programming paradigm is related to its capabilities and potential. Where I stand I rather be given a (paradigmatic, if you will) direction, rather than recommended a specific programming language given a programming paradigm of choice. But as I understand, what you say is that if one opts for going for Haskell, he'd be better off going for F# instead?

Replies from: V_V

↑ comment by V_V · 2014-04-21T00:30:24.439Z · LW(p) · GW(p)

Think of programming paradigms as construction techniques and programming languages as tools. There is no technique or tool that is ideal in all situations.
If you want a broad education, you might want to study one representative language for any of the main paradigms, for instance C (imperative, static typed), C++/Java/C# (imperative-object oriented, largely static typed), one of the Lisp family, such as Scheme (multi-paradigm, mostly imperative and functional, metaprogramming, dynamic typed), and one of the ML family, such as F# (functional and imperative, static typed).
Python is very popular and very useful, and its basic syntax is easy to learn, but given that it is rather multi-paradigm and very high level (hiding lots of the underlying complexity) perhaps it is not the ideal place to start if you want to really understand what programming is about. At least, learn it aside something else. Similar considerations apply to "Python-like" languages such as Javascript, Ruby, Lua, etc.

But as I understand, what you say is that if one opts for going for Haskell, he'd be better off going for F# instead?

Generally yes.

Replies from: Mark_Neznansky

↑ comment by Mark_Neznansky · 2014-04-21T20:42:50.577Z · LW(p) · GW(p)

Thank you.

↑ comment by Mark_Neznansky · 2014-04-19T23:25:29.422Z · LW(p) · GW(p)

PS.

I. Probably doesn't add much to the consideration of language of choice, but I thought I might as well as add it: In my conceptualization of the game, the constitution of each agent is more than the "behavioral sheet" --- there are properties of several types that constitute an interface with the environment, and affect the way the agent comes into interaction with other individuals and the environment at large (mainly the former).

II. I'm speaking here of learning programming languages as if it was as easy as buying eggs at the corner store, but I wanted to mention that during my browsing Haskell did come across my attention (I actually think I've seen the name on LW before, a long time ago, which brought further attention to it), and it did seem to be a language worthwhile for me to learn, and now the existence of the Botworld seems like further evidence that it is suited to one of my current main directions of inquiry with programming --- though I wonder if at this point, where I have little existing programming proficiency, it wouldn't be better to learn another one that might be better suited to my task at hand?

comment by JoshuaFox · 2014-04-13T09:20:35.017Z · LW(p) · GW(p)

comment by negamuhia · 2014-04-11T06:16:53.408Z · LW(p) · GW(p)

I'm hand-typing the code from the pdf. I know it would be easier if I used the .lhs file from github, but I'd like to make sure I read and understand the code first. Reading the .lhs file hurts my eyes due to formatting issues in Emacs.

Replies from: So8res, John_Maxwell_IV

↑ comment by So8res · 2014-04-11T06:39:56.005Z · LW(p) · GW(p)

Here you go. Made via vim command:

ggqqqqqV/begin.code<ENTER>d/end.code<ENTER>O<ESC>j@qq@q<ESC>kdG

Replies from: negamuhia

↑ comment by negamuhia · 2014-04-11T07:10:56.081Z · LW(p) · GW(p)

Thanks! Wow, that was simple :)

↑ comment by John_Maxwell (John_Maxwell_IV) · 2014-04-11T06:21:56.312Z · LW(p) · GW(p)

You might want to write a quick script to remove all of the comments from the LHS file and turn it in to a regular old Haskell file.

comment by Said Achmiz (SaidAchmiz) · 2014-04-11T04:25:37.794Z · LW(p) · GW(p)

Could you post or link to some instructions for running this, or otherwise doing something with it? I didn't see any, on the github page or the PDF.

Replies from: So8res

↑ comment by So8res · 2014-04-11T05:06:03.227Z · LW(p) · GW(p)

Yeah, I'll put some examples up shortly.

Update: An example file with a trivial game can now be found here.

Botworld: a cellular automaton for studying self-modifying agents embedded in their environment

Contents

Overview

Cartesianism in Botworld

54 comments