What would flourishing look like in Conway's Game of Life?
post by sudhanshu_kasewa · 2020-05-12T11:22:31.089Z · LW · GW · 2 commentsThis is a question post.
Contents
Answers 14 Raemon 6 Daniel Kokotajlo 5 Dagon 4 Michaël Trazzi 1 Cato the Eider None 2 comments
I'm thinking of designing a reinforcement learning environment based on Conway's Game of Life (GoL). In it, at every timestep, an agent can change the state of some cells.
As is the case with most interesting RL problems, agent behaviour would be determined by the reward function.
In this scenario, I see some issues with simple reward functions:
1) Total life:
Something like this glider gun would represent a technically correct unbounded score, with a stream of lonely travellers in an interstaller abyss. Say we want to stick to a finite memory.
2) Highest density:
This is generation 50 of the Max spacefiller. The agent might find some way of maintaining the striped pattern from the middle. Uninspiring, to say the least; like
***spoiler alert*** being batteries in The Matrix.
Other reward functions might consider variation over time.
3) Some function with a penalty for static life:
Consider Karel's p177:
This oscillator has a period of 177 time steps. There's another one with p312, but I couldn't find a good-enough gif of it. Such patterns would likely game this reward specification too.
I've thought of a few other simple functions, which all look flawed in some obvious ways.
That's not to say there is some ungameable reward function. But I wonder, if each cell symbolically represented some small unit of sentience (say a single person, or a family, or a planet) what would flourishing look like?
Answers
Just wanted to say this question pinged my "huh, this is a neat question" detector. (I consider myself pretty confused about the broader topic so don't have good answers, but found your question neat because it tackled a problem at a level that where it still personally feels meaty to me)
Doublechecking my understanding of your "implied question" – it sounds like what you want is a reward function that is simple, but somehow analogous to the complexity of human value? And it sounds like maybe the underspecified bit is "you, as a human, have some vague notion that some sorts of value-generation are 'cheating'", and your true goal is "the most interesting outcome that doesn't feel like Somehow Cheating to me?"
Some thoughts I have at the moment:
- I think the actual direct comparison between Game of Life and Real Life is "one cell == an atom" (or some small physical particular", rather than one cell representing a bit of sentience, or even a single biological cell). I'd expect truly analogous "value" in Game of Life to look less like 'stuff happening' and more like "Particular types of patterns are more common, without being too repetitive." (i.e. in real life, I don't optimize "atoms moving around", I optimize for something more like "larger patterns of atoms doing particular things")
- Assuming "one cell = minimum viable bit of value-weight", there are still some questions I'd struggle with here that seem analogous to my philosophical confusions about human-value. How 'good' is it to have a repeating loop of, say, a billion flourishing human lives? Is it better than a billion human lives that happens exactly once and ends?"
- To some degree, I think "moral value" (or, "value") in real life is about the process of solving "what is valuable and how to do I get it?", and gaining more value depends somewhat on that question being "unsolved". I'm not sure, if I knew exactly what value was with infinite compute, that there would be as much point to actually having it.
If I'm taking your current (implied?) assumptions of "one cell == minimum viable value-weight", and "the goal is to have as simple a function you can that sort of 'feels like it's getting at something analogous to human value", I think the answer of "maximize the number of unique states that happen before things start looping" (maybe with a finite board, so that gliders-guns can't "game" the system by generating technically infinite 'variety'?)
In this case it might mean that the system optimizes either for true continuous novelty, or the longest possible loop?
I do suspect that figuring out which of your assumptions are "valid" is an important part of the question here.
↑ comment by sudhanshu_kasewa · 2020-05-14T16:29:47.702Z · LW(p) · GW(p)
Thanks for the detailed response. Meta: It feels good to receive a signal that this was a 'neat question', or in general, a positive-seeming contribution to LW. I have several unexpressed thoughts from fear of not actually creating value for the community.
it sounds like what you want is a reward function that is simple, but somehow analogous to the complexity of human value? And it sounds like maybe the underspecified bit is "you, as a human, have some vague notion that some sorts of value-generation are 'cheating'", and your true goal is "the most interesting outcome that doesn't feel like Somehow Cheating to me?"
This is about correct. A secondary reason for simplicity is to attempt to be computationally efficient (for the environment that generates the reward).
"one cell == an atom"
I can see that as being a case, but, again, computational tractability. Actual interesting structures in GoL can be incredibly massive, for example, this Tetris Proccessor (2,940,928 x 10,295,296 cells). Maybe there's some middle ground between truly fascinating GoL patterns made from atoms and my cell-as-a-planet level abstraction, as suggested by Daniel Kokotajlo in another comment.
How 'good' is it to have a repeating loop of, say, a billion flourishing human lives? Is it better than a billion human lives that happens exactly once and ends?
Wouldn't most argue that, in general, more life is better than less life? (but I see some of my hidden assumptions here, such as "the 'life's we're talking about here are qualitatively similar e.g. the repeating life doesn't feel trapped/irrelevant/futile because it is aware that it is repeating")
I think "moral value" (or, "value") in real life is about the process of solving "what is valuable and how to do I get it?"
I don't disagree, but I also think this is sort of outside the scope of finite-space cellular automata.
In this case it might mean that the system optimizes either for true continuous novelty, or the longest possible loop?
Given the constraints of CA, I'm mostly in agreement with this suggestion. Thanks.
I do suspect that figuring out which of your assumptions are "valid" is an important part of the question here.
Yes, I agree. Concretely, to me it looks like 'if I saw X happening in GoL, and I imagine being a sentient being (at some scale, TBD) in that world (well, with my human values), then would I want to live in it?', and translating that into some rules that promote or disincentivise X.
I do think taking this approach is broadly difficult, though. Perhaps its worth getting a v0.1 out with reward being tied to instantiations of novel states to begin with, and then seeing whether to build on that or try a new approach.
Replies from: RaemonHere's an inelegant answer that might work as a good proxy:
Hand-code a list of substructures that you identify as "good:" Gliders, glider guns, maybe some interesting shapes, etc. Make this list at least 10 items long.
Score is the product of the total numberXduration of each type of substructure. Use a large but finite grid, with a large but finite time.
I think this answer is better than it sounds. I think that value is complex; flourishing is not something that should be possible to specify simply in the real world, so why should it be possible in GoL? Moreover, flourishing in the real world involves a combination of order and disorder: Macro-structures like people and societies and spaceships (order) but they have to be different from each other and combined and recombined in lots of new ways (disorder).
I think one criticism of this answer is that it too could potentially be "gamed" by some big grid of densely packed substructures from the list. However, in practice I don't think this will be a problem because I doubt it will be possible to construct such a grid. Nevertheless I think thinking more about this question for an hour would produce a better answer.
↑ comment by Raemon · 2020-05-13T23:50:50.729Z · LW(p) · GW(p)
If you want to be able to track "can we create a variety of good structures, plus avoid bad structures?" you could weight the structures differently, and label some "bad".
(I'm not sure this actually accomplishes the underlying philosophical goal, but seemed like the obvious extension from your current line of thinking)
Replies from: daniel-kokotajlo↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-05-14T00:06:03.687Z · LW(p) · GW(p)
Yep, that seems like an improvement over the original version.
↑ comment by sudhanshu_kasewa · 2020-05-14T15:29:22.478Z · LW(p) · GW(p)
Interesting thoughts, thanks. My concerns: 1) Diversity would be restricted to what I specify as interesting shapes, while perhaps what I really want is for the AI to be able to discover new ways to accomplish some target value. 2) From a technological perspective, may be too expensive to implement? (in that, at every pass, must search over all subsets of space and check against all (suitably-sized) patterns in the database in order to determine what reward to provide).
Replies from: daniel-kokotajlo↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-05-14T22:35:49.783Z · LW(p) · GW(p)
Both good points. I think the AI will find new ways to accomplish your value, for pretty much anything you set as your value, including this one. (I for one have very few ideas how the AI would manage to build all those shapes; wouldn't they collide with each other? Probably some structure or organization would be needed. Etc.)
I don't have good intuitions for what is easy or hard. Instead of checking all sub-regions at all times, you could randomly sample some sub-regions at some times; that would drastically reduce the expense while incentivizing the same behavior.
↑ comment by Raemon · 2020-05-14T00:11:48.199Z · LW(p) · GW(p)
I notice something like "glider guns feel more valuable to me than gliders, and glider-gun-guns (is that a thing?) feel more valuable than glider-guns, but I wouldn't want the universe tiled with either of them."
Replies from: daniel-kokotajlo↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-05-14T10:20:57.372Z · LW(p) · GW(p)
I agree. I think I don't want the universe tiled with anything really; one (or a handful) of tiles is enough. Maybe a step in the right direction:
Find a library of interesting GoL structures. Declare some to be bad and the rest good. Score = the number of good types that are instantiated minus the number of bad. So making more than one of the same structure doesn't help. A more sophisticated algorithm would give you points equal to the log of the number of a structure.
↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-05-13T23:38:16.980Z · LW(p) · GW(p)
Thinking about the question from another angle: What would I want the RL agent to end up doing? What would make me most pleased? Answer: If it constructed some sort of infinitely growing pattern, that spirals out from the middle filling the void with interesting oscillating structures of infinite diversity, such that eventually all stable structures of all sizes will be created. Even better would be if not all, but most, stable structures would be created -- and some category of "bad" ones would never be.
One option would be maximum variation. Score goes up based on number of distinct states of the finite board in a finite time. This is kind of interesting because the solution(s) could be very different for different sizes and shapes of board, over different timeframes.
↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-05-14T10:22:13.521Z · LW(p) · GW(p)
This is an elegant solution, but I worry that it will result in something that looks very boring and high-entropy. EDIT: After playing around a bit with GoL, I no longer am worried about this.
↑ comment by sudhanshu_kasewa · 2020-05-14T15:20:18.132Z · LW(p) · GW(p)
After reading through the suggestions, including yours and Raemon's, I'm also sort of circling around this idea. Thanks.
Some friends tried (inconclusively) to apply AlphaZero to a two-player GoL. I can put you in touch if you want their feedback.
↑ comment by sudhanshu_kasewa · 2020-05-14T15:17:16.887Z · LW(p) · GW(p)
Thanks for the note. I'll let you know if my explorations take me that way.
Perhaps "the boundary between order and chaos", see rule 110.
↑ comment by sudhanshu_kasewa · 2020-05-12T19:21:54.930Z · LW(p) · GW(p)
Fascinating. Thanks. My sense is GoL already has this property; any intuitions on how to formalise it?
2 comments
Comments sorted by top scores.
comment by TurnTrout · 2020-05-12T12:24:05.280Z · LW(p) · GW(p)
Are you aware of SafeLife?
Replies from: sudhanshu_kasewa↑ comment by sudhanshu_kasewa · 2020-05-12T13:33:15.922Z · LW(p) · GW(p)
I was not; thanks for the pointer!
A quick look suggests that it's not quite what I had in mind; nonetheless a reference worth looking at.