# The Darwin Game

post by Zvi · 2017-11-15T23:20:00.294Z · score: 59 (28 votes) · LW · GW · 16 commentsEpistemic Status: True story

The plan is that this post will begin the sequence Zbybpu’f Nezl.

In college I once took a class called Rational Choice. Because obviously.

Each week we got the rules for, played and discussed a game. It was awesome.

For the grand finale, and to determine the winner of the prestigious Golden Shark Award for best overall performance, we submitted computer program architectures (we’d tell the professor what our program did, within reason, and he’d code it for us) to play The Darwin Game.

The Darwin Game is a variation slash extension of the iterated prisoner’s dilemma. It works like this:

For the first round, each player gets 100 copies of their program in the pool, and the pool pairs those programs at random. You can and often will play against yourself.

Each pair now plays an iterated prisoner’s dilemma variation, as follows. Each turn, each player simultaneously submits a number from 0 to 5. If the two numbers add up to 5 or less, both players earn points equal to their number. If the two numbers add up to 6 or more, neither player gets points. This game then lasts for a large but unknown number of turns, so no one knows when the game is about to end; for us this turned out to be 102 turns.

Each pairing is independent of every other pairing. You do not know what round of the game it is, whether you are facing a copy of yourself, or any history of the game to this point. Your decision algorithm does the same thing each pairing.

At the end of the round, all of the points scored by all of your copies are combined. Your percentage of all the points scored by all programs becomes the percentage of the pool your program gets in the next round. So if you score 10% more points, you get 10% more copies next round, and over time successful programs will displace less successful programs. Hence the name, The Darwin Game.

Your goal is to have as many copies in the pool at the end of the 200th round as possible, or failing that, to survive as many rounds as possible with at least one copy.

If both players coordinate to split the pot, they will score 2.5 per round.

To create some common terminology for discussions, ‘attack’ or means to submit 3 (or a higher number) more than half the time against an opponent willing to not do that, and to ‘fold’ or ‘surrender’ is to submit 2 (or a lower number) more than half the time, with ‘full surrender’ being to always submit 2. To ‘cooperate’ is to alternate 2 and 3 such that you each score 2.5 per round.

In this particular example we expected and got about 30 entries, and I was in second place in the points standings, so to win The Golden Shark, I had to beat David by a substantial amount and not lose horribly to the students in third or fourth.

What program do you submit?

(I recommend actually taking some time to think about this before you proceed.)

Some basic considerations I thought about:

1. The late game can come down to very small advantages that compound over time.

2. You need to survive the early game *and *win the late game. This means you need to succeed in a pool of mostly-not-smart programs, and then win in a pool of smart programs, and then in a pool of smart programs that outlasted other smart programs.

3. Scoring the maximum now regardless of what your opponent scores helps you early, but kills you late. In the late game, not letting your opponent score more points than you is very important, especially once you are down to two or three programs.

4. In the late game, how efficiently you cooperate with yourself is very important.

5. Your reaction to different programs in the mid game will help determine your opponents in the end game. If an opponent that outscores you in a pairing survives into the late game, and co-operates with itself, you lose.

6. It is all right to surrender, even fully surrender, to an opponent if and only if they will be wiped out by other programs before you become too big a portion of the pool, provided you can’t do better.

7. It is much more important to get to a good steady state than to get there quickly, although see point one. Getting people to surrender to you would be big game.

8. Some of the people entering care *way *more than others. Some programs will be complex and consider many cases and be trying hard to win, others will be very simple and not trying to be optimal.

9. It is hard to tell what others will interpret as cooperation and defection, and it might be easy to accidentally make them think you’re attacking them.

10. There will be some deeply silly programs out there at the start. One cannot assume early programs are doing remotely sensible things.

That leaves out many other considerations, including at least one central one. Next time, I’ll go over what happened on game day.

## 16 comments

Comments sorted by top scores.

Collusion seems possible here, with a player sacrificing itself for another's benefit.

When bot A (master) and bot B (slave) play each other, start with a few handshake turns that will let them recognize each other. Then have bot A play all 5's and bot B play all 0's.

Sample handshake: B starts 1,0. A plays 5 on its second turn if its opponent played 1 on its first turn.

Set each of them up to cooperate against itself.

When playing someone else, they should mainly be aiming for a generic "do good", but bot B should put more emphasis on its absolute score (e.g., it's okay to capitulate to "always 3" because it capitulates to bot A even harder). Bot A should put more emphasis on its relative score since it needs to win.

If you can find multiple students who are willing to throw the game for your benefit, get them all to submit the same bot B (or variations which do slightly different things against other bots).

Starting with a simpler problem, let's say that I want to build a bot that cooperates with itself as well as possible and also tries to distribute points as evenly as possible between copies of itself. This is the best that I've come up with:

Always play either 2 (low) or 3 (high).

On the first turn, play 2 with probability p and 3 with probability 1-p.

If we got 5 points last turn, then play what my opponent played last turn. (This makes mutual cooperation continue forever, while alternating high-low.)

If we got less than 5 points last turn, then play 2 with probability q and 3 with probability 1-q.

If we got went over 5 last turn, then play 2 with probability r and 3 with probability 1-r.

Let's call this fairbot(p,q,r).

If my calculations are correct, expected score in self-play is maximized with p=q=r=0.69 (to the nearest .01). On average this brings in a combined 2.24 total points less than the fair maximum (5 per turn) for the two bots, because it might take a few turns to coordinate on which copy plays 3 vs. 2. (The quickest coordination happens with p=q=r=0.5, but that would miss out on 3 points because half the coordination failures cost the whole pot; with more weight on 2 the coordination failure happens more often but usually only shrinks the pot by 1 point.)

Starting to think about how fairbot plays against other bots:

Fairbot never settles into a stable pattern against "always 3" or "always 2"; it continues to switch between 2 and 3 (and plays the same thing as its opponent more often than not). Presumably it would be better it stick with 3s against "always 2" and to either stick with 2s or stick with 3s against "always 3" (depending on whether I prioritize relative or absolute score).

If I knew that fairbot(0.69,0.69,0.69) was my opponent, I think I'd want start by playing 3s, and then settle into mutual alternating coordination once we got a pot of 5. I would do slightly better than this bot does against me or against itself. I could get a much larger relative score by playing always 3, but it would hurt my absolute score by a lot. Or possibly I could do something in between.

An obvious way to make fairbot(p,q,r) more aggressive / less exploitable is to reduce p, q, or especially r.

We can add other rules which cover situations that never arise in self-play, such as when the opponent plays something other than 2 or 3, or when 5-point mutual cooperation gets broken (a non-5 pot after the previous turn had a pot of 5).

It occurs to me that there are a bunch of different patterns that one could aim for to cooperate with oneself. Fairbot aims to play 2,3,2,3,2,3,... (or the complement of this: 3,2,3,2,3,2,...) so that each round has a pot of 5 and the two bots get approximately equal shares. But you could also aim to cooperate with yourself with the pattern 1,4,1,4,1,4,..., or 2,3,3,2,2,3,3,2,2,3,3,..., or 0,5,1,4,2,3,0,5,..., or whatever.

If you want to mutually cooperate with other bots, you either need to pick the Schelling pattern, or plan with the other humans to ensure that you pick the same pattern, or have a bot that's smart enough to recognize which pattern they're aiming for and then play it (though that will still leave you less efficient at cooperating than a bot which starts by aiming for that pattern).

A simple way to make a bot nearly inexploitable is to add a payback rule: if my opponent has outscored me thus far, then play 3. At worst, my opponent will outscore me by 1 point for this whole pairing.

This doesn't interfere with self-cooperation, and it generally shouldn't interfere with finding mutual fair coordination with bots that are seeking that.

I think I want bot A (master) in my other comment to have this feature.

I think I found a way to do slightly better than .69 for 2 and .31 for 3 when cooperating with yourself:

Play 1 with probability .12, 2 with .60, and 3 with .28. If you played the same number, repeat. If you played different numbers, the higher one can play a 3 and the lower one can play a 2 (or whichever cooperating pattern you choose), and they can start alternating from there.

If my calculations are correct, this only loses 2.10 points on average from the maximum possible, which seems to be the best you can do. The goal of the randomization is to pick a different number than your opponent so that you can start cooperating, so the reason this does better is that by picking out of 3 numbers, you have a higher change of not picking the same number.

Featured this post for being the first in a exciting and finished sequence (I should figure out a better way of causing concluded sequences to be Featured, when I only realise how awesome they are by the end). The second post was super fun to read, and the splitting into three posts meant that I notice how much my models were surprised by the third and final post - the outcome.

For these reasons, and also in the hope that people will build on the darwin game's setup, I've Featured it.

Promoted to the frontpage: I have some guesses on where this is going, but I am not confident abd definitely curious, and it seems like a great start to a good explanation.

I in turn would be curious to hear (privately) where you and anyone else with a guess think this is going and any other thoughts about how best to get there. Or just any entries someone wants to record for posterity.

(For those who don't have my email you can message me on LW1.0 under the same username.)

EDIT: I'll also add that my plan as of Friday afternoon is to wait until mid-next week to post the follow up as there's some interesting discussions I don't want to anchor/bias.

Oh my gosh this sounds really fun.

Are programs allowed to have either a way to differentiate themselves from each other (like one program gets a 1 and the other gets a 0 for each game) or a source of randomness? Because otherwise it doesn't seem possible to cooperate with yourself.

It would be so cool if someone organized a game of this here or something...

You are a normal computer program, so pseudo-randomness is available if you want it (which you do).

You do not know if you are player 0 or player 1.

The mockingbird: Find whatever method the current leader(s) is/are using to enable self-cooperation, and find the way to mimic them with a small advantage. (e.g. if they use a string of 0,1,4,5s to self-identify, spam 4 until they identify as you, then identify how to get into the side of mutual cooperation that is sometimes up a point.

Tit-for-tat with correction: Start with a even distribution, then play what they played last round, except if the total last round was above five and they played higher, reduce the value you played by the amout that exceeded five last round; if the total last round was below five and they played lower, increase the value you play this round by the shortfall. (If the values played were the same, adjust by half the difference, randomly selecting between two values if a .5 change is indicated. (Loses at most 5 points to fivebot, loses about half a point per round to threebot, leaves some on the table with twobot, but self-cooperates on round two with 80% probability.

Really, really interesting. I'm going to do some formal analysis later. Some brainstorming first:

- I want my bot to have memory. My bot needs to track the opponent bot's previous moves in a given round.
- My bot would have hypotheses on the kind of bot it is playing. It would particularly need to have the following types of hypothesis {fixed probability distribution over all numbers, a reactionary program (the current move of the program depends on my bots previous moves).
- I want my bot to cooperate against reactionary bots that adopt tit for tat with forgiveness. Of the reactionary bot does not forgive, my bot may be unable to cooperate. Especially if the reactionary bot tries to exploit or punish my bot based on my bot's initial moves.
- I want my bot to quickly identify itself as soon as possible. This may involve my bot playing suboptimally in the first few moves, but should also make it easier to identify reactionary bots.
- I want my bot to cooperate against itself.
- I want my bot to exploit submissive $(P(x \in [0,2] > 0.5)$ (where $x$ is the number the opponent bot writes) bots. The degree of exploitation (how often it exploits) depends on my bot's hypothesis of how submissive it's opponent is.
- I want my bot to punish offensive $(P(x \in [3,5] > 0.5)$ (where $x$ is the number the opponent bot writes) bots. The degree of punishment (how often it punishes) depends on my bot's hypothesis of how offensive it's opponent is.
- I want my bot to be Bayesian. My bot would use Bayesian inference to update it's hypotheses on its opponent.
- I would specify a utility function that specifies the appropriate trade off my bot should make between gaining points, and figuring out the kind of opponent it is facing. (I already have a rough idea of how my bot would go about gaining information due to my project on optimising scientific research).
- I expect that my bot becomes more optimal as n tends to infinity, where n is the number of turns in a given round. This is because my bot may waste the first few turns may be in order to gain information.
- I expect that given the source code of any bot, it is possible to script a predator bot, such that the predator bot always performs at least as good as the prey bot. As such, I expect to not disclose certain parts of my bot's algorithm once I complete the design.

I'm thinking of having my bot randomly playing 5 or 0 in the first round, so it can quickly identify itself. If the opponent played something else, it adopts accordingly. If the opponent played 5 or 0, it adopts another sequence of rituals (I don't want to say it here so that a predator bot wouldn't be scripted for my bot) which should help confirm that it is playing against itself.

I initially thought along those lines, but I realized that if your Bayesian update includes your own strategy, you can very quickly converge to playing optimally against yourself without an explicit handshake. See my thought process here.

There are 5^N possible strings for a handshake that lasts N turns. Select the handshake strong randomly. If the handshake is successful, the probability that the bot it is playing against is itself is (1 - 5^-N).

Using N = 3, we can establish with a Pr(0.992) that we are playing against a copy of ourself within 3 turns.

2 turns would be spent renormalising if we are not playing against ourself.

Copying over my thought process from your main blog:

Start with a short identifying sequence of 0s,1s,4s, and 5s. Seeing any 2 or 3 drops into oppositional mode immediately. Have the last two characters of this sequence only convey one bit of identification each; {0,5} vs. {1,4}. Determine which of them to use randomly, to establish who will be the low side and who the high side. This gives you a 75% handshake chance and 75% polarity-match chance, vs. 75/50 for finding them separately, at a cost of about 1 point in expectation.

In adversarial cases, start out with literal tit-for-tat; play the last number the opponent played. Come up with a short list of possible default strategies; always-2, always-3, always-N, 0/5-alternating, 2/3-alternating, 1/4-alternating. Probably a couple others. If the output doesn’t look quite like tit-for-tat, and the discrepancy with one of these other simple strategies is smaller, play the FDT-best-response strategy. For alternating strategies it’s the same alternation but in reverse, for always-N for N2 it’s to play always-3. This will involve several special cases, but they only need to handle common strategies so it’s limited.

Actually, skip the handshake code. This will get optimality against itself without that by starting out with random 2/3 for ~4 rounds and then proceeding from there.

Updates from reading other thoughts; use a 2-probability of 0.69 rather than 0.5 for the initial random run, for better performance as per Unnamed. This would also help against dumb no-forgiveness or minimal-forgiveness bots.

I've read about about experiment like this while learning game theory. Simple Tit-for-Tat stratetgy (play what your opponent played last turn) won 3 tournaments straight. It auto-enables cooperation, (almost) whatever is your opp's strategy and if they want to exploit you - well, they can't.

You may need to think how to initialize to cooperate with yourself.