Another Iterated Prisoner's Dilemma Tournament?

andreas_giger

Another Iterated Prisoner's Dilemma Tournament?

post by Andreas_Giger · 2012-05-25T14:16:30.529Z · LW · GW · Legacy · 23 comments

23 comments

Last year, there was a lot of interest in the IPD tournament with people asking for regular events of this sort and developing new strategies (like Afterparty) within hours after the results were published and also expressing interest in re-running the tournament with new rules that allowed for submitted strategies to evolve or read their opponent's source code. I noticed that many of the submitted strategies performed poorly because of a lack of understanding of the underlying mechanics, so I wrote a comprehensive article on IPD math that sparked some interesting comments.

And then the whole thing was never spoken of again.

So now I'd like to know: How many LWers would commit to competing in another tournament of this kind, and would someone be interested in hosting it?

23 comments

Comments sorted by top scores.

comment by prase · 2012-05-27T15:14:54.735Z · LW(p) · GW(p)

I have been planning to organise a new tournament, which would include: variable match length, possibility to simulate opponents (up to a fixed depth of recursion), scripting language for the strategies and an interpreter thereof and perhaps evolution with mutating strategies. However, relative lack of time together with akrasia stood in the way.

If there is a strong interest in this, I will try to precommit to do it in, say, three weaks. I have already some usable pieces of code.

Replies from: Andreas_Giger, shokwave

↑ comment by Andreas_Giger · 2012-05-27T19:29:10.773Z · LW(p) · GW(p)

You did an awesome job hosting the last tournament, and supplying competitors with a scripting language for their strategies is an excellent idea, so I hope you can do it. How would you implement evolution and variable match length though?

Also, I think if you could quantify "strong interest", more people might be willing to sign up for this. I definitely have a strong interest in this.

Replies from: prase

↑ comment by prase · 2012-05-28T23:37:20.553Z · LW(p) · GW(p)

Evolution:

Each agent is represented by a function; for concreteness let it be a real function of some relevant subset of previous turns. If the return value on a given turn is positive, the agent cooperates, else it defects. The function is represented as a tree; each node is a function with a fixed number of parameters (zero parameters = constant). Mutations are realised as cutting random branches and replacing them by constants. Sexual reproduction would take two functions and swap some branches. Population dynamics would be basically the same as in the last tournament.

Variable match length:

I would simply make up a number N which I will not tell anyone before the game starts. Then, each match would be preceded by generating a random real number x between 0 and 1, and the match length would be set to exp(xN) rounded up to integers, or something similar.

Replies from: Andreas_Giger

↑ comment by Andreas_Giger · 2012-05-29T14:55:38.356Z · LW(p) · GW(p)

This sounds interesting, however the longer I think about it, the less I'm in support of it.

Regarding match length, I think the main issue here is not that it is variable, but that it is unknown to the agents. This will lead to a tournament of nice strategies first exterminating all non-nice strategies, then always cooperating with each other. This won't change even if you can simulate your opponents, because the smartest way to play against TFT is still TFT (CooperateBot). If you take away the whole concept of parasites, I don't see much of interest left.

I don't see random mutations coupled with sexual reproduction work either; primarily because there is no compressed blueprint that can be altered, which means that entropy will increase. I feel that purposeful evolution of an intelligent design, like shokwave's proposal of a parameter with a gaussian distribution that can be included in any way you like, is a more viable option.

I'm also not sure whether evolution and simulation go well together at all, because ultimately you want to play optimally against every other strategy, no matter how many and which other strategies are in the pool. If you can't simulate your opponent, evolution is there to help you adapting to the changing pool of strategies, but if you can, it seems kind of pointless to me.

↑ comment by shokwave · 2012-05-28T17:43:25.441Z · LW(p) · GW(p)

We shouldn't duplicate work; I'm currently writing up a tournament simulator myself. I'm confident on variable match length, evolution, noisy responses, cheap talk rounds, and natural selection population growth tournaments. I am less confident on simulating opponents. For submitted strategies, I was prepared to accept code, pseudocode, and English descriptions, and then translate them into coffeescript myself. Probably I will also ask for some example games to test my translations (e.g. "My bot should cooperate 1-98 and defect 99, 100 against pure tit-for-tat").

So your scripting language sounds particularly interesting, as well as your ideas for simulating opponents. I gather the two are tied together. PM me for details!

ETA: I offer to save you the cost of overcoming akrasia and lack of time.

Replies from: prase

↑ comment by prase · 2012-05-28T23:22:09.372Z · LW(p) · GW(p)

I was prepared to accept code, pseudocode, and English descriptions, and then translate them into coffeescript myself.

I did it last time and it was pain in the arse. It's fun to code one PD agent, not so much when they are forty and you must follow descriptions which you are not sure that you understand correctly.

I have written an interpreter which evaluates functions written in a lisp-like prefix notation, which was originally intended to be used in a program evolving automatic traders using genetic algorithms and real stock exchange data - the traders were represented as functions. Changing it to be applicable to PD simulations would be quite easy. Mutations are already coded too.

The idea of simulating opponents was mentioned in the discussion and relates to all sorts of LW decision theory idiosyncrasies. I think it's worth doing, even if it were as primitive as having a function "simulate(X)" which would return the result of the opponent's decision function where all instances of "simulate" are replaced by X (which is either C or D).

Replies from: shokwave

↑ comment by shokwave · 2012-05-29T02:15:12.613Z · LW(p) · GW(p)

even if it were as primitive as having a function "simulate(X)"

I'm toying with each Bot having a "simulated" flag, and at the beginning of each game, create a real and a simulated version of each Bot, and let both Bots look at what the simulated Bots are doing. Haven't thought properly about it yet, but the rest of the project is coming along much faster than I'd anticipated so I'll have time to dig into this problem soon.

EDIT: I think I can do better. If bots have a single source code in some language L, and inferior interpreters I-weak, and the game has superior interpreter I-strong, then I-weak(L) can approximate I-strong(L). Especially neat is that I-weak's inferiority can come from inability to recurse; it could bottom out and choose arbitrary values to feed back in - or it could come from instruction limitations; no more than 10,000 or so.

comment by shokwave · 2012-05-27T10:43:28.679Z · LW(p) · GW(p)

I'm interested; also, one of my side-projects was writing a sort of fully extensible IPD tournament simulator. It does not yet do clever top-level stuff like population tournaments, but it will pretty soon. I would therefore submit that I am interested in hosting it.

Replies from: Andreas_Giger

↑ comment by Andreas_Giger · 2012-05-27T19:33:12.659Z · LW(p) · GW(p)

I liked this suggestion of yours from the tournament results thread:

Idea: everyone has access to a constant 'm', which they can use in their bot's code however they like. m is set by the botwriter's initial conditions, then when a bot has offspring in the natural selection tournament, one-third of the offspring have their m incremented by 1, one-third have theirs decremented by 1, and one third has their m remain the same. In this manner, you may plug m into any formulas you want to mutate over time.

Did you do any experiments on this since then?

Replies from: shokwave, shokwave

↑ comment by shokwave · 2012-05-28T18:00:03.879Z · LW(p) · GW(p)

Your post on IPD mechanics is very interesting! I'm going to implement TFT-Dn, TFT-DnC, and CRD alongside TFT, Cooperate, Defect, and GrimTrigger as basic types of bots. Not sure if I'll include them in the tournaments, though.

↑ comment by shokwave · 2012-05-28T12:57:27.835Z · LW(p) · GW(p)

No; but that particular concept is trivially easy to include (and will be included unless I come up with something better). It's literally a change from YourBot.new(gamelength) toYourBot.new(gamelength, m). I'm thinking about other ways to allow interesting evolution. The current best idea is something like providing an Evolution object that has all sorts of interesting numbers that change in all sorts of interesting ways with each population step. (The biggest bother I had with the m concept was that you can't have monotonically increasing numbers to plug into bots that want to do interesting things over time - for example, "play tit for tat until population step 1000, and now that all the silly strategies are dead and it's a pool of tit-for-tatters, play CliqueBot")

Replies from: Andreas_Giger

↑ comment by Andreas_Giger · 2012-05-29T14:47:56.979Z · LW(p) · GW(p)

I don't see this as a problem at all; in fact if you could turn TFT into a CliqueBot at an arbitrary point in the tournament, this could hardly be called evolution. I mean the whole point of this idea was to theoretically have a gaussian distribution around your starting value, and then see which value is the most successful, right? So for example, you could start with m = 10, and have a strategy TFT-nD with n = floor(abs(m/10)). This might result in high values for n if the pool is such that TFT-1D strategies are successful at first; causing lower m to die out and higher m to strive, thus shifting the bell curve.

Replies from: shokwave

↑ comment by shokwave · 2012-06-04T06:57:08.774Z · LW(p) · GW(p)

Mmm, I thought about this and you're right. Monotonically increasing numbers might make a meta-level "defect last N" game out of the evolution - turn Clique a turn before the others do. Of course, now I also think there's a way to turn m into an increasing number as well.

I mean the whole point of this idea was to theoretically have a gaussian distribution around your starting value, and then see which value is the most successful, right?

Actually the interesting part is whether you can use m cleverly, such that your mutation/evolution takes advantage of likely mutations/evolutions in other bots. Staying two steps ahead in the arms race at all times would allow you to dominate the genetic pool, because the change in bots would be the new constant. (My first idea along these lines was to have TFT-nD with n being decremented by 2 when m is decremented by one, and likewise for increments - so that my bot would become the parasite of other likely TFT-variable-nDs).

comment by benelliott · 2012-05-26T15:35:20.471Z · LW(p) · GW(p)

I'd play

comment by Andy_McKenzie · 2012-05-26T05:47:48.893Z · LW(p) · GW(p)

I would happily participate in such a tournament.

comment by wedrifid · 2012-05-26T04:21:57.698Z · LW(p) · GW(p)

I'm using the same algorithm. Defect! Always defect!

Replies from: benelliott

↑ comment by benelliott · 2012-05-26T15:35:50.323Z · LW(p) · GW(p)

Because it did so well in the previous one :P

Replies from: Andreas_Giger

↑ comment by Andreas_Giger · 2012-05-26T20:59:12.855Z · LW(p) · GW(p)

I think DefectBot, RandomBot and at least TFT-0D should be included by default. The last tournament was pretty much a battle of TFT-nD with varying n; I wonder if a new tournament would only result in an equivalent battle of Afterparty (TFT-DnC) strategies or whether we would actually see more successful CliqueBots and maybe completely new approaches. The tools are there.

comment by CasioTheSane · 2012-05-27T08:09:02.686Z · LW(p) · GW(p)

Why fixed length instead of random/unlimited length tournaments? Are there any real world situations where a fixed length scenario is more analogous than a random length one?

Personally I would find such a competition most interesting and fun when the rules are such that good strategies in the game are likely to be good strategies in other real world situations as well.

Replies from: Andreas_Giger, shokwave

↑ comment by Andreas_Giger · 2012-05-27T19:24:46.145Z · LW(p) · GW(p)

Why fixed length instead of random/unlimited length tournaments?

In short: because of this.

Your suggestion would very obviously turn into a tournament of nice strategies that will always cooperate once the non-nice strategies are eliminated. The "real world" isn't analogous to any kind of IPD at all, mainly because defectors often can't be punished. If you have a suggestion for a tournament that would to some extent actually model reality, I'm all ears; but random/unlimited length doesn't work.

Replies from: shokwave

↑ comment by shokwave · 2012-06-04T06:59:50.147Z · LW(p) · GW(p)

mainly because defectors often can't be punished.

If each round came with a "situation variable" per bot that determined how 'unfair' the situation was - i.e, high variable means you can get away with defecting - and it wasn't somehow just noise, that might come closer to reality. Outside of the scope of my project, though.

↑ comment by shokwave · 2012-06-04T07:02:52.737Z · LW(p) · GW(p)

Are there any real world situations where a fixed length scenario is more analogous than a random length one?

Most of them. Dating, a night out at the bar, employment, driving somewhere, etc, etc. If you take a Gods-eye view and mash these all together, sure, our life is one big indefinite-length IPD, but at that point you've lost so much information that it's not clear that chunks of a good indefinite strategy map to good definite strategies.

Replies from: CasioTheSane

↑ comment by CasioTheSane · 2012-06-05T07:02:44.429Z · LW(p) · GW(p)

LOL @ dating being analogous to a fixed length iterated Prisoner's Dilemma. [insert tired cliché joke about nerds here]

Another Iterated Prisoner's Dilemma Tournament?

Contents

23 comments