A free to enter, 240 character, open-source iterated prisoner's dilemma tournament

kingsupernova

A free to enter, 240 character, open-source iterated prisoner's dilemma tournament

post by Isaac King (KingSupernova) · 2023-11-09T08:24:43.277Z · LW · GW · 19 comments

This is a link post for https://manifold.markets/IsaacKing/which-240-character-program-wins-th

19 comments

I'm running an iterated prisoner's dilemma tournament where all programs are restricted to 240 characters maximum. The exact rules are posted in the Manifold Markets link; I figured I'd cross-post the contest here to reach more potentially-interested people.

You don't need a Manifold account to participate, you can just put your program in the comments on LessWrong or PM me. If you win, I'll donate your winnings to the charity of your choice.

19 comments

Comments sorted by top scores.

comment by Isaac King (KingSupernova) · 2023-11-29T19:46:25.115Z · LW(p) · GW(p)

The winner was the following program:

try{eval(`{let f=(d,m,c,s,f,h,i)=>{let r=9;${c};return +!!r};r=f}`);let θ='r=h.at(-1);r=!r||r.o',λ=Ω=>r(m,d,θ,c,f,Ω,Ω.map(χ=>({m:χ.o,o:χ.m}))),Σ=(μ,π)=>[...μ,{m:π,o:+!1}],α=λ([...i]),β=λ(Σ(i,α));r=f(θ)&α&!β&!λ(Σ(Σ(i,α),β))|d==m}catch{r = 1}

We're running a sequel, see here to participate.

Replies from: habryka4

↑ comment by habryka (habryka4) · 2023-11-30T00:17:34.286Z · LW(p) · GW(p)

This is the ideal agent. You may not like it, but this is what peak performance looks like.

More seriously though, does anyone want to explain how it works?

Replies from: KingSupernova

↑ comment by Isaac King (KingSupernova) · 2023-11-30T09:57:24.320Z · LW(p) · GW(p)

It's conceptually pretty simple; 240 characters isn't room for a lot. Here's how the writer explained it:

Here's the annotated version of my bot: https://pastebin.com/1a9UPKQk
The basic strategy is:
Simulate what the opponent will do on the current turn, and what they would do on the next two turns if I defect twice in a row.
If the results of the simulations are [cooperate, defect, defect], play tit-for-tat. Otherwise, defect.
This will defect against DefectBots, CooperateBots, and most of the silly bots that don't pay attention to the opponent's moves. Meanwhile, it cooperates against tit-for-tats and grim triggers and the like.
There are a bunch of details and optimizations and code-golf-ifications, but the big one is that instead of simulating the opponent using the provided function f, I use a code-golfed version of the code used to create bots in the tournament program (in createBot()). This prevents the anti-simulator bots from realizing they're in a simulation - arguments.callee.name is "f", and the variable a isn't defined.

The simulation this bot is doing pits the opposing bot against tit-for-tat, rather than against the winning bot itself, to avoid the infinite regress that would occur if the opposing bot also runs a simulation.

The last paragraph is because I wrote the tournament code poorly, and the function that's provided to the bots to let them simulate other bots was slightly different from the way the top-level bots were run, which allowed bots to tell if they were in a simulation and output different behavior. (In particular someone submitted a "simulator killer" that would call process.exit() any time it detected it was in a simulation, which would shut down the entire tournament, disqualifying the top-level bot that had run the simulation.) This submission modifies the simulation function to make it indistinguishable.

The greek letters as variable names were for style points.

Replies from: RedMan

↑ comment by RedMan · 2023-12-12T19:54:37.613Z · LW(p) · GW(p)

Is it safe to call this bot 'tit for tat with foresight and feigned ignorance'?

I'm wondering what its' actual games looked like and how much of a role the hidden foresight actually played.

comment by Ilio · 2023-11-12T05:25:02.607Z · LW(p) · GW(p)

Thanks for organizing this, here’s the pseudocode for my entry.

Robot 1: Cooperate at first, then tit for tat for 42 rounds, then identify yourself by playing: [0, 1, 1, 0, 0, 0, 1,1, 1, 1], then cooperate if the opponent did the same, otherwise defect.

Robot 2: Same as robot 1, ending with: … otherwise tit for tat

Robot 3 (secret): Same as robot 1, with a secret identifying sequence and number of initial ronds (you pick Isaac).

Replies from: KingSupernova

↑ comment by Isaac King (KingSupernova) · 2023-11-12T18:36:23.212Z · LW(p) · GW(p)

Done!

comment by Ericf · 2023-11-09T18:25:44.274Z · LW(p) · GW(p)

How does buying "none of the above" work as you add more entries? If someone buys NOTA today, and the winning entry is #13, does everyone who bought NOTA before it was posted also win?

Replies from: KingSupernova

↑ comment by Isaac King (KingSupernova) · 2023-11-09T18:37:28.135Z · LW(p) · GW(p)

Yes, if you buy "other" it splits those shares among all new answers.

comment by denyeverywhere (daniel-radetsky) · 2023-11-10T06:01:29.879Z · LW(p) · GW(p)

The tournament allows you to enter 3 bots, so we should be able to cheat if our bots can recognize each other. If so we have 1 bot, "ringer", and 2 bots "shill0" and "shill1" and follow this strategy:

ringer: if my opponent is shill0 or shill1, always defect. otherwise, play tit-for-tat.

shill0: if my opponent is ringer, always cooperate. otherwise play tit-for-tat.

To recognize each other, since we have access to the source code of both bots during the game, we need a hash function which is efficient in terms of characters used. One possibility would be to simply count the character codes. So we have

// one of the programs I grabbed at random as an example
let s = "r = h.length === 0 ? 1 : (h[h.length - 1].o === 1 ? h[h.length - 1].m : 1 - h[h.length - 1].m)"
// 6589
let id = [...s].reduce((a,c)=>a+c.charCodeAt(0),0)

This expression takes 41 chars, so we should be able to fit that in along with the required logic.

The obvious difficulty is that somebody could attempt to piggyback: craft a bot whose hash adds up to the same as the one our shills will auto-cooperate with. However, we should be able to solve this:

shill: if hash(c+s) = N, always cooperate. otherwise, play tit-for-tat

ringer: if hash(c+s) = M always defect. otherwise, play tit-for-tat

where c is the source code of the opponent and s is the source code of ourself.

We note that the rules of the challenge permit us to submit one of our bots by private message so that other players cannot see it. Therefore, we submit the source of ringer privately, and nobody should be able to piggyback.

Replies from: KingSupernova

↑ comment by Isaac King (KingSupernova) · 2023-11-10T06:17:57.953Z · LW(p) · GW(p)

That's not cheating, that's just "a strategy". :)

Replies from: daniel-radetsky

↑ comment by denyeverywhere (daniel-radetsky) · 2023-11-10T07:08:38.599Z · LW(p) · GW(p)

I mean, yes, it's not currently against the rules, but it obviously should be (or technical measures should be added to render it impossible, like adding random multiline comment blocks to programs).

Presumably the purpose of having 3 bots is to allow me to try 3 different strategies. But if my strategy takes less than about 160 characters to express, I have to use the ringer/shill version of the same strategy since otherwise I will always lose to a ringer/shill player using the same strategy but also shilling for his ringer. And the benefit of the ringer/shill strategy will be constant, since it shouldn't be triggered by other players, just my bots. So it just means all the strategies are inflated by a constant amount. And it requires everyone to adopt the strategy, which is a waste of effort.

Replies from: KingSupernova

↑ comment by Isaac King (KingSupernova) · 2023-11-10T09:21:37.241Z · LW(p) · GW(p)

A much shorter strategy is always suboptimal in problems like these. The higher level your program can go, the better, and that requires more information.

It seems unlikely to me that this ringer/shill strategy will be particularly good compared to the other options, and you haven't provided a compelling reason why I need to disallow it. What problems does it cause?

Replies from: daniel-radetsky

↑ comment by denyeverywhere (daniel-radetsky) · 2023-11-10T09:47:41.623Z · LW(p) · GW(p)

It seems unlikely to me that this ringer/shill strategy will be particularly good compared to the other options

It will absolutely be guaranteed to be better than equivalent strategies without ringer/shill. Remember that ringer/shill modifies an existing strategy like tit-for-tat. Ringer tit-for-tat will always beat tit-for-tat since it will score the same as tit-for-tat except when it goes up against shill tit-for-tat, where it will always get the best possible score.

This means that whatever the strongest ~160-character strategy is, the ringer/shill version will beat 2 more strategies than it. Intuitively, it seems unlikely that anyone will come up with a 240-character strategy which is that much stronger than the best ~160-character strategy. Partly this is because I suspect that the more sophisticated strategies that people will actually come up with will start running up against the execution time barrier and won't have time to make positive use of all that complexity.

you haven't provided a compelling reason why I need to disallow it.

You don't need to disallow it, I'm just saying that it would be ideal if it could be disallowed. It could easily not be worth the trouble.

Replies from: KingSupernova

↑ comment by Isaac King (KingSupernova) · 2023-11-10T15:35:07.429Z · LW(p) · GW(p)

Ringer tit-for-tat will always beat tit-for-tat since it will score the same as tit-for-tat except when it goes up against shill tit-for-tat, where it will always get the best possible score.

It will do worse than tit-for-tat against any bot that cooperates with tit-for-tat and defects against ringer tit-for-tat.

Replies from: daniel-radetsky

↑ comment by denyeverywhere (daniel-radetsky) · 2023-11-10T18:05:25.255Z · LW(p) · GW(p)

But not much worse; against counterexamplebot, ringer tit-for-tat will defect (so almost full double-defect), and tit-for-tat will always cooperate, so for that match ringer tit-for-tat is down about 50 points (assuming 50 rounds so the score is 100-49). Ringer tit-for-tat then picks up 150 points for each match against the 2 ringers, and the score is now (300-349). And it's only this close because the modified strategy is tit-for-tat rather than something more clever.

Also, this assumes that a bot even can defect specifically against ringer tit-for-tat. Insofar as ringer's source is secret and the identification is a function of the source of both ringer and shill, this may not be possible. If I understood correctly we only have access to ourselves and our opponent during the match, so we can't ask if some 3rd bot would always cooperate against our opponent, who would always defect against the 3rd.

Replies from: KingSupernova

↑ comment by Isaac King (KingSupernova) · 2023-11-11T23:49:30.162Z · LW(p) · GW(p)

Well, I'd encourage you to submit this strategy and see how it does. :)

comment by RedMan · 2023-11-10T04:08:56.031Z · LW(p) · GW(p)

I expect the tit for tat bot to win.

Replies from: KingSupernova, positivesum

↑ comment by Isaac King (KingSupernova) · 2023-11-10T05:25:59.401Z · LW(p) · GW(p)

I would bet against you on that. :)

↑ comment by positivesum · 2023-11-11T23:39:50.450Z · LW(p) · GW(p)

*Generous Tit for Tat!

A free to enter, 240 character, open-source iterated prisoner's dilemma tournament

Contents

19 comments