The Darwin Game
post by lsusr · 2020-10-09T10:19:10.516Z · LW · GW · 131 commentsContents
Instructions for non-programmers Instructions for programmers Coordinating with other players None 131 comments
Click here to participate. Entries must be submitted on October 18th, 2020 or earlier.
Entry is now closed.
In 2017, Zvi posted an exciting story [? · GW] about The Darwin Game, a variation of iterated prisoner's dilemma.
I will run my own version of the game in the week following October 18th, 2020. You do not have to know how to program in order to participate. I will code simple bots for non-programmers. If you do know how to program then you may create your own complicated bot.
Here are the rules. Changes from Zvi's original game are in brackets [like this].
For the first round, each player gets 100 copies of their program in the pool, and the pool pairs those programs at random. You can and often will play against yourself.
Each pair now plays an iterated prisoner’s dilemma variation, as follows. Each turn, each player simultaneously submits [an integer] from 0 to 5. If the two numbers add up to 5 or less, both players earn points equal to their number. If the two numbers add up to 6 or more, neither player gets points. This game then lasts for a large but unknown number of turns, so no one knows when the game is about to end; [I guarantee it will be at least 100 turns per iterated prisoner's dilemma].
Each pairing is independent of every other pairing. [You do know what round of the game it is and that you are facing an opponent. If you face a copy of yourself you are automatically awarded the maximum 5 points per round (2.5 points per bot). You otherwise do not know any history of the game to this point.] Your decision algorithm does the same thing each pairing.
At the end of the round, all of the points scored by all of your copies are combined. Your percentage of all the points scored by all programs becomes the percentage of the pool your program gets in the next round. So if you score 10% more points, you get 10% more copies next round, and over time successful programs will displace less successful programs. Hence the name, The Darwin Game.
Your goal is to have as many copies in the pool at the end of the last round as possible, or failing that, to survive as many rounds as possible with at least one copy.
[I will attempt to iterate until there is a stable equilibrium.]
I will add some silly bots of my own to make the early game more interesting.
Instructions for non-programmers
Please give a simple explanation of what you want your bot to do. Write it with mathematical precision. If your specification is even slightly ambiguous then you will be disqualified.
Instructions for programmers
Write a program of the following format.
class TitForTatBot():
def __init__(self, round=0): # Edit: Changed "1" to "0"
self.turn = 0
self.round = round
self.previous = None
def move(self, previous=None):
self.turn += 1
if self.previous:
output = self.previous
self.previous = previous
return output
else:
return 2
# Edit 2020-10-11: The above code is wrong. The properly-implemented TFT `move` method looks like this.
# def move(self, previous=None):
# self.turn += 1
# if previous == None:
# return 2
# else:
# return previous
Your class must have an __init__(self, round=1)
intializer and a move(self, previous=None)
method. You may write your class in Python 3 or Hy.
Unlike Zvi's original game, you do get to know what round it is. Rounds are indexed starting at 0. The previous
parameter of the move
method is an integer indicating what your opponent did last iteration. If it is the first iteration then previous
equals None
.
A new instance of your class will be initialized in each round. You may save whatever information you want into the class instance's fields but you may not save information between rounds or between class instantiations. The move
method must always return an integer from 0 to 5 inclusive.
You may import standard libraries like random
, scikit
and numpy
.
Coordinating with other players
Anyone can play, but only players with a Less Wrong account that existed before I declared this tournament will be allowed to coordinate out-of-game. This rule exists to prevent players from submitting multiple entries to this contest and self-coordinating. Coordinating with other people is encouraged. Coordinating with yourself between multiple separate entries is cheating.
Click here to participate. Entries must be submitted on October 18th, 2020 or earlier.
Entry is now closed.
131 comments
Comments sorted by top scores.
comment by Multicore (KaynanK) · 2020-10-19T18:13:48.919Z · LW(p) · GW(p)
tl;dr I betrayed the CloneBot clique and submitted MimicBot. My submission is in this Github repository.
My Process in chronological order:
There were three important things I noted in the original post
- Self cooperation is free, so an early lead is a big advantage.
- There are guaranteed to be “silly” bots in the early pool.
- You can vary your strategy depending on the round.
It seemed to me that the best strategy was to get as many points as possible early by exploiting silly bots, cooperating with anyone willing to cooperate, and folding somewhat to attackers. Then later on just play a non-exploitable cooperative strategy and hope for your early advantage to snowball.
Simulation
After seeing AbstractSpyTreeBot and some of the comments around it, it seemed to me that simulation was perhaps the best way to take advantage of simple bots. There are various approaches you could take, but mine was to simulate every possible sequence of moves I could make for the next N turns, and use the one with the best outcome. Since this has exponential time complexity, I only considered the moves 2 and 3 and kept N fairly small, with the option for the game runner to reduce it further to improve performance.
There are several possible pitfalls with simulators:
- The opponent might have randomized behavior, making your simulated predictions useless. In fact, not using randomization seems wrong to me, since you might get stuck in a rut against another deterministic bot, alternating 2 and 3 on the same schedule. Only bad bots would be deterministic and therefore simulatable.
- The opponent might have planted a landmine in their code to disqualify any potential simulator.
- The opponent might be slow, and make my dozens of simulations per turn take too long.
- The opponent might be a BullyBot, who attacks until the opponent retaliates. My short time horizon would make folding look correct, when actually retaliating would give more points in the long run. At the same time, I did want to fold against AttackBot.
My solution to the first three problems was to only run simulations against bots with two or fewer open parentheses “(” in their source. That’s how many you need to define __init__ and move, so it wouldn’t leave any for calls to exec or random or inspect. I suspect it might be possible to sneak a landmine past this requirement, but since I didn’t tell anyone what I was doing it seemed safe enough. Hopefully enough of the deterministic bots fulfill this requirement for simulation to be worthwhile.
For the BullyBot problem, I had the simulation also see what the best strategy was that didn’t let the opponent outscore by more than X but also got at least Y points. If this strategy didn’t exist, such as against AttackBot, I would fold, but otherwise I would play that strategy. I also made the simulation be more willing to be uncompromising against attackers as the rounds went on, in case one of the dumb bots was actually putting up a fight in the overall tournament.
For my backup strategy against non-simulatable bots I kept it simple and pulled from Zvi’s playbook. In the early rounds, I play an EquityBot which folds if the opponent plays 3 ten times in a row. In the later rounds, I play DefenseBot.
I considered a plan of obfuscating my code so that it would look safe to other simulators, and then disqualifying them if they tried to simulate me. I gave up on that because it seemed like other simulators would either be careful like I was or someone else would disqualify them for me. Plus, simulators are fun. I’d rather have a fair fight against other simulators. My code uses randomization and has lots of scary keywords in it, so other simulators aren’t going to be exploiting me.
To feed my simulation plan, I found two people who didn't have time to make a proper submission, and got them to submit PasswordBot, which plays 3 on the first two turns, then if the opponent played 2 on the first two turns it plays 2 for the rest of the game, and otherwise plays 3 for the rest of the game. I figured that only a branching simulator could take advantage of this.
CloneBot
Throughout the week, a couple people publicly announced intent to join the CloneBot clique. I decided that the CloneBots would be a huge threat that I might not be able to compete with if I didn’t do something to address them. Originally I was planning to not contact the clique and use the public source, but when Vanilla_cabs announced the source would not be public, I messaged them indicating willingness to join the clique. I’m not sure how suspicious the timing seemed to them, but I was included anyway. I gave Vanilla_cabs some notes about other parts of the code, but intentionally did not comment on anything related to security.
My plan was to bypass whatever they were using to check that the opponent is a clone, so that I could get perfect cooperation with clones, but also use my simulation strategy to exploit dumb bots and continue to cooperate with outsiders as the clones got more aggressive.
On Friday, the CloneBot security update was released, with the final version of the clone check.
(I can't figure out how to paste code into here), the source is here.
There are two constraints:
- After the string sanitization, my code before the payload must compare identical with the sanitized version of my opponent’s pre-payload code.
- After the string sanitization, my code after the payload must have at least eight spaces at the beginning of every line.
I got cracking.
I noticed that the checker was splitting the code by “\n”, so I wondered if there was any other character that python would treat as semantically equivalent to a newline. There is - the ASCII Form Feed character \f (0x0C). However, this didn’t work. I hadn’t noticed the earlier sanitization step of using splitlines() and then joining by \n. All my \f’s were just getting turned into \n’s and getting busted.
I then wondered if there was any other way of convincing python that there was a newline, or convincing it to dedent to a lower indentation level, without using a character that would be caught by splitlines or using less than eight spaces of indentation. A string of google searches led me to reading through the C code for the python lexer, which was very educational but convinced me that this route was a dead end.
I noticed in reading about the lexer that it has a somewhat complicated system for handling tabs and spaces in indents. I noticed that the code sanitization was automatically converting tabs to four spaces and wondered if I could use a mixture of tabs and spaces in indentation to manipulate the if / else statements in the clonebot move function to make it run the payload early. Unfortunately python 3 does not allow any mixture of tabs and spaces which makes the meaning of the code depend on how many spaces a tab is worth. I think this attack vector might be valid if we were using python 2.
At this point it was 3 AM and I slept on it. On Saturday morning I thought back to the very thing that had foiled my first attempt, the splitlines() sanitization. Instead of finding a character that would be treated as a newline but not caught by splitlines, what about a character that would be caught by splitlines but not treated as a newline? I found a thread online where people were complaining about splitlines splitting on certain characters, and a dev said the behavior was intentional. One of those characters was 0x1D, the ASCII Group Separator character. I tried it in a test program, and it worked. I tried it in CloneBot, and… it worked!
That’s the technique my final program uses. Starting with the first comment in the move function, I replaced every newline with 0x1D, which is a valid character in a comment. The bottom half of the CloneBot source is a comment and gets skipped over, so the move function actually runs my custom code, which is all indented at least eight spaces and fulfills the second constraint of the clone check. I use the same cooperateWithClone function as the clones to cooperate with them until the showdown round, at which point I just treat them like any other bot.
I’m not too surprised that other people tried and failed to cheat the restrictions, since it took me 8+ hours and some real thinking outside the box. It does make me feel better that I’m not the only would-be betrayer.
Predictions
How do I think I’m going to do? My bot passes the clone check both in Vanilla_cabs’s official checking tool and in my test harness, but I can see some possibility of something introducing enough differences to screw up the check. Vanilla_cabs encouraged people to submit through the form, while I submitted a git repo. Lsusr promised to preserve my special characters, but the real game environment might be different enough to make the check go wrong, though if that was the case maybe all clones would fail the check.
Separately, my bot is pretty complicated. I wouldn’t be surprised if it was the longest submitted, at over 250 lines of code. I’ve done a fair amount of testing, but I am not confident in my ability to make my software robust to anything that might happen. There’s a decent chance something unexpected will happen and I’ll crash and get disqualified.
However, if my bot doesn’t have a crippling bug and no one else figured out how to submit a MimicBot, I don’t really see how I lose. I get lots of points from clones, dumb bots, and cooperators, while everyone else is losing out against at least one of those.
I’ll say 30% my bot crashes or fails the clone test, 10% someone else submitted MimicBot and theirs is better, 10% those things don’t happen but I still lose, 50% I win.
Replies from: Lanrian, philh, Taleuntum, Vanilla_cabs↑ comment by Lukas Finnveden (Lanrian) · 2020-10-19T22:17:36.862Z · LW(p) · GW(p)
Damn, good job. We should've gone with my suggestion that the whole payload needed to fit on one line, separated by ; (though maybe this would've caused us to lose so many clique-members out of annoyance that it wouldn't have been worth it).
Replies from: Vanilla_cabs↑ comment by Vanilla_cabs · 2020-10-20T17:32:39.005Z · LW(p) · GW(p)
That's an idea worthy of consideration, but in addition to the risk you raised, I also feared some members would have submitted invalid bots.
Replies from: Lanrian↑ comment by Lukas Finnveden (Lanrian) · 2020-10-20T18:21:10.782Z · LW(p) · GW(p)
Yeah, if we'd seen the issue, I think we could've gotten around it just by not using splitlines, which would've been smoother.
Though of course, this exploit updates me towards thinking that there are other vulnerabilities as well.
Replies from: philh↑ comment by philh · 2020-10-20T21:02:57.328Z · LW(p) · GW(p)
If we don't use splitlines we instead need to use something similar, right? Like, even if we don't need to worry about LF versus CRLF (which was a genuine suggestion I made), we still need to figure out if someone's got any de-indents after the start of the payload. And I don't expect us to do better without splitlines than with it.
Replies from: Lanrian↑ comment by Lukas Finnveden (Lanrian) · 2020-10-20T23:03:47.624Z · LW(p) · GW(p)
Using newlines to figure out what happens after "payload" is fine, as far as I can tell. Multicore's exploit relies on newlines being used when comparing stuff before the payload.
Stuff like CRLF vs LF is a bit awkward, but can maybe be handled explicitly?
Replies from: philh↑ comment by philh · 2020-10-20T23:44:51.150Z · LW(p) · GW(p)
Oh yeah, that's true as far as I know. I guess it depends how much we trust ourselves to find all instances of this hole. A priori I would have thought "python sees a newline where splitlines doesn't" was just as likely as the reverse. (I'm actually not sure why we don't see it, I looked up what I thought was the source code for the function and it looked like it should only split on \n, \r and \r\n. But that's not what it does. Maybe there's a C implementation of it and a python implementation?)
↑ comment by philh · 2020-10-19T21:27:43.519Z · LW(p) · GW(p)
Clever! I looked for holes in mostly the same directions as you and didn't find anything. I think I either didn't think of "things splitlines will split on but python won't", or if I did I dismissed it as being not useful because I didn't consider comments.
↑ comment by Taleuntum · 2020-10-19T21:23:01.699Z · LW(p) · GW(p)
Your betrayal of the clique is very nice, hats off to you. I also liked your idea of getting others not that interested in the game to submit bots helping you, It's a pity it did not occur to me.
However, I think you are, too, overconfident in you winning. I've run a simulation of the whole tournament till the 160th round with 8 bots (MatrixCrashingBot, TitforTatBot, PasswordBot1, PasswordBot2, EmptyCloneBot, earlybird, incomprehensiblebot, CliqueZviBot) and in the final equilibrium state there are three bots: earlybird, incomprehensiblebot and CliqueZviBot with roughly equal populations. While PasswordBots do help you at the start your lead seems to disappear later when the dumb bots and non-clique members die (which is nice, because your bot's output when simulating is pretty annoying). Sure, it's just one run of the tournament with a low number of participants (it's slow to do the tournament on my laptop), but it's something.
Replies from: KaynanK↑ comment by Multicore (KaynanK) · 2020-10-19T21:40:22.256Z · LW(p) · GW(p)
That does revise down my expectations of winning, but my bot having run thousands of times on someone else's computer and not crashing (or failing the clone check?) is good to hear.
Maybe I'm overestimating the snowball effects of an early pool. If the late game has everyone cooperating with everyone else, your matches with others are only giving a tiny bit fewer points than matches against your copies.
↑ comment by Vanilla_cabs · 2020-10-20T17:44:48.817Z · LW(p) · GW(p)
Originally I was planning to not contact the clique and use the public source, but when Vanilla_cabs announced the source would not be public, I messaged them indicating willingness to join the clique. I’m not sure how suspicious the timing seemed to them, but I was included anyway.
I gave an equal chance to an early newcomer and a late newcomer of trying to betray. Maybe I was wrong, and I should be mindful of that in the future.
Also, I felt like our numbers were dangerously close to the minimum (6), and I expected a couple members to retract before I revealed the names (which di not happen). So in my mind I had no choice but to accept you.
I tried it in CloneBot, and… it worked!
Good job! My plan for security was to leave an obvious vulnerability, and entrust members who would report with the task to look for more subtle ones. Only Lanrian reported, late in the week, and I didn't trust them enough because I was suspicious of their motive when they'd asked me for making our code easier to simulate (which turns out they were honest about).
comment by Zvi · 2020-10-09T10:44:31.617Z · LW(p) · GW(p)
Free self recognition seems like it makes game less interesting, And that anyone who gets a big lead early just wins?
Replies from: vanessa-kosoy, lsusr, Ericf, oskar-mathiasen↑ comment by Vanessa Kosoy (vanessa-kosoy) · 2020-10-12T14:22:41.701Z · LW(p) · GW(p)
Given that you can read the opponent's source code [LW(p) · GW(p)], self-recognition is trivial to implement anyway (trivial but annoying, since you need to do quining).
Replies from: lsusr↑ comment by lsusr · 2020-10-09T20:28:14.066Z · LW(p) · GW(p)
I am aware of the potential strategic implication you bring up. However, as the organizer of this game, I believe it is my place neither to confirm nor deny beforehand any strategic implications of these rules.
Explaining my rationale necessarily involves discussing the strategic implications. I will explain my rationale for this choice after the game.
Replies from: GuySrinivasan, lsusr↑ comment by SarahSrinivasan (GuySrinivasan) · 2020-10-09T21:11:46.805Z · LW(p) · GW(p)
Eh, okay, but (prediction) this choice nudges me from "probably participate" to "probably ignore".
Replies from: Harmless↑ comment by Harmless · 2020-10-09T22:28:06.968Z · LW(p) · GW(p)
If I had to guess, the reasoning behind it is to nudge the game closer to a 'true' prisoner's dillemma (trying to work out if your opponent is willing to cooperate, rather taking focus away from it towards the shallower problem of trying to work out if your opponent is a copy of you)
Replies from: GuySrinivasan↑ comment by SarahSrinivasan (GuySrinivasan) · 2020-10-09T23:08:06.172Z · LW(p) · GW(p)
I agree, and this design avoids that problem, but seems to introduce a much larger one, assuming the intent also includes measuring bots on their ability to survive in progressively more "difficult" bot mixes, which "Darwin" seems to imply.
This choice also nudges me from "has noodled around the idea of hosting a similar competition many times and probably won't" to "same, but slightly more likely to actually do it". :D
Replies from: KaynanK↑ comment by Multicore (KaynanK) · 2020-10-15T21:10:01.095Z · LW(p) · GW(p)
My suggestion for a future Darwin Game tournament is to get rid of the 0, 1, 4, and 5 moves, and leave 2 and 3 as the only legal moves. Serious bots generally don't need or use the other moves, so they mostly just make you add annoying special case logic to get a few more points out of GoofBots in the early game.
Replies from: John_Maxwell_IV↑ comment by John_Maxwell (John_Maxwell_IV) · 2020-10-15T22:52:18.689Z · LW(p) · GW(p)
It's a good point but in the original Darwin Game story, the opening sequence 2, 0, 2 was key to the plot.
↑ comment by lsusr · 2020-12-04T21:05:49.983Z · LW(p) · GW(p)
Update: There are two primary reasons for free auto-recognition. Firstly, I promised to write bots for non-programmers and I didn't want to promise to implement an arbitrary number of bad crypto systems written by non-specialists. Secondly, I believe is is good game design to reward early aggression.
↑ comment by Ericf · 2020-10-10T04:57:29.663Z · LW(p) · GW(p)
Reading the other's source code is a bigger change (actually, a superset of auto-recognizing self) It actually makes coordinating trivial and enforceable, since you can just check for the correct behavior & password before cooperating. And if you can run your opponent's source code...
↑ comment by Oskar Mathiasen (oskar-mathiasen) · 2020-10-10T09:58:39.081Z · LW(p) · GW(p)
If i had to guess on the motives, the last time a similar game was played (non publically) the meta developed to be about self recognizing, this is likely a rule to avoid this.
Winning strategy was some key string to identify yourself, cooperate with yourselt, play 3 otherwise. (Granted number of iterations was low, so people might not have moved to be non cooperating strategies enough (something like grim trigger))
comment by johnswentworth · 2020-10-09T14:35:54.558Z · LW(p) · GW(p)
Is reading the opponent's source code legal?
Replies from: lsusr↑ comment by lsusr · 2020-10-09T20:39:55.561Z · LW(p) · GW(p)
Yes. You may read the opponent's source code via the get_opponent_source(self)
function.
from extra import get_opponent_source
class SpyBot():
def __init__(self, round=1):
self.round = round
def move(self, previous=None):
opponent_source = get_opponent_source(self) # return value is of class `str`
if '3' in opponent_source:
return 2
return 3
Beware the halting problem. Infinite recursion constitutes overuse of compute resources and may be grounds for disqualification.
Replies from: vanessa-kosoy, John_Maxwell_IV↑ comment by Vanessa Kosoy (vanessa-kosoy) · 2020-10-12T14:25:00.944Z · LW(p) · GW(p)
Infinite recursion constitutes overuse of compute resources and may be grounds for disqualification.
Is this disqualification in advance, or in run-time? That is, do you just look at the code and decide whether it's good, or do you give each program some bounded time and memory to run and disqualify it if any copy overflows it? (Btw another option would be, punish only that copy.)
Replies from: lsusr↑ comment by lsusr · 2020-10-12T20:33:25.920Z · LW(p) · GW(p)
Run-time. I disqualify the entire program—not just the one copy—and then restart the game with one (or fewer) competitors.
Replies from: kuudes↑ comment by kuudes · 2020-10-17T00:14:59.676Z · LW(p) · GW(p)
Does this open a security hole of future prediction like Spectre etc?
Some bots could have a thing where they remember if they have met certain another bot (SignalBot) in the game. If they haven't, they decide that the game setup carries certain preset information.
If the bots find out later in the game that the preset condition would be true, then they coordinate so that SignalBot causes infinite loop and gets disqualified and the game restarted. Now the game will miss SignalBot, causing the others to use this information to deduce the signalled information.
Tl;dr:
Givens: In some cases Left is best strategy. Otherwise Right is best strategy.
- ForetellerBot observes SignalBot in the game, decides to play Right.
- Bots observe Left is better strategy.
- SignalBot nonhalts and gets thrown out, game is reset.
- ForetellerBot observes no SignalBot in the game, decides to play Left.
↑ comment by Multicore (KaynanK) · 2020-10-17T00:39:34.358Z · LW(p) · GW(p)
OP said elsewhere in the comments (emphasis mine):
Your code is allowed to peek at the game engine for the purposes of figuring out if it is being simulated by someone else's code. Peeking at the game engine for other reasons, like figuring out how many of each opponents remain or attempting to modify the game engine itself or attempting to modify your opponents' code from anywhere outside their own simulation of you is against the rules.
And in the original post:
Replies from: kuudesYou may save whatever information you want into the class instance's fields but you may not save information between rounds or between class instantiations.
↑ comment by kuudes · 2020-10-17T01:04:57.886Z · LW(p) · GW(p)
You are not peeking at the game engine, you can just message this the same as you can message cooperation (2, 0, 2 code etc).
You also do not need to save any information over instances - all of your bots on a round are running on the same instance. If any of your ForetellerBots observes SignalBot, then SignalBot has not dropped. SignalBot's existence is in itself the flag.
Replies from: KaynanK↑ comment by Multicore (KaynanK) · 2020-10-17T01:22:51.697Z · LW(p) · GW(p)
Each pairing is independent of every other pairing. [You do know what round of the game it is and that you are facing an opponent. If you face a copy of yourself you are automatically awarded the maximum 5 points per round (2.5 points per bot). You otherwise do not know any history of the game to this point.
A separate instance of each bot is created for each pairing in each round. All that ForetellerBot knows in any pairing is whether it is itself facing a SignalBot, not whether any of its copies are.
Replies from: kuudes↑ comment by kuudes · 2020-10-17T12:24:34.822Z · LW(p) · GW(p)
Thanks for the info. This conflicts with the specification of
A new instance of your class will be initialized in each round.
which I interpreted to mean that there exists exactly 1 instance of the class per round.
The model you propose makes sense though, I guess my mental model of the thing was mistaken.
↑ comment by John_Maxwell (John_Maxwell_IV) · 2020-10-10T12:38:39.021Z · LW(p) · GW(p)
Why does get_opponent_source take self as an argument?
Replies from: lsusr↑ comment by lsusr · 2020-10-10T12:55:09.544Z · LW(p) · GW(p)
There are two active bots. The self
argument tells the game client which bot's code not to return.
↑ comment by John_Maxwell (John_Maxwell_IV) · 2020-10-10T15:39:59.027Z · LW(p) · GW(p)
Thanks.
comment by Vanilla_cabs · 2020-10-12T00:54:20.285Z · LW(p) · GW(p)
If you're participating in the contest and you want to win, I have a proposition for you:
Howdy! You've probably looked up Zvi's past Darwin game [? · GW] that directly inspired this one. A group of players formed a clique who recognized each other, cooperated among themselves and defected on everyone else. They nearly wiped the board, but they were preyed upon by a member who played both sides.
What they missed was a way to guarantee that all members apply the decided strategy. They had no way to enforce it.
But we have a way.
I call it CloneBot: a bot who checks that its opponent has the exact same code as itself. No way to cheat that! It guarantees that every member of the clique does the same thing. Moreover, there'll be a way to cooperate optimally, avoiding losing the first rounds to coordinate. The clique are gonna be the most efficient ccoperators.
But in the end we're all gonna tie, it's boring. I want to take a chance at winning!
So do I. This is why the clique are only going to collaborate until a predefined round. After we've eliminated the competition, we can have a dramatic showdown among ourselves. Cool, no? In the code, there's gonna be a separate function that's called only after a given round. The code checker will verify that the function is only called at the right time, but will ignore what is inside.
What will CloneBot do?
Depends on the proportion of clones. If we're a big group, CloneBot will straight up defect against outsiders by playing 3. Otherwise, CloneBot will cooperate, but make sure opponent does not gain more than itself.
With its clones, CloneBot will cooperate by alternating 2-3. Who gets the first 3 will be determined fairly before the first turn starts.
Ok, but I don't want to find myself in a small clique that's going to lose.
You don't have to commit to submitting CloneBot right away. All you need to do is contact me to say you're in, conditional on the clique being large enough. By default, contact me privately. If you want, you can roll a D6, and if you roll 6, join me publicly right here.
A few days before the deadline, if we're 5 or less, I'll just announce that the critical mass was not reached. But if we're more than 5, I will make public the names of all who contacted me to join the group and crush the competition. If I post your name at that moment, you must submit CloneBot.
I like the idea, but I wish X instead of Y.
Every detail is open to discussion for about 3 days. After that, this post will be updated, and every person who expressed their desire to join will be informed of the changes and asked to confirm their participation.
Where do I find CloneBot?
I'm considering making the code public.
Let the best clique member win :)
Replies from: Measure, Vanilla_cabs, frontier64, arxhy, DaemonicSigil↑ comment by Measure · 2020-10-15T20:49:38.041Z · LW(p) · GW(p)
I don't approve of this. Assuming that everyone who pledges to join the clique actually submits a CloneBot and nobody finds a way to beat the recognition code and defect from the clique, and assuming there isn't a subtle bug in the code that disqualifies some or all of the clones, then the clone cohort can indeed eliminate the non-clone bots. At that point though, we're right back where we started, and then what? Why not just let the best bot win in the first place?
If everyone goes through with this, then of course I'd be better off submitting a clone myself (again assuming no cheating/errors/etc. - I would certainly need to see the code myself before deciding to join), but this is a bit different from typical public-goods-type pledges. Typically, everyone wants the thing done but given that it is done would individually rather not contribute. Here everyone would rather the thing not be done, but given that it is done would individually rather contribute. This is a straightforward example of a bad equilibrium.
If you have pledged, or are thinking of pledging, consider:
- How surprised would you be if a bug in the CloneBot code disqualified all the clones?
- How surprised would you be if someone managed to bypass the code checking and defect from the group?
- How surprised would you be if one or more people who pledged didn't actually submit a CloneBot?
- Is this the kind of equilibrium you want to encourage?
↑ comment by Vanilla_cabs · 2020-10-15T22:19:21.555Z · LW(p) · GW(p)
I get it that you don't like that players join forces. I am not sure I'd allow coordination if I had a say on the rules. But per the rules coordination is part of the game. That's it. For all we know, others are making cliques in secret.
I believe my scheme substantially increases our chances of winning, so I'll go with that.
Admissions are closing soon. Good luck, whatever you decide :)
Replies from: Measure↑ comment by Measure · 2020-10-16T01:52:28.934Z · LW(p) · GW(p)
I can't say it's not fair, and I do realize you've put a lot of work into this. Have you decided to make the clone code public?
Replies from: Vanilla_cabs↑ comment by Vanilla_cabs · 2020-10-16T10:19:49.940Z · LW(p) · GW(p)
At least one member asked for a basic obfuscation measure. Publishing the code would defeat their purpose.
Also, from an insider's perspective, publishing the code now would only slightly increase our chances to get another member before the end of admissions, while it would entail a significant risk of opponents adjusting their strategy against it. I should have decided on the publication earlier, but honestly it was never a priority.
↑ comment by Vanilla_cabs · 2020-10-17T16:38:56.460Z · LW(p) · GW(p)
We were five hundred, but with swift support
Grew to three thousand as we reached the port
(Le Cid)
It's been an exciting week, I've had lots of fun, thanks everyone who shared ideas, and thanks lsusr for swiftly and kindly answering all my questions. Now is time for the final act.
- arxhy
- DaemonicSigil
- Emiya
- frontier64
- Lanrian
- Multicore
- philh
- simon
- Taleuntum
- Vanilla_cabs
You will receive a personal message shortly.
That is all.
↑ comment by frontier64 · 2020-10-15T14:59:24.125Z · LW(p) · GW(p)
I pledge to join if there is at least 5 people total joined.
↑ comment by DaemonicSigil · 2020-10-12T04:37:15.174Z · LW(p) · GW(p)
How do you determine who gets the first 3? Maybe lsusr will be kind enough to provide a symmetry-breaking bit in the "extra" package. (It would only be fair, given that bots playing themselves are automatically given max score.) If not, and you have to do things the hard way, do you compare source code alphabetically, and favour X over Y on even rounds and Y over X on odd rounds?
Also, it may be a good idea to make the level of defection against outsiders depend on the round number. i.e. cooperate at first to maximize points, then after some number of rounds, when you're likely to be a larger proportion of the pool, switch to defecting to drive the remaining bots extinct more quickly.
Replies from: lsusr, Vanilla_cabs↑ comment by Vanilla_cabs · 2020-10-12T08:18:18.015Z · LW(p) · GW(p)
do you compare source code alphabetically, and favour X over Y on even rounds and Y over X on odd rounds?
Great idea! I've updated the following heuristic using that.
There is one thing that is different between the programs: the code that you will add to be executed in the later rounds (the payload). As I said, CloneBot will ignore it when judging whether its opponent is a clone. But if the opponent is a clone, it will use this heuristic to decide who goes first:
- compare both payloads lexicographically
- if the difference in length is the same parity as the round, the shortest plays 3
- otherwise, the longest plays 3
This is fair, deterministic, and needs 0 turn to communicate. There's no point in tweaking your payload in the hope of starting with 3 more often. The only problem are ties, which are unllikely, and adding your name as a comment solves it.
Why also compare length? Because otherwise, the payloads of extreme length (very short or very long) would have very stable alternating priority, while the ones in the middle would be more subject to randomness. This way, it's the same randomness for everybody.
Also, it may be a good idea to make the level of defection against outsiders depend on the round number. i.e. cooperate at first to maximize points, then after some number of rounds, when you're likely to be a larger proportion of the pool, switch to defecting to drive the remaining bots extinct more quickly.
That seems reasonable. I'm just worried that we might let greedy or even cooperating bots take too much lead. Ideally, as soon as the clique reaches criticals mass, it should starve its opponents. The 'as soon' depends on what proportion of the pool we'll initially be.
comment by Taleuntum · 2020-10-19T07:32:32.133Z · LW(p) · GW(p)
Explanation of my strategy and thought process in chronological order
After seeing Vanilla_Cabs's comment I lied to them about wanting to join the clique. I was undecided, but I figured seeing the code of the clique can be a great advantage if I can exploit some coding mistake and I can still decide to join later anyway if I want.
The first versions of CloneBot (the name of the program for our clique) did actually contain a mistake I could exploit (by defining the __new__() method of the class after the payload) and so this was my plan until Vanilla_Cabs fixed this mistake. After they fixed it, I didn't notice any way I can take advantage, so I joined the clique in spirit.
Initially my plan for my bot was a simulator which simulates between 100 and 1000 turns of the opponent against a few candidate bots (ZviBot, TiTforTat, return 3, return 2, etc..) and then depending on the round number either chooses the one with the maximum point for me or the one which gets more points than the opponent. There were unfortunately two problems with this plan:
- Bots who can detect if they are in a simulation can get me into an infinite loop which would disqualify my program, so I had to research the ways this might happen to know how to prevent it. I started by inspecting the code for AbstractTreeSpyBot and I did notice some ways to do this:
- The code always instantiates a new object, so by storing past moves, I can detect if I'm in a simulation if my move() method's argument is not None and I have not stored previous moves. This defect however is really specific to this program and other simulators can easily avoid it by storing the instantiated class object between rounds.
- The code simply calls the simulated bot's move() method which can be detected by inspecting the call stack and finding a local variable named 'self' with a method named 'move' of a class type not named the simulated bot's classname. This method of detection can still be tricked by running the simulation in a different process (eg by calling the python interpreter from the shell, this has the advantage of having an easy and robust way to redefine get_opponent_source()), but still I expect this would DQ many submitted simulators. (I attached the code of a bot using this type of detection.)
- Isusr posted a very unforgiving time limit. Unfortunately, I could not make simulator bot work in this timelimit. In fact, on my laptop even AbstractSpyTreeBot does not run in 0.05s (1 init + 100 moves) and to avoid detection I would have to call the python interpreter from the os which would again cost a lot of time. It does not even matter what the time limit is just that it is not a vague 'don't be too slow' like it was initially, because I planned to simulate (100-1000 moves+init)*(number of programs I want to try), so if the number of programs I want to try is greater than 1 I would go over the limit if my opponent uses even just half the limit.
After this, seeing that I can't make my simulator work, I abandoned the idea of me submitting a simulator. However seeing that some other types of simulator can still be pretty strong, I decided to disincentivize them, so when Larian cautioned in the clique-chat that simulator crashing bots weaken our clique (as isusr restarts the whole tournament in the event of DQ), so we should not use them, I lied that I've already submitted one program detecting and crashing simulators. Of course Larian's point was valid, so obviously I did not even plan to do so. Some time later It occured to me that my claim that there is a simulator crashing bot in the clique might incentivize those who wanted to submit a simulator to leave the clique, so I added another lie that the method of detecting simulators was my friend's idea (hopefully suggesting that there is another contestant with the same method outside the clique). I'm curious how believable my lies were, I felt them to be pretty weak, hopefully it's only because of my inside view.
After this I had to come up with an actual program. My intrigue and life didn't leave me with much time, so I finally submitted a ZviBot (a bot similar to the one described in the Darwin Game series) as a payload (for those not in our clique: payload is the part of the CloneBot which activates in the endgame against other clique members). While I did not have time to run simulations, I had three reasons for this submission:
- ZviBot has a good track record in a similar competition.
- Given that bots in this competition know the round number, I expect that many will get meaner (less cooperating) in the later rounds on the grounds that if they survived for this many rounds they are probably a significant portion of the pool, so they can try to kill others. However because the programs will use different rates of becoming uncooperating, they will get eliminated from the pool one by one as the others are still cooperatating. By not getting meaner my plan was to stay in the final equilibrium state with others.
- I was curious how such a simple bot would perform.
Bot detecting simulators:
class MatrixCrashingBot():
def __init__(self, round=1):
import inspect
from time import sleep
sf = inspect.stack()
simulated = False
enemy = ""
for f in sf:
if ('self' not in f.frame.f_locals):
continue
fo = f.frame.f_locals['self']
fod = dict(inspect.getmembers(fo))
if 'MatrixCrashingBot' != fo.__class__.__name__ and 'move' in fod:
simulated = True
enemy = fo.__class__.__name__
break
if simulated:
while 1:
print("{} tried to simulate me :O >:\[ So I sent him to an infinite cycle!\nNote to the tournament organizer: The infinite cycle is run by {} not by the MatrixCrashingBot! ".format(enemy, enemy))
sleep(3)
def move(self, previous=None):
return 3
Bot I finally submitted: (everything exept the payload is written by Vanilla_cabs. The core idea (alternating 3s and 2s) of the payload is from Zvi)
class CloneBot():
def __init__(self, round=0):
import math
import random
import extra
self.math = math
self.random = random
self.extra = extra
self.showdownRound = 90 # after this round, your personal program takes over
self.round = round # the current round
self.myMoves = [] # all the moves you've made, first to last
self.opponentMoves = [] # all the moves your opponent has made, first to last
my_source_raw = extra.__getattribute__(''.join(['ge','t_','my','_s','ou','rce']))(self)
opponent_source_raw = extra.__getattribute__(''.join(['ge','t_','op','po','ne','nt','_s','ou','rce']))(self)
my_source = "\n".join([" ".join(line.split('\t')).rstrip() for line in my_source_raw.splitlines()])
opponent_source = "\n".join([" ".join(line.split('\t')).rstrip() for line in opponent_source_raw.splitlines()])
if not 'def payload(self) :' in opponent_source :
self.is_opponent_clone = False
else :
my_common_code, my_payload = my_source.rsplit('def payload(self) :', 1)
opponent_common_code, opponent_payload = opponent_source.rsplit('def payload(self) :', 1)
if my_common_code != opponent_common_code :
self.is_opponent_clone = False
else :
self.is_opponent_clone = True
for line in opponent_payload.split("\n") :
# checks that no common method or property is overwritten after the payload
# allows the innocuous command "foo = 'bar'" by member's demand
if line.lstrip() != "" and line != "foo = 'bar'" and line[0:8] != " " :
self.is_opponent_clone = False
break
if self.is_opponent_clone :
payload_length_difference = len(my_payload) - len(opponent_payload)
if my_payload != opponent_payload :
# compares payloads without reading them
# fair way to decide who starts with 3 between two clones
# for 100% protection against ties, personalize your payload with a comment
self.high_first = (my_payload < opponent_payload) == ((payload_length_difference+round) % 2 == 1)
def move(self, previous=None) :
self.turn = len(self.myMoves) # the current turn
# pseudorandom to allow simulators to collaborate
self.random.seed((self.round+1) * (self.turn+1) * (7 if previous==None else (previous+1)))
if previous != None :
self.opponentMoves.append(previous)
if self.is_opponent_clone :
if self.round < self.showdownRound :
output = self.cooperateWithClone()
else :
output = self.payload()
else :
output = self.default()
self.myMoves.append(output)
return output
def defaultCooperation(self) : # factor influencing behaviour with non-clones, 1 at round 0, 0 at round 60
return max(0.0, float(self.showdownRound - (self.round*1.5)) / self.showdownRound)
def cooperateWithClone(self) :
if self.turn == 0 :
if self.high_first :
return 3
else :
return 2
else :
return self.opponentMoves[-1]
def default(self) :
if self.turn == 0 :
if self.random.random() < 0.5 * self.defaultCooperation() :
return 2
else :
return 3
elif self.myMoves[-1] + self.opponentMoves[-1] == 5 :
if self.myMoves[-1] == 2 :
return 3 # tit for tat
elif self.myMoves[-1] == 3 :
if self.turn >= 2 :
if self.myMoves[-2] == 3 and self.opponentMoves[-2] == 2 :
return 3 # stable 3 against 2
if self.random.random() < self.defaultCooperation() * 1.2 :
return 2 # cooperation
else :
return 3 # maintain 3 against 2
else :
return self.myMoves[-1] # free candy
elif self.myMoves[-1] + self.opponentMoves[-1] < 5 :
return 5 - self.opponentMoves[-1]
else : # sum > 5
if self.random.random() < self.defaultCooperation() * max(0, 50-self.turn) / 100.0 :
return 2 # back down
else :
return 3 # maintain
def payload(self) :
# put a personal word here to guarantee no tie during cooperation: in a pretty body no one can see your rotten soul
# put what you want to play for the showdown
# no line after 'def payload(self)' should have less than 8 whitespaces at the beginning,
# unless it's an empty or only whitespace line
#
# Idea of solution: most program will gradually get meaner as rounds
# progress, but because they do so at different rates
# they will get eliminated as other, still cooperating bots outcompete their
# meanness. By not getting meaner I plan to stay in the final equilibrium state.
# Otherwise it's a ZviBot.
if self.turn == 0:
self.switched = False
return 3
else:
# against very submissive bots
if self.turn >= 3 and self.opponentMoves[-1] < 3 and self.opponentMoves[-2] < 3 and self.opponentMoves[-3] < 3:
return 5-self.opponentMoves[-1]
# against bots who are very bad at detecting patterns
if self.turn >= 5 and self.opponentMoves[-1] == 3 and self.opponentMoves[-2] == 2 and self.opponentMoves[-3] == 3 and \
self.myMoves[-1] == 3 and self.myMoves[-2] == 2 and self.myMoves[-3] == 3 and not self.switched:
self.switched = True
return 3;
if self.myMoves[-1] == 2:
return 3
elif self.myMoves[-1] == 3:
return 2
Replies from: Vanilla_cabs, philh, philh, Lanrian↑ comment by Vanilla_cabs · 2020-10-19T13:52:29.973Z · LW(p) · GW(p)
The first versions of CloneBot (the name of the program for our clique) did actually contain a mistake I could exploit (by defining the __new__() method of the class after the payload) and so this was my plan until Vanilla_Cabs fixed this mistake. After they fixed it, I didn't notice any way I can take advantage, so I joined the clique in spirit.
Little did you know that I was aware of this weakness from the beginning, and left it as a test to find whom I could trust to search for the weaknesses I didn't know. Of the 3 (I think) to whom I showed the code early, only Lanrian reported it.
I'm curious how believable my lies were, I felt them to be pretty weak, hopefully it's only because of my inside view.
I didn't play a simulator so I didn't care about the first.
About the second, I can tell you that another member alerted me that you seemed to have a hidden ally. They feared you had made an ally outside the clique, or just given the code to join the clique to a player know only to you. Which I thought was a possibility. Actually, I hoped for a few stowaways to boost our numbers.
Replies from: Taleuntum↑ comment by Taleuntum · 2020-10-19T14:52:52.304Z · LW(p) · GW(p)
Maybe it's a little cheap to say this after you've revealed it, but it did actually occur to me that you might have deliberately made this weakness. Had I known that in Python you can redefine methods, I might have reported it, but the exploit with __new__() seemed pretty obscure (even though I didn't know the other way and I did know this). The possibility of this being a test was also the reason I went with the "Oh I'm so busy, I didn't have time to review the code.." excuse. I'm also curious whether Larion calculated with you deliberately planting the mistake or they had in-game ethics. Also, before you posted the list of the members publicly, you were the center of the clique and could control the information the clique members got. I was really paranoid about this and I feel you could have used this somehow. Have you thought along these lines?
About your second point, It's nice I could make someone believe that I had an ally outside the clique.
Replies from: andrea-mulazzani, Vanilla_cabs, Lanrian↑ comment by Emiya (andrea-mulazzani) · 2020-10-19T23:31:23.716Z · LW(p) · GW(p)
Oh, that was me I think. I had simply thought your comment meant you were preparing code with someone else. Whether he was inside the clique, outside it, or a non player helping you out I wasn't sure, but I still recommended caution.
I did think it was weird that you'd let slip such information, but couldn't see any reason for making people think you had allies, so I just thought that the most likely explanation was that a non player was helping you. Still, being cautious wouldn't hurt.
I have to say I didn't made the connection about simulation crashing software being outside the clique, likely because I wasn't playing a simulator so I didn't thought much about it.
All in all... I think it's a lie that would work best on the people it wouldn't need to work on. If I had thought to change a plan I had going based on the information you provided, I would have wondered a bit more about why you did that, perhaps getting suspicious.
But I still think it wouldn't really be obvious as a lie to anyone.
On a side note, I really love this site. I can't really recall any other game I've been in getting this tangled.
↑ comment by Vanilla_cabs · 2020-10-19T16:07:10.939Z · LW(p) · GW(p)
I didn't know about __new__(), I only knew about redifining methods, so based on what you knew, your reasoning was correct.
I knew no one before starting the clique. Lanrian joined the same way as the others. If anything, Lanrian was suspicious because they insisted we put the random.seed() inside move() and make it pseudorandom so that simulators can accurately emulate our behaviour. The reason they gave was to better collaborate, and have the simulators play 2 against 3 instead of 3 against 3. I was mildly convinced and I still am suspicious of that move. They only late in the week reported the weakness, after you and philh passed on the chance to do so. But they did so soon after I showed them the code.
I was really paranoid about this and I feel you could have used this somehow.
The secrecy on the members was used to:
- prevent members and potential members from worrying if there were too few current members. That was the purpose I had in mind when I made that choice. A few days before the end I still was not sure we'd be enough. I was also worried some members would drop if we were too little. So the 2 members who joined in the last 2 days really helped.
- avoid any collusion between members that would not include me. And more generally receive any valuable information that members would like to share.
So I used that advantage only in a defensive way. But I did receive an offer that did inform me on more offensive uses, and impacted my payload, which I will elaborate on if the sender allows it.
Replies from: Lanrian↑ comment by Lukas Finnveden (Lanrian) · 2020-10-19T21:15:05.724Z · LW(p) · GW(p)
I stand by my reasoning! As long as we don't yield to bullying, simulators are our friends, ensuring that the maximum payout is always payed out.
↑ comment by Lukas Finnveden (Lanrian) · 2020-10-19T21:14:08.853Z · LW(p) · GW(p)
I didn't think about reporting the bug as making a sub-optimal but ethical choice – I just wanted to be part of a clique that worked instead of a clique where people defected. My aversion to lying might have affected my intuitions about what the correct choice was, though, idk ¯\_(ツ)_/¯
↑ comment by philh · 2020-10-19T12:44:35.572Z · LW(p) · GW(p)
by defining the
__new__()
method of the class after the payload
Incidentally, you could also just redefine existing methods, which was how I planned to do it. Like,
class Foo():
def __init__(self):
self.x = 1
def __init__(self):
self.x = 2
Foo().x # 2
Replies from: Taleuntum↑ comment by philh · 2020-10-19T08:25:27.434Z · LW(p) · GW(p)
I lied that I’ve already submitted one program detecting and crashing simulators. ... I added another lie that the method of detecting simulators was my friend’s idea (hopefully suggesting that there is another contestant with the same method outside the clique). I’m curious how believable my lies were, I felt them to be pretty weak, hopefully it’s only because of my inside view.
I believed both of these lies, though if I'd come to rely on them at all I might have questioned them. But I assumed your friend was in the clique.
Replies from: Taleuntum↑ comment by Taleuntum · 2020-10-19T15:02:23.693Z · LW(p) · GW(p)
Yes, I feared that some might think my friend is in the clique. However I couldn't just say that they are not in the clique, because that would have been too obvious. (like my other lie: "Yeah, I totally have another method for detecting being in a simulation even if the simulation runs in a separate process, but unfortunately I can't reveal it.") So I tried to imply it by speaking about him as if he is not in the conversation and him not commenting after I mentioned him. I hoped in case someone was planning to submit a simulator outside the clique they would try to sneakily inquire about whether my friend is in the clique or not and then I would have asked a random, not competing lesswronger to play the part of my friend.
↑ comment by Lukas Finnveden (Lanrian) · 2020-10-19T21:08:39.227Z · LW(p) · GW(p)
I believed all lies! And I might've submitted a simulator if you hadn't told the first, and would definitely have tried harder to simulator-proof my bot, so you did change my behaviour. Leaving the clique wouldn't have been worth it, though. Even knowing that you lied about the 2nd thing, I assign decent probability to someone crashing all the simulators outside the clique. (I think this is incorrect, though – if you can figure out that you're in a simulation, it's way better to claim that you'll be submitting 3 to scare the simulator into playing 2.)
comment by Zvi · 2020-10-19T22:55:26.802Z · LW(p) · GW(p)
So, in case anyone's wondering what I did...
I cared enough to think and enter, but not to program.
I designed a simulator, but was told it wouldn't be coded for me, so that was out.
So instead, I wrote this:
Until symmetry breaks, if the round # is 1, is 3 or is even, play sequence 23322232323322233 and then repeat 22222223 until symmetry breaks. If round # is odd and 5 or more, the reverse, except repeating 22222223 at the end.
Once it breaks, alternate 2 and 3 for 4 turns.
Then, if the last turn added to 5, keep doing that until they don't.
Once they don't...
If they've always played 0 or less, play 5.
If they've always played 1 or less, play 4.
If they've always played 2 or less, play 3.
Otherwise, depending on round number:
Rounds 1-50: Keep alternating until turn t+10. After that, if last turn added to 5, alternate 2 and 3. Otherwise, check their average score per round after symmetry. If it's 2.5 or lower, play 2, otherwise play 3.
Rounds 51-100: Same as above, except you also always play 3 if their score is 5 or more higher than yours.
Rounds 101+: Same as above, except you also always play 3 if their score is higher than yours.
(We could improve by adding more logic to properly exploit in strange situations, but we don't care enough so we won't.)
That's it. Keep it simple. Still call it BendBot I guess.
The intention here was pretty basic. Endgame behavior varies by round to get stingier if anyone tries something, to grab early pool share without being exploited later.
The big thing is that this bot is deterministic. I intentionally avoid calling the random function by choosing a semi-random set of 2s and 3s, on the theory that it's unlikely anyone else would choose an identical sequence, and if I meet myself I get the 2.5 anyway.
If they are not simulating or checking code, it won't matter.
If they are looking at all, then my not loading their code and not being random tells them the water's fine, take a look, see what's going on, and we can cooperate fully - you can see what I'm starting with, and we can get 2.5 each without incident. I'm sad that some people thought that more than two parenthesis was high risk to simulate/examine - I thought that the obvious thing to do was check to see if someone ever loads code or uses a random function, and if you don't do either, you should be safe.
So the thought was, many of the best bots would be simulator bots and I'd get full cooperation from them, whereas when they faced each other, they'd have to do some random matching to cooperate, so I'd have an edge there, and I'd do reasonably well against anything else that went late unless some alliance was afoot.
Turns out an alliance is afoot after all, but I certainly didn't care enough to worry about that. Let them come, and let the backstabbers profit, I say.
I was told that I had by far the most complicated non-coded entry even then, and that my endgame logic was being replaced with randomly 50% 2, 50% 3. I was asked, submit as-is, fix it, or withdraw?
That modification definitely didn't work, and the code that was written was not something I felt OK touching. So I explained why, and suggested it be replaced with this:
If (last round added to 5 or less) play whatever they played last.
Else If (their score > my score and round > 5) play 3.
Else Play 2.
I figured that was one extra line of code and should take like 2 minutes tops, and if that was 'too complex' then that was fine, I'd sit out.
So basically, let myself get exploited very early since there would likely be at least one all-3s in the mix but all such things would swiftly lose, then shift to hardcore mode a little faster to keep it simple.
I didn't get a reply to that, so I don't know if my entry is in or not. I hope it is, but either way, good luck everyone.
Replies from: lsusr, KaynanK↑ comment by lsusr · 2020-10-23T00:57:04.477Z · LW(p) · GW(p)
Your entry is in. I implemented the If (their score > my score and round > 5) play 3. Else Play 2.
algorithm. I hope I got the rest of it right.
(import random)
(setv +sequence-1+ [2 3 3 2 2 2 3 2 3 2 3 3 2 2 2 3 3])
(setv +sequence-2+ [2 2 2 2 2 2 2 3])
(defclass BendBot []
(defn --init-- [self &optional [round 0]]
(setv self.round round)
(setv self.opponent-record [])
(setv self.record [])
(setv self.turn -1)
(setv self.count-to-four 0)
(setv self.terminal-behavior False))
(defn move [self &optional [previous None]]
(+= self.turn 1)
(if previous
(.append self.opponent-record previous))
(setv output
(if (= self.opponent-record self.record)
(if (or (% self.round 2)
(< self.round 4))
(if (< self.turn (len +sequence-1+))
(. +sequence-1+ [self.turn])
(. +sequence-2+ [(% (- (len +sequence-1+)
self.turn)
(len +sequence-2+))]))
(if (< self.turn (len +sequence-1+))
(. (list (reversed +sequence-1+)) [self.turn])
(. +sequence-2+ [(% (- (len +sequence-1+)
self.turn)
(len +sequence-2+))])))
(if (< count-to-four 4)
(do
(+= count-to-four 1)
(if (= (last self.record) 3)
2
3))
(if (or self.terminal-behavior
(= 5 (+ (last self.record) previous)))
(if (= (last self.record) 3)
2
3)
(do
(setv self.terminal-behavior True)
(if (all (map (fn [x] (<= x 0)) self.opponent-record))
5
(all (map (fn [x] (<= x 1)) self.opponent-record))
4
(all (map (fn [x] (<= x 2)) self.opponent-record))
3
(if (> (+ (last self.record)
previous)
5)
3
2)
))))))
(.append self.record output)
output))
Replies from: Zack_M_Davis↑ comment by Zack_M_Davis · 2020-10-23T01:53:00.519Z · LW(p) · GW(p)
Seeing people actually use Hy is making me nostalgic!
Replies from: lsusr↑ comment by lsusr · 2020-10-23T02:00:47.409Z · LW(p) · GW(p)
I love Hy and use it all the time for data science and other applications. Thank you for all your work on the project!
Replies from: gilch↑ comment by gilch · 2021-04-15T06:01:24.011Z · LW(p) · GW(p)
Zack_M_Davis isn't the only one.
I've also written Hissp now. I'm curious how they compare for data science work (and other applications).
I've also seen evhub, the author of Coconut, here on LessWrong.
↑ comment by Multicore (KaynanK) · 2020-10-19T23:42:53.025Z · LW(p) · GW(p)
I'm sad that some people thought that more than two parenthesis was high risk to simulate/examine - I thought that the obvious thing to do was check to see if someone ever loads code or uses a random function, and if you don't do either, you should be safe.
We'll see how many people submitted simulators braver than mine, but simulators being timid seems like a natural consequence of the rules allowing you to nuke your opponent if you find out that you're in a simulation, and a common enough perception that simulators might have enough of an advantage that they should be eliminated.
Static analysis is not very useful if the opponent's code is at all obfuscated, which is likely is if your opponent is looking to nuke simulators. Does your static analysis catch the code getattr(__builtins__, ‘e’ + ‘x’ + ‘e’ + ‘c’)(base64.decode(God knows what)) ? Or however many dozens of other ways there are to do something like that?
The tournament might look significantly different if the rules were slanted in the simulator's favor, maybe if you just had to avoid infinite simulation loops and keep runtime reasonable, and the worst the opponent was allowed to do if they found they were in a simulation was to play BullyBot in order to extort you or play randomly to make the simulation useless. The iterated prisoner's dilemma with shared source code tournament a few years ago had a lot of simulators, so I assume their rules were more friendly to simulators.
Replies from: philh↑ comment by philh · 2020-10-20T08:44:35.293Z · LW(p) · GW(p)
I do think it would be hard to obfuscate in a way that wasn't fairly easy to detect as obfuscation. Throw out anything that uses import
, any variables with __
or a handful of builtin functions and you should be good. (There's only a smallish list of builtins, I couldn't confidently say which ones to blacklist right now but I do think someone could figure out a safe list without too much trouble.) In fact, I can't offhand think of any reason a simple bot would use strings except docstrings, maybe throw out anything with those, too.
(Of course my "5% a CloneBot manages to act out" was wrong, so take that for what it's worth.)
The iterated prisoner’s dilemma with shared source code tournament a few years ago had a lot of simulators, so I assume their rules were more friendly to simulators.
I know of two such - one [LW · GW] (results [LW · GW] - DMRB was mine) in Haskell where you could simulate but not see source, and an earlier one [LW · GW] (results [LW · GW]) in Scheme where you could see source.
I think in the Haskell one it would have been hard to figure out you were being simulated. I'm not sure about the scheme one.
comment by gjm · 2020-10-11T00:13:56.197Z · LW(p) · GW(p)
The first line says "Entries must be submitted on October 18, 2020, or earlier."
Then a bit later you say "I will run my own version of the game on October 16, 2020."
Will you be making your time-travel technology available to contestants' bots, and if so what is the API they should use?
Replies from: lsusr↑ comment by lsusr · 2020-10-11T00:37:45.061Z · LW(p) · GW(p)
Good question! My time travel technology will not be available to contestants' bots. This is in accordance with Less Wrong's established norm of containing dangerous technology.
I have edited the original post in a futile attempt to cover-up this leak and have removed the relevant API call from the extra
module.
↑ comment by Ben Pace (Benito) · 2020-10-18T02:15:29.732Z · LW(p) · GW(p)
Thanks for obeying the norms. That we definitely have. Around time travel technology.
comment by jan betley (jan-betley) · 2020-10-09T11:43:21.935Z · LW(p) · GW(p)
Your TitForTatBot
* never sets self.previous
* even if it was set, it would stop cooperating when opponent played 0
Also I agree with Zvi's comment, why 2.5 for free? This way one should really concentrate on maxing out in the early stage, is it intended?
comment by philh · 2020-10-19T08:21:54.317Z · LW(p) · GW(p)
Will post a link to a github repo with my code later today (when I'm not meant to be working), but for now, here's my thought processes.
(Edit: my code is here. My entry is in incomprehensibot.)
General strategy:
I was undecided on joining the clique, but curious. I didn't want to be a sucker if someone (possibly the clique organizer) found a way to betray it. I sent out feelers, and Vanilla_Cabs shared the clique bot code with me.
I saw a way to defect against the clique. I think that's when I decided to do so, though I may have had the idea in my head beforehand. I would call my entry "bastardBot" and it would be glorious. I told Vanilla_Cabs I was in. They asked if the code was airtight. "I don't see anything I want to flag."
Someone else found that same bug, and was more honest than I. I spent some time trying to work around the fix, but couldn't see anything. I tried to get Vanilla_cabs to put a new hole in, under the pretext that I wanted some code at the top level - this was true, but it was only marginally useful. I couldn't think of any new holes that wouldn't be really freaking obvious, so instead I tried being vague about my requirements to see if they'd suggest something I could use, but they didn't. Eventually we just settled on "the exact line foo = 'bar'
is permitted as an exception", and I didn't see what I could do with that.
Later, lsusr told me that that line would get me disqualified. I didn't say anything, in the hopes some clique member would wonder what it was for, include it in their bot just in case, and get disqualified.
I feel a little bad about all this, and hope Vanilla_cabs has no hard feelings.
My backup plan was: don't let anyone simulate me, and get them disqualified if they try. (New name: "incomprehensibot".) "jailbreaker.py" shows my tricks here. Defense seemed more effective than offense, and I didn't think I could safely simulate my opponent, and especially not do so safely and usefully within the time limits, so I gave up on the idea. As for my actual moves, I didn't have much in mind. After rereading (or at least reskimming) "the Darwin pregame" I settled on this:
After the showdown round, start off with something like Zvi's "I'll let you do better than me, but you could do even better by cooperating with me". Gradually move towards "I won't let you do better than me" as time progresses; if my opponent had more than a certain number of points than me, I'd refuse to play less than 3. (I chose the number of points based on expecting 550 turns on average, and gave it a minimum of 5 to allow some coordination dance early on.) Early on, skew towards playing 2 initially; if opponents randomize between 2 and 3, and pick 3 with probability >= 1/3, then 2 maximizes my score. Later, skew towards playing 3 initially, which increases my probability of beating my opponent.
"payload.py" shows my approach here. I modelled it off the early-round CloneBot moves against non-clones. If last round had been a total of 5, I'd play their last move. If it had been 4, I'd do the same, but maybe add a little. If it had been more, I'd repeat my own move, but maybe subtract one. In the 5 and >5 cases, I had some "pushing" behaviour to see if I could exploit them: if I haven't had a chance to push yet, or if pushing had worked well (or seemed like it would have worked well) in the past, or if I just hadn't tried it recently, I'd try to take one more point than I really had any right to. I didn't do that in the <5 case because that situation was my only source of randomness, which seemed important somehow.
(I'm a bit confused about, if the last-round total isn't five, should I base off my own previous move or theirs? My decisions here weren't principled.)
If this made me play 0 (I dunno if it ever would), I'd make it a 1. If it made me play less than 3, and I was too far behind (depending on round), I'd make it a 3.
Just before I went to bed Saturday night, someone sent a message to the clique group saying not to try to break simulators. Because if a clique member simulates us and gets disqualified, the tournament is restarted and the clique is smaller. That was completely right, and I felt stupid for not thinking of it sooner.
I still decided to ignore it, because I thought the game would be small enough that "fewer opponents" was better than "bigger clique". Overnight someone else said they'd already submitted a simulation-breaker, so I dunno if anyone ended up playing a simulator.
Right towards the end I started doing numerical analysis, because early on I was too enamoured with my own cleverness to notice what a good idea it was. I didn't have time to do anything thoroughly, but based on running my paload against ThreeBot (which gets 148-222 early, 10-15 late) I reduced my exploitability ramp-down from 100 rounds (chosen fairly arbitrarily) to 20 (still fairly arbitrary). Come to think of it, I don't think I compared "what proportion of the pool do I need to eventually win" between my early and late game behaviors.
It would have been interesting to have some kind of logging such that my bot could report "I think I'm being simulated right now, and this is how I know" and afterwards lsusr could me how often that happened. I assume that would be significant work for lsusr to set up though, and it adds attack surface.
Predictions:
I win: 20%.- A CloneBot wins: 75%.
At least one clique member submits a non-CloneBot (by accident or design): 60%.- At least one clique member fails to submit anything: 60%.
At least one bot tries to simulate me after the showdown and doesn't get disqualified: 10%.At least one bot tries to simulate me after the showdown and succeeds: 5%.- At least one CloneBot manages to act out: 5%.
I get disqualified: 5%.
↑ comment by philh · 2020-10-26T15:06:41.239Z · LW(p) · GW(p)
*At least one bot tries to simulate me after the showdown and doesn’t get disqualified: 10%.
- At least one bot tries to simulate me after the showdown and succeeds: 5%.
I now think these were overconfident. I think it would be fairly easy to simulate incomprehensibot safely; but hard to simulate incomprehensibot in a way that would be both safe and generically useful.
The difficulty with simulating is that you need to track your opponent's internal state. If you just call MyOpponentBot(round).move(myLastMove)
you'll be simulating "what does my opponent do on the first turn of the game, if it gets told that my last move was...". If you do this against incomprehensibot, and myLastMove
is not None
, incomprehensibot will figure out what's up and try to crash you.
So at the beginning, you initialize self.opponent = MyOpponentBot(round)
. And then every turn, you call self.opponent.move(myLastMove)
to see what it's going to do next. I don't expect incomprehensibot to figure this out.
But if your opponent has any random component to it, you want to do that a couple of times at least to see what's going to happen. But if you call that function multiple times, you're instead simulating "what does my opponent do if it sees me play myLastMove several times in a row". And so you need to reset state using deepcopy
or multiprocessing or something, and incomprehensibot has ways to figure out if you've done that. (Or I suppose you can just initialize a fresh copy and loop over the game history running .move()
, which would be safe.)
But actually, the simple "initialize once and call .move()
once per turn" isn't obviously terrible? Especially if you keep track of your predictions and stop paying attention to them once they deviate more than a certain amount (possibly zero) from reality. And Taleuntum's bot might catch that, I'm not sure, but I think Incomprehensibot wouldn't.
I think at some point I decided basically no one would do that, and then at some other point I forgot that it was even a possibility? But I now think that was silly of me, and someone might well do that and last until the showdown round. Trying to put a number on that would involve thinking in more depth than I did for any of my other predictions, so I'm not going to try, just leave this note.
↑ comment by Taleuntum · 2020-10-19T15:21:19.418Z · LW(p) · GW(p)
The links you posted do not work for me. (Nevermind)
Wow, you are really confident in you winning. There are 10 players in the clique, so even if there are no players outside the clique (a dubious assumption) a priori there is 10% chance. If I had money I would bet with you.
I also think there is a good chance that a CloneBot wins. 10 possible member is a good number imo. i would say 80%.
I would say 70% for the (possibly accidental) betrayal.
Without seeing your jailbreak.py I can't say how likely that others are able to simulate you.
What does "act out" mean in this context?
Replies from: philh, philh↑ comment by philh · 2020-10-19T15:58:01.639Z · LW(p) · GW(p)
Conditioned on "any CloneBot wins" I've given myself about 25%.
10% in that conditional would definitely be too low - I think I have above-baseline chances on all of "successfully submit a bot", "bot is a CloneBot" and "don't get disqualified". I think I expect at least three to fall to those hurdles, and five wouldn't surprise me. And of the rest, I still don't necessarily expect most of them to be very serious attempts.
By "act out" I mean it's a bot that's recognized as a CloneBot by the others but doesn't act like one - most likely cooperating with non-clones, but not-cooperating with clones would also count, it would just be silly as far as I can tell. I also include such a bot as a CloneBot for the 75%.
↑ comment by philh · 2020-10-19T17:59:14.871Z · LW(p) · GW(p)
Updated with a link to my code. I also put yours in to see how we'd fare against each other one-on-one - from quick experimentation, looks like we both get close to 2.5 points/turn, but I exploit you for approximately one point every few hundred turns, leaving me the eventual victor. :D I haven't looked closely to see where that comes from.
Of course too much depends on what other bots are around.
↑ comment by Vanilla_cabs · 2020-10-19T14:05:55.379Z · LW(p) · GW(p)
They asked if the code was airtight. "I don't see anything I want to flag."
And I saw right through that my friend :D
As I said in my reply to Taleuntum, I left the weakness as a test to find someone I could trust to find sneakier weaknesses. Of you three who saw the code, only Lanrian reported. "I don't see anything I want to flag." That's cute. To be more accurate, I wasn't sure you were hiding a security flaw, but I didn't have to be sure since either way meant I couldn't entrust you with the task. And the wording left me thinking you were hiding a security flaw with 80% credence. I thought about asking "Did you see anything worth flagging?", but decided against it.
Later, lsusr told me that that line would get me disqualified. I didn't say anything, in the hopes some clique member would wonder what it was for, include it in their bot just in case, and get disqualified.
I feel a little bad about all this, and hope Vanilla_cabs has no hard feelings.
Not at all, I just feel like I've dodged a big bullet. How come that line would get someone disqualified? Has lsusr been more specific?
↑ comment by simon · 2020-10-19T15:15:24.088Z · LW(p) · GW(p)
I fell for
Eventually we just settled on "the exact line foo = 'bar' is permitted as an exception", and I didn't see what I could do with that.
Later, lsusr told me that that line would get me disqualified. I didn't say anything, in the hopes some clique member would wonder what it was for, include it in their bot just in case, and get disqualified.
I thought it would be harmless and that there was some chance (even though it would make more sense to read some codeword directly from the payload) that someone would be using it as a recognition signal to boost a chosen clique member's bot using a non clique member's bot.
Is it really disqualifiable? I am not using foo for anything else other than setting it.
Replies from: philh, lsusr↑ comment by philh · 2020-10-19T15:30:57.864Z · LW(p) · GW(p)
Putting data in global state is forbidden, yeah, even if you don't do anything with it. I was a bit surprised.
Just to be clear, this would only be forbidden if you put it at the top level. If you put it in your class it would be fine. So
class CloneBot():
...
def payload(self) :
...
foo = 'bar' # allowed
foo = 'bar' # forbidden
Replies from: lsusr, simon↑ comment by lsusr · 2020-10-19T22:05:38.626Z · LW(p) · GW(p)
The parent comment comes from a private conversation between me and philh in which I conveyed erroneous information. The mistake is my fault—not philh's. A line such as foo = 'bar'
is legal in some contexts. A line which uses the global namespace to preserve information between class instances is illegal.
↑ comment by simon · 2020-10-19T15:33:18.286Z · LW(p) · GW(p)
Yeah, I put it in at top level.
Replies from: Taleuntum↑ comment by lsusr · 2020-10-19T20:57:13.680Z · LW(p) · GW(p)
Setting constants into the global state of your own file is fine. What I care about is whether you're setting variables into the global state for the purpose of preserving information between bot instantiations or otherwise messing with the game engine. simon's bot clearly does neither.
Replies from: philh↑ comment by philh · 2020-10-19T21:37:02.700Z · LW(p) · GW(p)
I confess I'm a bit confused, I thought in our PM conversation I was fairly explicit that that's what I was asking about, and you were fairly explicit that it was forbidden?
It's not a big deal - even if this was forbidden I'd think it would be totally fine not to disqualify simon, and I still don't actually expect it to have been useful for me.
Replies from: lsusr↑ comment by lsusr · 2020-10-19T21:59:18.426Z · LW(p) · GW(p)
In my private conversation with you (philh) I stated "Writing variables to the global namespace is forbidden." I then doubled down on the error by stating "Yes. All data must be saved in a class instance."
I apologize for the error. What I should have written was that using the global namespace to get around information restrictions is illegal but that writing constants to your module's global namespace in a way that does not evade information restrictions is fine.
comment by jacobjacob · 2020-10-09T18:06:50.392Z · LW(p) · GW(p)
Great initiative, thanks for organizing!
comment by Dagon · 2020-10-09T16:47:04.938Z · LW(p) · GW(p)
Fun! Note that the "automatic recognition of self" and "group, not individual fitness" rules are different from many such contests. I think it's likely that this allows super-simple bots to win, if the initial mix allows them to get past the first few iterations.
I'd predict three-bot is the most likely simple winner - it gets full value against itself (as does everyone), and gets some points for all opponents' probes and complex signaling (that includes bidding 0, 1, or 2). It NEVER gives an opponent more points than it gets.
When you add in truly sophisticated bots, who exploit each other based on knowledge of the mix of opponents, they can do enough better in early rounds for three-bot to never get critical mass. Likewise for coordinated bots who can accurately detect each other, getting full value for more cases than non-coordinated ones.
edit: after reading Zvi's old posts, I realize two things:
- auto-self-recognition (but not auto-partner recognition) is weird, and probably changes the correct explore-exploit strategy significantly. Perhaps not as far toward the simple as I propose, above - partner recognition is going to be important.
- starting mix matters a whole lot. it'll be a very different game with 100,000 initial programs than with 50. And there will be a groupthink component - there's no way to avoid the meta-game of un-enforced partnerships.
↑ comment by Rochambeau · 2020-10-16T00:01:24.363Z · LW(p) · GW(p)
Wouldn't three-bot only get a few points against tit-for-tat bots? I agree it can't do worse than its opponent (meaning it would at worst tie if it is one of the last two bots), but bots that can cooperate or even two-bot could be raking in a lot more points as long as the population of bots isn't dominated by three-bot.
For example, in a given round if three-bot is versing tit for tat, it might get a few points right away and then no more, but in that same round, two tit for tat bots or a tit for tat and a two-bot are getting a comparatively large amount of points meaning they will dominate later rounds so I think two-bot is a better simple strategy than three-bot (granted it will not win).
↑ comment by frontier64 · 2020-10-14T23:38:53.025Z · LW(p) · GW(p)
Auto-self-recognition is trivial when you have the ability to read opponent's source code [LW(p) · GW(p)]. You can detect you're playing against a copy of yourself and your 'opponent' will do the same. Although in this case you will lose a small amount of points randomly deciding which bot will choose 3 and the other 2 assuming there is no deterministic way to differentiate the clones.
ETA: I see that you posted your comment before Isusr said you could view opponent's source.
comment by Zvi · 2020-10-21T11:09:13.009Z · LW(p) · GW(p)
A thought for next time, if you want to build a simulator, it could make sense to wait one round before simulating anything even slightly unsafe, and do something simpler in round 1. That way, if something weird might potentially disqualify you but will itself get disqualified, it dies before it can remove you from the pool.
comment by Zack_M_Davis · 2020-10-10T03:26:32.720Z · LW(p) · GW(p)
I don't know whether it's a competitive entry (it would be a FoldBot in Zvi's taxonomy [LW · GW]), but as a quick piece of software, I'm pretty proud of AbstractSpyTreeBot
!
↑ comment by Taleuntum · 2020-10-12T14:05:06.111Z · LW(p) · GW(p)
You should also check whether 'exec' is in the program code string, because someone could call getopponentsource with exec and caesar-encryption, otherwise you will be DQ'd if someone submits a program like that. (However, rebinding getopponentsource is probably more elegant than this type of static analysis.)
↑ comment by Multicore (KaynanK) · 2020-10-12T20:24:44.608Z · LW(p) · GW(p)
This seems to only simulate the opponent's first move, and then act as if the opponent will make that move every round.
For the most part this is a good reference implementation though, and I expect I'll be borrowing from it in my submission.
Replies from: KaynanK↑ comment by Multicore (KaynanK) · 2020-10-14T00:59:45.164Z · LW(p) · GW(p)
An interesting bot for simulators to grapple with is this:
class BullyBot:
def __init__(self, round=0):
self.opponentThreesPlayed = 0
def move(self, previous=None):
if self.opponentThreesPlayed > 3:
return previous
elif previous == 3:
self.opponentThreesPlayed+=1
if self.opponentThreesPlayed > 3:
return 2
return 3
Something that holds out just long enough to get a naive simulator to fold to it forever, but will probably end up cooperating with most others.
comment by Zack_M_Davis · 2020-10-10T00:54:37.935Z · LW(p) · GW(p)
This is the Nash bargaining game. Is there some specific reason to call it a "Prisoner's Dilemma variation", or is it just "everyone has heard of the Prisoner's Dilemma, and don't have a generic term for this kind of game-theory tournament"?
Replies from: lsusrcomment by Emiya (andrea-mulazzani) · 2020-10-10T19:11:16.327Z · LW(p) · GW(p)
I'm a non programmer, if I submitted a bot that's too complicated or that is supposed to do something that isn't a possible move, would I be contacted and get a chance to change it, provided I submit it early enough?
Could you provide one or more examples of too complicated descriptions, just so I know for which level of complexity I should aim? I'm not clear on how you would consider a bot that has a decision three with different options and processes but no part that's hard to program, for example. (To avoid giving hints or good ideas, they can be filled with completely stupid instructions or be for bots that try to do some other stuff)
Replies from: lsusr↑ comment by lsusr · 2020-10-10T21:31:37.260Z · LW(p) · GW(p)
If you're worried whether something is too complicated then the best solution is to PM me with your most complicated dream bot and I'll tell you how much of it I'm willing to code.
An example of something that is too complicated is Zach_M_Davis's AbstractSpyTreeBot which runs a simulation of its opponent. Another example of something too complicated is is a neural network trained on your opponent's past history. Such bots are perfectly legal if you code them yourself to reasonable speed constraints but I will not write one for you.
A hard-coded decision tree with several unambiguously-specified simple options is well within reason if that is all there is to it and you write a good specification. For example, using regular expressions to analyze your opponent's source code if fine.
Replies from: andrea-mulazzani↑ comment by Emiya (andrea-mulazzani) · 2020-10-10T22:26:52.218Z · LW(p) · GW(p)
That sounds perfect, thank you.
I'm sorry to ask for another detail, but I'm not 100% sure if simulator-bots such as AbstractSpyTreeBot are legal if you can program them yourself. Your reply seems to state so, but I didn't wanted to assume.
Replies from: lsusr↑ comment by lsusr · 2020-10-10T22:58:36.021Z · LW(p) · GW(p)
Simulator bots are legal for programmers to write provided they never crash, slow the game to a crawl, etc. Any bot (even a non-simulator) that does will be disqualified.
I have not yet read the AbstractSpyTreeBot
source code in detail nor run it myself. A cursory glance suggests it's within the rules.
comment by Andrew Jacob Sauer (andrew-jacob-sauer) · 2020-10-09T18:22:10.151Z · LW(p) · GW(p)
What are the rules about program runtime?
Replies from: lsusr↑ comment by lsusr · 2020-10-09T20:59:39.119Z · LW(p) · GW(p)
Vague. The only limitations are obvious stuff like hacking the game client to get around information restrictions, using so much compute you slow the game itself to a crawl, installing anything that takes me significant effort to set up and anything I consider a security risk. You may PM me for clarification on your specific situation. I can easily perform speed tests as I have already written the game client.
If you are a programmer then you may request reasonable features such as reading your opponent's source code [LW(p) · GW(p)].
Edit #1: In order to maximize resource use, you may provide an adjustable constant such as TREE_DEPTH
.
Edit #2: Any code that can always complete 10,000 [LW(p) · GW(p)] move
calls within 5 seconds is guaranteed to be "fast enough".
↑ comment by Drako365 · 2020-10-11T21:12:46.388Z · LW(p) · GW(p)
What do you mean by information restrictions? What exactly is restricted? If I were to use some side-channel to determine that I was being simulated by an opponent and forkbomb if so, would I be eliminated for hacking? Would they be eliminated for overusing compute? (after all, it wasn't my code per say that forkbombed, but their code simulating my code) Would I be eliminated for being a dick? (Which would be self-evident, at any rate)
You say in another comment, "Simulator bots are legal for programmers to write provided they never crash, slow the game to a crawl, etc. Any bot (even a non-simulator) that does will be disqualified." I'm just concerned about the edge case where someone writes the troll code described above, and suddenly every person who submitted a simulator bot (i.e. Zak M. Davis) gets DQ'd. The rules as written suggest that such a bot (presuming it does not preserve information across rounds) would be perfectly legal and it's Zak's own fault for executing untrusted code, but that seems to effectively prohibit simulators as there's an infinite plurality of side-channels through which to do this.
A few possible resolutions would be:
- Whoever wrote the code that contains the instructions to forkbomb is at fault. Essentially, don't be a dick.
- Any information not explicitly given is forbidden. Essentially, you don't know if you're being simulated, and you're not allowed to find out.
- Untrusted code is untrusted code; simulate at your own risk. Essentially, if Zak executes my forkbomb, that's his problem. (Well, your problem as well to fix it, but his fault.)
Resolution 2 seems the most reasonable to me, but if you choose it I humbly request a way to locally overwrite get_opponent_source so you simulate your opponent's response without risking infinite recursion. (Such an overwrite is probably standard python, but being ignorant of your implementation I'd be concerned about screwing something up and not being able to detect the mistake in testing)
EDIT: Upon further consideration it seems inevitable that Zak will get DQ'd for infinite recursion. So I must be missing something.
Replies from: lsusr↑ comment by lsusr · 2020-10-12T03:12:16.249Z · LW(p) · GW(p)
Hacking the game engine is against the rules. If your opponent decides to simulate your code then hacking them from inside your opponents' simulation is allowed. Your code is allowed to peek at the game engine for the purposes of figuring out if it is being simulated by someone else's code. Peeking at the game engine for other reasons, like figuring out how many of each opponents remain or attempting to modify the game engine itself or attempting to modify your opponents' code from anywhere outside their own simulation of you is against the rules.
Anything that could damage the game's execution environment is a security risk and grounds for disqualification. Forkbombing your opponent is really pushing things…but technically legal. I would prefer something more mundane like calling exit()
or an infinite loop. Anything more damaging than a forkbomb constitutes a security risk and is not allowed under any circumstances. If you are going to write any malware-like then code (including forkbombs) then please draw my attention to it in some way what the intended behavior is so I can mitigate any unintended consequences to my own systems. Even better would be to PM me for explicit permission.
Any code which gets disabled by malware will get disqualified from the tournament. I will then restart the tournament with the broken competitor(s) removed.
The resolution is #3 "Untrusted code is untrusted code; simulate at your own risk."
The function get_opponent_source
returns source code as a string. You can trivially overwrite it in your local environment with the command get_opponent_source = my_custom_get_opponent_source
. (This is standard Python.) If you execute your opponent's code in a simulation where get_opponent_source
has been rebound[1] to a new function (and have not otherwise left alternate attack vectors open) I see no reason why this would would trigger infinite recursion.
If you rebind the get_opponent_source
function in your local environment in a reasonable way that is itself obviously not an attempt to breach anything then I will consider that a bug in the game engine and attempt to engineer a fix.
I have not yet read over Zak's code in detail nor have I attempted to execute it. If it infinitely recurses then it will be disqualified.
Edit: Do note that an unobfuscated simulator bot is likely to have a statement something along the lines of
from extra import get_opponent_source
. If you overwrite theget_opponent_source
function and then execute your opponent's code, it is possible your opponent'sfrom extra import get_opponent_source
may un-overwrite your rebinding. ↩︎
↑ comment by Taleuntum · 2020-10-12T14:15:54.695Z · LW(p) · GW(p)
In what order do programs get disqualified? For example, if I submit a program with an infinite loop, every other program using simulation will also go into infinite loop when meeting with my program as detecting infinite loops generally isn't theoretically feasible. Is my program disqualified before the others? What is the general principle?
EDIT: An unrelated question: Do round numbers start from 0 or 1? In the post you write "Unlike Zvi's original game, you do get to know what round it is. Rounds are indexed starting at 0.", but also: "Your class must have an __init__(self, round=1) [..]". Why not have the default initializer also use 0 if the round numbers start from zero?
Replies from: lsusr↑ comment by lsusr · 2020-10-12T20:33:54.467Z · LW(p) · GW(p)
First a run your program against one or more simple programs without any simulations. If your program hits an infinite loop there then you will be disqualified before you get to infect any other programs. In this way you can be disqualified before the others.
If your program passes the simple tests then it will join the pool and against all other programs which have passed the initial test. All programs which fail to terminate at this stage will be removed simultaneously.
Thank you for the bug report. I have corrected __init__(self, round=1) [..]
to __init__(self, round=0) [..]
↑ comment by philh · 2020-10-14T08:57:44.302Z · LW(p) · GW(p)
Hm. Can we get a "you can use at least this amount of time per move and not be disqualified"? Without wanting to say too much, I have a strategy in mind that would rely on knowing a certain runtime is safe. (Allowing for some amount of jankiness, so that if the limit was 5s I'd consider 4s safe.)
Replies from: lsusr↑ comment by lsusr · 2020-10-14T19:38:46.301Z · LW(p) · GW(p)
I will not disqualify anyone who can complete 100 move
calls in 5 seconds (0.05 seconds per call on average) on my T450s.
I will not disqualify anyone who can complete 100 move
calls in 0.05 seconds on my T450s.
Edit: Corrected time limit downward by 100×.
Replies from: philh↑ comment by philh · 2020-10-14T22:03:48.987Z · LW(p) · GW(p)
Thanks. I confess I'd been hoping for more like 100x that, but not really expecting it :p
Replies from: lsusr↑ comment by lsusr · 2020-10-14T22:38:45.069Z · LW(p) · GW(p)
My previous answer was a mistake. It is actually 100 move
calls in 0.05 seconds. Sorry.
The numbers are brutal.
If a game goes for 50 rounds and 10 different bots each use 5[1] seconds per move and there are 550 moves per bot per pairing then it would take 4.36 years to run this tournament.
However, do note that the guarantee is "0.05 seconds per 100 moves" as opposed to "0.0005 seconds per move". If you only have to run expensive computations once then, depending on your algorithm, the right caching system might get you close to 100× performance.
I can add caching functions to the extra
package if that would help. The computer in question has 4 processors and runs Linux so multiprocessing can get you up to a 4× speedup.
500 seconds per move is 100× my original limit which equals 10,000× the corrected limit. ↩︎
↑ comment by philh · 2020-10-15T09:36:34.644Z · LW(p) · GW(p)
Oh, geez. I figured it would be too long, but I didn't think about just how much too long. Yeah, with these constraints, even 5s per hundred moves I agree is unreasonable.
Caching seems easy enough to implement independently, I think. No need for you to add it.
↑ comment by Andrew Jacob Sauer (andrew-jacob-sauer) · 2020-10-10T17:07:26.859Z · LW(p) · GW(p)
Is everybody's code going to be in Python?
Replies from: lsusr↑ comment by lsusr · 2020-10-10T21:29:45.210Z · LW(p) · GW(p)
The rules are that, from the game engine's perspective, everyone's code is going to be written in Python 3 or Hy. It is theoretically possible that someone's code might, say, include some code in a different language that is then executed from the Python runtime.
Replies from: philhcomment by SarahSrinivasan (GuySrinivasan) · 2020-10-09T16:31:32.336Z · LW(p) · GW(p)
If you face a copy of yourself you are automatically awarded the maximum 5 points per round
What's your rationale behind this? Isn't part of the point that you need to be able to survive even in an ecosystem consisting mainly of you?
Replies from: lsusrcomment by Harmless · 2020-10-09T22:27:30.437Z · LW(p) · GW(p)
Is the number of rounds per matchup going to be in the tens, or the thousands?
Edit: I just realised you specified in the post
Replies from: lsusr↑ comment by lsusr · 2020-10-09T22:30:41.717Z · LW(p) · GW(p)
The number of turns per round is at least 100 and no more than 1000. I have not yet determined the number of rounds. If practical, I will iterate until there is a stable equilibrium.
Edit: I have edited the original post to reflect the latter statement about rounds.