AI Challenge: Ants - Post Mortem
post by D_Alex · 2012-01-12T07:23:08.593Z · LW · GW · Legacy · 21 commentsContents
21 comments
Late last year a LessWrong team was being mooted for the Google AI challenge (http://aichallenge.org/; http://lesswrong.com/r/discussion/lw/8ay/ai_challenge_ants/). Sadly, after a brief burst of activity, no "official" LessWrong entry appeared (AFAICT, and please let me know if I am mistaken). The best individual effort from this site's regulars (AFAICT) came from lavalamp, who finished around #300.
This is a pity. This was an opportunity to achieve, or at least have a go at, a bunch of worthwhile things, including development of methods of cooperation between site members, gathering positive publicity, and yes, even advancing the understanding of AI related issues.
So - how can things be improved for the next AI challenge (which I think is about 6 months away)?
21 comments
Comments sorted by top scores.
comment by jimrandomh · 2012-01-14T22:16:01.169Z · LW(p) · GW(p)
The problem with programming challenges is that if you win a programming challenge, you probably could've spent the same time doing a very similar thing but produced something of much greater value (money, software that people will actually use, research that will actually be built on). The unfortunate consequence of this is that you aren't likely to see many rationalists entering these competitions without strong motivation, and a little prestige and publicity isn't enough.
Also, building a functioning team of programmers out of loosely-committed geographically-dispersed acquaintances is ridiculously hard, and if you could do it, you wouldn't waste that power on a game.
Replies from: D_Alex↑ comment by D_Alex · 2012-01-16T07:53:16.109Z · LW(p) · GW(p)
you probably could've spent the same time doing a very similar thing but produced something of much greater value (money, software that people will actually use
What, specifically, would you suggest?
research that will actually be built on
This is one desired outcome of these AI challenges.
building a functioning team of programmers out of loosely-committed geographically-dispersed acquaintances is ridiculously hard, and if you could do it, you wouldn't waste that power on a game.
No, this is completely arse-about. The "game" should be used as a medium to develop cooperation skills.
comment by JonathanLivengood · 2012-01-12T08:31:00.828Z · LW(p) · GW(p)
Just spitballing here:
Promote the AI challenge as a rationalist meetup topic with the goal of having several working groups
Instead of trying to get one big group with a leader right from the start, appoint (or whatever) several leaders: assign to each leader a small collection of interested people
Be clear about what you want the leaders to do: what are the short and medium range goals
Put up an early post asking people to express interest and (maybe) skill-sets so that teams could be assembled with some balance / hope of accomplishing something
Keep in contact with the various leaders and see where people are getting stuck (I'm assuming that you are ultimately the person in charge of this project); periodically, have the leaders talk to each other -- but not extremely often; post regular discussion threads focusing on solving specific "We're stuck on this," problems
Try to reframe the problem or parts of the problem in a way that connects to generic rationality, so that non-programmers can contribute something -- looking over the old thread, it seems that a lot of people were intimidated by the threat of having to code stuff, but the programmers might nonetheless get a good idea or two from what non-programmers have to say about generic rationalist-type problems
Make some direct suggestions about the "worthwhile things" you mention. For example, apart from the AI project itself, what methods would you suggest site members use to cooperate and why? (Okay, maybe there isn't much more to be said directly about positive publicity and advancing AI ... but then, maybe there is ...)
Set benchmarks for when things should be done, even if those benchmarks have to be re-set several times along the way
↑ comment by printing-spoon · 2012-01-13T04:25:57.686Z · LW(p) · GW(p)
Try to reframe the problem or parts of the problem in a way that connects to generic rationality, so that non-programmers can contribute something
This is harder than it sounds.
Replies from: JonathanLivengood↑ comment by JonathanLivengood · 2012-01-13T06:32:04.946Z · LW(p) · GW(p)
I didn't think it sounded all that easy ... :)
Replies from: printing-spoon↑ comment by printing-spoon · 2012-01-13T21:36:23.406Z · LW(p) · GW(p)
Can you give an example for Ants?
Replies from: JonathanLivengood↑ comment by JonathanLivengood · 2012-01-14T12:16:13.019Z · LW(p) · GW(p)
I can try. Or, at least give a sketch. (Hand-waving ahead ...)
The Ants problem -- if I'm understanding it correctly -- is a problem of coordinated action. We have a community of ants, and the community has some goals: collecting food, taking over opposing hills, defending friendly hills. Imagine you are an ant in the community. What does rational behavior look like for you?
I think that is already enough to launch us on lots of hard problems:
What does winning look like for a single ant in the Ants game? Does winning for a single ant even make sense or is winning completely parasitic on the community or colony in this case? Does that tell us anything about humans?
If all of the ants in my community share the same decision theory and preferences, will the colony succeed or fail? Why?
If the ants have different decision theories and/or different preferences, how can they work together? (In this case, working together isn't very hard to describe ... it's not like the ants fight themselves, but we might ask what kinds of communities work well -- i.e. is there an optimal assortment of decision theories and/or preferences for individuals?)
If the ants have different preferences, how might we apply results like Arrow's Theorem or how might we work around it?
...
So, there's a hand-wavy sketch of what I had in mind. But I don't know, is it still too vague to be useful?
EDIT: I should say that I realize the game works with a bot controlling the whole colony, but I don't think that changes the problems in principle, anyway. But maybe I'm missing something there.
Replies from: gwern, printing-spoon↑ comment by gwern · 2012-01-14T19:29:42.991Z · LW(p) · GW(p)
The Ants problem -- if I'm understanding it correctly -- is a problem of coordinated action.
One of the interesting aspects of the winning entry post-mortem is the description of how dumb and how local the basic strategy the winner used:
There’s been a lot of talking about overall strategies. Unfortunately, i don’t really have one. I do not make decisions based on the number of ants i have or the size of my territory, my bot does not play different when it’s losing or winning, it does not even know that. I also never look which turn it is, in the first turn everything is done exactly the same as in the 999th turn. I treat all enemies the same, even in combat situations and i don’t save any hill locations.
Other than moving ants away from my hills via missions, every move i make depends entirely on the local environment of the ant.
Interesting reading, overall.
EDIT: Another example of overthinking it: http://lesswrong.com/lw/8ay/ai_challenge_ants/56ug One wonders if the winner could understand even half those links.
Replies from: malthrin, JonathanLivengood↑ comment by malthrin · 2012-01-14T21:13:08.273Z · LW(p) · GW(p)
Agreed. We can certainly do better than that. Unless I have a major life-event before the next AI challenge, I'll enter and get the LW community involved in the effort.
Replies from: gwern↑ comment by JonathanLivengood · 2012-01-15T22:27:56.249Z · LW(p) · GW(p)
Yes, the write-up is very interesting. But while the strategy was very local, he did end up having mechanisms for coordinating action between ants with otherwise pretty simple decision rules, especially for combat. At least, that's the way it looks to me. Did you mean for your comment to be a criticism of what I wrote? If so, could you say a bit more?
↑ comment by printing-spoon · 2012-01-14T19:16:07.915Z · LW(p) · GW(p)
If the ants have different decision theories and/or different preferences, how can they work together?
EDIT: I should say that I realize the game works with a bot controlling the whole colony, but I don't think that changes the problems in principle, anyway.
What?
The ants are not even close to individuals. They're dots. They're dots that you move around.
Replies from: JonathanLivengood, HoverHell↑ comment by JonathanLivengood · 2012-01-15T22:17:08.139Z · LW(p) · GW(p)
The ants are not even close to individuals. They're dots. They're dots that you move around.
This just seems like a failure of imagination to me.
You could think of the game as just pushing around dots. But if you write a rule for pushing the dots that works for each dot and has no global constraints, then you are treating the dots like individuals with individual decision rules.
Example. On each turn, roll a fair four-sided die. If the result is '1', go North. If the result is '2', go South. Etc.
The effect is to push around all the dots each turn. But it's not at all silly to describe what you would be coding here as giving each ant a very simple decision rule. Any global behavior -- behavior exhibited by the colony -- is due to each ant having this specific decision rule.
If you want a colony filled with real individuals, tweak the dumb rule by weighting the die in a new (slight) way for each new ant generated. Then every ant will have a slightly different decision rule.
Note that I am not trying to say anything smart about what rule(s) should be implemented for the ants, only illustrating the thought that it is not crazy -- and might even be helpful -- to think about the ants as individuals with individual decision rules.
comment by lavalamp · 2012-01-12T16:14:44.186Z · LW(p) · GW(p)
...which I think is about 6 months away...
Outside view suggests it might be more like a year. (Ants was originally planned--at least by one of the contest's officials--for about 6 months after planet wars.)
I'm not too unhappy with my #326 finish, considering that life intervened and I basically didn't work on it at all for the second half of the contest. Percentage-wise, it's an improvement over the last one, anyway. I was also struck by how mundane the winner's strategy was (http://www.xathis.com/posts/ai-challenge-2011-ants.html). It seems he didn't beat it with just a few overarching principles, but rather a lot of little pieces done well.
I have some more postmortem thoughts that I'll try to write down in a bit.
Replies from: malthrincomment by [deleted] · 2012-01-12T20:55:22.892Z · LW(p) · GW(p)
Don't expect people to commit up front to representing LessWrong. The idea of representing the entire community sounds kind of intimidating. In general don't stress benefits like getting positive publicity or learning important things about cooperation too much. I think that people who'll participate in a competition like that will be those who find it intrinsically rewarding. So stress the fun and coolness factor, I guess? Maybe have the people who participate initially write down the accounts of their early experiences in order to capture the imagination and draw in others.
It would be good if people could try their hand at participating in a team without too much commitment. This sounds hard and probably heavily depends on the structure of the particular programming problem and maybe rules of the competition.
Have a ready-to-use platform for cooperation so that people can immediately jump into team activities. Preparing such could amount to researching various websites providing such services, choosing the best combination of them and then writing a short tutorial on their use. In the extreme case, a group of volunteers could give the chosen solution a test run by cooperating on solving a problem from some prior programming contest and then writing down a report of their experience.
comment by lavalamp · 2012-01-16T16:39:29.124Z · LW(p) · GW(p)
As promised, a few more postmortem thoughts.
I made a number of mistakes while writing my bot. (I'm about to compare my approach with that of the winner, xathis, so you might want to read his write up first: http://www.xathis.com/posts/ai-challenge-2011-ants.html)
I assumed I wouldn't have time to do the number of local searches that xathis did, and I never tested this assumption.
For the combat code, I wrote both minimax and probabalistic algorithms; both worked ok, but I never tried to do any branch pruning, so the minimax code only ran when small numbers of ants were fighting. I should have tried some branch pruning. I spent too much time writing the combat code for the benefit it gave my bot.
My biggest mistake, or perhaps the unifying theme of my mistakes, is that I spent way too much time searching for the Deep Underlying Principle; instead of just writing code that directly did what I wanted, I was trying to write code that produced the behavior I wanted automatically as emergent behavior.
I spent too much time trying to fine-tune doomed approaches (of course, that is at least partly hindsight bias). I should have spent more time exploring completely different approaches.
One thing I did do well was an algorithm for listing all local combats, so that my fighting code could run on small(er) numbers of ants. (xathis did something similar.) I'm also not unhappy with my BFS algorithms; they ran quickly and I should have thought to use them more heavily.
As far as working with an "official" less wrong team, I didn't feel sufficiently confident in my abilities to want to represent LW officially (and I'm not updating that belief much because I still think it's quite likely that lots of people here could have done better). I think if there is interest for next time, we should at least figure out a method of working together beforehand.
I'm not sure what else I should say, ask questions if you have them.
comment by [deleted] · 2012-01-12T09:37:41.153Z · LW(p) · GW(p)
There are a lot of standard software development processes that you can follow. (e.g. SCRUM)
What those usually boil down to is:
- Now exactly what tasks need to be done.
- Now exactly who does what task and when it will be completed.
- Check both assumptions regularly.