↑ comment by ESRogs ·
2019-01-26T19:38:02.994Z · LW(p) · GW(p)
I think it's quite possible that when they instituted the cap they thought it was fair, however from the actual gameplay it should be obvious to anyone who is even somewhat familiar with Starcraft II (e.g., many members of the AlphaStar team) that AlphaStar had a large advantage in "micro", which in part came from the APM cap still allowing superhumanly fast and accurate actions at crucial times. It's also possible that the blogpost and misleading APM comparison graph were written by someone who did not realize this, but then those who did realize should have objected to it and had it changed after they noticed.
It's not so obvious to me that someone who realizes that AlphaStar is superior at "micro" should have objected to those graphs.
Think about it like this -- you're on the DeepMind team, developing AlphaStar, and the whole point is to make it superhuman at StarCraft. So there's going to be some part of the game that it's superhuman at, and to some extent this will be "unfair" to humans. The team decided to try not to let AlphaStar have "physical" advantages, but I don't see any indication that they explicitly decided that it should not be better at "micro" or unit control in general, and should only win on "strategy".
Also, separating "micro" from "strategy" is probably not that simple for a model-free RL system like this. So I think they made a very reasonable decision to focus on a relatively easy-to-measure APM metric. When the resulting system doesn't play exactly as humans do, or in a way that would be easy for humans to replicate, to me it doesn't seem so-obvious-that-you're-being-deceptive-if-you-don't-notice-it that this is "unfair" and that you should go back to the drawing board with your handicapping system.
It seems to me that which ways for AlphaStar to be superhuman are "fair" or "unfair" is to some extent a matter of taste, and there will be many cases that are ambiguous. To give a non "micro" example -- suppose AlphaStar is able to better keep track of exactly how many units its opponent has (and at what hit point levels) throughout the game, than a human can, and this allows it to make just slightly more fine-grained decisions about which units it should produce. This might allow it to win a game in a way that's not replicable by humans. It didn't find a new strategy -- it just executed better. Is that fair or unfair? It feels maybe less unfair than just being super good at micro, but exactly where the dividing line is between "interesting" and "uninteresting" ways of winning seems not super clear.
Of course, now that a much broader group of StarCraft players has seen these games, and a consensus has emerged that this super-micro does not really seem fair, it would be weird if DeepMind did not take that into account for its next release. I will be quite surprised if they don't adjust their setup to reduce the micro advantage going forward.Replies from: Wei_Dai
↑ comment by Wei_Dai ·
2019-01-26T21:14:27.983Z · LW(p) · GW(p)
When the resulting system doesn’t play exactly as humans do, or in a way that would be easy for humans to replicate, to me it doesn’t seem so-obvious-that-you’re-being-deceptive-if-you-don’t-notice-it that this is “unfair” and that you should go back to the drawing board with your handicapping system.
This is not the complaint that people (including me) have. Instead the complaint is that, given it's clear that AlphaStar won mostly through micro, that graph highlighted statistics (i.e., average APM over the whole game, including humans spamming keys to keep their fingers warm) that would be irrelevant to SC2 experts for judging whether or not AlphaStar did win through micro, but would reliably mislead non-experts into thinking "no" on that question. Both of these effects should have been easy to foresee.