[Link] Neural networks trained on expert Go games have just made a major leappost by ESRogs · 2015-01-02T15:48:16.283Z · score: 15 (16 votes) · LW · GW · Legacy · 5 comments
From the arXiv:
Move Evaluation in Go Using Deep Convolutional Neural Networks
Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver
The game of Go is more challenging than other board games, due to the difficulty of constructing a position or move evaluation function. In this paper we investigate whether deep convolutional networks can be used to directly represent and learn this knowledge. We train a large 12-layer convolutional neural network by supervised learning from a database of human professional games. The network correctly predicts the expert move in 55% of positions, equalling the accuracy of a 6 dan human player. When the trained convolutional network was used directly to play games of Go, without any search, it beat the traditional search program GnuGo in 97% of games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move.
This approach looks like it could be combined with MCTS. Here's their conclusion:
In this work, we showed that large deep convolutional neural networks can predict the next move made by Go experts with an accuracy that exceeds previous methods by a large margin, approximately matching human performance. Furthermore, this predictive accuracy translates into much stronger move evaluation and playing strength than has previously been possible. Without any search, the network is able to outperform traditional search based programs such as GnuGo, and compete with state-of-the-art MCTS programs such as Pachi and Fuego.
In Figure 2 we present a sample game played by the 12-layer CNN (with no search) versus Fuego (searching 100K rollouts per move) which was won by the neural network player. It is clear that the neural network has implicitly understood many sophisticated aspects of Go, including good shape (patterns that maximise long term effectiveness of stones), Fuseki (opening sequences), Joseki (corner patterns), Tesuji (tactical patterns), Ko fights (intricate tactical battles involving repeated recapture of the same stones), territory (ownership of points), and influence (long-term potential for territory). It is remarkable that a single, unified, straightforward architecture can master these elements of the game to such a degree, and without any explicit lookahead.
On the other hand, we note that the network still has weaknesses: notably it sometimes fails to understand the global picture, behaving as if the life and death status of large groups has been incorrectly assessed. Interestingly, it is precisely these global aspects of the game for which Monte-Carlo search excels, suggesting that these two techniques may be largely complementary. We have provided a preliminary proof-of-concept that MCTS and deep neural networks may be combined effectively. It appears that we now have two core elements that scale effectively with increased computational resource: scalable planning, using Monte-Carlo search; and scalable evaluation functions, using deep neural networks. In the future, as parallel computation units such as GPUs continue to increase in performance, we believe that this trajectory of research will lead to considerably stronger programs than are currently possible.
H/T: Ken Regan
Edit -- see also: Teaching Deep Convolutional Neural Networks to Play Go (also published to the arXiv in December 2014), and Why Neural Networks Look Set to Thrash the Best Human Go Players for the First Time (MIT Technology Review article)
Comments sorted by top scores.