Neural nets designing neural nets

post by Stuart_Armstrong · 2017-01-18T15:27:43.000Z · LW · GW · 1 comments

This is a link post for https://arxiv.org/abs/1611.01578

Contents

1 comment

1 comments

Comments sorted by top scores.

comment by IAFF-User-177 (Imported-IAFF-User-177) · 2017-01-20T04:20:49.000Z · LW(p) · GW(p)

On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.84, which is only 0.1 percent worse and 1.2x faster than the current state-of-the-art model. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art.

Ummm... if I'm reading this correctly, they had to do extra training for the architecture learner, and then they didn't do that much better than grad-student descent. Interesting, but not necessarily what I would call self-improving AI.