New program can beat Alpha Go, didn't need input from human games

post by NancyLebovitz · 2017-10-18T20:01:39.616Z · LW · GW · Legacy · 14 comments

This is a link post for https://deepmind.com/blog/alphago-zero-learning-scratch/

14 comments

Comments sorted by top scores.

comment by Mitchell_Porter · 2017-10-19T21:31:45.389Z · LW(p) · GW(p)

A voice tells me that we're out of time. The future of the world will now be decided at Deep Mind, or by some other group at their level.

Replies from: IlyaShpitser, Kawoomba
comment by IlyaShpitser · 2017-10-26T14:34:18.909Z · LW(p) · GW(p)

You should probably stop listening to random voices.


More seriously, do you want to make a concrete bet on something?

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2017-10-29T20:28:11.776Z · LW(p) · GW(p)

How much are you willing to lose?

Replies from: IlyaShpitser
comment by IlyaShpitser · 2017-10-30T14:42:38.378Z · LW(p) · GW(p)

Let's say 100 dollars, but the amount is largely symbolic. The function of the bet is to try to clarify what specifically you are worried about. I am happy to do less -- whatever is comfortable.

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2017-10-31T03:50:11.073Z · LW(p) · GW(p)

Wake up! In three days, that AI evolved from knowing nothing, to comprehensively beating an earlier AI which had been trained on a distillation of the best human experience. Do you think there's a force in the world that can stand against that kind of strategic intelligence?

Replies from: IlyaShpitser, whpearson
comment by IlyaShpitser · 2017-10-31T04:19:46.096Z · LW(p) · GW(p)

So, a concrete bet then? What specifically are you worried about? In the form of a falsifiable claim, please.


edit: I am trying to make you feel better, the real way. The empiricist way.

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2017-11-01T21:19:15.661Z · LW(p) · GW(p)

Just answer the question.

Replies from: IlyaShpitser
comment by whpearson · 2017-11-02T17:08:11.437Z · LW(p) · GW(p)

A brief reply.

Strategy is nothing without knowledge of the terrain.

Knowledge of the terrain might be hard to get reliably

Therefore there might be some time between AGI being developed and it being able to reliably acquire the knowledge. If these people that develop it are friendly they might decide to distribute it to other people to make it harder for any one project to take off.

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2017-11-06T12:54:48.094Z · LW(p) · GW(p)

Knowledge of the terrain might be hard to get reliably

Knowing that the world is made of atoms should take an AI a long way.

If these people that develop [AGI] are friendly they might decide to distribute it to other people to make it harder for any one project to take off.

I hold to the classic definition of friendly AI as being AI with friendly values, which retains them (or even improves them) as it surpasses human intelligence and otherwise self-modifies. As far as I'm concerned, AlphaGo Zero demonstrates that raw problem-solving ability has crossed a dangerous threshold. We need to know what sort of "values" and "laws" should govern the choices of intelligent agents with such power.

comment by Kawoomba · 2017-10-22T09:15:10.463Z · LW(p) · GW(p)

... and there is only one choice I'd expect them to make, in other words, no actual decision at all.

comment by gwern · 2017-10-20T01:45:08.328Z · LW(p) · GW(p)

If anyone wants more details, I have extensive discussion & excerpts from the paper & DM QAs at https://www.reddit.com/r/reinforcementlearning/comments/778vbk/mastering_the_game_of_go_without_human_knowledge/

comment by Manfred · 2017-10-18T22:59:35.379Z · LW(p) · GW(p)

Interesting that resnets still seem state of the art. I was expecting them to have been replaced by something more heterogeneous by now. But I might be overrating the usefulness of discrete composition because it's easy to understand.