New program can beat Alpha Go, didn't need input from human games
post by NancyLebovitz · 2017-10-18T20:01:39.616Z · LW · GW · Legacy · 14 commentsThis is a link post for https://deepmind.com/blog/alphago-zero-learning-scratch/
Contents
14 comments
14 comments
Comments sorted by top scores.
comment by Mitchell_Porter · 2017-10-19T21:31:45.389Z · LW(p) · GW(p)
A voice tells me that we're out of time. The future of the world will now be decided at Deep Mind, or by some other group at their level.
Replies from: IlyaShpitser, Kawoomba↑ comment by IlyaShpitser · 2017-10-26T14:34:18.909Z · LW(p) · GW(p)
You should probably stop listening to random voices.
More seriously, do you want to make a concrete bet on something?
Replies from: Mitchell_Porter↑ comment by Mitchell_Porter · 2017-10-29T20:28:11.776Z · LW(p) · GW(p)
How much are you willing to lose?
Replies from: IlyaShpitser↑ comment by IlyaShpitser · 2017-10-30T14:42:38.378Z · LW(p) · GW(p)
Let's say 100 dollars, but the amount is largely symbolic. The function of the bet is to try to clarify what specifically you are worried about. I am happy to do less -- whatever is comfortable.
Replies from: Mitchell_Porter↑ comment by Mitchell_Porter · 2017-10-31T03:50:11.073Z · LW(p) · GW(p)
Wake up! In three days, that AI evolved from knowing nothing, to comprehensively beating an earlier AI which had been trained on a distillation of the best human experience. Do you think there's a force in the world that can stand against that kind of strategic intelligence?
Replies from: IlyaShpitser, whpearson↑ comment by IlyaShpitser · 2017-10-31T04:19:46.096Z · LW(p) · GW(p)
So, a concrete bet then? What specifically are you worried about? In the form of a falsifiable claim, please.
edit: I am trying to make you feel better, the real way. The empiricist way.
Replies from: Mitchell_Porter↑ comment by Mitchell_Porter · 2017-11-01T21:19:15.661Z · LW(p) · GW(p)
Just answer the question.
Replies from: IlyaShpitser↑ comment by IlyaShpitser · 2017-11-01T21:43:20.438Z · LW(p) · GW(p)
http://marginalrevolution.com/marginalrevolution/2012/11/a-bet-is-a-tax-on-bullshit.html
Replies from: Mitchell_Porter↑ comment by Mitchell_Porter · 2017-11-01T22:28:48.446Z · LW(p) · GW(p)
And you're the tax collector? Answer the question.
↑ comment by whpearson · 2017-11-02T17:08:11.437Z · LW(p) · GW(p)
A brief reply.
Strategy is nothing without knowledge of the terrain.
Knowledge of the terrain might be hard to get reliably
Therefore there might be some time between AGI being developed and it being able to reliably acquire the knowledge. If these people that develop it are friendly they might decide to distribute it to other people to make it harder for any one project to take off.
Replies from: Mitchell_Porter↑ comment by Mitchell_Porter · 2017-11-06T12:54:48.094Z · LW(p) · GW(p)
Knowledge of the terrain might be hard to get reliably
Knowing that the world is made of atoms should take an AI a long way.
If these people that develop [AGI] are friendly they might decide to distribute it to other people to make it harder for any one project to take off.
I hold to the classic definition of friendly AI as being AI with friendly values, which retains them (or even improves them) as it surpasses human intelligence and otherwise self-modifies. As far as I'm concerned, AlphaGo Zero demonstrates that raw problem-solving ability has crossed a dangerous threshold. We need to know what sort of "values" and "laws" should govern the choices of intelligent agents with such power.
comment by gwern · 2017-10-20T01:45:08.328Z · LW(p) · GW(p)
If anyone wants more details, I have extensive discussion & excerpts from the paper & DM QAs at https://www.reddit.com/r/reinforcementlearning/comments/778vbk/mastering_the_game_of_go_without_human_knowledge/