Posts
Comments
Unscathed, you try playing yourself against the master. You lose again, again, and again. Gino silently makes his moves and swiftly corners you each time. In a while, you manage not to lose right away, but your defeat still comes pretty quickly, and your progress in defeat-time is biblically slow. It seems like you would need to play an incredibly large number of matches to get to a decent level.
This is reinforcement learning, and it worked out spectacularly for AlphaGo (having to operate in a much greater search space than chess, BTW). In more constrained problem spaces, which in my mind include most of "knowledge work" / desk jobs, the amount of labeled data needed seems to be in the order of 00s of 000s.
Intuitively, it can be that language contains the schemes of human thought, not just as that abstract thing which produced the stream of language, but within the language itself, even though we did not lay down explicitly the algorithm of a human in words. If imitation training can find associations that somehow tap into this recursiveness, it could be that optimizing the imitation of a relatively short amount of human text was sufficient to crack humans.
This is well said. When does acting becomes indistinguishable from reality? In human world, we certainly have plenty of examples - movie actors, politicians, fake-it-till-you-make-it entrepreneurs. And more frequently, thinking out loud is something many of us practice, where the spoken words do seem to take on their own lives in pushing that abstract thing called thinking.