Posts

Comments

Comment by IAFF-User-14 (Imported-IAFF-User-14) on Stable self-improvement as a research problem · 2015-03-21T21:39:12.000Z · LW · GW

Eurisko was a self-improving AI that failed mainly because it didn't stably self improve. In fact the lack of stable self improvement seems to me to be the reason most attempts fail.

In Eurisko the system created a heuristic which did nothing but increase the value of itself. The system was plagued with incentive problems like this. Without an adequate way of doing credit assignment and estimating the utility of things, the estimated utility of everything was heuristically based, and thus subject to exploitation by the system itself.

Likewise Godel machines take this to the opposite extreme, where they refuse to self improve unless they can formally prove the rewrite is good. And so they never do anything at all, since formal proofs of self-rewriting code is incredibly hard.

Saying an AI will avoid bad rewrites because it will know they are bad is fine, if the system is human-level or higher. But if we try to bootstrap AI from much stupider algorithms, they will make stupid mistakes. And it's also possible for AI to be adversarial. E.g. Eurisko was a bunch of different heuristics that were meant to work together with properly balanced incentives, as opposed to a single agent.

Comment by IAFF-User-14 (Imported-IAFF-User-14) on Anti-Pascaline agents · 2015-03-20T21:25:53.000Z · LW · GW

I wrote about some very similar ideas awhile ago here: http://lesswrong.com/lw/lsz/open_thread_mar_2_mar_8_2015/c29q