Does there exist an AGI-level parameter setting for modern DRL architectures?

post by TurnTrout · 2020-02-09T05:09:55.012Z · LW · GW · No comments

This is a question post.

Contents

  Answers
    3 Gurkenglas
    2 steve2152
None
No comments

Suppose the architecture includes memory (in the form of a recurrent state) and will act as the policy network for an observation-based RL agent. Evaluating the agent from a reasonable initial state, would you guess that there exists a model with robustly human+ capabilities for current architectures?

How many parameters would it take before you estimate there's a fifty-fifty chance of such a parameter setting existing? 1 billion? 1 trillion? More?

Answers

answer by Gurkenglas · 2020-02-09T21:07:46.077Z · LW(p) · GW(p)

Yes. Modelspace is huge and we're only exploring a smidgen. The busy beaver sequence hints at how much you can do with a small number of parts and exponential luck. I think feeding a random number generator into a compiler could theoretically have spawned an AGI in the eighties. Given a memory tape, transformers (and much simpler architectures) are Turing-complete. Even if all my reasoning is wrong, can't the model just be hardcoded to output instructions on how to write an AGI?

comment by Steven Byrnes (steve2152) · 2020-02-10T14:51:29.053Z · LW(p) · GW(p)

Very clever! Yes I agree with you that there is a parameter setting for modern DRL architectures for an agent that has an "instinct" to walk over to the nearest computer, and write and execute code that turns on a real-deal superintelligent AGI. Or for a program that manually steps through the execution steps of an AGI Turing machine. I guess I interpreted the question to say that that kind of thing doesn't count. :-P

answer by Steven Byrnes (steve2152) · 2020-02-09T18:26:27.268Z · LW(p) · GW(p)

Jumping out on a limb—and I might change my mind next week—but I would say "no", if using current popular mainstream DRL techniques, because these lack (1) foresight (i.e., running a generative model to predict the result of different possible courses of action, and choosing on the basis of the results), and (2) analysis-by-synthesis (processing inputs by continually running searches through a space of generative models to find the model that best matches that input). I think humans do both [LW · GW], and without both (among other requirements), I picture systems as sorta more like "operating on instinct" rather than "intelligent".

So (in my mind), your question would be "can we get 'robustly human+ capabilities' from a system operating on instinct?" and the answer is "Obviously yes when restricted to any finite set of tasks in any finite set of situations", e.g. AlphaStar. With enough parameters, the set of tasks and situations could get awfully high, and maybe that counts as "robustly human+"—just as a large enough Giant Lookup Table might count as "robustly human+". But my hunch is that systems with foresight and analysis-by-synthesis will be "robustly human+" earlier than any systems that operate on instinct.

No comments

Comments sorted by top scores.