Deepmind has made a general inductor ("Making sense of sensory input")
post by mako yass (MakoYass) · 2021-02-02T02:54:26.404Z · LW · GW · 10 commentsThis is a link post for https://www.sciencedirect.com/science/article/pii/S0004370220301855
Contents
10 comments
Our system [the Apperception Engine] is able to produce interpretable human-readable causal theories from very small amounts of data, because of the strong inductive bias provided by the unity conditions. A causal theory produced by our system is able to predict future sensor readings, as well as retrodict earlier readings, and impute (fill in the blanks of) missing sensory readings, in any combination.
We tested the engine in a diverse variety of domains, including cellular automata, rhythms and simple nursery tunes, multi-modal binding problems, occlusion tasks, and sequence induction intelligence tests. In each domain, we test our engine's ability to predict future sensor values, retrodict earlier sensor values, and impute missing sensory data. The Apperception Engine performs well in all these domains, significantly out-performing neural net baselines. We note in particular that in the sequence induction intelligence tests, our system achieved human-level performance. This is notable because our system is not a bespoke system designed specifically to solve intelligence tests, but a general-purpose system that was designed to make sense of any sensory sequence.
If we were to take AIXI literally, we'd be concerned that induction (the generation of predictive models from observation) appears to provide about half of general intelligence (the rest is decision theory). It also seems noteworthy that the models that the apperception engine produces are reductive enough to be readable to humans, a quality similar to being analyzable, classifiable, generally comprehensible enough to be intelligently worked as components in an intellectual medium, that is to say, they may be amenable to a process of self-improvement that is informed by consciously applied principles and meta-knowledge, which in turn might be improved in similar ways. So, we should probably pay attention to this sort of thing.
10 comments
Comments sorted by top scores.
comment by Richard_Ngo (ricraz) · 2021-02-03T15:08:20.758Z · LW(p) · GW(p)
If we were to take AIXI literally, we'd be concerned that induction (the generation of predictive models from observation) appears to provide about half of general intelligence (the rest is decision theory).
I don't think "taking AIXI literally" in this way makes sense; nor does saying that decision theory is about half of general intelligence.
Thanks for the link, though.
Replies from: interstice↑ comment by interstice · 2021-02-08T07:25:45.861Z · LW(p) · GW(p)
I mean, it's not exactly provable from first principles, but using the architecture of AIXI as a heuristic for what a general intelligence will look like seems to make sense to me. 'Do reinforcement learning on a learned world model' is, I think, also what many people think a GAI may end up in fact looking like, e.g., and saying that that's half decision theory and half predictive model doesn't seem too far off.
Replies from: MakoYass↑ comment by mako yass (MakoYass) · 2021-02-09T04:02:37.763Z · LW(p) · GW(p)
Well I'm not sure there's any reason to think that we can tell, by looking at the mathematical idealizations, that the inductive parts will take about the same amount of work to create as the agentic parts, just because the formalisms seem to weigh similar amounts (and what does that seeming mean?). I'm not sure our intuitions about the weights of the components mean anything.
Replies from: interstice↑ comment by interstice · 2021-02-09T05:35:25.450Z · LW(p) · GW(p)
If a thing has two main distinct parts, it seems reasonable to say that the thing is half part-1 and half part-2. This does not necessarily imply that the parts are equally difficult to create, although that would be a reasonable prior if you didn't know much about how the parts worked.
comment by interstice · 2021-02-02T03:44:24.901Z · LW(p) · GW(p)
Is there any evidence that this is actually a general inductor, i.e. that as a prior it dominates some large class of functions? From skimming the paper it sounds like this could be interesting progress in ILP, but not necessarily groundbreaking or close to being a fully general inductor. At the moment I'd be more concerned about the transformer architecture potentially being used as (part of) a general inductor.
Replies from: vanessa-kosoy, MakoYass↑ comment by Vanessa Kosoy (vanessa-kosoy) · 2021-02-02T10:13:34.404Z · LW(p) · GW(p)
My impression is that it's interesting because it's good at some functions that deep learning is bad at (although unfortunately the paper doesn't make any toe-to-toe comparisons), but certainly there's a lot of things in which transformers would beat it. In particular I would be very surprised if it could reproduce GPT3 or DALL-E. So, if this leads to a major breakthrough it would probably be through merging it with deep learning somehow.
↑ comment by mako yass (MakoYass) · 2021-02-02T18:32:17.295Z · LW(p) · GW(p)
I'm not aware of a technical definition of "general inductor". I meant that it's an inductor that is quite general.
↑ comment by mako yass (MakoYass) · 2021-02-04T19:30:45.502Z · LW(p) · GW(p)
Wondering whether Integrated Information theory dictates that most anthropic moments have internet access
↑ comment by mako yass (MakoYass) · 2021-02-04T04:53:12.281Z · LW(p) · GW(p)
Hm, to clarify, by "consciously" I didn't mean experiential weight/anthropic measure, in this case I meant the behaviors generally associated with consciousness: metacognition, centralized narratization of thought, that stuff, which I seem to equate to deliberateness.. though maybe those things are only roughly equivalent in humans.