Meta Programming GPT: A route to Superintelligence?
post by dmtea · 2020-07-11T14:51:38.919Z · LW · GW · 7 commentsContents
Human-Like Learning? Human Augmentation Failure Modes None 7 comments
Imagine typing the following meta-question into GPT-4, a revolutionary new 20 Trillion parameter language model released in 2021:
"I asked the superintelligence how to cure cancer. The superintelligence responded __"
How likely are we to get an actual cure for cancer, complete with manufacturing blueprints? Or will we get yet another "nice sounding, vague suggestion" like "by combining genetic engineering and fungi based medicine"- the sort GPT-2/3 is likely to suggest?
The response depends on whether GPT focuses on either:
1. What GPT thinks humans think that the superintelligence would say; or
2. Using basic reasoning, solve for what the character (an actual superintelligence) would say if this scenario were playing out in real life.
If GPT takes the second approach, by imitating the idealised superintelligence, it would in essence have to act superintelligent.
The difference between the two lies on the fine semantic line: whether GPT thinks the conversation is a human imitating superintelligence, or an actual words of a superintelligence. Arguably, since it only has training samples of the former, it will do the former. Yet that's not what it did with numbers - it learnt the underlying principle, and extrapolated to tasks it had never seen.
If #1 is true, that still implies that GPT-3/4 could be very useful as an AXI: we just need it to imitate a really smart human. More below under "Human Augmentation".
Human-Like Learning?
Human intelligence ['ability to achieve goals'] can be modelled purely as an optimisation process towards imitating an agent that achieves those goals. In so far as these goals can be expressed in language, GPT exhibits a similar capacity to "imagine up an agent" that is likely to fulfil a particular goal. Ergo, GPT exhibits primitive intelligence, of the same kind as human intelligence.
More specifically, I'm trying to clarify that there is a spectrum between imitation and meta-imitation; and bigger GPT models are getting progressively better at meta-imitation.
- Meta-Imitation is the imitation of the underlying type of thinking that is represented by a class of real or fictional actors. Eg., mathematics.
- Imitation is direct (perfect/imperfect) copying of an observed behaviour : eg. recalling the atomic number of Uranium.
Language allows humans to imagine ideas that they then imitate- it gives us an ability to imitate the abstract.
Suppose you were a general in ancient Athens, and the problem of house lamps occasionally spilling and setting neighbourhoods aflame was brought you. "We should build a fire-fighting squad.", You pronounce. The words "fire fighting squad" may never have been used in history before that (as sufficient destiny of human population requiring such measures didn't occur earlier) - yet the meaning would be, to a great degree, plain to onlookers. The fire-fighting squad thus formed can go about their duties without much further instruction, by making decisions based on substituting the question "what do I do?" with "what would a hypothetical idealised firefighter do?".
With a simple use of language, we're able to get people to optimize for brand new tasks. Could this same sort of reasoning be used with GPT? Evidence of word substitution would suggest so.
So in one line, is Meta-Imitation = Intelligence ? And will GPT ever be capable of human-level meta-imitation?
Larger GPT models appear to show an increase in meta-imitation over literal imitation. For example, if you asked GPT-2:
"what is 17+244?"
It replies "11"
This is closer to literal imitation - It knows numbers come after a question including other numbers and an operator ("+"). Incidentally, young children seem to acquire language in a somewhat similar fashion:
They begin by imitating utterances (a baby might initially describe many things as "baba"); Their utterances grow increasingly sensitive to nuances of context over time "doggy" < "Labrador" < "Tommy's Labrador named Kappy". I'm arguing that GPT shows a similar increase in contextual sensitivity as the model size grows, implying increasing meta-imitation.
Human Augmentation
My definition of AXI relies on a turing test comprising of a foremost expert in a field conversing with another expert (or an AI). If the expert finds the conversation highly informative and indistinguishable from the human expert, we've created useful AXI.
GPT-2 and GPT-3 appear to show progression towards such intelligence - GPT written research papers providing interesting ideas being one example. Thus, even if GPT-4 isn't superintelligent, I feel it is highly likely to qualify as AXI [especially when trained on research from the relevant field]. And while it may not be able to answer the question on cancer, maybe it will respond to subtler prompts that induce it to imitate a human expert that has solved the problem. So the following might be how a human announces finding the cure for cancer, and GPT-4's completion might yield interesting results:
"Our team has performed in-vivo experiments where we were able to target and destroy cancerous cells, while leaving healthy ones untouched. We achieved this by targeting certain inactivated genes through a lentivirus-delivered Cas9–sgRNA system. The pooled lentiviruses target several genes, including "
[Epistemic status: weak - I'm not a geneticist and this is likely not the best prompt - but this implies that it would require human experts working in unison with AXI to coax it to give meaningful answers.]
Failure Modes
GPT has some interesting failure modes very distinct from a human - going into repetitive loops for one, and with GPT-3 in particular, and increasing tendency to reproduce texts verbatim. Maybe we'll find that GPT-4 is just a really good memoriser, and lacks abstract thinking and creativity. Or maybe it falls into even more loops than GPT-3. It is hard to say.
To me, the main argument against a GPT-4 acquiring superintelligence is simply its reward function- it is trained to copy humans, perhaps it will not be able to do things humans can't (since there is no point optimising for it). However, this is a fairly weak position. Because, to be precise, GPT attempts to imitate anything, real or hypothetical, in an attempt to get at the right next word. The examples of math, and invented words, show that GPT appears to be learning the processes behind the words, and extrapolating them to unseen scenarios.
Finally, the word "superintelligence" is likely to have a lot of baggage from its usage in sci-fi and other articles by humans. Perhaps, to remove any human linked baggage with the word superintelligence, we could instead define specific scenarios, to focus the AI on imitating the new concept, rather than recalling previous human usage. For example:
"RQRST is a robot capable of devising scientific theories that accurately predict reality. When asked to devise a theory on Dark Energy, RQRST responds,"
Or
"Robert Riley was the finest geneticist of the 21st century. His work on genetic screening of embryos relied on "
Or
"Apple has invented a new battery to replace lithium Ion, that lasts 20x as long. It relies on"
I'd love to see GPT-3 complete expert sounding claims of as-yet unachieved scientific breakthroughs. I'm sure it can already give researchers working in the domain interesting answers; especially once fine-tuning with relevant work is possible.
7 comments
Comments sorted by top scores.
comment by Vanessa Kosoy (vanessa-kosoy) · 2020-07-12T09:46:08.056Z · LW(p) · GW(p)
Imagine typing the following meta-question into GPT-4, a revolutionary new 20 Trillion parameter language model released in 2021:
"I asked the superintelligence how to cure cancer. The superintelligence responded __"
How likely are we to get an actual cure for cancer, complete with manufacturing blueprints?
...The difference between the two lies on the fine semantic line: whether GPT thinks the conversation is a human imitating superintelligence, or an actual words of a superintelligence. Arguably, since it only has training samples of the former, it will do the former. Yet that's not what it did with numbers - it learnt the underlying principle, and extrapolated to tasks it had never seen.
What GPT is actually trying to do is predicting the continuation of random texts found on the Internet. So, if you let it continue "24+51=", what it does is answering the question "Suppose a random text on the Internet contained the string '24+51='. What do I expect to come next?" In this case, it seems fairly reasonable to expect the correct answer. More so if this is preceded by a number of correct exercises in arithmetic (otherwise, maybe it's e.g. one of those puzzles in which symbols of arithmetic are used to denote something different).
On the other hand, your text about curing cancer is extremely unlikely to be generated by actual superintelligence. If you told me that you found this text on the Internet, I would bet against the continuation being an actual cure for cancer. I expect any version of GPT which is as smart as me or more to reason similarly (except for complex reasons to do with subagents and acausal bargaining which are besides the point here), and any version of GPT that is less smart than me to be unable to cure cancer (roughly speaking: intelligence is not really one-dimensional).
It seems more likely to get an actual cure for cancer if your initial text is a realistic imitation of something like, an academic paper describing a novel cure for cancer. Or, a paper in AI describing a superintelligence that can cure cancer.
comment by avturchin · 2020-07-11T21:40:39.213Z · LW(p) · GW(p)
It would be interesting to train GPT-4 on the raw code of GPT-3 neural net (weights table), so it will be able to output larger networks code.
Replies from: gwern↑ comment by gwern · 2023-03-13T00:47:58.705Z · LW(p) · GW(p)
You wouldn't be able to do that because the raw weights would require context windows of millions or billions. Approaches to meta-learning fast weights require more tailored approaches; a good recent example is the meta-learning diffusion model "Gpt". (Yes, that is really its name - possibly the worst named DL result of 2022.)
comment by Felix Bergmann (felix-bergmann) · 2020-07-13T11:35:08.642Z · LW(p) · GW(p)
I think you misunderstood the aim of GPT. While what you're saying sounds great, it's not the aim of the project and trying to achieve such tasks via GPT within the next 10 years is extremely inefficient. GPT simply predicts the next word in a sentence. Word by word. It only produces what sounds right to us.
I get your idea though
comment by beezlebub33 · 2020-07-13T11:33:31.410Z · LW(p) · GW(p)
I think that the article is very interesting, and that the meta-intelligence point of view is a useful one.
However, I'd like to add the following argument: We need to consider why humans are intelligent, and why they are the intelligent in they way that they are. Other animals get by just fine being less intelligent, including other primates, and perform very well in their environments. The (likely?) answer is that we are intelligent not to perform well against the overall environment but to perform well against each other. Your greatest competitor (evolutionarily speaking) is not gathering food or defending against a lion, but against they person down the street who is trying to take your mate, either overtly by pairing up with them long term or covertly by mating with them when you are not around.
In that environment, it is important for an animal to understand what the motives, intentions, and plan are for their competitors. That means that they need to have a model of them, what they know, what their options are, and what they will decide to do. That is meta-intelligence. Humans are good at forming mental models of certain types, infering consequences, and predicting outcomes. That ability, writ large, is the basis of math and science: form a mental model of a situation, process it, manipulate it, pick a useful outcome.
comment by Pattern · 2020-07-11T17:10:13.400Z · LW(p) · GW(p)
The question I had until half way through the post:
What is an AXI? ASI?
(Artificial eXpert Intelligence and...artificial scientist?)
The firefighters know their enemy is fires (sort of), and don't require a notion of 'firefighter'.
Why GPT-4 would magically know how to cure cancer isn't clear.
Replies from: dmtea↑ comment by dmtea · 2020-07-11T18:09:10.771Z · LW(p) · GW(p)
Artificial eXpert Intelligence and Artificial Super Intelligence. Sorry for not being clear- edited title to be more obvious.
What I'm going for here is that the few shot examples show GPT-3 is a sort of DWIM [LW · GW] AI (Do What I Mean): something it has implicitly learned from examples of humans responding to other human's requests. It is able to understand simple requests (like unscramble, use word in sentence, etc.) with about as much data as a human would need: understanding the underlying motive of the request and attempting to fulfill it.
On firefighters, the instruction would work even with children who had never seen a fire, but only heard its attributes verbally described ("hot bright thing that causes skin damage when touched; usually diminished by covering with water or blocking air").
On cancer, have a look at the CRISPR completion - what do you think GPT-4 would say? Is it really that far out to believe that in an endeavour to predict the next word in thousands of biology research papers, GPT-4 will gain an implicit understanding of biology? In a literal sense, GPT would've "read" more papers than any human possibly could, and might be better placed to probabilistically rank all genes that might be involved in a cancer cure, than the best human researcher (who is also relying on a less probabilistic grasp of the same papers).