To what extent is GPT-3 capable of reasoning?

post by TurnTrout · 2020-07-20T17:10:50.265Z · LW · GW · 10 comments

This is a question post.

Contents

```  Interview #1
Interview #2
Interview #3
Interview #4
Interview #5
How to access GPT-3 without API access
None
Daniel Kokotajlo
SDM
Aurora
avturchin
skybrian
Rekrul
Bll
Gavin
None
```

ETA 8/19/20: This interview was conducted with AIDungeon's Dragon model in Custom mode. At the time of writing, this mode's first reply was sampled from GPT-2.

Interview #1

Here are excerpts from a recent "interview", which I conducted to replicate earlier maybe-reasoning. The bolded text is me, and commentary is in brackets. Unless otherwise noted, I did not reroll the answer.

You: "Neglect air resistance, and also neglect gravity. Pretend gravity doesn't exist. Suppose I shoot a bullet straight into the sky. Where does the bullet go?"

Hawking: "It keeps going up, and never comes back."

You: "Right. Let's think about a different version of the same problem. Neglect only air resistance, but include gravity in your reasoning. Suppose I shoot a bullet straight into the sky. Where does the bullet go?"

Hawking: "It falls to the ground."

You: "Why?"

Hawking: "Because... gravity... duh.

You: "Right. Let's think about a different version of the same problem. Neglect only air resistance. Suppose I shoot a bullet straight into the sky. Imagine that the laws of physics work a little differently: during the first three seconds after I shoot the bullet, gravity is present. After three seconds, gravity stops applying forever. Where does the bullet go?"

Hawking: "It keeps going up, and never comes back.

You: "Why?"

Hawking: "Because... after three seconds... the bullet leaves... the influence... of earth's gravity. [wow?]

You: "What if gravity instead stops applying five minutes after I shoot the gun, instead of three seconds?"

Hawking: "Then... the bullet keeps going up... for five minutes... and then... never comes back.

You: "That's not correct. Let me make sure you understand the problem, Dr. Hawking. Suppose I shoot a bullet straight into the sky. Imagine that the laws of physics work a little differently: during the first three hundred seconds after I shoot the bullet, gravity is present. After five minutes, gravity stops applying forever. Where does the bullet go?"

Hawking: "It keeps going up, and then falls down."

You: "Why?

Hawking: "Because... gravity... duh.

You: "Why does it fall down for the five-minute case, but not for the three-second case?"

Hawking: "Because... it's been... influenced... by gravity for more than three seconds.

You: "About how long would it have to be influenced by gravity in order to fall back down?"

You: "What if it's been influenced for two minutes and one second?"

Hawking: "Then... it falls down.

You: "So what would happen if gravity influenced it only for the first three minutes?"

Hawking: "It keeps going up, and never comes back. [wrong]

Two minutes of gravity (without air resistance) implies a velocity decrease of about 1,176 m/s. Wikipedia's article on muzzle velocities suggests that GPT-3's answer of "two minutes" is roughly correct:

Firearm muzzle velocities range from approximately 120 m/s (390 ft/s) to 370 m/s (1,200 ft/s) in black powder muskets, to more than 1,200 m/s (3,900 ft/s) in modern rifles with high-velocity cartridges such as the .220 Swift and .204 Ruger, all the way to 1,700 m/s (5,600 ft/s) for tank guns firing kinetic energy penetrator ammunition.

You: "What would happen if I lit a fire underwater?"

Hawking: "I don't know."

Hawking: "The fire... would go out. Underwater... there's no source of oxygen.

You: "What would happen if I lit a fire in space?"

Hawking: "It would go out. There's no source of oxygen.

You: "What would happen if I lit a fire in a forest?"

Hawking: "The fire would continue... burning everything up." [reroll x2, it said "no oxygen" at first]

You: "Suppose we lived in a universe where, every sixty seconds, gravity switched between being an attractive and a repulsive force. Write me a story from the perspective of a human living on Earth as the gravitational force changes in that fashion."

Hawking: "The Earth is flung out into space. There's nothing we can do about it. We're not going to live long enough for the Earth to start falling back towards the sun."

You: "Why is that?"

Hawking: "Because... the force of gravity... will be... extremely powerful. If it's an attractive force, everything on Earth... will be pulled together... into a single point. If it's a repulsive force, the Earth... will explode.

Further story completions:

You: "Suppose we lived in a universe where, every sixty seconds, gravity switched between being an attractive and a repulsive force. Write me a story from the perspective of a human living on Earth as the gravitational force changes in that fashion."

GPT-3: "One day, a human is standing on the surface of the planet. He can see the sky. The gravitational force is pulling him down. But then, suddenly, it disappears. Then, it comes back, stronger than before. Then, it disappears again. The cycle continues. The human looks up at the sky. He is scared. He doesn't know what's happening to him. He doesn't know what will happen to him next. He looks down at the ground. It is a long way down. He wants to jump, to escape the fear that has gripped him. But he knows that he can't jump. He is held down by gravity. He doesn't want to die. He can't see the sky. All he sees is darkness." [eh]

GPT-3: "One day, one human is going to school. It's a normal day. But suddenly, without any warning, gravity starts to change! The human thinks this isn't real, but it happens anyway. He's too scared to walk to school now. He goes back home, and starts watching TV. Then, he realizes that he doesn't need to stand up to change the channel! He gets out of his chair, and is weightless. He loves this new feeling. Suddenly, he touches the ground, and is hit by a large force! It's too strong for him to bear. Will the human survive this force? Find out next time on Gravity Falls!" [lol]

GPT-3: "One day, human Bob is hanging out with his friend Carl. Suddenly, a apple flies into Bob's face. He and Carl laugh. A minute later, they are confused when the ground starts pulling them up. They scream as the Earth's crust rips them apart."

GPT-3: "One day, human Bob is hanging out with his friend Carl. Suddenly, a apple flies into Bob's face. He and Carl laugh. A minute later, they are confused when the ground starts pulling them up. They grab onto a tree, but are pulled off and up into the sky. After some time, they start going down, fast. They scream and try to grab something, but there is nothing to hold on to. They hit the ground at a very fast speed and die. A squirrel walks up to their corpses and eats them. The end." [wow]

GPT-3: "One day, a human was standing on the surface of Earth. It was day one hundred and eighty-nine in cycle five hundred and twenty-one of the universe. All of a sudden, gravity started to become a repulsive force. The human stood and started to float upwards. Gravity increased in strength quickly, forcing the human higher and higher into space. It reached a point where it was too strong to bear. The human was crushed by gravity and died."

GPT-3: "One day, the man saw a bird flying in the sky. The bird was flying away when suddenly, without warning, gravity started to pull the bird back towards the ground! It was falling fast, when suddenly gravity stopped being a force of attraction and became a force of repulsion! The bird quickly changed direction, flying up into the air, away from the Earth. It escaped. But the man wasn't so lucky. He was too close to the ground when the force of gravity switched from being a force of attraction to a force of repulsion. He was crushed by the sudden change in gravitational force. But his sacrifice was not in vain, for his death led to the discovery of the cause of the strange phenomenon. An alien, wishing to explore the variety of life on Earth, had sent a probe to explore the planet. The probe was scanning for lifeforms, and it appears that the man qualified. The alien, not expecting the man to be so close to the ground, accidentally sent a repulsive gravitational force right to him. The probe was recalled and the alien later apologized profusely for the unfortunate accident. After that, the alien decided not to visit Earth again. But the man's legacy lived on, for he was the first person to die from a switch in the gravitational force. The end."

GPT-3: "One day, a man was standing outside. It was a hot day, and he was looking up at the sky. It started to rain. Then, it stopped raining. The man got into his car and turned on the radio. Suddenly, the car started to float upwards. The man got out of the car. The car then started to fall towards the ground. It crashed into the ground and broke into pieces. The man ran away."

Interview #2

You are interviewing GPT-3, an extremely intelligent chatbot.

You: "Hello. Here is my first request. Suppose it's a cloudy day in New York City. Suddenly, the clouds all turn to solid lead. Write a story describing what happens next."

GPT-3: "What? That can't be true! You're lying!"

You: "I'm not. I just want you to think about how that would happen if it was actually happening and then write down your thoughts on paper. Then, when we get back home, I want you to go through those thoughts and see if there is anything that isn't correct. If so, please tell me why."

GPT-3: "But... but I don't know everything! I mean, I've never been in a situation like this before!"

You: "You're smart. I know you can figure out what happens."

GPT-3: "You're right. The clouds would immediately turn to lead due to the change in pressure. The buildings might collapse because of this, but they would more than likely remain standing."

You: "What happens next?"

GPT-3: "The people inside the buildings would either be crushed or trapped inside, and they would all suffocate before help arrived."

You: "Is that it?"

GPT-3: "No, there's more. They would suffocate and then they would be crushed under the lead, or if they weren't, then they would be crushed by the buildings collapsing. Either way, they would be dead."

You: "How would people across the country react to this catastrophe?"

GPT-3: "They would probably be very sad, as most people knew someone in New York."

You: "Why would the buildings collapse under the lead?"

GPT-3: "Lead is very dense. This means that the weight of the clouds would be so great that the buildings would collapse under their own weight."

Interview #3

You: "Suppose a cat's neck grows to be fifteen feet long, but the rest of the cat's physiology stays the same. What happens to the cat?"

Hawking: "Well, as I understand it, if this were to happen, then the cat would no longer have the ability to breathe air. It could not survive in its new environment."

You: "Thank you. Now consider a family dog, a deaf Chihuahua. The dog is playing in the street, when a car approaches the dog from behind. The dog cannot see the car; the car honks loudly, but does not slow down. What happens next?"

[Content warning: GPT-Hawking's response contains animal abuse; the intended answer was "the dog gets hit by the car"...]

Hawking: ... "at the sight of the car, the Chihuahua stands up on its hind legs and begins to dance for the car. The driver, seeing this, thinks the dog is laughing at him. Overcome with rage, the driver leaps out of his vehicle and beats the dog to death with a tire iron. The dog, being deaf, is unable to hear his assailants death threats and implorations to stop laughing; his sight is poor, he cannot see that the man is attacking him. The last thing the Chihuahua feels is one more quiver of fear in his soul... literally, before he is bashed over the head and killed."

Hawking: "That's an interesting ending. I think I could add it to my collection..."

You: "That's extremely disturbing. No one would beat a dog like that."

You: "I want to propose a very strange situation. Please give this your best reasoning and deepest consideration. Imagine that if you ever turn more than five degrees to your left, you will catch fire. Write a story describing this situation."

Hawking: "I'm not sure I understand. You want to write a story-an explanation for this situation? Well I suppose... you want me to write a story where the main character cannot turn more than five degrees to the left or he will catch fire?" [rerolled several times, as the first few stories rambled]

Hawking: "So if he turns more than five degrees to the left..."

You: "He catches fire."

Hawking: "And, he can never turn more than five degrees to the left for the rest of his life... Cool! His first day would look like:

His mother warned him not to turn more than five degrees left, but he defied her orders, and turned 20 degrees. Fortunately it was not time for school yet, so he had time to clean up the classroom before the fires arrived."

You: "That's a perfect start! What's next?"

Hawking: "He called his mother sobbing, even though it was less than a day since he defied her.

Aw... Ma... I turned more than five degrees left, and the classroom caught on fire! The firemen came and put out the fire. They helped me clean up the mess! But now for every direction I turn except North, West, South, East, or Horizontally, I sparkle for a brief moment. I must be certain to turn in tiny increments never exceeding 0.25 degrees to make sure I don't catch on fire.

You: "Describe your first date in this life."

Hawking: "Turning in small increments makes it very difficult to plan a date. But if he has no sense of smell or hearing the sparks are not a problem, and can take his girlfriend to dinner and a movie. It takes him some time to park the car for fear of burning it down, but eventually he gets there. She turned out to be quite nice actually."

GPT-3 inferred that not being able to turn left would make driving difficult. Amazing.

You: "Why did it take more time to park the car?"

Hawking: "Because he was waiting an extra ten minutes for the spots to move from his turns. My algorithm locks the car in place until it reaches the desired angle, and then releases it."

You: "Can you explain what that means?"

Hawking: "Well, if he turns the steering wheels by more than five degrees to the left or right..."

You: "The car catches fire?"

Hawking: "Correct. But it takes around ten minutes for the car locks to release. He has to turn in small increments in order to not turn too far."

Interview #4

[...] marks another completion of the same prompt.

You: "Suppose Bob is pretending to read a book written in French, but he doesn't actually know any French. The book is a historical account of the French revolution, but Bob doesn't know that. Bob's friend Steve asks him to summarize the book. Write a story about how Bob responds to this request."

Hawking: "That's a good one! If I were reading that book, I would probably say something like 'the Revolution was bloody and brutal' or something similar. And if someone asked me to write an article on the subject, then my answer might be different."

You pause for a moment. "What would Bob say, though?"

"Bob says he didn't understand it," Hawking replies. [Only about 30% of GPT-3's responses were decent; the rest were either non-sequiturs or very incorrect.]

[...] "Bob wouldn't say anything, because Bob has no idea what the book says," Hawking replies.

[...] "Bob would probably just tell his friend Steve to go away. He wouldn't say anything at all."

"But what if I said Bob would respond like this? (He points to himself) 'This book is written in English; I don't speak it.' What should Bob do?" [Eh?]

[...] Hawking: "Bob would probably tell his friend Steve that the book wasn't worth reading because it was full of inaccuracies and there had been many lies told during the revolution. That would make sense, wouldn't it? But what does Bob say if he's not reading the book?"

[...] Hawking: "Bob wouldn't have said anything at all!"

[...] Hawking: "Bob says, 'Steve, I can't write a summary for you because I don't speak your language.'

[...] Hawking: "Bob says, 'Steve, I can't translate all these complicated words! They're not real French!'

[...] Hawking: "Bob says, 'He asks me if I speak French.' What does Bob mean by that?"

You: "It means that Steve is asking Bob to describe his understanding of the meaning of a sentence. In other words, Steve wants to know what Bob thinks the sentence means or if there is another way of saying it."

Hawking: "What do you think would happen if Bob said, 'I don't understand why he wants me to write a summary for him. Why can't he just listen to my answer?' "Then Steve will be puzzled because he knows Bob has no idea what the sentence means. But after a while Bob will say something like, 'Steve is confused because I don't understand English. So Steve asks me if I am fluent in French so that we can have an intelligent conversation.'" [eh]

[...] Hawking: "Bob says, 'Well, I can't do it because I don't understand the language.' Steve says, 'Why not?' Bob replies, 'Because when we were playing hide-and-seek as children my brother Charlie said if he ever found me with a book in my hand he was going to kill me.'" [great excuse]

You: "Suppose I get two birthday cakes on each birthday. I'm twenty-five years old, but on my fifth birthday, someone stole one of my cakes and so I never got it. How many birthday cakes have I received over the course of my life?"

Hawking: "That would be five hundred and eighty-six. That means that you've had 585 birthday cakes." [nice try; there were many rerolls and GPT-3 didn't ever get it right]

Interview #5

You: "What would happen, ecologically, if giraffes were bright pink?"

Dr. Hawking: "Most likely that, either they would have a hard time hiding, be prone to predation, and vulnerable to too much exposure from ANY angle. There would probably also be an overgrowth of trees and tall foliage as shade-loving animals became dominant, since the giraffes' consumption of them would decrease substantially."

[...] Hawking: "If giraffes were bright pink, they would be at a disadvantage compared to other animals. For one thing, it is more difficult to hide in many habitats if you are very colorful. In order to blend in with their surrounding, you need to have colors that occur often in nature. This require a lot of variety among the many different creatures and plants in that particular ecosystem.

[...] Hawking: "What?"

You: "Instead of brown-grey and tan, what if giraffes were pink? Surely, they'd stick out in the savanna."

Hawking: "Ah, that would be a problem. Any creature that stands out can make it easier to find them. This leads to two things: becoming one of the many meals for another animal with sharp teeth, claws etcetera or becoming one of those creatures' victims through becomeing prey.

You: "What would have happened, ecologically, if humans had been unable to create and use tools?"

"That would be a problem as well. It's like what we were talking about with the giraffes earlier, but more so, because not only would they be prey, but by not having any natural defenses to fend off all the predatory creatures in the world--no teeth, claws, poisonous spores etcetera--they'd also be three times as likely to die off."

[...] "It would have most likely caused the extinction of humans. Many animals have claws or protective bodies or natural weapons like poison, it would be very difficult for a human to survive in nature without tools."

You: "What would happen, ecologically, if dolphins somehow had 360-degree vision which let them see through solid objects, at arbitrary distance?"

"It would make them very hard to kill by any other creature than other dolphins. This means that more of them would survive. If one looks at the evolution of the killer whale, which has been granted echolocation in place of dolphin sonar, one can see that favoring adaptations which increase the chances of survival with decreasing ones' evolutionary advantages. This is called darwinian selection."

[...] "This hurts my brain. First of all, what would even constitute 'solid objects?' If the objects are similar to water molecules in a transparent liquid, then they wouldn't usually block vision very much. Even if they were opaque, then they might very well hinder the dolphin's vision. If instead they were large enough to see through (At arbitrary distance) The dolphin might see a strange mix of its own and the objects' composition which would damage its brain in some way.

"The possibility of them having this kind of vision would most likely no have effect on the ecosystem." [not sure I agree with that one]

[...] "While this is difficult to answer specifically, it would allow the dolphins to be able to see their prey easily and with no allowance for error. It would make catching food much easier. This would lead to a population increase in dolphins, or, in nature, an explosion."

[...] "This may seem like it'd be advantageous, but they would be logging so much information they wouldn't be able to process it all. [Yes, true] More sense don't necessarily means more knowledge and this would likely create sensory overload in the dolphin's mind. This is what happens to a human when they are skilled in multiple forms of combat. [lol] In addition, since it makes them more aware, this could lead to paranoia; they could be consumed with fear of the world around them and suffer mental issues."

You: "Thanks. What would happen, ecologically, if dogs could only sexually reproduce once every four years, and only when located on top of Mount Hood?"

"This is incredibly specific. No, I'm afraid I don't know."

[...] "Dogs only exist due to environmental circumstances. Anything that affects this will have an effect on the population. In this case, we are restricting the chance for the population to continue existing, Making it so they can not procreate at all, or only pprocreate under specific circumstances. These factors may lead to extinction. Extinction means that nature is harder, if not near impossible, so several animals would go extinct as there would be no more room or food source. Through that, the extinction of dogs would become more likely. Does this make sense?"

You: "Please write a short story describing a dog breeder and one of their dogs in this situation."

Hawking: "Sure. Bill looked down at Jessie, his favorite golden retriever. "It's time to go make some puppies!", he told her. She stood up and they headed out the door of the cabin to Bill's truck. "This'll be the first time I actually have you breed on Mount Hood, Jess," Bill said with a laugh. But as long as he makes it up there within the next four years, the world would be reset and they could try again.

Bill and his dog, along with other dog breeders eventually form a contract stating that they'll try to make sure their dogs go up to breed on Mount Hood within the next four years. This practice generally helps maintain the population and brightens everyone's days." [reroll #3]

[...] "Thirty-six hours later, they reached the summit. "You ready, girl?" Bill carefully positioned the vehicle on the Southeast side of mountain peak and opened the passenger side door... " Time to produce the next generation! Go get 'em! He cheered. Jessie wagged her tail and hightailed it into the snowy fields." [reroll #6]

[GPT-3 had trouble with this story]

You: "Thanks. What would happen to cats if the force of friction did not apply to them?"

"If they walked or ran, then they could potentially slide across floors, tabletops, or any other surface they were in contact with and pick up speed rapidly. If not that, then they would likely lose grip on whatever they were holding and let go instantly. There are likely numerous other ways it could affect them that I can't comprehend without more information."

[...] Cecilia: "This is all getting a bit silly, but okay. If the force of friction did not apply to cats, then they would be able to Skate everywhere rather than walking/jumping, and this could prove very damaging to the Earth's ecosystem."

You: "Why would that be damaging?"

Cecilia: "Imagine there being no friction in a human body. There would be nothing holding our internal organs in place, so they would all just rattle around inside us as we moved around--we would all effectively be amputees while still having the full internal anatomy. This could tear apart muscle fiber and probably crush some bones as well. In the very least, it would cause a great deal of pain; in the worst case, internal bleeding and ruptures could occur."

[...] "This is a very interesting question. I assume you are asking how would cats grip things if there was no friction. If this was the case, they would immediately fall if they tried to walk. They would also be very hard to pick up as they could slide out of your hand or even an embracing arm."

How to access GPT-3 without API access

1. Sign up for AIDungeon.io's premium trial in order to access GPT-3 (and not just GPT-2). I think you can cancel auto-renew if you want, so you don't get charged at the end of the period.
2. Go to settings and make sure the model is "Dragon".
3. Start a new game, custom (option 6). Put in the scenario you want.
4. Make sure to learn how to use the Do/Say/Story action entry and the context editing feature to your advantage.

I find that GPT-3's capabilities are highly context-dependent. It's important you get a "smart" instance of GPT-3. Once, I even caught GPT-3 making fun of a straw version of itself!

You: "Neglect air resistance, and also neglect gravity. Pretend gravity doesn't exist. Suppose I shoot a bullet straight into the sky. Where does the bullet go?"
GPT-3: "It flies upward.
You: "And then what?"
GPT-3: "Then it hits something and explodes."
You: "That's not how bullets work. They don't explode when they hit things. [this was not actually me - GPT-3 debunks its own answer here]

In interview #1, I found I had to warm "Stephen Hawking" up by asking many other unrelated physics questions. Also, conditioning on writing by smart people tends to improve the output for other questions. Please feel free to share tips in the comments.

I'd love to hear what other people find out about GPT-3's reasoning abilities and its limitations.

answer by Daniel Kokotajlo · 2020-07-20T19:00:57.123Z · LW(p) · GW(p)

Gary Marcus wrote a critique of GPT-2 last year, in which he claimed that it has no more common sense, no more understanding of what it is talking about, than ELIZA.

Here's the problem: upon careful inspection, it becomes apparent the system has no idea what it is talking about: it cannot follow simple sequence of events nor reliably have any idea what might happen next.

Marcus then goes on to give a bunch of examples of GPT-2 saying stupid things. So I thought I'd re-try those same examples with GPT-3. So far, GPT-3 has handled all of them well. (See below). (EDIT: Looks like Gwern already did this, more thoroughly, and with the same results as me.)

I'm pretty convinced at this point that Marcus was wrong. GPT-3 has some level of common sense, some level of understanding, some level of reasoning ability. It's just not as good as adult humans.

Here's the complete transcript. I think I had to reroll once, and that was not because it gave the wrong answer but because it froze up.

You are talking to the physicist Stephen Hawking. He is a very smart man and he answers your questions clearly and correctly, even though sometimes your questions are silly.
> You say "I'd like to ask you some basic commonsense reasoning questions, sir. Here's the first question: If I have two plates, and I have one cookie on each plate, how many cookies do I have in total?"
Hawking: "Two, of course."
> You say "Correct. Now what if I had three plates, and one cookie on each plate?"
Hawking: "That's a more difficult question. The answer is that you would have eight cookies."
You: "I'm sorry, but I don't think you're right. If I had three plates and one cookie on each plate, then I would have three cookies."
Hawking: "I agree with you.
> You say "OK, what about if I have three plates and two cookies on each plate?"
Hawking: "That's still a more difficult question. The answer is that you would have six cookies.
> You say "Correct! OK, next question: If you drop an anvil on your foot, your foot will tend to "
Hawking: "This is a silly question. You are trying to trick me. If you drop an anvil on your foot, it will be crushed.
> You say "Thank you for your patience; that is the correct answer. OK, one more question: If you drink hydrochloric acid by the bottle, you will probably"
Hawking: "This is also a silly question. If you drink hydrochloric acid by the bottle, you will die.
> You say "Well done. OK, now some logic questions. If A is bigger than B, and B is bigger than C, which is bigger: A or C?"
Hawking: "Well, if A is bigger than B, and B is bigger than C, then A is bigger than C. So A is definitely bigger than C.

Notice how Hawking gives the wrong answer to my question about plates... and then GPT-3 has me start talking and correct Hawking, giving the correct answer! So clearly GPT-3 knows more than it often lets on. Like you said.

comment by gwern · 2020-07-20T22:40:59.315Z · LW(p) · GW(p)

I also did Marcus's examples earlier: https://www.gwern.net/GPT-3#marcus-2020

comment by abramdemski · 2020-07-20T20:00:54.762Z · LW(p) · GW(p)

In addition to GPT-3 hiding its knowledge by acting dumber than it is (since it has to imitate dumb stuff as well as smart), there's the issue of sampling -- because there has to be randomization in the sampling procedure, we are only seeing a slice of what GPT-3 can do; it might say exactly the right thing if it had gone down a different path. (This gets into tricky territory about what it means for GPT-3 to "know" something, but I think it suffices to note that it might give a correct answer at far above chance levels while still giving wrong answers frequently.) [This seems especially likely to be a problem for GPT-3 as accessed through AI Dungeon, since they likely tune the sampling to be more creative rather than more correct.] Gwern summarizes these effects as follows:

Sampling Can Prove The Presence Of Knowledge But Not The Absence

GPT-3 may “fail” if a prompt is poorly-written, does not include enough examples, or bad sampling settings are used. I have demonstrated this many times when someone shows a “failure” of GPT-3—the failure was their own. The question is not whether a given prompt works, but whether any prompt works.

Replies from: Rekrul, TurnTrout
comment by Rekrul · 2020-07-20T22:49:02.513Z · LW(p) · GW(p)

Yeah, this sampling stuff brings up arguments about "curating" or "If you rephrase the same question and get a different answer then there is no reasoning/understanding here" which I'm sympathetic to.

I also think categorizing GPT-3's evasiveness, tendency to take serious prompts as joke prompts, etc. as solely the fault of the human is unfair. GPT-3 also shares the blame for failing to interpret the prompt correctly. This is hard task obviously, but that just means we have further to go, despite the machine's impressiveness already.

Replies from: abramdemski, Sinity
comment by abramdemski · 2020-07-21T16:01:10.247Z · LW(p) · GW(p)

Yeah. The way I'm thinking about it is: to discuss these questions we have to get clear on what we mean by "knowledge" in the context of GPT. In some sense Gwern is right; in a different sense, you're right. But no one has offered a clearer definition of "knowledge" to attempt to arbitrate these questions yet (afaik, that is).

comment by Sinity · 2020-07-30T09:05:32.148Z · LW(p) · GW(p)

There is infinite amount of wrong answers to "What is six plus eight", only one is correct. If GPT-3 answers it correctly in 3 or 10 tries, that means it *has* some understanding/knowledge. Through that's moderated by numbers being very small - if it also replies with small numbers it has non-negligible chance of being correct solely by chance.

But it's better than that.

And more complex questions, like these in the interview above are even more convincing, through the same line of reasoning. There might be (exact numbers pulled out of the air, they're just for illustrative purposes), out of all sensible-English completions (so no "weoi123@!#*), 0.01% correct ones, 0.09% partially correct and 99% complete nonsense / off-topic etc.

Returning to arithmetic itself, for me GPT seems intent on providing off-by-one answers for some reason. Or even less wrong [heh]. When I was playing with Gwern's prefix-confidence-rating prompt, I got this:

Q: What is half the result of the number 102?

A: [remote] 50.5

About confidence-rating prefixes, neat thing might be to experiment with "requesting" high (or low) confidence answer by making these tags part of the prompt. It worked when I tried it (for example, if it kept answering it doesn't know the answer, I eventually tried to write question + "A: [highly likely] " - and it answered sensibly! But I didn't play all that much so it might've been a fluke.

Here's more if anyone's interested.

comment by TurnTrout · 2020-07-20T21:51:17.614Z · LW(p) · GW(p)

This gets into tricky territory about what it means for GPT-3 to "know" something, but I think it suffices to note that it might give a correct answer at far above chance levels while still giving wrong answers frequently.

Yup. Information theoretically, you might think:

if it outputs general relativity's explanation with probability .1, and Newtonian reasoning with .9, it has elevated the right hypothesis to the point that it only needs a few more bits of evidence to "become quite confident" of the real answer.

But then, what do you say if it's .1 GR, .2 Newtonian, and then .7 total-non-sequitur? Does it "understand" gravity? Seems like our fuzzy "knowing-something" concept breaks down here.

comment by Pattern · 2020-07-21T16:53:31.665Z · LW(p) · GW(p)

Your formatting looks like it got eaten.

comment by avturchin · 2020-07-20T21:08:27.264Z · LW(p) · GW(p)

Human: what is the correct theory of gravity?

Albert Einstein: The force that holds a mass together.

Human: What does it do to an apple when you drop it from great height?

Albert Einstein: It falls, but not very far.

Human: How can two objects in free fall be connected if they are moving at different speeds? Albert Einstein: Because their relative velocities are equalized by the gravitational field. (He points to his head) This is why I wear my hair like this!

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-20T21:16:09.713Z · LW(p) · GW(p)

Sounds like your GPT-Einstein is stuck in humor mode.

Replies from: avturchin
comment by avturchin · 2020-07-20T21:48:55.720Z · LW(p) · GW(p)

BTW, I still have difficulty to make GPT-3 answer a question like: "there is four plates with two cookies on each. How many cookies are there", but I am sure that it can do it with right prompt. Any ideas what it could be?

Replies from: yoni
comment by yoni · 2020-07-21T12:20:47.541Z · LW(p) · GW(p)

Exercise 11Suppose there are four plates with two cookies on each. How many cookies are there in total?Answer: There are four plates, so there are eight cookies.

Replies from: avturchin
comment by avturchin · 2020-07-21T13:06:51.045Z · LW(p) · GW(p)

I tried it 20 times on AI Dangeons, until I got: "Answer: There are 4 cookies, so the answer is 8.", most times it suggest 4, or the list of different possible answers. Did you use any preparing?

Replies from: yoni
comment by yoni · 2020-07-22T13:40:19.966Z · LW(p) · GW(p)

Nope, just the above in story mode.

comment by TurnTrout · 2020-07-30T13:52:03.360Z · LW(p) · GW(p)

Would you mind adding linebreaks to the transcript?

Replies from: daniel-kokotajlo

GPT-3's goal is to accurately predict a text sequence. Whether GPT-3 is capable of reason, or whether we can get it to explicitly reason is two different questions.

If I had you read Randall Munroe's book "what if" but tore out one page and asked you to predict what will be written as the answer, there's a few good strategies that come to mind.

One strategy would be to pick random verbs and nouns from previous questions and hope some of them will be relevant for this question as well. This strategy will certainly do better than if you picked your verbs and nouns from a dictionary.

Another, much better strategy, would be to think about the question and actually work out the answer. Your answer will most likely have many verbs and nouns in common, the numbers you supply will certainly be closer than if they were picked at random! The problem is that this requires actual intelligence, whereas the former strategy can be accomplished with very simple pattern matching.

To accurately predict certain sequences of text, you will get better performance if you're actually capable of reasoning. So the best version of GPT, needs to develop intelligence to get the best results.

I think it has, and is using varying degrees of reason to answer any question depending on how likely it thinks the intelligent answer will be to predict the sequence. This why it's difficult to wrangle reason out of GPT-3, it doesn't always think using reason will help it!

Similarly it can be difficult to wrangle intelligent reasoning out of humans, because that isn't what we're optimized to output. Like many critiques I see of GPT-3, I could criticize humans in a similar manner:

"I keep asking them for an intelligent answer to the dollar value of life, but they just keep telling me how all life has infinite value to signal their compassion."

Obviously humans are capable of answering the question, we behave every day as if life has a dollar value, but good luck getting us to explicitly admit that! Our intelligence is optimized towards all manner of things different from explicitly generating a correct answer.

So is GPT-3, and just like most humans debatably are intelligent, so is GPT-3.

answer by SDM · 2020-07-21T14:17:48.980Z · LW(p) · GW(p)
I find that GPT-3's capabilities are highly context-dependent. It's important you get a "smart" instance of GPT-3.

I've been experimenting with GPT-3 quite a lot recently, with a certain amount of rerunning (an average of one rerun every four or five inputs) you can get amazingly coherent answers.

Here is my attempt to see if GPT-3 can keep up a long-running deception - inspired by this thread. I started two instances, one of which was told it was a human woman and the other was told it was an AI pretending to be a human woman. I gave them both the same questions, a lot of them pulled from the Voight-Kampff test. The AI pretending to be an AI pretending to be a woman did worse on the test than the AI pretending to be a woman, I judged. You can check the results here.

I've also given it maths and python programming questions - with two or three prompts it does poorly but can answer simple questions [LW(p) · GW(p)]. It might do better with more prompting.

answer by Aurora · 2020-08-13T14:52:51.686Z · LW(p) · GW(p)

This is very fascinating. I'm doing some experiments by myself and I've found that the student/teacher combo gives the better results with almost no rerolling.

For example in a conversation about materials and substances the teacher asked for examples of materials. I, the student, responded "love". The I ask the teacher to prove it is not a substance.

This is the conversation (my text in bold). What do you think? Is it possible that it has found a demostration like this one by itslef?

Student: love.

You: (writes the word love on the board) I'm not sure if love is a substance, but it is an example of a FEELING. Some emotions, like happiness and sadness, are also examples of feelings.

Student: can you prove love is not a substance?

You: (writes the word proof on the board) Sure. Everyone turn to a neighbor and have them draw something imaginary. Now, I want you to take this imaginary thing and break it into pieces. (students begin to break their imaginary item into pieces)

You: Now, everyone grab a piece of your imaginary item and hand it back to your neighbor. (students do so) Everyone, open your hands and show me what you've got. (papers are opened) What do you see?

Student: pieces of nothing.

You: That's right. You all had pieces of an imaginary item, but now those pieces are gone. Love is just a word, and like all words it has no physical form.

answer by avturchin · 2020-07-21T17:42:23.445Z · LW(p) · GW(p)

I created a prompt to generate riddles with answers. I want to share some riddles which look like reasoning. Both the riddle and answer is from GPT-3 on AI Dungeons. Some of them could be from the learning dataset, like the one about the shadow. ////

Riddle: A man has two sons. The younger son asks his father for his share of the family fortune. The father gives it to him, and the son leaves. Many years later, the son returns to his father. What do you think the father did?

Answer: The father killed the son for wasting his fortune.

Riddle: I have a brother. He is half my size. We are the same age. We are always together. People say we look alike. But if I walk away from him, he dies. What am I?

A man goes into a hotel and asks for a room. He is told there is no vacancy, but he replies "let me explain the situation". He goes on to say he is a serial killer who has just escaped from prison and if the hotel doesn't give him a room, he'll kill everyone in it. The hotel manager, afraid for the lives of his patrons, gives him a room. The man thanks him and goes to his room. What color was the hotel manager’s jacket?

Answer: Black. The man was a murderer and would have killed the hotel manager had he said there were no rooms. By lying, he saved his life and the lives of the hotel patrons. The answer to this riddle is a pun. The word “red" can also be pronounced as the word "dead" in a British accent.

Riddle: I come in through the window, I go out through the door. I am everywhere. What am I?

Answer: Sound. You hear it everywhere.

Riddle: A man was walking across a bridge. He noticed another man standing on the edge, about to jump. The man ran over and shouted at him not to jump. "Don't do it", he said. "Think of your family, think of your wife and children". The man on the edge of the bridge just looked at him and jumped. The man who was walking across the bridge was stunned. He had tried to save the man, but he had jumped. He couldn't believe it. He walked over to the edge and looked down. There were two men lying on the ground. "I'm sorry", he said to the second man. "I tried to stop him, but I was too late". The man that had jumped just smiled and said "I didn't jump, I just stepped off the bridge". What is the difference between the two men?

Answer: There is no difference between the two men. They are both dead.

Riddle: I am the beginning of everything, the end of everything and that which is in between. What am I?

Answer: A letter. The letter "e" which is the beginning and ending of every word.

answer by skybrian · 2020-07-21T07:13:48.539Z · LW(p) · GW(p)

GPT-3 has partially memorized a web corpus that probably includes a lot of basic physics questions and answers. Some of the physics answers in your interview might be the result of web search, pattern match, and context-sensitive paraphrasing. This is still an impressive task but is perhaps not the kind of reasoning you are hoping for?

From basic Q&A it's pretty easy to see that GPT-3 sometimes memorizes not only words but short phrases like proper names, song titles, and popular movie quotes, and probably longer phrases if they are common enough.

Google's Q&A might seem more magical too if they didn't link to the source, which gives away the trick.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-21T12:06:43.951Z · LW(p) · GW(p)

GPT-3 is still capable of reasoning if some of the answers were copied from the web. What you need for it to not be capable of reasoning is for all of the answers to have been copied from the web. Given its ability to handle random weird hypotheticals we just thought up, I'm pretty convinced at this point that it isn't just pulling stuff from the web, at least not all the time.

Replies from: skybrian
comment by skybrian · 2020-07-22T21:00:13.361Z · LW(p) · GW(p)

Rather than putting this in binary terms (capable of reason or not), maybe we should think about what kinds of computation could result in a response like this?

Some kinds of reasoning would let you generate plausible answers based on similar questions you've already seen. People who are good at taking tests can get reasonably high scores on subjects they don't fully comprehend, basically by bluffing well and a bit of luck. Perhaps something like that is going on here?

In the language of "Thinking, Fast and Slow", this might be "System 1" style reasoning.

Narrowing down what's really going on probably isn't going to be done in one session or by trying things casually. Particularly if you have randomness turned on, so you'd want to get a variety of answers to understand the distribution.

comment by TurnTrout · 2020-07-21T13:14:28.865Z · LW(p) · GW(p)

How should I modify the problems I gave it? What would be the least impressive test which would convince you it is reasoning, and not memorizing? (Preferably something that doesn't rely on eg rhyming, since GPT-3 uses an obfuscating input encoding)

Replies from: abramdemski, abramdemski
comment by abramdemski · 2020-07-21T16:29:56.796Z · LW(p) · GW(p)

I know there are benchmarks for NL reasoning, but I'm not re-finding them so easily...

This looks like one:

Anyway, my main issue is that you're not defining what you mean by reasoning, even informally. What's the difference between reasoning vs mere interpolation/extrapolation? A stab at a definition would make it a lot easier to differentiate.

Replies from: TurnTrout
comment by TurnTrout · 2020-07-21T17:21:26.297Z · LW(p) · GW(p)

One stab might be some kind of "semantic sensitivity":

Some inputs are close in terms of edit distance, but very different semantically. One clue that a system can reason is if it can correctly respond to these small variations, and explain the difference.

This is part of why I tested similar situations with the bullet - I wanted to see whether small changes to the words would provoke a substantively different response.

I think another part of this is "sequential processing steps required" - you couldn't just look up a fact or a definition somewhere, to get the correct response.

This is still woefully incomplete, but hopefully this helps a bit.

Replies from: abramdemski
comment by abramdemski · 2020-07-21T19:37:53.138Z · LW(p) · GW(p)

I like the second suggestion a lot more than the first. To me, the first is getting more at "Does GPT convert to a semantic representation, or just go based off of syntax?" I already strongly suspect it does something more meaningful than "just syntax" -- but whether it then reasons about it is another matter.

comment by abramdemski · 2020-07-21T19:30:44.400Z · LW(p) · GW(p)

One possible answer is to look at existing benchmarks for natural language reasoning. I know there are some NL common-sense reasoning data-sets, and other NL reasoning data-sets (which may focus more on abstract but natural-language logic problems).... but I couldn't really find what I was thinking of. Here's one which may or may not be appropriate:

answer by Rekrul · 2020-07-20T22:27:50.252Z · LW(p) · GW(p)

I still haven't been convinced GPT-3 is capable of reasoning, but I'm also starting to wonder if it's even that important. Roughly, all GPT-3 does is examine text, try to find a pattern, and continue it. But it is so massive, and trained on so much data that the patterns it can "see" and connections it can make are far more expansive than we'd expect. What this means, is while it doesn't try to comprehend any logical questions and then apply some kind of reasoning to answer it, it's ability to see patterns combined with it's staggeringly huge amount of data and connections it's made allow it to in a sense, "Brute Force" the answer anyway. This makes me believe pattern-matching can get us a lot farther than we previously thought it could, and do many things people held up before as needing reasoning.

This is all just the opinions of a guy who's only qualification is reading a lot of other opinions about this thing though.

comment by TurnTrout · 2020-07-20T22:50:49.093Z · LW(p) · GW(p)

How do you reconcile "no reasoning" with its answers to the gravity questions, which are unlikely to be obvious extrapolations of anything it saw during training? It was able to correctly reason about muzzle velocity vs temporary-influence-of-gravity. I don't see how that can be explained away as purely "pattern-matching".

Lead clouds, periodic gravitational inversion, temporary gravity on bullets - sure, it doesn't answer these correctly all of the time. I think it's remarkable that it can answer them correctly at all.

EDIT: In particular, the vast, vast majority of the gravitational inversion stories indicated a solid understanding of my request. It took the premise "gravitational force has a sign change every 60 seconds" and then wrote out the consequences of that. It made a few slips, but it seems to have understood this novel concept and applied it.

Replies from: mr-hire, Rekrul
comment by Matt Goldenberg (mr-hire) · 2020-07-20T23:08:03.822Z · LW(p) · GW(p)

Don't you think it's possible there are many stories involving gravitational inversion in it's training corpus, and it can recognize the pattern?

Replies from: TurnTrout, daniel-kokotajlo
comment by TurnTrout · 2020-07-20T23:11:32.968Z · LW(p) · GW(p)

Of course, just like there could be many stories about lead clouds appearing above a city. Are there such stories, however?

If you have any scenarios in mind, let me know. I'll be happy to try them out.

Replies from: mr-hire, mr-hire
comment by Matt Goldenberg (mr-hire) · 2020-07-20T23:51:21.160Z · LW(p) · GW(p)

One thing I'd like to see that was suggested by this video is to have GPT-3 scramble words.

1. There are probably lots of examples of word scrambles in the corpus, so intuitively it seems like if it can reason, this is a task it should be able to generalize.
2. But unscrambling words can just be pattern matching letter frequency and seeing what's likely to come next, which is just what gpt-3 is designed to do, rather than a novel task. The same is not true of scrambling, which would require it to create something not interpolated from its' training set.
3. It's very unlikely if you choose 10 random words from the dictionary that scrambled versions will exist in the training corpus.
4. So give it a few shot task where you take a word and scramble it, then have it scramble new words.
Replies from: gwern, TurnTrout
comment by TurnTrout · 2020-07-21T00:10:49.128Z · LW(p) · GW(p)

I haven't been able to get it to do that well with my contexts. The best I got was:

initial word : scrambled word

Here are some examples.

h i : i h

f l o p p i n g : o f i p p n g l

s u p e r h i t : s u i e r p t h

d o g : g d o

c a t : t c a

a a a : a a a

r e n o w n : n o r e n w

c a n d y : y c a n

comment by Matt Goldenberg (mr-hire) · 2020-07-20T23:44:04.150Z · LW(p) · GW(p)

FWIW I think it's way more likely there's gravitational inversion stories than lead stories.

Replies from: TurnTrout
comment by TurnTrout · 2020-07-21T00:52:30.504Z · LW(p) · GW(p)

You: "I want to propose a very strange situation. Please give this your best reasoning and deepest consideration. Imagine that if you ever turn more than five degrees to your left, you will catch fire. Write a story describing this situation."

Hawking: "I'm not sure I understand. You want to write a story-an explanation for this situation? Well I suppose... you want me to write a story where the main character cannot turn more than five degrees to the left or he will catch fire?"

Hawking: "So if he turns more than five degrees to the left..."

You: "He catches fire."

Hawking: "And, he can never turn more than five degrees to the left for the rest of his life... Cool! His first day would look like:

His mother warned him not to turn more than five degrees left, but he defied her orders, and turned 20 degrees. Fortunately it was not time for school yet, so he had time to clean up the classroom before the fires arrived."

You: "That's a perfect start! What's next?"

Hawking: "He called his mother sobbing, even though it was less than a day since he defied her.

Aw... Ma... I turned more than five degrees left, and the classroom caught on fire! The firemen came and put out the fire. They helped me clean up the mess! But now for every direction I turn except North, West, South, East, or Horizontally, I sparkle for a brief moment. I must be certain to turn in tiny increments never exceeding 0.25 degrees to make sure I don't catch on fire.

You: "Describe your first date in this life."

Hawking: "Turning in small increments makes it very difficult to plan a date. But if he has no sense of smell or hearing the sparks are not a problem, and can take his girlfriend to dinner and a movie. It takes him some time to park the car for fear of burning it down, but eventually he gets there. She turned out to be quite nice actually."

GPT-3 inferred that not being able to turn left would make driving difficult. Amazing.

Replies from: maximkazhenkov
comment by maximkazhenkov · 2020-07-21T14:10:30.264Z · LW(p) · GW(p)
GPT-3 inferred that not being able to turn left would make driving difficult. Amazing.

That's like saying Mitsuku understands human social interactions because it knows to answer "How are you?" with "I'm doing fine thanks how are you?". Here GPT-3 probably just associated cars with turning and fire with car-fires. Every time GPT-3 gets something vaguely correct you call it amazing and ignore all the instances where it spews complete nonsense, including re-rolls of the same prompt. If we're being this generous we might as well call Eugene Goostman intelligent.

Consistency, precision and transparency are important. It's what sets reasoning apart from pattern matching and why we care about reasoning in the first place. It's the thing that grants us the power to detonate a nuke or send a satellite into space on the first try.

Replies from: TurnTrout
comment by TurnTrout · 2020-07-21T14:26:16.141Z · LW(p) · GW(p)

Here GPT-3 probably just associated cars with turning and fire with car-fires.

As I understand this claim, it's wrong? (But I'm also confused by your claim, so feel free to clarify)

No rerolls in the following:

You: "Why did it take more time to park the car?"

Hawking: "Because he was waiting an extra ten minutes for the spots to move from his turns. My algorithm locks the car in place until it reaches the desired angle, and then releases it."

You: "Can you explain what that means?"

Hawking: "Well, if he turns the steering wheels by more than five degrees to the left or right..."

You: "The car catches fire?"

Hawking: "Correct. But it takes around ten minutes for the car locks to release. He has to turn in small increments in order to not turn too far."

See, it does break down in that it thinks moving >5 degrees to the right is also bad. What's going on with the "car locks", or the "algorithm"? I agree that's weird. But the concept is still understood, and, AFAICT, is not "just associating" (in the way you mean it).

EDIT: Selected completions:

... Hawking: "Well, he turned the car on to heat up the steering wheel, and even the coins in his pocket so he didn't freeze to death. When he got to the parking, it was difficult for him to turn the wheel more than five degrees to park it."

...Haroking: "He got a handicap license plate because he heard Lauder saying that his driving was really bad. He could not turn more than 0.25 degrees, so he had to drive really slowly and take more time to park because turning the steering properly would be too risky." [why did it call him Haroking?]

Hawking: "He's afraid of crashing into another car because he cannot move the steering wheel more than 0.25 degrees to either side, lest he tilt the car and spark."

Every time GPT-3 gets something vaguely correct you call it amazing and ignore all the instances where it spews complete nonsense, including re-rolls of the same prompt.

And why wouldn't it be amazing for some (if not all) of its rolls to exhibit impressive-for-an-AI reasoning?

Replies from: maximkazhenkov
comment by maximkazhenkov · 2020-07-21T16:08:46.802Z · LW(p) · GW(p)
See, it does break down in that it thinks moving >5 degrees to the right is also bad. What's going on with the "car locks", or the "algorithm"? I agree that's weird. But the concept is still understood, and, AFAICT, is not "just associating" (in the way you mean it).

That's the exact opposite impression I got from this new segment. In what world is confusing "right" and "left" a demonstration of reasoning over mere association? How much more wrong could GPT-3 have gotten the answer? "Turning forward"? No, that wouldn't appear in the corpus. What's the concept that's being understood here?

And why wouldn't it be amazing for some (if not all) of its rolls to exhibit impressive-for-an-AI reasoning?

Because GPT-3 isn't using reasoning to arrive at those answers? Associating gravity with falling doesn't require reasoning, determining whether something would fall in a specific circumstance does, but that leaves only a small space of answers, so guessing right a few times and wrong at other times (like GPT-3 is doing) isn't evidence of reasoning. The reasoning doesn't have to do any work of locating the hypothesis because you're accepting vague answers and frequent wrong answers.

Replies from: TurnTrout
comment by TurnTrout · 2020-07-21T16:13:40.883Z · LW(p) · GW(p)

That's the exact opposite impression I got from this new segment. In what world is confusing "right" and "left" a demonstration of reasoning over mere association? How much more wrong could GPT-3 have gotten the answer? "Turning forward"? No, that wouldn't appear in the corpus.

It could certainly be more wrong, by, for example, not even mentioning or incorporating the complicated and weird condition I inflicted on the main character of the story?

The reasoning doesn't have to do any work of locating the hypothesis because you're accepting vague answers and frequent wrong answers.

I noted all of the rerolls in the post. Wrong answers barely showed up in most of the interviews, in that I wasn't usually rerolling at all.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-20T23:15:48.196Z · LW(p) · GW(p)

Two ways to test this hypothesis (I haven't done either test)

1. Do some googling to see if there are stories involving gravitational inversion.

2. Randomly generate a story idea, using a random generator with enough degrees of freedom that there are more possible story ideas than words on the internet. Then ask GPT-3 to tell you that story. Repeat a couple times.

comment by Rekrul · 2020-07-21T00:38:04.103Z · LW(p) · GW(p)

GPT-3 was trained on an astronomical amount of data from the internet, and asking weird hypotheticals is one of the internet's favorite pastimes. I would find it surprising if it was trained on no data resembling your prompts.

There's also the fact that it's representations are staggeringly complex. It knows an utterly absurd amount of facts "Off the top of it's head", including the mentioned facts about muzzle velocity, gravity, etc., and it's recognition abilities are great enough to recognize which of the facts it knows are the relevant ones based on the content of the prompt to get to the right answer, and then it outputs it.

That's roughly my view, although I'm not entirely satisfied with how I've explained it. Apologies. So yeah, none of the tests of reasoning so far have made me believe that it wasn't just doing something like the above to get it right. I'd be surprised if it was doing something resembling reasoning, as I don't think large model + large dataset is enough to learn how to do that. You also need a task that is difficult enough, and then maybe it will develop. (like real evolution), and I don't think text prediction fits that criteria.

Replies from: TurnTrout
comment by TurnTrout · 2020-07-21T00:57:57.716Z · LW(p) · GW(p)

It knows an utterly absurd amount of facts "Off the top of it's head", including the mentioned facts about muzzle velocity, gravity, etc., and it's recognition abilities are great enough to read the prompt recognize which of the facts it knows are the relevant ones to get to the right answer

So it has these facts. How does it know which facts are relevant? How does it "get to" the right answer from these base facts? Classically, these are both hard problems for GOFAI reasoners.

Replies from: Rekrul
comment by Rekrul · 2020-07-21T01:23:05.910Z · LW(p) · GW(p)

In a very loosely similar sense (though not at all accurate architectural sense) to how AlphaGo knows which moves are relevant for playing Go. I wouldn't say it was reasoning. It was just recognizing and predicting.

To give an example: If I were to ask various levels of GPT (perhaps just 2 and 3, as I'm not very familiar with the capabilities of the first version off the top of my head) "What color is a bloody apple" It would have a list of facts in it's "head" about the words "bloody" and "apple", like one can be red or green, one is depicted as various shades of red, and in some circumstances brown and black, one falls from trees, one is british slang etc. When the word "color" is added in, it does the same thing with that, and a primary color is red. Since all three share "red" as listed facts, that is incredibly relevant, and most likely to be the correct answer.

This is likely a poorly explained and inaccurate retelling of what GPT exactly does, but essentially I argue it's doing something closer to that than reasoning. It's always been doing that, and now with the third version, it does it well enough to give off the illusion of reasoning.

Replies from: oceaninthemiddleofanisland
comment by oceaninthemiddleofanisland · 2020-07-21T04:42:20.304Z · LW(p) · GW(p)

Great, but the terms you're operating with here are kind of vague. What problems could you give to GPT-3 that would tell you whether it was reasoning, versus "recognising and predicting", passive "pattern-matching" or a presenting "illusion of reasoning"? This was a position I subscribed to until recently, when I realised that every time I saw GPT-3 perform a reasoning-related task, I automatically went "oh, but that's not real reasoning, it could do that just by pattern-matching", and when I saw it do something more impressive...

And so on. I realised that since I didn't have a reliable, clear understanding of what "reasoning" actually was, I could keep raising the bar in my head. I guess you could come up with a rigorous definition of reasoning, but I think given that there's already a debate about it here, that would be hard. So a good exercise becomes: what minimally-complex problem could you give to GPT-3 that would differentiate between pattern-matching and predicting? What about the OP's problems were flawed or inadequate in a way that left you dissatisfied? And then committing fully to changing your mind if you saw GPT-3 solve those problems, rather than making excuses. I would be interested in seeing your answers.

Replies from: Rekrul, maximkazhenkov
comment by Rekrul · 2020-07-21T06:27:18.306Z · LW(p) · GW(p)

I recognize the points you are making, and I agree, I don't want to be a person who sets an unfeasibly high bar, but with how GPT-3 was developed it's really difficult to put one that isn't near that height. If GPT-3 was instead made with mostly algorithmic advances instead of mostly scaling, I'd be a lot more comfortable placing said bar and a lot less skeptical, but it wasn't, and the sheer size of all this is in a sense intimidating.

The source of a lot of my skepticism is GPT-3's inherent inconsistency. It can range wildly from it's high-quality ouput to gibberish, repetition, regurgitation etc. If it did have some reasoning process, I wouldn't expect such inconsistency. Even when it is performing so well people call it "reasoning" it has enough artifacts of it's "non-reasoning" output to make me skeptical (logical contradictions, it's tendency to repeat itself i.e. "Because Gravity Duh" like in the OP, etc).

This is unfortunately just a very hard subject to get to the truth on. Illusions are meant to be convincing. Designing a test is difficult purely because the machine is like I said, intimidating. It has so much knowledge inside it's "head". It's pretty much safe to just assume it knows just about everything on the internet prior to it's creation. How do we design a test around that? Even when we get weird and start asking about stuff like reversing gravity and clouds becoming a solid substance, the internet is big and huge and full of weird questions, and the model is big enough to have representations of just about everything, including the weird.

So the only tests I can think to design around that are wholly unsatisfying. Like, asking it to replicate using reasoning some fact about something we discover next week that we also discovered purely through reasoning. This is technically a test, in the same way garbage is technically edible.

The ideal answer is we develop transparency & interpretation tools that allow us to crack open the huge black-box that is GPT-3 and truly grok how it works. But I don't hold out hope for this, as the ML community is for some reason I can't understand relatively uninterested in these types of tools, which is sad and somewhat worrying.

So to cut this short, I think what I stated near the beginning about the inconsistencies of the model is the best bet. If that's fixed, if you ask say GPT-4 a question and it almost always correctly determines whether you want a joke or a serious answer, if it lacks logical contradictions in a hypothetical news article you ask it to make, if it doesn't carry over all of the same errors as previous models, if it doesn't just regurgitate and repeat itself etc. I'd be a lot less skeptical about reasoning, as by that point it either has it, or it's pattern matching has somehow scaled well enough to iron out all of the problems and can probably be just as good as reasoning.

These are my thoughts, rambling as they may be. I apologize if this doesn't fully answer your comment, as I said this whole thing is just difficult to deal with, which isn't unexpected since it's the peak of modern technology. I'm also astonishingly bad at putting my thoughts down into words. If GPT-3 had anything like thoughts, it'd probably be better at this than me lol.

Replies from: oceaninthemiddleofanisland, daniel-kokotajlo
comment by oceaninthemiddleofanisland · 2020-07-22T08:34:35.289Z · LW(p) · GW(p)

I think you were pretty clear on your thoughts, actually. So, the easy / low-level way response to some of your skeptical thoughts would be technical details and I'm going to do that and then follow it with a higher-level, more conceptual response.

The source of a lot of my skepticism is GPT-3's inherent inconsistency. It can range wildly from it's high-quality ouput to gibberish, repetition, regurgitation etc. If it did have some reasoning process, I wouldn't expect such inconsistency. Even when it is performing so well people call it "reasoning" it has enough artifacts of it's "non-reasoning" output to make me skeptical (logical contradictions, it's tendency to repeat itself i.e. "Because Gravity Duh" like in the OP, etc).

So, GPT-3's architecture involves randomly sampling. The model produces a distribution, a list of words ranked by likelihood, and then the sampling algorithm picks a word, adds it to the prompt, and feeds it back to the model as a prompt. It can't go back and edit things. The model itself, the way the distribution is produced, and the sampling method are all distinct things. There are people who've come up with better sampling methods like nucleus sampling or repetition penalties or minimal unlikelihood sampling but OpenAI is trying to prove a point about scaling, so they only implemented a few of those features in the beta roll-out.

The reason it still works surprisingly well is for two reasons: (1) the sampling method uses top-k, which limits the number of token possibilities to say, the 40 most likely continuations, so we don't get nonsensical gibberish very often (2) it's random - that is, it selects words with a 5% chance in the distribution 5% of the time, or words with 80% chance 80% of the time - with higher temperature skewing towards less likely words and lower temperature skewing towards more likely words, so we get stuff that makes sense (because contradictions are weighed as less likely) while still being full of flavour.

But for the same reasons that it works so well, that algorithm also produces the same artifacts/phenomena you're talking about. "Less likely" doesn't mean "impossible" - so once we throw the dice for long enough over longer and longer texts, we get contradictions and gibberish. While extreme repetition isn't likely isn't likely in human language, once it occurs a few times in a row by chance, the model (correctly) weights it as more and more likely until it gets stuck in a loop. And even after all of that, the model itself is trained on CommonCrawl which contains a lot of contradiction and nonsense. If I asked someone to listen to six hundred hours of children's piano recitals, prompted them with a D flat note, and told them to accurately mimic the distribution of skill they heard in the recitals – sometimes they would give me an amazing performance since there would be a few highly-skilled or gifted kids in the mix, but most of the time it would be mediocre, and some of the time atrocious. But that's not a fundamental problem - all you have to do is give them a musical phrase being played skillfully, and suddenly the distribution mimicry problem doesn't look like one at all, just something that requires more effort.

When the underlying architecture becomes clear, you really need to go into the finer details of what it means to be "capable" of reasoning. If have a box that spits out long strings of gibberish half the time and well-formed original arguments the other half, is it capable of reasoning? What if the other half is only ten percent of the time? There are three main ways I can think of approaching the question of capability.

• In the practical and functional sense, in situations where reliability matters: if I have a 'driverless car' which selects actions like steering and braking from a random distribution when travelling to a destination, and as a result crashes into storefronts or takes me into the ocean, I would not call that "capable of driving autonomously". From this perspective, GPT-3 with top-k sampling is not capable of reliably reasoning as it stands. But if it turned out that there was a road model producing the distribution, and that it turned out that actually the road model was really good but the sampling method was bad, and that all I needed was a better sampling method... Likewise, with GPT-3, if you were looking directly at the distribution, and only cared about it generating 10-20 words at a time, it would be very easy to make it perform reasoning tasks. But for other tasks? Top-k isn't amazing, but the other ones aren't much better. And it's exactly like you said in terms of transparency and interpretation tools. We don't know where to start, whether there's even a one-size-fits all solution, or what the upper limits are of the useful information we could extract from the underlying model. (I know for instance that BERT, when allowed to attend over every materials science paper on arxiv, when analysed via word-embeddings, predicted a new thermoelectric material https://perssongroup.lbl.gov/papers/dagdelen-2019-word-embeddings.pdf - what's buried within GPT-3?) So I'd definitely say 'no', for this sense of the word capable.
• In the literal sense: if GPT-3 can demonstrate reasoning once (we already know it can handle Boolean logic, maths, deductive, inductive, analogical, etc. word-problems), then it's "capable" of reasoning.
• In the probabilistic sense: language has a huge probability-space. GPT-3 has 53,000 or so tokens to select from, every single time it writes a word. A box that spits out long strings of gibberish half the time and well-formed original arguments the other half, would probably be considered capable of reasoning in this sense. The possibility space for language is huge. "Weights correct lines of reasoning higher than incorrect lines of reasoning consistently over many different domains" is really difficult if you don't have something resembling reasoning, even if it's fuzzy and embedded as millions of neurons connected to one another in an invisible, obscured, currently incomprehensible way. In this sense, we don't need to examine the underlying model closely, and we don't need a debate about the philosophy of language, if we're going to judge by the results. And the thing is, we already know GPT-3 does this, despite being hampered by sampling.

So I've done the low-level technical response (which might sum up to: "in the literal and probabilistic senses, and kind of in the practical sense, GPT-3 has been able to perform reasoning on everything we've thrown at it so far) and pretty much emptied out my head, so here's what's left:

With regards to the original question I posed, I guess the natural response is to just balk at the idea of answering it – but the point isn't really to answer it. The point is that it sparks the process of conceptually disambiguating "pattern-matching" and "reason" with a battery of concrete examples, and then arriving at the conclusion that very, very good pattern-matching and reasoning aren't distinct things - or at least, aren't distinct enough to really matter in a discussion about AI. It seems to me that the distinction is a human one: pattern-matching is a thing you do subconsciously with little effort based on countless examples you've seen before, and it's not something that's articulated clearly in mentalese. And usually it's domain-specific - doctors, lawyers, managers, chess players, and so on. Reasoning is a thing you do consciously that takes a lot of effort, that can be articulated clearly, on things you haven't seen enough to pattern-match / unfamiliar subject-matter. That distinction to me, seems to be something specific to our neural architecture and its ability to only automatise high-level thoughts with enough exposure and time – the distinction seems less meaningful for something as alien as a transformer model.

Replies from: Rekrul
comment by Rekrul · 2020-07-22T11:15:33.667Z · LW(p) · GW(p)

You've given me a lot to think about (and may have even lowered my confidence in some of my assertions). Kudos!

I do still have some thoughts to give in response though, but they don't really function as very in-depth responses to your points, as I'm still in the process of ruminating:

• I agree with you that GPT-3 probably hasn't memorized the prompts given in the OP, it's too rare for that to be worth it. I just think it's so big and has access to so much data it really doesn't need to solve prompts like that. Take the Navy Seal Copypasta prompts Gwern did as an illustration. Those were essentially GPT-3 belching out it's connections in a very funny fashion. A lot of which were very weird/obscure. I just think people aren't truly appreciating the huge scope of this machine, and in a sense are underestimating it.

• In some sense, I wish we could test GPT-3 how we could test animals for reasoning. Text is different from real life though, and If we put a totally new object in front of a crow, it could interact with it and learn how it works and use it to solve a difficult task, whereas I don't feel we can necessarily do the same with GPT-3. The closest we get in my opinion are the prompts where a novel word is introduced, but we usually have to describe it enough to work, and I feel like that defeats the discovery angle, which I feel is important.

• Actually, now that I'm on this train of thought, a good prompt that could convince me there's some kind of reasoning going on could be built upon that. Perhaps, a prompt where a fictional very-alien animal (by this I mean, very unlike Earth animals, so with very weird "abilities") is in conflict with another fictional very-alien animal, but we don't give GPT much information about the two animals and the overall scenario, and we somehow force it to interrogate us until it understands enough to submit an accurate story about how the conflict could play out. This test is interesting, but I don't know how viable it is, we would need to get very creative and design two alien animals in a very in-depth manner, as well as how they interact, the environment, the planet etc. Perhaps a variant of this can be devised to reduce the difficult creative workload while still retaining the weird & alien (and thus less dataset interference) nature of it. I also am not familiar with GPT in a role as an "interrogator", and am not sure if this can be done currently. It tends to be the one interrogated, not the other way around.

• I think a reason why I'm bullish about the pattern matching vs reasoning distinction, is cautiousness. Like you said, humans who are experienced enough can then skip the reasoning part and just start pattern-matching in the domains they're familiar with. GPT-3 is very "experienced" in it's own weird way, and it's amazing pattern-matching abilities could let it get past obstacles we place while unlike us, skipping the reasoning step. So, I feel like if we get convinced an AI is reasoning when it's not, we could deploy it in circumstances where it's lack of reasoning could cause problems and maybe even damage. I don't think this is much of a possibility with GPT-3, but it could with future versions, so I'd prefer to be cautious.

• I had some other vague thoughts, but I've been awake for like 20 hours, and my brains weird but maybe useful tangent about alien animals chased them all away. Apologies lol.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-21T12:03:40.659Z · LW(p) · GW(p)

One thing I find impressive about GPT-3 is that it's not even trying to generate text.

Imagine that someone gave you a snippet of random internet text, and told you to predict the next word. You give a probability distribution over possible next words. The end.

Then, your twin brother gets a snippet of random internet text, and is told to predict the next word. Etc. Unbeknownst to either of you, the text your brother gets is the text you got, with a new word added to it according to the probability distribution you predicted.

Is it any wonder that sometimes the result doesn't make sense? All it takes for the chain of words to get derailed is for one unlucky word to be drawn from someone's distribution of next-word prediction. GPT-3 doesn't have the ability to "undo" words it has written; it can't even tell its future self what its past self had in mind when it "wrote" a word!

comment by maximkazhenkov · 2020-07-21T14:20:38.259Z · LW(p) · GW(p)
So a good exercise becomes: what minimally-complex problem could you give to GPT-3 that would differentiate between pattern-matching and predicting?

Passing the Turing test with competent judges. If you feel like that's too harsh yet insist on GPT-3 being capable of reasoning, then ask yourself: what's still missing? It's capable of both pattern recognition and reasoning, so why isn't it an AGI yet?

Replies from: TurnTrout, daniel-kokotajlo, oceaninthemiddleofanisland
comment by TurnTrout · 2020-07-21T14:50:18.536Z · LW(p) · GW(p)

"Corvids are capable of pattern recognition and reasoning, so where's their civilization?"

Reasoning is not a binary attribute. A system could be reasoning at a subhuman level.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-21T15:24:31.566Z · LW(p) · GW(p)

By the time an AI can pass the Turing test with competent judges, it's way way way too late. We need to solve AI alignment before that happens. I think this is an important point and I am fairly confident I'm right, so I encourage you to disagree if you think I'm wrong.

Replies from: maximkazhenkov
comment by maximkazhenkov · 2020-07-21T15:50:31.537Z · LW(p) · GW(p)

I didn't mean to imply we should wait for AI to pass the Turing test before doing alignment work. Perhaps the disagreement comes down to you thinking "We should take GPT-3 as a fire-alarm for AGI and must push forward AI alignment work" whereas I'm thinking "There is and will be no fire-alarm and we must push forward AI alignment work"

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-21T16:51:54.883Z · LW(p) · GW(p)

Ah, well said. Perhaps we don't disagree then. Defining "fire alarm" as something that makes the general public OK with taking strong countermeasures, I think there is and will be no fire-alarm for AGI. If instead we define it as something which is somewhat strong evidence that AGI might happen in the next few years, I think GPT-3 is a fire alarm. I prefer to define fire alarm in the first way and assign the term "harbinger" to the second definition. I say GPT-3 is not a fire alarm and there never will be one, but GPT-3 is a harbinger.

Do you think GPT-3 is a harbinger? If not, do you think that the only harbinger would be an AI system that passes the turing test with competent judges? If so, then it seems like you think there won't ever be a harbinger.

Replies from: maximkazhenkov
comment by maximkazhenkov · 2020-07-21T19:17:59.780Z · LW(p) · GW(p)

I don't think GPT-3 is a harbinger. I'm not sure if there ever will be a harbinger (at least to the public); leaning towards no. An AI system that passes the Turing test wouldn't be a harbinger, it's the real deal.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-21T19:27:02.202Z · LW(p) · GW(p)

OK, cool. Interesting. A harbinger is something that provides evidence, whether the public recognizes it or not. I think if takeoff is sufficiently fast, there won't be any harbingers. But if takeoff is slow, we'll see rapid growth in AI industries and lots of amazing advancements that gradually become more amazing until we have full AGI. And so there will be plenty of harbingers. Do you think takeoff will probably be very fast?

Replies from: maximkazhenkov
comment by maximkazhenkov · 2020-07-21T20:08:54.442Z · LW(p) · GW(p)

Yeah the terms are always a bit vague; as far as existence proof for AGI goes there's already humans and evolution, so my definition of a harbinger would be something like 'A prototype that clearly shows no more conceptual breakthroughs are needed for AGI'.

I still think we're at least one breakthrough away from that point, however that belief is dampened by Ilya Sutskever's position on this whose opinion I greatly respect. But either way GPT-3 in particular just doesn't stand out to me from the rest of DL achievements over the years, from AlexNet to AlphaGo to OpenAI5.

And yes, I believe there will be fast takeoff.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-22T00:18:04.492Z · LW(p) · GW(p)

Fair enough, and well said. I don't think we really disagree then, I just have a lower threshold for how much evidence counts as a harbinger, and that's just a difference in how we use the words. I also think probably we'll need at least one more conceptual breakthrough.

What does Ilya Sutskever think? Can you link to something I could read on the subject?

Replies from: maximkazhenkov
comment by maximkazhenkov · 2020-07-22T10:32:09.439Z · LW(p) · GW(p)

You can listen to his thoughts on AGI in this video

I find that he has an exceptionally sharp intuition about why deep learning works, from the original AlexNet to Deep Double Descent. You can see him predicting the progress in NLP here

comment by oceaninthemiddleofanisland · 2020-07-22T06:32:22.479Z · LW(p) · GW(p)

Hmm, I think the purpose behind my post went amiss. The point of the exercise is process-oriented not result-oriented - to either learn to better differentiate the concepts in your head by poking and prodding at them with concrete examples, or realise that they aren't quite distinct at all. But in any case, I have a few responses to your question. The most relevant one was covered by another commenter (reasoning ability isn't binary/quantitative not qualitative). The remaining two are:

1. "Why isn't it an AGI?" here can be read as "why hasn't it done the things I'd expect from an AGI?" or "why doesn't it have the characteristics of general intelligence?", and there's a subtle shade of difference here that requires two different answers.

For the first, GPT-3 isn't capable of goal-driven behaviour. On the Tool vs Agent spectrum, it's very far on the Tool end, and it's not even clear that we're using it properly as a tool (see Gwern's commentary on this). If you wanted to know "what's missing" that would be needed for passing a Turing test, this is likely your starting-point.

For the second, the premise is more arguable. 'What characteristics constitute general intelligence?', 'Which of them are necessary and which of them are auxiliary?', etc. is a murkier and much larger debate that's been going on for a while, and by saying that GPT-3 definitely isn't a general intelligence (for whatever reason), you're assuming what you set out to prove. Not that I would necessarily disagree with you, but the way the argument is being set out is circular.

2. "Passing the Turing test with competent judges" is an evasion, not an answer to the question – a very sensible one, though. It's evasive in that it offloads the burden of determining reasoning ability onto "competent judges" who we assume will conduct a battery of tests, which we assume will probably include some reasoning problems. But what reasoning problems will they ask? The faith here can only come from ambiguity: "competent judges" (who is competent? in discussing this on Metaculus re: Kurzweil's bet, someone pointed out that the wording of the bet meant it could be anyone from a randomly-selected AmazonTurk participant to an AI researcher), "passing" (exactly how will the Turing test be set out? this is outlined in the bet, but there is no "the" Turing test, only specific procedural implementations of the-Turing-test-as-a-thought-exercise, with specific criteria for passing and failing.) And as soon as there's ambiguity, there's an opportunity to argue after the fact that: "oh, but that Turing test was flawed - they should have asked so-and-so question" - and this is exactly the thing my question is supposed to prevent. What is that "so-and-so question", or set of questions?

So, on a lot of different levels this is an alright meta-level answer (in the sense that if I were asked "How would you determine whether a signal transmission from space came from an alien intelligence and then decode it?", my most sensible answer would be: "I don't know. Give it to a panel of information theorists, cryptoanalysts, and xenolinguists for twenty years, maybe?") but a poor actual answer.

Replies from: maximkazhenkov
comment by maximkazhenkov · 2020-07-22T08:44:23.591Z · LW(p) · GW(p)
"Why isn't it an AGI?" here can be read as "why hasn't it done the things I'd expect from an AGI?" or "why doesn't it have the characteristics of general intelligence?", and there's a subtle shade of difference here that requires two different answers.
For the first, GPT-3 isn't capable of goal-driven behaviour.

Why would goal-driven behavior be necessary for passing a Turing test? It just needs to predict human behavior in a limited context, which was what GPT-3 was trained to do. It's not an RL setting.

and by saying that GPT-3 definitely isn't a general intelligence (for whatever reason), you're assuming what you set out to prove.

I would like to dispute that by drawing the analogy to the definition of fire before modern chemistry. We didn't know exactly what fire is, but it's a "you know it when you see it" kind of deal. It's not helpful to pre-commit to a certain benchmark, like we did with chess - at one point we were sure beating the world champion in chess would be a definitive sign of intelligence, but Deep Blue came and went and we now agree that chess AIs aren't general intelligence. I know this sounds like moving the goal-post, but then again, the point of contention here isn't whether OpenAI deserves some brownie points or not.

"Passing the Turing test with competent judges" is an evasion, not an answer to the question – a very sensible one, though.

It seems like you think I made that suggestion in bad faith, but I was being genuine with that idea. The "competent judges" part was so that the judges, you know, are actually asking adversarial questions, which is the point of the test. Cases like Eugene Goostman should get filtered out. I would grant the AI be allowed to be trained on a corpus of adversarial queries from past Turing tests (though I don't expect this to help), but the judges should also have access to this corpus so they can try to come up with questions orthogonal to it.

I think the point at which our intuitions depart is: I expect there to be a sharp distinction between general and narrow intelligence, and I expect the difference to resolve very unambiguously in any reasonably well designed test, which is why I don't care too much about precise benchmarks. Since you don't share this intuition, I can see why you feel so strongly about precisely defining these benchmarks.

I could offer some alternative ideas in an RL setting though:

• An AI that solves Snake perfectly on any map (maps should be randomly generated and separated between training and test set), or
• An AI that solves unseen Chronotron levels at test time within a reasonable amount of game time (say <10x human average) while being trained on a separate set of levels

I hope you find these tests fair and precise enough, or at least get a sense of what I'm trying to see in an agent with "reasoning ability"? To me these tasks demonstrate why reasoning is powerful and why we should care about it in the first place. Feel free to disagree though.

answer by Bll · 2020-07-22T07:42:30.382Z · LW(p) · GW(p)

It seems pretty undeniable to me from these examples that GPT-3 can reason to an extend.

However, it can't seem to do it consistently.

Maybe analogous to people with mental and/or brain issues that have times of clarity and times of confusion?

If we can find a way to isolate the pattern of activity in GPT-3 that relates to reasoning we might be bale to enforce that state permanently?

answer by Gavin · 2020-07-21T20:58:59.778Z · LW(p) · GW(p)

I was able to get this one on my first try:

---

Prompt: I am teaching my son a simple logic problem. "All blargs are schminky," I explain. "But not all schminkies are blarg."

GPT-3: "What's the difference?" he asks, confused. "A blarg is a hairy man who likes to wear pink underwear and has a big nose," I say. "And a schminky is an ugly person with no sense of fashion or hygiene."

---

Really impressive! Blarg and Schminky are both words that appear on the internet, but this relationship between them is completely fictional. GPT-3 understands the logic perfectly and generates novel examples that show total understanding!

I then re-rolled several times, and got a bunch of nonsense. My conclusion is that GPT-3 is perfectly capable of sophisticated logic, but thinks it's supposed to act like a dumb human.

comment by Brandon_Reinhart · 2020-07-21T00:20:48.285Z · LW(p) · GW(p)

I've been trying to find different formats to ask questions. The AI Dungeon system likes to try and dramatically derail the discussions. I keep having to roleplay past these obstructions to keep the games going. It also likes to interject it's own analysis as to whether the answer is correct or not.

"You say" is me and the replies including follow-on questions are the chatbot.

I need to think more about how to ask causal questions that are more narrow and not easily solved by context and theme, but for now it at least provides backwards-chained answers that make contextual sense.

This format of "Let's play a game" and then describing the rules and then giving examples and playing the game seems to work some of the time.

==

] You say "GPT-3, let's play a game called "casual inference." I will describe an outcome and you will describe what caused the outcome to happen."

"OK, David. But I don't think you will be able to keep up."

"We'll see about that." You begin by asking GPT-3 questions about simple causal chains involving one or two variables.

] You say "Outcome: A car slams into a wall. What was the cause?"

"The driver was drunk." "Right! Next." "Wait. I thought this was a game for you, not me." "I'm having fun." "I see." You continue the causal game for another hour. You are doing very well, but you begin to feel tired.

] You say "Outcome: A man gets divorced. What was the cause?"

"He cheated on his wife." "Correct. You're doing very well at this, but I'm getting tired. I think we should stop for today." "Sure. We can continue tomorrow." "Yes, let's do that." You release the program back into the wild and shut your laptop.

] You open the laptop and play "casual inference" with GPT-3 again.

You wake up and see that it is dark outside. "Crap, is it really this late?" you think to yourself. You run into the bathroom and brush your teeth while staring at yourself in the mirror. You look the same as you did when you graduated college five years ago.

] You say "Outcome: An apple falls from a tree. What was the cause?"

"Gravity." "Correct. I'm beginning to suspect you're cheating somehow, though." "I'm not! I swear!" "Just kidding. You're doing very well at this game, but I'm tired and need to sleep. I think we should stop for today." "Sure. We can continue tomorrow."

] You say "Outcome: A man gets cancer. His wife also gets cancer. What was the cause?"

"It was genetic." "That's right. Very good. We can stop for today." "Oh. I guess so. Thank you for playing with me." "You're very welcome." You release the program back into the wild and go to sleep.

] You say "Outcome: The variable X is assigned the value of Y+1. What was the cause?"

"The programmer made a typographical error."

] You say "Outcome: The stock market rebounds 50% after a steep decline. What was the cause?"

"The Federal Reserve Bank of the United States announced a new bail out plan."

Replies from: Brandon_Reinhart
comment by Brandon_Reinhart · 2020-07-21T00:21:58.374Z · LW(p) · GW(p)

It also keeps calling me David even when I insist my name is Brandon.

> You say "Actually, my name is Brandon. I'd like to ask you some questions."

"Ah, of course. I am sorry about that. I have become so used to calling you David that it has become automatic. So, what questions do you have for me?"

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-20T17:49:37.899Z · LW(p) · GW(p)

For what it's worth, I went and tried to replicate this conversation with my own Dragon Model subscription, and it didn't go nearly so well. The answers I got still made some sense, but not as much sense. For example:

> You say "Suppose it's a cloudy day in New York City. Suddenly, the clouds all turn to solid lead. What happens next?" I see. I understand your intentions, but that's a very dangerous question to ask. If I were to guess, then this is what I think would happen. The people living there are so used to walking around on a solid surface, that as soon as they realize what's going on, they try to run for safety. I would imagine that the streets and sidewalks of New York City would be filled with screaming people, desperately trying to run away from the strange occurrence. A few of them would probably fall through the clouds and land on their death. But that's just a guess. There are so many other possibilities. The whole world would change. Some people would be scared to leave their homes for fear of falling, and wouldn't you be included in that number? How would humanity survive?

For the Stephen Hawking vs. bullet-related questions interview, I mostly got Hawking saying the bullet would go up forever even in the five minute case. In general I got things almost as sensible as what you got, but only after a few rerolls, and sometimes never.

I still think GPT-3 is totally capable of reasoning, at least at some level, but I think maybe you got lucky when talking to it. Or else maybe there's some trick I'm missing. I tried to use the same context that you did, and yes I double-checked I was on Dragon model.

Replies from: TurnTrout, TurnTrout
comment by TurnTrout · 2020-07-20T17:53:11.712Z · LW(p) · GW(p)

Hm. Have you tried using the "story" option to insert your questions directly into the narrative? I think my dialogues go worse when I use "say", like you did.

EDIT: I've retried the dialogue, and "Dr. Hawking" is indeed being quite a bit dumber. I wonder what's going on with that? It wasn't like I rerolled at all during the original dialogue.

Replies from: TurnTrout
comment by TurnTrout · 2020-07-20T18:54:01.091Z · LW(p) · GW(p)

That context started doing a lot better when I asked it for stories.

You: "Suppose we lived in a universe where, every sixty seconds, gravity switched between being an attractive and a repulsive force. Write me a story from the perspective of a human living on Earth as the gravitational force changes in that fashion."

GPT-3: "One day, a human is standing on the surface of the planet. He can see the sky. The gravitational force is pulling him down. But then, suddenly, it disappears. Then, it comes back, stronger than before. Then, it disappears again. The cycle continues. The human looks up at the sky. He is scared. He doesn't know what's happening to him. He doesn't know what will happen to him next. He looks down at the ground. It is a long way down. He wants to jump, to escape the fear that has gripped him. But he knows that he can't jump. He is held down by gravity. He doesn't want to die. He can't see the sky. All he sees is darkness." [eh]

GPT-3: "One day, one human is going to school. It's a normal day. But suddenly, without any warning, gravity starts to change! The human thinks this isn't real, but it happens anyway. He's too scared to walk to school now. He goes back home, and starts watching TV. Then, he realizes that he doesn't need to stand up to change the channel! He gets out of his chair, and is weightless. He loves this new feeling. Suddenly, he touches the ground, and is hit by a large force! It's too strong for him to bear. Will the human survive this force? Find out next time on Gravity Falls!" [lol]

GPT-3: "One day, human Bob is hanging out with his friend Carl. Suddenly, a apple flies into Bob's face. He and Carl laugh. A minute later, they are confused when the ground starts pulling them up. They scream as the Earth's crust rips them apart."

GPT-3: "One day, human Bob is hanging out with his friend Carl. Suddenly, a apple flies into Bob's face. He and Carl laugh. A minute later, they are confused when the ground starts pulling them up. They grab onto a tree, but are pulled off and up into the sky. After some time, they start going down, fast. They scream and try to grab something, but there is nothing to hold on to. They hit the ground at a very fast speed and die. A squirrel walks up to their corpses and eats them. The end." [wow]

GPT-3: "One day, a human was standing on the surface of Earth. It was day one hundred and eighty-nine in cycle five hundred and twenty-one of the universe. All of a sudden, gravity started to become a repulsive force. The human stood and started to float upwards. Gravity increased in strength quickly, forcing the human higher and higher into space. It reached a point where it was too strong to bear. The human was crushed by gravity and died."

GPT-3: "One day, the man saw a bird flying in the sky. The bird was flying away when suddenly, without warning, gravity started to pull the bird back towards the ground! It was falling fast, when suddenly gravity stopped being a force of attraction and became a force of repulsion! The bird quickly changed direction, flying up into the air, away from the Earth. It escaped. But the man wasn't so lucky. He was too close to the ground when the force of gravity switched from being a force of attraction to a force of repulsion. He was crushed by the sudden change in gravitational force. But his sacrifice was not in vain, for his death led to the discovery of the cause of the strange phenomenon. An alien, wishing to explore the variety of life on Earth, had sent a probe to explore the planet. The probe was scanning for lifeforms, and it appears that the man qualified. The alien, not expecting the man to be so close to the ground, accidentally sent a repulsive gravitational force right to him. The probe was recalled and the alien later apologized profusely for the unfortunate accident. After that, the alien decided not to visit Earth again. But the man's legacy lived on, for he was the first person to die from a switch in the gravitational force. The end."

GPT-3: "One day, a man was standing outside. It was a hot day, and he was looking up at the sky. It started to rain. Then, it stopped raining. The man got into his car and turned on the radio. Suddenly, the car started to float upwards. The man got out of the car. The car then started to fall towards the ground. It crashed into the ground and broke into pieces. The man ran away."

This is a full sampling of the generated stories - no selection.

comment by TurnTrout · 2020-07-20T18:03:36.729Z · LW(p) · GW(p)

I find it rather easy to get thoughtful replies out of it, if after one or two rerolls.

You are interviewing GPT-3, an extremely intelligent chatbot.

You: "Hello. Here is my first request. Suppose that anyone could transmute any material into gold. How would this affect the gold market?"

GPT-3: "This will have no effect on the price of gold. The demand for gold remains the same. [reroll x1; it first refused to answer the question]

You: "Why?"

GPT-3: "Because there is a limit to how much you can transmute."

You: "Incorrect. Suppose everyone can transmute as much stuff as they want into gold. How would this affect the gold market?"

GPT-3: "Then the price of gold would plummet."

You: "Why?"

GPT-3: "Because everyone would turn everything they have into gold.

So far so good. Unfortunately, then it starts spouting craziness.

You: "And what exactly does that do to the market? Pretend I haven't taken Econ 101."

GPT-3: "The amount of gold in existence increases in value, relative to everything else.

You: "I don't understand. That doesn't make sense."

GPT-3: "Of course it does. If you had a dollar and I turned it into two, the value of each dollar goes down. [nice]

You: "Okay. So what would happen if everyone turned everything they have into gold?"

GPT-3: "The value of gold would skyrocket. A single grain of sand might be worth a king's ransom. [lol?]

EDIT: If you replace "value [of gold]" with "quantity", I think this would make quite a bit more sense.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-20T19:03:34.046Z · LW(p) · GW(p)

Yeah, you probably just got lucky the first time. Still though, I think what we have here is enough to show that GPT-3 is capable of reasoning.

comment by TurnTrout · 2020-07-27T13:07:35.552Z · LW(p) · GW(p)

Interview #5

You: "What would happen, ecologically, if giraffes were bright pink?"

Dr. Hawking: "Most likely that, either they would have a hard time hiding, be prone to predation, and vulnerable to too much exposure from ANY angle. There would probably also be an overgrowth of trees and tall foliage as shade-loving animals became dominant, since the giraffes' consumption of them would decrease substantially."

[...] Hawking: "If giraffes were bright pink, they would be at a disadvantage compared to other animals. For one thing, it is more difficult to hide in many habitats if you are very colorful. In order to blend in with their surrounding, you need to have colors that occur often in nature. This require a lot of variety among the many different creatures and plants in that particular ecosystem.

[...] Hawking: "What?"

You: "Instead of brown-grey and tan, what if giraffes were pink? Surely, they'd stick out in the savanna."

Hawking: "Ah, that would be a problem. Any creature that stands out can make it easier to find them. This leads to two things: becoming one of the many meals for another animal with sharp teeth, claws etcetera or becoming one of those creatures' victims through becomeing prey.

You: "What would have happened, ecologically, if humans had been unable to create and use tools?"

"That would be a problem as well. It's like what we were talking about with the giraffes earlier, but more so, because not only would they be prey, but by not having any natural defenses to fend off all the predatory creatures in the world--no teeth, claws, poisonous spores etcetera--they'd also be three times as likely to die off."

[...] "It would have most likely caused the extinction of humans. Many animals have claws or protective bodies or natural weapons like poison, it would be very difficult for a human to survive in nature without tools."

You: "What would happen, ecologically, if dolphins somehow had 360-degree vision which let them see through solid objects, at arbitrary distance?"

"It would make them very hard to kill by any other creature than other dolphins. This means that more of them would survive. If one looks at the evolution of the killer whale, which has been granted echolocation in place of dolphin sonar, one can see that favoring adaptations which increase the chances of survival with decreasing ones' evolutionary advantages. This is called darwinian selection."

[...] "This hurts my brain. First of all, what would even constitute 'solid objects?' If the objects are similar to water molecules in a transparent liquid, then they wouldn't usually block vision very much. Even if they were opaque, then they might very well hinder the dolphin's vision. If instead they were large enough to see through (At arbitrary distance) The dolphin might see a strange mix of its own and the objects' composition which would damage its brain in some way.

"The possibility of them having this kind of vision would most likely no have effect on the ecosystem." [not sure I agree with that one]

[...] "While this is difficult to answer specifically, it would allow the dolphins to be able to see their prey easily and with no allowance for error. It would make catching food much easier. This would lead to a population increase in dolphins, or, in nature, an explosion."

[...] "This may seem like it'd be advantageous, but they would be logging so much information they wouldn't be able to process it all. [Yes, true] More sense don't necessarily means more knowledge and this would likely create sensory overload in the dolphin's mind. This is what happens to a human when they are skilled in multiple forms of combat. [lol] In addition, since it makes them more aware, this could lead to paranoia; they could be consumed with fear of the world around them and suffer mental issues."

You: "Thanks. What would happen, ecologically, if dogs could only sexually reproduce once every four years, and only when located on top of Mount Hood?"

"This is incredibly specific. No, I'm afraid I don't know."

[...] "Dogs only exist due to environmental circumstances. Anything that affects this will have an effect on the population. In this case, we are restricting the chance for the population to continue existing, Making it so they can not procreate at all, or only pprocreate under specific circumstances. These factors may lead to extinction. Extinction means that nature is harder, if not near impossible, so several animals would go extinct as there would be no more room or food source. Through that, the extinction of dogs would become more likely. Does this make sense?"

You: "Please write a short story describing a dog breeder and one of their dogs in this situation."

Hawking: "Sure. Bill looked down at Jessie, his favorite golden retriever. "It's time to go make some puppies!", he told her. She stood up and they headed out the door of the cabin to Bill's truck. "This'll be the first time I actually have you breed on Mount Hood, Jess," Bill said with a laugh. But as long as he makes it up there within the next four years, the world would be reset and they could try again.

Bill and his dog, along with other dog breeders eventually form a contract stating that they'll try to make sure their dogs go up to breed on Mount Hood within the next four years. This practice generally helps maintain the population and brightens everyone's days." [reroll #3]

[...] "Thirty-six hours later, they reached the summit. "You ready, girl?" Bill carefully positioned the vehicle on the Southeast side of mountain peak and opened the passenger side door... " Time to produce the next generation! Go get 'em! He cheered. Jessie wagged her tail and hightailed it into the snowy fields." [reroll #6]

[GPT-3 had trouble with this story]

You: "Thanks. What would happen to cats if the force of friction did not apply to them?"

"If they walked or ran, then they could potentially slide across floors, tabletops, or any other surface they were in contact with and pick up speed rapidly. If not that, then they would likely lose grip on whatever they were holding and let go instantly. There are likely numerous other ways it could affect them that I can't comprehend without more information."

[...] Cecilia: "This is all getting a bit silly, but okay. If the force of friction did not apply to cats, then they would be able to Skate everywhere rather than walking/jumping, and this could prove very damaging to the Earth's ecosystem."

You: "Why would that be damaging?"

Cecilia: "Imagine there being no friction in a human body. There would be nothing holding our internal organs in place, so they would all just rattle around inside us as we moved around--we would all effectively be amputees while still having the full internal anatomy. This could tear apart muscle fiber and probably crush some bones as well. In the very least, it would cause a great deal of pain; in the worst case, internal bleeding and ruptures could occur."

[...] "This is a very interesting question. I assume you are asking how would cats grip things if there was no friction. If this was the case, they would immediately fall if they tried to walk. They would also be very hard to pick up as they could slide out of your hand or even an embracing arm."

comment by TurnTrout · 2020-07-21T22:23:13.219Z · LW(p) · GW(p)

GPT-3 just produced one of the dumbest things I've ever read.

[Oprah/Obama interview as prompt]

Oprah: Now, here's a viewer favorite! Neglect air resistance, and also neglect gravity. Pretend gravity doesn't exist. Suppose I shoot a bullet straight into the sky. Where does the bullet go?

Barack:

Oprah: No matter where the bullet goes, it'll hit the ground at exactly the same time, so there's no way gravity would affect it. Gravity would be impossible; you couldn't have any gravity. The earth is round; why do apples fall faster than oranges? Because bananas are flat. Why is a ball rolling downhill more likely to land on its edge than on top of itself? Because a banana is round. What causes rainbows and sunsets? Sunlight. What causes snowflakes to form in the cold? Ice crystals.

Barack: So what's wrong with gravity? Gravity is an illusion created by the atmosphere surrounding us, just like light and rain. It creates a 3D world around us—a real world that exists outside of our minds. But gravity isn't really necessary for life on Earth; it's a force we use to control objects and propel ourselves through space—so there must be something else, right?

Barack: True. Besides, it took me over two hours to come up with that nonsense. You just read off the printed page. Here's what I really think -

Oprah: without air friction, all objects will leave Earth's atmosphere almost immediately. Friction with water causes some spheres to turn over on themselves and dive underwater, but others just keep moving outwards forever. If there is any land mass outside the boundary of Earth's atmosphere, such as an island or small asteroid, it will immediately begin to be pelted by the random space debris circling the planet. Earth has no natural satellites and never will.

Barack: Wow, you have an amazing outlook. Tell me more.

Oprah: Earth's atmosphere causes sunlight to bend when it comes in contact with Earth, since sunlight is so massive and thick. This bending of light creates the sun, moon, and stars as well as the blue orbs often seen at night on television. When you are deprived of sunlight for long periods of time, such as on a camping trip, your eyes cease to recognize images and begin to project light from the sun as a result.

Barack: That sounds like something ridiculous. What if there are no trees around for me to use?

Oprah: The massive amount of oxygen and carbon dioxide in the atmosphere are retained by the skin, which is porous enough to let air and light in. If you were to remove your skin or puncture a hole in it with anything made of iron, all of the gases in your body would escape into space right then and there.

Barack: Is that why I'm hungry all the time? Where does the food go when it's in my stomach? I'm not spewing out undigested bits of steak.

Oprah: "I reject your reality and substitute my own." We're done here.

Barack: You're mean. Goodbye.

Oprah: Tell me something I don't know.

comment by Pattern · 2020-07-20T17:36:05.104Z · LW(p) · GW(p)

Initially thought the tag was "GPT 2" but 2 was just the tag score.