Blake Richards on Why he is Skeptical of Existential Risk from AI

post by Michaël Trazzi (mtrazzi) · 2022-06-14T19:09:26.783Z · LW · GW · 12 comments

This is a link post for https://theinsideview.ai/blake

Contents

  Why I Interviewed Blake
  Why you Might Want to Talk to Skeptics
  Generalizing to "All Sort of Tasks We Might Want It To do"
  Contra Transfer Learning from Scaling
  On Recursive Self-Improvement
  Scaling is "Something" You Need
  On the Bitter Lessons being Half True
  On the Difficulty of Long-term Credit Assignment
  On What Would Make him Change his Mind
None
12 comments

I have recently interviewed Blake Richards, an Assistant Professor in the Montreal Neurological Institute and the School of Computer Science at McGill University and a Core Faculty Member at MiLA. Below you will find some quotes summarizing his takes on AGI.

Blake is not really concerned about existential risk from AI. Like Yann LeCun, he finds that AGI is not a coherent concept, and that it would be impossible for an AI to be truly general (even if we restrict the no free lunch theorem to economically valuable tasks).

Why I Interviewed Blake

Although I do not agree with everything he says, I think there is value [LW · GW]in trying to interact with AI researchers outside of the AI Alignment bubble, understanding exactly what arguments they buy and do not buy, eventually nailing down some cruxes that would convince them that AI existential risk is worth thinking about.

Better understanding LeCun's position has been valuable for many on LessWrong (see for instance the 2019 debate with Bengio and Russell [LW · GW]), and Blake thinking is close to Yann's, given they are part of a similar philosophical bent.

Why you Might Want to Talk to Skeptics

Another exercise I found insightful was (mostly incorrectly) assessing people's views on AI Alignment and AI timelines, which made me understand better (thanks Cunningham's law!) the views of optimists (they turned out to be pretty close to Richard Ngo's reasons for optimism at 11:36 here [LW · GW]).

In any case, I recommend to people who are in touch with ML researchers or practitioners to 1) get to a level where they feel comfortable steelmanning them 2) do a write-up of their positions on LW/EAF. That would help nail down the community's understanding of what arguments are convincing or not, and what would make them change their mind.

To that end, here are what Blake has to say about his position on AGI and what could make his change his mind about existential risk.

 

Generalizing to "All Sort of Tasks We Might Want It To do"

"We know from the no free lunch theorem that you cannot have a learning algorithm that outperforms all other learning algorithms across all tasks. [...] Because the set of all possible tasks will include some really bizarre stuff that we certainly don’t need our AI systems to do. And in that case, we can ask, “Well, might there be a system that is good at all the sorts of tasks that we might want it to do?” Here, we don’t have a mathematical proof, but again, I suspect Yann’s intuition is similar to mine, which is that you could have systems that are good at a remarkably wide range of things, but it’s not going to cover everything you could possibly hope to do with AI or want to do with AI."

Contra Transfer Learning from Scaling

"What’s happened with scaling laws is that we’ve seen really impressive ability to transfer to related tasks. So if you train a large language model, it can transfer to a whole bunch of language-related stuff, very impressively. And there’s been some funny work that shows that it can even transfer to some out-of-domain stuff a bit, but there hasn’t been any convincing demonstration that it transfers to anything you want. And in fact, I think that the recent paper… The Gato paper from DeepMind actually shows, if you look at their data, that they’re still getting better transfer effects if you train in domain than if you train across all possible tasks."

On Recursive Self-Improvement

"Per this specificity argument, my intuition is that an AI that is good at writing AI code  might not have other types of intelligence. And so this is where I’m less concerned about the singularity because if I have an AI system that’s really good at coding, I’m not convinced that it’s going to be good at other things. [...] Instead, what I can imagine is that you have an AI that’s really good at writing code, it generates other AI that might be good at other things. And if it generates another AI that’s really good at code, that new one is just going to be that: an AI that’s good at writing code."

Scaling is "Something" You Need

"Will scale be literally all you need? No, I don’t think so. In so far as… I think that right off the bat, in addition to scale, you’re going to need careful consideration of the data that you train it on. And you’re never going to be able to escape that. So human-like decisions on the data you need is something you cannot put aside totally. But the other thing is, I suspect that architecture is going to matter in the long run.

I think we’re going to find that systems that have appropriate architectures for solving particular types of problems will again outperform those that don’t have the appropriate architectures for those problems. [...] my personal bet is that we will find new ways of doing transformers or self-attention plus other stuff that again makes a big step change in our capabilities."

On the Bitter Lessons being Half True

"For RL meta-learning systems have yet to outperform other systems that are trained specifically using model-free components. [...] a lot of the current models are based on diffusion stuff, not just bigger transformers. If you didn’t have diffusion models and you didn’t have transformers, both of which were invented in the last five years, you wouldn’t have GPT-3 or DALL-E. And so I think it’s silly to say that scale was the only thing that was necessary because that’s just clearly not true."

On the Difficulty of Long-term Credit Assignment

"One of the questions that I already alluded to earlier is the issue of really long-term credit assignment. So, if you take an action and then the outcome of that action is felt a month later, how do you connect that? How do you make the connection to those things? Current AI systems can’t do that."

"the reason Montezuma’s revenge was so difficult for standard RL algorithms is, if you just do random exploration in Montezuma’s revenge, it’s garbage, you die constantly. Because there’s all sorts of ways to die. And so you can’t take that approach. You need to basically take that approach of like, “Okay up to here is good. Let’s explore from this point on.” Which is basically what Uber developed."

On What Would Make him Change his Mind

"I suppose what would change my mind on this is, if we saw that with increasing scale, but not radically changing the way that we train the… Like the data we train them on or the architectures we use. And I even want to take out the word radically without changing the architectures or the way we feed data. And if what we saw were systems that really… You couldn’t find weird behaviors, no matter how hard you tried. It always seemed to be doing intelligent things. Then I would really buy it. I think what’s interesting about the existing systems, is they’re very impressive and it’s pretty crazy what they can do, but it doesn’t take that much probing to also find weird silly behaviors still. Now maybe those silly behaviors will disappear in another couple orders of magnitude in which case I will probably take a step back and go, “Well, maybe scale is all you need”."

(disclaimer for commenters: even if you disagree about the reasoning, remember that those are just intuitions from a podcast whose sole purpose is to inform about why ML researchers are not really concerned about existential risk from AI).

12 comments

Comments sorted by top scores.

comment by Razied · 2022-06-14T20:15:56.652Z · LW(p) · GW(p)

The "AGI doesn't exist" point is so disappointingly stupid. The word "general" in the name is meant to point to The Thing Humans Do That Machines Don't That Is Very Useful. If your definition of generality implies that humans aren't general, and think that this is somehow an argument against the danger of AI, then you are badly confused. People care about AGI because they have an intuition like "ok when my computer can do the sort of stuff I can do, shit quickly gets crazy for humanity". Somehow defining your way into making generality impossible doesn't address any of the concerns behind the intuitive definition. Sure, its quite hard to pinpoint what exactly we mean by generality, just like you can't explain to me how you recognize a cat, yet cats still exist as a coherent useful concept, and any arguments about the inexistence of cats should be carefully separated from the part of your brain that actually anticipates future experience.

Replies from: weathersystems, brian-edwards
comment by weathersystems · 2022-06-15T01:25:28.019Z · LW(p) · GW(p)

The quotes above are not the complete conversation. In the section of the discussion about AGI, Blake says:

Blake: Because the set of all possible tasks will include some really bizarre stuff that we certainly don’t need our AI systems to do. And in that case, we can ask, “Well, might there be a system that is good at all the sorts of tasks that we might want it to do?” Here, we don’t have a mathematical proof, but again, I suspect Yann’s intuition is similar to mine, which is that you could have systems that are good at a remarkably wide range of things, but it’s not going to cover everything you could possibly hope to do with AI or want to do with AI.

Blake: At some point, you’re going to have to decide where your system is actually going to place its bets as it were. And that can be as general as say a human being. So we could, of course, obviously humans are a proof of concept that way. We know that an intelligence with a level of generality equivalent to humans is possible and maybe it’s even possible to have an intelligence that is even more general than humans to some extent. I wouldn’t discount it as a possibility, but I don’t think you’re ever going to have something that can truly do anything you want, whether it be protein folding, predictions, managing traffic, manufacturing new materials, and also having a conversation with you about your grand’s latest visit that can’t be… There is going to be no system that does all of that for you.

I don't think he's making the mistake you're pointing to.  Looks like he's willing to allow for AI with at least as much generality as humans.

And he doesn't seem too committed to one definition of generality. Instead he talks about different types/levels of generality.

Replies from: yair-halberstadt, mtrazzi
comment by Yair Halberstadt (yair-halberstadt) · 2022-06-15T03:12:56.415Z · LW(p) · GW(p)

In that case it's still an extremely poor argument.

He's successfully pointed out that something nobody ever cared about can't exist (due to the free lunch theorem). We know this argument doesn't apply to humans since humans are better at all the things he discussed than apes, and polymaths are better at all the things he discussed than your average human.

So he's basically got no evidence at all for his assertion, and the no free lunch theorem is completely irrelevant.

Replies from: mtrazzi
comment by Michaël Trazzi (mtrazzi) · 2022-06-15T08:56:49.559Z · LW(p) · GW(p)

The goal of the podcast is to discuss why people believe certain things while discussing their inside views about AI. In this particular case, the guest gives roughly three reasons for his views:

  • the no free lunch theorem showing why you cannot have a model that outperforms all other learning algorithms across all tasks.
  • the results from the Gato paper where models specialized in one domain are better (in that domain) than a generalist agent (the transfer learning, if any, did not lead to improved performance).
  • society as a whole being similar to some "general intelligence", with humans being the individual constituants who have a more specialized intelligence

If I were to steelman his point about humans being specialized, I think he basically meant that what happened with society is we have many specialized agents, and that's probably what will happen as AIs automate our economy, as AIs specialized in one domain will be better than general ones at specific tasks.

He is also saying that, with respect to general agents, we have evidence from humans, the impossibility result from the no free lunch theorem, and basically no evidence for anything in between. For the current models, there is evidence for positive transfer for NLP tasks but less evidence for a broad set of tasks like in Gato.

The best version of the "different levels of generality" argument I can think of (though I don't buy it) goes something like: "The reasons why humans are able to do impressive things like building smartphones is because they are multiple specialized agents who teach other humans what they have done before they die. No humans alive today could build the latest Iphone from scratch, yet as a society we build it. It is not clear that a single ML model who is never turned off would be trivially capable of learning to do virtually everything that is needed to build a smartphone, spaceships and other things that humans might have not discovered yet necessary to expand through space, and even if it is a possibility, what will most likely happen (and sooner) is a society full of many specialized agents (cf. CAIS)."

Replies from: Viliam
comment by Viliam · 2022-06-15T17:16:46.039Z · LW(p) · GW(p)

what happened with society is we have many specialized agents, and that's probably what will happen as AIs automate our economy, as AIs specialized in one domain will be better than general ones at specific tasks.

Humans specialize, because their brains are limited.

If an AI with certain computing capacity has to choose whether to be an expert at X or an expert at Y, an AI with twice as much capacity could choose to be an expert on both X and Y.

From this perspective, maybe a human-level AI is not a risk, but a humankind-level AI could be.

comment by Michaël Trazzi (mtrazzi) · 2022-06-15T09:05:23.657Z · LW(p) · GW(p)

Thanks for bringing up the rest of the conversation. It is indeed unfortunate that I cut out certain quotes from their full context. For completness sake, here is the full excerpt without interruptions, including my prompts. Emphasis mine.

Michaël: Got you. And I think Yann LeCun’s point is that there is no such thing as AGI because it’s impossible to build something truly general across all domains.

Blake: That’s right. So that is indeed one of the sources of my concerns as well. I would say I have two concerns with the terminology AGI, but let’s start with Yann’s, which he’s articulated a few times. And as I said, I agree with him on it. We know from the no free lunch theorem that you cannot have a learning algorithm that outperforms all other learning algorithms across all tasks. It’s just an impossibility. So necessarily, any learning algorithm is going to have certain things that it’s good at and certain things that it’s bad at. Or alternatively, if it’s truly a Jack of all trades, it’s going to be just mediocre at everything. Right? So with that reality in place, you can say concretely that if you take AGI to mean literally good at anything, it’s just an impossibility, it cannot exist. And that’s been mathematically proven.

Blake: Now, all that being said, the proof for the no free lunch theorem, refers to all possible tasks. And that’s a very different thing from the set of tasks that we might actually care about. Right?

Michaël: Right.

Blake: Because the set of all possible tasks will include some really bizarre stuff that we certainly don’t need our AI systems to do. And in that case, we can ask, “Well, might there be a system that is good at all the sorts of tasks that we might want it to do?” Here, we don’t have a mathematical proof, but again, I suspect Yann’s intuition is similar to mine, which is that you could have systems that are good at a remarkably wide range of things, but it’s not going to cover everything you could possibly hope to do with AI or want to do with AI.

Blake: At some point, you’re going to have to decide where your system is actually going to place its bets as it were. And that can be as general as say a human being. So we could, of course, obviously humans are a proof of concept that way. We know that an intelligence with a level of generality equivalent to humans is possible and maybe it’s even possible to have an intelligence that is even more general than humans to some extent. I wouldn’t discount it as a possibility, but I don’t think you’re ever going to have something that can truly do anything you want, whether it be protein folding, predictions, managing traffic, manufacturing new materials, and also having a conversation with you about your grand’s latest visit that can’t be… There is going to be no system that does all of that for you.

Michaël: So we will have system that do those separately, but not at the same time?

Blake: Yeah, exactly. I think that we will have AI systems that are good at different domains. So, we might have AI systems that are good for scientific discovery, AI systems that are good for motor control and robotics, AI systems that are good for general conversation and being assistants for people, all these sorts of things, but not a single system that does it all for you.

Michaël: Why do you think that?

Blake: Well, I think that just because of the practical realities that one finds when one trains these networks. So, what has happened with, for example, scaling laws? And I said this to Ethan the other day on Twitter. What’s happened with scaling laws is that we’ve seen really impressive ability to transfer to related tasks. So if you train a large language model, it can transfer to a whole bunch of language-related stuff, very impressively. And there’s been some funny work that shows that it can even transfer to some out-of-domain stuff a bit, but there hasn’t been any convincing demonstration that it transfers to anything you want. And in fact, I think that the recent paper… The Gato paper from DeepMind actually shows, if you look at their data, that they’re still getting better transfer effects if you train in domain than if you train across all possible tasks.

comment by Brian Edwards (brian-edwards) · 2022-06-15T00:18:20.359Z · LW(p) · GW(p)

If that's what "general" means, why not just say "conscious AI"? I suspect the answer is because the field has already come to terms with the fact that conscious machines are philosophically unattainable. Another word was needed that was both sufficiently meaningful and also sufficiently meaningless to refocus (or more accurately misdirect) attention to "The Thing Humans Do That Machines Don't That Is Very Useful".

The burden of defining concepts like "AGI" is on the true believers, not the skeptics. Labeling someone "disappointingly stupid" who isn't making any non falsifiable claims about binary systems doing the "sort of stuff I can do". Simply making fun of your critics for lacking sufficient imagination to comprehend your epistemically incoherent claims is nothing more than lazy burden shifting.

I do get a kick out of statements like "but you can't explain to me how you recognize a cat" as if the epistemically weak explanations for human general intelligence excuse or even somehow validate epistemically weak explanations for AGI.

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2022-06-15T02:58:26.908Z · LW(p) · GW(p)

If that's what "general" means, why not just say "conscious AI"?

What? What does consciousness have to do with 'the thing humans do that let us land on the Moon, invent plastic, etc.'?

I suspect the answer is because the field has already come to terms with the fact that conscious machines are philosophically unattainable.

Huh? Seems neither relevant nor true. (Like, it's obviously false that conscious machines are impossible, and obviously false that there's consensus 'conscious machines are impossible'.) I don't understand what you even mean here.

Labeling someone "disappointingly stupid" who isn't making any non falsifiable claims about binary systems doing the "sort of stuff I can do". 

The dumb thing isn't rejecting the label "AGI". It's thinking that this rejection matters for any of the arguments about AGI risk.

... Also, how on earth is the idea of AGI unfalsifiable? In a world where machines melt whenever you try to get them to par-human capabilities on science, you would have falsified AGI.

Replies from: JBlack, brian-edwards
comment by JBlack · 2022-06-15T23:39:40.505Z · LW(p) · GW(p)

The concept "AGI" is indeed unfalsifiable, and we shouldn't expect it to be falsifiable. It's a descriptor, not a model or a prediction, and so is the wrong type of thing for falsifiability. It's also slightly vague, but that's a different objection altogether.

The nearest corresponding falsifiable predictions are along the lines of "an AGI will be publically verified to exist by year X", for every value of X, or maybe "this particular design with these specifications will create AGI". Lots of these are already falsified. Others can be, but have not yet been. There are of course also predictions involving the descriptor "AGI" that cannot be falsified.

comment by Brian Edwards (brian-edwards) · 2022-06-16T13:14:55.250Z · LW(p) · GW(p)

"AGI" doesn't actually make ANY claim at all. That is my primary point, it is an utterly useless term, other than that is sufficiently meaningful and meaningless at the same time that it can be the basis for conveying an intangible concept.

YOU, specifically, have not made a single claim that can be falsified. Please point me at your claim if you think I missed it.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2022-06-15T06:00:13.141Z · LW(p) · GW(p)

I'd like to understand this perspective better:

I suppose what would change my mind on this is, ... You couldn’t find weird behaviors, no matter how hard you tried. It always seemed to be doing intelligent things. Then I would really buy it. I think what’s interesting about the existing systems, is they’re very impressive and it’s pretty crazy what they can do, but it doesn’t take that much probing to also find weird silly behaviors still. Now maybe those silly behaviors will disappear in another couple orders of magnitude in which case I will probably take a step back and go, “Well, maybe scale is all you need”.

Blake and many other people I know seem to think that weird silly behaviors mean we aren't close to AGI. Whereas I think the AGI that accelerates R&D, takes over the world, etc. may do so while also exhibiting occasional weird silly behaviors. AIs are not humans, they are going to be better in some areas and worse in others, and their failures will sometimes be similar to human failures but not always. They'll have various weird silly deficiencies, just as (from their perspective) we have various weird silly deficiencies.
 

comment by Søren Elverlin (soren-elverlin-1) · 2022-06-15T08:58:24.230Z · LW(p) · GW(p)

The Gato paper from DeepMind actually shows, if you look at their data, that they’re still getting better transfer effects if you train in domain than if you train across all possible tasks.

This probably refers to figure 9 in A Generalist Agent, which compares generalization given:

  1. Training in irrelevant domain (Blue line)
  2. Training in relevant domain (Green line)
  3. Training in both domains (Yellow line)

From DeepMind's results in the figure, it looks like 3. almost always outperforms 2., though I would hesitate to draw strong conclusions from this figure (or Gato in general).

(Blake Richards' claim is trivially true given a fixed number of tasks or episodes. Ceteris Paribus, you'll get better results from more relevant data.)