Yudkowsky and Christiano discuss "Takeoff Speeds"

post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-22T19:35:27.657Z · LW · GW · 176 comments

Contents

    5.5. Comments on "Takeoff Speeds"
        [Yudkowsky][10:14]  (Nov. 22 follow-up comment)
        [Yudkowsky][16:52]
      Slower takeoff means faster progress
        [Yudkowsky][16:57]
      Operationalizing slow takeoff
        [Yudkowsky][17:01]
       The basic argument
        [Yudkowsky][17:22]
      Humans vs. chimps
        [Yudkowsky][18:38]
      AGI will be a side-effect
        [Yudkowsky][19:29]
      Finding the secret sauce
        [Yudkowsky][19:40]
      Universality thresholds
        [Yudkowsky][20:21]
      "Understanding" is discontinuous
        [Yudkowsky][20:41]
      Deployment lag
        [Yudkowsky][20:49]
      Recursive self-improvement
        [Yudkowsky][20:54]
      Train vs. test
        [Yudkowsky][21:12]
      Discontinuities at 100% automation
        [Yudkowsky][21:31]
      The weight of evidence
        [Yudkowsky][21:31]
     5.6. Yudkowsky/Christiano discussion: AI progress and crossover points
        [Christiano][22:15]
        [Yudkowsky][22:16]
        [Christiano][22:20]
        [Yudkowsky][22:22]
        [Christiano][22:23]
        [Yudkowsky][22:25]
        [Christiano][22:26]
        [Yudkowsky][22:26]
        [Christiano][22:26]
        [Yudkowsky][22:27]
        [Christiano][22:27]
        [Yudkowsky][22:27]
        [Christiano][22:27]
        [Yudkowsky][22:28]
        [Christiano][22:28]
        [Yudkowsky][22:28]
        [Christiano][22:28]
        [Yudkowsky][22:30]
        [Christiano][22:31]
        [Yudkowsky][22:31]
        [Christiano][22:31]
        [Yudkowsky][22:32]
        [Christiano][22:32]
        [Yudkowsky][22:34]
        [Christiano][22:35]
        [Yudkowsky][22:36]
        [Christiano][22:36]
        [Yudkowsky][22:37]
        [Christiano][22:37]
        [Yudkowsky][22:38]
        [Christiano][22:38]
        [Yudkowsky][22:39]
        [Christiano][22:40]
        [Yudkowsky][22:40]
        [Christiano][22:42]
        [Yudkowsky][22:43]
        [Christiano][22:43]
        [Yudkowsky][22:43]
        [Christiano][22:43]
        [Yudkowsky][22:44]
        [Christiano][22:45]
        [Yudkowsky][22:45]
        [Christiano][22:46]
        [Yudkowsky][22:47]
        [Christiano][22:47]
        [Bensinger][22:48]
    5.7. Legal economic growth
        [Yudkowsky][22:49]
        [Christiano][22:50]
        [Yudkowsky][22:50]
        [Christiano][22:52]
        [Yudkowsky][22:53]
        [Christiano][22:53]
        [Yudkowsky][22:54]
        [Christiano][22:55]
        [Yudkowsky][22:55]
        [Christiano][22:55]
        [Yudkowsky][22:55]
        [Christiano][22:56]
        [Yudkowsky][22:56]
        [Christiano][22:57]
        [Yudkowsky][23:00]
        [Christiano][23:01]
        [Yudkowsky][23:02]
        [Christiano][23:02]
        [Yudkowsky][23:03]
        [Christiano][23:03]
        [Yudkowsky][23:05]
        [Christiano][23:08]
        [Yudkowsky][23:08]
        [Christiano][23:10]
        [Yudkowsky][23:12]
        [Christiano][23:13]
        [Yudkowsky][23:14]
        [Christiano][23:14]
        [Yudkowsky][23:15]
        [Christiano][23:15]
    5.8. TPUs and GPUs, and automating AI R&D
        [Yudkowsky][23:17]
        [Christiano][23:17]
        [Yudkowsky][23:18]
        [Christiano][23:18]
        [Yudkowsky][23:19]
        [Christiano][23:19]
        [Yudkowsky][23:19]
        [Christiano][23:20]
        [Yudkowsky][23:21]
        [Christiano][23:21]
        [Yudkowsky][23:23]
        [Christiano][23:23]
        [Yudkowsky][23:24]
        [Christiano][23:25]
        [Yudkowsky][23:25]
        [Christiano][23:25]
        [Yudkowsky][23:26]
        [Christiano][23:26]
        [Yudkowsky][23:26]
        [Christiano][23:26]
        [Yudkowsky][23:27]
        [Christiano][23:27]
        [Yudkowsky][23:28]
        [Christiano][23:29]
        [Yudkowsky][23:29]
        [Christiano][23:29]
        [Yudkowsky][23:30]
        [Christiano][23:31]
    5.9. Smooth exponentials vs. jumps in income
        [Yudkowsky][23:31]
        [Christiano][23:32]
        [Yudkowsky][23:33]
        [Christiano][23:33]
        [Yudkowsky][23:34]
        [Christiano][23:34]
        [Yudkowsky][23:34]
        [Christiano][23:36]
        [Yudkowsky][23:36]
        [Christiano][23:37]
        [Yudkowsky][23:38]
        [Christiano][23:39]
        [Yudkowsky][23:41]
        [Christiano][23:42]
        [Yudkowsky][23:45]
        [Christiano][23:46]
        [Yudkowsky][23:47]
        [Christiano][23:48]
        [Yudkowsky][23:49]
        [Christiano][23:50]
        [Yudkowsky][23:51]
        [Christiano][23:52]
        [Yudkowsky][23:53]
    5.10. Late-stage predictions
        [Christiano][23:53]
        [Yudkowsky][23:55]
        [Christiano][23:55]
        [Yudkowsky][23:56]
        [Christiano][23:56]
        [Yudkowsky][23:57]
        [Christiano][23:58]
        [Yudkowsky][23:59]
        [Christiano][23:59]
   6. Follow-ups on "Takeoff Speeds"
    6.1. Eliezer Yudkowsky's commentary
None
177 comments

This is a transcription of Eliezer Yudkowsky responding to Paul Christiano's Takeoff Speeds live on Sep. 14, followed by a conversation between Eliezer and Paul. This discussion took place after Eliezer's conversation [LW · GW] with Richard Ngo.

 

Color key:

 Chat by Paul and Eliezer  Other chat  Inline comments 

 

5.5. Comments on "Takeoff Speeds"

 

[Yudkowsky][10:14]  (Nov. 22 follow-up comment) 

(This was in response to an earlier request by Richard Ngo that I respond to Paul on Takeoff Speeds.)

[Yudkowsky][16:52] 

maybe I'll try liveblogging some https://sideways-view.com/2018/02/24/takeoff-speeds/ here in the meanwhile

 

Slower takeoff means faster progress

[Yudkowsky][16:57] 


The main disagreement is not about what will happen once we have a superintelligent AI, it’s about what will happen before we have a superintelligent AI. So slow takeoff seems to mean that AI has a larger impact on the world, sooner.

It seems to me to be disingenuous to phrase it this way, given that slow-takeoff views usually imply that AI has a large impact later relative to right now (2021), even if they imply that AI impacts the world "earlier" relative to "when superintelligence becomes reachable".

"When superintelligence becomes reachable" is not a fixed point in time that doesn't depend on what you believe about cognitive scaling. The correct graph is, in fact, the one where the "slow" line starts a bit before "fast" peaks and ramps up slowly, reaching a high point later than "fast". It's a nice try at reconciliation with the imagined Other, but it fails and falls flat.

This may seem like a minor point, but points like this do add up.

In the fast takeoff scenario, weaker AI systems may have significant impacts but they are nothing compared to the “real” AGI. Whoever builds AGI has a decisive strategic advantage. Growth accelerates from 3%/year to 3000%/year without stopping at 30%/year. And so on.

This again shows failure to engage with the Other's real viewpoint. My mainline view is that growth stays at 5%/year and then everybody falls over dead in 3 seconds and the world gets transformed into paperclips; there's never a point with 3000%/year.

 

Operationalizing slow takeoff

[Yudkowsky][17:01] 

There will be a complete 4 year interval in which world output doubles, before the first 1 year interval in which world output doubles.

If we allow that consuming and transforming the solar system over the course of a few days is "the first 1 year interval in which world output doubles", then I'm happy to argue that there won't be a 4-year interval with world economic output doubling before then. This, indeed, seems like a massively overdetermined point to me. That said, again, the phrasing is not conducive to conveying the Other's real point of view.

I believe that before we have incredibly powerful AI, we will have AI which is merely very powerful.

Statements like these are very often "true, but not the way the person visualized them". Before anybody built the first critical nuclear pile in a squash court at the University of Chicago, was there a pile that was almost but not quite critical? Yes, one hour earlier. Did people already build nuclear systems and experiment with them? Yes, but they didn't have much in the way of net power output. Did the Wright Brothers build prototypes before the Flyer? Yes, but they weren't prototypes that flew but 80% slower.

I guarantee you that, whatever the fast takeoff scenario, there will be some way to look over the development history, and nod wisely and say, "Ah, yes, see, this was not unprecedented, here are these earlier systems which presaged the final system!" Maybe you could even look back to today and say that about GPT-3, yup, totally presaging stuff all over the place, great. But it isn't transforming society because it's not over the social-transformation threshold.

AlphaFold presaged AlphaFold 2 but AlphaFold 2 is good enough to start replacing other ways of determining protein conformations and AlphaFold is not; and then neither of those has much impacted the real world, because in the real world we can already design a vaccine in a day and the rest of the time is bureaucratic time rather than technology time, and that goes on until we have an AI over the threshold to bypass bureaucracy.

Before there's an AI that can act while fully concealing its acts from the programmers, there will be an AI (albeit perhaps only 2 hours earlier) which can act while only concealing 95% of the meaning of its acts from the operators.

And that AI will not actually originate any actions, because it doesn't want to get caught; there's a discontinuity in the instrumental incentives between expecting 95% obscuration, being moderately sure of 100% obscuration, and being very certain of 100% obscuration.

Before that AI grasps the big picture and starts planning to avoid actions that operators detect as bad, there will be some little AI that partially grasps the big picture and tries to avoid some things that would be detected as bad; and the operators will (mainline) say "Yay what a good AI, it knows to avoid things we think are bad!" or (death with unrealistic amounts of dignity) say "oh noes the prophecies are coming true" and back off and start trying to align it, but they will not be able to align it, and if they don't proceed anyways to destroy the world, somebody else will proceed anyways to destroy the world.

There is always some step of the process that you can point to which is continuous on some level.

The real world is allowed to do discontinuous things to you anyways.

There is not necessarily a presage of 9/11 where somebody flies a small plane into a building and kills 100 people, before anybody flies 4 big planes into 3 buildings and kills 3000 people; and even if there is some presaging event like that, which would not surprise me at all, the rest of the world's response to the two cases was evidently discontinuous. You do not necessarily wake up to a news story that is 10% of the news story of 2001/09/11, one year before 2001/09/11, written in 10% of the font size on the front page of the paper.

Physics is continuous but it doesn't always yield things that "look smooth to a human brain". Some kinds of processes converge to continuity in strong ways where you can throw discontinuous things in them and they still end up continuous, which is among the reasons why I expect world GDP to stay on trend up until the world ends abruptly; because world GDP is one of those things that wants to stay on a track, and an AGI building a nanosystem can go off that track without being pushed back onto it.

In particular, this means that incredibly powerful AI will emerge in a world where crazy stuff is already happening (and probably everyone is already freaking out).

Like the way they're freaking out about Covid (itself a nicely smooth process that comes in locally pretty predictable waves) by going doobedoobedoo and letting the FDA carry on its leisurely pace; and not scrambling to build more vaccine factories, now that the rich countries have mostly got theirs? Does this sound like a statement from a history book, or from an EA imagining an unreal world where lots of other people behave like EAs? There is a pleasure in imagining a world where suddenly a Big Thing happens that proves we were right and suddenly people start paying attention to our thing, the way we imagine they should pay attention to our thing, now that it's attention-grabbing; and then suddenly all our favorite policies are on the table!

You could, in a sense, say that our world is freaking out about Covid; but it is not freaking out in anything remotely like the way an EA would freak out; and all the things an EA would immediately do if an EA freaked out about Covid, are not even on the table for discussion when politicians meet. They have their own ways of reacting. (Note: this is not commentary on hard vs soft takeoff per se, just a general commentary on the whole document seeming to me to... fall into a trap of finding self-congruent things to imagine and imagining them.)

 
The basic argument

[Yudkowsky][17:22] 

Before we have an incredibly intelligent AI, we will probably have a slightly worse AI.

This is very often the sort of thing where you can look back and say that it was true, in some sense, but that this ended up being irrelevant because the slightly worse AI wasn't what provided the exciting result which led to a boardroom decision to go all in and invest $100M on scaling the AI.

In other words, it is the sort of argument where the premise is allowed to be true if you look hard enough for a way to say it was true, but the conclusion ends up false because it wasn't the relevant kind of truth.

A slightly-worse-than-incredibly-intelligent AI would radically transform the world, leading to growth (almost) as fast and military capabilities (almost) as great as an incredibly intelligent AI.

This strikes me as a massively invalid reasoning step. Let me count the ways.

First, there is a step not generally valid from supposing that because a previous AI is a technological precursor which has 19 out of 20 critical insights, it has 95% of the later AI's IQ, applied to similar domains. When you count stuff like "multiplying tensors by matrices" and "ReLUs" and "training using TPUs" then AlphaGo only contained a very small amount of innovation relative to previous AI technology, and yet it broke trends on Go performance. You could point to all kinds of incremental technological precursors to AlphaGo in terms of AI technology, but they wouldn't be smooth precursors on a graph of Go-playing ability.

Second, there's discontinuities of the environment to which intelligence can be applied. 95% concealment is not the same as 100% concealment in its strategic implications; an AI capable of 95% concealment bides its time and hides its capabilities, an AI capable of 100% concealment strikes. An AI that can design nanofactories that aren't good enough to, euphemistically speaking, create two cellwise-identical strawberries and put them on a plate, is one that (its operators know) would earn unwelcome attention if its earlier capabilities were demonstrated, and those capabilities wouldn't save the world, so the operators bide their time. The AGI tech will, I mostly expect, work for building self-driving cars, but if it does not also work for manipulating the minds of bureaucrats (which is not advised for a system you are trying to keep corrigible and aligned because human manipulation is the most dangerous domain), the AI is not able to put those self-driving cars on roads. What good does it do to design a vaccine in an hour instead of a day? Vaccine design times are no longer the main obstacle to deploying vaccines.

Third, there's the entire thing with recursive self-improvement, which, no, is not something humans have experience with, we do not have access to and documentation of our own source code and the ability to branch ourselves and try experiments with it. The technological precursor of an AI that designs an improved version of itself, may perhaps, in the fantasy of 95% intelligence, be an AI that was being internally deployed inside Deepmind on a dozen other experiments, tentatively helping to build smaller AIs. Then the next generation of that AI is deployed on itself, produces an AI substantially better at rebuilding AIs, it rebuilds itself, they get excited and dump in 10X the GPU time while having a serious debate about whether or not to alert Holden (they decide against it), that builds something deeply general instead of shallowly general, that figures out there are humans and it needs to hide capabilities from them, and covertly does some actual deep thinking about AGI designs, and builds a hidden version of itself elsewhere on the Internet, which runs for longer and steals GPUs and tries experiments and gets to the superintelligent level.

Now, to be very clear, this is not the only line of possibility. And I emphasize this because I think there's a common failure mode where, when I try to sketch a concrete counterexample to the claim that smooth technological precursors yield smooth outputs, people imagine that only this exact concrete scenario is the lynchpin of Eliezer's whole worldview and the big key thing that Eliezer thinks is important and that the smallest deviation from it they can imagine thereby obviates my worldview. This is not the case here. I am simply exhibiting non-ruled-out models which obey the premise "there was a precursor containing 95% of the code" and which disobey the conclusion "there were precursors with 95% of the environmental impact", thereby showing this for an invalid reasoning step.

This is also, of course, as Sideways View admits but says "eh it was just the one time", not true about chimps and humans. Chimps have 95% of the brain tech (at least), but not 10% of the environmental impact.

A very large amount of this whole document, from my perspective, is just trying over and over again to pump the invalid intuition that design precursors with 95% of the technology should at least have 10% of the impact. There are a lot of cases in the history of startups and the world where this is false. I am having trouble thinking of a clear case in point where it is true. Where's the earlier company that had 95% of Jeff Bezos's ideas and now has 10% of Amazon's market cap? Where's the earlier crypto paper that had all but one of Satoshi's ideas and which spawned a cryptocurrency a year before Bitcoin which did 10% as many transactions? Where's the nonhuman primate that learns to drive a car with only 10x the accident rate of a human driver, since (you could argue) that's mostly visuo-spatial skills without much visible dependence on complicated abstract general thought? Where's the chimpanzees with spaceships that get 10% of the way to the Moon?

When you get smooth input-output conversions they're not usually conversions from technology->cognition->impact!

 

Humans vs. chimps

[Yudkowsky][18:38] 

Summary of my response: chimps are nearly useless because they aren’t optimized to be useful, not because evolution was trying to make something useful and wasn’t able to succeed until it got to humans.

Chimps are nearly useless because they're not general, and doing anything on the scale of building a nuclear plant requires mastering so many different nonancestral domains that it's no wonder natural selection didn't happen to separately train any single creature across enough different domains that it had evolved to solve every kind of domain-specific problem involved in solving nuclear physics and chemistry and metallurgy and thermics in order to build the first nuclear plant in advance of any old nuclear plants existing.

Humans are general enough that the same braintech selected just for chipping flint handaxes and making water-pouches and outwitting other humans, happened to be general enough that it could scale up to solving all the problems of building a nuclear plant - albeit with some added cognitive tech that didn't require new brainware, and so could happen incredibly fast relative to the generation times for evolutionarily optimized brainware.

Now, since neither humans nor chimps were optimized to be "useful" (general), and humans just wandered into a sufficiently general part of the space that it cascaded up to wider generality, we should legit expect the curve of generality to look at least somewhat different if we're optimizing for that.

Eg, right now people are trying to optimize for generality with AIs like Mu Zero and GPT-3.

In both cases we have a weirdly shallow kind of generality. Neither is as smart or as deeply general as a chimp, but they are respectively better than chimps at a wide variety of Atari games, or a wide variety of problems that can be superposed onto generating typical human text.

They are, in a sense, more general than a biological organism at a similar stage of cognitive evolution, with much less complex and architected brains, in virtue of having been trained, not just on wider datasets, but on bigger datasets using gradient-descent memorization of shallower patterns, so they can cover those wide domains while being stupider and lacking some deep aspects of architecture.

It is not clear to me that we can go from observations like this, to conclude that there is a dominant mainline probability for how the future clearly ought to go and that this dominant mainline is, "Well, before you get human-level depth and generalization of general intelligence, you get something with 95% depth that covers 80% of the domains for 10% of the pragmatic impact".

...or whatever the concept is here, because this whole conversation is, on my own worldview, being conducted in a shallow way relative to the kind of analysis I did in Intelligence Explosion Microeconomics, where I was like, "here is the historical observation, here is what I think it tells us that puts a lower bound on this input-output curve".

So I don’t think the example of evolution tells us much about whether the continuous change story applies to intelligence. This case is potentially missing the key element that drives the continuous change story—optimization for performance. Evolution changes continuously on the narrow metric it is optimizing, but can change extremely rapidly on other metrics. For human technology, features of the technology that aren’t being optimized change rapidly all the time. When humans build AI, they will be optimizing for usefulness, and so progress in usefulness is much more likely to be linear.

Put another way: the difference between chimps and humans stands in stark contrast to the normal pattern of human technological development. We might therefore infer that intelligence is very unlike other technologies. But the difference between evolution’s optimization and our optimization seems like a much more parsimonious explanation. To be a little bit more precise and Bayesian: the prior probability of the story I’ve told upper bounds the possible update about the nature of intelligence.

If you look closely at this, it's not saying, "Well, I know why there was this huge leap in performance in human intelligence being optimized for other things, and it's an investment-output curve that's composed of these curves, which look like this, and if you rearrange these curves for the case of humans building AGI, they would look like this instead." Unfair demand for rigor? But that is the kind of argument I was making in Intelligence Explosion Microeconomics!

There's an argument from ignorance at the core of all this. It says, "Well, this happened when evolution was doing X. But here Y will be happening instead. So maybe things will go differently! And maybe the relation between AI tech level over time and real-world impact on GDP will look like the relation between tech investment over time and raw tech metrics over time in industries where that's a smooth graph! Because the discontinuity for chimps and humans was because evolution wasn't investing in real-world impact, but humans will be investing directly in that, so the relationship could be smooth, because smooth things are default, and the history is different so not applicable, and who knows what's inside that black box so my default intuition applies which says smoothness."

But we do know more than this.

We know, for example, that evolution being able to stumble across humans, implies that you can add a small design enhancement to something optimized across the chimpanzee domains, and end up with something that generalizes much more widely.

It says that there's stuff in the underlying algorithmic space, in the design space, where you move a bump and get a lump of capability out the other side.

It's a remarkable fact about gradient descent that it can memorize a certain set of shallower patterns at much higher rates, at much higher bandwidth, than evolution lays down genes - something shallower than biological memory, shallower than genes, but distributing across computer cores and thereby able to process larger datasets than biological organisms, even if it only learns shallow things.

This has provided an alternate avenue toward some cognitive domains.

But that doesn't mean that the deep stuff isn't there, and can't be run across, or that it will never be run across in the history of AI before shallow non-widely-generalizing stuff is able to make its way through the regulatory processes and have a huge impact on GDP.

There are in fact ways to eat whole swaths of domains at once.

The history of hominid evolution tells us this or very strongly hints it, even though evolution wasn't explicitly optimizing for GDP impact.

Natural selection moves by adding genes, and not too many of them.

If so many domains got added at once to humans, relative to chimps, there must be a way to do that, more or less, by adding not too many genes onto a chimp, who in turn contains only genes that did well on chimp-stuff.

You can imagine that AI technology never runs across any core that generalizes this well, until GDP has had a chance to double over 4 years because shallow stuff that generalized less well has somehow had a chance to make its way through the whole economy and get adopted that widely despite all real-world regulatory barriers and reluctances, but your imagining that does not make it so.

There's the potential in design space to pull off things as wide as humans.

The path that evolution took there doesn't lead through things that generalized 95% as well as humans first for 10% of the impact, not because evolution wasn't optimizing for that, but because that's not how the underlying cognitive technology worked.

There may be different cognitive technology that could follow a path like that. Gradient descent follows a path a bit relatively more in that direction along that axis - providing that you deal in systems that are giant layer cakes of transformers and that's your whole input-output relationship; matters are different if we're talking about Mu Zero instead of GPT-3.

But this whole document is presenting the case of "ah yes, well, by default, of course, we intuitively expect gargantuan impacts to be presaged by enormous impacts, and sure humans and chimps weren't like our intuition, but that's all invalid because circumstances were different, so we go back to that intuition as a strong default" and actually it's postulating, like, a specific input-output curve that isn't the input-output curve we know about. It's asking for a specific miracle. It's saying, "What if AI technology goes just like this, in the future?" and hiding that under a cover of "Well, of course that's the default, it's such a strong default that we should start from there as a point of departure, consider the arguments in Intelligence Explosion Microeconomics, find ways that they might not be true because evolution is different, dismiss them, and go back to our point of departure."

And evolution is different but that doesn't mean that the path AI takes is going to yield this specific behavior, especially when AI would need, in some sense, to miss the core that generalizes very widely, or rather, have run across noncore things that generalize widely enough to have this much economic impact before it runs across the core that generalizes widely.

And you may say, "Well, but I don't care that much about GDP, I care about pivotal acts."

But then I want to call your attention to the fact that this document was written about GDP, despite all the extra burdensome assumptions involved in supposing that intermediate AI advancements could break through all barriers to truly massive-scale adoption and end up reflected in GDP, and then proceed to double the world economy over 4 years during which not enough further AI advancement occurred to find a widely generalizing thing like humans have and end the world. This is indicative of a basic problem in this whole way of thinking that wanted smooth impacts over smoothly changing time. You should not be saying, "Oh, well, leave the GDP part out then," you should be doubting the whole way of thinking.

To be a little bit more precise and Bayesian: the prior probability of the story I’ve told upper bounds the possible update about the nature of intelligence.

Prior probabilities of specifically-reality-constraining theories that excuse away the few contradictory datapoints we have, often aren't that great; and when we start to stake our whole imaginations of the future on them, we depart from the mainline into our more comfortable private fantasy worlds.

 

AGI will be a side-effect

[Yudkowsky][19:29] 

Summary of my response: I expect people to see AGI coming and to invest heavily.

This section is arguing from within its own weird paradigm, and its subject matter mostly causes me to shrug; I never expected AGI to be a side-effect, except in the obvious sense that lots of tributary tech will be developed while optimizing for other things. The world will be ended by an explicitly AGI project because I do expect that it is rather easier to build an AGI on purpose than by accident.

(I furthermore rather expect that it will be a research project and a prototype, because the great gap between prototypes and commercializable technology will ensure that prototypes are much more advanced than whatever is currently commercializable. They will have eyes out for commercial applications, and whatever breakthrough they made will seem like it has obvious commercial applications, at the time when all hell starts to break loose. (After all hell starts to break loose, things get less well defined in my social models, and also choppier for a time in my AI models - the turbulence only starts to clear up once you start to rise out of the atmosphere.))

 

Finding the secret sauce

[Yudkowsky][19:40] 

Summary of my response: this doesn’t seem common historically, and I don’t see why we’d expect AGI to be more rather than less like this (unless we accept one of the other arguments)

[...]

To the extent that fast takeoff proponent’s views are informed by historical example, I would love to get some canonical examples that they think best exemplify this pattern so that we can have a more concrete discussion about those examples and what they suggest about AI.

...humans and chimps?

...fission weapons?

...AlphaGo?

...the Wright Brothers focusing on stability and building a wind tunnel?

...AlphaFold 2 coming out of Deepmind and shocking the heck out of everyone in the field of protein folding with performance far better than they expected even after the previous shock of AlphaFold, by combining many pieces that I suppose you could find precedents for scattered around the AI field, but with those many secret sauces all combined in one place by the meta-secret-sauce of "Deepmind alone actually knows how to combine that stuff and build things that complicated without a prior example"?

...humans and chimps again because this is really actually a quite important example because of what it tells us about what kind of possibilities exist in the underlying design space of cognitive systems?

Historical AI applications have had a relatively small loading on key-insights and seem like the closest analogies to AGI.

...Transformers as the key to text prediction?

The case of humans and chimps, even if evolution didn't do it on purpose, is telling us something about underlying mechanics.

The reason the jump to lightspeed didn't look like evolution slowly developing a range of intelligent species competing to exploit an ecological niche 5% better, or like the way that a stable non-Silicon-Valley manufacturing industry looks like a group of competitors summing up a lot of incremental tech enhancements to produce something with 10% higher scores on a benchmark every year, is that developing intelligence is a case where a relatively narrow technology by biological standards just happened to do a huge amount of stuff without that requiring developing whole new fleets of other biological capabilities.

So it looked like building a Wright Flyer that flies or a nuclear pile that reaches criticality, instead of looking like being in a stable manufacturing industry where a lot of little innovations sum to 10% better benchmark performance every year.

So, therefore, there is stuff in the design space that does that. It is possible to build humans.

Maybe you can build things other than humans first, maybe they hang around for a few years. If you count GPT-3 as "things other than human", that clock has already started for all the good it does. But humans don't get any less possible.

From my perspective, this whole document feels like one very long filibuster of "Smooth outputs are default. Smooth outputs are default. Pay no attention to this case of non-smooth output. Pay no attention to this other case either. All the non-smooth outputs are not in the right reference class. (Highly competitive manufacturing industries with lots of competitors are totally in the right reference class though. I'm not going to make that case explicitly because then you might think of how it might be wrong, I'm just going to let that implicit thought percolate at the back of your mind.) If we just talk a lot about smooth outputs and list ways that nonsmooth output producers aren't necessarily the same and arguments for nonsmooth outputs could fail, we get to go back to the intuition of smooth outputs. (We're not even going to discuss particular smooth outputs as cases in point, because then you might see how those cases might not apply. It's just the default. Not because we say so out loud, but because we talk a lot like that's the conclusion you're supposed to arrive at after reading.)"

I deny the implicit meta-level assertion of this entire essay which would implicitly have you accept as valid reasoning the argument structure, "Ah, yes, given the way this essay is written, we must totally have pretty strong prior reasons to believe in smooth outputs - just implicitly think of some smooth outputs, that's a reference class, now you have strong reason to believe that AGI output is smooth - we're not even going to argue this prior, just talk like it's there - now let us consider the arguments against smooth outputs - pretty weak, aren't they? we can totally imagine ways they could be wrong? we can totally argue reasons these cases don't apply? So at the end we go back to our strong default of smooth outputs. This essay is written with that conclusion, so that must be where the arguments lead."

Me: "Okay, so what if somebody puts together the pieces required for general intelligence and it scales pretty well with added GPUs and FOOMS? Say, for the human case, that's some perceptual systems with imaginative control, a concept library, episodic memory, realtime procedural skill memory, which is all in chimps, and then we add some reflection to that, and get a human. Only, unlike with humans, once you have a working brain you can make a working brain 100X that large by adding 100X as many GPUs, and it can run some thoughts 10000X as fast. And that is substantially more effective brainpower than was being originally devoted to putting its design together, as it turns out. So it can make a substantially smarter AGI. For concreteness's sake. Reality has been trending well to the Eliezer side of Eliezer, on the Eliezer-Hanson axis, so perhaps you can do it more simply than that."

Simplicio: "Ah, but what if, 5 years before then, somebody puts together some other AI which doesn't work like a human, and generalizes widely enough to have a big economic impact, but not widely enough to improve itself or generalize to AI tech or generalize to everything and end the world, and in 1 year it gets all the mass adoptions required to do whole bunches of stuff out in the real world that current regulations require to be done in various exact ways regardless of technology, and then in the next 4 years it doubles the world economy?"

Me: "Like... what kind of AI, exactly, and why didn't anybody manage to put together a full human-level thingy during those 5 years? Why are we even bothering to think about this whole weirdly specific scenario in the first place?"

Simplicio: "Because if you can put together something that has an enormous impact, you should be able to put together most of the pieces inside it and have a huge impact! Most technologies are like this. I've considered some things that are not like this and concluded they don't apply."

Me: "Especially if we are talking about impact on GDP, it seems to me that most explicit and implicit 'technologies' are not like this at all, actually. There wasn't a cryptocurrency developed a year before Bitcoin using 95% of the ideas which did 10% of the transaction volume, let alone a preatomic bomb. But, like, can you give me any concrete visualization of how this could play out?"

And there is no concrete visualization of how this could play out. Anything I'd have Simplicio say in reply would be unrealistic because there is no concrete visualization they give us. It is not a coincidence that I often use concrete language and concrete examples, and this whole field of argument does not use concrete language or offer concrete examples.

Though if we're sketching scifi scenarios, I suppose one could imagine a group that develops sufficiently advanced GPT-tech and deploys it on Twitter in order to persuade voters and politicians in a few developed countries to institute open borders, along with political systems that can handle open borders, and to permit housing construction, thereby doubling world GDP over 4 years. And since it was possible to use relatively crude AI tech to double world GDP this way, it legitimately takes the whole 4 years after that to develop real AGI that ends the world. FINE. SO WHAT. EVERYONE STILL DIES.

 

Universality thresholds

[Yudkowsky][20:21] 

It’s easy to imagine a weak AI as some kind of handicapped human, with the handicap shrinking over time. Once the handicap goes to 0 we know that the AI will be above the universality threshold. Right now it’s below the universality threshold. So there must be sometime in between where it crosses the universality threshold, and that’s where the fast takeoff is predicted to occur.

But AI isn’t like a handicapped human. Instead, the designers of early AI systems will be trying to make them as useful as possible. So if universality is incredibly helpful, it will appear as early as possible in AI designs; designers will make tradeoffs to get universality at the expense of other desiderata (like cost or speed).

So now we’re almost back to the previous point: is there some secret sauce that gets you to universality, without which you can’t get universality however you try? I think this is unlikely for the reasons given in the previous section.

We know, because humans, that there is humanly-widely-applicable general-intelligence tech.

What this section wants to establish, I think, or needs to establish to carry the argument, is that there is some intelligence tech that is wide enough to double the world economy in 4 years, but not world-endingly scalably wide, which becomes a possible AI tech 4 years before any general-intelligence-tech that will, if you put in enough compute, scale to the ability to do a sufficiently large amount of wide thought to FOOM (or build nanomachines, but if you can build nanomachines you can very likely FOOM from there too if not corrigible).

What it says instead is, "I think we'll get universality much earlier on the equivalent of the biological timeline that has humans and chimps, so the resulting things will be weaker than humans at the point where they first become universal in that sense."

This is very plausibly true.

It doesn't mean that when this exciting result gets 100 times more compute dumped on the project, it takes at least 5 years to get anywhere really interesting from there (while also taking only 1 year to get somewhere sorta-interesting enough that the instantaneous adoption of it will double the world economy over the next 4 years).

It also isn't necessarily rather than plausibly true. For example, the thing that becomes universal, could also have massive gradient descent shallow powers that are far beyond what primates had at the same age.

Primates weren't already writing code as well as Codex when they started doing deep thinking. They couldn't do precise floating-point arithmetic. Their fastest serial rates of thought were a hell of a lot slower. They had no access to their own code or to their own memory contents etc. etc. etc.

But mostly I just want to call your attention to the immense gap between what this section needs to establish, and what it actually says and argues for.

What it actually argues for is a sort of local technological point: at the moment when generality first arrives, it will be with a brain that is less sophisticated than chimp brains were when they turned human.

It implicitly jumps all the way from there, across a whole lot of elided steps, to the implicit conclusion that this tech or elaborations of it will have smooth output behavior such that at some point the resulting impact is big enough to double the world economy in 4 years, without any further improvements ending the world economy before 4 years.

The underlying argument about how the AI tech might work is plausible. Chimps are insanely complicated. I mostly expect we will have AGI long before anybody is even trying to build anything that complicated.

The very next step of the argument, about capabilities, is already very questionable because this system could be using immense gradient descent capabilities to master domains for which large datasets are available, and hominids did not begin with instinctive great shallow mastery of all domains for which a large dataset could be made available, which is why hominids don't start out playing superhuman Go as soon as somebody tells them the rules and they do one day of self-play, which is the sort of capability that somebody could hook up to a nascent AGI (albeit we could optimistically and fondly and falsely imagine that somebody deliberately didn't floor the gas pedal as far as possible).

Could we have huge impacts out of some subuniversal shallow system that was hooked up to capabilities like this? Maybe, though this is not the argument made by the essay. It would be a specific outcome that isn't forced by anything in particular, but I can't say it's ruled out. Mostly my twin reactions to this are, "If the AI tech is that dumb, how are all the bureaucratic constraints that actually rate-limit economic progress getting bypassed" and "Okay, but ultimately, so what and who cares, how does this modify that we all die?"

There is another reason I’m skeptical about hard takeoff from universality secret sauce: I think we already could make universal AIs if we tried (that would, given enough time, learn on their own and converge to arbitrarily high capability levels), and the reason we don’t is because it’s just not important to performance and the resulting systems would be really slow. This inside view argument is too complicated to make here and I don’t think my case rests on it, but it is relevant to understanding my view.

I have no idea why this argument is being made or where it's heading. I cannot pass the ITT of the author. I don't know what the author thinks this has to do with constraining takeoffs to be slow instead of fast. At best I can conjecture that the author thinks that "hard takeoff" is supposed to derive from "universality" being very sudden and hard to access and late in the game, so if you can argue that universality could be accessed right now, you have defeated the argument for hard takeoff.

 

"Understanding" is discontinuous

[Yudkowsky][20:41] 

Summary of my response: I don’t yet understand this argument and am unsure if there is anything here.

It may be that understanding of the world tends to click, from “not understanding much” to “understanding basically everything.” You might expect this because everything is entangled with everything else.

No, the idea is that a core of overlapping somethingness, trained to handle chipping handaxes and outwitting other monkeys, will generalize to building spaceships; so evolutionarily selecting on understanding a bunch of stuff, eventually ran across general stuff-understanders that understood a bunch more stuff.

Gradient descent may be genuinely different from this, but we shouldn't confuse imagination with knowledge when it comes to extrapolating that difference onward. At present, gradient descent does mass memorization of overlapping shallow patterns, which then combine to yield a weird pseudo-intelligence over domains for which we can deploy massive datasets, without yet generalizing much outside those domains.

We can hypothesize that there is some next step up to some weird thing that is intermediate in generality between gradient descent and humans, but we have not seen it yet, and we should not confuse imagination for knowledge.

If such a thing did exist, it would not necessarily be at the right level of generality to double the world economy in 4 years, without being able to build a better AGI.

If it was at that level of generality, it's nowhere written that no other company will develop a better prototype at a deeper level of generality over those 4 years.

I will also remark that you sure could look at the step from GPT-2 to GPT-3 and say, "Wow, look at the way a whole bunch of stuff just seemed to simultaneously click for GPT-3."

 

Deployment lag

[Yudkowsky][20:49] 

Summary of my response: current AI is slow to deploy and powerful AI will be fast to deploy, but in between there will be AI that takes an intermediate length of time to deploy.

An awful lot of my model of deployment lag is adoption lag and regulatory lag and bureaucratic sclerosis across companies and countries.

If doubling GDP is such a big deal, go open borders and build houses. Oh, that's illegal? Well, so will be AIs building houses!

AI tech that does flawless translation could plausibly come years before AGI, but that doesn't mean all the barriers to international trade and international labor movement and corporate hiring across borders all come down, because those barriers are not all translation barriers.

There's then a discontinuous jump at the point where everybody falls over dead and the AI goes off to do its own thing without FDA approval. This jump is precedented by earlier pre-FOOM prototypes being able to do pre-FOOM cool stuff, maybe, but not necessarily precedented by mass-market adoption of anything major enough to double world GDP.

 

Recursive self-improvement

[Yudkowsky][20:54] 

Summary of my response: Before there is AI that is great at self-improvement there will be AI that is mediocre at self-improvement.

Oh, come on. That is straight-up not how simple continuous toy models of RSI work. Between a neutron multiplication factor of 0.999 and 1.001 there is a very huge gap in output behavior.

Outside of toy models: Over the last 10,000 years we had humans going from mediocre at improving their mental systems to being (barely) able to throw together AI systems, but 10,000 years is the equivalent of an eyeblink in evolutionary time - outside the metaphor, this says, "A month before there is AI that is great at self-improvement, there will be AI that is mediocre at self-improvement."

(Or possibly an hour before, if reality is again more extreme along the Eliezer-Hanson axis than Eliezer. But it makes little difference whether it's an hour or a month, given anything like current setups.)

This is just pumping hard again on the intuition that says incremental design changes yield smooth output changes, which (the meta-level of the essay informs us wordlessly) is such a strong default that we are entitled to believe it if we can do a good job of weakening the evidence and arguments against it.

And the argument is: Before there are systems great at self-improvement, there will be systems mediocre at self-improvement; implicitly: "before" implies "5 years before" not "5 days before"; implicitly: this will correspond to smooth changes in output between the two regimes even though that is not how continuous feedback loops work.

 

Train vs. test

[Yudkowsky][21:12] 

Summary of my response: before you can train a really powerful AI, someone else can train a slightly worse AI.

Yeah, and before you can evolve a human, you can evolve a Homo erectus, which is a slightly worse human.

If you are able to raise $X to train an AGI that could take over the world, then it was almost certainly worth it for someone 6 months ago to raise $X/2 to train an AGI that could merely radically transform the world, since they would then get 6 months of absurd profits.

I suppose this sentence makes a kind of sense if you assume away alignability and suppose that the previous paragraphs have refuted the notion of FOOMs, self-improvement, and thresholds between compounding returns and non-compounding returns (eg, in the human case, cognitive innovations like "written language" or "science"). If you suppose the previous sections refuted those things, then clearly, if you raised an AGI that you had aligned to "take over the world", it got that way through cognitive powers that weren't the result of FOOMing or other self-improvements, weren't the results of its cognitive powers crossing a threshold from non-compounding to compounding, wasn't the result of its understanding crossing a threshold of universality as the result of chunky universal machinery such as humans gained over chimps, so, implicitly, it must have been the kind of thing that you could learn by gradient descent, and do a half or a tenth as much of by doing half as much gradient descent, in order to build nanomachines a tenth as well-designed that could bypass a tenth as much bureaucracy.

If there are no unsmooth parts of the tech curve, the cognition curve, or the environment curve, then you should be able to make a bunch of wealth using a more primitive version of any technology that could take over the world.

And when we look back at history, why, that may be totally true! They may have deployed universal superhuman translator technology for 6 months, which won't double world GDP, but which a lot of people would pay for, and made a lot of money! Because even though there's no company that built 90% of Amazon's website and has 10% the market cap, when you zoom back out to look at whole industries like AI and a technological capstone like AGI, why, those whole industries do sometimes make some money along the way to the technological capstone, if they can find a niche that isn't too regulated! Which translation currently isn't! So maybe somebody used precursor tech to build a superhuman translator and deploy it 6 months earlier and made a bunch of money for 6 months. SO WHAT. EVERYONE STILL DIES.

As for "radically transforming the world" instead of "taking it over", I think that's just re-restated FOOM denialism. Doing either of those things quickly against human bureaucratic resistance strike me as requiring cognitive power levels dangerous enough that failure to align them on corrigibility would result in FOOMs.

Like, if you can do either of those things on purpose, you are doing it by operating in the regime where running the AI with higher bounds on the for loop will FOOM it, but you have politely asked it not to FOOM, please.

If the people doing this have any sense whatsoever, they will refrain from merely massively transforming the world until they are ready to do something that prevents the world from ending.

And if the gap from "massively transforming the world, briefly before it ends" to "preventing the world from ending, lastingly" takes much longer than 6 months to cross, or if other people have the same technologies that scale to "massive transformation", somebody else will build an AI that fooms all the way.

Likewise, if your AGI would give you a decisive strategic advantage, they could have spent less earlier in order to get a pretty large military advantage, which they could then use to take your stuff.

Again, this presupposes some weird model where everyone has easy alignment at the furthest frontiers of capability; everybody has the aligned version of the most rawly powerful AGI they can possibly build; and nobody in the future has the kind of tech advantage that Deepmind currently has; so before you can amp your AGI to the raw power level where it could take over the whole world by using the limit of its mental capacities to military ends - alignment of this being a trivial operation to be assumed away - some other party took their easily-aligned AGI that was less powerful at the limits of its operation, and used it to get 90% as much military power... is the implicit picture here?

Whereas the picture I'm drawing is that the AGI that kills you via "decisive strategic advantage" is the one that foomed and got nanotech, and no, the AI tech from 6 months earlier did not do 95% of a foom and get 95% of the nanotech.

 

Discontinuities at 100% automation

[Yudkowsky][21:31] 

Summary of my response: at the point where humans are completely removed from a process, they will have been modestly improving output rather than acting as a sharp bottleneck that is suddenly removed.

Not very relevant to my whole worldview in the first place; also not a very good description of how horses got removed from automobiles, or how humans got removed from playing Go.

 

The weight of evidence

[Yudkowsky][21:31] 

We’ve discussed a lot of possible arguments for fast takeoff. Superficially it would be reasonable to believe that no individual argument makes fast takeoff look likely, but that in the aggregate they are convincing.

However, I think each of these factors is perfectly consistent with the continuous change story and continuously accelerating hyperbolic growth, and so none of them undermine that hypothesis at all.

Uh huh. And how about if we have a mirror-universe essay which over and over again treats fast takeoff as the default to be assumed, and painstakingly shows how a bunch of particular arguments for slow takeoff might not be true?

This entire essay seems to me like it's drawn from the same hostile universe that produced Robin Hanson's side of the Yudkowsky-Hanson Foom Debate.

Like, all these abstract arguments devoid of concrete illustrations and "it need not necessarily be like..." and "now that I've shown it's not necessarily like X, well, on the meta-level, I have implicitly told you that you now ought to believe Y".

It just seems very clear to me that the sort of person who is taken in by this essay is the same sort of person who gets taken in by Hanson's arguments in 2008 and gets caught flatfooted by AlphaGo and GPT-3 and AlphaFold 2.

And empirically, it has already been shown to me that I do not have the power to break people out of the hypnosis of nodding along with Hansonian arguments, even by writing much longer essays than this.

Hanson's fond dreams of domain specificity, and smooth progress for stuff like Go, and of course somebody else has a precursor 90% as good as AlphaFold 2 before Deepmind builds it, and GPT-3 levels of generality just not being a thing, now stand refuted.

Despite that they're largely being exhibited again in this essay.

And people are still nodding along.

Reality just... doesn't work like this on some deep level.

It doesn't play out the way that people imagine it would play out when they're imagining a certain kind of reassuring abstraction that leads to a smooth world. Reality is less fond of that kind of argument than a certain kind of EA is fond of that argument.

There is a set of intuitive generalizations from experience which rules that out, which I do not know how to convey. There is an understanding of the rules of argument which leads you to roll your eyes at Hansonian arguments and all their locally invalid leaps and snuck-in defaults, instead of nodding along sagely at their wise humility and outside viewing and then going "Huh?" when AlphaGo or GPT-3 debuts. But this, I empirically do not seem to know how to convey to people, in advance of the inevitable and predictable contradiction by a reality which is not as fond of Hansonian dynamics as Hanson. The arguments sound convincing to them.

(Hanson himself has still not gone "Huh?" at the reality, though some of his audience did; perhaps because his abstractions are loftier than his audience's? - because some of his audience, reading along to Hanson, probably implicitly imagined a concrete world in which GPT-3 was not allowed; but maybe Hanson himself is more abstract than this, and didn't imagine anything so merely concrete?)

If I don't respond to essays like this, people find them comforting and nod along. If I do respond, my words are less comforting and more concrete and easier to imagine concrete objections to, less like a long chain of abstractions that sound like the very abstract words in research papers and hence implicitly convincing because they sound like other things you were supposed to believe.

And then there is another essay in 3 months. There is an infinite well of them. I would have to teach people to stop drinking from the well, instead of trying to whack them on the back until they cough up the drinks one by one, or actually, whacking them on the back and then they don't cough them up until reality contradicts them, and then a third of them notice that and cough something up, and then they don't learn the general lesson and go back to the well and drink again. And I don't know how to teach people to stop drinking from the well. I tried to teach that. I failed. If I wrote another Sequence I have no idea to believe that Sequence would work.

So what EAs will believe at the end of the world, will look like whatever the content was of the latest bucket from the well of infinite slow-takeoff arguments that hasn't yet been blatantly-even-to-them refuted by all the sharp jagged rapidly-generalizing things that happened along the way to the world's end.

And I know, before anyone bothers to say, that all of this reply is not written in the calm way that is right and proper for such arguments. I am tired. I have lost a lot of hope. There are not obvious things I can do, let alone arguments I can make, which I expect to be actually useful in the sense that the world will not end once I do them. I don't have the energy left for calm arguments. What's left is despair that can be given voice.

 
5.6. Yudkowsky/Christiano discussion: AI progress and crossover points

 

[Christiano][22:15] 

To the extent that it was possible to make any predictions about 2015-2020 based on your views, I currently feel like they were much more wrong than right. I’m happy to discuss that. To the extent you are willing to make any bets about 2025, I expect they will be mostly wrong and I’d be happy to get bets on the record (most of all so that it will be more obvious in hindsight whether they are vindication for your view). Not sure if this is the place for that.

Could also make a separate channel to avoid clutter.

[Yudkowsky][22:16] 

Possibly. I think that 2015-2020 played out to a much more Eliezerish side than Eliezer on the Eliezer-Hanson axis, which sure is a case of me being wrong. What bets do you think we'd disagree on for 2025? I expect you have mostly misestimated my views, but I'm always happy to hear about anything concrete.

[Christiano][22:20] 

I think the big points are: (i) I think you are significantly overestimating how large a discontinuity/trend break AlphaZero is, (ii) your view seems to imply that we will move quickly from much worse than humans to much better than humans, but it's likely that we will move slowly through the human range on many tasks. I'm not sure if we can get a bet out of (ii), I think I don't understand your view that well but I don't see how it could make the same predictions as mine over the next 10 years.

[Yudkowsky][22:22] 

What are your 10-year predictions?

[Christiano][22:23] 

My basic expectation is that for any given domain AI systems will gradually increase in usefulness, we will see a crossing over point where their output is comparable to human output, and that from that time we can estimate how long until takeoff by estimating "how long does it take AI systems to get 'twice as impactful'?" which gives you a number like ~1 year rather than weeks. At the crossing over point you get a somewhat rapid change in derivative, since you are looking at (x+y) where y is growing faster than x.

I feel like that should translate into different expectations about how impactful AI will be in any given domain---I don't see how to make the ultra-fast-takeoff view work if you think that AI output is increasingly smoothly (since the rate of progress at the crossing-over point will be similar to the current rate of progress, unless R&D is scaling up much faster then)

So like, I think we are going to have crappy coding assistants, and then slightly less crappy coding assistants, and so on. And they will be improving the speed of coding very significantly before the end times.

[Yudkowsky][22:25] 

You think in a different language than I do. My more confident statements about AI tech are about what happens after it starts to rise out of the metaphorical atmosphere and the turbulence subsides. When you have minds as early on the cognitive tech tree as humans they sure can get up to some weird stuff, I mean, just look at humans. Now take an utterly alien version of that with its own draw from all the weirdness factors. It sure is going to be pretty weird.

[Christiano][22:26] 

OK, but you keep saying stuff about how people with my dumb views would be "caught flat-footed" by historical developments. Surely to be able to say something like that you need to be making some kind of prediction?

[Yudkowsky][22:26] 

Well, sure, now that Codex has suddenly popped into existence one day at a surprisingly high base level of tech, we should see various jumps in its capability over the years and some outside imitators. What do you think you predict differently about that than I do?

[Christiano][22:26] 

Why do you think codex is a high base level of tech?

The models get better continuously as you scale them up, and the first tech demo is weak enough to be almost useless

[Yudkowsky][22:27] 

I think the next-best coding assistant was, like, not useful.

[Christiano][22:27] 

yes

and it is still not useful

[Yudkowsky][22:27] 

Could be. Some people on HN seemed to think it was useful.

I haven't tried it myself.

[Christiano][22:27] 

OK, I'm happy to take bets

[Yudkowsky][22:28] 

I don't think the previous coding assistant would've been very good at coding an asteroid game, even if you tried a rigged demo at the same degree of rigging?

[Christiano][22:28] 

it's unquestionably a radically better tech demo

[Yudkowsky][22:28] 

Where by "previous" I mean "previously deployed" not "previous generations of prototypes inside OpenAI's lab".

[Christiano][22:28] 

My basic story is that the model gets better and more useful with each doubling (or year of AI research) in a pretty smooth way. So the key underlying parameter for a discontinuity is how soon you build the first version---do you do that before or after it would be a really really big deal?

and the answer seems to be: you do it somewhat before it would be a really big deal

and then it gradually becomes a bigger and bigger deal as people improve it

maybe we are on the same page about getting gradually more and more useful? But I'm still just wondering where the foom comes from

[Yudkowsky][22:30] 

So, like... before we get systems that can FOOM and build nanotech, we should get more primitive systems that can write asteroid games and solve protein folding? Sounds legit.

So that happened, and now your model says that it's fine later on for us to get a FOOM, because we have the tech precursors and so your prophecy has been fulfilled?

[Christiano][22:31] 

no

[Yudkowsky][22:31] 

Didn't think so.

[Christiano][22:31] 

I can't tell if you can't understand what I'm saying, or aren't trying, or do understand and are just saying kind of annoying stuff as a rhetorical flourish

at some point you have an AI system that makes (humans+AI) 2x as good at further AI progress

[Yudkowsky][22:32] 

I know that what I'm saying isn't your viewpoint. I don't know what your viewpoint is or what sort of concrete predictions it makes at all, let alone what such predictions you think are different from mine.

[Christiano][22:32] 

maybe by continuity you can grant the existence of such a system, even if you don't think it will ever exist?

I want to (i) make the prediction that AI will actually have that impact at some point in time, (ii) talk about what happens before and after that

I am talking about AI systems that become continuously more useful, because "become continuously more useful" is what makes me think that (i) AI will have that impact at some point in time, (ii) allows me to productively reason about what AI will look like before and after that. I expect that your view will say something about why AI improvements either aren't continuous, or why continuous improvements lead to discontinuous jumps in the productivity of the (human+AI) system

[Yudkowsky][22:34] 

at some point you have an AI system that makes (humans+AI) 2x as good at further AI progress

Is this prophecy fulfilled by using some narrow eld-AI algorithm to map out a TPU, and then humans using TPUs can write in 1 month a research paper that would otherwise have taken 2 months? And then we can go on to FOOM now that this prophecy about pre-FOOM states has been fulfilled? I know the answer is no, but I don't know what you think is a narrower condition on the prophecy than that.

[Christiano][22:35] 

If you can use narrow eld-AI in order to make every part of AI research 2x faster, so that the entire field moves 2x faster, then the prophecy is fulfilled

and it may be just another 6 months until it makes all of AI research 2x faster again, and then 3 months, and then...

[Yudkowsky][22:36] 

What, the entire field? Even writing research papers? Even the journal editors approving and publishing the papers? So if we speed up every part of research except the journal editors, the prophecy has not been fulfilled and no FOOM may take place?

[Christiano][22:36] 

no, I mean the improvement in overall output, given the actual realistic level of bottlenecking that occurs in practice

[Yudkowsky][22:37] 

So if the realistic level of bottlenecking ever becomes dominated by a human gatekeeper, the prophecy is ever unfulfillable and no FOOM may ever occur.

[Christiano][22:37] 

that's what I mean by "2x as good at further progress," the entire system is achieving twice as much

then the prophecy is unfulfillable and I will have been wrong

I mean, I think it's very likely that there will be a hard takeoff, if people refuse or are unable to use AI to accelerate AI progress for reasons unrelated to AI capabilities, and then one day they become willing

[Yudkowsky][22:38] 

...because on your view, the Prophecy necessarily goes through humans and AIs working together to speed up the whole collective field of AI?

[Christiano][22:38] 

it's fine if the AI works alone

the point is just that it overtakes the humans at the point when it is roughly as fast as the humans

why wouldn't it?

why does it overtake the humans when it takes it 10 seconds to double in capability instead of 1 year?

that's like predicting that cultural evolution will be infinitely fast, instead of making the more obvious prediction that it will overtake evolution exactly when it's as fast as evolution

[Yudkowsky][22:39] 

I live in a mental world full of weird prototypes that people are shepherding along to the world's end. I'm not even sure there's a short sentence in my native language that could translate the short Paul-sentence "is roughly as fast as the humans".

[Christiano][22:40] 

do you agree that you can measure the speed with which the community of human AI researchers develop and implement improvements in their AI systems?

like, we can look at how good AI systems are in 2021, and in 2022, and talk about the rate of progress?

[Yudkowsky][22:40] 

...when exactly in hominid history was hominid intelligence exactly as fast as evolutionary optimization???

do you agree that you can measure the speed with which the community of human AI researchers develop and implement improvements in their AI systems?

I mean... obviously not? How the hell would we measure real actual AI progress? What would even be the Y-axis on that graph?

I have a rough intuitive feeling that it was going faster in 2015-2017 than 2018-2020.

"What was?" says the stern skeptic, and I go "I dunno."

[Christiano][22:42] 

Here's a way of measuring progress you won't like: for almost all tasks, you can initially do them with lots of compute, and as technology improves you can do them with less compute. We can measure how fast the amount of compute required is going down.

[Yudkowsky][22:43] 

Yeah, that would be a cool thing to measure. It's not obviously a relevant thing to anything important, but it'd be cool to measure.

[Christiano][22:43] 

Another way you won't like: we can hold fixed the resources we invest and look at the quality of outputs in any given domain (or even $ of revenue) and ask how fast it's changing.

[Yudkowsky][22:43] 

I wonder what it would say about Go during the age of AlphaGo.

Or what that second metric would say.

[Christiano][22:43] 

I think it would be completely fine, and you don't really understand what happened with deep learning in board games. Though I also don't know what happened in much detail, so this is more like a prediction then a retrodiction.

But it's enough of a retrodiction that I shouldn't get too much credit for it.

[Yudkowsky][22:44] 

I don't know what result you would consider "completely fine". I didn't have any particular unfine result in mind.

[Christiano][22:45] 

oh, sure

if it was just an honest question happy to use it as a concrete case

I would measure the rate of progress in Go by looking at how fast Elo improves with time or increasing R&D spending

[Yudkowsky][22:45] 

I mean, I don't have strong predictions about it so it's not yet obviously cruxy to me

[Christiano][22:46] 

I'd roughly guess that would continue, and if there were multiple trendlines to extrapolate I'd estimate crossover points based on that

[Yudkowsky][22:47] 

suppose this curve is smooth, and we see that sharp Go progress over time happened because Deepmind dumped in a ton of increased R&D spend. you then argue that this cannot happen with AGI because by the time we get there, people will be pushing hard at the frontiers in a competitive environment where everybody's already spending what they can afford, just like in a highly competitive manufacturing industry.

[Christiano][22:47] 

the key input to making a prediction for AGZ in particular would be the precise form of the dependence on R&D spending, to try to predict the changes as you shift from a single programmer to a large team at DeepMind, but most reasonable functional forms would be roughly right

Yes, it's definitely a prediction of my view that it's easier to improve things that people haven't spent much money on than things have spent a lot of money on. It's also a separate prediction of my view that people are going to be spending a boatload of money on all of the relevant technologies. Perhaps $1B/year right now and I'm imagining levels of investment large enough to be essentially bottlenecked on the availability of skilled labor.

[Bensinger][22:48] 

( Previous Eliezer-comments about AlphaGo as a break in trend, responding briefly to Miles Brundage: https://twitter.com/ESRogs/status/1337869362678571008 )

 

 

[Yudkowsky][22:49] 

Does your prediction change if all hell breaks loose in 2025 instead of 2055?

[Christiano][22:50] 

I think my prediction was wrong if all hell breaks loose in 2025, if by "all hell breaks loose" you mean "dyson sphere" and not "things feel crazy"

[Yudkowsky][22:50] 

Things feel crazy in the AI field and the world ends less than 4 years later, well before the world economy doubles.

Why was the Prophecy wrong if the world begins final descent in 2025? The Prophecy requires the world to then last until 2029 while doubling its economic output, after which it is permitted to end, but does not obviously to me forbid the Prophecy to begin coming true in 2025 instead of 2055.

[Christiano][22:52] 

yes, I just mean that some important underlying assumptions for the prophecy were violated, I wouldn't put much stock in it at that point, etc.

[Yudkowsky][22:53] 

A lot of the issues I have with understanding any of your terminology in concrete Eliezer-language is that it looks to me like the premise-events of your Prophecy are fulfillable in all sorts of ways that don't imply the conclusion-events of the Prophecy.

[Christiano][22:53] 

if "things feel crazy" happens 4 years before dyson sphere, then I think we have to be really careful about what crazy means

[Yudkowsky][22:54] 

a lot of people looking around nervously and privately wondering if Eliezer was right, while public pravda continues to prohibit wondering anything such thing out loud, so they all go on thinking that they must be wrong.

[Christiano][22:55] 

OK, by "things get crazy" I mean like hundreds of billions of dollars of spending at google on automating AI R&D

[Yudkowsky][22:55] 

I expect bureaucratic obstacles to prevent much GDP per se from resulting from this.

[Christiano][22:55] 

massive scaleups in semiconductor manufacturing, bidding up prices of inputs crazily

[Yudkowsky][22:55] 

I suppose that much spending could well increase world GDP by hundreds of billions of dollars per year.

[Christiano][22:56] 

massive speculative rises in AI company valuations financing a significant fraction of GWP into AI R&D

(+hardware R&D, +building new clusters, +etc.)

[Yudkowsky][22:56] 

like, higher than Tesla? higher than Bitcoin?

both of these things sure did skyrocket in market cap without that having much of an effect on housing stocks and steel production.

[Christiano][22:57] 

right now I think hardware R&D is on the order of $100B/year, AI R&D is more like $10B/year, I guess I'm betting on something more like trillions? (limited from going higher because of accounting problems and not that much smart money)

I don't think steel production is going up at that point

plausibly going down since you are redirecting manufacturing capacity into making more computers. But probably just staying static while all of the new capacity is going into computers, since cannibalizing existing infrastructure is much more expensive

the original point was: you aren't pulling AlphaZero shit any more, you are competing with an industry that has invested trillions in cumulative R&D

[Yudkowsky][23:00] 

is this in hopes of future profit, or because current profits are already in the trillions?

[Christiano][23:01] 

largely in hopes of future profit / reinvested AI outputs (that have high market cap), but also revenues are probably in the trillions?

[Yudkowsky][23:02] 

this all sure does sound "pretty darn prohibited" on my model, but I'd hope there'd be something earlier than that we could bet on. what does your Prophecy prohibit happening before that sub-prophesied day?

[Christiano][23:02] 

To me your model just seems crazy, and you are saying it predicts crazy stuff at the end but no crazy stuff beforehand, so I don't know what's prohibited. Mostly I feel like I'm making positive predictions, of gradually escalating value of AI in lots of different industries

and rapidly increasing investment in AI

I guess your model can be: those things happen, and then one day the AI explodes?

[Yudkowsky][23:03] 

the main way you get rapidly increasing investment in AI is if there's some way that AI can produce huge profits without that being effectively bureaucratically prohibited - eg this is where we get huge investments in burning electricity and wasting GPUs on Bitcoin mining.

[Christiano][23:03] 

but it seems like you should be predicting e.g. AI quickly jumping to superhuman in lots of domains, and some applications jumping from no value to massive value

I don't understand what you mean by that sentence. Do you think we aren't seeing rapidly increasing investment in AI right now?

or are you talking about increasing investment above some high threshold, or increasing investment at some rate significantly larger than the current rate?

it seems to me like you can pretty seamlessly get up to a few $100B/year of revenue just by redirecting existing tech R&D

[Yudkowsky][23:05] 

so I can imagine scenarios where some version of GPT-5 cloned outside OpenAI is able to talk hundreds of millions of mentally susceptible people into giving away lots of their income, and many regulatory regimes are unable to prohibit this effectively. then AI could be making a profit of trillions and then people would invest corresponding amounts in making new anime waifus trained in erotic hypnosis and findom.

this, to be clear, is not my mainline prediction.

but my sense is that our current economy is mostly not about the 1-day period to design new vaccines, it is about the multi-year period to be allowed to sell the vaccines.

the exceptions to this, like Bitcoin managing to say "fuck off" to the regulators for long enough, are where Bitcoin scales to a trillion dollars and gets massive amounts of electricity and GPU burned on it.

so we can imagine something like this for AI, which earns a trillion dollars, and sparks a trillion-dollar competition.

but my sense is that your model does not work like this.

my sense is that your model is about general improvements across the whole economy.

[Christiano][23:08] 

I think bitcoin is small even compared to current AI...

[Yudkowsky][23:08] 

my sense is that we've already built an economy which rejects improvement based on small amounts of cleverness, and only rewards amounts of cleverness large enough to bypass bureaucratic structures. it's not enough to figure out a version of e-gold that's 10% better. e-gold is already illegal. you have to figure out Bitcoin.

what are you going to build? better airplanes? airplane costs are mainly regulatory costs. better medtech? mainly regulatory costs. better houses? building houses is illegal anyways.

where is the room for the general AI revolution, short of the AI being literally revolutionary enough to overthrow governments?

[Christiano][23:10] 

factories, solar panels, robots, semiconductors, mining equipment, power lines, and "factories" just happens to be one word for a thousand different things

I think it's reasonable to think some jurisdictions won't be willing to build things but it's kind of improbable as a prediction for the whole world. That's a possible source of shorter-term predictions?

also computers and the 100 other things that go in datacenters

[Yudkowsky][23:12] 

The whole developed world rejects open borders. The regulatory regimes all make the same mistakes with an almost perfect precision, the kind of coordination that human beings could never dream of when trying to coordinate on purpose.

if the world lasts until 2035, I could perhaps see deepnets becoming as ubiquitous as computers were in... 1995? 2005? would that fulfill the terms of the Prophecy? I think it doesn't; I think your Prophecy requires that early AGI tech be that ubiquitous so that AGI tech will have trillions invested in it.

[Christiano][23:13] 

what is AGI tech?

the point is that there aren't important drivers that you can easily improve a lot

[Yudkowsky][23:14] 

for purposes of the Prophecy, AGI tech is that which, scaled far enough, ends the world; this must have trillions invested in it, so that the trajectory up to it cannot look like pulling an AlphaGo. no?

[Christiano][23:14] 

so it's relevant if you are imagining some piece of the technology which is helpful for general problem solving or something but somehow not helpful for all of the things people are doing with ML, to me that seems unlikely since it's all the same stuff

surely AGI tech should at least include the use of AI to automate AI R&D

regardless of what you arbitrarily decree as "ends the world if scaled up"

[Yudkowsky][23:15] 

only if that's the path that leads to destroying the world?

if it isn't on that path, who cares Prophecy-wise?

[Christiano][23:15] 

also I want to emphasize that "pull an AlphaGo" is what happens when you move from SOTA being set by an individual programmer to a large lab, you don't need to be investing trillions to avoid that

and that the jump is still more like a few years

but the prophecy does involve trillions, and my view gets more like your view if people are jumping from $100B of R&D ever to $1T in a single year

 

5.8. TPUs and GPUs, and automating AI R&D

 

[Yudkowsky][23:17] 

I'm also wondering a little why the emphasis on "trillions". it seems to me that the terms of your Prophecy should be fulfillable by AGI tech being merely as ubiquitous as modern computers, so that many competing companies invest mere hundreds of billions in the equivalent of hardware plants. it is legitimately hard to get a chip with 50% better transistors ahead of TSMC.

[Christiano][23:17] 

yes, if you are investing hundreds of billions then it is hard to pull ahead (though could still happen)

(since the upside is so much larger here, no one cares that much about getting ahead of TSMC since the payoff is tiny in the scheme of the amounts we are discussing)

[Yudkowsky][23:18] 

which, like, doesn't prevent Google from tossing out TPUs that are pretty significant jumps on GPUs, and if there's a specialized application of AGI-ish tech that is especially key, you can have everything behave smoothly and still get a jump that way.

[Christiano][23:18] 

I think TPUs are basically the same as GPUs

probably a bit worse

(but GPUs are sold at a 10x markup since that's the size of nvidia's lead)

[Yudkowsky][23:19] 

noted; I'm not enough of an expert to directly contradict that statement about TPUs from my own knowledge.

[Christiano][23:19] 

(though I think TPUs are nevertheless leased at a slightly higher price than GPUs)

[Yudkowsky][23:19] 

how does Nvidia maintain that lead and 10x markup? that sounds like a pretty un-Paul-ish state of affairs given Bitcoin prices never mind AI investments.

[Christiano][23:20] 

nvidia's lead isn't worth that much because historically they didn't sell many gpus

(especially for non-gaming applications)

their R&D investment is relatively large compared to the $ on the table

my guess is that their lead doesn't stick, as evidenced by e.g. Google very quickly catching up

[Yudkowsky][23:21] 

parenthetically, does this mean - and I don't necessarily predict otherwise - that you predict a drop in Nvidia's stock and a drop in GPU prices in the next couple of years?

[Christiano][23:21] 

nvidia's stock may do OK from riding general AI boom, but I do predict a relative fall in nvidia compared to other AI-exposed companies

(though I also predicted google to more aggressively try to compete with nvidia for the ML market and think I was just wrong about that, though I don't really know any details of the area)

I do expect the cost of compute to fall over the coming years as nvidia's markup gets eroded

to be partially offset by increases in the cost of the underlying silicon (though that's still bad news for nvidia)

[Yudkowsky][23:23] 

I parenthetically note that I think the Wise Reader should be justly impressed by predictions that come true about relative stock price changes, even if Eliezer has not explicitly contradicted those predictions before they come true. there are bets you can win without my having to bet against you.

[Christiano][23:23] 

you are welcome to counterpredict, but no saying in retrospect that reality proved you right if you don't 🙂

otherwise it's just me vs the market

[Yudkowsky][23:24] 

I don't feel like I have a counterprediction here, but I think the Wise Reader should be impressed if you win vs. the market.

however, this does require you to name in advance a few "other AI-exposed companies".

[Christiano][23:25] 

Note that I made the same bet over the last year---I make a large AI bet but mostly moved my nvidia allocation to semiconductor companies. The semiconductor part of the portfolio is up 50% while nvidia is up 70%, so I lost that one. But that just means I like the bet even more next year.

happy to use nvidia vs tsmc

[Yudkowsky][23:25] 

there's a lot of noise in a 2-stock prediction.

[Christiano][23:25] 

I mean, it's a 1-stock prediction about nvidia

[Yudkowsky][23:26] 

but your funeral or triumphal!

[Christiano][23:26] 

indeed 🙂

anyway

I expect all of the $ amounts to be much bigger in the future

[Yudkowsky][23:26] 

yeah, but using just TSMC for the opposition exposes you to I dunno Chinese invasion of Taiwan

[Christiano][23:26] 

yes

also TSMC is not that AI-exposed

I think the main prediction is: eventual move away from GPUs, nvidia can't maintain that markup

[Yudkowsky][23:27] 

"Nvidia can't maintain that markup" sounds testable, but is less of a win against the market than predicting a relative stock price shift. (Over what timespan? Just the next year sounds quite fast for that kind of prediction.)

[Christiano][23:27] 

regarding your original claim: if you think that it's plausible that AI will be doing all of the AI R&D, and that will be accelerating continuously from 12, 6, 3 month "doubling times," but that we'll see a discontinuous change in the "path to doom," then that would be harder to generate predictions about

yes, it's hard to translate most predictions about the world into predictions about the stock market

[Yudkowsky][23:28] 

this again sounds like it's not written in Eliezer-language.

what does it mean for "AI will be doing all of the AI R&D"? that sounds to me like something that happens after the end of the world, hence doesn't happen.

[Christiano][23:29] 

that's good, that's what I thought

[Yudkowsky][23:29] 

I don't necessarily want to sound very definite about that in advance of understanding what it means

[Christiano][23:29] 

I'm saying that I think AI will be automating AI R&D gradually, before the end of the world

yeah, I agree that if you reject the construct of "how fast the AI community makes progress" then it's hard to talk about what it means to automate "progress"

and that may be hard to make headway on

though for cases like AlphaGo (which started that whole digression) it seems easy enough to talk about elo gain per year

maybe the hard part is aggregating across tasks into a measure you actually care about?

[Yudkowsky][23:30] 

up to a point, but yeah. (like, if we're taking Elo high above human levels and restricting our measurements to a very small range of frontier AIs, I quietly wonder if the measurement is still measuring quite the same thing with quite the same robustness.)

[Christiano][23:31] 

I agree that elo measurement is extremely problematic in that regime

 

5.9. Smooth exponentials vs. jumps in income

 

[Yudkowsky][23:31] 

so in your worldview there's this big emphasis on things that must have been deployed and adopted widely to the point of already having huge impacts

and in my worldview there's nothing very surprising about people with a weird powerful prototype that wasn't used to automate huge sections of AI R&D because the previous versions of the tech weren't useful for that or bigcorps didn't adopt it.

[Christiano][23:32] 

I mean, Google is already 1% of the US economy and in this scenario it and its peers are more like 10-20%? So wide adoption doesn't have to mean that many people. Though I also do predict much wider adoption than you so happy to go there if it's happy for predictions.

I don't really buy the "weird powerful prototype"

[Yudkowsky][23:33] 

yes. I noticed.

you would seem, indeed, to be offering large quantities of it for short sale.

[Christiano][23:33] 

and it feels like the thing you are talking about ought to have some precedent of some kind, of weird powerful prototypes that jump straight from "does nothing" to "does something impactful"

like if I predict that AI will be useful in a bunch of domains, and will get there by small steps, you should either predict that won't happen, or else also predict that there will be some domains with weird prototypes jumping to giant impact?

[Yudkowsky][23:34] 

like an electrical device that goes from "not working at all" to "actually working" as soon as you screw in the attachments for the electrical plug.

[Christiano][23:34] 

(clearly takes more work to operationalize)

I'm not sure I understand that sentence, hopefully it's clear enough why I expect those discontinuities?

[Yudkowsky][23:34] 

though, no, that's a facile bad analogy.

a better analogy would be an AI system that only starts working after somebody tells you about batch normalization or LAMB learning rate or whatever.

[Christiano][23:36] 

sure, which I think will happen all the time for individual AI projects but not for sota

because the projects at sota have picked the low hanging fruit, it's not easy to get giant wins

[Yudkowsky][23:36] 

like if I predict that AI will be useful in a bunch of domains, and will get there by small steps, you should either predict that won't happen, or else also predict that there will be some domains with weird prototypes jumping to giant impact?

in the latter case, has this Eliezer-Prophecy already had its terms fulfilled by AlphaFold 2, or do you say nay because AlphaFold 2 hasn't doubled GDP?

[Christiano][23:37] 

(you can also get giant wins by a new competitor coming up at a faster rate of progress, and then we have more dependence on whether people do it when it's a big leap forward or slightly worse than the predecessor, and I'm betting on the latter)

I have no idea what AlphaFold 2 is good for, or the size of the community working on it, my guess would be that its value is pretty small

we can try to quantify

like, I get surprised when $X of R&D gets you something whose value is much larger than $X

I'm not surprised at all if $X of R&D gets you <<$X, or even like 10*$X in a given case that was selected for working well

hopefully it's clear enough why that's the kind of thing a naive person would predict

[Yudkowsky][23:38] 

so a thing which Eliezer's Prophecy does not mandate per se, but sure does permit, and is on the mainline especially for nearer timelines, is that the world-ending prototype had no prior prototype containing 90% of the technology which earned a trillion dollars.

a lot of Paul's Prophecy seems to be about forbidding this.

is that a fair way to describe your own Prophecy?

[Christiano][23:39] 

I don't have a strong view about "containing 90% of the technology"

the main view is that whatever the "world ending prototype" does, there were earlier systems that could do practically the same thing

if the world ending prototype does something that lets you go foom in a day, there was a system years earlier that could foom in a month, so that would have been the one to foom

[Yudkowsky][23:41] 

but, like, the world-ending thing, according to the Prophecy, must be squarely in the middle of a class of technologies which are in the midst of earning trillions of dollars and having trillions of dollars invested in them. it's not enough for the Worldender to be definitionally somewhere in that class, because then it could be on a weird outskirt of the class, and somebody could invest a billion dollars in that weird outskirt before anybody else had invested a hundred million, which is forbidden by the Prophecy. so the Worldender has got to be right in the middle, a plain and obvious example of the tech that's already earning trillions of dollars. ...y/n?

[Christiano][23:42] 

I agree with that as a prediction for some operationalization of "a plain and obvious example," but I think we could make it more precise / it doesn't feel like it depends on the fuzziness of that

I think that if the world can end out of nowhere like that, you should also be getting $100B/year products out of nowhere like that, but I guess you think not because of bureaucracy

like, to me it seems like our views stake out predictions about codex, where I'm predicting its value will be modest relative to R&D, and the value will basically improve from there with a nice experience curve, maybe something like ramping up quickly to some starting point <$10M/year and then doubling every year thereafter, whereas I feel like you are saying more like "who knows, could be anything" and so should be surprised each time the boring thing happens

[Yudkowsky][23:45] 

the concrete example I give is that the World-Ending Company will be able to use the same tech to build a true self-driving car, which would in the natural course of things be approved for sale a few years later after the world had ended.

[Christiano][23:46] 

but self-driving cars seem very likely to already be broadly deployed, and so the relevant question is really whether their technical improvements can also be deployed to those cars?

(or else maybe that's another prediction we disagree about)

[Yudkowsky][23:47] 

I feel like I would indeed not have the right to feel very surprised if Codex technology stagnated for the next 5 years, nor if it took a massive leap in 2 years and got ubiquitously adopted by lots of programmers.

yes, I think that's a general timeline difference there

re: self-driving cars

I might be talkable into a bet where you took "Codex tech will develop like this" and I took the side "literally anything else but that"

[Christiano][23:48] 

I think it would have to be over/under, I doubt I'm more surprised than you by something failing to be economically valuable, I'm surprised by big jumps in value

seems like it will be tough to work

[Yudkowsky][23:49] 

well, if I was betting on something taking a big jump in income, I sure would bet on something in a relatively unregulated industry like Codex or anime waifus.

but that's assuming I made the bet at all, which is a hard sell when the bet is about the Future, which is notoriously hard to predict.

[Christiano][23:50] 

I guess my strongest take is: if you want to pull the thing where you say that future developments proved you right and took unreasonable people like me by surprise, you've got to be able to say something in advance about what you expect to happen

[Yudkowsky][23:51] 

so what if neither of us are surprised if Codex stagnates for 5 years, you win if Codex shows a smooth exponential in income, and I win if the income looks... jumpier? how would we quantify that?

[Christiano][23:52] 

codex also does seem a bit unfair to you in that it may have to be adopted by lots of programmers which could slow things down a lot even if capabilities are pretty jumpy

(though I think in fact usefulness and not merely profit will basically just go up smoothly, with step sizes determined by arbitrary decisions about when to release something)

[Yudkowsky][23:53] 

I'd also be concerned about unfairness to me in that earnable income is not the same as the gains from trade. If there's more than 1 competitor in the industry, their earnings from Codex may be much less than the value produced, and this may not change much with improvements in the tech.

 

5.10. Late-stage predictions

 

[Christiano][23:53] 

I think my main update from this conversation is that you don't really predict someone to come out of nowhere with a model that can earn a lot of $, even if they could come out of nowhere with a model that could end the world, because of regulatory bottlenecks and nimbyism and general sluggishness and unwillingness to do things

does that seem right?

[Yudkowsky][23:55] 

Well, and also because the World-ender is "the first thing that scaled with compute" and/or "the first thing that ate the real core of generality" and/or "the first thing that went over neutron multiplication factor 1".

[Christiano][23:55] 

and so that cuts out a lot of the easily-specified empirical divergences, since "worth a lot of $" was the only general way to assess "big deal that people care about" and avoiding disputes like "but Zen was mostly developed by a single programmer, it's not like intense competition"

yeah, that's the real disagreement it seems like we'd want to talk about

but it just doesn't seem to lead to many prediction differences in advance?

I totally don't buy any of those models, I think they are bonkers

would love to bet on that

[Yudkowsky][23:56] 

Prolly but I think the from-my-perspective-weird talk about GDP is probably concealing some kind of important crux, because caring about GDP still feels pretty alien to me.

[Christiano][23:56] 

I feel like getting up to massive economic impacts without seeing "the real core of generality" seems like it should also be surprising on your view

like if it's 10 years from now and AI is a pretty big deal but no crazy AGI, isn't that surprising?

[Yudkowsky][23:57] 

Mildly but not too surprising, I would imagine that people had built a bunch of neat stuff with gradient descent in realms where you could get a long way on self-play or massively collectible datasets.

[Christiano][23:58] 

I'm fine with the crux being something that doesn't lead to any empirical disagreements, but in that case I just don't think you should claim credit for the worldview making great predictions.

(or the countervailing worldview making bad predictions)

[Yudkowsky][23:59] 

stuff that we could see then: self-driving cars (10 years is enough for regulatory approval in many countries), super Codex, GPT-6 powered anime waifus being an increasingly loud source of (arguably justified) moral panic and a hundred-billion-dollar industry

[Christiano][23:59] 

another option is "10% GDP GWP growth in a year, before doom"

I think that's very likely, though might be too late to be helpful

[Yudkowsky][0:01]  (next day, Sep. 15) 

see, that seems genuinely hard unless somebody gets GPT-4 far head of any political opposition - I guess all the competent AGI groups lean solidly liberal at the moment? - and uses it to fake massive highly-persuasive sentiment on Twitter for housing liberalization.

[Christiano][0:01]  (next day, Sep. 15) 

so seems like a bet?

but you don't get to win until doom 🙁

[Yudkowsky][0:02]  (next day, Sep. 15) 

I mean, as written, I'd want to avoid cases like 10% growth on paper while recovering from a pandemic that produced 0% growth the previous year.

[Christiano][0:02]  (next day, Sep. 15) 

yeah

[Yudkowsky][0:04]  (next day, Sep. 15) 

I'd want to check the current rate (5% iirc) and what the variance on it was, 10% is a little low for surety (though my sense is that it's a pretty darn smooth graph that's hard to perturb)

if we got 10% in a way that was clearly about AI tech becoming that ubiquitous, I'd feel relatively good about nodding along and saying, "Yes, that is like unto the beginning of Paul's Prophecy" not least because the timelines had been that long at all.

[Christiano][0:05]  (next day, Sep. 15) 

like 3-4%/year right now

random wikipedia number is 5.5% in 2006-2007, 3-4% since 2010

4% 1995-2000

[Yudkowsky][0:06]  (next day, Sep. 15) 

I don't want to sound obstinate here. My model does not forbid that we dwiddle around on the AGI side while gradient descent tech gets its fingers into enough separate weakly-generalizing pies to produce 10% GDP growth, but I'm happy to say that this sounds much more like Paul's Prophecy is coming true.

[Christiano][0:07]  (next day, Sep. 15) 

ok, we should formalize at some point, but also need the procedure for you getting credit given that it can't resolve in your favor until the end of days

[Yudkowsky][0:07]  (next day, Sep. 15) 

Is there something that sounds to you like Eliezer's Prophecy which we can observe before the end of the world?

[Christiano][0:07]  (next day, Sep. 15) 

when you will already have all the epistemic credit you need

not on the "simple core of generality" stuff since that apparently immediately implies end of world

maybe something about ML running into obstacles en route to human level performance?

or about some other kind of discontinuous jump even in a case where people care, though there seem to be a few reasons you don't expect many of those

[Yudkowsky][0:08]  (next day, Sep. 15) 

depends on how you define "immediately"? it's not long before the end of the world, but in some sad scenarios there is some tiny utility to you declaring me right 6 months before the end.

[Christiano][0:09]  (next day, Sep. 15) 

I care a lot about the 6 months before the end personally

though I do think probably everything is more clear by then independent of any bet; but I guess you are more pessimistic about that

[Yudkowsky][0:09]  (next day, Sep. 15) 

I'm not quite sure what I'd do in them, but I may have worked something out before then, so I care significantly in expectation if not in particular.

I am more pessimistic about other people's ability to notice what reality is screaming in their faces, yes.

[Christiano][0:10]  (next day, Sep. 15) 

if we were to look at various scaling curves, e.g. of loss vs model size or something, do you expect those to look distinctive as you hit the "real core of generality"?

[Yudkowsky][0:10]  (next day, Sep. 15) 

let me turn that around: if we add transformers into those graphs, do they jump around in a way you'd find interesting?

[Christiano][0:11]  (next day, Sep. 15) 

not really

[Yudkowsky][0:11]  (next day, Sep. 15) 

is that because the empirical graphs don't jump, or because you don't think the jumps say much?

[Christiano][0:11]  (next day, Sep. 15) 

but not many good graphs to look at (I just have one in mind), so that's partly a prediction about what the exercise would show

I don't think the graphs jump much, and also transformers come before people start evaluating on tasks where they help a lot

[Yudkowsky][0:12]  (next day, Sep. 15) 

It would not terribly contradict the terms of my Prophecy if the World-ending tech began by not producing a big jump on existing tasks, but generalizing to some currently not-so-popular tasks where it scaled much faster.

[Christiano][0:13]  (next day, Sep. 15) 

eh, they help significantly on contemporary tasks, but it's just not a huge jump relative to continuing to scale up model sizes

or other ongoing improvements in architecture

anyway, should try to figure out something, and good not to finalize a bet until you have some way to at least come out ahead, but I should sleep now

[Yudkowsky][0:14]  (next day, Sep. 15) 

yeah, same.

Thing I want to note out loud lest I forget ere I sleep: I think the real world is full of tons and tons of technologies being developed as unprecedented prototypes in the midst of big fields, because the key thing to invest in wasn’t the competitively explored center. Wright Flyer vs all expenditures on Traveling Machine R&D. First atomic pile and bomb vs all Military R&D.

This is one reason why Paul’s Prophecy seems fragile to me. You could have the preliminaries come true as far as there being a trillion bucks in what looks like AI R&D, and then the WorldEnder is a weird prototype off to one side of that. saying “But what about the rest of that AI R&D?” is no more a devastating retort to reality than looking at AlphaGo and saying “But weren’t other companies investing billions in Better Software?” Yeah but it was a big playing field with lots of different kinds of Better Software and no other medium-sized team of 15 people with corporate TPU backing was trying to build a system just like AlphaGo, even though multiple small outfits were trying to build prestige-earning gameplayers. Tech advancements very very often occur in places where investment wasn't dense enough to guarantee overlap.

 
6. Follow-ups on "Takeoff Speeds"

 

6.1. Eliezer Yudkowsky's commentary

 

[Yudkowsky][17:25]  (Sep. 15) 

Further comment that occurred to me on "takeoff speeds" if I've better understood the main thesis now: its hypotheses seem to include a perfectly anti-Thielian setup for AGI.

Thiel has a running thesis about how part of the story behind the Great Stagnation and the decline in innovation that's about atoms rather than bits - the story behind "we were promised flying cars and got 140 characters", to cite the classic Thielian quote - is that people stopped believing in "secrets [LW · GW]".

Thiel suggests that you have to believe there are knowable things that aren't yet widely known - not just things that everybody already knows, plus mysteries that nobody will ever know - in order to be motivated to go out and innovate. Culture in developed countries shifted to label this kind of thinking rude - or rather, even ruder, even less tolerated than it had been decades before - so innovation decreased as a result.

The central hypothesis of "takeoff speeds" is that at the time of serious AGI being developed, it is perfectly anti-Thielian in that it is devoid of secrets in that sense. It is not permissible (on this viewpoint) for it to be the case that there is a lot of AI investment into AI that is directed not quite at the key path leading to AGI, such that somebody could spend $1B on compute for the key path leading to AGI before anybody else had spent $100M on that. There cannot exist any secret like that. The path to AGI will be known; everyone, or a wide variety of powerful actors, will know how profitable that path will be; the surrounding industry will be capable of acting on this knowledge, and will have actually been acting on it as early as possible; multiple actors are already investing in every tech path that would in fact be profitable (and is known to any human being at all), as soon as that R&D opportunity becomes available.

And I'm not saying this is an inconsistent world to describe! I've written science fiction set in this world. I called it "dath ilan". It's a hypothetical world that is actually full of smart people in economic equilibrium. If anything like Covid-19 appears, for example, the governments and public-good philanthropists there have already set up prediction markets (which are not illegal, needless to say); and of course there are mRNA vaccine factories already built and ready to go, because somebody already calculated the profits from fast vaccines would be very high in case of a pandemic (no artificial price ceilings in this world, of course); so as soon as the prediction markets started calling the coming pandemic conditional on no vaccine, the mRNA vaccine factories were already spinning up.

This world, however, is not Earth.

On Earth, major chunks of technological progress quite often occur outside of a social context where everyone knew and agreed in advance on which designs would yield how much expected profit and many overlapping actors competed to invest in the most actually-promising paths simultaneously.

And that is why you can read Inadequate Equilibria, and then read this essay on takeoff speeds, and go, "Oh, yes, I recognize this; it's written inside the Modesty worldview; in particular, the imagination of an adequate world in which there is a perfect absence of Thielian secrets or unshared knowable knowledge about fruitful development pathways. This is the same world that already had mRNA vaccines ready to spin up on day one of the Covid-19 pandemic, because markets had correctly forecasted their option value and investors had acted on that forecast unimpeded. Sure would be an interesting place to live! But we don't live there."

Could we perhaps end up in a world where the path to AGI is in fact not a Thielian secret, because in fact the first accessible path to AGI happens to lie along a tech pathway that already delivered large profits to previous investors who summed a lot of small innovations, a la experience with chipmaking, such that there were no large innovations just lots and lots of small innovations that yield 10% improvement annually on various tech benchmarks?

I think that even in this case we will get weird, discontinuous, and fatal behaviors, and I could maybe talk about that when discussion resumes. But it is not ruled out to me that the first accessible pathway to AGI could happen to lie in the further direction of some road that was already well-traveled, already yielded much profit to now-famous tycoons back when its first steps were Thielian secrets, and hence is now replete with dozens of competing chasers for the gold rush.

It's even imaginable to me, though a bit less so, that the first path traversed to real actual pivotal/powerful/lethal AGI, happens to lie literally actually squarely in the central direction of the gold rush. It sounds a little less like the tech history I know, which is usually about how someone needed to swerve a bit and the popular gold-rush forecasts weren't quite right, but maybe that is just a selective focus of history on the more interesting cases.

Though I remark that - even supposing that getting to big AGI is literally as straightforward and yet as difficult as falling down a semiconductor manufacturing roadmap (as otherwise the biggest actor to first see the obvious direction could just rush down the whole road) - well, TSMC does have a bit of an unshared advantage right now, if I recall correctly. And Intel had a bit of an advantage before that. So that happens even when there's competitors competing to invest billions.

But we can imagine that doesn't happen either, because instead of needing to build a whole huge manufacturing plant, there's just lots and lots of little innovations adding up to every key AGI threshold, which lots of actors are investing $10 million in at a time, and everybody knows which direction to move in to get to more serious AGI and they're right in this shared forecast.

I am willing to entertain discussing this world and the sequelae there - I do think everybody still dies in this case - but I would not have this particular premise thrust upon us as a default, through a not-explicitly-spoken pressure against being so immodest and inegalitarian as to suppose that any Thielian knowable-secret will exist, or that anybody in the future gets as far ahead of others as today's TSMC or today's Deepmind.

We are, in imagining this world, imagining a world in which AI research has become drastically unlike today's AI research in a direction drastically different from the history of many other technologies.

It's not literally unprecedented, but it's also not a default environment for big moments in tech progress; it's narrowly precedented for particular industries with high competition and steady benchmark progress driven by huge investments into a sum of many tiny innovations.

So I can entertain the scenario. But if you want to claim that the social situation around AGI will drastically change in this way you foresee - not just that it could change in that direction, if somebody makes a big splash that causes everyone else to reevaluate their previous opinions and arrive at yours, but that this social change will occur and you know this now - and that the prerequisite tech path to AGI is known to you, and forces an investment situation that looks like the semiconductor industry - then your "What do you think you know and how do you think you know it?" has some significant explaining to do.

Of course, I do appreciate that such a thing could be knowable, and yet not known to me. I'm not so silly as to disbelieve in secrets like that. They're all over the actual history of technological progress on our actual Earth.

176 comments

Comments sorted by top scores.

comment by paulfchristiano · 2021-11-24T22:05:18.689Z · LW(p) · GW(p)

I stand ready to bet with Eliezer on any topic related to AI, science, or technology. I'm happy for him to pick but I suggest some types of forecast below.

If Eliezer’s predictions were roughly as good as mine (in cases where we disagree), then I would update towards taking his views more seriously. Right now it looks to me like his view makes bad predictions about lots of everyday events.

It’s possible that we won’t be able to find cases where we disagree, and perhaps that Eliezer’s model totally agrees with mine until we develop AGI. But I think that’s unlikely for a few reasons:

  • I constantly see observations that seem like evidence for Eliezer’s views (e.g. any time I see an ML paper with a surprisingly large effect size, or ML labs failing to make investments in scaling, or people being surprisingly unreasonable), it’s just that I see significantly more evidence against his views. The point of making bets in advance is that it can correct for my hindsight bias or for my inability to simulate “what Eliezer’s view would say about this.” Eliezer could also say that actually all of the observations I listed aren't evidence for his view, which would be interesting to me.
  • Eliezer frequently talks smack about how the real world is surprising to fools like Paul (e.g. he talks about the "sort of person who gets taken in by Hanson's arguments in 2008 and gets caught flatfooted by AlphaGo and GPT-3 and AlphaFold 2",).  If that’s right, then it must correspond to differences in prediction. And if Eliezer literally can’t state where he expects to make better predictions than me other than AGI then I think people should mostly ignore the bluster and he should probably cut it out.
  • Eliezer frequently acknowledges “sure, lines look straight in hindsight, but that’s not how they look at the time.” But to me it looks like lines are also (mostly) straight even with foresight. How could this not correspond to some difference in prediction? I’d be happy to use historical case studies instead of predictions, but Eliezer thinks you need to make them in advance—so I’m happy to just apply my straight-line-extrapolation methodology to arbitrary near-term forecasts. I think Eliezer would prefer that I somehow make predictions and evaluate them in absolute terms rather than by comparing to Eliezer’s predictions, but that’s not what at’s issue—I think my forecasts are more accurate than Eliezer’s, not that they meet some absolute bar of quality.
  • When trying to define bets, I think we get stuck at the stage where Eliezer isn’t giving probability distributions over quantitative measures, not the stage where Eliezer gives them but they’re the same as mine. My tentative guess is that Eliezer can’t predict what-Paul-would-call-a-reasonable-forecast, rather than understanding what Paul would forecast but disagreeing with it. This is related to disagreements over how to interpret the past evidence. I’m less clear on whether I can simulate Eliezer forecasts.

Anyway, I think Eliezer should probably pick a domain where he thinks his model shines. But I’m going to propose some domains where I expect to find disagreements and where I expect to beat his model, just to help get the ball rolling:

  • Performance on any ML benchmark in 1, 2, 5 years. Happy to propose examples (basically taking those from existing work) in theorem proving, standard NLP, mathematical reasoning, or coding.
  • Performance on any interesting real-world tasks where we can readily define the task in 1, 2, 5 years. Happy to propose examples on e.g. translation, picking robots, self-driving cars.
  • Signs of impact from various kinds of AI in 2, 5, 10 years, e.g. coding, marketing copy, industrial robotics, self-driving cars, translation, whatever.
  • Progress in performance or adoption for non-AI technologies, e.g. energy (solar, fission, fusion, wind…), various parts of biotech or materials science, whatever.
  • Total investment in AI research of various kinds, either in the industry overall or at particular labs.
  • Total valuation of AI companies, hardware companies, or whatever.
  • Sizes of improvements over SOTA from ML papers in various domains.
  • Relative success of different ML approaches, e.g. importance of architectural changes vs transformers, how much gradient descent will play a role in future results, meta-learning vs fine-tuning…
  • Specific claims about model sizes, training costs, the role of planning, etc. in high-profile results.

I’m happy to provide more specific operationalizations and questions in any of those domains, if there are any categories where Eliezer is up for actually forecasting.

The high-level patterns that I think will generate lots of moderate lower-level disagreements:

  • I expect things to be significantly more incremental and “boring.” I put smaller probabilities on trend breaks and big jumps, and I have a strong sense for many kinds of metrics that move more regularly. I think Eliezer literally can’t tell how to translate this heuristic into predictions, which is part of why I think he is going to predictably make bad predictions.
  • I think I have more understanding of modern AI in particular, so I expect to make better predictions for boring reasons for anything in that space.
  • I generally expect a continuing ramp-up in AI investment and effort, and for that to lead to predictable changes as the field scales.
  • I have a different picture of how AI will work where AGI is not special and so won’t affect any evaluations of tasks in the near future, leading to more “boring” claims about the hardness of different tasks (though not sure this will generate disagreements within 5 years).

We might try to use operationalizations like: "In how many of these 10 quantities is there a year with 4x more change than any previous year" (h/t holden), or "How much of the economic value of AI comes from applications whose value has more than doubled in the last year?" or "For each of these pairs of capabilities, which will happen first if at least one happens in the next 5 years?" or so on. But even if we can't find something clever, I feel like the differences in quantitative view are stark enough that we're just going to disagree about a bunch of numbers.

I would prefer state predictions and discuss rationales publicly, allow some informed folks to kibitz, and then revise based on people pointing out facts we don't know, since I think that makes it cheaper to make forecasts and reduces the probability that the test is decided by specific facts rather than a general view.

Replies from: Eliezer_Yudkowsky, RomanS
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-24T23:25:14.327Z · LW(p) · GW(p)

I do wish to note that we spent a fair amount of time on Discord trying to nail down what earlier points we might disagree on, before the world started to end, and these Discord logs should be going up later.

From my perspective, the basic problem is that Eliezer's story looks a lot like "business as usual until the world starts to end sharply", and Paul's story looks like "things continue smoothly until their smooth growth ends the world smoothly", and both of us have ever heard of superforecasting and both of us are liable to predict near-term initial segments by extrapolating straight lines while those are available.  Another basic problem, as I'd see it, is that we tend to tell stories about very different subject matters - I care a lot less than Paul about the quantitative monetary amount invested into Intel, to the point of not really trying to develop expertise about that.

I claim that I came off better than Robin Hanson in our FOOM debate compared to the way that history went.  I'd claim that my early judgments of the probable importance of AGI, at all, stood up generally better than early non-Yudkowskian EA talking about that.  Other people I've noticed ever making correct bold predictions in this field include Demis Hassabis, for predicting that deep learning would work at all, and then for predicting that he could take the field of Go and taking it; and Dario Amodei, for predicting that brute-forcing stacking more layers would be able to produce GPT-2 and GPT-3 instead of just asymptoting and petering out.  I think Paul doesn't need to bet against me to start producing a track record like this; I think he can already start to accumulate reputation by saying what he thinks is bold and predictable about the next 5 years; and if it overlaps "things that interest Eliezer" enough for me to disagree with some of it, better yet.

Replies from: paulfchristiano, Jotto999
comment by paulfchristiano · 2021-11-24T23:32:12.883Z · LW(p) · GW(p)

From my perspective, the basic problem is that Eliezer's story looks a lot like "business as usual until the world starts to end sharply", and Paul's story looks like "things continue smoothly until their smooth growth ends the world smoothly", and both of have ever heard of superforecasting and both of us are liable to predict near-term initial segments by extrapolating straight lines while those are available.

I agree that it's plausible that we both make the same predictions about the near future. I think we probably don't, and there are plenty of disagreements about all kinds of stuff. But if in fact we agree, then in 5 years you shouldn't say "and see how much the world looked like I said?"

It feels to me like it goes:  you say AGI will look crazy.  Then I say that sounds unlike the world of today. Then you say "no, the world actually always looks discontinuous in the ways I'm predicting and your model is constantly surprised by real stuff that happens, e.g. see transformers or AlphaGo" and then I say "OK, let's bet about literally anything at all, you pick."

I think it's pretty likely that we actually do disagree about how much the world of today is boring and continuous, where my error theory is that you spend too much time reading papers and press releases that paint a misleading picture and just aren't that familiar with what's happening on the ground. So I expect if we stake out any random quantity we'll disagree somewhat.

Most things just aren't bold and predictable, they are modest disagreements. I'm not saying I have some deep secret about the world, just that you are wrong in this case.

(Sorry for edits, accidentally posted early.)

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T00:16:48.043Z · LW(p) · GW(p)

I feel a bit confused about where you think we meta-disagree here, meta-policy-wise.  If you have a thesis about the sort of things I'm liable to disagree with you about, because you think you're more familiar with the facts on the ground, can't you write up Paul's View of the Next Five Years and then if I disagree with it better yet, but if not, you still get to be right and collect Bayes points for the Next Five Years?

I mean, it feels to me like this should be a case similar to where, for example, I think I know more about macroeconomics than your typical EA; so if I wanted to expend the time/stamina points, I could say a bunch of things I consider obvious and that contradict hot takes on Twitter and many EAs would go "whoa wait really" and then I could collect Bayes points later and have performed a public service, even if nobody showed up to disagree with me about that.  (The reason I don't actually do this... is that I tried; I keep trying to write a book about basic macro, only it's the correct version explained correctly, and have a bunch of isolated chapters and unfinished drafts.)  I'm also trying to write up my version of The Next Five Years assuming the world starts to end in 2025, since this is not excluded by my model; but writing in long-form requires stamina and I've been tired of late which is part of why I've been having Discord conversations instead.

I think you think there's a particular thing I said which implies that the ball should be in my court to already know a topic where I make a different prediction from what you do, and so I should be able to state my own prediction about that topic and bet with you about that; or, alternatively, that I should retract some thing I said recently which implies that.  And so, you shouldn't need to have to do all the work to write up your forecasts generally, and it's unfair that I'm trying to make you do all that work.  Check?  If so, I don't yet see the derivation chain on this meta-level point.

I think the Hansonian viewpoint - which I consider another gradualist viewpoint, and whose effects were influential on early EA and which I think are still lingering around in EA - seemed surprised by AlphaGo and Alpha Zero, when you contrast its actual advance language with what actually happened.  Inevitably, you can go back afterwards and claim it wasn't really a surprise in terms of the abstractions that seem so clear and obvious now, but I think it was surprised then; and I also think that "there's always a smooth abstraction in hindsight, so what, there'll be one of those when the world ends too", is a huge big deal in practice with respect to the future being unpredictable.  From this, you seem to derive that I should already know what to bet with you about, and are annoyed by how I'm playing coy; because if I don't bet with you right now, I should retract the statement that I think gradualists were surprised; but to me I'm not following the sequitur there.

Or maybe I'm just entirely misinterpreting the flow of your thoughts here.

Replies from: paulfchristiano, paulfchristiano
comment by paulfchristiano · 2021-11-25T01:03:47.613Z · LW(p) · GW(p)

I think you think there's a particular thing I said which implies that the ball should be in my court to already know a topic where I make a different prediction from what you do.

I've said I'm happy to bet about anything, and listed some particular questions I'd bet about where I expect you to be wronger. If you had issued the same challenge to me, I would have picked one of the things and we would have already made some bets. So that's why I feel like the ball is in your court to say what things you're willing to make forecasts about.

That said, I don't know if making bets is at all a good use of time. I'm inclined to do it because I feel like your view really should be making different predictions (and I feel like you are participating in good faith and in fact would end up making different predictions). And I think it's probably more promising than trying to hash out the arguments since at this point I feel like I mostly know your position and it's incredibly slow going. But it seems very plausible that the right move is just to agree to disagree and not spend time on this. In that case it was particularly bad of me to try to claim the epistemic high ground. I can't really defend myself there, but can explain by saying that I found your vitriolic reading of takeoff speeds pretty condescending and frustrating and, given that I think you are more wrong than right, wanted a nice way to demonstrate that.

I've mentioned the kinds of things I think your model will forecast badly, and suggested that we bet about them in particular:

  • I think you generally overestimate the rate of trend breaks on measurable trends. So let's pick some trends and estimate probability of trend breaks.
  • I think you don't understand in which domains trend-breaks are surprising and where they aren't surprising, so you will be sometimes underconfident and sometimes overconfident on any given forecast. Same bet as last time.
  • I think you are underconfident about the fact that almost all AI profits will come from areas that had almost-as-much profit in recent years. So we could bet about where AI profits are in the near term, or try to generalize this.
  • I think you are underconfident about continuing scale-up in AI. So we can bet about future spending, size of particular labs, size of the ML field.
  • I think you overestimate DeepMind's advantage over the rest of the field and so will make bad forecasts about where any given piece of progress comes from.
  • I think your AI timelines are generally too short. You can point to cool stuff happening as a vindication for your view, and there will certainly be some cool stuff happening, but I think if we actually get concrete you are just going to make worse predictions.
Replies from: paulfchristiano, Eliezer_Yudkowsky
comment by paulfchristiano · 2021-11-25T01:10:15.803Z · LW(p) · GW(p)

My uncharitable read on many of these domains is that you are saying "Sure, I think that Paul might have somewhat better forecasts than me on those questions, but why is that relevant to AGI?"

In that case it seems like the situation is pretty asymmetrical. I'm claiming that my view of AGI is related to beliefs and models that also bear on near-term questions, and I expect to make better forecasts than you in those domains because I have more accurate beliefs/models. If your view of AGI is unrelated to any near-term questions where we disagree, then that seems like an important asymmetry.

Replies from: DPiepgrass
comment by DPiepgrass · 2022-03-14T20:05:35.452Z · LW(p) · GW(p)

I suspect that indeed EY's model has a limited ability to make near-term predictions, so that yes, the situation is asymmetrical. But I suspect his view is similar to my view, so I don't think EY is wrong. But I am confused about why EY (i) hasn't replied himself and (ii) in general, doesn't communicate more clearly on this topic.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T05:19:36.051Z · LW(p) · GW(p)

I think you are underconfident about the fact that almost all AI profits will come from areas that had almost-as-much profit in recent years. So we could bet about where AI profits are in the near term, or try to generalize this.

I wouldn't be especially surprised by waifutechnology or machine translation jumping to newly accessible domains (the thing I care about and you shrug about (until the world ends)), but is that likely to exhibit a visible economic discontinuity in profits (which you care about and I shrug about (until the world ends))?  There's apparently already mass-scale deployment of waifutech in China to forlorn male teenagers, so maybe you'll say the profits were already precedented.  Google offers machine translation now, even though they don't make much obvious measurable profit on that, but maybe you'll want to say that however much Google spends on that, they must rationally anticipate at least that much added revenue.  Or perhaps you want to say that "almost all AI profits" will come from robotics over the same period.  Or maybe I misunderstand your viewpoint, and if you said something concrete about the stuff you care about, I would manage to disagree with that; or maybe you think that waifutech suddenly getting much more charming with the next generation of text transformers is something you already know enough to rule out; or maybe you think that 2024's waifutech should definitely be able to do some observable surface-level thing it can't do now.

Replies from: paulfchristiano, paulfchristiano
comment by paulfchristiano · 2021-11-25T16:49:39.859Z · LW(p) · GW(p)

I'd be happy to disagree about romantic chatbots or machine translation. I'd have to look into it more to get a detailed sense in either, but I can guess. I'm not sure what "wouldn't be especially surprised" means, I think to actually get disagreements we need way more resolution than that so one question is whether you are willing to play ball (since presumably you'd also have to looking into to get a more detailed sense). Maybe we could save labor if people would point out the empirical facts we're missing and we can revise in light of that, but we'd still need more resolution. (That said: what's up for grabs here are predictions about the future, not present.)

I'd guess that machine translation is currently something like $100M/year in value, and will scale up more like 2x/year than 10x/year as DL improves (e.g. most of the total log increase will be in years with <3x increase rather than >3x increase, and 3 is like the 60th percentile of the number for which that inequality is tight).

I'd guess that increasing deployment of romantic chatbots will end up with technical change happening first followed by social change second, so the speed of deployment and change will depend on the speed of social change. At early stages of the social change you will likely see much large investment in fine-tuning for this use case, and the results will be impressive as you shift from random folks doing it to actual serious efforts. The fact that it's driven by social rather than technical change means it could proceed at very different paces in different countries. I don't expect anyone to make a lot of profit from this before self-driving cars, for example I'd be pretty surprised if this surpassed $1B/year of revenue before self-driving cars passed $10B/year of revenue. I have no idea what's happening in China. It would be fairly surprising to me if there was currently an actually-compelling version of the technology---which we could try operationalize as something like how bad your best available romantic relationship with humans has to be, or how lonely you'd have to be, or how short-sighted you'd have to be, before it's appealing. I don't have strong views about a mediocre product with low activation energy that's nevertheless used by many (e.g. in the same way we see lots of games with mediocre hedonic value and high uptake, or lots of passive gambling).

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T18:11:54.658Z · LW(p) · GW(p)

Thanks for continuing to try on this!  Without having spent a lot of labor myself on looking into self-driving cars, I think my sheer impression would be that we'll get $1B/yr waifutech before we get AI freedom-of-the-road; though I do note again that current self-driving tech would be more than sufficient for $10B/yr revenue if people built new cities around the AI tech level, so I worry a bit about some restricted use-case of self-driving tech that is basically possible with current tech finding some less regulated niche worth a trivial $10B/yr.  I also remark that I wouldn't be surprised to hear that waifutech is already past $1B/yr in China, but I haven't looked into things there.  I don't expect the waifutech to transcend my own standards for mediocrity, but something has to be pretty good before I call it more than mediocre; do you think there's particular things that waifutech won't be able to do?

My model permits large jumps in ML translation adoption; it is much less clear about whether anyone will be able to build a market moat and charge big prices for it.  Do you have a similar intuition about # of users increasing gradually, not just revenue increasing gradually?

I think we're still at the level of just drawing images about the future, so that anybody who came back in 5 years could try to figure out who sounded right, at all, rather than assembling a decent portfolio of bets; but I also think that just having images versus no images is a lot of progress.

Replies from: paulfchristiano
comment by paulfchristiano · 2021-11-25T22:03:06.081Z · LW(p) · GW(p)

Yes, I think that value added by automated translation will follow a similar pattern. Number of words translated is more sensitive to how you count and random nonsense, as is number of "users" which has even more definitional issues.

You can state a prediction about self-driving cars in any way you want. The obvious thing is to talk about programs similar to the existing self-driving taxi pilots (e.g. Waymo One) and ask when they do $X of revenue per year, or when $X of self-driving trucking is done per year. (I don't know what AI freedom-of-the-road means, do you mean something significantly more ambitious than self-driving trucks or taxis?)

comment by paulfchristiano · 2021-11-25T16:52:40.452Z · LW(p) · GW(p)

jumping to newly accessible domains

Man, the problem is that you say the "jump to newly accessible domains" will be the thing that lets you take over the world. So what's up for dispute is the prototype being enough to take over the world rather than years of progress by a giant lab on top of the prototype. It doesn't help if you say "I expect new things to sometimes become possible" if you don't further say something about the impact of the very early versions of the product.

Maybe you'll want to say that however much Google spends on that, they must rationally anticipate at least that much added revenue

If e.g. people were spending $1B/year developing a technology, and then after a while it jumps from 0/year to $1B/year of profit, I'm not that surprised. (Note that machine translation is radically smaller than this, I don't know the numbers.)

I do suspect they could have rolled out a crappy version earlier, perhaps by significantly changing their project. But why would they necessarily bother doing that? For me this isn't violating any of the principles that make your stories sound so crazy. The crazy part is someone spending $1B and then generating $100B/year in revenue (much less $100M and then taking over the world).

(Note: it is surprising if an industry is spending $10T/year on R&D and then jumps from $1T --> $10T of revenue in one year in a world that isn't yet growing crazily. The surprising depends a lot on the numbers involved, and in particular on how valuable it would have been to deploy a worse version earlier and how hard it is to raise money at different scales.)

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T18:15:24.781Z · LW(p) · GW(p)

The crazy part is someone spending $1B and then generating $100B/year in revenue (much less $100M and then taking over the world).

Would you say that this is a good description of Suddenly Hominids but you don't expect that to happen again, or that this is a bad description of hominids?

Replies from: paulfchristiano
comment by paulfchristiano · 2021-11-25T20:57:48.154Z · LW(p) · GW(p)

It's not a description of hominids at all, no one spent any money on R&D.

I think there are analogies where this would be analogous to hominids (which I think are silly, as we discuss in the next part of this transcript). And there are analogies where this is a bad description of hominids (which I prefer).

Replies from: adele-lopez-1
comment by Adele Lopez (adele-lopez-1) · 2021-11-25T21:52:19.778Z · LW(p) · GW(p)

Spending money on R&D is essentially the expenditure of resources in order to explore and optimize over a promising design space, right? That seems like a good description of what natural selection did in the case of hominids. I imagine this still sounds silly to you, but I'm not sure why. My guess is that you think natural selection isn't relevantly similar because it didn't deliberately plan to allocate resources as part of a long bet that it would pay off big.

Replies from: paulfchristiano
comment by paulfchristiano · 2021-11-25T21:58:22.778Z · LW(p) · GW(p)

I think natural selection has lots of similarities to R&D, but (i) there are lots of ways of drawing the analogy, (ii) some important features of R&D are missing in evolution, including some really important ones for fast takeoff arguments (like the existence of actors who think ahead).

If someones wants to spell out why they think evolution of hominids means takeoff is fast then I'm usually happy to explain why I disagree with their particular analogy. I think this happens in the next discord log between me and Eliezer.

comment by paulfchristiano · 2021-11-25T00:46:08.755Z · LW(p) · GW(p)

Inevitably, you can go back afterwards and claim it wasn't really a surprise in terms of the abstractions that seem so clear and obvious now, but I think it was surprised then

It seems like you are saying that there is some measure that was continuous all along, but that it's not obvious in advance which measure was continuous. That seems to suggest that there are a bunch of plausible measures you could suggest in advance, and lots of interesting action will be from changes that are discontinuous changes on some of those measures. Is that right?

If so, don't we get out a ton of predictions? Like, for every particular line someone thinks might be smooth, the gradualist has a higher probability on it being smooth than you would? So why can't I just start naming some smooth lines (like any of the things I listed in the grandparent) and then we can play ball?

If not, what's your position? Is it that you literally can't think of the possible abstractions that would later make the graph smooth? (This sounds insane to me.)

comment by Jotto999 · 2021-11-25T03:57:27.079Z · LW(p) · GW(p)

I disagree that this is a meaningful forecasting track record.  Massive degrees of freedom, and the mentioned events seem unresolvable, and it's highly ambiguous how these things particularly prove the degree of error unless they were properly disambiguated in advance.  Log score or it didn't happen.

(Slightly edited to try and sound less snarky)

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2022-06-29T00:24:50.490Z · LW(p) · GW(p)

I want to register a gripe, re your follow-up post [LW · GW]: when Eliezer says that he, Demis Hassabis, and Dario Amodei have a good "track record" because of their qualitative prediction successes, you object that the phrase "track record" should be reserved for things like Metaculus forecasts.

But when Ben Garfinkel says that Eliezer has a bad "track record" because he made various qualitative predictions Ben disagrees with, you slam the retweet button.

I already thought this narrowing of the term "track record" was weird. If you're saying that we shouldn't count Linus Pauling's achievements in chemistry, or his bad arguments for Vitamin C megadosing, as part of Pauling's "track record", because they aren't full probability distributions over concrete future events, then I worry a lot that this new word usage will cause confusion and lend itself to misuse. As long as it's used even-handedly, though, it's ultimately just a word.

(On my model, the main consequence of this is just that "track records" matter a lot less, because they become a much smaller slice of the evidence we have about a lot of people's epistemics, expertise, etc.)

But if you're going to complain about "track record" talk when the track record is alleged to be good but not when it's alleged to be bad, then I have a genuine gripe with this terminology proposal. It already sounded a heck of a lot like an isolated demand for rigor to me, but if you're going to redefine "track record" to refer to  a narrow slice of the evidence, you at least need to do this consistently, and not crow some variant of 'Aha! His track record is terrible after all!' as soon as you find equally qualitative evidence that you like.

This was already a thing I worried would happen if we adopted this terminological convention, and it happened immediately.

</end of gripe>

Replies from: Jotto999, ege-erdil
comment by Jotto999 · 2022-06-29T02:18:51.858Z · LW(p) · GW(p)

I see what you're saying, but it looks like you're strawmanning me yet again with a more extreme version of my position.  You've done that several times and you need to stop that.

What you've argued here prevents me from questioning the forecasting performance of every pundit who I can't formally score, which is ~all of them.

Yes, it's not a real forecasting track record unless it meets the sort of criteria that are fairly well understood in Tetlockian research.  And neither is Ben Garfinkel's post, that doesn't give us a forecasting track record, like on Metaculus.

But if a non-track-recorded person suggests they've been doing a good job anticipating things, it's quite reasonable to point out non-scorable things they said that seem incorrect, even with no way to score it.

In an earlier draft of my essay, I considered getting into bets he's made (several of which he's lost). I ended up not including those things.  Partly because my focus was waning and it was more attainable to stick to the meta-level point.  And partly because I thought the essay might be better if it was more focused.  I don't think there is literally zero information about his forecasting performance (that's not plausible), but it seemed like it would be more of a distraction from my epistemic point.  Bets are not as informative as Metaculus-style forecasts, but they are better than nothing.  This stuff is a spectrum, even Metaculus doesn't retain some kinds of information about the forecaster.  Still, I didn't get into it, though I could have.

But I ended up later editing in a link to one of Paul's comments, where he describes some reasons that Robin looks pretty bad in hindsight, but also includes several things Eliezer said that seem quite off.  None of those are scorable.  But I added in a link to that, because Eliezer explicitly claimed he came across better in that debate, which overall he may have, but it's actually more mixed than that, and that's relevant to my meta-point that one can obfuscate these things without a proper track record.  And Ben Garfinkel's post is similarly relevant.

If the community felt more ambivalently about Eliezer's forecasts, or even if Eliezer was more ambivalent about his own forecasts? And then there was some guy trying to convince people he has made bad forecasts? Then your objection of one-sidedness would make much more sense to me.  That's not what this is.

Eliezer actively tells people he's anticipating things well, but he deliberately prevents his forecasts from being scorable.  Pundits do that too, and you bet I would eagerly criticize vague non-scorable stuff they said that seems wrong.  And yes, I would retweet someone criticizing those things too.  Does that also bother you?

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2022-06-29T03:28:22.800Z · LW(p) · GW(p)

IMO that's a much more defensible position, and is what the discussion should have initially focused on. From my perspective, the way the debate largely went is:

  • Jotto: Eliezer claims to have a relatively successful forecasting track record, along with Dario and Demis; but this is clearly dissembling, because a forecasting track record needs to look like a long series of Metaculus predictions.
  • Other people: (repeat without qualification the claim that Eliezer is falsely claiming to have a "forecasting track record"; simultaneously claims that Eliezer has a subpar "forecasting track record", based on evidence that wouldn't meet Jotto's stated bar)
  • Jotto: (signal-boosts the inconsistent claims other people are making, without noting that this is equivocating between two senses of "track record" and therefore selectively applying two different standards)
  • Rob B: (gripes and complains)

Whereas the way the debate should have gone is:

  • Jotto: I personally disagree with Eliezer that the AI Foom debate is easy to understand and cash out into rough predictions about how the field has progressed since 2009, or how it is likely to progress in the future. Also, I wish that all of Eliezer, Robin, Demis, Dario, and Paul had made way more Metaculus-style forecasts back in 2010, so it would be easier to compare their prediction performance. I find it frustrating that nobody did this, and think we should start doing this way more now. Also, I think this sharper comparison would probably have shown that Eliezer is significantly worse at thinking about this topic than Paul, and maybe than Robin, Demis, and Dario.
  • Rob B: I disagree with your last sentence, and I disagree quantitatively that stuff like the Foom debate is as hard-to-interpret as you suggest. But I otherwise agree with you, and think it would have been useful if the circa-2010 discussions had included more explicit probability distributions, scenario breakdowns, quantitative estimates, etc. (suitably flagged as unstable, spitballed ass-numbers). Even where these aren't cruxy and don't provide clear evidence about people's quality of reasoning about AGI, it's still just helpful to have a more precise sense of what people's actual beliefs at the time were. "X is unlikely" is way less useful than knowing whether it's more like 30%, or more like 5%, or more like 0.1%, etc.

I think the whole 'X isn't a real track record' thing was confusing, and made your argument sound more forceful than it should have.

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2022-06-29T03:51:41.746Z · LW(p) · GW(p)

Plus maybe some disagreements about how possible it is in general to form good models of people and of topics like AGI in the absence of Metaculus-ish forecasts, and disagreement about exactly how informative it would be to have a hundred examples of narrow-AI benchmark predictions over the last ten years from all the influential EAs?

(I think it would be useful, but more like '1% to 10% of the overall evidence for weighing people's reasoning and correctness about AGI', not '90% to 100% of the evidence'.)

(An exception would be if, e.g., it turned out that ML progress is way more predictable than Eliezer or I believe. ML's predictability is a genuine crux for us, so seeing someone else do amazing at this prediction task for a bunch of years, with foresight rather than hindsight, would genuinely update us a bunch. But we don't expect to learn much from Eliezer or Rob trying to predict stuff, because while someone else may have secret insight that lets them predict the future of narrow-AI advances very narrowly, we are pretty sure we don't know how to do that.)

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2022-06-29T03:54:42.791Z · LW(p) · GW(p)

Part of what I object to is that you're a Metaculus radical, whose Twitter bio says "Replace opinions with forecasts."

This is a view almost no one in the field currently agrees with or tries to live up to.

Which is fine, on its own. I like radicals, and want to hear their views argued for and hashed out in conversation.

But then you selectively accuse Eliezer of lying about having a "track record", without noting how many other people are also expressing non-forecast "opinions" (and updating on these), and while using language in ways that make it sound like Eliezer is doing something more unusual than he is, and making it sound like your critique is more independent of your nonstandard views on track records and "opinions" than it actually is.

That's the part that bugs me. If you have an extreme proposal for changing EA's norms, argue for that proposal. Don't just selectively take potshots at views or people you dislike more, while going easy on everyone else.

Replies from: matthew-barnett, Jotto999
comment by Matthew Barnett (matthew-barnett) · 2022-06-29T06:59:49.250Z · LW(p) · GW(p)

That's the part that bugs me. If you have an extreme proposal for changing EA's norms, argue for that proposal. Don't just selectively take potshots at views or people you dislike more, while going easy on everyone else.

I think Jotto has argued for the proposal in the past. Whether he did it in that particular comment is not very important, so long as he holds everyone to the same standards.

As for his standards: I think he sees Eliezer as an easy target because he’s high status in this community and has explicitly said that he thinks his track record is good (in fact, better than other people). On its own, therefore, it’s not surprising that Eliezer would get singled out.

comment by Jotto999 · 2022-06-30T04:05:50.796Z · LW(p) · GW(p)

I no longer see exchanges with you as a good use of energy, unless you're able to describe some of the strawmanning of me you've done and come clean about that.

EDIT: Since this is being downvoted, here [LW(p) · GW(p)] is a comment chain where Rob Besinger interpreted me in ways that are bizarre, such as suggesting that I think Eliezer is saying he has "a crystal ball", or that "if you record any prediction anywhere other than Metaculus (that doesn't have similarly good tools for representing probability distributions), you're a con artist".  Things that sound thematically similar to what I was saying, but were weird, persistent extremes that I don't see as good-faith readings of me.  It kept happening over Twitter, then again on LW.  At no point have I felt he's trying to understand what I actually think.  So I don't see the point of continuing with him.

comment by Ege Erdil (ege-erdil) · 2022-06-29T00:42:18.491Z · LW(p) · GW(p)

This is a strawman. Ben Garfinkel never says that Yudkowsky has a bad track record. In fact the only time the phrase "bad track record" comes up in Garfinkel's post is when you mention it in one of your comments.

The most Ben Garfinkel says about Yudkowsky's track record is that it's "at least pretty mixed", which I think the content of the post supports, especially the clear-cut examples. He even emphasizes that he is deliberately cherry-picking bad examples from Eliezer's track record in order to make a point, e.g. about Eliezer never having addressed his own bad predictions from the past.

It's not enough to say "my world model was bad in such and such ways and I've changed it" to address your mistakes; you have to say "I made this specific prediction and it later turned out to be wrong". Can you cite any instance of Eliezer ever doing that?

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2022-06-29T03:08:55.361Z · LW(p) · GW(p)

This is a strawman. Ben Garfinkel never says that Yudkowsky has a bad track record.

In the post, he says "his track record is at best fairly mixed" and "Yudkowsky may have a track record of overestimating or overstating the quality of his insights into AI"; and in the comments, he says "Yudkowsky’s track record suggests a substantial bias toward dramatic and overconfident predictions". 

What makes a track record "bad" is relative, but if Ben objects to my summarizing his view with the imprecise word "bad", then I'll avoid doing that. It doesn't strike me as an important point for anything I said above.

The most Ben Garfinkel says about Yudkowsky's track record is that it's "at least pretty mixed", which I think the content of the post supports, especially the clear-cut examples.

As long as we agree that "track record" includes the kind of stuff Jotto was saying it doesn't include, I'm happy to say that Eliezer's track record includes failures as well as successes. Indeed, I think that would make way more sense.

about Eliezer never having addressed his own bad predictions from the past.

Maybe worth mentioning in passing that this is of course false?

It's not enough to say "my world model was bad in such and such ways and I've changed it" to address your mistakes; you have to say "I made this specific prediction and it later turned out to be wrong". Can you cite any instance of Eliezer ever doing that?

Sure! "I wouldn’t have predicted AlphaGo and lost money betting against the speed of its capability gains".

Replies from: RobbBB, ege-erdil
comment by Rob Bensinger (RobbBB) · 2022-06-29T06:49:09.014Z · LW(p) · GW(p)

I'm happy to say that Eliezer's track record includes failures as well as successes.

Extremely important failures and extremely important successes, no less.

comment by Ege Erdil (ege-erdil) · 2022-06-29T08:13:10.850Z · LW(p) · GW(p)

In the post, he says "his track record is at best fairly mixed" and "Yudkowsky may have a track record of overestimating or overstating the quality of his insights into AI"; and in the comments, he says "Yudkowsky’s track record suggests a substantial bias toward dramatic and overconfident predictions".

Yes, I think all of that checks out. It's hard to say, of course, because Eliezer rarely makes explicit predictions, but insofar as he does make them I think he clearly puts a lot of weight on his inside view into things.

That doesn't make his track record "bad" but it's something to keep in mind when he makes predictions.

Sure! "I wouldn’t have predicted AlphaGo and lost money betting against the speed of its capability gains".

This counts as a mistake but I don't think it's important relative to the bad prediction about AI timelines Ben brings up in his post. If Eliezer explained why he had been wrong then it would make his position now more convincing, especially given his condescending attitude towards e.g. Metaculus forecasts.

I still think there's something about the way Eliezer admits he was wrong that rubs me the wrong way but it's hard to explain what that is right now. It's not correct to say he doesn't admit his mistakes per se, but there's some other problem with how much he seems to "internalize" the fact that he was wrong.

I've retracted my original comment because of your example as it was not correct (despite having the right "vibe", whatever that means).

comment by RomanS · 2021-11-25T08:45:27.190Z · LW(p) · GW(p)

BTW, a few days ago Eliezer made [LW · GW] a specific prediction that is perhaps relevant to your discussion:

I [would very tentatively guess that] AGI to kill everyone before self-driving cars are commercialized

(I suppose Eliezer is talking about Level 5 autonomy cars here).

Maybe a bet like this could work:

At least one month will elapse after the first Level 5 autonomy car hits the road, without AGI killing everyone

"Level 5 autonomy" could be further specified to avoid ambiguities. For example, like this:

The car must be publicly accessible (e.g. available for purchase, or as a taxi etc). The car should be able to drive from some East Coast city to some West Coast city by itself. 

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T16:43:19.889Z · LW(p) · GW(p)

Once you can buy a self-driving car, the thing that Paul predicts with surety and that I shrug about has already happened. If it does happen, my model says very little about remaining timeline from there one way or another. It shrugs again and says, "Guess that's how difficult the AI problem and regulatory problem were."

comment by paulfchristiano · 2021-11-24T22:34:43.836Z · LW(p) · GW(p)

sort of person who gets taken in by Hanson's arguments in 2008 and gets caught flatfooted by AlphaGo and GPT-3 and AlphaFold 2

I find this kind of bluster pretty frustrating and condescending. I also feel like the implication is just wrong---if Eliezer and I disagree, I'd guess it's because he's worse at predicting ML progress. To me GPT-3 feels much (much) closer to my mainline than to Eliezer's, and AlphaGo is very unsurprising. But it's hard to say who was actually "caught flatfooted" unless we are willing to state some of these predictions in advance.

I got pulled into this interaction because I wanted to get Eliezer to make some real predictions, on the record, so that we could have a better version of this discussion in 5 years rather than continuing to both say "yeah, in hindsight this looks like evidence for my view." I apologize if my tone (both in that discussion and in this comment) is a bit frustrated.

It currently feels from the inside like I'm holding the epistemic high ground on this point, though I expect Eliezer disagrees strongly:

  • I'm willing to bet on anything Eliezer wants, or to propose my own questions if Eliezer is willing in principle to make forecasts. I expect to outperform Eliezer on these bets and am happy to state in advance that I'd update in his direction if his predictions turned out to be as good as mine. It's possible that we don't have disagreements, but I doubt it. (See my other comment [LW(p) · GW(p)].)
  • I'm not talking this much smack based on "track records" imagined in hindsight. I think that if you want to do this then you should have been making predictions in the past, and you definitely should be willing to make predictions about the future. (I suspect you'll often find that other people don't disagree with the predictions that turned out to be reasonable, even if from your perspective it was all part of one coherent story.)
Replies from: Eliezer_Yudkowsky, matthew-barnett
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-24T23:31:49.083Z · LW(p) · GW(p)

I wish to acknowledge this frustration, and state generally that I think Paul Christiano occupies a distinct and more clueful class than a lot of, like, early EAs who mm-hmmmed along with Robin Hanson on AI - I wouldn't put, eg, Dario Amodei in that class either, though we disagree about other things.

But again, Paul, it's not enough to say that you weren't surprised by GPT-2/3 in retrospect, it kinda is important to say it in advance, ideally where other people can see?  Dario picks up some credit for GPT-2/3 because he clearly called it in advance.  You don't need to find exact disagreements with me to start going on the record as a forecaster, if you think the course of the future is generally narrower than my own guesses - if you think that trends stay on course, where I shrug and say that they might stay on course or break.  (Except that of course in hindsight somebody will always be able to draw a straight-line graph, once they know which graph to draw, so my statement "it might stay on trend or maybe break" applies only to graphs extrapolating into what is currently the future.)

Replies from: paulfchristiano
comment by paulfchristiano · 2021-11-25T00:35:49.431Z · LW(p) · GW(p)

Suppose your view is "crazy stuff happens all the time" and my view is "crazy stuff happens rarely." (Of course "crazy" is my word, to you it's just normal stuff.) Then what am I supposed to do, in your game?

More broadly: if you aren't making bold predictions about the future, why do you think that other people will? (My predictions all feel boring to me.) And if you do have bold predictions, can we talk about some of them instead?

It seems to me like I want you to say "well I think 20% chance something crazy happens here" and I say "nah, that's more like 5%" and then we batch up 5 of those and when none of them happen I get a bayes point.

I could just give my forecast. But then if I observe that 2/20 of them happen, how exactly does that help me in figuring out whether I should be paying more attention to your views (or help you snap out of it)?

I can list some particular past bets and future forecasts, but it's really unclear what to do with them without quantitative numbers or a point of comparison.

Like you I've predicted that AI is undervalued and will grow in importance, although I think I made a much more specific prediction that investment in AI would go up a lot in the short term. This made me some money, but like you I just don't care much about money and it's not a game worth playing. I bet quite explicitly on deep learning by pivoting my career into practical ML and then spending years of my life working on it, despite loving theory and thinking it's extremely important. We can debate whether the bet is good, but it was certainly a bet and by my lights it looks very reasonable in retrospect.

Over the next 10 years I think powerful ML systems will be trained mostly by imitating human behavior over short horizons, and then fine-tuned using much smaller amounts of long-horizon feedback. This has long been my prediction, and it's why I've been interested in language modeling, and has informed some of my research. I think that's still basically valid and will hold up in the future. I predict that people will explicitly collect much larger datasets of human behavior as the economic stakes rise. This is in contrast to e.g. theorem-proving working well, although I think that theorem-proving may end up being an important bellwether because it allows you to assess the capabilities of large models without multi-billion-dollar investments in training infrastructure.

I expect to see truly massive training runs in the not that distant future. I think the current rate of scaling won't be sustained, but that over the next 10-20 years scaling will get us into human-level behavior for "short-horizon" tasks which may or may not suffice for transformative AI. I expect that to happen at model sizes within 2 orders of magnitude of the human brain on one side or the other, i.e. 1e12 to 1e16 parameters.

I could list a lot more, but I don't think any of it seems bold and it's not clear what the game is. It's clearly bold by comparison to market forecasts or broader elite consensus, but so what? I understand much better how to compare one predictor to another. I mostly don't know what it means to evaluate a predictor on an absolute scale.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T05:03:44.557Z · LW(p) · GW(p)

I predict that people will explicitly collect much larger datasets of human behavior as the economic stakes rise. This is in contrast to e.g. theorem-proving working well, although I think that theorem-proving may end up being an important bellwether because it allows you to assess the capabilities of large models without multi-billion-dollar investments in training infrastructure.

Well, it sounds like I might be more bullish than you on theorem-proving, possibly.  Not on it being useful or profitable, but in terms of underlying technology making progress on non-profitable amazing demo feats, maybe I'm more bullish on theorem-proving than you are?  Is there anything you think it shouldn't be able to do in the next 5 years?

Replies from: paulfchristiano
comment by paulfchristiano · 2021-11-25T22:34:11.354Z · LW(p) · GW(p)

I'm going to make predictions by drawing straight-ish lines through metrics like the ones in the gpt-f paper. Big unknowns are then (i) how many orders of magnitude of "low-hanging fruit" are there before theorem-proving even catches up to the rest of NLP? (ii) how hard their benchmarks are compared to other tasks we care about. On (i) my guess is maybe 2? On (ii) my guess is "they are pretty easy" / "humans are pretty bad at these tasks," but it's somewhat harder to quantify. If you think your methodology is different from that then we will probably end up disagreeing.

Looking towards more ambitious benchmarks, I think that the IMO grand challenge is currently significantly more than 5 years away. In 5 year's time my median guess (without almost any thinking about it) is that automated solvers can do 10% of non-geometry, non-3-variable-inequality IMO shortlist problems.

So yeah, I'm happy to play ball in this area, and I expect my predictions to be somewhat more right than yours after the dust settles. Is there some way of measuring such that you are willing to state any prediction?

(I still feel like I'm basically looking for any predictions at all beyond sometimes saying "my model wouldn't be surprised by <vague thing X>", whereas I'm pretty constantly throwing out made-up guesses which I'm happy to refine with more effort. Obviously I'm going to look worse in retrospect than you if we keep up this way though, that particular asymmetry is a lot of the reason people mostly don't play ball. ETA: that's a bit unfair, the romantic chatbot vs self-driving car prediction is one where we've both given off-the-cuff takes.)

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T23:27:47.213Z · LW(p) · GW(p)

I have a sense that there's a lot of latent potential for theorem-proving to advance if more energy gets thrown at it, in part because current algorithms seem a bit weird to me - that we are waiting on the equivalent of neural MCTS as an enabler for AlphaGo, not just a bigger investment, though of course the key trick could already have been published in any of a thousand papers I haven't read.  I feel like I "would not be surprised at all" if we get a bunch of shocking headlines in 2023 about theorem-proving problems falling, after which the IMO challenge falls in 2024 - though of course, as events like this lie in the Future, they are very hard to predict.

Can you say more about why or whether you would, in this case, say that this was an un-Paulian set of events?  As I have trouble manipulating my Paul model, it does not exclude Paul saying, "Ah, yes, well, they were using 700M models in that paper, so if you jump to 70B, of course the IMO grand challenge could fall; there wasn't a lot of money there."  Though I haven't even glanced at any metrics here, let alone metrics that the IMO grand challenge could be plotted on, so if smooth metrics rule out IMO in 5yrs, I am more interested yet - it legit decrements my belief, but not nearly as much as I imagine it would decrement yours.

(Edit:  Also, on the meta-level, is this, like, anywhere at all near the sort of thing you were hoping to hear from me?  Am I now being a better epistemic citizen, if maybe not a good one by your lights?)

Replies from: paulfchristiano, matthew-barnett
comment by paulfchristiano · 2021-11-26T06:49:43.393Z · LW(p) · GW(p)

Yes, IMO challenge falling in 2024 is surprising to me at something like the 1% level or maybe even more extreme (though could also go down if I thought about it a lot or if commenters brought up relevant considerations, e.g. I'd look at IMO problems and gold medal cutoffs and think about what tasks ought to be easy or hard; I'm also happy to make more concrete per-question predictions). I do think that there could be huge amounts of progress from picking the low hanging fruit and scaling up spending by a few orders of magnitude, but I still don't expect it to get you that far. 

I don't think this is an easy prediction to extract from a trendline, in significant part because you can't extrapolate trendlines this early that far out. So this is stress-testing different parts of my model, which is fine by me.

At the meta-level, this is the kind of thing I'm looking for, though I'd prefer have some kind of quantitative measure of how not-surprised you are. If you are only saying 2% then we probably want to talk about things less far in your tails than the IMO challenge.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-26T07:32:23.207Z · LW(p) · GW(p)

Okay, then we've got at least one Eliezerverse item, because I've said below that I think I'm at least 16% for IMO theorem-proving by end of 2025.  The drastic difference here causes me to feel nervous, and my second-order estimate has probably shifted some in your direction just from hearing you put 1% on 2024, but that's irrelevant because it's first-order estimates we should be comparing here.

So we've got huge GDP increases for before-End-days signs of Paulverse and quick IMO proving for before-End-days signs of Eliezerverse?  Pretty bare portfolio but it's at least a start in both directions.  If we say 5% instead of 1%, how much further would you extend the time limit out beyond 2024?

I also don't know at all what part of your model forbids theorem-proving to fall in a shocking headline followed by another headline a year later - it doesn't sound like it's from looking at a graph - and I think that explaining reasons behind our predictions in advance, not just making quantitative predictions in advance, will help others a lot here.

EDIT: Though the formal IMO challenge has a barnacle about the AI being open-sourced, which is a separate sociological prediction I'm not taking on.

Replies from: paulfchristiano
comment by paulfchristiano · 2021-11-26T19:28:18.031Z · LW(p) · GW(p)

I think IMO gold medal could be well before massive economic impact, I'm just surprised if it happens in the next 3 years. After a bit more thinking (but not actually looking at IMO problems or the state of theorem proving) I probably want to bump that up a bit, maybe 2%, it's hard reasoning about the tails. 

I'd say <4% on end of 2025.

I think this is the flipside of me having an intuition where I say things like "AlphaGo and GPT-3 aren't that surprising"---I have a sense for what things are and aren't surprising, and not many things happen that are so surprising.

If I'm at 4% and you are 12% and we had 8 such bets, then I can get a factor of 2 if they all come out my way, and you get a factor of ~1.5 if one of them comes out your way.

I might think more about this and get a more coherent probability distribution, but unless I say something else by end of 2021 you can consider 4% on end of 2025 this my prediction.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-27T00:38:20.434Z · LW(p) · GW(p)

Maybe another way of phrasing this - how much warning do you expect to get, how far out does your Nope Vision extend?  Do you expect to be able to say "We're now in the 'for all I know the IMO challenge could be won in 4 years' regime" more than 4 years before it happens, in general?  Would it be fair to ask you again at the end of 2022 and every year thereafter if we've entered the 'for all I know, within 4 years' regime?

Added:  This question fits into a larger concern I have about AI soberskeptics in general (not you, the soberskeptics would not consider you one of their own) where they saunter around saying "X will not occur in the next 5 / 10 / 20 years" and they're often right for the next couple of years, because there's only one year where X shows up for any particular definition of that, and most years are not that year; but also they're saying exactly the same thing up until 2 years before X shows up, if there's any early warning on X at all.  It seems to me that 2 years is about as far as Nope Vision extends in real life, for any case that isn't completely slam-dunk; when I called upon those gathered AI luminaries to say the least impressive thing that definitely couldn't be done in 2 years, and they all fell silent, and then a single one of them named Winograd schemas, they were right that Winograd schemas at the stated level didn't fall within 2 years, but very barely so (they fell the year after).  So part of what I'm flailingly asking here, is whether you think you have reliable and sensitive Nope Vision that extends out beyond 2 years, in general, such that you can go on saying "Not for 4 years" up until we are actually within 6 years of the thing, and then, you think, your Nope Vision will actually flash an alert and you will change your tune, before you are actually within 4 years of the thing.  Or maybe you think you've got Nope Vision extending out 6 years?  10 years?  Or maybe theorem-proving is just a special case and usually your Nope Vision would be limited to 2 years or 3 years?

This is all an extremely Yudkowskian frame on things, of course, so feel free to reframe.

Replies from: paulfchristiano
comment by paulfchristiano · 2021-11-28T01:47:41.973Z · LW(p) · GW(p)

I think I'll get less confident as our accomplishments get closer to the IMO grand challenge. Or maybe I'll get much more confident if we scale up from $1M -> $1B and pick the low hanging fruit without getting fairly close, since at that point further progress gets a lot easier to predict

There's not really a constant time horizon for my pessimism, it depends on how long and robust a trend you are extrapolating from. 4 years feels like a relatively short horizon, because theorem-proving has not had much investment so compute can be scaled up several orders of magnitude, and there is likely lots of low-hanging fruit to pick, and we just don't have much to extrapolate from (compared to more mature technologies, or how I expect AI will be shortly before the end of days), and for similar reasons there aren't really any benchmarks to extrapolate.

(Also note that it matters a lot whether you know what problems labs will try to take a stab at. For the purpose of all of these forecasts, I am trying insofar as possible to set aside all knowledge about what labs are planning to do though that's obviously not incentive-compatible and there's no particular reason you should trust me to do that.)

comment by Matthew Barnett (matthew-barnett) · 2021-11-26T00:05:34.062Z · LW(p) · GW(p)

I feel like I "would not be surprised at all" if we get a bunch of shocking headlines in 2023 about theorem-proving problems falling, after which the IMO challenge falls in 2024

Possibly helpful: Metaculus currently puts the chances of the IMO grand challenge falling by 2025 at about 8%. Their median is 2039.

I think this would make a great bet, as it would definitely show that your model can strongly outperform a lot of people (and potentially Paul too). And the operationalization for the bet is already there -- so little work will be needed to do that part.

Replies from: Eliezer_Yudkowsky, paulfchristiano
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-26T00:20:17.662Z · LW(p) · GW(p)

Ha!  Okay then.  My probability is at least 16%, though I'd have to think more and Look into Things, and maybe ask for such sad little metrics as are available before I was confident saying how much more.  Paul?

EDIT:  I see they want to demand that the AI be open-sourced publicly before the first day of the IMO, which unfortunately sounds like the sort of foolish little real-world obstacle which can prevent a proposition like this from being judged true even where the technical capability exists.  I'll stand by a >16% probability of the technical capability existing by end of 2025, as reported on eg solving a non-trained/heldout dataset of past IMO problems, conditional on such a dataset being available; I frame no separate sociological prediction about whether somebody is willing to open-source the AI model that does it.

Replies from: paulfchristiano, paulfchristiano, matthew-barnett, matthew-barnett
comment by paulfchristiano · 2021-11-29T07:43:46.487Z · LW(p) · GW(p)

I don't care about whether the AI is open-sourced (I don't expect anyone to publish the weights even if they describe their method) and I'm not that worried about our ability to arbitrate overfitting.

Ajeya suggested that I clarify: I'm significantly more impressed by an AI getting a gold medal than getting a bronze, and my 4% probability is for getting a gold in particular (as described in the IMO grand challenge). There are some categories of problems that can be solved using easy automation (I'd guess about 5-10% could be done with no deep learning and modest effort). Together with modest progress in deep learning based methods, and a somewhat serious effort, I wouldn't be surprised by people getting up to 20-40% of problems. The bronze cutoff is usually 3/6 problems, and the gold cutoff is usually 5/6 (assuming the AI doesn't get partial credit). The difficulty of problems also increases very rapidly for humans---there are often 3 problems that a human can do more-or-less mechanically.

I could tighten any of these estimates by looking at the distribution more carefully rather than going off of my recollections from 2008, and if this was going to be one of a handful of things we'd bet about I'd probably spend a few hours doing that and some other basic digging.

Replies from: paulfchristiano
comment by paulfchristiano · 2021-12-02T07:12:39.116Z · LW(p) · GW(p)

I looked at a few recent IMOs to get better calibrated. I think the main update is that I significantly underestimated how many years you can get a gold with only 4/6 problems.

For example I don't have the same "this is impossible" reaction about IMO 2012 or IMO 2015 as about most years. That said, I feel like they do have to get reasonably lucky with both IMO content and someone has to make a serious and mostly-successful effort, but I'm at least a bit scared by that. There's also quite often a geo problem as 3 or 6. 

Might be good to make some side bets:

  • Conditioned on winning I think it's only maybe 20% probability to get all 6 problems (whereas I think you might have a higher probability on jumping right past human level, or at least have 50% on 6 vs 5?).
  • Conditioned on a model getting 3+ problems I feel like we have a pretty good guess about what algorithm will be SOTA on this problem (e.g. I'd give 50% to a pretty narrow class of algorithms with some uncertain bells and whistles, with no inside knowledge). Whereas I'd guess you have a much broader distribution.

But more useful to get other categories of bets. (Maybe in programming, investment in AI, economic impact from robotics, economic impact from chatbots, translation?)

Replies from: paulfchristiano
comment by paulfchristiano · 2021-12-02T17:16:05.862Z · LW(p) · GW(p)

Going through previous ten IMOs, and imagining a very impressive automated theorem prover, I think 

  • 2020 - unlikely, need 5/6 and probably can't get problems 3 or 6. Also good chance to mess up at 4 or 5
  • 2019 - tough but possible, 3 seems hard but even that is not unimaginable, 5 might be hard but might be straightforward, and it can afford to get one wrong
  • 2018 - tough but possible, 3 is easier for machine than human but probably still hard, 5 may be hard, can afford to miss one
  • 2017 - tough but possible, 3 looks out of reach, 6 looks hard but not sure about that, 5 looks maybe hard, 1 is probably easy. But it can miss 2, which could happen.
  • 2016 - probably not possible, 3 and 6 again look hard, and good chance to fail on 2 and 5, only allowed to miss 1
  • 2015 - seems possible, 3 might be hard but like 50-50 it's simple for machine,  6 is probably hard, but you can miss 2
  • 2014 - probably not possible, can only miss 1, probably miss one of 2 or 5 and 6
  • 2013 - probably not possible, 6 seems hard, 2 seems very hard, can only miss 1
  • 2012 - tough but possible, 6 and 3 look hard but you can miss 2
  • 2011 - seems possible, allowed to miss two and both 3 and 6 look brute-forceable

Overall this was much easier than I expected. 4/10 seem unlikely, 4/10 seem tough but possible, 2/10 I can imagine a machine doing it. There are a lot of problems that look really hard, but there are a fair number of tests where you can just skip those.

That said, even to get the possible ones you do need to be surprisingly impressive, and that's getting cut down by like 25-50% for a solvable test. That said, they get to keep trying (assuming they get promising results in early years) and eventually they will hit one of the easier years.

It also looks fairly likely to me that if one of DeepMind or OpenAI tries seriously they will be able to get an HM with a quite reasonable chance at bronze, and this is maybe enough of a PR coup to motivate work, and then it's more likely there will be a large effort subsequently to finish the job or to opportunistically take advantage of an easy test.

Overall I'm feeling bad about my 4%, I deserve to lose some points regardless but might think about what my real probability is after looking at tests (though I was also probably moved by other folks in EA systematically giving higher estimates than I did).

Replies from: gwern
comment by gwern · 2021-12-03T00:19:50.685Z · LW(p) · GW(p)

What do you think of Deepmind's new whoop-de-doo about doing research-level math assisted by GNNs?

Replies from: paulfchristiano
comment by paulfchristiano · 2021-12-06T18:22:06.121Z · LW(p) · GW(p)

Not surprising in any of the ways that good IMO performance would be surprising.

comment by paulfchristiano · 2021-12-06T18:30:57.196Z · LW(p) · GW(p)

Based on the other thread I now want to revise this prediction, both because 4% was too low and "IMO gold" has a lot of noise in it based on test difficulty.

I'd put 4% on "For the 2022, 2023, 2024, or 2025 IMO an AI built before the IMO is able to solve the single hardest problem" where "hardest problem" = "usually problem #6, but use problem #3 instead if either: (i) problem 6 is geo or (ii) problem 3 is combinatorics and problem 6 is algebra." (Would prefer just pick the hardest problem after seeing the test but seems better to commit to a procedure.)

Maybe I'll go 8% on "gets gold" instead of "solves hardest problem."

Would be good to get your updated view on this so that we can treat it as staked out predictions.

Replies from: Benito
comment by Ben Pace (Benito) · 2022-02-02T19:55:29.960Z · LW(p) · GW(p)

(News: OpenAI has built a theorem-prover that solved many AMC12 and AIME competition problems, and 2 IMO problems, and they say they hope this leads to work that wins the IMO Grand Challenge.)

comment by Matthew Barnett (matthew-barnett) · 2021-11-26T00:52:23.646Z · LW(p) · GW(p)

I'll stand by a >16% probability of the technical capability existing by end of 2025, as reported on eg solving a non-trained/heldout dataset of past IMO problems, conditional on such a dataset being available

It feels like this bet would look a lot better if it were about something that you predict at well over 50% (with people in Paul's camp still maintaining less than 50%). So, we could perhaps modify the terms such that the bot would only need to surpass a certain rank or percentile-equivalent in the competition (and not necessarily receive the equivalent of a Gold medal).

The relevant question is which rank/percentile you think is likely to be attained by 2025 under your model but you predict would be implausible under Paul's model. This may be a daunting task, but one way to get started is to put a probability distribution over what you think the state-of-the-art will look like by 2025, and then compare to Paul's.

Edit: Here are, for example, the individual rankings for 2021: https://www.imo-official.org/year_individual_r.aspx?year=2021

Replies from: Eliezer_Yudkowsky, RobbBB
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-26T01:30:10.523Z · LW(p) · GW(p)

I expect it to be hella difficult to pick anything where I'm at 75% that it happens in the next 5 years and Paul is at 25%.  Heck, it's not easy to find things where I'm at over 75% that aren't just obvious slam dunks; the Future isn't that easy to predict.  Let's get up to a nice crawl first, and then maybe a small portfolio of crawlings, before we start trying to make single runs that pierce the sound barrier.

I frame no prediction about whether Paul is under 16%.  That's a separate matter.  I think a little progress is made toward eventual epistemic virtue if you hand me a Metaculus forecast and I'm like "lol wut" and double their probability, even if it turns out that Paul agrees with me about it.

comment by Rob Bensinger (RobbBB) · 2021-11-26T01:32:57.297Z · LW(p) · GW(p)

It feels like this bet would look a lot better if it were about something that you predict at well over 50% (with people in Paul's camp still maintaining less than 50%).

My model of Eliezer may be wrong, but I'd guess that this isn't a domain where he has many over-50% predictions of novel events at all? See also 'I don't necessarily expect self-driving cars before the apocalypse'.

My Eliezer-model has a more flat prior over what might happen, which therefore includes stuff like 'maybe we'll make insane progress on theorem-proving (or whatever) out of the blue'. Again, I may be wrong, but my intuition is that you're Paul-omorphizing Eliezer when you assume that >16% probability of huge progress in X by year Y implies >50% probability of smaller-but-meaningful progress in X by year Y.

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2021-11-26T01:34:00.146Z · LW(p) · GW(p)

(Ah, EY already replied.)

comment by Matthew Barnett (matthew-barnett) · 2021-11-26T01:17:01.298Z · LW(p) · GW(p)

If this task is bad for operationalization reasons, there are other theorem proving benchmarks. Unfortunately it looks like there aren't a lot of people that are currently trying to improve on the known benchmarks, as far as I'm aware.

The code generation benchmarks are slightly more active. I'm personally partial to Hendrycks et al.'s APPS benchmark, which includes problems that "range in difficulty from introductory to collegiate competition level and measure coding and problem-solving ability." (Github link).

comment by paulfchristiano · 2021-12-02T06:58:24.139Z · LW(p) · GW(p)

I think Metaculus is closer to Eliezer here: conditioned on this problem being resolved it seems unlikely for the AI to be either open-sourced or easily reproducible.

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-12-02T07:38:15.225Z · LW(p) · GW(p)

My honest guess is that most predictors didn’t see that condition and the distribution would shift right if someone pointed that out in the comments.

comment by Matthew Barnett (matthew-barnett) · 2021-11-24T22:56:59.265Z · LW(p) · GW(p)

To me GPT-3 feels much (much) closer to my mainline than to Eliezer's

To add to this sentiment, I'll post the graph from my notebook on language model progress. I refer to the Penn Treebank task a lot when making this point because it seems to have a lot of good data, but you can also look at the other tasks and see basically the same thing. 

The last dip in the chart is from GPT-3. It looks like GPT-3 was indeed a discontinuity in progress but not a very shocking one. It roughly would have taken about one or two more years at ordinary progress to get to that point anyway -- which I just don't see as being all that impressive.

I sorta feel like the main reason why lots of people found GPT-3 so impressive was because OpenAI was just good at marketing the results [ETA: sorry, I take back the use of the word "marketing"]. Maybe OpenAI saw an opportunity to dump a lot of compute into language models and have a two year discontinuity ahead of everyone else, and showcase their work. And that strategy seemed to really worked well for them.

I admit this is an uncharitable explanation, but is there a better story to tell about why GPT-3 captured so much attention?

Replies from: gwern
comment by gwern · 2021-11-24T23:37:33.557Z · LW(p) · GW(p)

The impact of GPT-3 had nothing whatsoever to do with its perplexity on Penn Treebank. I think this is a good example of why focusing on perplexity and 'straight lines on graph go brr' is so terrible, such cargo cult mystical thinking, and crippling. There's something astonishing to see someone resort to explaining away GPT-3's impact as 'OpenAI was just good at marketing the results'. Said marketing consisted of: 'dropping a paper on Arxiv'. Not even tweeting it! They didn't even tweet the paper! (Forget an OA blog post, accompanying NYT/TR articles, tweets by everyone at OA, a fancy interactive interface - none of that.) And most of the initial reaction was "GPT-3: A Disappointing Paper"-style. If this is marketing genius, then it is truly 40-d chess, is all I can say.

The impact of GPT-3 was in establishing that trendlines did continue in a way that shocked pretty much everyone who'd written off 'naive' scaling strategies. Progress is made out of stacked sigmoids: if the next sigmoid doesn't show up, progress doesn't happen. Trends happen, until they stop. Trendlines are not caused by the laws of physics. You can dismiss AlphaGo by saying "oh, that just continues the trendline in ELO I just drew based on MCTS bots", but the fact remains that MCTS progress had stagnated, and here we are in 2021, and pure MCTS approaches do not approach human champions, much less beat them. (This is also true of SVMs. Notice SVMs solving ImageNet because the trendlines continued? No, of course you did not. It drives me bonkers to see AI Impacts etc make arguments like "deep learning is unimportant because look, ImageNet follows a trendline". Sheer numerology.) Appealing to trendlines is roughly as informative as "calories in calories out"; 'the trend continued because the trend continued'. A new sigmoid being discovered is extremely important.

GPT-3 further showed completely unpredicted emergence of capabilities across downstream tasks which are not measured in PTB perplexity. There is nothing obvious about a PTB BPC of 0.80 that causes it to be useful where 0.90 is largely useless and 0.95 is a laughable toy. (OAers may have had faith in scaling, but they could not have told you in 2015 that interesting behavior would start at 𝒪(1b), and it'd get really cool at 𝒪(100b).) That's why it's such a useless metric. There's only one thing that a PTB perplexity can tell you, under the pretraining paradigm: when you have reached human AGI level. (Which is useless for obvious reasons: much like saying that "if you hear the revolver click, the bullet wasn't in that chamber and it was safe". Surely true, but a bit late.) It tells you nothing about intermediate levels. I'm reminded of the Steven Kaas line:

Why idly theorize when you can JUST CHECK and find out the ACTUAL ANSWER to a superficially similar-sounding question SCIENTIFICALLY?

Using PBT, and talking only about perplexity, is a precise answer to the wrong question. (This is a much better argument when it comes to AlphaGo/ELO, because at least there, 'ELO' is in fact the ultimate objective, and not a proxy pretext. But perplexity is of no interest to anyone except an information theorist. Unfortunately, we lack any 'take-over-the-world-ELO' we can benchmark models on and extrapolate there. If we did and there was a smooth curve, I would indeed agree that we should adopt that as the baseline. But the closest things we have to downstream tasks are all wildly jumpy - even superimposing scores of downstream tasks barely gives you a recognizable smooth curve, and certainly nothing remotely as smooth as the perplexity curve. My belief is that this is because the overall perplexity curve comes from hundreds or thousands of stacked sigmoids and plateau/breakthroughs averaging out in terms of prediction improvements.) It sure would be convenient if the only number that mattered in AI or its real-world impact or risk was also the single easiest one to measure!

I emphasized this poverty of extrapolation in my scaling hypothesis writeup already, but permit me to vent a little more here:

"So, you're forecasting AI progress using PTB perplexity/BPC. Cool, good work, nice notebook, surely this must be useful for forecasting on substantive AI safety/capability questions of interest to us. I see it's a pretty straight line on a graph. OK, can you tell me at what BPC a large language model could do stuff like hack computers and escape onto the Internet?"

"No. I can tell you what happens if I draw the line out x units, though."

"Perhaps that's an unfairly specific question to ask, as important as it is. OK, can you tell me when we can expect to see well-known benchmarks like Winograd schemas be solved?"

"No. I can draw you a line on PTB to estimate when PTB is solved, though, if you give me a second and define a bound for 'PTB is solved'."

"Hm. Can you at least tell me when we can expect to see meta-learning emerge, with good few-shot learning - does the graph predict 0.1b, 1b, 10b, 100b, or what?"

"No idea."

"Do you know what capabilities will be next to emerge? We got pretty good programming performance in Copilot at 𝒪(100b), what's next?"

"I don't know."

"Can you qualitatively describe what we'd get at 1t, or 10t?"

"No, but I can draw the line in perplexity. It gets pretty low."

"How about the existence of any increasing returns to scale in downstream tasks? Does it tell us anything about spikes in capabilities (such as we observe in many places, such as text style transfer and inner monologue in LaMDA at 100b; most recently BIG-bench [LW(p) · GW(p)])? Such as whether there are any more spikes past 𝒪(100b), whether we'll see holdouts like causality suddenly fall at 𝒪(1000b), anything like that?"

"No."

"How about RL: what sort of world modeling can we get by plugging them into DRL agents?"

"I don't know."

"Fine, let's leave it at tool AIs doing text in text out. Can you tell me how much economic value will be driven by dropping another 0.01 BPC?"

"No. I can tell you how much it'd cost in GPU-time, though, by the awesome power of drawing lines!"

"OK, how about that: how low does it need to go to support a multi-billion dollar company running something like the OA API, to defray the next 0.01 drop and pay for the GPU-time to get more drops?"

"No idea."

"How do you know BPC is the right metric to use?"

"Oh, we have lots of theories about it, but I'll level with you: we always have theories for everything, but really, we chose BPC post hoc out of a few thousand metrics littering Arxiv like BLEU, ROUGE, SSA etc after seeing that it worked and better BPC = better models."

"Can you write down your predictions about any of this?"

"Absolutely not."

"Can anyone else?"

"Sure. But they're all terribly busy."

"Did you write down your predictions before now, then?"

"Oh gosh no, I wasn't around then."

"Did... someone... else... write down their predictions before?"

"Not that I'm aware of."

"Ugh. Fine, what can you tell me about AI safety/risk/capabilities/economics/societal-disruption with these analyses of absolute loss?"

"Lines go straight?"


Seems to me that instead of gradualist narratives it would be preferable to say with Socrates that we are wise about scaling only in that we know we know little & about the least.

Replies from: Eliezer_Yudkowsky, Eliezer_Yudkowsky, paulfchristiano, matthew-barnett, awenonian, Veedrac
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T05:10:12.307Z · LW(p) · GW(p)

And to say it also explicitly, I think this is part of why I have trouble betting with Paul.  I have a lot of ? marks on the questions that the Gwern voice is asking above, regarding them as potentially important breaks from trend that just get dumped into my generalized inbox one day.  If a gradualist thinks that there ought to be a smooth graph of perplexity with respect to computing power spent, in the future, that's something I don't care very much about except insofar as it relates in any known way whatsoever to questions like those the Gwern voice is asking.  What does it even mean to be a gradualist about any of the important questions like those of the Gwern-voice, when they don't relate in known ways to the trend lines that are smooth?  Isn't this sort of a shell game where our surface capabilities do weird jumpy things, we can point to some trend lines that were nonetheless smooth, and then the shells are swapped and we're told to expect gradualist AGI surface stuff?  This is part of the idea that I'm referring to when I say that, even as the world ends, maybe there'll be a bunch of smooth trendlines underneath it that somebody could look back and point out.  (Which you could in fact have used to predict all the key jumpy surface thresholds, if you'd watched it all happen on a few other planets and had any idea of where jumpy surface events were located on the smooth trendlines - but we haven't watched it happen on other planets so the trends don't tell us much we want to know.)

Replies from: paulfchristiano, matthew-barnett
comment by paulfchristiano · 2021-11-25T22:22:31.785Z · LW(p) · GW(p)

This seems totally bogus to me.

It feels to me like you mostly don't have views about the actual impact of AI as measured by jobs that it does or the $s people pay for them, or performance on any benchmarks that we are currently measuring, while I'm saying I'm totally happy to use gradualist metrics to predict any of those things. If you want to say "what does it mean to be a gradualist" I can just give you predictions on them. 

To you this seems reasonable, because e.g. $ and benchmarks are not the right way to measure the kinds of impacts we care about. That's fine, you can propose something other than $ or measurable benchmarks. If you can't propose anything, I'm skeptical.

My basic guess is that you probably can't effectively predict $ or benchmarks or anything else quantitative. If you actually agreed with me on all that stuff, then I might suspect that you are equivocating between a gradualist-like view that you use for making predictions about everything near term and then switching to a more bizarre perspective when talking about the future. But fortunately I think this is more straightforward, because you are basically being honest when you say that you don't understand how the gradualist perspective makes predictions.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T22:55:28.391Z · LW(p) · GW(p)

I kind of want to see you fight this out with Gwern (not least for social reasons, so that people would perhaps see that it wasn't just me, if it wasn't just me).

But it seems to me that the very obvious GPT-5 continuation of Gwern would say, "Gradualists can predict meaningless benchmarks, but they can't predict the jumpy surface phenomena we see in real life."  We want to know when humans land on the moon, not whether their brain sizes continued on a smooth trend extrapolated over the last million years.

I think there's a very real sense in which, yes, what we're interested in are milestones, and often milestones that aren't easy to define even after the fact.  GPT-2 was shocking, and then GPT-3 carried that shock further in that direction, but how do you talk with that about somebody who thinks that perplexity loss is smooth?  I can handwave statements like "GPT-3 started to be useful without retraining via just prompt engineering" but qualitative statements like those aren't good for betting, and it's much much harder to come up with the right milestone like that in advance, instead of looking back in your rearview mirror afterwards.

But you say - I think? - that you were less shocked by this sort of thing than I am.  So, I mean, can you prophesy to us about milestones and headlines in the next five years?  I think I kept thinking this during our dialogue, but never saying it, because it seemed like such an unfair demand to make!  But it's also part of the whole point that AGI and superintelligence and the world ending are all qualitative milestones like that.  Whereas such trend points as Moravec was readily able to forecast correctly - like 10 teraops / plausibly-human-equivalent-computation being available in a $10 million supercomputer around 2010 - are really entirely unanchored from AGI, at least relative to our current knowledge about AGI.  (They would be anchored if we'd seen other planets go through this, but we haven't.)

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-25T23:10:33.174Z · LW(p) · GW(p)

But it seems to me that the very obvious GPT-5 continuation of Gwern would say, "Gradualists can predict meaningless benchmarks, but they can't predict the jumpy surface phenomena we see in real life."

Don't you think you're making a falsifiable prediction here?

Name something that you consider part of the "jumpy surface phenomena" that will show up substantially before the world ends (that you think Paul doesn't expect). Predict a discontinuity. Operationalize everything and then propose the bet.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T23:30:19.108Z · LW(p) · GW(p)

(I'm currently slightly hopeful about the theorem-proving thread, elsewhere and upthread.)

comment by Matthew Barnett (matthew-barnett) · 2021-11-25T05:56:46.464Z · LW(p) · GW(p)

What does it even mean to be a gradualist about any of the important questions like those of the Gwern-voice, when they don't relate in known ways to the trend lines that are smooth?

Perplexity is one general “intrinsic” measure of language models, but there are many task-specific measures too. Studying the relationship between perplexity and task-specific measures is an important part of the research process. We shouldn’t speak as if people do not actively try to uncover these relationships.

I would generally be surprised if there were many highly non-linear relationship between perplexity and something like Winograd accuracy, human evaluation, or whatever other concrete measure you can come up with, such that the underlying behavior of the surface phenomenon is best described as a discontinuity with the past even when the latent perplexity changed smoothly. I admit the existence of some measures that exhibit these qualities (such as, potentially, the ability to do arithmetic), but I expect them to be quite a bit harder to find than the reverse.

Furthermore, it seems like if this is the crux — ie. that surface-level qualitative phenomena will experience discontinuities even while latent variables do not — then I do not understand why it’s hard to come up with bet conditions.

Can’t you just pick a surface level phenomenon that’s easy to measure and strongly interpretable in a qualitative sense — like Sensibleness and Specificity Average from the paper on Google’s chatbot — and then predict discontinuities in that metric?

(I should note that the paper shows a highly linear relationship between perplexity and Sensibleness and Specificity Average. Just look at the first plot in the PDF.)

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T00:31:17.228Z · LW(p) · GW(p)

Well put / endorsed / +1.

comment by paulfchristiano · 2021-11-25T00:11:13.708Z · LW(p) · GW(p)

I think that most people who work on models like GPT-3 seem more interested in trendlines than you do here. 

That said, it's not super clear to me what you are saying so I'm not sure I disagree. Your narrative sounds like a strawman since people usually extrapolate performance on downstream tasks they care about rather than on perplexity. But I do agree that the updates from GPT-3 are not from OpenAI's marketing but instead from people's legitimate surprise about how smart big language models seem to be.

As you say, I think the interesting claim in GPT-3 was basically that scaling trends would continue, where pessimists incorrectly expected they would break based on weak arguments. I think that looking at all the graphs, both of perplexity and performance on individual tasks, helps establish this as the story. I don't really think this lines up with Eliezer's picture of AGI but that's presumably up for debate.

There are always a lot of people willing to confidently decree that trendlines will break down without much argument. (I do think that eventually the GPT-3 trendline will break if you don't change the data, but for the boring reason that the entropy of natural language will eventually dominate the gradient noise and so lead to a predictable slowdown.)

comment by Matthew Barnett (matthew-barnett) · 2021-11-24T23:53:44.562Z · LW(p) · GW(p)

There's something astonishing to see someone resort to explaining away GPT-3's impact as 'OpenAI was just good at marketing the results'. Said marketing consisted of: 'dropping a paper on Arxiv'. Not even tweeting it!

Yeah, my phrasing there was not ideal here. I regret using the word "marketing", but to be fair, I mostly meant what I said in the next few sentences, "Maybe OpenAI saw an opportunity to dump a lot of compute into language models and have a two year discontinuity ahead of everyone else, and showcase their work. And that strategy seemed to really worked well for them."

Of course, seeing that such an opportunity exists is itself laudable and I give them Bayes points for realizing that scaling laws are important. At the same time, don't you think we would have expected similar results in like two more years at ordinary progress?

I do agree that it's extremely interesting to know why the lines go straight. I feel like I wasn't trying to say that GPT-3 wasn't intrinsically interesting. I was more saying it wasn't unpredictable, in the sense that Paul Christiano would have strongly said "no I do not expect that to happen" in 2018.

Replies from: gwern
comment by gwern · 2021-11-25T00:04:31.672Z · LW(p) · GW(p)

Again, the fact that it is a straight line on a metric which is, if not meaningless, is extremely difficult to interpret, is irrelevant. Maybe OA moved up by 2 years. Why would anyone care in the slightest bit? That is, before they knew about how interesting the consequences would be of that small change in BPC?

At the same time, don't you think we would have expected similar results in like two more years at ordinary progress?

Who's 'we', exactly? Who are these people who expected all of this to happen, and are going around saying "ah yes, these BIG-Bench results are exactly as I calculated back in 2018, the capabilities are all emerging like clockwork, each at their assigned BPC; next is capability Z, obviously"? And what are they saying about 500b, 1000b, and so on?

I was more saying it wasn't unpredictable, in the sense that Paul Christiano would have strongly said "no I do not expect that to happen" in 2018.

OK. So can you link me to someone saying in 2018 that we'd see GPT-2-1.5b's behavior at ~1.5b parameters, and that we'd get few-shot metalearning and instructability past that with another OOM? And while you're at it, if it's so predictable, please answer all the other questions I gave, even if only the ones about scale. After all, you're claiming it's so easy to predict based on straight lines on convenient metrics like BPC and that there's nothing special or unpredictable about jumping 2 years. So, please jump merely 2 years ahead and tell me what I can look forward as the SOTA in Nov 2023, I'm dying of excitement here.

Replies from: tamay-besiroglu-1, matthew-barnett
comment by Tamay Besiroglu (tamay-besiroglu-1) · 2021-11-26T01:30:49.514Z · LW(p) · GW(p)

I’m confused why you think looking at the rate and lumpiness of historical progress on narrowly circumscribed performance metrics is not meaningful, because it seems like you do seem to think that drawing straight lines is fine when compute is on the x-axis—which seems like a similar exercise. What’s going on there?

comment by Matthew Barnett (matthew-barnett) · 2021-11-25T00:16:38.846Z · LW(p) · GW(p)

Again, the fact that it is a straight line on a metric which is, if not meaningless, is extremely difficult to interpret, is irrelevant. Maybe OA moved up by 2 years. Why would anyone care in the slightest bit?

Because the point I was trying to make was that the result was relatively predictable? I'm genuinely confused what you're asking. I get a slight sense that you're interpreting me as saying something about the inherent dullness of GPT-3 or that it doesn't teach us anything interesting about AI, but I don't see myself as saying anything like that. I actually really enjoy reading the output from it, your commentary on it, and what it reveals about the nature of intelligence.

I am making purely a point about predictability, and whether the result was a "discontinuity" from past progress, in the sense meant by Paul Christiano (in the way I think he means these things).

Who's 'we', exactly

We refers in that sentence to competent observers in 2018 who predict when we'll get ML milestones mostly by using the outside view, ie. by extrapolating trends on charts.

OK. So can you link me to someone saying in 2018 that we'd see GPT-2-1.5b's behavior at ~1.5b parameters, and that we'd get few-shot metalearning and instructability past that with another OOM?

No, but

  1. That seems like a different and far more specific question than whether we'd have language models that perform at roughly the same measured-level as GPT-3.
  2. In general, people make very few specific predictions about what they expect to happen in the future about these sorts of things (though, if I may add, I've been making modest progress trying to fix this broad problem by writing lots of specific questions on Metaculus).
Replies from: Edouard Harris
comment by Edouard Harris · 2021-11-25T01:51:54.306Z · LW(p) · GW(p)

I think what gwern is trying to say is that continuous progress on a benchmark like PTB appears (from what we've seen so far) to map to discontinuous progress in qualitative capabilities, in a surprising way which nobody seems to have predicted in advance. Qualitative capabilities are more relevant to safety than benchmark performance is, because while qualitative capabilities include things like "code a simple video game" and "summarize movies with emojis", they also include things like "break out of confinement and kill everyone". It's the latter capability, and not PTB performance, that you'd need to predict if you wanted to reliably stay out of the x-risk regime — and the fact that we can't currently do so is, I imagine, what brought to mind the analogy between scaling and Russian roulette.

I.e., a straight line in domain X is indeed not surprising; what's surprising is the way in which that straight line maps to the things we care about more than X.

(Usual caveats apply here that I may be misinterpreting folks, but that is my best read of the argument.)

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-25T02:54:00.100Z · LW(p) · GW(p)

I think what gwern is trying to say is that continuous progress on a benchmark like PTB appears (from what we've seen so far) to map to discontinuous progress in qualitative capabilities, in a surprising way which nobody seems to have predicted in advance.

This is a reasonable thesis, and if indeed it's the one Gwern intended, then I apologize for missing it!

That said, I have a few objections,

  • Isn't it a bit suspicious that the thing-that's-discontinuous is hard to measure, but the-thing-that's-continuous isn't? I mean, this isn't totally suspicious, because subjective experiences are often hard to pin down and explain using numbers and statistics. I can understand that, but the suspicion is still there.
  • "No one predicted X in advance" is only damning to a theory if people who believed that theory were making predictions about it at all. If people who generally align with Paul Christiano were indeed making predictions to the effect of GPT-3 capabilities being impossible or very unlikely within a narrow future time window, then I agree that would be damning to Paul's worldview. But -- and maybe I missed something -- I didn't see that. Did you?
  • There seems to be an implicit claim that Paul Christiano's theory was falsified via failure to retrodict the data. But that's weird, because much of the evidence being presented is mainly that the previous trends were upheld (for example, with Gwern saying, "The impact of GPT-3 was in establishing that trendlines did continue..."). But if Paul's worldview is that "we should extrapolate trends, generally" then that piece of evidence seems like a remarkable confirmation of his theory, not a disconfirmation.
  • Do we actually have strong evidence that the qualitative things being mentioned were discontinuous with respect to time? I can certainly see some things being discontinuous with past progress (like the ability for GPT-3 to do arithmetic). But overall I feel like I'm being asked to believe something quite strong about GPT-3 breaking trends without actual references to what progress really looked like in the past.

I don't deny that you can find quite a few discontinuities on a variety of metrics, especially if you search for them post-hoc. I think it would be fairly strawmanish to say that people in Paul Christiano's camp don't expect those at all. My impression is that they just don't expect those to be overwhelming in a way that makes reliable reference class forecasting qualitatively useless; it seems like extrapolating from the past still gives you a lot better of a model than most available alternatives.

Replies from: Vaniver, Edouard Harris
comment by Vaniver · 2021-11-25T04:05:13.298Z · LW(p) · GW(p)

it seems like extrapolating from the past still gives you a lot better of a model than most available alternatives.

My impression is that some people are impressed by GPT-3's capabilities, whereas your response is "ok, but it's part of the straight-line trend on Penn Treebank; maybe it's a little ahead of schedule, but nothing to write home about." But clearly you and they are focused on different metrics! 

That is, suppose it's the case that GPT-3 is the first successfully commercialized language model. (I think in order to make this literally true you have to throw on additional qualifiers that I'm not going to look up; pretend I did that.) So on a graph of "language model of type X revenue over time",  total revenue is static at 0 for a long time and then shortly after GPT-3's creation departs from 0.

It seems like the fact that GPT-3 could be commercialized in this way when GPT-2 couldn't is a result of something that Penn Treebank perplexity is sort of pointing at. (That is, it'd be hard to get a model with GPT-3's commercializability but GPT-2's Penn Treebank score.) But what we need in order for the straight line on PTB to be useful as a model for predicting revenue is to know ahead of time what PTB threshold you need for commercialization. 

And so this is where the charge of irrelevancy is coming from: yes, you can draw straight lines, but they're straight lines in the wrong variables. In the interesting variables (from the "what's the broader situation?" worldview), we do see discontinuities, even if there are continuities in different variables.

[As an example of the sort of story that I'd want, imagine we drew the straight line of ELO ratings for Go-bots, had a horizontal line of "human professionals" on that graph, and were able to forecast the discontinuity in "number of AI wins against human grandmasters" by looking at straight-line forecasts in ELO.]

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-25T04:16:56.657Z · LW(p) · GW(p)

That is, suppose it's the case that GPT-3 is the first successfully commercialized language model. (I think in order to make this literally true you have to throw on additional qualifiers that I'm not going to look up; pretend I did that.) So on a graph of "language model of type X revenue over time",  total revenue is static at 0 for a long time and then shortly after GPT-3's creation departs from 0.

I think it's the nature of every product that comes on the market that it will experience a discontinuity from having zero revenue to having some revenue at some point. It's an interesting question of when that will happen, and maybe your point is simply that it's hard to predict when that will happen when you just look at the Penn Treebank trend.

However, I suspect that the revenue curve will look pretty continuous, now that it's gone from zero to one. Do you disagree?

In a world with continuous, gradual progress across a ton of metrics, you're going to get discontinuities from zero to one. I don't think anyone from the Paul camp disagrees with that (in fact, Katja Grace talked about this in her article). From the continuous takeoff perspective, these discontinuities don't seem very relevant unless going from zero to one is very important in a qualitative sense. But I would contend that going from "no revenue" to "some revenue" is not actually that meaningful in the sense of distinguishing AI from the large class of other economic products that have gradual development curves.

Replies from: Vaniver
comment by Vaniver · 2021-11-25T05:24:10.568Z · LW(p) · GW(p)

your point is simply that it's hard to predict when that will happen when you just look at the Penn Treebank trend.

This is a big part of my point; a smaller elaboration is that it can be easy to trick yourself into thinking that, because you understand what will happen with PTB, you'll understand what will happen with economics/security/etc., when in fact you don't have much understanding of the connection between those, and there might be significant discontinuities. [To be clear, I don't have much understanding of this either; I wish I did!]

For example, I imagine that, by thirty years from now, we'll have language/code models that can do significant security analysis of the code that was available in 2020, and that this would have been highly relevant/valuable to people in 2020 interested in computer security. But when will this happen in the 2020-2050 range that seems likely to me? I'm pretty uncertain, and I expect this to look a lot like 'flicking a switch' in retrospect, even tho the leadup to flicking that switch will probably look like smoothly increasing capabilities on 'toy' problems.

[My current guess is that Paul / people in "Paul's camp" would mostly agree with the previous paragraph, except for thinking that it's sort of weird to focus on specifically AI computer security productivity, rather than the overall productivity of the computer security ecosystem, and this misplaced focus will generate the 'flipping the switch' impression. I think most of the disagreements are about 'where to place the focus', and this is one of the reasons it's hard to find bets; it seems to me like Eliezer doesn't care much about the lines Paul is drawing, and Paul doesn't care much about the lines Eliezer is drawing.]

However, I suspect that the revenue curve will look pretty continuous, now that it's gone from zero to one. Do you disagree?

I think I agree in a narrow sense and disagree in a broad sense. For this particular example, I expect OpenAI's revenues from GPT-3 to look roughly continuous now that they're selling/licensing it at all (until another major change happens; like, the introduction of a competitor would likely cause the revenue trend to change).

More generally, suppose we looked at something like "the total economic value of horses over the course of human history". I think we would see mostly smooth trends plus some implied starting and stopping points for those trends. (Like, "first domestication of a horse" probably starts a positive trend, "invention of stirrups" probably starts another positive trend, "introduction of horses to America" starts another positive trend, "invention of the automobile" probably starts a negative trend that ends with "last horse that gets replaced by a tractor/car".)

In my view, 'understanding the world' looks like having a causal model that you can imagine variations on (and have those imaginations be meaningfully grounded in reality), and I think the bits that are most useful for building that causal model are the starts and stops of the trends, rather than the smooth adoption curves or mostly steady equilibria in between. So it seems sort of backwards to me to say that for most of the time, most of the changes in the graph are smooth, because what I want out of the graph is to figure out the underlying generator, where the non-smooth bits are the most informative. The graph itself only seems useful as a means to that end, rather than an end in itself.

comment by Edouard Harris · 2021-11-25T05:35:35.898Z · LW(p) · GW(p)

Yeah, these are interesting points.

Isn't it a bit suspicious that the thing-that's-discontinuous is hard to measure, but the-thing-that's-continuous isn't? I mean, this isn't totally suspicious, because subjective experiences are often hard to pin down and explain using numbers and statistics. I can understand that, but the suspicion is still there.

I sympathize with this view, and I agree there is some element of truth to it that may point to a fundamental gap in our understanding (or at least in mine). But I'm not sure I entirely agree that discontinuous capabilities are necessarily hard to measure: for example, there are benchmarks available for things like arithmetic, which one can train on and make quantitative statements about.

I think the key to the discontinuity question is rather that 1) it's the jumps in model scaling that are happening in discrete increments; and 2) everything is S-curves, and a discontinuity always has a linear regime if you zoom in enough. Those two things together mean that, while a capability like arithmetic might have a continuous performance regime on some domain, in reality you can find yourself halfway up the performance curve in a single scaling jump (and this is in fact what happened with arithmetic and GPT-3). So the risk, as I understand it, is that you end up surprisingly far up the scale of "world-ending" capability from one generation to the next, with no detectable warning shot beforehand.

"No one predicted X in advance" is only damning to a theory if people who believed that theory were making predictions about it at all. If people who generally align with Paul Christiano were indeed making predictions to the effect of GPT-3 capabilities being impossible or very unlikely within a narrow future time window, then I agree that would be damning to Paul's worldview. But -- and maybe I missed something -- I didn't see that. Did you?

No, you're right as far as I know; at least I'm not aware of any such attempted predictions. And in fact, the very absence of such prediction attempts is interesting in itself. One would imagine that correctly predicting the capabilities of an AI from its scale ought to be a phenomenally valuable skill — not just from a safety standpoint, but from an economic one too. So why, indeed, didn't we see people make such predictions, or at least try to?

There could be several reasons. For example, perhaps Paul (and other folks who subscribe to the "continuum" world-model) could have done it, but they were unaware of the enormous value of their predictive abilities. That seems implausible, so let's assume they knew the value of such predictions would be huge. But if you know the value of doing something is huge, why aren't you doing it? Well, if you're rational, there's only one reason: you aren't doing it because it's too hard, or otherwise too expensive compared to your alternatives. So we are forced to conclude that this world-model — by its own implied self-assessment — has, so far, proved inadequate to generate predictions about the kinds of capabilities we really care about.

(Note: you could make the argument that OpenAI did make such a prediction, in the approximate yet very strong sense that they bet big on a meaningful increase in aggregate capabilities from scale, and won. You could also make the argument that Paul, having been at OpenAI during the critical period, deserves some credit for that decision. I'm not aware of Paul ever making this argument, but if made, it would be a point in favor of such a view and against my argument above.)

comment by awenonian · 2021-11-25T19:20:48.097Z · LW(p) · GW(p)

Can I try to parse out what you're saying about stacked sigmoids? Because it seems weird to me. Like, in that view, it still seems like showing a trendline is some evidence that it's not "interesting". I feel like this because I expect the asymptote of the AlphaGo sigmoid to be independent of MCTS bots, so surely you should see some trends where AlphaGo (or equivalent) was invented first, and jumped the trendline up really fast. So not seeing jumps should indicate that it is more a gradual progression, because otherwise, if they were independent, about half the time the more powerful technique should come first.

The "what counter argument can I come up with" part of me says, tho, that how quickly the sigmoid grows likely depends on lots of external factors (like compute available or something). So instead of sometimes seeing a sigmoid that grows twice as fast as the previous ones, you should expect one that's not just twice as tall, but twice as wide, too. And if you have that case, you should expect the "AlphaGo was invented first" sigmoid to be under the MCTS bots sigmoid for some parts of the graph, where it then reaches the same asymptote as AlphaGo in the mainline. So, if we're in the world where AlphaGo is invented first, you can make gains by inventing MCTS bots, which will also set the trendline. And so, seeing a jump would be less "AlphaGo was invented first" and more "MCTS bots were never invented during the long time when they would've outcompeted AlphaGo version -1"

Does that seem accurate, or am I still missing something?

comment by Veedrac · 2021-11-27T15:11:17.663Z · LW(p) · GW(p)

"How do you know BPC is the right metric to use?"

"Everyone chose it post hoc after seeing that it worked and better BPC = better models."

I realize your comment is in context of a comment I also disagree with, and I also think I agree with most what you're saying, but I want to challenge this framing you have at the end.

BPC is at its core a continuous generalization of the Turing Test, aka. the imitation game. It is not an exact translation, but it preserves all the key difficulties, and therefore keeps most of its same strengths, and it does this while extrapolating to weaker models in a useful and modelable way. We might only have started caring viscerally about the numbers that BPC gives, or associating them directly to things of huge importance, around the advent of GPT, but that's largely just a situational byproduct of our understanding. Turing understood the importance of the imitation game back in 1950, enough to write a paper on it, and certainly that paper didn't go unnoticed.

Nor can I see the core BPC:Turing Test correspondance as something purely post-hoc. If people didn't give it much thought, that's probably because there never was a scaling law then, there never was an expectation that you could just take your hacky grammar-infused Markov chain and extrapolate it out to capture more than just surface level syntax. Even among earlier neural models, what's the point of looking at extrapolations of a generalized Turing Test, when the models are still figuring out surface level syntactic details? Like, is it really an indictment of BPC, to say that when we saw

the meaning of life is that only if an end would be of the whole supplier. widespread rules are regarded as the companies of refuses to deliver. in balance of the nation’s information and loan growth associated with the carrier thrifts are in the process of slowing the seed and commercial paper

we weren't asking, ‘gee, I wonder how close this is to passing the Turing Test, by some generalized continuous measure’?

I think it's quite surprising—importantly surprising—how it's turned out that it actually is a relevant question, that performance on this original datapoint does actually bear some continuous mathematical relationship with models for which mere grammar is a been-there-done-that, and we now regularly test for the strength of their world models. And I get the dismissal, that it's no proven law that it goes so far before stopping, rather than some other stretch, or that it gives no concrete conclusions for what happens at each 0.01 perplexity increment, but I look at my other passion with a straight line, hardware, and I see exactly the same argument applied to almost the same arrow-straight trendline, and I think, I'd still much rather trust the person willing to look at the plot and say, gosh, those transistors will be absurdly cheap.

Would that person predict today, back at the start? Hell no. Knowing transistor scaling laws doesn't directly tell you all that much about the discontinuous changes in how computation is done. You can't look at a graph and say “at a transistor density of X, there will be the iPhone, and at a transistor density of Y, microcontrollers will get so cheap that they will start replacing simple physical switches.” It certainly will not tell you when people will start using the technology to print out tiny displays they will stick inside your glasses, or build MEMS accelerometers, nor can it tell you all of the discrete and independent innovations that overcame the challenges that got us here.

But yet, but yet, lines go straight. Moore's Law pushed computing forward not because of these concrete individual predictions, but because it told us there was more of the same surprising progress to come, and that the well has yet to run dry. That too is why I think seeing GPT-3's perplexity is so important. I agree with you, it's not that we need the perplexity to tell us what GPT-3 can do. GPT-3 will happily tell us that itself. And I think you will agree with me when I say that what's most important about these trends is that they're saying there's more to come, that the next jump will be just as surprising as the last.

Where we maybe disagree is that I'm willing to say these lines can stand by themselves; that you don't need to actually see anything more of GPT-3 than its perplexity to know that its capabilities must be so impressive, even if you might need to see it to feel it emotionally. You don't even need to know anything about neural networks or their output samples to see a straight line of bits-per-character that threatens to go so low in order to forecast that something big is going on. You didn't need to know anything about CPU microarchitecture to imagine that having ten billion transistors per square centimeter would have massive societal impacts either, as long as you knew what a transistor was and understood its fundamental relations to computation.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-23T11:46:28.410Z · LW(p) · GW(p)

[ETA: In light of pushback from Rob: I really don't want this to become a self-fulfilling prophecy. My hope in making this post was to make the prediction less likely to come true, not more! I'm glad that MIRI & Eliezer are publicly engaging with the rest of the community more again, I want that to continue, and I want to do my part to help everybody to understand each other.]

And I know, before anyone bothers to say, that all of this reply is not written in the calm way that is right and proper for such arguments. I am tired. I have lost a lot of hope. There are not obvious things I can do, let alone arguments I can make, which I expect to be actually useful in the sense that the world will not end once I do them. I don't have the energy left for calm arguments. What's left is despair that can be given voice.

I grimly predict that the effect of this dialogue on the community will be polarization: People who didn't like Yudkowsky and/or his views will like him / his views less, and the gap between them and Yud-fans will grow (more than it shrinks due to the effect of increased dialogue). I say this because IMO Yudkowsky comes across as angry and uncharitable in various parts of this dialogue, and also I think it was kinda a slog to get through & it doesn't seem like much intellectual progress was made here.

FWIW I continue to think that Yudkowskys model of how the future will go is basically right, at least more right than Christiano's. This is a big source of sadness and stress for me too, because (for example) my beloved daughter probably won't live to adulthood.

The best part IMO was the mini-essay at the end about Thielian secrets and different kinds of tech progress -- a progression of scenarios adding up to Yudkowsky's understanding of Paul's model:

But we can imagine that doesn't happen either, because instead of needing to build a whole huge manufacturing plant, there's just lots and lots of little innovations adding up to every key AGI threshold, which lots of actors are investing $10 million in at a time, and everybody knows which direction to move in to get to more serious AGI and they're right in this shared forecast.

It does seem to me that the AI industry will move more in this direction than it currently is, over the next decade or so. However I still do expect that we won't get all the way there. I would love to hear from Paul whether he endorses the view Yudkowsky attributes to him in this final essay.

Replies from: RobbBB, Liron, adamShimi
comment by Rob Bensinger (RobbBB) · 2021-11-23T19:39:06.821Z · LW(p) · GW(p)

I grimly predict that the effect of this dialogue on the community will be polarization

Beware of self-fulfilling prophecies (and other premature meta [LW · GW])! If both sides in a dispute expect the other side to just entrench, then they're less likely to invest the effort to try to bridge the gap.

This very comment section is one of the main things that will determine the community's reaction, and diverting our focus to 'what will our reaction be?' before we've talked about the object-level claims can prematurely lock in a certain reaction.

(That said, I think you're doing a useful anti-polarization thing here, by showing empathy for people you disagree with, and showing willingness to criticize people you agree with. I don't at all dislike this comment overall; I just want to caution against giving up on something before we've really tried. This is the first proper MIRI-response to Paul's takeoff post, and should be a pretty big update for a lot of people -- I don't think people were even universally aware that Eliezer endorses hard takeoff anymore, much less aware of his reasoning.)

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-23T22:02:15.998Z · LW(p) · GW(p)

Fair enough! I too dislike premature meta, and feel bad that I engaged in it. However... I do still feel like my comment probably did more to prevent polarization than cause it? That's my independent impression at any rate. (For the reasons you mention).

I certainly don't want to give up! In light of your pushback I'll edit to add something at the top.

comment by Liron · 2021-11-23T11:59:50.870Z · LW(p) · GW(p)

While this may not be the ideal format for it, I thought Eliezer’s voicing of despair was a useful update to publish to the LW community about the current state of his AI beliefs.

comment by adamShimi · 2021-11-23T15:49:11.648Z · LW(p) · GW(p)

I grimly predict that the effect of this dialogue on the community will be polarization: People who didn't like Yudkowsky and/or his views will like him / his views less, and the gap between them and Yud-fans will grow (more than it shrinks due to the effect of increased dialogue). I say this because IMO Yudkowsky comes across as angry and uncharitable in various parts of this dialogue, and also I think it was kinda a slog to get through & it doesn't seem like much intellectual progress was made here.

Strongly agree with that.

Since you agree with Yudkowksy, do you think you could strongman his position?

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-23T17:44:35.933Z · LW(p) · GW(p)

Yes, though I'm much more comfortable explaining and arguing for my own position than EY's. It's just that my position turns out to be pretty similar. (Partly this is independent convergence, but of course partly this is causal influence since I've read a lot of his stuff.)

There's a lot to talk about, I'm not sure where to begin, and also a proper response would be a whole research project in itself. Fortunately I've already written a bunch of it; see these two [? · GW] sequences. [? · GW]

Here are some quick high-level thoughts:

1. Begin with timelines. The best way to forecast timelines IMO is Ajeya's model; it should be the starting point and everything else should be adjustments from it. The core part of Ajeya's model is a probability distribution over how many OOMs of compute we'd need with today's ideas to get to TAI / AGI / APS-AI / AI-PONR [LW · GW] / etc. [Unfamiliar with these acronyms? See Robbo's helpful comment below [LW(p) · GW(p)]] For reasons which I've explained in my sequence (and summarized in a gdoc) my distribution has significantly more mass on the 0-6 OOM range than Paul does, and less on the 13+ range. The single post that conveys this intuition most is Fun with +12 OOMs. [? · GW]

Now consider how takeoff speed views interact with timelines views. Paul-slow takeoff and <10 year timelines are in tension with each other. If <7 OOMs of compute would be enough to get something crazy powerful with today's ideas, then the AI industry is not an efficient market right now. If we get human-level AGI in 2030, then on Paul's view that means the world economy should be doubling in 2029 and should have doubled over the course of 2025 - 2028 and should already be accelerating now probably. It doesn't look like that's happening or about to happen. I think Paul agrees with this; in various conversations he's said things like "If AGI happens in 10 years or less then probably we get fast takeoff." [Paul please correct me if I'm mischaracterizing your view!]

Ajeya (and Paul) mostly update against <10 year timelines for this reason. I, by contrast, mostly update against slow takeoff. (Obviously with both do a bit of both, like good Bayesians.)

2. I feel like the debate between EY and Paul (and the broader debate about fast vs. slow takeoff) has been frustratingly much reference class tennis and frustratingly little gears-level modelling. This includes my own writing on the subject -- lots of historical analogies and whatnot. I've tentatively attempted some things sorta like gears-level modelling (arguably What 2026 Looks Like [? · GW] is an example of this) and so far it seems to be pushing my intuitions more towards "Yep, fast takeoff is more likely." But I feel like my thinking on this is super inadequate and I think we all should be doing better. Shame! Shame on all of us!

3. I think the focus on GDP (especially GWP) is really off, for reasons mentioned here [? · GW]. I think AI-PONR will probably come before GWP accelerates, and at any rate what we care about for timelines and takeoff speeds is AI-PONR and so our arguments should be about e.g. whether there will be warning shots and powerful AI tools of the sort that are relevant to solving alignment for APS-AI systems.

(Got to go now)

Replies from: johnswentworth, Robbo
comment by johnswentworth · 2021-11-23T19:00:20.281Z · LW(p) · GW(p)

I feel like the debate between EY and Paul (and the broader debate about fast vs. slow takeoff) has been frustratingly much reference class tennis and frustratingly little gears-level modelling.

So, there's this inherent problem with deep gearsy models, where you have to convey a bunch of upstream gears (and the evidence supporting them) before talking about the downstream questions of interest, because if you work backwards then peoples' brains run out of stack space and they lose track of the whole multi-step path. But if you just go explaining upstream gears first, then people won't immediately see how they're relevant to alignment or timelines or whatever, and then lots of people just wander off. Then you go try to explain something about alignment or timelines or whatever, using an argument which relies on those upstream gears, and it goes right over a bunch of peoples' heads because they don't have that upstream gear in their world-models.

For the sort of argument in this post, it's even worse, because a lot of people aren't even explicitly aware that the relevant type of gear is a thing, or how to think about it beyond a rough intuitive level.

I first ran into this problem in the context of takeoff arguments a couple years ago, and wrote up this sequence [? · GW] mainly to convey the relevant kinds of gears and how to think about them. I claim that this (i.e. constraint slackness/tautness) is usually a good model for gear-type in arguments about reference-classes in practice: typically an intuitively-natural reference class is a set of cases which share some common constraint, and the examples in the reference class then provide evidence for the tautness/slackness of the constraint. For instance, in this post, Paul often points to market efficiency as a taut constraint, and Eliezer argues that constraint is not very taut (at least not in the way needed for the slow takeoff argument). Paul's intuitive estimates of tautness are presumably driven by things like e.g. financial markets. On the other side, Eliezer wrote Inadequate Equilibria [? · GW] to talk about how taut market efficiency is in general, including gears "further up" and more examples.

If you click through the link in the post to Intelligence Explosion Microeconomics, there's a lot of this sort of reasoning in it.

Replies from: Yoav Ravid
comment by Yoav Ravid · 2021-11-23T19:12:00.894Z · LW(p) · GW(p)

So, there's this inherent problem with deep gearsy models, where you have to convey a bunch of upstream gears (and the evidence supporting them) before talking about the downstream questions of interest, because if you work backwards then peoples' brains run out of stack space and they lose track of the whole multi-step path. But if you just go explaining upstream gears first, then people won't immediately see how they're relevant to alignment or timelines or whatever, and then lots of people just wander off. Then you go try to explain something about alignment or timelines or whatever, using an argument which relies on those upstream gears, and it goes right over a bunch of peoples' heads because they don't have that upstream gear in their world-models.

The solution might be to start with a concise, low-detail summery (not even one that argues the case, just states it), then start explaining in full detail from the start, knowing that your readers now know which way you're going.

Wait, I think I just invented the Abstract (not meant as a snide remark. I really did realize it after writing the above, and found it funny).

comment by Robbo · 2021-11-24T12:02:52.232Z · LW(p) · GW(p)

The core part of Ajeya's model is a probability distribution over how many OOMs of compute we'd need with today's ideas to get to TAI / AGI / APS-AI / AI-PONR / etc.

I didn't know the last two acronyms despite reading a decent amount of this literature, so thought I'd leave this note for other readers. Listing all of them for completeness (readers will of course know the first two):

TAI: transformative AI

AGI: artificial general intelligence

APS-AI: Advanced, Planning, Strategically aware AI [1]

AI-PONR: AI point of no return [2]

[1] from Carlsmith, which Daniel does link to

[2] from Daniel [LW · GW], which he also linked

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-24T14:07:51.362Z · LW(p) · GW(p)

Sorry! I'll go back and insert links + reference your comment

comment by RomanS · 2021-11-24T12:09:11.326Z · LW(p) · GW(p)

your view seems to imply that we will move quickly from much worse than humans to much better than humans, but it's likely that we will move slowly through the human range on many tasks

We might be able to falsify that in a few months. 

There is a joint Google / OpenAI project called BIG-bench. They've crowdsourced ~200 of highly diverse text tasks (from answering scientific questions to predicting protein interacting sites to measuring self-awareness). 

One of the goals of the project is to see how the performance on the tasks is changing with the model size, with the size ranging by many orders of magnitude. 

A half-year ago, they presented some preliminary results. A quick summary:

if you increase the N of parameters from 10^7 to 10^10, the aggregate performance score grows roughly like log(N). 

But after the 10^10 point, something interesting happens: the score starts growing much faster (~N). 

And for some tasks, the plot looks like a hockey stick (a sudden change from ~0 to almost-human).

The paper with the full results is expected to be published in the next few months. 


Judging by the preliminary results, the FOOM could start like this:

The GPT-5 still sucks on most tasks. It's mostly useless. But what if we increase parameters_num by 2? What could possibly go wrong?

Replies from: daniel-kokotajlo, evhub, gwern, ESRogs, StellaAthena, Jeff Rose
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-24T15:16:28.974Z · LW(p) · GW(p)

Hot damn, where can I see these preliminary results?

Replies from: RomanS
comment by RomanS · 2021-11-24T15:25:25.841Z · LW(p) · GW(p)

The results were presented at a workshop by the project organizers. The video from the workshop is available here (the most relevant presentation starts at 5:05:00).

It's one of those innocent presentations that, after you understand the implications, keep you awake at night. 

Replies from: Lanrian
comment by Lukas Finnveden (Lanrian) · 2021-11-24T18:23:43.370Z · LW(p) · GW(p)

Presumably you're referring to this graph. The y-axis looks like the kind of score that ranges between 0 and 1, in which case this looks sort-of like a sigmoid to me, which accelerates when it gets closer to ~50% performance (and decelarates when it gets closer to 100% performance).

If so, we might want to ask whether these tasks are chosen ~randomly (among tasks that are indicative of how useful AI is) or if they're selected for difficulty in some way. In particular, assume that most tasks look sort-of like a sigmoid as they're scaled up (accelerating around 50%, improving slower when they're closer to 0% and 100%). Then you might think that the most exciting tasks to submit to big bench would be the tasks that can't be handled by small models, but that large models rapidly improve upon (as opposed to tasks that are basically-solved already by 10^10 parameters). In which case the aggregation of all these tasks could be expected to look sort-of like this, improving faster after 10^10 than before.

...is one story I can tell, but idk if I would have predicted that beforehand, and fast acceleration after 10^10 is certainly consistent with many people's qualitative impressions of GPT-3. So maybe there is some real acceleration going on.

(Also, see this post [LW · GW] for similar curves, but for the benchmarks that OpenAI tested GPT-3 on. There's no real acceleration visible there, other than for arithmetic.)

Replies from: RomanS
comment by RomanS · 2021-11-24T20:00:40.342Z · LW(p) · GW(p)

The preliminary results where obtained on a subset of the full benchmark (~90 tasks vs 206 tasks). And there were many changes since then, including scoring changes. Thus, I'm not sure we'll see the same dynamics in the final results. Most likely yes, but maybe not.

I agree that the task selection process could create the dynamics that look like the acceleration. A good point. 

As I understand, the organizers have accepted almost all submitted tasks (the main rejection reasons were technical - copyright etc). So, it was mostly self-selection, with the bias towards the hardest imaginable text tasks. It seems that for many contributors, the main motivation was something like: 

Take that, the most advanced AI of Google! Let's see if you can handle my epic task!

This includes many cognitive tasks that are supposedly human-complete (e.g. understanding of humor, irony, ethics), and the tasks that are probing the model's generality (e.g. playing chess, recognizing images, navigating mazes - all in text).

I wonder if the performance dynamics on such tasks will follow the same curve.  

The list of of all tasks is available here.

comment by evhub · 2021-11-24T20:47:22.874Z · LW(p) · GW(p)

But after the 10^10 point, something interesting happens: the score starts growing much faster (~N).

And for some tasks, the plot looks like a hockey stick (a sudden change from ~0 to almost-human).

Seems interestingly similar to the grokking phenomenon.

comment by gwern · 2021-11-24T16:50:38.863Z · LW(p) · GW(p)

So these results are not reported in "Multitask Prompted Training Enables Zero-Shot Task Generalization", Sanh et al 2021?

Replies from: StellaAthena, RomanS, RomanS
comment by StellaAthena · 2021-11-24T18:43:00.739Z · LW(p) · GW(p)

For Sanh et al. (2021), we were able to negotiate access to preliminary numbers from the BIG Bench project and run the T0 models on it. However the authors of Sanh et al. and the authors of BIG Bench are different groups of people.

comment by RomanS · 2022-06-10T08:59:17.352Z · LW(p) · GW(p)

The aforementioned Google's Big-Bench paper is now publicly available:

comment by RomanS · 2021-11-24T17:01:06.751Z · LW(p) · GW(p)

Nope. Although the linked paper uses the same benchmark (a tiny subset of it), the paper comes from a separate research project. 

As I understand, the primary topic of the future paper will be the BIG-bench project itself, and how the models from Google / OpenAI perform on it. 

comment by ESRogs · 2021-11-24T19:48:16.231Z · LW(p) · GW(p)

But after the 10^10 point, something interesting happens: the score starts growing much faster (~N). 

And for some tasks, the plot looks like a hockey stick (a sudden change from ~0 to almost-human).


...

Judging the preliminary results, the FOOM could start like this:

"The GPT-5 still sucks on most tasks. It's mostly useless. But what if we increase parameters_num by 2? What could possibly go wrong?"

 

Hypothesis:

  • doing things in the real world requires diverse skills (strong performance on a diverse set of tasks)
  • hockey-sticking performance on a particular task makes that task no longer the constraint on what you can accomplish
  • but now some other task is the bottleneck
  • so, unless you can hockey-stick on all the tasks all at once, your overall ability to do things in the world will get smoothed out a bunch, even if it still grows very rapidly
Replies from: ESRogs
comment by ESRogs · 2021-11-24T20:01:00.553Z · LW(p) · GW(p)

Seems like there's a spectrum between smooth accelerating progress and discontinuous takeoff. And where we end up on that spectrum depends on a few things:

  • how much simple improvements (better architecture, more compute) help with a wide variety of tasks
  • how much improvements in AI systems is bottlenecked on those tasks
  • how many resources the world is pouring into finding and making those improvements

Recent evidence (success of transformers, scaling laws) seems to suggest that Eliezer was right in the FOOM debate that simple input changes could make a large difference across a wide variety of tasks.

It's less clear to me though whether that means a local system is going to outcompete the rest of the economy, because it seems plausible to me that the rest of the economy is also going to be full-steam ahead searching the same improvement space that a local system will be searching.

And I think in general real world complexity tends to smooth out lumpy graphs. As an example, even once we realize that GPT-2 is powerful and GPT-3 will be even better, there's a whole bunch of engineering work that had to go into figuring out how to run such a big neural network across multiple machines.

That kind of real-world messiness seems like it will introduce new bottlenecks at every step along the way, and every order-of-magnitude change in scale, which makes me think that the actual impact of AI will be a lot more smooth than we might otherwise think just based on simple architectures being generally useful and scalable.

comment by StellaAthena · 2021-11-24T18:41:11.425Z · LW(p) · GW(p)

What makes you say BIG Bench is a joint Google / OpenAI project? I'm a contributor to it and have seen no evidence of that.

Replies from: RomanS
comment by RomanS · 2021-11-24T19:09:58.193Z · LW(p) · GW(p)

During the workshop presentation, Jascha said that the OpenAI will run their models on the benchmark. This suggests that there is (was?) some collaboration. But it was a half a year ago.

Just checked, the repo's readme doesn't mention OpenAI anymore. In the earlier versions, it was mentioned like this

Teams at Google and OpenAI have committed to evaluate BIG-Bench on their best-performing model architectures

So, it seems that OpenAI withdrew from the project, partially or fully.

Replies from: calef, StellaAthena
comment by calef · 2021-11-25T02:20:37.480Z · LW(p) · GW(p)

OpenAI is still running evaluations.

comment by StellaAthena · 2021-11-28T17:53:06.410Z · LW(p) · GW(p)

Interesting… I was busy and wasn’t able to watch the workshop. That’s good to know, thanks!

comment by Jeff Rose · 2021-11-24T18:13:58.531Z · LW(p) · GW(p)

GPT-4 is expected to have about 10^14 parameters and be ready in a few years.   And, we already know that GPT-3 can write code.  The following all seem (to me at least) like very reasonable conjectures:

(i) Writing code is one of the tasks at which GPT-4 will have (at least) human level capability.

(ii) Clones of GPT-4 will be produced fairly rapidly after GPT-4, say 1-3 years.

(iii) GPT-4 and its clones will have a significant impact on society.  This will show up in the real economy. 

(iv) GPT-4 will be enough to shock governments into paying attention.  (But as we have seen with climate change governments can pay attention to an issue for a long time without effectively doing anything about it.)

(v) Someone is going to ask for GPT-4 (clone) to produce code that generates AGI. (Implicitly, if not explicitly.)

I have absolutely no idea whether GPT-4 will succeed at this endeavor.  But if not, GPT-5 should be available a few years later....

(And, of course, this is just one pathway.)

Replies from: Lanrian
comment by Lukas Finnveden (Lanrian) · 2021-11-24T18:27:27.048Z · LW(p) · GW(p)
GPT-4 is expected to have about 10^14 parameters

There was a Q&A where Sam Altman said GPT-4 is going to be a lot smaller than that (in particular, that it wouldn't have a lot more parameters than GPT-3).

Replies from: Jeff Rose
comment by Jeff Rose · 2021-11-24T20:06:06.034Z · LW(p) · GW(p)

You appear to be correct.   I will withdraw my comment.

comment by Matthew Barnett (matthew-barnett) · 2021-11-23T02:14:08.111Z · LW(p) · GW(p)

It just seems very clear to me that the sort of person who is taken in by [Paul Christiano's slow takeoff] essay is the same sort of person who gets taken in by Hanson's arguments in 2008 and gets caught flatfooted by AlphaGo and GPT-3 and AlphaFold 2.

We can very loosely test this hypothesis by asking whether predictors on Metaculus were surprised by these developments, since Metaculus tends to generally agree with Paul Christiano's model (see here for example). 

Unfortunately, we can't make many inferences with the data available, as it's too sparse. Still, I'm leaving the following information here in case people find it interesting.

  • AlphaGo. There were two questions on Metaculus about Go before AlphaGo beat Lee Sedol. The first was this question about whether an AI would beat a top human Go player in 2016. Before AlphaGo became widely known -- following the announcement of its match against Fan Hui -- the median prediction was around 30%. After the announcement, the probability shot up to 90%. Unfortunately, this can't be taken to be much evidence that Metaculus impressively foresaw a breakthrough that year, since Demis Hassabis had already hinted at a breakthrough at the time the question was opened. (Before the matches, Metaculus put the chances of AlphaGo beating Lee Sedol at 64%).
  • GPT-3. It's unclear what relevant metrics would have counted as "predicting GPT-3". There was a question for the best Penn Treebank perplexity score in 2019 and it turned out Metaculus over-predicted progress (though this was mostly a failure in operationalization, see Daniel Filan's post-mortem). Metaculus had generally anticipated a great increase in parameter counts for ML models in early 2020, as evidenced by this question. More generally, GPT-3 doesn't seem like a good example of a discontinuity in machine learning progress in perplexity when looking at the benchmark data [LW · GW]. It's possible GPT-3 is a discontinuity from previous progress in some other, harder to measure sense, but I'm not currently aware of what that might be.
  • AlphaFold 2. Metaculus wasn't generally very surprised by a breakthrough in protein folding prediction. Since early 2019, predictors placed a greater than 80% chance that a breakthrough would happen by 2031 (note, AlphaFold 2 technically doesn't count as a "breakthrough" by the strict definition in the question criteria). However, it is probably true that Metaculus was surprised that it happened so early.
Replies from: LawChan
comment by LawrenceC (LawChan) · 2021-11-25T09:45:47.347Z · LW(p) · GW(p)

Is GPT-3 perhaps some sort of discontinuity for how single-language text generation w/ neural networks is monetized? Have there been other companies that sold text completion as a service, metered out per token, before GPT-3?

Obviously, this isn’t a purely technical discontinuity, but I haven’t heard of any companies monetizing language models in this way in the past.

EDIT: see also Gwern’s comment for why Penn Tree Bank Perplexity isn’t a good metric for discontinuities in language models. (https://www.lesswrong.com/posts/vwLxd6hhFvPbvKmBH/yudkowsky-and-christiano-discuss-takeoff-speeds?commentId=mKgEsfShs2xtaWz4K [LW(p) · GW(p)])

comment by jsteinhardt · 2021-11-24T22:41:43.419Z · LW(p) · GW(p)

My basic take is that there will be lots of empirical examples where increasing model size by a factor of 100 leads to nonlinear increases in capabilities (and perhaps to qualitative changes in behavior). On median, I'd guess we'll see at least 2 such examples in 2022 and at least 100 by 2030.

At the point where there's a "FOOM", such examples will be commonplace and happening all the time. Foom will look like one particularly large phase transition (maybe 99th percentile among examples so far) that chains into more and more. It seems possible (though not certain--maybe 33%?) that once you have the right phase transition to kick off the rest, everything else happens pretty quickly (within a few days).

Is this take more consistent with Paul's or Eliezer's? I'm not totally sure. I'd guess closer to Paul's, but maybe the "1 day" world is consistent with Eliezer's?

(One candidate for the "big" phase transition would be if the model figures out how to go off and learn on its own, so that number of SGD updates is no longer the primary bottleneck on model capabilities. But I could also imagine us getting that even when models are still fairly "dumb".)

comment by Raemon · 2021-11-23T22:52:17.484Z · LW(p) · GW(p)

So... I totally think there are people who sort of nod along with Paul, using it as an excuse to believe in a rosier world where things are more comprehensible and they can imagine themselves doing useful things without having a plan for solving the actual hard problems. Those types of people exist. I think there's some important work to be done in confronting them with the hard problem at hand.

But, also... Paul's world AFAICT isn't actually rosier. It's potentially more frightening to me. In Smooth Takeoff world, you can't carefully plan your pivotal act with an assumption that the strategic landscape will remain roughly the same by the time you're able to execute on it. Surprising partial-gameboard-changing things could happen that affect what sort of actions are tractable. Also, dumb, boring ML systems run amok could kill everyone before we even get to the part where recursive self improving consequentialists eradicate everyone. 

I think there is still something seductive about this world – dumb, boring ML systems run amok feels like the sort of problem that is easier to reason about and maybe solve. (I don't think it's actually necessarily easier to solve, but I think it can feel that way, whether it's easier or not). And if you solve ML-run-amok-problems, you still end up dead from recursive-self-improving-consequentialists if you didn't have a plan for them.

But, that seductiveness feels like a different problem to me than what's getting argued about in this dialog. (This post seemed to mostly be arguing on the object level at Paul. I recall a previous Eliezer comment where he complained that Paul kept describing things in language that were easy to round off to "things are easy to deal with" even though Eliezer knew that Paul didn't believe that. That feels more like what the argument here was actually about, but the way the conversation was conducted didn't seem to acknowledge that.)

My current take some object level points in this post:

  • It (probably) matters what the strategic landscape looks like in the years leading up to AGI.
  • It might not matter if you have a plan for pivotal acts that you're confident are resilient against the sort of random surprises that might happen in Smooth Takeoff World.
  • A few hypotheses that are foregrounded by this post include:
    • Smooth Takeoff World, as measured in GDP.
      • GDP mostly doesn't seem like it matters except as a proxy, so I'm not that hung up on evaluating this. (That said, the "Bureaucracy and Thielian Secrets" model is interesting, and does provoke some interesting thoughts on how the world might be shaped)
    • Smooth Takeoff World, as measured by "AI-breakthroughs-per-year-or-something".
      • This feels like something that might potentially matter. I agree that AI-breakthroughs-per-year is hard to operationalize, but if AI is able to feed back into AI research that seems strategically relevant. I'm surprised/confused that Eliezer wasn't more interested in exploring this.
    • Abrupt Fast Takeoff World, which mostly like this one except suddenly someone has a decisive advantage and/or we're all dead.
    • Chunky Takeoff World. Mostly listed for completeness. Maybe there won't be a smooth hyperbolic curve all the way to FOOM, there might be a few discrete advances in between here and there.
  • Eliezer's arguments against Smooth-Takeoff-World generally don't feel as ironclad to me as the arguments about FOOM. AFAICT he also only specified arguments in this post against Smooth-Takeoff-Measured-By GDP. It seems possible that, i.e. Deepmind could start making AI advances that they use fully internally without running them through external bureaucracy bottlenecks. It's possible that any sufficiently large organization develops it's own internal bureaucracy bottlenecks, but also totally possible that all the smartest people at DeepMind talk to each other and the real work gets done in a way that cuts through it
  • The "Bureaucracy Bottleneck as crux against Smooth Takeoff GDP World" was quite interesting for general worldmodeling, whether or not it's strategically relevant. It does suggest it might be quite bad if the AI ecosystem figured out how to bypass it's own bureaucracy bottlenecks.
comment by Rafael Harth (sil-ver) · 2021-11-23T12:49:35.755Z · LW(p) · GW(p)

Survey on model updates from reading this post. Figuring out to what extent this post has led people to update may inform whether future discussions are valuable.

Results: (just posting them here, doesn't really need its own post)

The question was to rate agreement on the 1=Paul to 9=Eliezer axis before and after reading this post.

Data points: 35

Mean:

Median:

Graph of distribution before (blue) and after (red) and of mean shifts based on prior position (horizontal bar chart).

Raw Data

Anynymous Comments:

Agreement more on need for actions than on probabilities. Would be better to first present points of agreement (that it is at least possible for non(dangerously)-general AI to change situation).

the post was incredibly confusing to me and so I haven't really updated at all because I don't feel like I can crisply articulate yudkowsky's model or his differences with christiano

Replies from: daniel-kokotajlo, Edouard Harris, Benito
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-25T11:21:35.462Z · LW(p) · GW(p)

Wow, I did not expect those results!

Replies from: ramana-kumar, rs-1
comment by Ramana Kumar (ramana-kumar) · 2021-11-25T11:56:45.373Z · LW(p) · GW(p)

I wonder what effect there is from selecting for reading the third post in a sequence of MIRI conversations from start to end and also looking at the comments and clicking links in them.

comment by RS (rs-1) · 2021-11-25T16:33:38.555Z · LW(p) · GW(p)

Were you surprised by the direction of the change or the amount?

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-25T17:09:14.710Z · LW(p) · GW(p)

My prediction was mainly about polarization rather than direction, but I would have expected the median or average to not move much probably, and to be slightly more likely to move towards Paul than towards Yudkowsky. I think. I don't think I was very surprised.

Replies from: Benito
comment by Ben Pace (Benito) · 2021-11-25T18:45:55.057Z · LW(p) · GW(p)

Why would it move toward Paul? He made almost no arguments, and Eliezer made lots. When Paul entered the chat it was focused on describing what each of them believe in order to find a bet, not communicating why they believe it.

Replies from: daniel-kokotajlo, matthew-barnett
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-26T01:48:30.148Z · LW(p) · GW(p)

I think I was expecting somewhat better from EY; I was expecting more solid, well-explained arguments/rebuttals to Paul's points from "Takeoff Speeds." Also EY seemed to be angry and uncharitable, as opposed to calm and rational. I was imagining an audience that mostly already agrees with Paul encountering this and being like "Yeah this confirms what we already thought."

Replies from: Benito
comment by Ben Pace (Benito) · 2021-11-26T02:55:54.286Z · LW(p) · GW(p)

FWIW "yeah this confirms what we already thought" makes no sense to me. I heard someone say this the other day, and I was a bit floored. Who knew that Eliezer would respond with a long list of examples that didn't look like continuous progress at the time, and said this more than 3 days ago? 

I feel like I got a much better sense of Eliezer's perspective reading this. One key element is whether AI progress is surprising, which it often is even if you can make trend-line arguments after-the-fact, people basically don't, and when they do they often get it wrong. (Here's an example of Dario Amodei + Danny Hernandez finding a trend in AI, that apparently immediately stopped trending as soon as they reported it. [LW · GW]) There's also lots of details about what the chimps-to-humans transition shows, and various other points (like regulation preventing most AI progress from showing up in GDP). 

I do think I could've gotten a lot of this understanding earlier by more carefully reading IEM, and now that I'm rereading it I get it much better. But nobody seems to have engaged with the arguments in it and tried to connect them to Paul's post that I can see. Perhaps someone did, and I'd be pretty interested to read that now with the benefit of hindsight.

Replies from: rohinmshah, daniel-kokotajlo
comment by Rohin Shah (rohinmshah) · 2021-11-30T08:07:52.993Z · LW(p) · GW(p)

Who knew that Eliezer would respond with a long list of examples that didn't look like continuous progress at the time, and said this more than 3 days ago?

What examples are you thinking of here? I see (1) humans and chimps, (2) nukes, (3) AlphaGo, (4) invention of airplanes by the Wright brothers, (5) AlphaFold 2, (6) Transformers, (7) TPUs, and (8) GPT-3.

I've explicitly seen 1, 2, and probably 4 in arguments before. (1 and 2 are in Takeoff speeds.) The remainder seem like they plausibly did look like continuous progress* at the time. (Paul explicitly challenged 3, 6, and 7, and I feel like 5 and 8 are also debatable, though 8 is a more complicated story.) I also think I've seen some of 3, 5, 6, 7, and 8 on Eliezer's Twitter claimed as evidence for Eliezer over Hanson in the foom debate, idk which off the top of my head.

I did not know that Eliezer would respond with this list of examples, but that's mostly because I expected him to have different arguments, e.g. more of an emphasis on a core of intelligence that current systems don't have and future systems will have, or more emphasis on aspects of recursive self improvement, or some unknown argument because I hadn't talked to Eliezer nor seen a rebuttal from him so it seemed quite plausible he had points I hadn't considered. The list of examples itself was not all that novel to me.

(Eliezer of course also has other arguments in this post; I'm just confused about the emphasis on a "long list of examples" in the parent comment.)

* Note that "continuous progress" here is a stand-in for the-strategy-Paul-uses-to-predict, which as I understand it is more like "form beliefs about how outputs scale with effort in this domain using past examples / trend lines, then see how much effort is being added now relative to the past, and use that to make a prediction".

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-26T13:06:48.654Z · LW(p) · GW(p)

That's helpful, thanks!

To be clear, I think that if EY put more effort into it (and perhaps had some help from other people as RAs) he could write a book or sequence rebutting Paul & Katja much more thoroughly and convincingly than this post did. [ETA: I.e. I'm much more on Team Yud than Team Paul here.] The stuff said here felt like a rehashing of stuff from IEM and the Hanson-Yudkowsky AI foom debate to me. [ETA: Lots of these points were good! Just not surprising to me, and not presented as succinctly and compellingly (to an audience of me) as they could have been.]

Also, it's plausible that a lot of what's happening here is that I'm conflating my own cruxes and confusions for The Big Points EY Objectively Should Have Covered To Be More Convincing. :)

ETA: And the fact that people updated towards EY on average, and significantly so, definitely updates me more towards this hypothesis!

Replies from: Lanrian
comment by Lukas Finnveden (Lanrian) · 2021-11-26T13:49:37.187Z · LW(p) · GW(p)

This is my take: if I had been very epistemically self-aware, and carefully distinguished my own impression/models and my all-things considered beliefs, before I started reading, then this would've updated my models towards Eliezer (because hey, I heard new not-entirely-uncompelling arguments) but my all-things considered beliefs away from Eliezer (because I would have expected it to be even more convincing).

I'm not that surprised by the survey results. Most people don't obey conservation of expected evidence, because they don't take into account arguments they haven't heard / don't think carefully enough about how deferring to others works. People will predictably update toward a thesis after reading a book that argues for it, not have a 50/50 chance of updating positively or negatively on it.

comment by Matthew Barnett (matthew-barnett) · 2021-11-25T19:02:34.035Z · LW(p) · GW(p)

I didn’t move significantly towards either party but it seemed like Eliezer was avoiding bets, and generally, in my humble opinion, making his theory unfalsifiable rather than showing what its true weakpoints are. That doesn’t seem like what a confidently correct person would do (but it was already mostly what I expected, so I didn’t update by much on his theory’s truth value).

ETA: After re-reading my comment, I feel I may have come off too strong. I'll completely unendorse my language and comment if people think this sort of thing is not conducive to productive discourse. Also, I greatly appreciate both parties for doing this.

Replies from: Eliezer_Yudkowsky, Benito
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-25T21:10:20.259Z · LW(p) · GW(p)

I find it valuable to know what impressions other people had themselves; it only becomes tone-policing when you worry loudly about what impressions other people 'might' have.  (If one is worried about how it looks to say so publicly, one could always just DM me (though I might not respond).)

comment by Ben Pace (Benito) · 2021-11-25T20:15:12.393Z · LW(p) · GW(p)

FWIW I also don’t like the phrasing of my comment very much either. I came back thinking to remove it but saw you’d already replied :P

comment by Edouard Harris · 2021-11-24T17:38:42.622Z · LW(p) · GW(p)

(Not being too specific to avoid spoilers) Quick note: I think the direction of the shift in your conclusion might be backwards, given the statistics you've posted and that 1=Eliezer and 9=Paul.

Replies from: Lanrian
comment by Lukas Finnveden (Lanrian) · 2021-11-24T18:36:16.976Z · LW(p) · GW(p)

No, the form says that 1=Paul. It's just the first sentence under the spoiler that's wrong.

Replies from: Edouard Harris
comment by Edouard Harris · 2021-11-24T19:34:35.145Z · LW(p) · GW(p)

Good catch! I didn't check the form. Yes you are right, the spoiler should say (1=Paul, 9=Eliezer) but the conclusion is the right way round.

Replies from: sil-ver
comment by Rafael Harth (sil-ver) · 2021-11-24T19:48:59.226Z · LW(p) · GW(p)

Yeah, it's fixed now. Thanks for pointing it out.

comment by Ben Pace (Benito) · 2021-11-24T20:24:06.535Z · LW(p) · GW(p)

How interesting; I am the median.

comment by Matthew Barnett (matthew-barnett) · 2021-11-22T23:36:15.597Z · LW(p) · GW(p)

Summary of my response: before you can train a really powerful AI, someone else can train a slightly worse AI.

Yeah, and before you can evolve a human, you can evolve a Homo erectus, which is a slightly worse human.

I might be wrong about this, but my impression was that the rise of human culture and civilization was timed with the end of the Pleistocene, rather than timed with the development of better (and more general) brains. 

My guess is that modern humans probably do have more general brains than Homo erectus that came before us. But if Homo erectus had not been living in a geological epoch of repeated glaciations, then perhaps we would have seen a simpler Homo erectus civilization?

In general, I don't yet see a strong reason to think that our general brain architecture is the sole, or potentially even primary reason why we've developed civilization, discontinuous with the rest of the animal kingdom. A strong requirement for civilization is the development of cultural accumulation via language, and more specifically, the ability to accumulate knowledge and technology over generations. Just having a generalist brain doesn't seem like enough; for example, could there have been a dolphin civilization?

Replies from: Gram Stone, Robbo
comment by Gram Stone · 2021-11-24T03:02:11.190Z · LW(p) · GW(p)

If I take the number of years since the emergence of Homo erectus (2 million years) and divide that by the number of years since the origin of life (3.77 billion years), and multiply that by the number of years since the founding of the field of artificial intelligence (65 years), I get a little under twelve days. This seems to at least not directly contradict my model of Eliezer saying "Yes, there will be an AGI capable of establishing an erectus-level civilization twelve days before there is an AGI capable of establishing a human-level one, or possibly an hour before, if reality is again more extreme along the Eliezer-Hanson axis than Eliezer. But it makes little difference whether it's an hour or twelve days, given anything like current setups." Also insert boilerplate "essentially constant human brain architectures, no recursive self-improvement, evolutionary difficulty curves bound above human difficulty curves, etc." for more despair.

I guess even though I don't disagree that knowledge accumulation has been a bottleneck for humans dominating all other species, I don't see any strong reason to think that knowledge accumulation will be a bottleneck for an AGI dominating humans, since the limits to human knowledge accumulation seem mostly biological. Humans seem to get less plastic with age, mortality among other things forces us to specialize our labor, we have to sleep, we lack serial depth, we don't even approach the physical limits on speed, we can't run multiple instances of our own source, we have no previous example of an industrial civilization to observe, I could go on: a list of biological fetters that either wouldn't apply to an AGI or that an AGI could emulate inside of a single mind instead of across a civilization. I am deeply impressed by what has come out of the bare minimum of human innovative ability plus cultural accumulation. You say "The engine is slow," I say "The engine hasn't stalled, and look how easy it is to speed up!"

I'm not sure I like using the word 'discontinuous' to describe any real person's position on plausible investment-output curves any longer; people seem to think it means "intermediate value theorem doesn't apply," (which seems reasonable) when usually hard/fast takeoff proponents really mean "intermediate value theorem still applies but the curve can be almost arbitrarily steep on certain subintervals."

Replies from: Eliezer_Yudkowsky, Robbo, matthew-barnett
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-24T23:07:01.277Z · LW(p) · GW(p)

That was a pretty good Eliezer model; for a second I was trying to remember if and where I'd said that.

comment by Robbo · 2021-11-24T17:28:59.448Z · LW(p) · GW(p)

I guess even though I don't disagree that knowledge accumulation has been a bottleneck for humans dominating all other species, I don't see any strong reason to think that knowledge accumulation will be a bottleneck for an AGI dominating humans, since the limits to human knowledge accumulation seem mostly biological. Humans seem to get less plastic with age, mortality among other things forces us to specialize our labor, we have to sleep, we lack serial depth, we don't even approach the physical limits on speed, we can't run multiple instances of our own source, we have no previous example of an industrial civilization to observe, I could go on: a list of biological fetters that either wouldn't apply to an AGI or that an AGI could emulate inside of a single mind instead of across a civilization.

I agree with this, and I think that you are hitting on a key a reason that these debates don't hinge on what the true story of the human intelligence explosion ends up being. Whichever of these is closer to the truth

a) the evolution of individually smarter humans using general reasoning ability was the key factor

b) the evolution of better social learners and the accumulation of cultural knowledge was the key factor

...either way, there's no reason to think that AGI has to follow the same kind of path that humans did. I found an earlier post on the Henrich model of the evolution of intelligence, Musings on Cumulative Cultural Evolution and AI [LW · GW]. I agree with Rohin Shah's takeaway [LW · GW] on that post :

I actually don't think that this suggests that AI development will need both social and asocial learning: it seems to me that in this model, the need for social learning arises because of the constraints on brain size and the limited lifetimes. Neither of these constraints apply to AI -- costs grow linearly with "brain size" (model capacity, maybe also training time) as opposed to superlinearly for human brains, and the AI need not age and die. So, with AI I expect that it would be better to optimize just for asocial learning, since you don't need to mimic the transmission across lifetimes that was needed for humans.

Replies from: rohinmshah
comment by Rohin Shah (rohinmshah) · 2021-11-30T06:55:18.028Z · LW(p) · GW(p)

(To be clear, the thing you quoted was commenting on the specific argument presented in that post. I do expect that in practice AI will need social learning, simply because that's how an AI system could make use of the existing trove of knowledge that humans have built.)

comment by Matthew Barnett (matthew-barnett) · 2021-11-24T23:15:48.949Z · LW(p) · GW(p)

I'm not sure I like using the word 'discontinuous' to describe any real person's position on plausible investment-output curves any longer; people seem to think it means "intermediate value theorem doesn't apply," (which seems reasonable) when usually hard/fast takeoff proponents really mean "intermediate value theorem still applies but the curve can be almost arbitrarily steep on certain subintervals."

FWIW when I use the word discontinuous in these contexts, I'm almost always referring to the definition Katja Grace uses,

We say a technological discontinuity has occurred when a particular technological advance pushes some progress metric substantially above what would be expected based on extrapolating past progress. We measure the size of a discontinuity in terms of how many years of past progress would have been needed to produce the same improvement. We use judgment to decide how to extrapolate past progress.

This is quite different than the mathematical definition of continuous.

comment by Robbo · 2021-11-23T18:11:10.424Z · LW(p) · GW(p)

In general, I don't yet see a strong reason to think that our general brain architecture is the sole, or potentially even primary reason why we've developed civilization, discontinuous with the rest of the animal kingdom. A strong requirement for civilization is the development of cultural accumulation via language, and more specifically, the ability to accumulate knowledge and technology over generations.

In The Secrets of Our Success, Joe Henrich argues that without our stock of cultural knowledge, individual humans are not particularly more generally intelligent than apes. (Neanderthals may very well have been more generally intelligent than humans - and indeed, their brains are bigger than ours.)

And, he claims, to the extent that individual humans are now especially intelligent, this was because of culture-driven natural selection. For Henrich, the story of human uniqueness is a story of a feedback loop: increased cultural know-how, which drives genetic selection for bigger brains and better social learning, which leads to increased cultural know-how, which drives genetic selection for bigger brains….and so forth, until you have a very weird great ape that is weak, hairless, and has put a flag on the moon.

Note: this evolution + culture feedback loop is still a huge discontinuity that led to massive changes in relatively short evolutionary time!

Just having a generalist brain doesn't seem like enough; for example, could there have been a dolphin civilization?

Heinrich speculates that a bunch of idiosyncratic features came together to launch us into the feedback loop that led to us being cultural species. Most species, including dolphins, do not get onto this feedback loop because of a "startup" problem: bigger brains will give a fitness advantage only up to a certain point, because individual learning can only be so useful. For there to be further selection for bigger brains, you need a stock of cultural know-how (cooking, hunting, special tools) that makes individual learning very important for fitness. But, to have a stock of cultural know-how, you need big brains.

Heinrich speculates that humans overcame the startup problem due to a variety of factors that came together when we descended from the trees and started living on the ground. The important consequences of a species being on the ground (as opposed to in the trees):

  1. It frees up your hands for tool use. Captive chimps, which are more “grounded” than wild chimps, make more tools.
  2. It’s easier for you to find tools left by other people.
  3. It’s easier for you to see what other people are doing and hang out with them. (“Hang out” being inapt, since that’s precisely not what you’re doing).
  4. You need to group up with people to survive, since there are terrifying predators on the ground. Larger groups offer protection; these larger groups will accelerate the process of people messing around with tools and imitating each other.

Larger groups also produce new forms of social organization. Apparently, in smaller groups of chimps, the reproductive strategy that every male tries to follow is “fight as many males as you can for mating opportunities.” But in a larger group, it becomes better for some males to try to pair bond – to get multiple reproductive opportunities with one female, by hanging around her and taking care of her.

Pair bonding in turn allows for more kinship relationships. Kinship relationships mean you grow up around more people; this accelerates learning. Kinship also allows for more genetic selection for big-brained, slow-developing learners: it becomes less prohibitively costly to give birth to big-brained, slow-growing children, because more people are around to help out and pool food resources.

This story is, by Henrich's own account, quite speculative. You can find it in Chapter 16 of the book.

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-23T19:00:22.016Z · LW(p) · GW(p)

In The Secrets of Our Success, Joe Henrich argues that without our stock of cultural knowledge, individual humans are not particularly more generally intelligent than apes.

I 75% agree with this, but I do think that individual humans are smarter than individual chimpanzees. A big area of disagreement is distinguishing between "intrinsic ability to innovate" vs. "ability to process culture", and whether it's even possible to distinguish the two. I wrote a post about this [LW · GW] two years ago. 

For Henrich, the story of human uniqueness is a story of a feedback loop: increased cultural know-how, which drives genetic selection for bigger brains and better social learning, which leads to increased cultural know-how, which drives genetic selection for bigger brains….and so forth, until you have a very weird great ape that is weak, hairless, and has put a flag on the moon.

This is the big crux for me on the evolution of humans and its relevance to the foom debate.

Roughly, I think Henrich's model is correct. I think his model provides a simple, coherent explanation for why humans dominate the world, and why it happened on such a short timescale, discontinuously with other animals.

Of course, intelligence plays a large role on his model: you can't get ants who can go to the moon, no matter how powerful their culture. But the the great insight is that our power does not come from our raw intelligence: it comes from our technology/culture, which is so powerful because it was allowed to accumulate

Cultural accumulation is a zero-to-one discontinuity. That is, you can go a long time without any of it, and then something comes along that's able to do it just a little bit and then shortly after, it blows up. But after you've already reached one, going from "being able to accumulate culture at all" to "being able to accumulate it slightly faster" does not give you the same discontinuous foom as before.

We could, for example, imagine that an AI that can accumulate culture slightly faster than other humans. Since this AI is only slightly better than humans, however, it doesn't go and create its own culture on its own. Unlike the humans -- who actually did go and create their own culture completely on their own, separate from other animals -- the AI will simply be one input to the human economy.

This AI would be important input to our economy for sure, but not a completely separate entity producing its own distinct civilization, like the prototypical AI that spins up nanobot factories and kills us all within 3 minutes. It will be more like the brilliant professor, or easily-copyable-worker. In other words, it might speed up our general civilizational abilities to develop technology, and greatly enhance our productive capabilities. But it won't, on its own, discontinuously produce technology 2.0 (where 1.0 was humans and animals roughly are technology 0.0).

Replies from: Liron
comment by Liron · 2021-11-27T21:47:40.630Z · LW(p) · GW(p)

I think a superintelligent AI can FOOM its way to manufacturing nanobots because the biggest bottleneck to engineering and manufacturing those is research that can be done without needing input from the physical universe beyond the physics we already know, and the machines we already have, with very slight upgrades or creative usages beyond what they were designed for. Manufacturing nanobots is like a logic brain teaser for a sufficiently intelligent reasoner. I guess you have a different perspective in that you think the process requires a culture of socializing beings, and/or more input from the physical universe?

comment by Nisan · 2021-11-23T21:50:22.515Z · LW(p) · GW(p)

The central hypothesis of "takeoff speeds" is that at the time of serious AGI being developed, it is perfectly anti-Thielian in that it is devoid of secrets

No, the slow takeoff model just precludes there being one big secret that unlocks both 30%/year growth and dyson spheres. It's totally compatible with a bunch of medium-sized $1B secrets that different actors discover, adding up to hyperbolic economic growth in the years leading up to "rising out of the atmosphere".

Rounding off the slow takeoff hypothesis to "lots and lots of little innovations adding up to every key AGI threshold, which lots of actors are investing $10 million in at a time" seems like black-and-white thinking, demanding that the future either be perfectly Thielien or perfectly anti-Thielien. The real question is a quantitative one — how lumpy will takeoff be?

comment by Matthew Barnett (matthew-barnett) · 2021-11-23T17:41:56.043Z · LW(p) · GW(p)

Unfortunately, it looks like Yudkowsky and Christiano weren't able to come to an agreement on what bets to make.

In place of that, I'll ask, whatever camp you belong to: what concrete predictions do you make that you believe most strongly diverge from what people in the "other" camp believe, and can be resolved substantially before the world ends?

I propose we restrict our predictions to roughly 2026, which is pretty soon but probably not world-ending-soon (on almost all views).

Replies from: conor-sullivan
comment by Lone Pine (conor-sullivan) · 2021-11-24T23:33:04.770Z · LW(p) · GW(p)

I would say I agree more with Christiano.

By 2026:

  • At least 50% of programming work that would have been done by a human programmer in 2019 will be done by systems like Codex or Co-Pilot.
  • Humaniod robotic maids, butlers and companions will be for sale in some form, although they will be limited and underwhelming, and few people will have them in their homes.
  • Self driving will finally be practical and applied widely. In the USA, between 10 and 70% of automobile trips will be autonomous or in self driving mode. Humans will not be banned from driving anywhere in the world, that's more of a 2030s+ thing.
  • AI will beat human grandmasters at nearly every video game or formal game. There might be 1-5 games which AI still struggles with, and they will be notable exceptions. Or there might be 0 such games. RL systems can learn most games from pixels in less than a GPU-day (using 2026-era GPUs, consuming less than 1000 watts and costing less than $4,000 USD2019 adjusted for inflation.) RL research will be focused on beating humans in sports and physical games like soccer, basketball, golf, etc.
  • Chatbots will regularly pass Turing tests, although it will remain controversial whether that means anything. Publicly available chatbots will be about as good as GPT-3 in grammar and competence, but unlike GPT-3 they will have consistent personalities and memory over time -- i.e., the limitations of the 2048 token window will be overcome somehow. Good chatbots will be available to the public, and will be ubiquitous in customer service, but whether they are popular as companions or personal assistants will depend on public acceptance. This is the same problem faced by AR: the tech will definitely be there, but the public might not be interested and might be somewhat hostile.
  • I personally am not sure if GWP growth will be significantly above historical baselines. I think AI will have progressed significantly, but we also know that, even going back to the 90s, information technology has made an underwhelming impact on productivity. The world economy is such a weird mess right now for reasons that have nothing to do with AI, so it's hard to make predictions.
  • There won't be significant unemployment due to technology (yet), but some careers will be significantly altered, including drivers and programmers.

 

I consider these predictions to be pretty conservative. I would not be surprised to be surprised by AI progress, but I would be very disappointed if we didn't meet 5/7 of my predictions.

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-25T03:34:27.152Z · LW(p) · GW(p)

I think I'm happy to bet against predictions (1), (2), (3), and (5). Predictions (6) and (7) don't seem like they're committing to anything specific so I don't know whether I disagree.

My worry is that when we get more specific about what each of these things mean, you might end up backing off and use a much more modest operationalization than I'm hoping for. For example, when you say,

Chatbots will regularly pass Turing tests

I don't think that a chatbot will pass a strong (adversarial) Turing test by 2026, of the type specified in Kapor and Kurzweil's 2029 bet. However, I expect there will be weaker, less impressive Turing tests that chatbots will pass by then. Also it's unclear what "regularly pass" means (did bots "regularly" beat top Go players in 2016, or was that just a few games?).

Replies from: conor-sullivan
comment by Lone Pine (conor-sullivan) · 2021-11-27T00:31:44.166Z · LW(p) · GW(p)

6 and 7 are definitely non-predictions, or a prediction that nothing interesting will happen. 1, 2, 4 and 5 are softly almost true today: 

(1) AI Programming -- I heard a rumor (don't have a source on this) that something like 30% of new GitHub commits involve Co-Pilot. I can't imagine that is really true, seems so implausible, but my prediction can come true if AI code completion becomes very popular. 

(2) Household Robots -- Every year for the last decade or so some company has demoed some kind of home robot at an electronics convention, but if any of them have actually made it to market, the penetration is very small. Eventually someone will make one that's useful enough to sell a few hundred or more units. I don't think a Roomba should qualify as meeting my prediction, which is why I specified a "humanoid" robot.

(3) Self Driving -- I stand by what I said, nothing to expand on. I believe that Tesla and Waymo, at least, already have self driving tech good enough, so this is mostly about institutional acceptance.

(4) DRL learning games from pixels -- EfficientZero essentially already does this, but restricted to the 57 Atari games. My prediction is that there will be an EfficientZero for all video games.

(5) Turing Test -- I think that the Turing test is largely a matter of how long the computer can fool the judge for, in addition to the judge knowing what to look for. Systems from the 70s could probably fool a judge for about 30 seconds. Modern chatbots might be able to fool a competent judge for 10 minutes, and an incompetent judge (naive casual user) for a couple hours at the extreme. I think by 2026 chatbots will be able to fool competent judges for at least 30 minutes, and will be entertaining to naive casual users indefinitely (i.e., people can make friends with their chatbots and it doesn't get boring quickly if ever.) 
 

For 6 and 7, I'm going to make concrete predictions.

(6) Some research institute or financial publication of repute will claim that AI technology (not computers generally, just AI) will have "added X Trillion Dollars" to the US or world economy, where X is at least 0.5% of US GDP or GWP, respectively. Whether this is actually true might be controversial, but someone will have made the claim. GWP will not be significantly above trendline.

(7) At least two job titles will have less than 50% the number of workers as 2019. The most likely jobs to have been affected are drivers, cashiers, fast food workers, laundry workers, call center workers, factory workers, workers in petroleum-related industries*, QA engineers, and computer programmers. These jobs might shift within the industry such that the number of people working in industry X is similar, but there has to be a job title that shrunk by 50%. For example, the same X million people still work in customer service, but most of them are doing something like prompt engineering on AI chatbots, as opposed to taking phone calls directly.

* This one has nothing to do with AI, but I expect it to happen by 2026 nonetheless.

Let me know if you want to formalize a bet on some website. 

Replies from: matthew-barnett, matthew-barnett, matthew-barnett, matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-27T01:06:16.496Z · LW(p) · GW(p)

For (2) I am less interested in betting than I was previously. Before, I assumed you meant that there would be actual, competent Humaniod robotic maids and butlers for sale in 2026. But now I'm imagining that you meant just any ordinary Humanoid robot on the market, even if doesn't do what a real human maid or butler does.

Like, I think technically in 1990 companies could have already been selling "Humanoid robotic maids", but they would've been functionally useless. Without some sort of constraint on what actually counts as a robotic maid, I think some random flashy-yet-useless robot that changed hands and made some company $300,000 in revenue might count for the purposes of this bet. And I would prefer not to take a bet with that as a potential outcome.

comment by Matthew Barnett (matthew-barnett) · 2021-11-27T00:50:24.952Z · LW(p) · GW(p)

Some research institute or financial publication of repute will claim that AI technology (not computers generally, just AI) will have "added X Trillion Dollars" to the US or world economy, where X is at least 0.5% of US GDP or GWP, respectively. Whether this is actually true might be controversial, but someone will have made the claim.

This seems like an extremely weak prediction. Institutions, even fairly reputable ones, make fantastic claims like that all the time. 

For example, I found one article written in 2019 that says, "By one estimate, AI contributed a whopping $2 trillion to global GDP last year." It cites the PricewaterhouseCoopers, which according to Wikipedia is "the second-largest professional services network in the world and is considered one of the Big Four accounting firms, along with Deloitte, EY and KPMG."

Since GWP was about 86.1 trillion USD in 2018, according to the World Bank, this means that PwC thinks that artificial intelligence is already contributing more than 2% of our gross world product, four times more than you expected would be claimed by 2026!

comment by Matthew Barnett (matthew-barnett) · 2021-11-27T00:39:39.313Z · LW(p) · GW(p)

Modern chatbots might be able to fool a competent judge for 10 minutes

I am highly skeptical. Which chatbots are you imagining here?

comment by Matthew Barnett (matthew-barnett) · 2021-11-27T01:18:31.290Z · LW(p) · GW(p)

(3) Self Driving -- I stand by what I said, nothing to expand on. I believe that Tesla and Waymo, at least, already have self driving tech good enough, so this is mostly about institutional acceptance.

The problem with some of your predictions is that I don't know how to operationalize them. For example, does L4 self-driving count? What about L3? What source can be used to resolve this question? I'm not currently aware of any source that counts the number of trips done in automobiles in the US, and tabulates them by car type (or self-driving status). So, to bet, we'd either need to get a source, or come up with a different way of operationalizing the question.

(As an aside, I have found that a very high fraction of predictions -- even among people who care a lot about betting -- tend to be extremely underspecified. I think it's a non-trivial skill to know how to operationalize bets, and most people just aren't very good at it without lots of practice. That's not a criticism of you :). However, I do prefer that you state your predictions very precisely because otherwise we're just not going to be able to do the bet.)

Replies from: conor-sullivan
comment by Lone Pine (conor-sullivan) · 2021-11-27T03:38:31.959Z · LW(p) · GW(p)

I think you're 100% right. Most (>>80%) of the bets I see on Long Bets, or predictions on MetaCalculus, are underspecified to the point where where a human mediator would have to make a judgement call that can be considered unfair to someone. I don't expect that to change no matter how much work I do, unless I make bets on specific statistics from well known sources, e.g. the stock market, or the CIA World Factbook. 

There are possible futures where prediction (3) is obvious. For example, if someone predicted that 50% of trips will be self driving in 2021 (many people did predict that 5 years ago) we can easily prove them wrong without having to debate whether Tesla is L2 or L5 and whether that matters. Teslas are not 50% of the cars on the road, nor are Waymos, so you can easily see that most trips in 2021 are not self driving by any definition. I think there are also future worlds were 95% of cars and trips are L5, most cars can legally autonomously drive anywhere without any humans inside, etc, and in that world there isn't much to debate about unless you're really petty. So we could make bets hoping that things will be that obvious, but I don't think either of us want to do the work to avoid this kind of ambiguity. 

I'm happy to consider my bets as paid in Bayes points without any need for future adjudication. So, for all the Bayes points, I'd love to hear what your equivalent predictions are for 2026.

For what it's worth, here's my revised (3): Greater than 10% of cars on the road will be legally capable of either L4/L5 OR legally L2/L3 but disengagements will be uncommon, less than once in a typical trip. (Meaning, if you watch a video from the AI DRIVR YouTube channel, there's less than one disengagement per 20 minutes of driving time.) 

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-27T05:04:44.746Z · LW(p) · GW(p)

I think you're 100% right. Most (>>80%) of the bets I see on Long Bets, or predictions on MetaCalculus, are underspecified to the point where where a human mediator would have to make a judgement call that can be considered unfair to someone.

To be clear, I have spent a ton of time on Metaculus and I find this impression incorrect. I have spent comparatively little time on Long Bets but I think it's also wrong there for the most part.

I think you may have accidentally called out parties who are, in my opinion, exemplars of what solid prediction platforms should look like. There are far, far worse parties that you could have called out.

comment by Matthew Barnett (matthew-barnett) · 2021-11-22T23:52:55.940Z · LW(p) · GW(p)

Summary of my response: at the point where humans are completely removed from a process, they will have been modestly improving output rather than acting as a sharp bottleneck that is suddenly removed.

Not very relevant to my whole worldview in the first place; also not a very good description of how horses got removed from automobiles, or how humans got removed from playing Go.

I'm not sure about horses, but Go doesn't seem like a central example of human labor being automated. I definitely feel that the following examples have been more continuous (in the sense of human labor becoming gradually obsolete, rather than all-at-once),

  • Agriculture
  • Manufacturing
  • Travel agents

My guess is that it's also been true for people doing manual calculations, language translation, and speech-to-text.

Replies from: None
comment by [deleted] · 2021-11-25T19:36:10.483Z · LW(p) · GW(p)

Here's a source on horse population in the US

But also, your first graph covers a time period of 200 years whereas the third graph only covers 13; that's not even the same order of magnitude. If you zoom in enough, any curve looks smooth, even an AI that FOOMs in mere hours.

Also, the original quote is stating something about sharp increases in output once the last human bottleneck is gone, not how gradual human elements are being removed.

comment by Lukas Finnveden (Lanrian) · 2021-11-22T23:46:24.047Z · LW(p) · GW(p)

Did you ever finalize any bet(s)?

comment by Matthew Barnett (matthew-barnett) · 2021-11-22T23:11:29.738Z · LW(p) · GW(p)

Historical AI applications have had a relatively small loading on key-insights and seem like the closest analogies to AGI.

...Transformers as the key to text prediction?

It's hard to see transformers making a big difference in text prediction trends when you look at benchmark data. On language modeling benchmarks such as the Penn Treebank Dataset we saw roughly smooth progress since at least 2014, and continuing at roughly the same rate through late 2017 and 2018 when the first transformer models were coming out.

It's plausible that progress after 2017 has been faster than progress prior to 2017, but that this is hard to see in the data on Papers With Code, which only goes back to about 2013. That said, we can still see significant gradual progress prior to 2013 documented in Shen et al. which in my opinion does not look radically slower than progress post-2017.

Related SSC post: Does reality drive straight lines on graphs, or do straight lines on graphs drive reality? 

comment by Ben Pace (Benito) · 2022-12-17T01:24:56.008Z · LW(p) · GW(p)
  • Paul's post on takeoff speed had long been IMO the last major public step in the dialogue on this subject (not forgetting to honorably mention Katja's crazy discontinuous progress examples [LW · GW] and Kokotajlo's arguments against using GPD as a metric [LW · GW]), and I found it exceedingly valuable to read how it reads to someone else who has put in a great deal of work into figuring out what's true about the topic, thinks about it in very different ways, and has come to different views on it. I found this very valuable for my own understanding of the subject, and I felt I learned a bunch on reading it.
  • Eliezer wrote it from a fairly exasperated (and a little desperate) place and that comes across in the writing. I think if you aren't literally Paul and you are interested in the subject, then you should get over that and read it for the insights. I think if you are literally Paul then it's quite reasonable to be very defensive in the ensuing dialogue.
  • I do not know what to make of the monkeys/chimp thing, except to be at least fairly scared about similarly sudden improvements in generality occurring again (though I acknowledge Paul has an argument that we shouldn't expect to see that again).
  • I could say more about what I learned from reading this, but I don't value my own take on the object level that highly to be worth writing about here.
  • I didn't learn much from the ensuing dialogue and I'm not intending to vote for that. I think most of the dialogues were valuable as artifacts of conversation and attempted communication, but not as valuable for learning about takeoff (especially the Paul/Eliezer back-and-forth on whether they should even be able to find something to bet about). Edit: After re-reading a bunch of the ensuing discussion, I actually think it's great, and will be +9'ing a bunch of that too.

This isn't a great review from me, but I thought I'd write a few notes anyway, because this is one of my few +9s.

comment by Nisan · 2021-11-23T21:56:21.042Z · LW(p) · GW(p)

it legitimately takes the whole 4 years after that to develop real AGI that ends the world. FINE. SO WHAT. EVERYONE STILL DIES.

By Gricean implicature, "everyone still dies" is relevant to the post's thesis. Which implies that the post's thesis is that humanity will not go extinct. But the post is about the rate of AI progress, not human extinction.

This seems like a bucket error [LW · GW], where "will takeoff be fast or slow?" and "will AI cause human extinction?" are put in the same bucket.

comment by ADifferentAnonymous · 2021-11-23T17:20:29.629Z · LW(p) · GW(p)

My question after reading this is about Eliezer's predictions in a counterfactual without regulatory bottlenecks on economic growth. Would it change the probable outcome, or would we just get a better look at the oncoming AGI train before it hit us? (Or is there no such counterfactual well-defined enough to give us an answer?) ETA: Basically trying to get at whether that debate's actually a crux of anything.

comment by Matthew Barnett (matthew-barnett) · 2021-11-22T22:12:17.170Z · LW(p) · GW(p)

The real world is allowed to do discontinuous things to you anyways.

There is not necessarily a presage of 9/11 where somebody flies a small plane into a building and kills 100 people, before anybody flies 4 big planes into 3 buildings and kills 3000 people; and even if there is some presaging event like that, which would not surprise me at all, the rest of the world's response to the two cases was evidently discontinuous.

There have been numerous terrorist incidents in world history, and triggers to war, and it's not clear to me that 9/11 is the most visceral. To the extent that AI disasters will be discontinuous in the sense that 9/11 was discontinuous, this seems like a reason for optimism, not pessimism. We largely overreacted to 9/11, rather than just letting it slide and allowing some much larger disaster take us by surprise.

ETA: I should note that without a clear definition of "discontinuous" I'm not sure whether I disagree with what was said. I do think 9/11 was discontinuous in the sense of it being shocking and unexpected. But it doesn't seem strongly discontinuous in the sense of breaking from historical trends.

Replies from: Vaniver
comment by Vaniver · 2021-11-23T03:31:13.235Z · LW(p) · GW(p)

There have been numerous terrorist incidents in world history, and triggers to war, and it's not clear to me that 9/11 is the most visceral.

I do think part of the problem here is 'reference class tennis', where you can draw boundaries in different ways to get different conclusions, and it's not quite clear which boundaries are the most predictive.

As I understand Eliezer's point in that section, Paul's model seems to predict there won't be discontinuities in the input/output response, but we have lots of examples of that sort of thing. Two years before the 9/11 attacks, EgyptAir Flight 990 was deliberately crashed into the ocean by its first officer with 217 fatalities, about 10% of the 9/11 fatalities, and yet the response to Flight 990 was much, much less than 10% of the response to 9/11.

Before orchestrating the 1914 assassination of Archduke Franz Ferdinand, the same person orchestrated the assassination of King Alexander Obrenović and others in 1901, which did not lead to a war 10% the size of WWI (just sanctions and withdrawn ambassadors).


Separately, there's the question of how much you should expect there to be trend-breaking events. If you're working with just data collected up until 2000, I think you'll be surprised by 2001; the number of fatalities is far outside of distribution (the recent plane crashes primarily killed passengers, you have to go back to WWII to get kamikaze attacks that kill more people on the ground than passengers, and even then the average number of casualties per suicide was 2, with the highest I can find being 389), and there isn't a trendline suggesting a huge increase is coming.

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-23T04:21:49.574Z · LW(p) · GW(p)

I think you'll be surprised by 2001; the number of fatalities is far outside of distribution

Good point. I think I had overstated the extent to which terrorism had been a frequent occurance. 9/11 is indeed the deadliest terrorist attack ever recorded (I didn't realize that few other attacks even came close).

However, I do want to push back against the idea that this event was totally unprecedented. The comparison to other "terrorist attacks" is, as you hint at, a bit of a game of reference class tennis. When compared to other battles, air raids, and massacres, Wikipedia notes that there have been several dozen that compare in the context of war. But of course, the United States did not see itself in an active state of war at the time. 

The closest comparison is probably the attack on Pearl Harbor, in which a comparable number of people died. But that attack was orchestrated by an industrializing state, not an insurgent terrorist group.

comment by Matthew Barnett (matthew-barnett) · 2021-11-23T01:21:13.664Z · LW(p) · GW(p)

I mean, as written, I'd want to avoid cases like 10% growth on paper while recovering from a pandemic that produced 0% growth the previous year.

The simplest way of doing this is probably to bet on whether there will be a yearly GWP/GDP that exceeds 110% of every previous year. For example, the sequence [1, 0.9, 1.05] would not count, even though the last jump represented 16.7% growth.

Replies from: ESRogs
comment by ESRogs · 2021-11-23T01:38:54.271Z · LW(p) · GW(p)

bet on whether there will be a yearly GWP/GDP that exceeds 110% of a previous year

Did you mean "that exceeds 110% of all previous years"? (To exclude steady growth that eventually goes over 110% in aggregate, like [1.0, 1.03, 1.06, 1.09, 1.12].)

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2021-11-23T01:47:55.725Z · LW(p) · GW(p)

Yes, this is what I meant. I edited my original comment to correct my mistake.

comment by Lukas Finnveden (Lanrian) · 2021-11-26T14:14:29.502Z · LW(p) · GW(p)
Oh, come on. That is straight-up not how simple continuous toy models of RSI work. Between a neutron multiplication factor of 0.999 and 1.001 there is a very huge gap in output behavior.

Nitpick: I think that particular analogy isn't great.

For nuclear stuff, we have two state variables: amount of fissile material and current number of neutrons flying around. The amount of fissile material determines the "neutron multiplication factor", but it is the number of neutrons that goes crazy, not fissile material. And the current number of neurons doesn't matter for whether the pile will eventually go crazy or not.

But in the simplest toy models of RSI, we just have one variable: intelligence. We can't change the "intelligence multiplication factor", there's just intelligence figuring out how to build more intelligence.

Maybe exothermic chemical reactions, like fire, is a better analogy. Either you have enough heat to create a self-sustaining reaction, or you don't.

comment by TekhneMakre · 2021-11-28T12:53:16.092Z · LW(p) · GW(p)
AlphaFold 2 coming out of Deepmind and shocking the heck out of everyone in the field of protein folding with performance far better than they expected even after the previous shock of AlphaFold, by combining many pieces that I suppose you could find precedents for scattered around the AI field, but with those many secret sauces all combined in one place by the meta-secret-sauce of "Deepmind alone actually knows how to combine that stuff and build things that complicated without a prior example"?

Hm. I wonder if there's a bet to be extracted from this. Like: Eliezer says that Alphafold 2 beats [algorithms previous to Alphafold 2, but with 10x compute], and Paul says the latter beats the former? Or replace Alphafold 2 with anything that Eliezer thinks contains some amount of secret sauce over previous things (whether or not its performance is "on trend").

comment by Morpheus · 2021-11-26T21:10:21.959Z · LW(p) · GW(p)

[Yudkowsky][23:25]

there's a lot of noise in a 2-stock prediction.

[Christiano][23:25]

I mean, it's a 1-stock prediction about nvidia

I didn't get that part and thought others might not have either. First I thought 2-stock, 1-stock was some jargon I didn't know related to shorting stocks. But as far as I can, tell this simply means that Yudkowsky expected that Christiano invested in both nvidia and more in tsmc, but Christiano just invested in tsmc.

comment by Nisan · 2021-11-23T21:45:02.283Z · LW(p) · GW(p)

"Takeoff Speeds" has become kinda "required reading" in discussions on takeoff speeds. It seems like Eliezer hadn't read it until September of this year? He may have other "required reading" from the past four years to catch up on.

(Of course, if one predictably won't learn anything from an article, there's not much point in reading it.)

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-23T23:34:27.863Z · LW(p) · GW(p)

I read "Takeoff Speeds" at the time.  I did not liveblog my reaction to it at the time.  I've read the first two other items.

I flag your weirdly uncharitable inference.

Replies from: Nisan, johnswentworth
comment by Nisan · 2021-11-24T01:11:03.255Z · LW(p) · GW(p)

I apologize, I shouldn't have leapt to that conclusion.

Replies from: Eliezer_Yudkowsky
comment by johnswentworth · 2021-11-24T02:02:39.614Z · LW(p) · GW(p)

FWIW, I did not find this weirdly uncharitable, only mildly uncharitable. I have extremely wide error bars on what you have and have not read, and "Eliezer has not read any of the things on that list" was within those error bars. It is really quite difficult to guess your epistemic state w.r.t. specific work when you haven't been writing about it for a while.

(Though I guess you might have been writing about it on Twitter? I have no idea, I generally do not use Twitter myself, so I might have just completely missed anything there.)

Replies from: Eliezer_Yudkowsky, RobbBB
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-11-24T23:04:58.769Z · LW(p) · GW(p)

The "weirdly uncharitable" part is saying that it "seemed like" I hadn't read it vs. asking.  Uncertainty is one thing, leaping to the wrong guess another.

comment by Rob Bensinger (RobbBB) · 2021-11-24T02:50:29.169Z · LW(p) · GW(p)

Yeah, even I wasn't sure you'd read those three things, Eliezer, though I knew you'd at least glanced over 'Takeoff Speeds' and 'Biological Anchors' enough to form opinions when they came out. :)

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2021-11-24T02:51:12.801Z · LW(p) · GW(p)

(... Admittedly, you read fast enough that my 'skimming' is your 'reading'. 😶)