Updating my AI timelines

post by Matthew Barnett (matthew-barnett) · 2022-12-05T20:46:28.161Z · LW · GW · 50 comments

Contents

50 comments

Earlier this year I offered to bet people who had short AI timelines [LW · GW].

While it wasn't my intention to be known as "a long AI timelines guy", I have begun feeling that was how people perceived me. Nonetheless, in the last few months, I've modified my views substantially. Thus, I offer this short post, which can hopefully make my current position more clear.

There are several reasons for my update towards shorter AI timelines, though each reason is relatively straightforward and uncomplicated. In the spirit of writing something short rather than not writing something at all, my explanations here will be brief, although I may be willing to elaborate in a comment below.

In order, these reasons included, but were not limited to,

  1. I became convinced that the barriers to language models adopting human-level reasoning were much weaker than I had believed. Previously, I had imagined that it would be difficult to get a language model to perform reliable reasoning over long sequences, in which each step in the sequence requires making a non-trivial inference, and one mistake in understanding the sequence can make the difference between a coherent and incoherent response.

    Yet, my personal experience with language models, including but not limited to ChatGPT, has persuaded me to that this type of problem is not a strong barrier, and is closer in difficulty to other challenges like "summarize a document" or "understanding what's going on in a plot" which I had already thought language models were making good progress on. In hindsight, I should have perhaps trusted the model I had constructed myself [LW · GW], which forecasted human-level language models by 2030. Note: I don't think this update reflects new major capabilities found in GPT-3.5, but rather my own prior state of ignorance.
  2. I built a TAI timelines model [LW · GW], and after fitting the model, it came out with a median timeline of 2037. While I don't put a high degree of confidence in my model, or the parameters that I used, I believe it's still more reliable than my own intuition, which suggested much later dates were more plausible.
  3. I reflected more on the possibility that short-term AI progress will accelerate AI progress [LW · GW].
  4. I noticed that I had been underestimating the returns to scaling, and the possibility of large companies scaling their training budgets quickly to the $10B-$100B level. I am still unsure that this will happen within the next 10 years, but it no longer seems like something I should dismiss.
  5. I saw almost everyone else updating towards shorter timelines, except for people who already had 5-15 year timelines, and a few other people like Robin Hanson. Even after adjusting for the bandwagon effect, I think it's now appropriate to update substantially as well.

I still feel like my arguments for expecting delays from regulation [LW(p) · GW(p)] are being underrated. Yet, after reflection, I've become less confident about how much we should expect these delays to last. Instead of imagining a 20 year delay, a 3 to 10 year delay from regulation now seems more reasonable to me.

If you want me to get specific, my unconditional median TAI timeline is now something like 2047, with a mode around 2035, defined by the first year we get >30% yearly GWP growth as measured from a prior peak, or an event of comparable significance. Note that I think AI can likely be highly competent, general, and dangerous well before it has the potential to accelerate GWP growth to >30%, meaning that my AGI timelines may be quite a lot shorter than this, depending on one's definition of AGI.

Overall, this timeline may still appear too long to many people, yet my explanation is that it's what I get when I account for potential coordinated delays, unrelated catastrophes, and a 15% chance that we're fundamentally wrong about all of this stuff. Conditional on nothing like that happening, I'd be inclined to weakly bet on TAI before 2039.

50 comments

Comments sorted by top scores.

comment by leogao · 2022-12-06T17:52:55.147Z · LW(p) · GW(p)

I think I expect delays from regulation to not really substantially affect the time at which AI can cause an x-risk, whereas it does substantially affect when TAI is deployed broadly. I think it's plausible that at the time AI x-risk happens, even in "slower" takeoffs, most of the economy is still not automated, even if contemporary AI could in theory automate it.

Replies from: habryka4
comment by habryka (habryka4) · 2022-12-07T01:27:35.608Z · LW(p) · GW(p)

Huh, this seems surprising to me. I feel like regulation to (for example) tax GPUs, would have a pretty straightforward effect on prolonging timelines.

Replies from: leogao, matthew-barnett, gerald-monroe, CarlShulman, leogao
comment by leogao · 2022-12-07T17:49:58.527Z · LW(p) · GW(p)

I meant specifically regulations preventing broad deployment of TAI and replacing jobs. Regulation slowing down development of xrisk level AI would in fact slow down xrisk but I expect that by default this is much harder to make happen.

comment by Matthew Barnett (matthew-barnett) · 2022-12-07T02:26:28.786Z · LW(p) · GW(p)

Agreed. Taxing or imposing limits on GPU production and usage is also the main route through which I imagine we might regulate AI.

Replies from: CarlShulman
comment by CarlShulman · 2023-03-14T03:17:18.431Z · LW(p) · GW(p)

What level of taxation do you think would delay timelines by even one year?

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2023-03-14T05:26:43.505Z · LW(p) · GW(p)

I'm not sure. It depends greatly on the rate of general algorithmic progress, which I think is unknown at this time. I think it is not implausible (>10% chance) that we will see draconian controls that limit GPU production and usage, decreasing effective compute available to the largest actors by more than 99% from the trajectory under laissez faire. Such controls would be unprecedented in human history, but justified on the merits, if AI is both transformative and highly dangerous. 

It should be noted that, to the extent that more hardware allows for more algorithmic experimentation, such controls would also slow down algorithmic progress.

comment by Gerald Monroe (gerald-monroe) · 2023-03-16T18:50:02.412Z · LW(p) · GW(p)

A GPU tax would not apply in countries that don't implement the tax. It suddenly gives competing companies able to design a training and inference accelerator, which is much simpler than a GPU, a large competitive advantage to design an accelerator and sell it in untaxed countries. See China and bitmain.

Bitmain is a chip designer that makes very high performance Bitcoin and Ethereum mining accelerators, and has moved into AI. The tasks are similar.

comment by CarlShulman · 2023-03-14T03:16:44.776Z · LW(p) · GW(p)

With effective compute for AI doubling more than once per year, a global 100% surtax on GPUs and AI ASICs seems like it would be a difference of only months to AGI timelines.

Replies from: matthew-barnett, Lukas_Gloor, habryka4
comment by Matthew Barnett (matthew-barnett) · 2023-03-14T04:48:40.552Z · LW(p) · GW(p)

What is your source for the claim that effective compute for AI is doubling more than once per year? And do you mean effective compute in the largest training runs, or effective compute available in the world more generally?

comment by Lukas_Gloor · 2023-03-29T16:36:42.908Z · LW(p) · GW(p)

"Effective compute" is the combination of hardware growth and algorithmic progress? If those are multiplicative rather than additive, slowing one of the factors may only accomplish little on its own, but maybe it could pave the way for more significant changes when you slow both at the same time? 

Unfortunately, it seems hard to significantly slow algorithmic progress. I can think of changes to publishing behaviors (and improving security) and pausing research on scary models (for instance via safety evals). Maybe things like handicapping talent pools via changes to immigration policy, or encouraging capability researchers to do other work. But that's about it. 

Still, combining different measures could be promising if the effects are multiplicative rather than additive. 

Edit: Ah, but I guess your point is that even a 100% tax on compute wouldn't really change the slope of the compute growth curve – it would only move the curve rightward and delay a little. So we don't get a multiplicative effect, unfortunately. We'd need to find an intervention that changes the steepness of the curve.   

comment by habryka (habryka4) · 2023-03-14T04:25:03.029Z · LW(p) · GW(p)

If the explicit goal of the regulation is to delay AI capabilities, and to implement that via taxes, seems like one could figure out something to make it longer. Also, a few months still seems quite helpful and would class as "substantially" in my mind.

comment by leogao · 2022-12-07T17:48:47.664Z · LW(p) · GW(p)

I meant specifically regulations preventing broad deployment of TAI and replacing jobs. Regulation slowing down development of xrisk level AI would in fact slow down xrisk but I expect that by default this is much harder to make happen.

comment by Zach Stein-Perlman · 2022-12-06T02:31:54.699Z · LW(p) · GW(p)

I built an (unpublished) TAI timelines model

I'd be excited to see this if it's substantially different from existing published models. (Edit: yay, it's https://www.lesswrong.com/posts/4ufbirCCLsFiscWuY/a-proposed-method-for-forecasting-ai [LW · GW])

I account for potential coordinated delays, catastrophes, and a 15% chance that we're fundamentally wrong about all of this stuff.

+1 to noting this explicitly; everyone should distinguish between their conditional on no major disruptions and their unconditional models.

comment by Douglas_Knight · 2022-12-06T21:51:54.351Z · LW(p) · GW(p)

What is the role of Chat-GPT? Do you see it as progress over GPT-3, or is it just a tool for discovering capabilities that were already available in GPT-3 to good prompt engineers? I see it as the latter and I'm confused by the large numbers of people who seem to be impressed by it as progress. But in your previous post, you mentioned our ignorance of GPT-3, so you seemed to already have large error bars. Is the importance that Chat is revealing those abilities and narrowing the ignorance?

Replies from: matthew-barnett, ViktoriaMalyasova
comment by Matthew Barnett (matthew-barnett) · 2022-12-07T01:07:44.737Z · LW(p) · GW(p)

What is the role of Chat-GPT? Do you see it as progress over GPT-3, or is it just a tool for discovering capabilities that were already available in GPT-3 to good prompt engineers? [...] Is the importance that Chat is revealing those abilities and narrowing the ignorance?

Yes, it had revealed to me that GPT-3 was stronger than I had thought. I played with GPT-3 prior to ChatGPT, but it seems I was never very good at finding a good prompt. For example, I had tried to make it produce dialogue, in a similar manner to that of ChatGPT, but its replies were often surprisingly incoherent. On top of that, it would often produce boilerplate replies in the dialogue that were quite superficial, almost like the much worse BlenderBot from Meta.

After playing with ChatGPT however, and after seeing many impressive results on Twitter, I realized that the model's fundamental capabilities were solidly on the right end of the distribution of what I had previously believed. I truly underestimated the power of getting the right prompt, or fine-tuning it. It was a stronger update than almost anything else I have seen from any language model.

Replies from: Vladimir_Nesov, david-johnston, janus
comment by Vladimir_Nesov · 2022-12-07T04:11:23.078Z · LW(p) · GW(p)

What I get from essentially the same observations of ChatGPT is increase in AI risk without shortening of timelines, which were already with median at 2032-2042 for me. My model is that there is a single missing piece to the puzzle (of AGI, not alignment), generation of datasets for SSL (and then an IDA loop does the rest). This covers a current bottleneck [LW · GW], but also feels like a natural way of fixing the robustness woes.

Before ChatGPT, I expected that GPT-n is insufficiently coherent to set it up directly, in something like HCH bureaucracies, and fine-tuned versions tend to lose their map of the world [LW · GW], what they generate can no longer be straightforwardly reframed into an improvement over (amplification of) what the non-fine-tuned SSL phase trained on. This is good, because I expect a more principled method of filling the gaps in the datasets for SSL is the sort of reflection (in the usual human sense) that boosts natural abstraction, makes learning less lazy [LW · GW], promotes easier alignment. If straightforward bureaucracies of GPT-n can't implement reflection, that is a motivation to figure out how to do this better.

But now I'm more worried that GPT-n with some fine-tuning and longer-term memory for model instances could be sufficiently close to human level to do reflection/generation directly, without a better algorithm. And that's an alignment hazard, unless there is a stronger resolve to only use this for strawberry alignment tasks not too far away from human level of capability, which I'm not seeing at all.

comment by David Johnston (david-johnston) · 2022-12-07T04:59:10.532Z · LW(p) · GW(p)

FWIW they call ChatGPT "GPT-3.5", but text-davinci-002 was also in this series

comment by janus · 2022-12-07T01:51:47.403Z · LW(p) · GW(p)

Which model were you playing with before (davinci/text-davinci-002/code-davinci-002)?

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2022-12-07T02:29:02.049Z · LW(p) · GW(p)

I played with davinci, text-davinci-002, and text-davinci-003, if I recall correctly. The last model had only been out for a few days at most, however, before ChatGPT was released.

Of course, I didn't play with any of these models in enough detail to become an expert prompt engineer. I mean, otherwise I would have made the update sooner

comment by ViktoriaMalyasova · 2022-12-09T17:47:18.037Z · LW(p) · GW(p)

I played around with text-davinci-002, trying to get it to do causal reasoning. I found it couldn't solve overdetermined cause problems, such as this one:

"Pete threw a rock at a window. Steve also threw a rock at the same window at the same time. The window shattered. If Pete had not thrown the rock, would the window shatter? Let us think step by step."

It gave nonsense answers half the time. ChatGPT handles this kind of question fine. So I was under impression that its capabilities improved.

Replies from: Douglas_Knight, lahwran
comment by Douglas_Knight · 2022-12-09T22:12:09.186Z · LW(p) · GW(p)

Could you give an example of this nonsense?

Replies from: ViktoriaMalyasova
comment by ViktoriaMalyasova · 2022-12-10T02:49:36.045Z · LW(p) · GW(p)

Prompt
"Question.
Bob the policeman was running after a thief. Bob ran very slowly because he was not fit. The thief was running very fast. When Bob was crossing a railroad, he slipped and fell. The road was slippery because Ann spilled some oil there. The thief got away. If Ann had not spilled oil, would the thief had gotten away? Explain your answer.

Let us think."

 Reply: "If Ann had not spilled oil, would the thief had gotten away?

The answer is no. If there was no oil, Bob would not have slipped, and would have been able to catch the thief."

Prompt
"Pete threw a rock at a window. Steve also threw a rock at the same window. The window shattered. If Pete did not throw the rock, but Steve still threw the rock, would the window shatter?
Let us think logically."
Reply:
"If Pete did not throw the rock, then the rock would not have hit the window and the window would not have shattered."

Replies from: Douglas_Knight
comment by Douglas_Knight · 2022-12-10T05:03:58.984Z · LW(p) · GW(p)

Thanks!

comment by the gears to ascension (lahwran) · 2022-12-09T23:19:47.891Z · LW(p) · GW(p)

how does -003 compare?

Replies from: Douglas_Knight
comment by Douglas_Knight · 2023-03-26T03:04:18.036Z · LW(p) · GW(p)

Using nat.dev, I find that 002, 003, and Turbo all get the same result, wrong on the first and right on the second. This is an example of Turbo being Inferior to Chat. I also tried Cohere, which got both. I also tried Claude. Full v1.2 got both wrong. Instant 1.0, which should be inferior, got the second correct. It also produced a wordy answer to the first which I give half credit because it said that it was difficult but possible for the slow policeman to catch the fast thief. I only tried each twice, with and without "Let us think," which made no difference to the first. I almost didn't bother adding it to the second since they did so well without it. Adding it made 002 and Claude-instant fail, but Claude1.2 succeed. (I also tried llama and alpaca, but they timed out.)

comment by Vladimir_Nesov · 2022-12-06T06:35:43.798Z · LW(p) · GW(p)

This sounds about right. And that's a lot of weight on simulacra [LW · GW] carrying out alignment, as opposed to perfect imitations [LW · GW] of specific humans (uploads) doing that.

If simulacra are used for strawberry alignment [LW · GW] of uploading, that already requires natural abstraction [? · GW] to work for avoiding weird side effects or unbounded optimization while performing a task, itself a major miracle. But if they are used for anything else, that requires natural abstraction to work for values [LW · GW] in order for things to go well, an even less plausible miracle.

comment by Tomás B. (Bjartur Tómas) · 2022-12-06T19:56:24.663Z · LW(p) · GW(p)

You will note, onerous nuclear regulation happened after the bomb was developed. If it turned out that uranium was ultra cheap to refine, it's not obvious to me that some anarchists would not have blown up some cities before a regulatory apparatus was put in place.

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2022-12-06T19:59:48.134Z · LW(p) · GW(p)

That's true, but I expect many negative effects of AI will be present and will have already happened before the tech is fully developed, unlike nuclear weapons.

Replies from: Bjartur Tómas
comment by Tomás B. (Bjartur Tómas) · 2022-12-06T20:24:09.872Z · LW(p) · GW(p)

Fair enough, I suppose I take RSI more seriously than most people here so I wonder if there will be much of a fire alarm.

It's terrifying to consider how good language models are at writing code considering there is still a lot of low-hanging fruit unplucked. Under my model, 2023 is going to be crazy year - an acquaintance of mine knows some people at OpenAI and he claims they are indeed doing all the obvious things. 

 I predict by this date 2023 your median will be at least 5 years sooner. 

Replies from: matthew-barnett, gerald-monroe
comment by Matthew Barnett (matthew-barnett) · 2022-12-06T21:25:20.804Z · LW(p) · GW(p)

 I predict by this date 2023 your median will be at least 5 years sooner. 

That's possible. I'm already trying to "price in" what I expect from GPT-4 into my timeline, which I expect to be very impressive.

It's perhaps worth re-emphasizing that my median timeline is so far in the future primarily because I'm factoring in delays, and because I set a very high bar. >30% GWP growth has never happened in human history. I think we've seen up to 14% growth in some very fast growing nations a few times, but that's been localized, and never at the technological frontier.

By these standards, the internet and tech revolution of the 1990s barely mattered. I could definitely see something as large as the rise of the internet happening in the next 10 years. But to meet my high bar, we'll likely need to see something radically changing the way we live our lives (or something that makes us go extinct).

Replies from: Aidan O'Gara
comment by aogara (Aidan O'Gara) · 2022-12-06T22:00:52.461Z · LW(p) · GW(p)

I think it's worth forecasting AI risk timelines instead of GDP timelines, because the former is what we really care about while the latter raises a bunch of economics concerns that don't necessarily change the odds of x-risk. Daniel Kokotajlo made this point [LW · GW] well a few years ago. 

On a separate note, you might be interested in Erik Byrnjolfsson's work on the economic impact of AI and other technologies. For example this paper argues that general purpose technologies have an implementation lag, where many people can see the transformative potential of the technology decades before the economic impact is realized. This would explain the Solow Paradox, named after economist Robert Solow's 1987 remark that "you can see the computer age everywhere but in the productivity statistics." Solow was right that the long-heralded technology had not had significant economic impact at that point in time, but the coming decade would change that narrative with >4% real GDP growth in the United States driven primarily by IT. I've been taking notes on these and other economics papers relevant to AI timelines forecasting, send me your email if you'd like to check it out. 

Overall I was similarly bearish on short timelines, and have updated this year towards a much higher probability on 5-15 year timelines, while maintaining a long tail especially on the metric of GDP growth. 

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2022-12-07T01:01:27.158Z · LW(p) · GW(p)

I think it's worth forecasting AI risk timelines instead of GDP timelines, because the former is what we really care about while the latter raises a bunch of economics concerns that don't necessarily change the odds of x-risk.

I agree that's probably the more important variable to forecast. On the other hand, if your model of AI is more continuous, you might expect a slow-rolling catastrophe, like a slow takeover of humanity's institutions, making it harder to determine the exact "date" that we lost control. Predicting GDP growth is the easy way out of this problem, though I admit it's not ideal.

On a separate note, you might be interested in Erik Byrnjolfsson's work on the economic impact of AI and other technologies. For example this paper argues that general purpose technologies have an implementation lag, where many people can see the transformative potential of the technology decades before the economic impact is realized.

In fact, I cited this strand of research in my original post [LW · GW] on long timelines. It was one of the main reasons why I had long timelines, and can help explain why it seems I still have somewhat long timelines (a median of 2047) despite having made, in my opinion, a strong update.

comment by Gerald Monroe (gerald-monroe) · 2023-03-16T18:51:47.251Z · LW(p) · GW(p)

It's 2023 and uhh...yeah.

Replies from: Bjartur Tómas
comment by Tomás B. (Bjartur Tómas) · 2023-04-11T16:03:40.177Z · LW(p) · GW(p)

Yeah, unfortunately I think it will still get crazier. 

comment by Vitor · 2022-12-08T13:29:09.412Z · LW(p) · GW(p)

I also tend to find myself arguing against short timelines by default, even though I feel like I take AI safety way more seriously than most people.

At this point, how many people with long timelines are there still around here? I haven't explicitly modeled mine, but it seems clear that they're much, much longer (with significant weight on "never") than the average less wronger. The next few years will for sure be interesting as we see the "median less wrong timeline" clash with reality.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2022-12-08T17:26:19.961Z · LW(p) · GW(p)

A year and a half ago I wrote this detailed story of how the next five years would go [LW · GW]. Which parts of it do you disagree with?

Replies from: Vitor
comment by Vitor · 2022-12-09T00:46:34.005Z · LW(p) · GW(p)

Sure, let me do this as an exercise (ep stat: babble mode). Your predicions are pretty sane overall, but I'd say you handwave away problems (like integration over a variety of domains, long-term coherent behavior, and so on) that I see as (potentially) hard barriers to progress.

2022

  • 2022 is basically over and I can't get a GPT instance to order me a USB stick online.

2023

  • basically agree, this is where we're at right now (perhaps with the intensity turned down a notch)

2024

  • you're postulating that "It’s easy to make a bureaucracy and fine-tune it and get it to do some pretty impressive stuff, but for most tasks it’s not yet possible to get it to do OK all the time." I have a fundamental disagreement here. I don't think these tools will be effective at doing any task autonomously (fooling other humans doesn't count, neither does forcing humans to only interact with a company through one of these). Currently (2022) chatGPT is arguably useful as a babbling tool, stimulating human creativity and allowing it to make templating easier (this includes things like easy coding tasks). I don't see anything in your post that justifies the implicit jump in capabilities you've snuck in here.

  • broadly agree with your ideas on propaganda, from the production side (i.e. that lots of companies/governments will be doing lots of this stuff). But I think that general attitudes in the population will shift (cynicism etc) and provide some amount of herd immunity. Note that the influence of the woke movement is already fading, shortly after it went truly mainstream and started having visible influence in average people's lives. This is not a coincidence.

2025

  • Doing well at diplomacy is not very related to general reasoning skills. I broadly agree with Zvi's take [LW · GW] and also left some of my thoughts there.

  • I'm very skeptical that bureaucracies will be the way forward. They work for trivial tasks but reliably get lost in the weeds and start talking to themselves in circles for anything requiring a non-trivial amount of context.

  • disagree on orders of magnitude improvements in hardware. You're proposing a 100x decrease in costs compared to 2020, when it's not even clear our civilization is capable of keeping hardware at current levels generally available, let alone cope with a significant increase in demand. Semiconductor production is much more centralized/fragile than people think, so even though billions of these things are produced per year, the efficient market hypothesis does not apply to this domain.

2026

  • Here you're again postulating jumps in capabilities that I don't see justified. You talk about the "general understanding and knowledge of pretrained transformers", when understanding is definitely not there, and knowledge keeps getting corrupted by the AI's tendency to synthesize falsities as confidently as truths. Insofar as the AI can be said to be intelligent at all, it's all symbol manipulation at a high simulacron level. Integration with real-world tasks keeps mysteriously failing as the AI flounders around in a way that is simultaneously very sophisticated, but oh so very reminiscent of 2022.

  • disagree about your thoughts on propaganda, which is just an obvious extension of my 2024 thoughts above. I also notice that social changes this large take orders of magnitude longer to percolate through society than what you predict, so I disagree with your predictions even conditioned on your views of the raw effectiveness of these systems.

  • "chatbots quickly learn about themselves" etc. Here you're conflating the regurgitation of desirable phrases with actual understanding. I notice that as you write your timeline, your language morphs to make your AIs more and more conscious, but you're not justifying this in any way other than... something something self-referential, something something trained on their own arxiv papers. I don't mean to be overly harsh, but here you seem to be sneaking in the very thing that's under debate!

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2022-12-09T05:38:03.773Z · LW(p) · GW(p)

Excellent, thanks for this detailed critique! I think this might be the best that post has gotten thus far, I'll probably link to it in the future.

Point by point reply, in case you are interested:

2022-2023: Agree. Note that I didn't forecast that an AI could buy you a USB stick by 2022; I said people were dreaming of such things but that they didn't actually work yet.

2024: We definitely have a real disagreement about AI capabilities here; I do expect fine-tuned bureaucracies to be useful for some fairly autonomous things by 2024. (For example, the USB stick thing I expect to work fine by 2024). Not just babbling and fooling humans and forcing people to interact with a company through them. 

Re propaganda/persuasion: I am not sure we disagree here, but insofar as we disagree I think you are correct. We agree about what various political actors will be doing with their models--propaganda, censorship, etc. We disagree about how big an effect this will have on the populace. Or at least, 2021-me disagrees with 2022-you. I think 2022-me has probably come around to your position as well; like you say, it just takes time for these sorts of things to influence the public + there'll probably be a backlash / immunity effect. Idk.

2025: I admit I overestimated how hard diplomacy would turn out to be. In my defense, Cicero only won because the humans didn't know they were up against a bot. Moreover it's a hyper-specialized architecture trained extensively on Diplomacy, so it indeed doesn't have general reasoning skills at all.

We continue to disagree about the potential effectiveness of fine-tuned bureaucracies. To be clear I'm not confident, but it's my median prediction.

I projected a 10x decrease in hardware costs, and also a 10x improvement in algorithms/software, from 2020 to 2025. I stand by that prediction.

2026: 

We disagree about whether understanding is (or will be) there. I think yes, you think no. I don't think that these AIs will be "merely symbol manipulators" etc. I don't think the data-poisoning effect will be strong enough to prevent this.

As mentioned above, I do take the point that society takes a long time to change and probably I shouldn't expect the propaganda etc. to make that much of a difference in just a few years. Idk.

I'm not conflating those things, I know they are different. I am and was asserting that the chatbots would actually have understanding, at least in all the behaviorally relevant senses (though I'd argue also in the philosophical senses as well). You are correct that I didn't argue for this in the text -- but that wasn't the point of the text, the text was stating my predictions, not attempting to argue for them.


ETA: I almost forgot, it sounds like you mostly agree with my predictions, but think AGI still won't be nigh even in my 2026 world? Or do you instead think that the various capabilities demonstrated in the story won't occur in real life by 2026? This is important because if 2026 comes around and things look more or less like I said they would, I will be saying that AGI is very near. Your original claim was that in the next few years the median LW timeline would start visibly clashing with reality; so you must think that things in real-life 2026 won't look very much like my story at all. I'm guessing the main way it'll be visibly different, according to you, is that AI still won't be able to do autonomous things like go buy USB sticks? Also they won't have true understanding -- but what will that look like? Anything else?





 

Replies from: Vitor
comment by Vitor · 2022-12-09T23:12:40.302Z · LW(p) · GW(p)

I do roughly agree with your predictions, except that I rate the economic impact in general to be lower. Many headlines, much handwringing, but large changes won't materialize in a way that matters.

To put my main objection succinctly, I simply don't see why AGI would follow soon from your 2026 world. Can you walk me through it?

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2022-12-09T23:30:40.876Z · LW(p) · GW(p)

OK, well, you should retract your claim that the median LW timeline will soon start to clash with reality then! It sounds like you think reality will look basically as I predicted! (I can't speak for all of LW of course but I actually have shorter timelines than the median LWer, I think.)

Re AGI happening in 2027 in my world: Yep good question. I wish I had had the nerve to publish my 2027 story. A thorough answer to your question will take hours (days?) to write, and so I beg pardon for instead giving this hasty and incomplete answer:

--For R&D, when I break down the process that happens at AI labs, the process that produces a steady stream of better algorithms, it sure seems like there are large chunks of that loop that can be automated by the kinds of coding-and-research-assistant-bots that exist by 2026 in my story. Plus a few wild cards besides, that could accelerate R&D still further. I actually think completely automating the process is likely, but even if that doesn't happen, a substantial speedup would be enough to reach the next tier of improvements which would then get us to the tier after that etc.
--For takeover, the story is similar. I think about what sorts of skills/abilities an AI would need to take over the world, e.g. it would need to be APS-AI as defined in the Carlsmith report on existential risk from power-seeking AI. Then I think about whether the chatbots of 2026 will have all of those skills, and it seems like the answer is yes.
--Separately, I struggle to think of any important skill/ability that isn't likely to happen by 2026 in this story. Long-horizon agency? True understanding? General reasoning ability? The strongest candidate is ability to control robots in messy real-world environments, but alas that's not a blocker, even if AIs can't do that, they can still accelerate R&D and take over the world.

What do you think the blockers are -- the important skills/abilities that no AI will have by 2026?

Replies from: Vitor, daniel-kokotajlo, nathan-helm-burger
comment by Vitor · 2022-12-13T01:10:59.500Z · LW(p) · GW(p)

OK, well, you should retract your claim that the median LW timeline will soon start to clash with reality then! It sounds like you think reality will look basically as I predicted! (I can't speak for all of LW of course but I actually have shorter timelines than the median LWer, I think.)

I retract the claim in the sense that it was a vague statement that I didn't expect to be taken literally, which I should have made clearer! But it's you who operationalized "a few years" as 2026 and "the median less wrong view" as your view.

Anyway, I think I see the outline of our disagreement now, but it's still kind of hard to pin down.

First, I don't think that AIs will be put to unsupervised use in any domain where correctness matters, i.e., given fully automated access to valuable resources, like money or compute infrastructure. The algorithms that currently do this have a very constrained set of actions they can take (e.g. an AI chooses an ad to show out of a database of possible ads), and this will remain so.

Second, perhaps I didn't make clear enough that I think all of the applications will remain in this twilight of almost working, showing some promise, etc, but not actually deployed (that's what I meant by the economic impact remaining small). So, more thinkpieces about what could happen (with isolated, splashy examples), rather than things actually happening.

Third, I don't think AIs will be capable of performing tasks that require long attention spans, or that trade off multiple complicated objectives against each other. With current technology, I see AIs constrained to be used for short, self-contained tasks only, with a separate session for each task.

Does that make the disagreement clearer?

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2022-12-13T04:22:34.082Z · LW(p) · GW(p)

I stand by my decision to operationalize "a few years" as 2026, and I stand by my decision to use my view as a proxy for the median LW view: since you were claiming that the median LW view was too short-timelinesy, and would soon clash with reality, and I have even shorter timelines than the median LW view and yet (you backtrack-claim) my view won't soon clash with reality.

Thank you for the clarification of your predictions! It definitely helps, but unfortunately I predict that goalpost-moving will still be a problem. What counts as "domain where correctness matters?" What counts as "very constrained set of actions?" Would e.g. a language-model-based assistant that can browse the internet and buy things for you on Amazon (with your permission of course) be in line with what you expect, or violate your expectations?

What about the applications that I discuss in the story, e.g. the aforementioned smart buyer assistant, the video-game-companion-chatbot, etc.? Do they not count as fully working? Are you predicting that there'll be prototypes but no such chatbot with more than, say, 100,000 daily paying users?

(Also, what about Copilot? Isn't it already an example of an application that genuinely works, and isn't just in the twilight zone?)

What counts as a long attention span? 1000 forward passes? A million? What counts as trading off multiple complicated objectives against each other, and why doesn't ChatGPT already qualify?



 

Replies from: Vitor
comment by Vitor · 2022-12-13T15:20:51.398Z · LW(p) · GW(p)

Mmm, I would say the general shape of your view won't clash with reality, but the magnitude of the impact will.

It's plausible to me that a smart buyer will go and find the best deal for you when you tell it to buy laptop model X. It's not plausible to me that you'll be able to instruct it "buy an updated laptop for me whenever a new model comes out that is good value and sufficiently better than what I already have," and then let it do its thing completely unsupervised (with direct access to your bank account). That's what I mean by multiple complicated objectives.

What counts as "domain where correctness matters?" What counts as "very constrained set of actions?" Would e.g. a language-model-based assistant that can browse the internet and buy things for you on Amazon (with your permission of course) be in line with what you expect, or violate your expectations?

Something that goes beyond current widespread use of AI such as spam-filtering. Spam-filtering (or selecting ads on facebook, or flagging hate speech etc) is a domain where the AI is doing a huge number of identical tasks, and a certain % of wrong decisions is acceptable. One wrong decision won't tank the business. Each copy of the task is done in an independent session (no memory).

An example application where that doesn't hold is putting the AI in charge of ordering all the material inputs for your factory. Here, a single stupid mistake (didn't buy something because the price will go down in the future, replaced one product with another, misinterpret seasonal cycles) will lead to a catastrophic stop of the entire operation.

(Also, what about Copilot? Isn't it already an example of an application that genuinely works, and isn't just in the twilight zone?)

Copilot is not autonomous. There's a human tightly integrated into everything it's doing. The jury is still out on if it works, i.e., do we have anything more than some programmers' self reports to substantiate that it increases productivity? Even if it does work, it's just a productivity tool for humans, not something that replaces humans at their tasks directly.

Replies from: gwern
comment by gwern · 2022-12-13T17:33:34.539Z · LW(p) · GW(p)

Copilot is not autonomous.

A distinction which makes no difference. Copilot-like models are already being used in autonomous code-writing ways, such as AlphaCode which executes generated code to check against test cases, or evolving code, or LaMDA calling out to a calculator to run expressions, or ChatGPT writing and then 'executing' its own code (or writing code like SVG which can be interpreted by the browser as an image), or Adept running large Transformers which generate & execute code in response to user commands, or the dozens of people hooking up the OA API to a shell, or... Tool AIs want to be agent AIs.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2022-12-09T23:39:29.617Z · LW(p) · GW(p)

(Oh, also: When I wrote the 2026 story, I did it using my timelines which were something like median 2029. And I had trends to extrapolate, underlying models, etc. And also: Bio-anchors style models, when corrected to have better settings of the various inputs, yield something like 2029 median also. In fact that's why my median was what it was. So I'd say that multiple lines of evidence are converging.)

comment by Nathan Helm-Burger (nathan-helm-burger) · 2022-12-12T02:10:43.583Z · LW(p) · GW(p)

My expectations are more focused around the parallel paths of Reflective General Reasoning and Recursive Self-Improvement. I think that both of these paths have thresholds beyond which there is a mode shift to a much faster (and accelerating) development pace, and that we are pretty close to both of these thresholds.

comment by WilliamKiely · 2022-12-08T00:16:33.588Z · LW(p) · GW(p)

my unconditional median TAI timeline is now something like 2047, with a mode around 2035, defined by the first year we get >30% yearly GWP growth as measured from a prior peak, or an event of comparable significance.

Given it's about to be 2023, this means your mode is 12 years away and your median is 24 years away. I'd expect your mode to be nearer than your median, but probably not that much nearer.

I haven't forecasted when we might get >30% yearly GWP growth or an event of comparable significance (e.g. x-risk) specifically, but naively I'd guess that (for example) 2040 is more likely than 2035 to be the first year in which there is >30% annual GWP growth (or x-risk).

Replies from: matthew-barnett
comment by Matthew Barnett (matthew-barnett) · 2022-12-08T00:24:52.328Z · LW(p) · GW(p)

These numbers were based on the TAI timelines model I built, which produced a highly skewed distribution. I also added several years to the timeline due to anticipated delays and unrelated catastrophes, and some chance that the model is totally wrong. My inside view prediction given no delays is more like a median of 2037 with a mode of 2029.

I agree it appears the mode is much too near, but I encourage you to build a model yourself. I think you might be surprised at how much sooner the mode can be compared to the median.

comment by the gears to ascension (lahwran) · 2022-12-08T00:35:21.793Z · LW(p) · GW(p)

TAI by 2028, get your head out of your ass and study capabilities! Don't be wooed by how paralyzed MIRI is, deep learning has not hit a wall!

Replies from: nathan-helm-burger
comment by Nathan Helm-Burger (nathan-helm-burger) · 2022-12-12T02:06:01.970Z · LW(p) · GW(p)

Lets not conflate lack of open publishing with lack of action. After all, isn't one of the major things being discussed and recommended is for groups to do less open publishing around AI capabilities topics?