snewman

Posts
Comments

Posts

AI 2027 is a Bet Against Amdahl's Law 2025-04-21T03:09:40.751Z

What Indicators Should We Watch to Disambiguate AGI Timelines? 2025-01-06T19:57:43.398Z

Towards Better Milestones for Monitoring AI Capabilities 2023-09-27T21:18:30.966Z

The AI Explosion Might Never Happen 2023-09-19T23:20:25.597Z

Comments

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-22T18:44:51.604Z · LW · GW

Thanks everyone for all the feedback and answers to my unending questions! The branching comments are starting to become too much to handle, so I'm going to take a breather and then write a followup post – hopefully by the end of the week but we'll see – in which I'll share some consolidated thoughts on the new (to me) ideas that surfaced here and also respond to some specific points.

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-22T14:54:23.093Z · LW · GW

Thanks.

I'm now very strongly feeling the need to explore the question of what sorts of activities go into creating better models, what sorts of expertise are needed, and how that might change as things move forward. Which unfortunately I know ~nothing about, so I'll have to find some folks who are willing to let me pick their brains...

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-22T02:42:28.318Z · LW · GW

Thanks! I agree that my statements about Amdahl's Law primarily hinge on my misunderstanding of the milestones, as elucidated in the back-and-forth with Ryan. I need to digest that; as Ryan anticipates, possibly I'll wind up with thoughts worth sharing regarding the "human-only, software-only" time estimates, especially for the earlier stages, but it'll take me some time to chew on that.

(As a minor point of feedback, I'd suggest adding a bit of material near the top of the timelines and/or takeoff forecasts, clarifying the range of activities meant to be included in "superhuman coder" and "superhuman AI researcher", e.g. listing some activities that are and are not in scope. I was startled to see Ryan say "my sense is that an SAR has to be better than humans at basically everything except vision"; I would never have guessed that was the intended interpretation.)

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-22T00:25:03.950Z · LW · GW

I've (briefly) addressed the compute bottleneck question on a different comment branch, and "hard-to-automate activities aren't a problem" on another (confusion regarding the definition of various milestones).

[Dependence on Narrow Data Sets] is only applicable to the timeline to the superhuman coder milestone, not to takeoff speeds once we have a superhuman coder. (Or maybe you think a similar argument applies to the time between superhuman coder and SAR.)

I do think it applies, if indirectly. Most data relating to progress in AI capabilities comes from benchmarks of crisply encapsulated tasks. I worry this may skew our collective intuitions regarding progress toward broader capabilities, especially as I haven't seen much attention paid to exploring the delta between things we currently benchmark and "everything".

Hofstadter's Law As Prior
Math: We're talking about speed up relative to what the human researchers would have done by default, so this just divides both sides equally and cancels out.

This feels like one of those "the difference between theory and practice is smaller in theory than in practice" situations... Hofstadter's Law would imply that Hofstadter's Law applies here. :-)

For one concrete example of how that could manifest, perhaps there is a delay between "AI models exist that are superhuman at all activities involved in developing better models" and "those models have been fully adopted across the organization". Interior to a frontier lab, that specific delay might be immaterial, it's just meant as an existence proof that there's room for us to be missing things.

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-22T00:14:32.690Z · LW · GW

I think my short, narrowly technical response to this would be "agreed".

Additional thoughts, which I would love your perspective on:

1. I feel like the idea that human activities involved in creating better models are broader than just, like, stereotypical things an ML Ph.D would do, is under-explored. Elsewhere in this thread you say "my sense is that an SAR has to be better than humans at basically everything except vision." There's a lot to unpack there, and I don't think I've seen it discussed anywhere, including in AI 2027. Do stereotypical things an ML Ph.D would do constitute 95% of the work? 50%? Less? Does the rest of the work mostly consist of other sorts of narrowly technical software work (coding, distributed systems design, etc.), or is there broad spillover into other areas of expertise, including non-STEM expertise? What does that look like? Etc.

(I try to make this point a lot, generally don't get much acknowledgement, and as a result have started to feel a bit like a crazy person. I appreciate you giving some validation to the idea. Please let me know if you suspect I've over-interpreted that validation.)

1a. Why "except vision"? Does an SAR have to be superhuman at creative writing, so that it can push forward creative writing capabilities in future models? (Obviously, substitute any number of other expertise domains for "creative writing".) If yes, then why doesn't it also need to be superhuman at vision (so that it can push forward vision capabilities)? If no, then presumably creative writing is one of the exceptions implied by the "basically" qualifier, what else falls in there?

2. "Superhuman AI researcher" feels like a very bad term for a system that is meant to be superhuman at the full range of activities involved in producing better models. It strongly suggests a narrower set of capabilities, thus making it hard to hold onto the idea that a broad definition is intended. Less critically, it also seems worthwhile to better define what is meant to fall within the umbrella of "superhuman coder".

3. As I read through AI 2027 and then wrote my post here, I was confused as to the breadth of skills meant to be implied by "superhuman coder" and (especially) "superhuman AI researcher", and probably did not maintain a consistent definition in my head, which may have confused my thinking.

4. I didn't spend much time evaluating the reasoning behind the estimated speedups at each milestone (5x, 25x, 250x, 2000x). I might have more to say after digging into that. If/when I find the time, that, plus the discussion we've just had here, might be enough grist for a followup post.

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-22T00:09:29.951Z · LW · GW

We now have several branches going, I'm going to consolidate most of my response in just one branch since they're converting onto similar questions anyway. Here, I'll just address this:

But, when considering activities that aren't bottlenecked on the environment, then to achieve 10x acceleration you just need 10 more speed at the same level of capability.

I'm imagining that, at some intermediate stages of development, there will be skills for which AI does not even match human capability (for the relevant humans), and its outputs are of unusably low quality.

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-21T22:59:56.418Z · LW · GW

This is valid, but doesn't really engage with the specific arguments here. By definition, when we consider the potential for AI to accelerate the path to ASI, we are contemplating the capabilities of something that is not a full ASI. Today's models have extremely jagged capabilities, with lots of holes, and (I would argue) they aren't anywhere near exhibiting sophisticated high-level planning skills able to route around their own limitations. So the question becomes, what is the shape of the curve of AI filling in weak capabilities and/or developing sophisticated strategies for routing around those weaknesses?

Maybe just by training very small models very quickly, they can discover a ton of new technologies which can scale to large models.

This is exactly missing the point. Training a cutting-edge model today involves a broad range of activities, not all of which fall under the heading of "discovering technologies" or "improving algorithms" or whatever. I am arguing that if all you can do is find better algorithms rapidly, that's valuable but it's not going to speed up overall progress by very large factors. Also, it may be that "by training very small models very quickly", the AI would discover new technologies that improve some aspects of models but fail to advance some other important aspects.

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-21T18:50:24.677Z · LW · GW

Sure, but for output quality better than what humans could (ever) do to matter for the relative speed up, you have to argue about compute bottlenecks, not Amdahl's law for just the automation itself!

I'm having trouble parsing this sentence... which may not be important – the rest of what you've said seems clear, so unless there's a separate idea here that needs responding to then it's fine.

It sounds like your actual objection is in the human-only, software-only time from superhuman coder to SAR (you think this would take more than 1.5-10 years).
Or perhaps your objection is that you think there will be a smaller AI R&D multiplier for superhuman coders. (But this isn't relevant once you hit full automation!)

Agreed that these two statements do a fairly good job of characterizing my objection. I think the discussion is somewhat confused by the term "AI researcher". Presumably, for an SAR to accelerate R&D by 25x, "AI researcher" needs to cover nearly all human activities that go into AI R&D? And even more so for SAIR/250x. While I've never worked at an AI lab, I presume that the full set of activities involved in producing better models is pretty broad, with tails extending into domains pretty far from the subject matter of an ML Ph.D and sometimes carried out by people whose job titles and career paths bear no resemblance to "AI researcher". Is that a fair statement?

If "producing better models" (AI R&D) requires more than just narrow "AI research" skills, then either SAR and SAIR need to be defined to cover that broader skill set (in which case, yes, I'd argue that 1.5-10 years is unreasonably short for unaccelerated SC->SAR), or if we stick with narrower definitions for SAR and SAIR then, yes, I'd argue for smaller multipliers.

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-21T18:30:34.357Z · LW · GW

This is valid for activities which benefit from speed and scale. But when output quality is paramount, speed and scale may not always provide much help?

My mental model is that, for some time to come, there will be activities where AIs simply aren't very competent at all, such that even many copies running at high speed won't provide uplift. For instance, if AIs aren't in general able to make good choices regarding which experiments to run next, then even an army of very fast poor-experiment-choosers might not be worth much, we might still need to rely on people to choose experiments. Or if AIs aren't much good at evaluating strategic business plans, it might be hard to train AIs to be better at running a business (a component of the SAIR -> ASI transition) without relying on human input for that task.

For Amdah's Law purposes, I've been shorthanding "incompetent AIs that don't become useful for a task even when taking speed + scale into account" as "AI doesn't provide uplift for that task".

EDIT: of course, in practice it's generally at least somewhat possible to trade speed+scale for quality, e.g. using consensus algorithms, or generate-and-test if you have a good way of identifying the best output. So a further refinement is to say that very high acceleration requires us to assume that this does not reach importantly diminishing returns in a significant set of activities.

EDIT2:

(My sense is that the progress multipliers in AI 2027 are too high but also that the human-only times between milestones are somewhat too long. On net, this makes me expect somewhat slower takeoff with a substantial chance on much slower takeoff.)

I find this quite plausible.

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-21T17:46:07.867Z · LW · GW

Yes, but you're assuming that human-driven AI R&D is very highly bottlenecked on a single, highly serial task, which is simply not the case. (If you disagree: which specific narrow activity are you referring to that constitutes the non-parallelizable bottleneck?)

Amdahl's Law isn't just a bit of math, it's a bit of math coupled with long experience of how complex systems tend to decompose in practice.

Comment by snewman on AI 2027 is a Bet Against Amdahl's Law · 2025-04-21T15:28:37.143Z · LW · GW

That's not how the math works. Suppose there are 200 activities under the heading of "AI R&D" that each comprise at least 0.1% of the workload. Suppose we reach a point where AI is vastly superhuman at 150 of those activities (which would include any activities that humans are particularly bad at), moderately superhuman at 40 more, and not much better than human (or even worse than human) at the remaining 10. Those 10 activities where AI is not providing much uplift comprise at least 1% of the AI R&D workload, and so progress can be accelerated at most 100x.

This is oversimplified; there is some room for superhuman ability (making excellent choices of experiments to run) can compensate for lack of uplift in other areas (time to code and execute individual experiments). But the fundamental point remains: a complex process can be bottlenecked by its slowest step. Amdahl's Law is not symmetric – a chain can't be as strong as its strongest link.

Comment by snewman on What Indicators Should We Watch to Disambiguate AGI Timelines? · 2025-02-21T17:47:03.928Z · LW · GW

You omitted "with a straight face". I do not believe that the scenario you've described is plausible (in the timeframe where we don't already have ASI by other means, i.e. as a path to ASI rather than a ramification of it).

Comment by snewman on What Indicators Should We Watch to Disambiguate AGI Timelines? · 2025-01-08T18:04:41.164Z · LW · GW

FWIW my vibe is closer to Thane's. Yesterday I commented that this discussion has been raising some topics that seem worthy of a systematic writeup as fodder for further discussion. I think here we've hit on another such topic: enumerating important dimensions of AI capability – such as generation of deep insights, or taking broader context into account – and then kicking off a discussion of the past trajectory / expected future progress on each dimension.

Comment by snewman on What Indicators Should We Watch to Disambiguate AGI Timelines? · 2025-01-08T01:03:27.557Z · LW · GW

Just posting to express my appreciation for the rich discussion. I see two broad topics emerging that seem worthy of systematic exploration:

What does a world look like in which AI is accelerating the productivity of a team of knowledge workers by 2x? 10x? 50x? In each scenario, how is the team interacting with the AIs, what capabilities would the AIs need, what strengths would the person need? How do junior and senior team members fit into this transition? For what sorts of work would this work well / poorly?
1. Validate this model against current practice, e.g. the ratio of junior vs. senior staff in effective organizations and how work is distributed across seniority.
How does this play out specifically for AI R&D?
1. Revisiting the questions from item 1.
2. How does increased R&D team productivity affect progress: to what extent is compute a bottleneck, how could the R&D organization adjust activities in response to reduced cost of labor relative to compute, does this open an opportunity to explore more points in the architecture space, etc.

(This is just a very brief sketch of the questions to be explored.)

I'm planning to review the entire discussion here and try to distill it into an early exploration of these questions, which I'll then post, probably later this month.

Comment by snewman on What Indicators Should We Watch to Disambiguate AGI Timelines? · 2025-01-07T03:58:57.920Z · LW · GW

better capabilities than average adult human in almost all respects in late 2024

I see people say things like this, but I don't understand it at all. The average adult human can do all sorts of things that current AIs are hopeless at, such as planning a weekend getaway. Have you, literally you personally today, automated 90% of the things you do at your computer? If current AI has better capabilities than the average adult human, shouldn't it be able to do most of what you do? (Setting aside anything where you have special expertise, but we all spend big chunks of our day doing things where we don't have special expertise – replying to routine emails, for instance.)

FWIW, I touched on this in a recent blog post: https://amistrongeryet.substack.com/p/speed-and-distance.

Comment by snewman on What Indicators Should We Watch to Disambiguate AGI Timelines? · 2025-01-07T03:34:05.698Z · LW · GW

Thanks for engaging so deeply on this!

AIs don't just substitute for human researchers, they can specialize differently. Suppose (for simplicity) there are 2 roughly equally good lines of research that can substitute (e.g. they create some fungible algorithmic progress) and capability researchers currently do 50% of each. Further, suppose that AIs can 30x accelerate the first line of research, but are worthless for the second. This could yield >10x acceleration via researchers just focusing on the first line of research (depending on how diminishing returns go).

Good point, this would have some impact.

As an intution pump, imagine you had nearly free junior hires who run 10x faster, but also work all hours. Because they are free, you can run tons of copies. I think this could pretty plausibly speed things up by 10x.

Wouldn't you drown in the overhead of generating tasks, evaluating the results, etc.? As a senior dev, I've had plenty of situations where junior devs were very helpful, but I've also had plenty of situations where it was more work for me to manage them than it would have been to do the job myself. These weren't incompetent people, they just didn't understand the situation well enough to make good choices and it wasn't easy to impart that understanding. And I don't think I've ever been sole tech lead for a team that was overall more than, say, 5x more productive than I am on my own – even when many of the people on the team were quite senior themselves. I can't imagine trying to farm out enough work to achieve 10x of my personal productivity. There's only so much you can delegate unless the system you're delegating to has the sort of taste, judgement, and contextual awareness that a junior hire more or less by definition does not. Also you might run into the issue I mentioned where the senior person in the center of all this is no longer getting their hands dirty enough to collect the input needed to drive their high-level intuition and do their high-value senior things.

Hmm, I suppose it's possible that AI R&D has a different flavor than what I'm used to. The software projects I've spent my career on are usually not very experimental in nature; the goal is generally not to learn whether an idea shows promise, it's to design and implement code to implement a feature spec, for integration into the production system. If a junior dev does a so-so job, I have to work with them to bring it up to a higher standard, because we don't want to incur the tech debt of integrating so-so code, we'd be paying for it for years. Maybe that plays out differently in AI R&D?

Incidentally, in this scenario, do you actually get to 10x the productivity of all your staff? Or do you just get to fire your junior staff? Seems like that depends on the distribution of staff levels today and on whether, in this world, junior staff can step up and productively manage AIs themselves.

Suppose a company specifically trained an AI system to be very familiar with its code base and infrastructure and relatively good at doing experiments for it. Then, it seems plausible that (with some misc schlep) the only needed context would be project specific context. ...

These are fascinating questions but beyond what I think I can usefully contribute to in the format of a discussion thread. I might reach out at some point to see whether you're open to discussing further. Ultimately I'm interested in developing a somewhat detailed model, with well-identified variables / assumptions that can be tested against reality.

Comment by snewman on What Indicators Should We Watch to Disambiguate AGI Timelines? · 2025-01-07T02:02:36.744Z · LW · GW

I see a bunch of good questions explicitly or implicitly posed here. I'll touch on each one.

1. What level of capabilities would be needed to achieve "AIs that 10x AI R&D labor"? My guess is, pretty high. Obviously you'd need to be able to automate at least 90% of what capabilities researchers do today. But 90% is a lot, you'll be pushing out into the long tail of tasks that require taste, subtle tacit knowledge, etc. I am handicapped here by having absolutely no experience with / exposure to what goes on inside an AI research lab. I have 35 years of experience as a software engineer but precisely zero experience working on AI. So on this question I somewhat defer to folks like you. But I do suspect there is a tendency to underestimate how difficult / annoying these tail effects will be, this is the same fundamental principle as Hofstadter's Law, the Programmer's Credo, etc.

I have a personal suspicion that a surprisingly large fraction of work (possibly but not necessarily limited to "knowledge work") will turn out to be "AGI complete", meaning that it will require something approaching full AGI to undertake it at human level. But I haven't really developed this idea beyond an intuition. It's a crux and I would like to find a way to develop / articulate it further.

2. What does it even mean to accelerate someone's work by 10x? It may be that if your experts are no longer doing any grunt work, they are no longer getting the input they need to do the parts of their job that are hardest to automate and/or where they're really adding magic-sauce value. Or there may be other sources of friction / loss. In some cases it may over time be possible to find adaptations, in other cases it may be a more fundamental issue. (A possible counterbalance: if AIs can become highly superhuman at some aspects of the job, not just in speed/cost but in quality of output, that could compensate for delivering a less-than-10x time speedup on the overall workflow.)

3. If AIs that 10x AI R&D labor are 20% likely to arrive and be adopted by Jan 2027, would that update my view on the possibility of AGI-as-I-defined-it by 2030? It would, because (per the above) I think that delivering that 10x productivity boost would require something pretty close to AGI. In other words, conditional on AI R&D labor being accelerated 10x by Jan 2027, I would expect that we have something close to AGI by Jan 2027, which also implies that we were able to make huge advances in capabilities in 24 months. Whereas I think your model is that we could get that level of productivity boost from something well short of AGI.

If it turns out that we get 10x AI R&D labor by Jan 2027 but the AIs that enabled this are pretty far from AGI... then my world model is very confused and I can't predict how I would update, I'd need to know more about how that worked out. I suppose it would probably push me toward shorter timelines, because it would suggest that "almost all work is easy" and RSI starts to really kick in earlier than my expectation.

4. Is this 10x milestone achievable just by scaling up existing approaches? My intuition is no. I think that milestone requires very capable AI (items 1+2 in this list). And I don't see current approaches delivering much progress on things I think will be needed for such capability, such as long-term memory, continuous learning, ability to "break out of the chatbox" and deal with open-ended information sources and extraneous information, or other factors that I mentioned in the original post.

I am very interested in discussing any or all of these questions further.

Comment by snewman on What Indicators Should We Watch to Disambiguate AGI Timelines? · 2025-01-06T22:20:57.260Z · LW · GW

See my response to Daniel (https://www.lesswrong.com/posts/auGYErf5QqiTihTsJ/what-indicators-should-we-watch-to-disambiguate-agi?commentId=WRJMsp2bZCBp5egvr). In brief: I won't defend my vague characterization of "breakthroughs" nor my handwavy estimates of how how many are needed to reach AGI, how often they occur, and how the rate of breakthroughs might evolve. I would love to see someone attempt a more rigorous analysis along these lines (I don't feel particularly qualified to do so). I wouldn't expect that to result in a precise figure for the arrival of AGI, but I would hope for it to add to the conversation.

Comment by snewman on What Indicators Should We Watch to Disambiguate AGI Timelines? · 2025-01-06T22:14:21.310Z · LW · GW

This is my "slow scenario". Not sure whether it's clear that I meant the things I said here to lean pessimistic – I struggled with whether to clutter each scenario with a lot of "might" and "if things go quickly / slowly" and so forth.

In any case, you are absolutely correct that I am handwaving here, independent of whether I am attempting to wave in the general direction of my median prediction or something else. The same is true in other places, for instance when I argue that even in what I am dubbing a "fast scenario" AGI (as defined here) is at least four years away. Perhaps I should have added additional qualifiers in the handful of places where I mention specific calendar timelines.

What I am primarily hoping to contribute is a focus on specific(ish) qualitative changes that (I argue) will need to emerge in AI capabilities along the path to AGI. A lot of the discourse seems to treat capabilities as a scalar, one-dimensional variable, with the implication that we can project timelines by measuring the rate of increase in that variable. At this point I don't think that's the best framing, or at least not the only useful framing.

One hope I have is that others can step in and help construct better-grounded estimates on things I'm gesturing at, such as how many "breakthroughs" (a term I have notably not attempted to define) would be needed to reach AGI and how many we might expect per year. But I'd be satisfied if my only contribution would be that people start talking a bit less about benchmark scores and a bit more about the indicators I list toward the end of the post – or, even better, some improved set of indicators.

Comment by snewman on o3 · 2024-12-31T00:08:23.993Z · LW · GW

Yes, test time compute can be worthwhile to scale. My argument is that it is less worthwhile than scaling training compute. We should expect to see scaling of test time compute, but (I suggest) we shouldn't expect this scaling to go as far as it has for training compute, and we should expect it to be employed sparingly.

The main reason I think this is worth bringing up is that people have been talking about test-time compute as "the new scaling law", with the implication that it will pick up right where scaling of training compute left off, just keep turning the dial and you'll keep getting better results. I think the idea that there is no wall, everything is going to continue just as it was except now the compute scaling happens on the inference side, is exaggerated.

Comment by snewman on o3 · 2024-12-30T16:56:54.807Z · LW · GW

Jumping in late just to say one thing very directly: I believe you are correct to be skeptical of the framing that inference compute introduces a "new scaling law". Yes, we now have two ways of using more compute to get better performance – at training time or at inference time. But (as you're presumably thinking) training compute can be amortized across all occasions when the model is used, while inference compute cannot, which means it won't be worthwhile to go very far down the road of scaling inference compute.

We will continue to increase inference compute, for problems that are difficult enough to call for it, and more so as efficiency gains reduce the cost. But given the log-linear nature of the scaling law, and the inability to amortize, I don't think we'll see the many-order-of-magnitude journey that we've seen for training compute.

As others have said, what we should presumably expect from o4, o5, etc. is that they'll make better use of a given amount of compute (and/or be able to throw compute at a broader range of problems), not that they'll primarily be about pushing farther up that log-linear graph.

Of course in the domain of natural intelligence, it is sometimes worth having a person go off and spend a full day on a problem, or even have a large team spend several years on a high-level problem. In other words, to spend lots of inference-time compute on a single high-level task. I have not tried to wrap my head around how that relates to scaling of inference-time compute. Is the relationship between the performance of a team on a task, and the number of person-days the team has to spend, log-linear???

Comment by snewman on Two easy things that maybe Just Work to improve AI discourse · 2024-06-08T21:42:17.596Z · LW · GW

I'd participate.

Comment by snewman on Two easy things that maybe Just Work to improve AI discourse · 2024-06-08T18:47:31.543Z · LW · GW

I love this. Strong upvoted. I wonder if there's a "silent majority" of folks who would tend to post (and upvote) reasonable things, but don't bother because "everyone knows there's no point in trying to have a civil discussion on Twitter".

Might there be a bit of a collective action problem here? Like, we need a critical mass of reasonable people participating in the discussion so that reasonable participation gets engagement and thus the reasonable people are motivated to continue? I wonder what might be done about that.

Comment by snewman on Transformers Represent Belief State Geometry in their Residual Stream · 2024-05-24T00:19:11.365Z · LW · GW

I think we're saying the same thing? "The LLM being given less information [about the internal state of the actor it is imitating]" and "the LLM needs to maintain a probability distribution over possible internal states of the actor it is imitating" seem pretty equivalent.

Comment by snewman on Transformers Represent Belief State Geometry in their Residual Stream · 2024-05-13T18:05:52.986Z · LW · GW

As I go about my day, I need to maintain a probability distribution over states of the world. If an LLM tries to imitate me (i.e. repeatedly predict my next output token), it needs to maintain a probability distribution, not just over states of the world, but also over my internal state (i.e. the state of the agent whose outputs it is predicting). I don't need to keep track of multiple states that I myself might be in, but the LLM does. Seems like that makes its task more difficult?

Or to put an entirely different frame on the the whole thing: the job of a traditional agent, such as you or me, is to make intelligent decisions. An LLM's job is to make the exact same intelligent decision that a certain specific actor being imitated would make. Seems harder?

Comment by snewman on Transformers Represent Belief State Geometry in their Residual Stream · 2024-05-01T21:54:39.944Z · LW · GW

I am trying to wrap my head around the high-level implications of this statement. I can come up with two interpretations:

What LLMs are doing is similar to what people do as they go about their day. When I walk down the street, I am simultaneously using visual and other input to assess the state of the world around me ("that looks like a car"), running a world model based on that assessment ("the car is coming this way"), and then using some other internal mechanism to decide what to do ("I'd better move to the sidewalk").
What LLMs are doing is harder than what people do. When I converse with someone, I have some internal state, and I run some process in my head – based on that state – to generate my side of the conversation. When an LLM converses with someone, instead of maintaining internal state, needs to maintain a probability distribution over possible states, make next-token predictions according to that distribution, and simultaneously update the distribution.

(2) seems more technically correct, but my intuition dislikes the conclusion, for reasons I am struggling to articulate. ...aha, I think this may be what is bothering me: I have glossed over the distinction between input and output tokens. When an LLM is processing input tokens, it is working to synchronize its state to the state of the generator. Once it switches to output mode, there is no functional benefit to continuing to synchronize state (what is it synchronizing to?), so ideally we'd move to a simpler neural net that does not carry the weight of needing to maintain and update a probability distribution of possible states. (Glossing over the fact that LLMs as used in practice sometimes need to repeatedly transition between input and output modes.) LLMs need the capability to ease themselves into any conversation without knowing the complete history of the participant they are emulating, while people have (in principle) access to their own complete history and so don't need to be able to jump into a random point in their life and synchronize state on the fly.

So the implication is that the computational task faced by an LLM which can emulate Einstein is harder than the computational task of being Einstein... is that right? If so, that in turn leads to the question of whether there are alternative modalities for AI which have the advantages of LLMs (lots of high-quality training data) but don't impose this extra burden. It also raises the question of how substantial this burden is in practice, in particular for leading-edge models.

Comment by snewman on We are headed into an extreme compute overhang · 2024-04-28T15:23:25.186Z · LW · GW

All of this is plausible, but I'd encourage you to go through the exercise of working out these ideas in more detail. It'd be interesting reading and you might encounter some surprises / discover some things along the way.

Note, for example, that the AGIs would be unlikely to focus on AI research and self-improvement if there were more economically valuable things for them to be doing, and if (very plausibly!) there were not more economically valuable things for them to be doing, why wouldn't a big chunk of the 8 billion humans have been working on AI research already (such that an additional 1.6 million agents working on this might not be an immediate game changer)? There might be good arguments to be made that the AGIs would make an important difference, but I think it's worth spelling them out.

Comment by snewman on We are headed into an extreme compute overhang · 2024-04-28T15:19:10.238Z · LW · GW

Can you elaborate? This might be true but I don't think it's self-evidently obvious.

In fact it could in some ways be a disadvantage; as Cole Wyeth notes in a separate top-level comment, "There are probably substantial gains from diversity among humans". 1.6 million identical twins might all share certain weaknesses or blind spots.

Comment by snewman on We are headed into an extreme compute overhang · 2024-04-26T22:15:31.552Z · LW · GW

Assuming we require a performance of 40 tokens/s, the training cluster can run concurrent instances of the resulting 70B model

Nit: you mixed up 30 and 40 here (should both be 30 or both be 40).

I will assume that the above ratios hold for an AGI level model.

If you train a model with 10x as many parameters, but use the same training data, then it will cost 10x as much to train and 10x as much to operate, so the ratios will hold.

In practice, I believe it is universal to use more training data when training larger models? Implying that the ratio would actually increase (which further supports your thesis).

On the other hand, the world already contains over 8 billion human intelligences. So I think you are assuming that a few million AGIs, possibly running at several times human speed (and able to work 24/7, exchange information electronically, etc.), will be able to significantly "outcompete" (in some fashion) 8 billion humans? This seems worth further exploration / justification.

Comment by snewman on Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk · 2024-01-19T21:59:58.441Z · LW · GW

They do mention a justification for the restrictions – "to maintain consistency across cells". One needn't agree with the approach, but it seems at least to be within the realm of reasonable tradeoffs.

Nowadays of course textbooks are generally available online as well. They don't indicate whether paid materials are within scope, but of course that would be a question for paper textbooks as well.

What I like about this study is that the teams are investing a relatively large amount of effort ("Each team was given a limit of seven calendar weeks and no more than 80 hours of red-teaming effort per member"), which seems much more realistic than brief attempts to get an LLM to answer a specific question. And of course they're comparing against a baseline of folks who still have Internet access.

Comment by snewman on Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk · 2024-01-19T21:21:10.493Z · LW · GW

I recently encountered a study which appears aimed at producing a more rigorous answer to the question of how much use current LLMs would be in abetting a biological attack: https://www.rand.org/pubs/research_reports/RRA2977-1.html. This is still work in progress, they do not yet have results. @1a3orn I'm curious what you think of the methodology?

Comment by snewman on Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense · 2023-12-03T16:38:10.273Z · LW · GW

Imagine someone offers you an extremely high-paying job. Unfortunately, the job involves something you find morally repulsive – say, child trafficking. But the recruiter offers you a pill that will rewrite your brain chemistry so that you'll no longer find it repulsive. Would you take the pill?

I think that pill would reasonably be categorized as "updating your goals". If you take it, you can then accept the lucrative job and presumably you'll be well positioned to satisfy your new/remaining goals, i.e. you'll be "happy". But you'd be acting against your pre-pill goal (I am glossing over exactly what that goal is, perhaps "not harming children" although I'm sure there's more to unpack here).

I pose this example in an attempt to get at the heart of "distinguishing between terminal and instrumental goals" as suggested by quetzal_rainbow. This is also my intuition, that it's a question of terminal vs. instrumental goals.

Comment by snewman on AI Timelines · 2023-11-20T22:12:08.801Z · LW · GW

Likewise, thanks for the thoughtful and detailed response. (And I hope you aren't too impacted by current events...)

I agree that if no progress is made on long-term memory and iterative/exploratory work processes, we won't have AGI. My position is that we are already seeing significant progress in these dimensions and that we will see more significant progress in the next 1-3 years. (If 4 years from now we haven't seen such progress I'll admit I was totally wrong about something). Maybe part of the disagreement between us is that the stuff you think are mere hacky workarounds, I think might work sufficiently well (with a few years of tinkering and experimentation perhaps).

Wanna make some predictions we could bet on? Some AI capability I expect to see in the next 3 years that you expect to not see?

Sure, that'd be fun, and seems like about the only reasonable next step on this branch of the conversation. Setting good prediction targets is difficult, and as it happens I just blogged about this. Off the top of my head, predictions could be around the ability of a coding AI to work independently over an extended period of time (at which point, it is arguably an "engineering AI"). Two different ways of framing it:

An AI coding assistant can independently complete 80% of real-world tasks that would take X amount of time for a reasonably skilled engineer who is already familiar with the general subject matter and the project/codebase to which the task applies.
An AI coding assistant can usefully operate independently for X amount of time, i.e. it is often productive to assign it a task and allow it to process for X time before checking in on it.

At first glance, (1) strikes me as a better, less-ambiguous framing. Of course it becomes dramatically more or less ambitious depending on X, also the 80% could be tweaked but I think this is less interesting (low percentages allow for a fluky, unreliable AI to pass the test; very high percentages seem likely to require superhuman performance in a way that is not relevant to what we're trying to measure here).

It would be nice to have some prediction targets that more directly get at long-term memory and iterative/exploratory work processes, but as I discuss in the blog post, I don't know how to construct such a target – open to suggestions.

Coding, in the sense that GPT4 can do it, is nowhere near the top of the hierarchy of skills involved in serious software engineering. And so I believe this is a bit like saying that, because a certain robot is already pretty decent at chiseling, it will soon be able to produce works of art at the same level as any human sculptor.
I think I just don't buy this. I work at OpenAI R&D. I see how the sausage gets made. I'm not saying the whole sausage is coding, I'm saying a significant part of it is, and moreover that many of the bits GPT4 currently can't do seem to me that they'll be doable in the next few years.

Intuitively, I struggle with this, but you have inside data and I do not. Maybe we just set this point aside for now, we have plenty of other points we can discuss.

To be clear, I do NOT think that today's systems could replace 99% of remote jobs even with a century of schlep. And in particular I don't think they are capable of massively automating AI R&D even with a century of schlep. I just think they could be producing, say, at least an OOM more economic value. ...

This, I would agree with. And on re-reading, I think I may have been mixed up as to what you and Ajeya were saying in the section I was quoting from here, so I'll drop this.

[Ege] I think when you try to use the systems in practical situations; they might lose coherence over long chains of thought, or be unable to effectively debug non-performant complex code, or not be able to have as good intuitions about which research directions would be promising, et cetera.
This was a nice answer from Ege. My follow up questions would be: Why? I have theories about what coherence is and why current models often lose it over long chains of thought (spoiler: they weren't trained to have trains of thought) and theories about why they aren't already excellent complex-code-debuggers (spoiler: they weren't trained to be) etc. What's your theory for why all the things AI labs will try between now and 2030 to make AIs good at these things will fail?

I would not confidently argue that it won't happen by 2030; I am suggesting that these problems are unlikely to be well solved in a usable-in-the-field form by 2027 (four years from now). My thinking:

The rapid progress in LLM capabilities has been substantially fueled by the availability of stupendous amounts of training data.
There is no similar abundance of low-hanging training data for extended (day/week/more) chains of thought, nor for complex debugging tasks. Hence, it will not be easy to extend LLMs (and/or train some non-LLM model) to high performance at these tasks.
A lot of energy will go into the attempt, which will eventually succeed. But per (2), I think some new techniques will be needed, which will take time to identify, refine, scale, and productize; a heavy lift in four years. (Basically: Hofstadter's Law.)
Especially because I wouldn't be surprised if complex-code-debugging turns out to be essentially "AGI-complete", i.e. it may require a sufficiently varied mix of exploration, logical reasoning, code analysis, etc. that you pretty much have to be a general AGI to be able to do it well.

I understand you might be skeptical that it can be done but I encourage you to red-team your position, and ask yourself 'how would I do it, if I were an AI lab hell-bent on winning the AGI race?' You might be able to think of some things.

In a nearby universe, I would be fundraising for a startup to do exactly that, it sounds like a hell of fun problem. :-) And I'm sure you're right... I just wouldn't expect to get to "capable of 99% of all remote work" within four years.

I realize you’re not explicitly labeling this as a prediction, but… isn’t this precisely the sort of thought process to which Hofstadter's Law applies?
Indeed. Like I said, my timelines are based on a portfolio of different models/worlds; the very short-timelines models/worlds are basically like "look we basically already have the ingredients, we just need to assemble them, here is how to do it..." and the planning fallacy / hofstadter's law 100% applies to this. The 5-year-and-beyond worlds are not like that; they are ... looking at lines on graphs and then extrapolating them ...
So my timelines do indeed take into account Hofstadter's Law. If I wasn't accounting for it already, my median would be lower than 2027. However, I am open to the criticism that maybe I am not accounting for it enough.

To be clear, I'm only attempting to argue about the short-timeline worlds. I agree that Hofstadter's Law doesn't apply to curve extrapolation. (My intuition for 5-year-and-beyond worlds is more like Ege's, but I have nothing coherent to add to the discussion on that front.) And so, yes, I think my position boils down to "I believe that, in your short-timeline worlds, you are not accounting for Hofstadter's Law enough".

As you proposed, I think the interesting place to go from here would be some predictions. I'll noodle on this, and I'd be very interested to hear any thoughts you have – milestones along the path you envision in your default model of what rapid progress looks like; or at least, whatever implications thereof you feel comfortable talking about.

Comment by snewman on AI Timelines · 2023-11-15T01:16:28.086Z · LW · GW

This post taught me a lot about different ways of thinking about timelines, thanks to everyone involved!

I’d like to offer some arguments that, contra Daniel’s view, AI systems are highly unlikely to be able to replace 99% of current fully remote jobs anytime in the next 4 years. As a sample task, I’ll reference software engineering projects that take a reasonably skilled human practitioner one week to complete. I imagine that, for AIs to be ready for 99% of current fully remote jobs, they would need to be able to accomplish such a task. (That specific category might be less than 1% of all remote jobs, but I imagine that the class of remote jobs requiring at least this level of cognitive ability is more than 1%.)

Rather than referencing scaling laws, my arguments stem from analysis of two specific mechanisms which I believe are missing from current LLMs:

Long-term memory. LLMs of course have no native mechanism for retaining new information beyond the scope of their token buffer. I don’t think it is possible to carry out a complex extended task, such as a week-long software engineering project, without long-term memory to manage the task, keep track of intermediate thoughts regarding design approaches, etc.
Iterative / exploratory work processes. The LLM training process focuses on producing final work output in a single pass, with no planning process, design exploration, intermediate drafts, revisions, etc. I don’t think it is possible to accomplish a week-long software engineering task in a single pass; at least, not without very strongly superhuman capabilities (unlikely to be reached in just four years).

Of course there are workarounds for each of these issues, such as RAG for long-term memory, and multi-prompt approaches (chain-of-thought, tree-of-thought, AutoGPT, etc.) for exploratory work processes. But I see no reason to believe that they will work sufficiently well to tackle a week-long project. Briefly, my intuitive argument is that these are old school, rigid, GOFAI, Software 1.0 sorts of approaches, the sort of thing that tends to not work out very well in messy real-world situations. Many people have observed that even in the era of GPT-4, there is a conspicuous lack of LLMs accomplishing any really meaty creative work; I think these missing capabilities lie at the heart of the problem.

Nor do I see how we could expect another round or two of scaling to introduce the missing capabilities. The core problem is that we don’t have massive amounts of training data for managing long-term memory or carrying out exploratory work processes. Generating such data at the necessary scale, if it’s even possible, seems much harder than what we’ve been doing up to this point to marshall training data for LLMs.

The upshot is that I think that we have been seeing the rapid increase in capabilities of generative AI, failing to notice that this progress is confined to a particular subclass of tasks – namely, tasks which can pretty much be accomplished using System 1 alone – and collectively fooling ourselves into thinking that the trend of increasing capabilities is going to quickly roll through the remainder of human capabilities. In other words, I believe the assertion that the recent rate of progress will continue up through AGI is based on an overgeneralization. For an extended version of this claim, see a post I wrote a few months ago: The AI Progress Paradox. I've also written at greater length about the issues of Long-term memory and Exploratory work processes.

In the remainder of this comment, I’m going to comment what I believe are some weak points in the argument for short timelines (as presented in the original post).

[Daniel] It seems to me that GPT-4 is already pretty good at coding, and a big part of accelerating AI R&D seems very much in reach -- like, it doesn't seem to me like there is a 10-year, 4-OOM-training-FLOP gap between GPT4 and a system which is basically a remote-working OpenAI engineer that thinks at 10x serial speed.

Coding, in the sense that GPT4 can do it, is nowhere near the top of the hierarchy of skills involved in serious software engineering. And so I believe this is a bit like saying that, because a certain robot is already pretty decent at chiseling, it will soon be able to produce works of art at the same level as any human sculptor.

[Ajeya] I don't know, 4 OOM is less than two GPTs, so we're talking less than GPT-6. Given how consistently I've been wrong about how well "impressive capabilities in the lab" will translate to "high economic value" since 2020, this seems roughly right to me?

[Daniel] I disagree with this update -- I think the update should be "it takes a lot of schlep and time for the kinks to be worked out and for products to find market fit" rather than "the systems aren't actually capable of this." Like, I bet if AI progress stopped now, but people continued to make apps and widgets using fine-tunes of various GPTs, there would be OOMs more economic value being produced by AI in 2030 than today.

If the delay in real-world economic value were due to “schlep”, shouldn’t we already see one-off demonstrations of LLMs performing economically-valuable-caliber tasks in the lab? For instance, regarding software engineering, maybe it takes a long time to create a packaged product that can be deployed in the field, absorb the context of a legacy codebase, etc. and perform useful high-level work. But if that’s the only problem, shouldn’t there already be at least one demonstration of an LLM doing some meaty software engineering project in a friendly lab environment somewhere?

More generally, how do we define “schlep” such that the need for schlep explains the lack of visible accomplishments today, but also allows for AI systems be able to replace 99% of remote jobs within just four years?

[Daniel] And so I think that the AI labs will be using AI remote engineers much sooner than the general economy will be. (Part of my view here is that around the time it is capable of being a remote engineer, the process of working out the kinks / pushing through schlep will itself be largely automatable.)

What is your definition of “schlep”? I’d assumed it referred to the innumerable details of figuring out how to adapt and integrate a raw LLM into a finished product which can handle all of the messy requirements of real-world use cases – the “last mile” of unspoken requirements and funky edge cases. Shouldn’t we expect such things to be rather difficult to automate? Or do you mean something else by “schlep”?

[Daniel] …when I say 2027 as my median, that's kinda because I can actually quite easily see it happening in 2025, but things take longer than I expect, so I double it.

Can you see LLMs acquiring long-term memory and an expert-level, nuanced ability to carry out extended exploratory processes by 2025? If yes, how do you see that coming about? If no, does that cause you to update at all?

[Daniel] I take it that in this scenario, despite getting IMO gold etc. the systems of 2030 are not able to do the work of today's OAI engineer? Just clarifying. Can you say more about what goes wrong when you try to use them in such a role?

Anecdote: I got IMO silver (granted, not gold) twice, in my junior and senior years of high school. At that point I had already been programming for close to ten years, and spent considerably more time coding than I spent studying math, but I would not have been much of an asset to an engineering team. I had no concept of how to plan a project, organize a codebase, design maintainable code, strategize a debugging session, evaluate tradeoffs, see between the lines of a poorly written requirements document, etc. Ege described it pretty well:

I think when you try to use the systems in practical situations; they might lose coherence over long chains of thought, or be unable to effectively debug non-performant complex code, or not be able to have as good intuitions about which research directions would be promising, et cetera.

This probably underestimates the degree to which IMO-silver-winning me would have struggled. For instance, I remember really struggling to debug binary tree rotation (a fairly simple bit of data-structure-and-algorithm work) for a college class, almost 2.5 years after my first silver.

[Ajeya] I think by the time systems are transformative enough to massively accelerate AI R&D, they will still not be that close to savannah-to-boardroom level transfer, but it will be fine because they will be trained on exactly what we wanted them to do for us.

This assumes we’re able to train them on exactly what we want them to do. It’s not obvious to me how we would train a model to do, for example, high-level software engineering? (In any case, I suspect that this is not far off from being AGI-complete; I would suspect the same of high-level work in most fields; see again my earlier-linked post on the skills involved in engineering.)

[Daniel] …here's a scenario I think it would be productive to discuss:
(1) Q1 2024: A bigger, better model than GPT-4 is released by some lab. It's multimodal; it can take a screenshot as input and output not just tokens but keystrokes and mouseclicks and images. Just like with GPT-4 vs. GPT-3.5 vs. GPT-3, it turns out to have new emergent capabilities. Everything GPT-4 can do, it can do better, but there are also some qualitatively new things that it can do (though not super reliably) that GPT-4 couldn't do.
…
(6) Q3 2026 Superintelligent AGI happens, by whatever definition is your favorite. And you see it with your own eyes.

I realize you’re not explicitly labeling this as a prediction, but… isn’t this precisely the sort of thought process to which Hofstadter's Law applies?

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-23T01:05:50.678Z · LW · GW

Thanks for the thoughtful and detailed comments! I'll respond to a few points, otherwise in general I'm just nodding in agreement.

I think it's important to emphasize (a) that Davidson's model is mostly about pre-AGI takeoff (20% automation to 100%) rather than post-AGI takeoff (100% to superintelligence) but it strongly suggests that the latter will be very fast (relative to what most people naively expect) on the order of weeks probably and very likely less than a year.

And it's a good model, so we need to take this seriously. My only quibble would be to raise again the possibility (only a possibility!) that progress becomes more difficult around the point where we reach AGI, because that is the point where we'd be outgrowing human training data. I haven't tried to play with the model and see whether that would significantly affect the post-AGI takeoff timeline.

(Oh, and now that I think about it more, I'd guess that Davidson's model significantly underestimates the speed of post-AGI takeoff, because it might just treat anything above AGI as merely 100% automation, whereas actually there are different degrees of 100% automation corresponding to different levels of quality intelligence; 100% automation by ASI will be significantly more research-oomph than 100% automation by AGI. But I'd need to reread the model to decide whether this is true or not. You've read it recently, what do you think?)

I want to say that he models this by equating the contribution of one ASI to more than one AGI, i.e. treating additional intelligence as equivalent to a speed boost. But I could be mis-remembering, and I certainly don't remember how he translates intelligence into speed. If it's just that each post-AGI factor of two in algorithm / silicon improvements is modeled as yielding twice as many AGIs per dollar, then I'd agree that might be an underestimate (because one IQ 300 AI might be worth a very large number of IQ 150 AIs, or whatever).

And (b) Davidson's model says that while there is significant uncertainty over how fast takeoff will be if it happens in the 30's or beyond, if it happens in the 20's -- i.e. if AGI is achieved in the 20's -- then it's pretty much gotta be pretty fast. Again this can be seen by playing around with the widget on takeoffspeeds.com.

Yeah, even without consulting any models, I would expect that any scenario where we achieve AGI in the 20s is a very scary scenario for many reasons.

--I work at OpenAI and I see how the sausage gets made. Already things like Copilot and ChatGPT are (barely, but noticeably) accelerating AI R&D. I can see a clear path to automating more and more parts of the research process, and my estimate is that going 10x faster is something like a lower bound on what would happen if we had AGI (e.g. if AutoGPT worked well enough that we could basically use it as a virtual engineer + scientist) and my central estimate would be "it's probably about 10x when we first reach AGI, but then it quickly becomes 100x, 1000x, etc. as qualitative improvements kick in." There's a related issue of how much 'room to grow' is there, i.e. how much low-hanging fruit is there to pick that would improve our algorithms, supposing we started from something like "It's AutoGPT but good, as good as an OAI employee." My answer is "Several OOMs at least." So my nose-to-the-ground impression is if anything more bullish/fast-takeoff-y than Davidson's model predicts.

What is your feeling regarding the importance of other inputs, i.e. training data and compute?

> I think of AI progress as being driven by a mix of cognitive input, training data, training FLOPs, and inference FLOPs. Davidson models the impact of cognitive input and inference FLOPs, but I didn't see training data or training FLOPs taken into account. ("Doesn’t model data/environment inputs to AI development.") My expectation that as RSI drives an increase in cognitive input, training data and training FLOPs will be a drag on progress. (Training FLOPs will be increasing, but not as quickly as cognitive inputs.)
Training FLOPs is literally the most important and prominent variable in the model, it's the "AGI training requirements" variable. I agree that possible data bottlenecks are ignored; if it turns out that data is the bottleneck, timelines to AGI will be longer (and possibly takeoff slower? Depends on how the data problem eventually gets solved; takeoff could be faster in some scenarios...) Personally I don't think the data bottleneck will slow us down much, but I could be wrong.

Ugh! This was a big miss on my part, thank you for calling it out. I skimmed too rapidly through the introduction. I saw references to biological anchors and I think I assumed that meant the model was starting from an estimate of FLOPS performed by the brain (i.e. during "inference") and projecting when the combination of more-efficient algorithms and larger FLOPS budgets (due to more $$$ plus better hardware) would cross that threshold. But on re-read, of course you are correct and the model does focus on training FLOPS.

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-23T00:41:23.944Z · LW · GW

So to be clear, I am not suggesting that a foom is impossible. The title of the post contains the phrase "might never happen".

I guess you might reasonably argue that, from the perspective of (say) a person living 20,000 years ago, modern life does in fact sit on the far side of a singularity. When I see the word 'singularity', I think of the classic Peace War usage of technology spiraling to effectively infinity, or at least far beyond present-day technology. I suppose that led me to be a bit sloppy in my use of the term.

The point I was trying to make by referencing those various historical events is that all of the feedback loops in question petered out short of a Vingian singularity. And it's a fair correction that some of those loops are actually still in play. But many are not – forest fires burn out, the Cambrian explosion stopped exploding – so we do have existence proofs that feedback loops can come to a halt. I know that's not any big revelation, I was merely attempting to bring the concept to mind in the context of RSI.

In any case, all I'm really trying to do is to argue that the following syllogism is invalid:

As AI approaches human level, it will be able to contribute to AI R&D, thus increasing the pace of AI improvement.
This process can be repeated indefinitely.
Therefore, as soon as AI is able to meaningfully contribute to its own development, we will quickly spiral to a Vingian singularity.

This scenario is certainly plausible, but I frequently see it treated as a mathematical certainty. And that is simply not the case. The improvement cycle will only exhibit a rapid upward spiral under certain assumptions regarding the relationship of R&D inputs to gains in AI capability – the r term in Davidson's model.

(Then I spend some time explaining why I think r might be lower than expected during the period where AI is passing through human level. Again, "might be".)

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-22T02:56:21.801Z · LW · GW

OK, having read through much of the detailed report, here's my best attempt to summarize my and Davidson's opinions. I think they're mostly compatible, but I am more conservative regarding the impact of RSI in particular, and takeoff speeds in general.

My attempt to summarize Davidson on recursive self-improvement

AI will probably be able to contribute to AI R&D (improvements to training algorithms, chip design, etc.) somewhat ahead of its contributions to the broader economy. Taking this into account, he predicts that the "takeoff time" (transition from "AI could readily automate 20% of cognitive tasks" to "AI could readily automate 100% of cognitive tasks") will take a few years: median 2.9, 80% confidence interval 0.8 to 12.5.

He notes that the RSI feedback loop could converge or diverge:

The feedback loop is:
Better software → more 2020-FLOP → more software R&D → better software
It turns out that, with this feedback loop, there are two broad possibilities.
1. Software singularity - quicker and quicker doublings. If returns to software R&D exceed a certain threshold, the feedback loop is so powerful that there’s a “software only singularity”. The level of software, quantified here as 2020-FLOP per FLOP, grows faster and faster, theoretically going to infinity in finite time. And this happens even using a fixed quantity of physical FLOP to run the AIs. In practice, of course, the software returns become worse before we go to infinity and we move to the second possibility.
2. Software fizzle - slower and slower doublings. If returns to software R&D are below a certain threshold, the level of software grows more and more slowly over time, assuming a fixed quantity of physical FLOP. (If the amount of physical FLOP is in fact growing increasingly quickly, then the level of software can do the same. But software progress is reliant on the growth of physical FLOP.)
Which possibility will obtain? It turns out that there is a software singularity just if r > 1, where r is defined as in section 4:
For each doubling of cumulative R&D inputs, the output metric will double r times.

He projects a 65% probability of scenario 1 (RSI leads to an accelerating capabilities ramp) occurring based strictly on software improvements, but thinks it would not last indefinitely:

Overall, I’m roughly ~65% on a software-only singularity occurring, and my median best guess is that it would last for ~2-3 OOMs if it happened.

Here, I believe he is only taking into account the contributions of AI to algorithm (software) improvements. Presumably, taking AI contributions to hardware design into account would produce a somewhat more aggressive estimate; this is part of the overall model, but I didn't see it broken out into a specific probability estimate for a period of quicker-and-quicker doublings.

My position

I agree that RSI might or might not result in an accelerating capabilities ramp as we approach human-level AGI. I keep encountering people assuming that what Davidson calls a "software singularity" is self-evidently inevitable, and my main goal was to argue that, while possible, it is not inevitable. Davidson himself expresses a related sentiment that his model is less aggressive than some people's stated expectations; for instance:

My impression is that Eliezer Yudkowsky expects takeoff to be very fast, happening in time scales of days or months. By contrast, this framework puts the bulk of its probability on takeoff taking multiple years.

I have not attempted to produce a concrete estimate of takeoff speed.

I expect the impact of RSI to be somewhat less than Davidson models:

I expect that during the transition period, where humans and AIs are each making significant contributions to AI R&D, that there will be significant lags in taking full advantage of AI (project management and individual work habits will continually need to be adjusted), with resulting inefficiencies. Davidson touches on these ideas, but AFAICT does not include them in the model; for instance, "Assumes no lag in reallocating human talent when tasks have been automated."
It may be that when AI has automated X% of human inputs into AI R&D, the remaining inputs are the most sophisticated part of the job, and can only be done by senior researchers, meaning that most of the people being freed up are not immediately able to be redirected to non-automated tasks. It might even be the case that, by abandoning the lower-level work, we (humans) would lose our grounding in the nuts and bolts of the field, and the quality of the higher-level work we are still doing might gradually decline.
I think of AI progress as being driven by a mix of cognitive input, training data, training FLOPs, and inference FLOPs. Davidson models the impact of cognitive input and inference FLOPs, but I didn't see training data or training FLOPs taken into account. ("Doesn’t model data/environment inputs to AI development.") My expectation that as RSI drives an increase in cognitive input, training data and training FLOPs will be a drag on progress. (Training FLOPs will be increasing, but not as quickly as cognitive inputs.)
I specifically expect progress to become more difficult as we approach human-level AGI, as human-generated training data will become less useful at that point. We will also be outrunning our existence proof for intelligence; I expect superhuman intelligence to be feasible, but we don't know for certain that extreme superhuman performance is reasonably achievable, and so we should allow for some probability that progress beyond human performance will be significantly more difficult.
As we approach human-level AGI, we may encounter other complications: coordination problems and transition delays as the economy begins to evolve rapidly, increased security overhead as AI becomes increasingly strategic for both corporations and nations (and as risks hopefully are taken more seriously), etc.

Most of this can be summarized as "everything is always harder and always takes longer than you think, even when you take this into account".

For the same reasons, I am somewhat more conservative than Davidson on the timeline from human-level AGI to superintelligence (which he guesses will take less than a year); but I'm not in a position to quantify that.

Davidson does note some of these possibilities. For instance, he cites a few factors that could result in superintelligence taking longer than a year (even though he does not expect that to be the case), including two of the factors I emphasize:

Software progress may have become much harder by the time we reach AGI.
Progress might become bottlenecked by the need to run expensive computational experiments, or to rerun multi-month long AI training runs.

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-21T15:36:56.762Z · LW · GW

Thanks, I appreciate the feedback. I originally wrote this piece for a less technical audience, for whom I try to write articles that are self-contained. It's a good point that if I'm going to post here, I should take a different approach.

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-21T15:15:06.007Z · LW · GW

You don't need a new generation of fab equipment to make advances in GPU design. A lot of improvements of the last few years were not about having constantly a new generation of fab equipment.

Ah, by "producing GPUs" I thought you meant physical manufacturing. Yes, there has been rapid progress of late in getting more FLOPs per transistor for training and inference workloads, and yes, RSI will presumably have an impact here. The cycle time would still be slower than for software: an improved model can be immediately deployed to all existing GPUs, while an improved GPU design only impacts chips produced in the future.

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-21T03:19:41.927Z · LW · GW

Thanks. I had seen Davidson's model, it's a nice piece of work. I had not previously read it closely enough to note that he does discuss the question of whether RSI is likely to converge or diverge, but I see that now. For instance (emphasis added):

We are restricting ourselves only to efficiency software improvements, i.e. ones that decrease the physical FLOP/s to achieve a given capability. With this restriction, the mathematical condition for a singularity here is the same as before: each doubling of cumulative inputs must more than double the efficiency of AI algorithms. If this holds, then the efficiency of running AGIs (of fixed ability) will double faster and faster over time. Let’s call this an “efficiency-only singularity”, which is of course an example of a software-only singularity.

I'll need some time to thoroughly digest what he has to say on this topic.

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-21T03:16:14.988Z · LW · GW

I'll try to summarize your point, as I understand it:
Intelligence is just one of many components. If you get huge amounts of intelligence, at that point you will be bottlenecked by something else, and even more intelligence will not help you significantly. (Company R&D doesn't bring a "research explosion".)

The core idea I'm trying to propose (but seem to have communicated poorly) is that the AI self-improvement feedback loop might (at some point) converge, rather than diverging. In very crude terms, suppose that GPT-8 has IQ 180, and we use ten million instances of it to design GPT-9, then perhaps we get a system with IQ 190. Then we use ten million instances of GPT-9 to design GPT-10, perhaps that has IQ 195, and eventually GPT-∞ converges at IQ 200.

I do not claim this is inevitable, merely that it seems possible, or at any rate is not ruled out by any mathematical principle. It comes down to an empirical question of how much incremental R&D effort is needed to achieve each incremental increase in AI capability.

The point about the possibility of bottlenecks other than intelligence feeds into that question about R&D effort vs. increase in capability; if we double R&D effort but are bottlenecked on, say, training data, than we might get a disappointing increase in capability.

IIUC, much of the argument you're making here is that the existing dynamic of IP laws, employee churn, etc. puts a limit on the amount of R&D investment that any given company is willing to make, and that these incentives might soon shift in a way that could unleash a drastic increase in AI R&D spending? That seems plausible, but I don't see how it ultimately changes the slope of the feedback loop – it merely allows for a boost up the early part of the curve?

Also, please note that LLMs are just one possible paradigm of AI. Yes, currently the best one, but who knows what tomorrow may bring. I think most people among AI doomers would agree that LLMs are not the kind of AI they fear. LLMs succeed to piggyback on humanity's written output, but they are also bottlenecked by it.

Agreed that there's a very good chance that AGI may not look all that much like an LLM. And so when we contemplate the outcome of recursive self-improvement, a key question will be what the R&D vs. increase-in-capability curve looks like for whatever architecture emerges.

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-21T02:59:25.174Z · LW · GW

Sure, it's easy to imagine scenarios where a specific given company could be larger than it is today. But are you envisioning that if we eliminated antitrust laws and made a few other specific changes, then it would become plausible for a single company to take over the entire economy?

My thesis boils down to the simple assertion that feedback loops need not diverge indefinitely, exponential growth can resolve into an S-curve. In the case of a corporation, the technological advantages, company culture, and other factors that allow a company to thrive in one domain (e.g. Google, web search) might not serve it well in another domain (Google, social networks). In the case of AI self-improvement, it might turn out that we eventually enter a domain – for instance, the point where we've exhausted human-generated training data – where the cognitive effort required to push capabilities forwards increases faster than the cognitive effort supplied by those same capabilities. In other words, we might reach a point where each successive generation of recursively-designed AI delivers a decreasing improvement over its predecessor. Note that I don't claim this is guaranteed to happen, I merely argue that it is possible, but that seems to be enough of a claim to be controversial.

We can look at a skill that's about applying human intelligence like playing Go. It would be possible that the maximum skill level is near what professional go players are able to accomplish. AlphaGo managed to go very much past what humans can accomplish in a very short timeframe and AlphaGo doesn't even do any self-recursive editing of it's own code.

Certainly. I think we see that the ease with which computers can definitively surpass humans depends on the domain. For multiplying large numbers, it's no contest at all. For Go, computers win definitively, but by a smaller margin than for multiplication. Perhaps, as we move toward more and more complex and open-ended problems, it will get harder and harder to leave humans in the dust? (Not impossible, just harder?) I discuss this briefly in a recent blog post, I'd love to hear thoughts / evidence in either direction.

AI can help with producing GPU's as well. It's possible to direct a lot more of the worlds economic output into producing GPU's than is currently done.

Sure. I'm just suggesting that the self-improvement feedback loop would be slower here, because designing and deploying a new generation of fab equipment has a much longer cycle time than training a new model, no?

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-21T02:46:32.263Z · LW · GW

All of these things are possible, but it's not clear to me that they're likely, at least in the early stages of AGI. In other words: once we have significantly-superhuman AGI, then agreed, all sorts of crazy things may become possible. But first we have to somehow achieve superhuman AGI. One of the things I'm trying to do in this post is explore the path that gets us to superhuman AGI in the first place. That path, by definition, can't rely on anything that requires superhuman capabilities.

If I understand correctly, you're envisioning that we will be able to construct AGIs that have human-level capability, and far greater than human speed, in order to bootstrap superhuman AGI? What makes you confident that this speed advantage will exist early on? Current leading-edge models like GPT-4 are not drastically faster than human beings, and presumably until we get to human-level AGI we'll be spending most of our algorithmic improvements and increases in FLOP budget on increased capabilities, rather than performance. In fact, it's quite possible that we'll have to (temporarily) accept reduced speed in order to achieve human-level performance; for instance, by introducing tree search into the thought process (tree-of-thought prompting, heuristic search techniques in Yann LeCun's "A Path Towards Autonomous Machine Intelligence", etc.).

Once we achieve human-level, human-speed AGI, then yes, further algorithm or FLOPs improvements could be spent on speed; this comes back to the basic question of how whether the cognitive effort required for further progress increases more or less rapidly than the extent to which progress (and/or increased budgets) enables increased cognitive effort, i.e. does the self-improvement feedback loop converge or diverge. Are you proposing that it definitely diverges? What points you in that direction?

I would also caution against being highly confident that AGI will automatically be some sort of ideal omnimath. Such ability would require more than merely assimilating all of human knowledge and abilities; it would require knowing exactly which sub-specialties to draw on in any given moment. Otherwise, the AI would risk drowning in millions of superfluous connections to its every thought. Some examples of human genius might in part depend on a particular individual just happening to have precisely the right combination of knowledge, without a lot of other superfluous considerations to distract them.

Also, is it obvious that a single early-human-level AI could be trained with deep mastery of every field of human knowledge? Biological-anchor analysis aims to project when we can create a human-level AI, and humans are not omnimaths. Deep expertise across every subspeciality might easily require many more parameters than the number of synapses in the human brain. Many things look simple until you try to implement them; I touch on this in a recent blog post, The AI Progress Paradox, but you just need to look at the history of self-driving cars (or photorealistic CGI, or many other examples) to see how things that seem simple in principle can require many rounds of iteration to fully achieve in practice.

Comment by snewman on The AI Explosion Might Never Happen · 2023-09-20T02:31:34.667Z · LW · GW

I think you're saying that the fact that no historical feedback loop has ever destroyed the Earth (nor transformed it into a state which would not support human life) could be explained by the Anthropic Principle? Sure, that's true enough. I was aiming more to provide an intuition for the idea that it's very common and normal for feedback loops to eventually reach a limit, as there are many examples in the historical record.

Intuition aside: given the sheer number of historical feedback loops that have failed to destroy the Earth, it seems unavoidable that either (a) there are some fundamental principles at play that tend to place a cap on feedback loops, at least in the family of alternative universes that this universe has been sampled from, or (b) we have to lean on the Anthropic Principle very very hard indeed. It's not hard to articulate causes for (a); for instance, any given feedback loop arises under a particular set of conditions, and once it has progressed sufficiently, it will begin to alter its own environment to the point where those conditions may no longer apply. (The forest fire consumes all available fuel, etc.)

Comment by snewman on Report on Frontier Model Training · 2023-09-10T23:15:14.705Z · LW · GW

Speaking as someone who has had to manage multi-million dollar cloud budgets (though not in an AI / ML context), I agree that this is hard.

As you note, there are many ways to think about the cost of a given number of GPU-hours. No one approach is "correct", as it depends heavily on circumstances. But we can narrow it down a bit: I would suggest that the cost is always substantially higher than the theoretical optimum one might get by taking the raw GPU cost and applying a depreciation factor.

As soon as you try to start optimizing costs – say, by reselling your GPUs after training is complete, or reusing training GPUs for inference – you run into enormous challenges. For example:

When is training "complete"? Maybe you discover a problem and need to re-run part of the training process.
You may expect to train another large model in N months, but if you sell your training GPUs, you can't necessarily be confident (in the current market) of being able to buy new ones on demand.
If you plan to reuse GPUs for inference once training is done... well, it's unlikely that the day after training is complete, your inference workload immediately soaks up all of those GPUs. Production (inference) workloads are almost always quite variable, and 100% hardware utilization is an unattainable goal.
The actual process of buying and selling hardware entails all sorts of overhead costs, from physically racking and un-racking the hardware, to finding a supplier / buyer, etc.

The closest you can come to the theoretical optimum is if you are willing to scale your workload to the available hardware, i.e. you buy a bunch of GPUs (or lease them at a three-year-commitment rate) and then scale your training runs to precisely utilize the GPUs you bought. In theory, you are then getting your GPU-hours at the naive "hardware cost divided by depreciation period" rate. However, you are now allowing your hardware capacity to dictate your R&D schedule, which is its own implicit cost – you may be paying an opportunity cost by training more slowly than you'd like, or you may be performing unnecessary training runs just to soak up the hardware.

User info

Posts

Comments