Comments on "AI 2027"

randaly

Comments on "AI 2027"

post by Randaly · 2025-04-11T20:32:34.419Z · LW · GW · 10 comments

10 comments

I find the decision to brand the forecast as "AI 2027" very odd. The authors do not in fact believe this; they explicitly give 2028, 2030, or 2033 for their median dates for a superhuman coder.

The point of this project was presumably to warn about a possible outcome; by the authors' own beliefs, their warning will be falsified immediately before it is needed.

When presenting predictions, forecasters always face tradeoffs regarding how much precision to present. Precise forecasting attracts attempts and motivates action; adding many concrete details produces a compelling story, stimulating discussion; this also involves falsifiable predictions. Emphasizing uncertainty avoids losing credibility when some parts of story inevitably fail; prevents overconfidence; and encourages more robust strategies that can work across a range of outcomes. But I can't think of any reason to only consider a single high precision story that you don't think is all that likely.

I think that the excessive precision is pretty important in this case: the current pace of AI R&D spending is unsustainable, so it matters exactly how much more progress is needed for superhuman coders.

***

I don't believe that METR's time horizons forecast is sufficiently strong evidence for a precise timeline:

*In general* AI competence relative to humans isn't ordered by time- AI's can complete some tasks that would take humans centuries, and can't complete other than would take humans seconds.
METR produced their analysis by taking a subset of activities: SWAA, HCAST, and RE-BENCH. SWAA has 66 tasks from 1-30 seconds; HCAST has 97 tasks from 1 minutes to 30 hours; and RE-BENCH has 7 tasks of 8 hours. SWAA and HCAST were created by METR for this research.
METR did not- and cannot- use an obvious/straightforward/canonical set of human tasks: current AI's cannot accomplish most human economic tasks, because most tasks do not solely involve interacting with computers.
METR created SWAA and HCAST *after* having observed AI progress. Their selection of tasks may have been unconsciously biased by knowing how difficult each task is for LLM's. (Albeit they also checked SWE-Bench Verified times and found a similar result within its tasks.)
The publicly shared Task Summaries from HCAST do not represent tasks separated solely by time. Instead, they represent *different* tasks, which are measured as taking different amounts of time, but which also have different difficulties. This undermines METR's attempt to provide a precise future forecast: the time taken for a task can be measured precisely, but its difficulty cannot. This poses two problems:
1. METR provides examples of future tasks (e.g. "Optimize code for custom chip"); a forecast of when they will be solvable depends on METR's having accurately evaluated their difficulty, in addition to the time it takes humans to accomplish them.
2. Regarding task difficulty, METR writes that "If this is the case, we may be underestimating the pace of model improvement." They seem to be viewing this as models becoming capable of solving more difficult problems *in addition* to time horizons increasing exponentially. However, in their dataset increased difficulty of tasks is correlated with time horizons- see section B.1.1; so increased capabilities might contribute to increasing measured time horizons by allowing the completion of tasks which were too difficult but not too long to accomplish previously.
  Two concrete examples: humans can read ~75 tokens per minute, so any task that GPT-2 could do that filled its entire 1024-token context length^[1] would be a 15 minute task; meanwhile, a leap of mathematical intuition takes a human only a few seconds, but is arguably still impossible for LLM's. Adding many tasks like these would make the calculated task length improvement slower.
It is plausible that the future projection for solvable tasks is correct. However, this is not a straightforward projection from past progress; instead, it is dependent on METR researchers' intuitions about the relative difficulties of tasks being correct.
Unfortunately, while I don't believe in it or its results, there is no better alternative; METR's time horizons forecast is pretty clearly the best available attempt at making a precise forecast of AI improvement.

***

I don't believe that the "benchmarks-and-gaps" model is the correct way to forecast future AI development:

"benchmarks-and-gaps" has the form of a Fermi estimate; I'm suspicious that this is how/why it was adopted. However, Fermi estimates are typically used in areas where estimates within an order of magnitude are useful- not to give a precise estimate. On a typical Fermi estimate reasoning, the authors should have rounded their timeline to one year, and instead written "AI 2026". Of course they were right to not do this- because this is a question where precision matters. However, taking a Fermi estimate's "2 months-2 decades" estimate and and producing a timeline of 2 years conveys excessive precision.
There is no particular reason to endorse the particular set of gaps chosen. The most prominent gap that I've seen discussed, the ability of LLMs to come up with new ideas or paradigms, wasn't included.
"benchmarks-and-gaps" has historically proven to be an unreliable way of forecasting AI development. The problem is that human intuitions about what capabilities are required for specific tasks aren't very good, and so more "gaps" are discovered once the original gaps have been passed.
I believe that the discussion of "Feedback loops: Working without externally provided feedback" considerably underestimates the difficulty involved: it claims that removing Best-of-K represents half of the difficulty. Eli writes:

I and others have consistently been surprised by progress on easy-to-evaluate, nicely factorable benchmark tasks, while seeing some corresponding real-world impact but less than I would have expected. Perhaps AIs will continue to get better on checkable tasks in substantial part by relying on trying a bunch of stuff and seeing what works, rather than general reasoning which applies to more vague tasks. And perhaps I’m underestimating the importance of work that is hard to even describe as “tasks”.

Narrowly, labs often are optimizing against benchmarks, meaning that non-benchmarked progress is slower than you would otherwise measure. More broadly, it's plausible that important tasks or knowledge *can't* be benchmarked, even with a benchmark that maxes out METR's messiness metric. James C. Scott uses the word "legibility" to describe the process of making it possible for outsiders to gain understanding: creating simplified, standardized categories; imposing uniform systems of measurement; and favoring formal, official knowledge over practical, local knowledge. I like that term in this context because it emphasizes that LLM's trained on text and passing objectively measured benchmarks will have trouble with some "illegible" types of tasks that humans can do: think intuitive leaps, context-dependent judgments, or embodied skills.^[3]

***

I believe that the "AI 2027" scaling forecasts are implausible.

"AI 2027" forecasts that OpenAI's revenue will reach 140 billion in 2027. This considerably exceeds even OpenAI's own forecast, which surpasses 125 billion in revenue in 2029. I believe that the AI 2027 forecast is implausible.^[4]
1. FutureSearch's estimate of paid subscribers for April 2025 was 27 million; the actual figure is 20 million. They justify high expected consumer growth with reference to the month-on-month increase in unpaid users from December 2024 -> February 2025. Data from Semrush replicates that increase, but also shows that traffic has since declined rather than continuing to increase.
2. Looking at market size estimates, FutureSearch seems to implicitly assume that OpenAI will achieve a near-monopoly on Agents, the same way they have for Consumer subscriptions. Enterprise sales are significantly different from consumer signups, and OpenAI doesn't currently have a significant technical advantage.
3. Current alignment techniques may be insufficient, not just to align AGI, but to align the independently acting Agents which are expected to make up 60% of OpenAI's revenue.
"AI 2027" uses an implausible forecast of compute/algorithm improvement past 2028. It assumes that each continues exponential progress, but at half the rate (so 2.35x compute/year, and 1.5x algorithmic improvement/year).
1. Algorithmic improvements have been driven in part by exponential increases in spending (to hire researchers), and in part by exponential increases in compute. "AI 2027" indeed implies the decreased speed of algorithmic improvement is due to "the human research population...growing at a slower rate". However, technological improvements often outright plateau. This can be due to objective limits, or due to low-hanging fruit having been picked. For a concrete example, some of the most significant algorithmic improvements have been the changes from fp32 -> fp8 training; Nvidia Blackwell also improved support for fp4 training; further progress, if any, will have natural limits.
2. Given their separate assumption of a 1.35x annual FLOPs increase, this implies a 1.7x annual increase in AI R&D spending- down from today's 2.6x.^[5] However, it's much more likely that AI R&D spending will instead drop by an order of magnitude or more: an "AI Winter". In the next section, I'll try running through some concrete numbers about future AI R&D spending.

***

Projected single-company R&D spending, in billions:^[6]

2024	2025	2026	2027	2028	2029	2030
4	10	27	70	183	311	528

Here are a few relevant points:

OpenAI's R&D spending *thus far* has been primarily driven by equity sales, not revenue. This is a typical model for a Silicon Valley startup, whose growth will eventually plateau at a profitable level. But OpenAI isn't like other startups.
OpenAI's leadership is unlike that of other startups: they are racing to an endpoint, not trying to build a sustainable business. Even if they could slow down R&D spending in order to guarantee that they can avoid bankruptcy, they aren't likely to do so.
OpenAI has a very different cost structure than other software companies. Typically, R&D costs are high but ~constant, whereas marginal costs are near-zero. OpenAI (and other AI companies) have high marginal costs due to the cost of the compute to run their models. Per OpenAI's 2024 budget, they had 4 billion in revenue, and spent 2 billion on inference, 400 million on hosting, 300 million on marketing, and probably 280 million on non-R&D salaries. (If salaries and compute had the same training:inference ratio.) For comparison, here are the costs of revenue for some major companies:

Meta	Salesforce	SAP	Oracle	Microsoft	Alphabet	OpenAI	Walmart	Ford
18%	23%	27%	29%	30%	42%	75%	75%	91%

For OpenAI in particular, turning a profit at all, even with zero R&D spending, will likely be impossible even if Altman were to stop racing. OpenAI has *also* signed a revenue sharing agreement with Microsoft that sends them another 20% of their revenue. Even worse, OpenAI and associated Project Stargate are already planning on racking up substantial amounts of debt even before 2026, let alone 2029. Realistically, it will be impossible for OpenAI to even pay the interest on its debt, let alone spend *anything* on R&D.^[7]
Even OpenAI reaching 140 billion in revenue in 2027, as per "AI 2027", won't come even close to making it profitable. It's not even clear whether OpenAI (or Anthropic, or xAI) can survive increasing their R&D spending at the current rate up to 2027.
OpenAI, or other AI companies, will likely have trouble raising further funding. "AI 2027" implies that it will control 400 billion in capex by 2027; raising approximately that much would be the largest financial transaction in human history, for only a fraction of a company which had never been profitable. OpenAI's *existing* fundraising has been explicable as normal investments;^[8] this would not be.^[9]
Other pre-existing tech companies are stronger contenders for continuing AI R&D. They don't have the revenue sharing agreement, and they have tens of billions in cash-on-hand and current profits to fund R&D without taking on debt. Meta isn't directly making revenue from its models; Amazon doesn't appear to be doing frontier AI research; Microsoft is pulling back from AI; which leaves Google as the likely winner if there's no takeoff by 2029.
If Google has a similar cost of revenue to OpenAI, it would need to make at least 1.2 trillion in new revenue to pay for the new R&D spending in 2028, or 2 trillion in new revenue in 2029. That's assuming no increase in Google's profits from now, and no increase in General and Administrative spending. That's also assuming that SC hasn't been reached. This all seems *highly* unlikely.^[10] (Note that Google currently makes 350 billion in revenue and is spending 50 billion on R&D.)^[11]

One way to summarize this section: any forecast of AGI arrival is *highly* sensitive to the exact point during current scaling at which AGI is achievable- because current scaling cannot continue for long.^[12] If current growth rates can deliver superhuman coding capabilities by 2027, we might actually see it happen. However, if those same capabilities would require until 2028, then on some plausible financial models we wouldn't see AGI until the mid-2030's or later. It's therefore *extremely* unfortunate that (a) per the timelines discussion, there is no way to get much confidence in a forecast of the near future, and (b) the AI 2027 team's median timeline is longer that what they shared.

^{^}
Note that from GPT-2 to today, neglecting Llama 4 Scout, the length of the context window did increase with a doubling time of 6.6 months- very similar to METR's claimed 7 month doubling time.
^{^}
This is just repeating my first point.
^{^}
This point is also an objection to using METR's time horizon forecasting, as it's also based on benchmarks.
^{^}
Note: "AI 2027" chooses to call the leading lab "OpenBrain", but FutureSearch is explicit that they're talking about OpenAI.
^{^}
These numbers are sourced from Epoch, but their estimates don't add up. 1.35x in computational performance, and 2.6x in training costs, should yield a 3.5x increase in training compute, not the calculated 4.7x. They are using two different datasets for the calculations, presumably one is more correct.
^{^}
Taking 2024's 4 billion level from OpenAI's spending; multiplying by 2.6 until 2029, then by 1.7, i.e. using the "AI 2027" numbers. Note that actual spending will be spikier than this: some years will see higher capex spending to build datacenters, other years may see lower data center construction, and instead are amortizing that capex + spending on electricity and salaries.
^{^}
I haven't even considered that General and Administrative spending might also scale with revenue.
^{^}
Total spending of 10-40 billion on R&D projects is high but not unprecedented- although spending of 40 billion per year is completely unprecedented. (The Manhattan project cost ~35 billion in today's dollars, over ~3 years; the largest private sector R&D effort ever, the 787, cost ~45 billion in today's dollars, over ~8 years.) The current spending of ~4 billion per year is pretty normal for a major R&D project. It's not even the highest current spending relevant to AI- Nvidia is spending ~13 billion on R&D this year.
^{^}
An *additional* problem here is that 2028 is approximately when data availability and data movement may become significant issues- this will slow down the future rate at which money buys effective compute, which will in turn dissuade people from funding further development.
^{^}
There are, to be fair, also some ways in which this forecast is overly negative. An AI Winter would drive down the cost of GPU's; this is only helpful for companies which don't own their own datacenters, i.e. Anthropic. (And if all companies increasingly use their own GPU designs, they may have trouble quickly integrating different GPU's into their setup.) Also, part of companies' current inference spend is on free users: this can be eliminated entirely, at the cost of giving up on public mindshare/raising more money. Also, especially if other AI companies are going bankrupt, prices can be raised.
^{^}
Even Google would need to either sell significant equity or take on significant debt to fund AI development until 2028- its current cash and profits aren't nearly enough. Note that this halt would probably co-occur with a recession- given that hundreds of billions of dollars of AI capex would have suddenly just vanished. AI companies would first cut new capex spending, and may try to sell their GPU's in this scenario; this also plausibly kills off Nvidia and halts semiconductor improvements.
^{^}
It's also highly sensitive to exactly how long scaling can continue, of course. Scaling might plausibly continue to 2030 or beyond if national governments take over funding, for example.

10 comments

Comments sorted by top scores.

comment by elifland · 2025-04-11T21:19:25.140Z · LW(p) · GW(p)

Thanks for these detailed comments! I'll aim to respond to some of the meat of your post within a few days latest, but real quick regarding the top portion:

I find the decision to brand the forecast as "AI 2027" very odd. The authors do not in fact believe this; they explicitly give 2028, 2030, or 2033 for their median dates for a superhuman coder.
The point of this project was presumably to warn about a possible outcome; by the authors' own beliefs, their warning will be falsified immediately before it is needed.

Adding some more context: each of the timelines forecasts authors' modal superhuman coder year is roughly 2027. The FutureSearch forecasters who have a 2033 median aren't authors on the scenario itself (but neither is Nikola with the 2028 median). Of the AI 2027 authors, all have a modal year of roughly 2027 and give at least ~20% to getting it by 2027. Daniel, the lead author, has a median of early 2028.

IMO it seems reasonable to portray 2027 as the arrival year of superhuman coders, given the above. It's not clear whether the median or modal year is better here, conditional on having substantial probability by the modal year (i.e. each of us has >=20% by 2027, Daniel has nearly 50%).

To be transparent though, we originally had it at 2027 because that was Daniel's median year when we started the project. We decided against changing it when he lengthened his median because (a) it would have been a bunch of work and we'd already spent over a year on the project and (b) as I said above, it seemed roughly as justified as 2028 anyway from an epistemic perspective.

Overall though I sympathize with the concern that we will lose a bunch of credibility if we don't get superhuman coders by 2027. Seems plausible that we should have lengthened story despite the reasoning above.

When presenting predictions, forecasters always face tradeoffs regarding how much confidence to present. Confident, precise forecasting attracts attempts and motivates action; adding many concrete details produces a compelling story, stimulating discussion; this also involves falsifiable predictions. Emphasizing uncertainty avoids losing credibility when some parts of story inevitably fail; prevents overconfidence; and encourages more robust strategies that can work across a range of outcomes. But I can't think of any reason to give a confident, high precision story that you don't even believe in!

I'd be curious to hear more about what made you perceive our scenario as confident. We included caveats signaling uncertainty in a bunch of places, for example in "Why is it valuable?" and several expendables and footnotes. Interestingly, this popular YouTuber made a quip that it seemed like we were adding tons of caveats everywhere,

Replies from: Randaly

↑ comment by Randaly · 2025-04-11T21:45:35.734Z · LW(p) · GW(p)

I'd be curious to hear more about what made you perceive our scenario as confident. We included caveats signaling uncertainty in a bunch of places, for example in "Why is it valuable?" and several expendables and footnotes. Interestingly, this popular YouTuber made a quip that it seemed like we were adding tons of caveats everywhere,

I was imprecise (ha ha) with my terminology here- I should have only talked about a precise forecast rather than a confident one, I meant solely the attempt to highlight a single story about a single year. My bad. Edited the post.

comment by Vladimir_Nesov · 2025-04-11T21:37:58.296Z · LW(p) · GW(p)

I can't think of any reason to give a confident, high precision story that you don't even believe in!

Datapoints generalize, a high precision story holds gears that can be reused in other hypotheticals. I'm not sure what you mean by the story being presented as "confident" (in some sense it's always wrong to say that a point prediction is "confident" rather than zero probability, even if it's the mode of a distribution, the most probable point). But in any case I think giving high precision stories is a good methodology for communicating a framing, pointing out which considerations seem to be more important in thinking about possibilities, and also which events (that happen to occur in the story) seem more plausible than their alternatives.

comment by elifland · 2025-04-13T18:25:44.874Z · LW(p) · GW(p)

Responses to some of your points:

There is no particular reason to endorse the particular set of gaps chosen. The most prominent gap that I've seen discussed, the ability of LLMs to come up with new ideas or paradigms, wasn't included.

This skill doesn't seem that necessary for superhuman coding, but separately I think that AIs can already do this to some extent and it's unclear that it will lag much behind other skills.

"benchmarks-and-gaps" has historically proven to be an unreliable way of forecasting AI development. The problem is that human intuitions about what capabilities are required for specific tasks aren't very good, and so more "gaps" are discovered once the original gaps have been passed.

I think with previous benchmarks it was generally clearer that solving them would be nowhere near what is needed for superhuman coding or AGI. But I agree that we should notice similar skulls with e.g. solving chess being considered AGI-complete.

"AI 2027" uses an implausible forecast of compute/algorithm improvement past 2028. It assumes that each continues exponential progress, but at half the rate (so 2.35x compute/year, and 1.5x algorithmic improvement/year).

Seems plausible, I implemented these as quick guesses, though this wouldn't effect the mode or median forecasts much. I agree that we should have a long tail due to considerations like this, e.g. my 90th percentile is >2050.

If current growth rates can deliver superhuman coding capabilities by 2027, we might actually see it happen. However, if those same capabilities would require until 2028, then on some plausible financial models we wouldn't see AGI until the mid-2030's or later.

I'm very skeptical that 2028 with current growth rates would be pushed all the way back to mid-2030s and that the cliff will be so steep. My intuitions are more continuous here. If AGI is close in 2027 I think that will mean increased revenue and continued investment, even if the rate slows down some.

Replies from: Randaly

↑ comment by Randaly · 2025-04-14T09:27:30.918Z · LW(p) · GW(p)

My intuitions are more continuous here. If AGI is close in 2027 I think that will mean increased revenue and continued investment

Gotcha, I disagree. Lemme zoom on this part of my reasoning, to explain why I think profitability matters (and growth matters less):

(1) Investors always only terminally value profit; they never terminally value growth. Most of the economy doesn't focus much on growth compared to profitability, even instrumentally. However, one group of investors, VC's, do: software companies generally have high fixed costs and low marginal costs, so sufficient growth will almost always make them profitable. But (a) VC's have never invested anywhere even close to the sums we're talking about, and (b) even if they had, OpenAI continuing to lose money will eventually make them skeptical.

(For normal companies: if they aren't profitable, they run out of money and die. Any R&D spending needs to come out of their profits.)

(2) Another way of phrasing point 1: I very much doubt if OpenAI's investors actually believe in AGI- Satya Nadella explicitly doesn't, others seem to use it as an empty slogan. What they believe in is getting a return on their money. So I believe that OpenAI making profits would lead to investment, but that OpenAI nearing AGI without profits won't trigger more investment.

(3) Even if VC's were to continue investment, the absolute numbers are nearly impossible. OpenAI's forecasted 2028 R&D budget is 183 billion; that exceeds the total global VC funding for enterprise software in 2024, which was 155 billion. This would be going to purchase a fraction of a company which would be tens of billions in debt, which had burned through 60 billion in equity already, and which had never turned a profit. (OpenAI needing to raise more money also probably means that xAI and Anthropic have run out of money, since they've raised less so far.)

In practice OpenAI won't even be able to raise its current amount of money ever again: (a) it's now piling on debt and burning through more equity, and is at a higher valuation; (b) recent OpenAI investor Masayoshi Son's SoftBank is famously bad at evaluating business models (they invested in WeWork) and is uniquely high-spending- but is now essentially out of money to invest.

So my expectation is that OpenAI cannot raise exponentially more money without turning a profit, which it cannot do.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2025-04-15T00:12:01.267Z · LW(p) · GW(p)

OpenAI continuing to lose money

They are losing money only if you include all the R&D (where the unusual thing is very expensive training compute for experiments), which is only important while capabilities keep improving. If/when the capabilities stop improving quickly, somewhat cutting research spending won't affect their standing in the market that much. And also after revenue grows some more, essential research (in the slow capability growth mode) will consume a smaller fraction. So it doesn't seem like they are centrally "losing money", the plausible scenarios still end in profitability (where they don't end the world) if they don't lose the market for normal reasons like failing on products or company culture.

OpenAI cannot raise exponentially more money without turning a profit, which it cannot do

This does seem plausible in some no-slowdown worlds (where they ~can't reduce R&D spending in order to start turning profit), if in fact more investors don't turn up there. On the other hand, if every AI company is forced to reduce R&D spending because they can't raise money to cover it, then they won't be outcompeted by a company that keeps R&D spending flowing, because such a competitor won't exist.

Replies from: Randaly

↑ comment by Randaly · 2025-04-15T06:07:59.491Z · LW(p) · GW(p)

I want to clarify that I'm criticizing "AI 2027"'s projection of R&D spending, i.e. this table [LW · GW]. If companies cut R&D spending, that falsifies the "AI 2027" forecast.

In particular, the comment I'm replying to proposed that while the current money would run out in ~2027, companies could raise more to continue expanding R&D spending. Raising money for 2028 R&D would need to occur in 2027; and it would need to occur on the basis of financial statements of at least a quarter before the raise. So in this scenario, they need to slash R&D spending in 2027- something the "AI 2027" authors definitely don't anticipate.

Furthermore, your claim that "they are losing money only if you include all the R&D" may be false. We lack sufficient breakdown of OpenAI's budget to be certain. My estimate from the post was that most AI companies have 75% cost of revenue; OpenAI specifically has a 20% revenue sharing agreement with Microsoft; and the remaining 5% needs to cover General and Administrative expenses. Depending on the exact percentage of salary and G&A expenses caused by R&D, it's plausible that OpenAI eliminating R&D entirely wouldn't make it profitable today. And in the future OpenAI will also need to pay interest on tens of billions in debt.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2025-04-15T16:48:01.990Z · LW(p) · GW(p)

I see what you mean (I did mostly change the topic to the slowdown hypothetical). There is another strange thing about AI companies, I think giving ~50% in cost of inference too much precision in the foreseeable future is wrong, as it's highly uncertain and malleable in a way that's hard for even the company itself to anticipate.

About ~2x difference in inference cost (or size of a model) can be merely hard to notice when nothing substantial changes in the training recipe (and training cost), and better post-training (which is relatively cheap) can get that kind of advantage or more, but not reliably. Pretraining knowledge distillation might get another ~1.5x at the cost of training a larger teacher model (plausibly GPT-4.1 has this because of the base model for GPT-4.5, but GPT-4o doesn't). And there are all the other compute multipliers that become less fake if the scale stops advancing. The company itself won't be able to plan with any degree of certainty how good its near future models will be relative to their cost, or how much its competitors will be able to cut prices. So the current state of cost of inference doesn't seem like a good anchor for where it might settle in the slowdown timelines.

comment by romeo · 2025-04-12T01:34:57.573Z · LW(p) · GW(p)

Thanks for the detailed comments! We really appreciate it. Regarding revenue, here's some thoughts:

"AI 2027" forecasts that OpenAI's revenue will reach 140 billion in 2027. This considerably exceeds even OpenAI's own forecast, which surpasses 125 billion in revenue in 2029. I believe that the AI 2027 forecast is implausible.^[4]

AI 2027 is not a median forecast but a modal forecast, so a plausible story for the faster side of the capability progression expected by the team. If you condition on the capability progression in the scenario, I actually think $140B in 2027 is potentially on the conservative side. My favourite parts of the FutureSearch report is the examples from the ~$100B/year reference class, e.g., 'Microsoft’s Productivity and Business Process segment.' If you take the AI's agentic capabilities and reliability from the scenario seriously, I think it feels intuitively easy to imagine how a similar scale business booms relatively quickly, and i'm glad that FutureSearch was able to give a breakdown as an example of how that could look.

So maybe I should just ask whether you are conditioning on the capabilities progression or not with this disagreement? Do you think $140b in 2027 is implausible even if you condition on the AI 2027 capability progression?

If you just think $140B in 2027 is not a good unconditional median forecast all things considered, then I think we all agree!

Note: "AI 2027" chooses to call the leading lab "OpenBrain", but FutureSearch is explicit that they're talking about OpenAI.

We aren't forecasting OpenAI revenue but OpenBrain revenue which is different because its ~MAX(OpenAI, Anthropic, GDM (AI-only), xAI, etc.).^[1] In some places FutureSearch indeed seems to have given the 'plausible $100B ARR breakdown' under the assumption that OpenAI is the leading company in 2027, but that doesn't mean the two are supposed to be equal neither in their own revenue forecast nor in any of the AI 2027 work.

FutureSearch's estimate of paid subscribers for April 2025 was 27 million; the actual figure is 20 million. They justify high expected consumer growth with reference to the month-on-month increase in unpaid users from December 2024 -> February 2025. Data from Semrush replicates that increase, but also shows that traffic has since declined rather than continuing to increase.

The exact breakdown FutureSearch use seems relatively unimportant compared to the high level argument that the headline (1) $/month and (2) no. of subscribers, very plausibly reaches the $100B ARR range, given the expected quality of agents that they will be able to offer.

Looking at market size estimates, FutureSearch seems to implicitly assume that OpenAI will achieve a near-monopoly on Agents, the same way they have for Consumer subscriptions. Enterprise sales are significantly different from consumer signups, and OpenAI doesn't currently have a significant technical advantage.

I don't think a monopoly is necessary, there's a significant OpenBrain lead-time in the scenario, and I think it seems plausible that OpenBrain would convert that into a significant market share.

^{^}
Not exactly equal since maybe the leading company in AI capabilities (measured by AI R&D prog. multiplier), i.e., OpenBrain, is not the one making the most revenue.

Replies from: Randaly

↑ comment by Randaly · 2025-04-12T05:30:57.420Z · LW(p) · GW(p)

Thanks for the response!

So maybe I should just ask whether you are conditioning on the capabilities progression or not with this disagreement? Do you think $140b in 2027 is implausible even if you condition on the AI 2027 capability progression?

I am conditioning on the capabilities progression.

Based on your later comments, I think you are expecting a much faster/stronger/more direct translation of capabilities into revenue than I am- such that conditioning on faster progress makes more of a difference.

The exact breakdown FutureSearch use seems relatively unimportant compared to the high level argument that the headline (1) $/month and (2) no. of subscribers, very plausibly reaches the $100B ARR range, given the expected quality of agents that they will be able to offer.

Sure, I disagree with that too. I recognize that most of the growth comes from the Agents category rather than the Consumer category, but overstating growth in the only period we can evaluate is evidence that the model or intuition will also overstate growth of other types in other periods.

I don't think a monopoly is necessary, there's a significant OpenBrain lead-time in the scenario, and I think it seems plausible that OpenBrain would convert that into a significant market share.

OpenBrain doesn't actually have a significant lead time by the standards of the "normal" economy. The assumed lead time is "3-9 months"; both from my very limited personal experience (involved very tangentially in 2 such sales attempts) and from checking online, enterprise sales in the 6+ digits range often take longer than that to close anyways.

I'm suspicious that both you and FutureSearch are trying to apply intuitions from free-to-use consumer-focused software companies to massive enterprise SAAS sales. (FutureSearch compares OpenAI with Google, Facebook, and TikTok.) Beyond the length of sales cycles, another difference is that enterprise software is infamously low quality; there are various purported causes, but relevant ones include various principal-agent problems: the people making decisions have trouble evaluating software, won't necessarily be directly using it themselves, and care more about things aside from technical quality: "Nobody ever got fired for buying IBM".

Comments on "AI 2027"

Contents

10 comments