Is OpenAI losing money on each request?

post by thenoviceoof · 2023-12-01T03:27:23.929Z · LW · GW · No comments

This is a question post.

Contents

  Other Factors
None
  Answers
    14 jacob_cannell
    11 johnswentworth
    4 RogerDearnaley
    2 Nathan Helm-Burger
None
No comments

While working on another post, I decided to follow up some details by doing some naive modeling of OpenAI’s LLM API revenue stream. The naive approach seems inadequate, because it implies OpenAI requires many years to break even just on the cost of GPUs.

Other Factors

Answers

answer by jacob_cannell · 2023-12-01T05:43:47.334Z · LW(p) · GW(p)

OpenAI's prices seem too low to recoup even part of their capital costs in a reasonable time given the volatile nature of the AI industry. Surely I'm missing something obvious?

Yes: batching. Efficient GPU inference uses matrix matrix multiplication not vector matrix multiplication.

answer by johnswentworth · 2023-12-01T17:17:09.043Z · LW(p) · GW(p)

+1 to Cannell's answer, and I'll also add pipelining.

Let's say (one instance of) the system is distributed across 10 GPUs, arranged in series - to to do a forward pass, the first GPU does some stuff, passes its result to the second GPU, which passes to the third, etc. If only one user at a time were being serviced, then 90% of those GPUs would be idle at any given time. But pipelining means that, once the first GPU in line has finished one request (or, realistically, batch of requests), it can immediately start on another batch of requests.

More generally: the rough estimate in the post above tries to estimate throughput from latency, which doesn't really work. Parallelism/pipelining mean that latency isn't a good way to measure throughput, unless we also know how many requests are processed in parallel at a time.

(Also I have been operating under the assumption that OpenAI is not profitable at-the-margin, and I'm curious to see an estimate.)

answer by RogerDearnaley · 2023-12-02T02:47:55.628Z · LW(p) · GW(p)

It seems very unlikely that they're running their models at 32-bit precision. 8-bit seems more likely, or at most 16-bit. And yes, obviously batching and pipelining, and probably things comparable to all the attention-cost improvements that have been going on in the open-source side (if they didn't invent them in parallel, they'll certainly adopt them). Plus they mostly run Turbo models now: recent rumors about projects named Arrakis and Gobi plus the launch of GPT-4 Turbo suggest that making inference more efficient is very important to them.

Despite all that, I still wouldn't be surprised if they were charging below cost, but I suspect they're charging a price around where they think them can soon(ish) reduce inference costs to, between algorithmic improvements and Moore's Law for GPUs.

Basically, they're a start-up: they don't need to be profitable yet, they need to persuade their investors that they have a creditable plan for reaching profitability in the next few years. 

answer by Nathan Helm-Burger · 2023-12-01T05:13:43.106Z · LW(p) · GW(p)

I think they might be loss-leading to compete against the counterfactual of status-quo-bias, the not-using-a-model-at-all state of being. Once companies start to pay the cost to incorporate the LLMs into their workflows, I see no reason why OpenAI can't just increase the price. I think this might happen by simply releasing a new improved model at a much higher price. If everyone is using and benefiting already from the old model, and the new one is clearly better, the higher price will be easier to justify as a good investment for businesses.

comment by O O (o-o) · 2023-12-01T05:51:31.462Z · LW(p) · GW(p)

With basically a blank check from VC, they’ll instead invest in making their models and infra more efficient/better instead of raising prices. They can run a large loss for a very long time.

Replies from: korin43
comment by Brendan Long (korin43) · 2023-12-02T00:23:01.332Z · LW(p) · GW(p)

Why though? They have a capped profit model (theoretically) so there's less value in this strategy, and their biggest investor would probably prefer that people use Bing instead.

Replies from: None, o-o
comment by [deleted] · 2023-12-02T00:31:13.699Z · LW(p) · GW(p)

General AI services is a natural monopoly. It has a large fixed cost to develop a competitive model, and lower marginal costs to deliver.

The best* model will have the most paying customers. It's a monopoly flywheel, the monopoly niche occupant reinvests in the most compute and the best engineers for model improvement, and the N+1 model is even more dominant and so on.

There is second network effect involved in hosting platforms for AI services. This can be an even strongest monopoly. Assuming the "app store" has some common copyrighted APIs for intercommunication between AI tools, it could make it impractical for companies offering models on the store to sell their wares anywhere else. This sends revenue to the monopoly platform owner even after they no longer offer the best model.

OpenAI seems to be pursuing both avenues like any for profit startup would. Their board has recently voted to lift the profit cap by 20 percent per year. ( https://www.economist.com/business/2023/11/21/inside-openais-weird-governance-structure )

*Refusing certain services, and refusing to offer long term guarantees, such as forever access to a frozen weight model, means openAI is leaving the door open to be evicted from this market niche.

comment by O O (o-o) · 2023-12-02T00:58:53.185Z · LW(p) · GW(p)

News is the cap grows 20% a year so it will really last until AGI

No comments

Comments sorted by top scores.