Zachary's Shortform

zachary-1

Zachary's Shortform

post by Zachary · 2024-06-25T17:40:02.212Z · LW · GW · 7 comments

7 comments

7 comments

Comments sorted by top scores.

comment by Zachary · 2024-06-25T17:40:02.325Z · LW(p) · GW(p)

I have moderately strong evidence that OpenAI has pushed back GPT-5 to late 2025 (not naming source for confidentiality reasons). Conditional on this being true:

What do you think the most likely explanation is as to why it's being delayed?
How would this affect your AI timelines?
What impact would you expect this news to have on AI relevant stocks?

Replies from: Vladimir_Nesov, p.b., Decaeneus

↑ comment by Vladimir_Nesov · 2024-06-25T19:43:51.132Z · LW(p) · GW(p)

Since they said they are training the next frontier model now (which was on May 28), probably tests on an intermediate checkpoint indicate it's not worthy of moniker "GPT-5" (which was hyped as a significant advance), and late 2025 is the plan for deployment of an even bigger model that's scaled one step further than that. So this is evidence that the late-2024/early-2025 deployment model finishing training now will be called something else like GPT-4.5. This also agrees with the recent iterative deployment buzz, a sudden GPT-5 worthy of the name would be discordant with it.

(If the currently training model was turning out too strong instead, other labs would also be approaching similarly powerful models, in which case having plans to delay to a specific distant date but not further seems strange. If training was getting unstable late into the training run, it might be too early to call a specific delay by at least half a year relative to prior plans.)

Replies from: niplav

↑ comment by niplav · 2024-06-25T22:28:54.162Z · LW(p) · GW(p)

Could also be that the next model is just going to take a bunch of time to train & test & fine-tune, which with GPT-4 apready took 6 months+7 months. Given that this is a bigger and more advanced model they might just be Hofstadters Law-ing their deployment pipeline.

Maybe they're even planning for some more time for safety-testing? A man can hope.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2024-06-25T22:58:05.200Z · LW(p) · GW(p)

These are considerations about prior plans, not change of plans caused by recent events ("pushed back GPT-5 to late 2025"). They don't necessarily need much more compute than for other recent projects either, just ease up on massive overtraining to translate similar compute into more capability at greater inference cost, and then catch up on efficiency with "turbo" variants later.

↑ comment by p.b. · 2024-06-25T17:55:31.022Z · LW(p) · GW(p)

Mira Murati said publicly that "next gen models" will come out in 18 months, so your confidential source seems likely to be correct.

Replies from: Zachary

↑ comment by Zachary · 2024-06-25T20:58:32.405Z · LW(p) · GW(p)

It's very possible that Murati's talk at Dartmouth was my source's source, i.e. the embedded video around 13:30. She doesn't say GPT-5 specifically but does sort of imply that by mentioning the jump from GPT-3 to GPT-4, then says "And then in the next couple of years we're looking at PhD-level intelligence for specific tasks...Yeah, a year and a half let's say"

↑ comment by Decaeneus · 2024-06-26T15:50:46.180Z · LW(p) · GW(p)

Things slow down when Ilya isn't there to YOLO in the right direction in an otherwise very high-dimensional space.

Zachary's Shortform

Contents

7 comments