Zachary’s Shortform

Zachary25 Jun 2024 17:40 UTC

2 points

7 comments1 min readLW link

Zachary 25 Jun 2024 17:40 UTC
2 points
0
I have moderately strong evidence that OpenAI has pushed back GPT-5 to late 2025 (not naming source for confidentiality reasons). Conditional on this being true:
1. What do you think the most likely explanation is as to why it’s being delayed?
2. How would this affect your AI timelines?
3. What impact would you expect this news to have on AI relevant stocks?
- Vladimir_Nesov 25 Jun 2024 19:43 UTC
  6 points
  2
  Parent
  Since they said they are training the next frontier model now (which was on May 28), probably tests on an intermediate checkpoint indicate it’s not worthy of moniker “GPT-5” (which was hyped as a significant advance), and late 2025 is the plan for deployment of an even bigger model that’s scaled one step further than that. So this is evidence that the late-2024/early-2025 deployment model finishing training now will be called something else like GPT-4.5. This also agrees with the recent iterative deployment buzz, a sudden GPT-5 worthy of the name would be discordant with it.
  
  (If the currently training model was turning out too strong instead, other labs would also be approaching similarly powerful models, in which case having plans to delay to a specific distant date but not further seems strange. If training was getting unstable late into the training run, it might be too early to call a specific delay by at least half a year relative to prior plans.)
  - niplav 25 Jun 2024 22:28 UTC
    2 points
    0
    Parent
    Could also be that the next model is just going to take a bunch of time to train & test & fine-tune, which with GPT-4 apready took 6 months+7 months. Given that this is a bigger and more advanced model they might just be Hofstadters Law-ing their deployment pipeline.
    
    Maybe they’re even planning for some more time for safety-testing? A man can hope.
    - Vladimir_Nesov 25 Jun 2024 22:58 UTC
      4 points
      0
      Parent
      These are considerations about prior plans, not change of plans caused by recent events (“pushed back GPT-5 to late 2025”). They don’t necessarily need much more compute than for other recent projects either, just ease up on massive overtraining to translate similar compute into more capability at greater inference cost, and then catch up on efficiency with “turbo” variants later.
- p.b. 25 Jun 2024 17:55 UTC
  2 points
  0
  Parent
  Mira Murati said publicly that “next gen models” will come out in 18 months, so your confidential source seems likely to be correct.
  - Zachary 25 Jun 2024 20:58 UTC
    4 points
    0
    Parent
    It’s very possible that Murati’s talk at Dartmouth was my source’s source, i.e. the embedded video around 13:30. She doesn’t say GPT-5 specifically but does sort of imply that by mentioning the jump from GPT-3 to GPT-4, then says “And then in the next couple of years we’re looking at PhD-level intelligence for specific tasks...Yeah, a year and a half let’s say”
- Decaeneus 26 Jun 2024 15:50 UTC
  1 point
  0
  Parent
  1. Things slow down when Ilya isn’t there to YOLO in the right direction in an otherwise very high-dimensional space.