Vladimir_Nesov comments on Zachary’s Shortform

Vladimir_Nesov 25 Jun 2024 19:43 UTC
6 points
2
Since they said they are training the next frontier model now (which was on May 28), probably tests on an intermediate checkpoint indicate it’s not worthy of moniker “GPT-5” (which was hyped as a significant advance), and late 2025 is the plan for deployment of an even bigger model that’s scaled one step further than that. So this is evidence that the late-2024/early-2025 deployment model finishing training now will be called something else like GPT-4.5. This also agrees with the recent iterative deployment buzz, a sudden GPT-5 worthy of the name would be discordant with it.

(If the currently training model was turning out too strong instead, other labs would also be approaching similarly powerful models, in which case having plans to delay to a specific distant date but not further seems strange. If training was getting unstable late into the training run, it might be too early to call a specific delay by at least half a year relative to prior plans.)
- niplav 25 Jun 2024 22:28 UTC
  2 points
  0
  Parent
  Could also be that the next model is just going to take a bunch of time to train & test & fine-tune, which with GPT-4 apready took 6 months+7 months. Given that this is a bigger and more advanced model they might just be Hofstadters Law-ing their deployment pipeline.
  
  Maybe they’re even planning for some more time for safety-testing? A man can hope.
  - Vladimir_Nesov 25 Jun 2024 22:58 UTC
    4 points
    0
    Parent
    These are considerations about prior plans, not change of plans caused by recent events (“pushed back GPT-5 to late 2025”). They don’t necessarily need much more compute than for other recent projects either, just ease up on massive overtraining to translate similar compute into more capability at greater inference cost, and then catch up on efficiency with “turbo” variants later.