Since they said they are training the next frontier model now (which was on May 28), probably tests on an intermediate checkpoint indicate it’s not worthy of moniker “GPT-5” (which was hyped as a significant advance), and late 2025 is the plan for deployment of an even bigger model that’s scaled one step further than that. So this is evidence that the late-2024/early-2025 deployment model finishing training now will be called something else like GPT-4.5. This also agrees with the recent iterative deployment buzz, a sudden GPT-5 worthy of the name would be discordant with it.
(If the currently training model was turning out too strong instead, other labs would also be approaching similarly powerful models, in which case having plans to delay to a specific distant date but not further seems strange. If training was getting unstable late into the training run, it might be too early to call a specific delay by at least half a year relative to prior plans.)
Could also be that the next model is just going to take a bunch of time to train & test & fine-tune, which with GPT-4 apready took 6 months+7 months. Given that this is a bigger and more advanced model they might just be Hofstadters Law-ing their deployment pipeline.
Maybe they’re even planning for some more time for safety-testing? A man can hope.
These are considerations about prior plans, not change of plans caused by recent events (“pushed back GPT-5 to late 2025”). They don’t necessarily need much more compute than for other recent projects either, just ease up on massive overtraining to translate similar compute into more capability at greater inference cost, and then catch up on efficiency with “turbo” variants later.
Since they said they are training the next frontier model now (which was on May 28), probably tests on an intermediate checkpoint indicate it’s not worthy of moniker “GPT-5” (which was hyped as a significant advance), and late 2025 is the plan for deployment of an even bigger model that’s scaled one step further than that. So this is evidence that the late-2024/early-2025 deployment model finishing training now will be called something else like GPT-4.5. This also agrees with the recent iterative deployment buzz, a sudden GPT-5 worthy of the name would be discordant with it.
(If the currently training model was turning out too strong instead, other labs would also be approaching similarly powerful models, in which case having plans to delay to a specific distant date but not further seems strange. If training was getting unstable late into the training run, it might be too early to call a specific delay by at least half a year relative to prior plans.)
Could also be that the next model is just going to take a bunch of time to train & test & fine-tune, which with GPT-4 apready took 6 months+7 months. Given that this is a bigger and more advanced model they might just be Hofstadters Law-ing their deployment pipeline.
Maybe they’re even planning for some more time for safety-testing? A man can hope.
These are considerations about prior plans, not change of plans caused by recent events (“pushed back GPT-5 to late 2025”). They don’t necessarily need much more compute than for other recent projects either, just ease up on massive overtraining to translate similar compute into more capability at greater inference cost, and then catch up on efficiency with “turbo” variants later.