Could also be that the next model is just going to take a bunch of time to train & test & fine-tune, which with GPT-4 apready took 6 months+7 months. Given that this is a bigger and more advanced model they might just be Hofstadters Law-ing their deployment pipeline.
Maybe they’re even planning for some more time for safety-testing? A man can hope.
These are considerations about prior plans, not change of plans caused by recent events (“pushed back GPT-5 to late 2025”). They don’t necessarily need much more compute than for other recent projects either, just ease up on massive overtraining to translate similar compute into more capability at greater inference cost, and then catch up on efficiency with “turbo” variants later.
Could also be that the next model is just going to take a bunch of time to train & test & fine-tune, which with GPT-4 apready took 6 months+7 months. Given that this is a bigger and more advanced model they might just be Hofstadters Law-ing their deployment pipeline.
Maybe they’re even planning for some more time for safety-testing? A man can hope.
These are considerations about prior plans, not change of plans caused by recent events (“pushed back GPT-5 to late 2025”). They don’t necessarily need much more compute than for other recent projects either, just ease up on massive overtraining to translate similar compute into more capability at greater inference cost, and then catch up on efficiency with “turbo” variants later.