GPT-5 should be released late 2025 at the earliest if OpenAI follows the usual naming convention of roughly 100x in raw compute. With GPT-4 at 2e25 FLOPs, GPT-4.5 should have about 2e26 FLOPs and GPT-5 about 2e27 FLOPs. A 100K H100 training system, like the one in Goodyear (or Musk’s Memphis datacenter as it was late 2024), can train a 3e26 FLOPs model, which fits the name of GPT-4.5, but it can’t train a 2e27 FLOPs model.
The new Stargate site in Abilene might be preparing to host 200K-300K chips in GB200 NVL72 racks. These chips produce 2.5x more compute than H100s, so 200K would be sufficient to get 2e27 FLOPs and train a GPT-5. If there’s already enough power (about 400 MW all-in for 200K chips), shipments of GB200 in bulk start in early 2025, get installed at xAI’s pace, and go into pretraining for 4 months, then with 1 more month of post-training it’s already November.
So the rumors about GPT-5 in late May 2025 either represent change in the naming convention, or correspond to some intermediate milestone in training GPT-5, likely the training system being in principle ready to start pretraining.
In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3. We will no longer ship o3 as a standalone model.
I think he’s pretty plainly saying that this “GPT-5” will be a completely different thing from a 100x’d GPT-4.
This is perfectly consistent with GPT-5 being 100x GPT-4 compute. Announcing specific features that will go into it suggests they have a prototype, in this case I’m guessing the LLM will itself be trained to decide whether to go into the reasoning mode, triggering it when needed and affordable, like any other tool.
I don’t see it. He says that GPT-5 will be a system that “integrates o3”. This isn’t his sloppy way of saying “integrates the reasoning techniques”: when he wants to express that idea, he talks about “unifying o-series models and GPT-series models”. The wording regarding GPT-5 is consistent with him literally saying that the model o3 will be part of GPT-5.
Furthermore, I take “as” in “GPT-5 as a system that integrates a lot of our technology” to mean “GPT-5 is defined as {a system that integrates a lot of our technology, including o3}”. Not “GPT-5 will be trained to automatically switch between a standard mode, a reasoning mode, a Deep Research mode, etc.”, not even “GPT-5 will be trained to recognize when to fall back to o3, a lesser model”, but literally “we’re slapping the GPT-5 label on a glorified wrapper over all our current models”.
The “glorified wrapper” could still be a 2e27 FLOPs model, it could even be using literal o3 as one of its tools (in addition to all the other tools, with native GPT-5 long reasoning mostly reserved for premium tier). This is in line with the “agents” agenda where better reliability in taking irreversible actions unlocks new use cases, in this case whether to make use of expensive reasoning calls.
Since “GPT-4.5” will actually be released rather than skipped, it’s less plausible for “GPT-5″ to come out shortly after. If it’s announced in ~Dec 2025 (the way o3 was), it’s still “within months”, and then it can actually get released in ~Feb 2026.
Hm, fair enough. Seems like a stretch, though, especially given the need to interpret his “ETA in months” as “will be officially announced in months and released in a year”.
There was also Murati in Jun 2024 predicting PhD level AI in 18 months. If they succeed in achieving parity with xAI in terms of safety procedures, they might even release a preview checkpoint in Dec 2025 for Pro users. So actual release in a year is not strictly necessary for this hypothesis, it’s just closer to what they’ve done in the past.
I’m merely referring to the historical precedent, whether there are informal commitments in the minds of the leadership is not something I can speak to. This pattern might continue or it might break. What I’m guessing about training system buildout from vague clues seems to be consistent with it continuing, so the naming pattern can be used as another clue to make a point estimate prediction that’s more concrete.
GPT-5 should be released late 2025 at the earliest if OpenAI follows the usual naming convention of roughly 100x in raw compute. With GPT-4 at 2e25 FLOPs, GPT-4.5 should have about 2e26 FLOPs and GPT-5 about 2e27 FLOPs. A 100K H100 training system, like the one in Goodyear (or Musk’s Memphis datacenter as it was late 2024), can train a 3e26 FLOPs model, which fits the name of GPT-4.5, but it can’t train a 2e27 FLOPs model.
The new Stargate site in Abilene might be preparing to host 200K-300K chips in GB200 NVL72 racks. These chips produce 2.5x more compute than H100s, so 200K would be sufficient to get 2e27 FLOPs and train a GPT-5. If there’s already enough power (about 400 MW all-in for 200K chips), shipments of GB200 in bulk start in early 2025, get installed at xAI’s pace, and go into pretraining for 4 months, then with 1 more month of post-training it’s already November.
So the rumors about GPT-5 in late May 2025 either represent change in the naming convention, or correspond to some intermediate milestone in training GPT-5, likely the training system being in principle ready to start pretraining.
Per Altman:
I think he’s pretty plainly saying that this “GPT-5” will be a completely different thing from a 100x’d GPT-4.
This is perfectly consistent with GPT-5 being 100x GPT-4 compute. Announcing specific features that will go into it suggests they have a prototype, in this case I’m guessing the LLM will itself be trained to decide whether to go into the reasoning mode, triggering it when needed and affordable, like any other tool.
I don’t see it. He says that GPT-5 will be a system that “integrates o3”. This isn’t his sloppy way of saying “integrates the reasoning techniques”: when he wants to express that idea, he talks about “unifying o-series models and GPT-series models”. The wording regarding GPT-5 is consistent with him literally saying that the model o3 will be part of GPT-5.
Furthermore, I take “as” in “GPT-5 as a system that integrates a lot of our technology” to mean “GPT-5 is defined as {a system that integrates a lot of our technology, including o3}”. Not “GPT-5 will be trained to automatically switch between a standard mode, a reasoning mode, a Deep Research mode, etc.”, not even “GPT-5 will be trained to recognize when to fall back to o3, a lesser model”, but literally “we’re slapping the GPT-5 label on a glorified wrapper over all our current models”.
The “glorified wrapper” could still be a 2e27 FLOPs model, it could even be using literal o3 as one of its tools (in addition to all the other tools, with native GPT-5 long reasoning mostly reserved for premium tier). This is in line with the “agents” agenda where better reliability in taking irreversible actions unlocks new use cases, in this case whether to make use of expensive reasoning calls.
Since “GPT-4.5” will actually be released rather than skipped, it’s less plausible for “GPT-5″ to come out shortly after. If it’s announced in ~Dec 2025 (the way o3 was), it’s still “within months”, and then it can actually get released in ~Feb 2026.
Hm, fair enough. Seems like a stretch, though, especially given the need to interpret his “ETA in months” as “will be officially announced in months and released in a year”.
There was also Murati in Jun 2024 predicting PhD level AI in 18 months. If they succeed in achieving parity with xAI in terms of safety procedures, they might even release a preview checkpoint in Dec 2025 for Pro users. So actual release in a year is not strictly necessary for this hypothesis, it’s just closer to what they’ve done in the past.
I doubt this is a real convention. I think OpenAI wanted to call Orion GPT-5 if they thought it was good enough to deserve the name.
I’m merely referring to the historical precedent, whether there are informal commitments in the minds of the leadership is not something I can speak to. This pattern might continue or it might break. What I’m guessing about training system buildout from vague clues seems to be consistent with it continuing, so the naming pattern can be used as another clue to make a point estimate prediction that’s more concrete.