“We are not currently training GPT-5. We’re working on doing more things with GPT-4.” – Sam Altman at MIT
Count me surprised if they’re not working on GPT-5. I wonder what’s going on with this?
I saw rumors that this is because they’re waiting on supercomputer improvements (H100s?), but I would have expected at least early work like establishing their GPT-5 scaling laws and whatnot. In which case perhaps they’re working on it, just haven’t started what is considered the main training run?
I’m interested to know if Sam said any other relevant details in that talk, if anyone knows.
I’m not sure if you’ve seen it or not, but here’s a relevant clip where he mentions that they aren’t training GPT-5. I don’t quite know how to update from it. It doesn’t seem likely that they paused from a desire to conduct more safety work, but I would also be surprised if somehow they are reaching some sort of performance limit from model size.
The expectation is that GPT-5 would be the next GPT-N but 100x the training compute of GPT-4, but that would probably cost tens of $billions, so GPT-N scaling is over for now.
Count me surprised if they’re not working on GPT-5. I wonder what’s going on with this?
I saw rumors that this is because they’re waiting on supercomputer improvements (H100s?), but I would have expected at least early work like establishing their GPT-5 scaling laws and whatnot. In which case perhaps they’re working on it, just haven’t started what is considered the main training run?
I’m interested to know if Sam said any other relevant details in that talk, if anyone knows.
I’m not sure if you’ve seen it or not, but here’s a relevant clip where he mentions that they aren’t training GPT-5. I don’t quite know how to update from it. It doesn’t seem likely that they paused from a desire to conduct more safety work, but I would also be surprised if somehow they are reaching some sort of performance limit from model size.
However, as Zvi mentions, Sam did say:
The expectation is that GPT-5 would be the next GPT-N but 100x the training compute of GPT-4, but that would probably cost tens of $billions, so GPT-N scaling is over for now.