GPT-3 to GPT-4 took 3 years. Why is it surprising that the training run for GPT-5 has not yet started?
The crucial thing is that OpenAI never stopped developing, training and tweaking GPT-3 during that time and capabilities made significant progress. They certainly won’t stop putting a ton of compute and data and algorithmic ingenuity into GPT-4 until they feel that compute and algorithms have reached a point where training a completely new model from scratch makes sense.
Given that the image perception functionality of GPT-4 hasn’t even been rolled out yet, they also haven’t been able to collect feedback on which parts need tweaking most.
One could argue that the important point of this piece of information is that the capabilities jump that will come with GPT-5 is not as imminent as if the training had already started. But I strongly suspect that the later the training starts the larger the jump will be.
I wouldn’t be worried about a GPT-5 that started training already. I would expect it to be a bigger GPT-4. I’m definitely more worried about a GPT-5 that only starts training after the lessons of massive scale deployment and a thousand different use cases have been incorporated into its architecture.
GPT-3 to GPT-4 took 3 years. Why is it surprising that the training run for GPT-5 has not yet started?
The crucial thing is that OpenAI never stopped developing, training and tweaking GPT-3 during that time and capabilities made significant progress. They certainly won’t stop putting a ton of compute and data and algorithmic ingenuity into GPT-4 until they feel that compute and algorithms have reached a point where training a completely new model from scratch makes sense.
Given that the image perception functionality of GPT-4 hasn’t even been rolled out yet, they also haven’t been able to collect feedback on which parts need tweaking most.
One could argue that the important point of this piece of information is that the capabilities jump that will come with GPT-5 is not as imminent as if the training had already started. But I strongly suspect that the later the training starts the larger the jump will be.
I wouldn’t be worried about a GPT-5 that started training already. I would expect it to be a bigger GPT-4. I’m definitely more worried about a GPT-5 that only starts training after the lessons of massive scale deployment and a thousand different use cases have been incorporated into its architecture.