So it’s very possible (albeit unlikely) that the number of total GPUs used for GPT-4 training could be higher than 15000!
OAers have noted that the cluster has, of course, been expanded heavily since the original 10k (albeit not what it is now).
Morgan Stanleyis saying that GPT-5 is being trained right now on 25,000 GPUs, up heavily from the original 10k, and implying that ‘most’ of the GPT-5 GPUs were used for GPT-4 which finished ‘some time ago’; the mean of 10 & 25 is 17.5, so >15k seems entirely possible, especially if those GPUs weren’t just installed.
OAers have noted that the cluster has, of course, been expanded heavily since the original 10k (albeit not what it is now). Morgan Stanley is saying that GPT-5 is being trained right now on 25,000 GPUs, up heavily from the original 10k, and implying that ‘most’ of the GPT-5 GPUs were used for GPT-4 which finished ‘some time ago’; the mean of 10 & 25 is 17.5, so >15k seems entirely possible, especially if those GPUs weren’t just installed.