LawrenceC comments on GPT-4 Predictions

LawrenceC Feb 18, 2023, 8:19 AM
7 points
0
Thanks for writing this!
I think the crux of your estimate of compute usage is the following line:
```
total_gpus = 2000 to 15000
```
In May 2020 (!) Microsoft announced that they had built a supercomputer with 10,000 GPUs for OpenAI, which is often suggested to be the machine GPT-3 was trained on: https://news.microsoft.com/source/features/ai/openai-azure-supercomputer/
So it’s very possible (albeit unlikely) that the number of total GPUs used for GPT-4 training could be higher than 15000!
Corrections/nitpicks:
GPT-3.5 finished training in early 2022, was released in November 2022, and demonstrated better quality answers than GPT-3. In December 2022, OpenAI released ChatGPT which is based on GPT-3.5 and fine-tuned for conversation.
code-davinci-002 and text-davinci-002 were first released in mid March 2022, soon after the InstructGPT paper, not November 2022. Source:
https://openai.com/blog/gpt-3-edit-insert/ (See also this reddit thread talking about text-davinci-002.)
The capability can be used with the latest versions of GPT-3 and Codex, text-davinci-003 and code-davinci-002.
Also, a nitpick: ChatGPT was released November 30th, 2022: https://openai.com/blog/chatgpt/
- gwern Mar 12, 2023, 7:10 PM
  8 points
  0
  Parent
  
  So it’s very possible (albeit unlikely) that the number of total GPUs used for GPT-4 training could be higher than 15000!
  
  OAers have noted that the cluster has, of course, been expanded heavily since the original 10k (albeit not what it is now). Morgan Stanley is saying that GPT-5 is being trained right now on 25,000 GPUs, up heavily from the original 10k, and implying that ‘most’ of the GPT-5 GPUs were used for GPT-4 which finished ‘some time ago’; the mean of 10 & 25 is 17.5, so >15k seems entirely possible, especially if those GPUs weren’t just installed.
- Stephen McAleese Feb 18, 2023, 9:44 AM
  2 points
  0
  Parent
  Thanks for the comment! I updated the paragraph to:
  The GPT-3.5 models finished training and were released in 2022, and demonstrated better quality answers than GPT-3. In late 2022, OpenAI released ChatGPT which is based on GPT-3.5 and fine-tuned for conversation.
- cubefox Feb 18, 2023, 6:52 PM
  1 point
  0
  Parent
  The March blog post mentions text-davinci-003, but you only say text-davinci-002 was released in March. The latter seems more plausible, since it matches with the newsletter OpenAI sent out at the end of November: “New GPT-3 model: text-davinci-003”.
  
  Starting today, you can access text-davinci-003 through our API and playground at the same price as our other Davinci base language models ($0.0200 / 1k tokens).
  
  So I think the “March” blog post has probably been edited and isn’t decisive evidence that code-davinci-002 (the GPT 3.5 base model) actually came out in March.