avturchin comments on What 2026 looks like

avturchin 7 Aug 2021 15:32 UTC
7 points
You assume that in 2023 “The multimodal transformers are now even bigger; the biggest are about half a trillion parameters”, while GPT-3 had 137 billions in 2020 (but not multimodal). This is like 4 times grows in 3 years, compared with an order of magnitude in 3 month growth before GPT-3. So you assume a significant slowdown in the parameter growth.
I heard a rumor that GPT-4 could be as large as 32 trillion parameters. If it turns to be true, will it affect your prediction?
- Daniel Kokotajlo 7 Aug 2021 18:50 UTC
  11 points
  Parent
  Indeed, my median future involves a significant slowdown in dense-network parameter growth.
  If there is a 32 trillion parameter dense model by 2023, I’ll be surprised and update towards shorter timelines, unless it turns out to be underwhelming compared to the performance predicted by the scaling trends.
  - Teerth Aloke 8 Aug 2021 1:33 UTC
    3 points
    Parent
    What will be your new median? (If you observe 32 trillion parameter model in 2023)
    - Daniel Kokotajlo 8 Aug 2021 5:15 UTC
      4 points
      Parent
      Hard to say, it depends a lot on the rest of the details. If the performance is as good as the scaling trends would predict, it’ll be almost human-level at text prediction and multiple choice questions on diverse topics and so forth. After fine-tuning it would probably be a beast.
      I suppose I’d update my 50% mark to, like, 2027 or so? IDK.
      - Teerth Aloke 8 Aug 2021 13:50 UTC
        2 points
        Parent
        I got the idea. I would also update to a very short timeline (4-5 years) in the absence of slowdown in dense-network parameter growth l, and performance following the scaling trend. And I was pretty scared when GPT-3 was released. As many here, I was expected further growth in that direction very soon which did not happen. So, I am less scared now.
  - awg 1 Apr 2023 16:10 UTC
    2 points
    Parent
    This was all well before the Chinchilla scaling paper, but this has still turned out to be absolutely true by 2023. We have PaLM-E 540B just for starters.