Stephen McAleese comments on Sama Says the Age of Giant AI Models is Already Over

Stephen McAleese 18 Apr 2023 21:55 UTC
2 points
1
Maybe Sam knows a lot I don’t know but here are some reasons why I’m skeptical about the end of scaling large language models:
- From scaling laws we know that more compute and data reliably lead to better performance and therefore scaling seems like a low-risk investment.
- I’m not sure how much GPT-4 cost but GPT-3 only cost $5-10 million which isn’t much for large tech companies (e.g. Meta spends billions on the metaverse every year).
- There are limits to how big and expensive supercomputers can be but I doubt we’re near them. I’ve heard that GPT-4 was trained on ~10,000 GPUs which is a lot but not an insane amount (~$300m worth of GPUs). If there were 100 GPUs/m^2, all 10,000 GPUs could fit in a 10m x 10m room. A model trained with millions of GPUs is not inconceivable and is probably technically and economically possible today.
Because scaling laws are power laws (x-axis is logarithmic and y-axis is linear), there are diminishing returns to resources like more compute but I doubt we’ve reached the point where the marginal cost of training larger models exceeds the marginal benefit. Think of a company like Google: building the biggest and best model is immensely valuable in a global, winner-takes-all market like search.
- IC Rainbow 19 Apr 2023 8:22 UTC
  1 point
  0
  Parent
  I’m getting the same conclusions.
  
  Think of a company like Google: building the biggest and best model is immensely valuable in a global, winner-takes-all market like search.
  
  And this is in a world, where Google already announced that they’re going to build even bigger model of their own
  
  We are not, and won’t for some* time.
  - We have to upgrade our cluster with a fresh batch of Nvidia gadgets.