Well they get to run however much compute they do have 6 more months with no competition. Probably several years since obviously this pause would get renewed again and again until someone honoring it defects. Note that enormous models are a function of total cluster memory and interconnect. Many current clusters have enough memory for theoretically enormous models, 10 trillion weights plus. Having too few GPUs so training takes a year+ is a problem unless your competition is all idle.
Thanks! Haven’t found good comments on that paper (and lack the technical insights to evaluate it myself)
Are you implying that China has access to compute required for a) GPT-4 type models or b) AGI?
Well they get to run however much compute they do have 6 more months with no competition. Probably several years since obviously this pause would get renewed again and again until someone honoring it defects. Note that enormous models are a function of total cluster memory and interconnect. Many current clusters have enough memory for theoretically enormous models, 10 trillion weights plus. Having too few GPUs so training takes a year+ is a problem unless your competition is all idle.
a.