Ege Erdil comments on Hyperbolic takeoff

Ege Erdil 9 Apr 2022 18:41 UTC
2 points
I’ve asked numerous people about this while writing the essay but nobody told me that these estimates were off, so I went ahead and used them. I should have known to get in touch with you instead.

I think Cotra says that the main cost of training these models is not in compute but in researcher time & effort. How many researcher-hours would you estimate went into designing and training GPT-3? An estimate of $20 million of cost for this means for a team of 100 people (roughly the number of employees at OpenAI in 2020) working at $200/hr you estimate maybe ~ 120 full time days of work per person on the model. That sounds too low to me, but you’re likely better informed about this, and a factor of 3 or so isn’t going to change much in this context anyway.

One problem I’ve had is that while I mention GPT-3 because it’s the only explicit example of a model about which I could find revenue estimates, I think these hyperbolic growth laws work much better with total investment into an area, because you still benefit from scale effects when working in a field even if you make up only a small fraction of the spending. It was too much work for this piece, but I would really like there to be a big spreadsheet of all large models trained in recent years along with estimates of both training costs and estimates of revenue generated by them.

In the end to manage this I end up going with estimates of AI investment & revenue that are on the high end, in the sense that they are for very general notions of “AI” which aren’t representative of what the frontier of research actually looks like. I would have preferred to use estimates from this hypothetical spreadsheet if it existed but unfortunately it didn’t.