If human-level AI is reached quickly mainly by spending more money on compute (which I understood to be Kokotajlo’s viewpoint; sorry if I misunderstood), it’d also be quite expensive to do inference with, no? I’ll try to estimate how it compares to humans.
Let’s use Cotra’s “tens of billions” for training compared to GPT-4′s $100m+, for roughly a 300x multiplier. Let’s say that inference costs are multiplied by the same 300x, so instead of GPT-4′s $0.06 per 1000 output tokens, you’d be paying GPT-N $18 per 1000 output tokens. I think of GPT output as analogous to human stream of consciousness, so let’s compare to human talking speed, which is roughly 130 wpm. Assuming 3⁄4 words per token, that converts to a human hourly wage of 18/1000/(3/4)*130*60 = $187/hr.
So, under these assumptions (which admittedly bias high), operating this hypothetical human-level GPT-N would cost the equivalent of paying a human about $200/hr. That’s expensive but cheaper than some high-end jobs, such as CEO or elite professional. To convert to a salary, assume 2000 hours per year, for a $400k salary. For example, that’s less than OpenAI software engineers reportedly earn.
This is counter-intuitive, because traditionally automation-by-computer has had low variable costs. Based on the above back-of-the-envelope calculation, I think it’s worth considering when discussing human-level-AI-soon scenarios.
1. If scaling continues with something like Chinchilla scaling laws, the 300x multiplier for compute will not be all lumped into increasing parameters / inference cost. Instead it’ll be split roughly half and half. So maybe 20x more data/trainingtime and 15x more parameters/inference cost. So, instead of $200/hr, we are talking more like $15/hr.
2. Hardware continues to improve in the near term; FLOP/$ continues to drop. As far as I know. Of course during AI boom times the price will be artificially high due to all the demand… Not sure which direction the net effect will be.
3. Reaching human-level AI might involve trading off inference compute and training compute, as discussed in Davidson’s model (see takeoffspeeds.com and linked report) which probably is a factor that increases inference compute of the first AGIs (while shortening timelines-to-AGI) perhaps by multiple OOMs.
4. However much it costs, labs will be willing to pay. An engineer that works 5x, 10x, 100x faster than a human is incredibly valuable, much more valuable than if they worked only at 1x speed like all the extremely high-salaried human engineers at AI labs.
If human-level AI is reached quickly mainly by spending more money on compute (which I understood to be Kokotajlo’s viewpoint; sorry if I misunderstood), it’d also be quite expensive to do inference with, no? I’ll try to estimate how it compares to humans.
Let’s use Cotra’s “tens of billions” for training compared to GPT-4′s $100m+, for roughly a 300x multiplier. Let’s say that inference costs are multiplied by the same 300x, so instead of GPT-4′s $0.06 per 1000 output tokens, you’d be paying GPT-N $18 per 1000 output tokens. I think of GPT output as analogous to human stream of consciousness, so let’s compare to human talking speed, which is roughly 130 wpm. Assuming 3⁄4 words per token, that converts to a human hourly wage of 18/1000/(3/4)*130*60 = $187/hr.
So, under these assumptions (which admittedly bias high), operating this hypothetical human-level GPT-N would cost the equivalent of paying a human about $200/hr. That’s expensive but cheaper than some high-end jobs, such as CEO or elite professional. To convert to a salary, assume 2000 hours per year, for a $400k salary. For example, that’s less than OpenAI software engineers reportedly earn.
This is counter-intuitive, because traditionally automation-by-computer has had low variable costs. Based on the above back-of-the-envelope calculation, I think it’s worth considering when discussing human-level-AI-soon scenarios.
Nice analysis. Some thoughts:
1. If scaling continues with something like Chinchilla scaling laws, the 300x multiplier for compute will not be all lumped into increasing parameters / inference cost. Instead it’ll be split roughly half and half. So maybe 20x more data/trainingtime and 15x more parameters/inference cost. So, instead of $200/hr, we are talking more like $15/hr.
2. Hardware continues to improve in the near term; FLOP/$ continues to drop. As far as I know. Of course during AI boom times the price will be artificially high due to all the demand… Not sure which direction the net effect will be.
3. Reaching human-level AI might involve trading off inference compute and training compute, as discussed in Davidson’s model (see takeoffspeeds.com and linked report) which probably is a factor that increases inference compute of the first AGIs (while shortening timelines-to-AGI) perhaps by multiple OOMs.
4. However much it costs, labs will be willing to pay. An engineer that works 5x, 10x, 100x faster than a human is incredibly valuable, much more valuable than if they worked only at 1x speed like all the extremely high-salaried human engineers at AI labs.
Thanks, that’s an excellent and important point that I overlooked: the growth rate of inference cost is about half that of training cost.