consider that writing 750 words with GPT-3 costs 6 cents.
vs
It costs well under $1/hour to rent hardware that performs 100 trillion operations per second. If a model using that much compute (something like 3 orders of magnitude more than gpt-3)...
That’s easy to reconcile! OpenAI is selling access to GPT-3 wayyyy above its own marginal hardware rental cost. Right? That would hardly be surprising; usually pricing decisions involve other things besides marginal costs, like price elasticity of demand, and capacity to scale up, and so on. (And/or OpenAI’s marginal costs includes things that are not hardware rental, e.g. human monitoring and approval processes.) But as soon as there’s some competition (especially competition from open-source projects) I expect price to rapidly approach the hardware rental cost (including electricity).
That estimate puts GPT-3 at about 500 billion floating point operations per word, 200x less than 100 trillion. If you think a human reads at 250 words per minute, then 6 cents for 750 words is $1.20/hour. So the two estimates differ by about 250x.
So that’s 430 trillion (operations per second) per ($/hour).
You shouldn’t expect to be able to get full utilization out of that for a variety of reasons, but in the very long run you should be getting reasonably close, certainly more than 100 trillion operations per second.
(ETA: But note that a service like the OpenAI API using EC2 would need to use on demand prices which are about 10x higher per flop if you want reasonable availability.)
I’m trying to reconcile:
vs
That’s easy to reconcile! OpenAI is selling access to GPT-3 wayyyy above its own marginal hardware rental cost. Right? That would hardly be surprising; usually pricing decisions involve other things besides marginal costs, like price elasticity of demand, and capacity to scale up, and so on. (And/or OpenAI’s marginal costs includes things that are not hardware rental, e.g. human monitoring and approval processes.) But as soon as there’s some competition (especially competition from open-source projects) I expect price to rapidly approach the hardware rental cost (including electricity).
Someone can correct me if I’m misunderstanding.
That estimate puts GPT-3 at about 500 billion floating point operations per word, 200x less than 100 trillion. If you think a human reads at 250 words per minute, then 6 cents for 750 words is $1.20/hour. So the two estimates differ by about 250x.
As a citation for the hardware cost:
P4d instances on EC2 currently cost $11.57/h if reserved for 3 years. They contain 8 A100s.
An A100 does about 624 trillion half-precision ops/second.
So that’s 430 trillion (operations per second) per ($/hour).
You shouldn’t expect to be able to get full utilization out of that for a variety of reasons, but in the very long run you should be getting reasonably close, certainly more than 100 trillion operations per second.
(ETA: But note that a service like the OpenAI API using EC2 would need to use on demand prices which are about 10x higher per flop if you want reasonable availability.)
Limitation:
Cost of compute + addition to pricing for:
a) Profit
b) To recuperate costs from training or acquiring the model
Having an additional feature, human monitoring/approval, does make things higher. (In principle maybe it could increase quality.)