So, it’s true that NVIDIA probably has very high markup on their ML GPUs. I discuss this a bit in the NVIDIA’s Monopoly section, but I’ll add a bit more detail here.
Google’s TPU v4 seems to be competitive with the A100, and has similar cost per hour.
I think the current prices do in fact reflect demand.
My best guess is that the software licensing would not be a significant barrier for someone spending hundreds of millions of dollars on a training run.
Even when accounting for markup[1] a quick rough estimate still implies a fairly significant gap vs gaming GPUs that FLOPs/$ don’t account for, though it does shrink that gap considerably.[2]
All this aside, my basic take is that I think “what people are actually paying” is the most straightforward and least speculative means we have of defining near term “cost”.
My cached state is that the A/H100 vs 4090 price gap is mostly price discrimination rather than a large difference in the actual manufacturing cost.
I think price discrimination is very common in computing hardware and nvidia happens to have a quite powerful monopoly right now for various reasons.
Note that 4090s technically can’t be used in data centers with cuda due to licensing which makes this a particularly effective approach to price discrimination.
So, it’s true that NVIDIA probably has very high markup on their ML GPUs. I discuss this a bit in the NVIDIA’s Monopoly section, but I’ll add a bit more detail here.
Google’s TPU v4 seems to be competitive with the A100, and has similar cost per hour.
I think the current prices do in fact reflect demand.
My best guess is that the software licensing would not be a significant barrier for someone spending hundreds of millions of dollars on a training run.
Even when accounting for markup[1] a quick rough estimate still implies a fairly significant gap vs gaming GPUs that FLOPs/$ don’t account for, though it does shrink that gap considerably.[2]
All this aside, my basic take is that I think “what people are actually paying” is the most straightforward and least speculative means we have of defining near term “cost”.
75-80% for H100 and … 40-50% for gaming would be my guess?
Being generous, I get 0.2*24000/(1,599*0.6) implies the H100 costs > 5x to manufacture than the RTX4090 despite having closer to 3x the FLOP/s.