As I remarked in other comments on this post, this is a plot of price-performance. The denominator is price, which can become cheap very fast. Potentially, as the demand for AI inference ramps up over the coming decade, the price of chips falls fast enough to drive this curve without chip speed growing nearly as fast. It is primarily an economic argument, not a purely technological argument.
For the purposes of forecasting, and understanding what the coming decade will look like, I think we care more about price-performance than raw chip speed. This is particularly true in a regime where both training and inference of large models benefit from massive parallelism. This means you can scale by buying new chips, and from a business or consumer perspective you benefit if those chips get cheaper and/or if they get faster at the same price.
As I remarked in other comments on this post, this is a plot of price-performance. The denominator is price, which can become cheap very fast. Potentially, as the demand for AI inference ramps up over the coming decade, the price of chips falls fast enough to drive this curve without chip speed growing nearly as fast. It is primarily an economic argument, not a purely technological argument.
For the purposes of forecasting, and understanding what the coming decade will look like, I think we care more about price-performance than raw chip speed. This is particularly true in a regime where both training and inference of large models benefit from massive parallelism. This means you can scale by buying new chips, and from a business or consumer perspective you benefit if those chips get cheaper and/or if they get faster at the same price.