No, I think we mostly agree—I’d expect TPUs to be with say 4x of practically optimal for the things they do. The remaining one OOM I think is possible for non-novel tasks has more to do with specialisation, eg model-specific hardware design, and that definitely has an asymtote.
The interesting case is if we can get TPU-equivalent hardware days after designing a new architecture, instead of years after, because (IMO) 1,000x speedups over CPUs are plausible.
No, I think we mostly agree—I’d expect TPUs to be with say 4x of practically optimal for the things they do. The remaining one OOM I think is possible for non-novel tasks has more to do with specialisation, eg model-specific hardware design, and that definitely has an asymtote.
The interesting case is if we can get TPU-equivalent hardware days after designing a new architecture, instead of years after, because (IMO) 1,000x speedups over CPUs are plausible.