[Linkpost] Growth in FLOPS used to train ML models
This is a linkpost for https://shape-of-code.com/2022/03/13/growth-in-flops-used-to-train-ml-models/
Given the ongoing history of continually increasing compute power, what is the maximum compute power that might be available to train ML models in the coming years?
Speaking of compute and experience curves, Karpathy just posted about replicating Le Cun’s 1989 pre-MNIST digit classifying results and what difference compute & methods make: https://karpathy.github.io/2022/03/14/lecun1989/
Thanks, an interesting read until the author peers into the future. Moore’s law is on its last legs, so the historical speed-ups will soon be just that, something that once happened. There are some performance improvements still to come from special purpose cpus, and half-precision floating-point will reduce memory traffic (which can then be traded for cpu perforamnce).
Thanks for writing! I don’t see an actual answer to the question asked in the beginning—“Given the ongoing history of continually increasing compute power, what is the maximum compute power that might be available to train ML models in the coming years?” Did I miss it?