In AI and compute, the authors compare the theoretical estimations (how many computations there ought to be in theory) with the actual GPU running time, and find that they roughly match.
(with the caveat that GPU utilization rate is ~30% of the reported peak performance)
Our team has extended this analysis to other architectures, and we have found similar results ; we will say more about this in an upcoming article.
In AI and compute, the authors compare the theoretical estimations (how many computations there ought to be in theory) with the actual GPU running time, and find that they roughly match.
(with the caveat that GPU utilization rate is ~30% of the reported peak performance)
Our team has extended this analysis to other architectures, and we have found similar results ; we will say more about this in an upcoming article.
EDIT: The comparison is available here.