Compute is used in two ways: it is used to run a big experiment quickly, and it is used to run many experiments in parallel.
95% of progress comes from the ability to run big experiments quickly. The utility of running many experiments is much less useful. In the old days, a large cluster could help you run more experiments, but it could not help with running a single large experiment quickly.
For this reason, an academic lab could compete with Google, because Google’s only advantage was the ability to run many experiments. This is not a great advantage.
Recently, it has become possible to combine 100s of GPUs and 100s of CPUs to run an experiment that’s 100x bigger than what is possible on a single machine while requiring comparable time. This has become possible due to the work of many different groups. As a result, the minimum necessary cluster for being competitive is now 10–100x larger than it was before.
I think Ilya said almost exactly this in his NeurIPS talk yesterday. Sounds like “doing the biggest experiments fast is more important than doing many experiments in parallel” is a core piece of wisdom to him.
Seems like there was a misunderstanding here where Musk meant “the AI community is taking a long time to figure stuff out” but Ilya thought he meant “figure out how to make AI that can think in terms of concepts”.