Potential gears level explanations of smooth progress
(Epistemic status: exploratory. Also, this post isn’t thorough; I wanted to write quickly.)
(Thanks to Mary Phuong, Pedro Freire, Tao Lin, Rohin Shah, and probably a few other people I’m forgetting for discussing this topic with me.)
My perception is that it is a common belief that (after investment spending becomes sufficiently large) AI progress on domains of focus will likely be smooth. That is, it will consistently improve at similar rates at a year over year basis (or perhaps somewhat smaller time periods).[1] Note that this doesn’t imply that growth will necessarily be exponential, the growth rate could steadily decelerate or accelerate and I would still consider it smooth growth.
I find this idea somewhat surprising because in current ML domains there have been relatively few meaningful advancements. This seems to imply that each of these improvements would yield a spike in performance. Yet, empirically, I think that in many domains of focus ML progress has been relatively consistent. I won’t make the case for these claims here. Additionally, empirically it seems that most domains are reasonably continuous even after selecting for domains which are likely to contain discontinuities.
This post will consider some possible gears level explanations of smooth progress. It’s partially inspired by this post in the 2021 MIRI conversations.
Many parts
If a system consists of many part and people are working on many parts at the same time, then the large number of mostly independent factors will drive down variance. For instance, consider a plane. There are many, many parts and some software components. If engineers are working on all of these simultaneously, then progress will tend to be smooth throughout the development of a given aircraft and in the field as whole.
(I don’t actually know anything about aircraft, I’m just using this as a model. If anyone has anything more accurate to claim about planes, please do so in the comments.)
So will AIs have many parts? Current AI doesn’t seem to have many parts which merit much individual optimization. Architectures are relatively uniform and not all that complex. However, there are potentially a large number of hyperparameters and many parts of the training and inference stack (hardware, optimized code, distributed algorithms, etc.). While hyperparameters can be searched for far more easily than aircraft parts, there is still relevant human optimization work. The training stack as a whole seems likely to have smooth progress (particularly hardware). So if/while compute remains limiting, smooth training stack progress could imply smooth AI progress.
Further, it’s plausible that future architectures will be deliberately engineered to have more different parts to better enable optimization with more people.
Knowledge as a latent variable
Perhaps the actual determining factor of the progress of many fields is the underlying knowledge of individuals. So progress is steady because individuals tend to learn at stable rates and the overall knowledge of the field grows similarly. This explanation seems very difficult to test or verify, but it would imply potential for steady progress even in domains where there are bottlenecks. Perhaps mathematics demonstrates this to some extent.
Large numbers of low-impact discoveries
If all discoveries are low-impact and the number of discoveries per year is reasonably large (and has low variance), then the total progress per year would be smooth. This could be the case even if all discoveries are concentrated in one specific component which doesn’t allow for parallel progress. It seems quite difficult to estimate the impact of future discoveries in AI.
Some other latent variable
There do seem to be a surprising large number of areas where progress is steady (at least to me). Perhaps this indicates some incorrect human bias (or personal bias) that progress should be less steady. It could also indicate the existence of some unknown and unaccounted for latent variable common in many domains. If this variable was applicable in future AI development, then that would likely indicate that future AI development would be unexpectedly smooth.
- ↩︎
Note that while smooth progress probably correlates with slower takeoffs, takeoff speeds and smoothness of progress aren’t the same: it’s possible to have very fast takeoff in which progress is steady before and after some inflection point (but not through the inflection point). Similarly it’s possible to have slower takeoffs where progress is quite erratic.
I find “large numbers of low-impact discoveries” to be the most compelling, though I’d also add:
Variance becomes lower as people put more effort into finding discoveries (intuitively, each person has some chance of finding a discovery, all of which are approximately independent from each other).
There can be high-impact discoveries, but those are found earlier (partly because people are looking for them, and partly because there are more ways to stumble upon high-impact discoveries).
“Combinatorial innovation and technological progress in the very long run”, Matt Clancy, might be worth looking at.