This post presents a dataset of the parameter counts of 139 ML models from 1952 to 2021. The resulting graph is fairly noisy and hard to interpret, but suggests that:
1. There was no discontinuity in model size in 2012 (the year that AlexNet was published, generally acknowledged as the start of the deep learning revolution). 2. There was a discontinuity in model size for language in particular some time between 2016-18.
Planned summary for the Alignment Newsletter:
Planned opinion: