My current understanding is that all major AI labs have already figured out the chinchilla results on their own, but that younger or less in-the-loop AI orgs may have needed to run experiments that took a couple months of staff time. This post was one of the most-read posts on LW this month, and shared heavily around twitter. It’s plausible to me that spreading these arguments plausibly speeds up AI timelines by 1-4 weeks on average.
What is the mechanism you’re imagining for this speedup? What happens that would not have happened without this post?
Consider that
The Chinchilla paper was released over four months ago, on 3/29/22.
It did not take long for the paper to get noticed among people interested in ML scaling, including here on LW.
I’m struggling to imagine a situation where a relevant AI org is doing Chinchilla-like scaling experiments, yet somehow has managed to miss this paper (or to ignore/misunderstand it) for 4+ months. The paper is not exactly a secret, and it’s not even especially difficult to read as these things go.
More broadly, I doubt LW has significant leverage to decrease the overall supply of these kinds of conversations. There are lots of venues for cutting-edge ML discussion, and the conversation is going to happen somewhere. (See Connor’s comments here.)
What is the mechanism you’re imagining for this speedup? What happens that would not have happened without this post?
Consider that
The Chinchilla paper was released over four months ago, on 3/29/22.
It did not take long for the paper to get noticed among people interested in ML scaling, including here on LW.
On 3⁄29, the same day it was released, the paper was linked on r/mlscaling.
On 3⁄31, I heard about it through the EleutherAI discord, and immediately made an LW linkpost.
On 4⁄1, 1a3orn posted a more detailed explainer.
I’m struggling to imagine a situation where a relevant AI org is doing Chinchilla-like scaling experiments, yet somehow has managed to miss this paper (or to ignore/misunderstand it) for 4+ months. The paper is not exactly a secret, and it’s not even especially difficult to read as these things go.
More broadly, I doubt LW has significant leverage to decrease the overall supply of these kinds of conversations. There are lots of venues for cutting-edge ML discussion, and the conversation is going to happen somewhere. (See Connor’s comments here.)