(2) Is it plausible that one group’s training run will have importantly new and different capabilities from the best relevant previous one? (I say yes—consider grokking, or algorithmic improvements, or autonomous learning per my other comment.)
And then you wrote:
I don’t expect any specific developer to suddenly get access to 5 − 9 OOMs more compute than any previous developer.
Isn’t that kinda a strawman? I can imagine a lot of scenarios where a training run results in a qualitatively better trained model than any that came before—I mentioned three of them—and I think “5-9OOM more compute than any previous developer” is a much much less plausible scenario than any of the three I mentioned.
This post mainly argues that evolution does not provide evidence for the sharp left turn. Sudden capabilities jumps from other sources, such as those you mention, are more likely, IMO. My first reply to your comment is arguing that the mechanisms behind the human sharp left turn wrt evolution probably still won’t arise in AI development, even if you go up an abstraction level. One of those mechanisms is a 5 − 9 OOM jump in usable optimization power, which I think is unlikely.
I wrote:
And then you wrote:
Isn’t that kinda a strawman? I can imagine a lot of scenarios where a training run results in a qualitatively better trained model than any that came before—I mentioned three of them—and I think “5-9OOM more compute than any previous developer” is a much much less plausible scenario than any of the three I mentioned.
This post mainly argues that evolution does not provide evidence for the sharp left turn. Sudden capabilities jumps from other sources, such as those you mention, are more likely, IMO. My first reply to your comment is arguing that the mechanisms behind the human sharp left turn wrt evolution probably still won’t arise in AI development, even if you go up an abstraction level. One of those mechanisms is a 5 − 9 OOM jump in usable optimization power, which I think is unlikely.