I tend to agree with Carey that the necessary compute to reach human-level AI lies somewhere around the 18 and 300-year milestones.
I’m sure there’s a better discussion about which milestones to use somewhere else, but since I’m rereading older posts to catch up, and others may be doing the same, I’ll make a brief comment here.
I think this is going to be an important crux between people who estimate timelines differently.
If you categorically disregard the evolutionary milestones, wouldn’t you be saying that searching for the right architecture isn’t the bottleneck, but training is? However, isn’t it standardly the case that architecture search takes more compute with ML than training? I guess the terminology is confusing here. In ML, the part that takes the most compute is often called “training,” but it’s not analogous to what happens in a single human’s lifetime, because there are architecture tweaks, hyperparameter tuning, and so on. It feels like what ML researchers call “training” is analogous to Hominid evolution, or something like that. Whereas the part that is analogous to a single human’s lifetime is AlphaZero going from 0 to superhuman capacity in 3 days of runtime. That second step took a lot less compute than the architecture search that came before!
Therefore, I would discount the 18y and 300y milestones quite a bit. That said, the 18y estimate was never a proper lower bound. The human brain may not be particularly optimal.
So, I feel like all we can say with confidence is that is that brain evolution is a proper higher bound, and AGI might arrive way sooner depending on how much human foresight can cut it down, being smarter than evolution. I think what we need most is conceptual progress on how much architecture search in ML is “random” vs. how much human foresight can cut corners and speed things up.
I actually don’t know what the “brain evolution” estimate refers to, exactly. If it counts compute wasted on lineages like birds, that seems needlessly inefficient. (Any smart simulator would realize that mammals are more likely to develop civilization, since they have fewer size constraints with flying.) But probably the “brain evolution” estimate just refers to how much compute it takes to run all the direct ancestors of a present-day human, back to the Cambrian period or something like that?
I’m sure others have done extensive analyses on these things, so I’m looking forward to reading all of that once I find it.
I’m sure there’s a better discussion about which milestones to use somewhere else, but since I’m rereading older posts to catch up, and others may be doing the same, I’ll make a brief comment here.
I think this is going to be an important crux between people who estimate timelines differently.
If you categorically disregard the evolutionary milestones, wouldn’t you be saying that searching for the right architecture isn’t the bottleneck, but training is? However, isn’t it standardly the case that architecture search takes more compute with ML than training? I guess the terminology is confusing here. In ML, the part that takes the most compute is often called “training,” but it’s not analogous to what happens in a single human’s lifetime, because there are architecture tweaks, hyperparameter tuning, and so on. It feels like what ML researchers call “training” is analogous to Hominid evolution, or something like that. Whereas the part that is analogous to a single human’s lifetime is AlphaZero going from 0 to superhuman capacity in 3 days of runtime. That second step took a lot less compute than the architecture search that came before!
Therefore, I would discount the 18y and 300y milestones quite a bit. That said, the 18y estimate was never a proper lower bound. The human brain may not be particularly optimal.
So, I feel like all we can say with confidence is that is that brain evolution is a proper higher bound, and AGI might arrive way sooner depending on how much human foresight can cut it down, being smarter than evolution. I think what we need most is conceptual progress on how much architecture search in ML is “random” vs. how much human foresight can cut corners and speed things up.
I actually don’t know what the “brain evolution” estimate refers to, exactly. If it counts compute wasted on lineages like birds, that seems needlessly inefficient. (Any smart simulator would realize that mammals are more likely to develop civilization, since they have fewer size constraints with flying.) But probably the “brain evolution” estimate just refers to how much compute it takes to run all the direct ancestors of a present-day human, back to the Cambrian period or something like that?
I’m sure others have done extensive analyses on these things, so I’m looking forward to reading all of that once I find it.