For context in a sibling comment Ryan said and Steven agreed with:
It sounds like your disagreement isn’t with drawing a link from RE-bench to (forecasts for) automating research engineering, but is instead with thinking that you can get AGI shortly after automating research engineering due to AI R&D acceleration and already being pretty close. Is that right?
Note that the comment says research engineering, not research scientists.
Now responding on whether I think the no new paradigms assumption is needed:
(Obviously you’re entitled to argue / believe that we don’t need need new AI paradigms and concepts to get to AGI! It’s a topic where I think reasonable people disagree. I’m just suggesting that it’s a necessary assumption for your argument to hang together, right?)
I generally have not been thinking in these sorts of binary terms but instead thinking in terms more like “Algorithmic progress research is moving at pace X today, if we had automated research engineers it would be sped up to N*X.” I’m not necessarily taking a stand on whether the progress will involve new paradigms or not, so I don’t think it requires an assumption of no new paradigms.
However:
If you think almost all new progress in some important sense will come from paradigm shifts, the forecasting method becomes weaker because the incremental progress doesn’t say as much about progress toward automated research engineering or AGI.
You might think that it’s more confusing than clarifying to think in terms of collapsing all research progress into a single “speed” and forecasting based on that.
Requiring a paradigm shift might lead to placing less weight on lower amounts of research effort required, and even if the probability distribution is the same what we should expect to see in the world leading up to AGI is not.
I’d also add that:
Regarding what research tasks I’m forecasting for the automated research engineer: REBench is not supposed to fully represent the tasks involved in actual research engineering. That’s why we have the gaps.
Regarding to what extent having an automated research engineer would speed up progress in worlds in which we need a paradigm shift: I think it’s hard to separate out conceptual from engineering/empirical work in terms of progress toward new paradigms. My guess would be being able to implement experiments very cheaply would substantially increase the expected number of paradigm shifts per unit time.
For context in a sibling comment Ryan said and Steven agreed with:
Now responding on whether I think the no new paradigms assumption is needed:
I generally have not been thinking in these sorts of binary terms but instead thinking in terms more like “Algorithmic progress research is moving at pace X today, if we had automated research engineers it would be sped up to N*X.” I’m not necessarily taking a stand on whether the progress will involve new paradigms or not, so I don’t think it requires an assumption of no new paradigms.
However:
If you think almost all new progress in some important sense will come from paradigm shifts, the forecasting method becomes weaker because the incremental progress doesn’t say as much about progress toward automated research engineering or AGI.
You might think that it’s more confusing than clarifying to think in terms of collapsing all research progress into a single “speed” and forecasting based on that.
Requiring a paradigm shift might lead to placing less weight on lower amounts of research effort required, and even if the probability distribution is the same what we should expect to see in the world leading up to AGI is not.
I’d also add that:
Regarding what research tasks I’m forecasting for the automated research engineer: REBench is not supposed to fully represent the tasks involved in actual research engineering. That’s why we have the gaps.
Regarding to what extent having an automated research engineer would speed up progress in worlds in which we need a paradigm shift: I think it’s hard to separate out conceptual from engineering/empirical work in terms of progress toward new paradigms. My guess would be being able to implement experiments very cheaply would substantially increase the expected number of paradigm shifts per unit time.