I agree with most of this. My claim here is mainly that if this is the case, then there’s at least one remaining necessary breakthrough, of unknown difficulty, before AGI, and so we can’t naively extrapolate timelines from LLM progress to date.
I additionally think that if this is the case, then LLMs’ difficulty with planning is evidence that they may not be great at automating search for new and better algorithms, although hardly conclusive evidence.
Yeah, I think my claim needs evidence to support it. That’s why I’m personally very excited to design evals targeted at detecting self-improvement capabilities.
We shouldn’t be stuck guessing about something so important!
I agree with most of this. My claim here is mainly that if this is the case, then there’s at least one remaining necessary breakthrough, of unknown difficulty, before AGI, and so we can’t naively extrapolate timelines from LLM progress to date.
I additionally think that if this is the case, then LLMs’ difficulty with planning is evidence that they may not be great at automating search for new and better algorithms, although hardly conclusive evidence.
Yeah, I think my claim needs evidence to support it. That’s why I’m personally very excited to design evals targeted at detecting self-improvement capabilities.
We shouldn’t be stuck guessing about something so important!