(I might misunderstand you. My impression was that you’re saying it’s valid to extrapolate from “model XYZ does well at RE-Bench” to “model XYZ does well at developing new paradigms and concepts.” But maybe you’re saying that the trend of LLM success at various things suggests we don’t need new paradigms and concepts to get AGI in the first place? My reply below assumes the former:)
I’m not saying LLMs can’t develop new paradigms and concepts, though. The original claim you were responding to was that success at RE-Bench in particular doesn’t tell us much about success at developing new paradigms and concepts. “LLMs have done various things some people didn’t expect them to be able to do” doesn’t strike me as much of an argument against that.
More broadly, re: your burden of proof claim, I don’t buy that “LLMs have done various things some people didn’t expect them to be able to do” determinately pins down an extrapolation to “the current paradigm(s) will suffice for AGI, within 2-3 years.” That’s not a privileged reference class forecast, it’s a fairly specific prediction.
I feel like this sub-thread is going in circles; perhaps we should go back to the start of it. I said:
I don’t think this distinction between old-paradigm/old-concepts and new-paradigm/new-concepts is going to hold up very well to philosophical inspection or continued ML progress; it smells similar to ye olde “do LLMs truly understand, or are they merely stochastic parrots?” and “Can they extrapolate, or do they merely interpolate?”
You replied:
I find this kind of pattern-match pretty unconvincing without more object-level explanation. Why exactly do you think this distinction isn’t important? (I’m also not sure “Can they extrapolate, or do they merely interpolate?” qualifies as “ye olde,” still seems like a good question to me at least w.r.t. sufficiently out-of-distribution extrapolation.)
Now, elsewhere in this comment section, various people (Carl, Radford) have jumped in to say the sorts of object-level things I also would have said if I were going to get into it. E.g. that old vs. new paradigm isn’t a binary but a spectrum, that automating research engineering WOULD actually speed up new-paradigm discovery, etc. What do you think of the points they made?
Also, I’m still waiting to hear answers to these questions: “Can you say more about this distinction—is it a binary, or a dimension? If it’s a dimension, how can we measure progress along it, and are we sure there hasn’t been significant progress on it already in the last few years, within the current paradigm? If there has indeed been no significant progress (as with ARC-AGI until 2024) is there another explanation for why that might be, besides your favored one (that your distinction is super important and that because of it a new paradigm is needed to get to AGI)”
(I might misunderstand you. My impression was that you’re saying it’s valid to extrapolate from “model XYZ does well at RE-Bench” to “model XYZ does well at developing new paradigms and concepts.” But maybe you’re saying that the trend of LLM success at various things suggests we don’t need new paradigms and concepts to get AGI in the first place? My reply below assumes the former:)
I’m not saying LLMs can’t develop new paradigms and concepts, though. The original claim you were responding to was that success at RE-Bench in particular doesn’t tell us much about success at developing new paradigms and concepts. “LLMs have done various things some people didn’t expect them to be able to do” doesn’t strike me as much of an argument against that.
More broadly, re: your burden of proof claim, I don’t buy that “LLMs have done various things some people didn’t expect them to be able to do” determinately pins down an extrapolation to “the current paradigm(s) will suffice for AGI, within 2-3 years.” That’s not a privileged reference class forecast, it’s a fairly specific prediction.
I feel like this sub-thread is going in circles; perhaps we should go back to the start of it. I said:
You replied:
Now, elsewhere in this comment section, various people (Carl, Radford) have jumped in to say the sorts of object-level things I also would have said if I were going to get into it. E.g. that old vs. new paradigm isn’t a binary but a spectrum, that automating research engineering WOULD actually speed up new-paradigm discovery, etc. What do you think of the points they made?
Also, I’m still waiting to hear answers to these questions: “Can you say more about this distinction—is it a binary, or a dimension? If it’s a dimension, how can we measure progress along it, and are we sure there hasn’t been significant progress on it already in the last few years, within the current paradigm? If there has indeed been no significant progress (as with ARC-AGI until 2024) is there another explanation for why that might be, besides your favored one (that your distinction is super important and that because of it a new paradigm is needed to get to AGI)”