Thanks for the mention Thane. I think you make excellent points, and agree with all of them, to some degree. Yet, I’m expecting huge progress in AI algorithms to be unlocked by AI reseachers.
How closely are they adhering to the “main path” of scaling existing techniques with minor tweaks? If you want to know how a minor tweak affects your current large model at scale, that is a very compute-heavy researcher-time-light type of experiment. On the other hand, if you want to test a lot of novel new paths at much smaller scales, then you are in a relatively compute-light but researcher-time-heavy regime.
What fraction of the available compute resources is the company assigning to each of training/inference/experiments? My guess it that the current split is somewhere around 63/33/4. If this was true, and the company decided to pivot away from training to focus on experiments (0/33/67), this would be something like a 16x increase in compute for experiments. So maybe that changes the bottleneck?
I think that Ilya and the AGI labs are part of a school of thought that is very focused on tweaking the existing architecture slightly. This then is a researcher-time-light and compute-heavy paradigm.
I think the big advancements require going further afield, outside the current search-space of the major players.
Which is not to say that I think LLMs have to be thrown out as useless. I expect some kind of combo system to work. The question is, combined with what?
Well, my prejudice as someone from a neuroscience background is that I think there are untapped insights from studying the brain.
Look at the limitations of current AI that François Chollet discusses in his various interviews and lectures. I think he’s pointing at real flaws. Look how many data points it takes a typical ML model to learn a new task! How limited in-context learning is!
Brains are doing something different clearly. I think our current models are much more powerful than a mouse brain, and yet there are some things that mice learn better.
So, if you stopped spending your compute on big expensive experiments, and instead spent it on combing through the neuroscience literature looking for clues… Would the AI reseachers make a breakthrough? My guess is yes.
I also suspect that there are ideas in computer science, paths not yet explored with modern compute, that are hiding revolutionary insights. But to find them you’d need to go way outside the current paradigm. Set deep learning entirely aside and look at fundamental ideas. I doubt that this describes even 1% of the current time being spent by researchers currently at the big companies. Their path seems to be working, why should they look elsewhere? The cost to them personally of reorienting to entirely different fields of research would be huge. Not so for AI reseachers. They can search everything, and quickly.
I think the big advancements require going further afield, outside the current search-space of the major players.
Oh, I very much agree. But any associated software engineering and experiments would then be nontrivial, ones involving setting up a new architecture, correctly interpreting when it’s not working due to a bug vs. because it’s fundamentally flawed, figuring out which tweaks are okay to make and which tweaks would defeat the point of the experiment, et cetera. Something requiring sophisticated research taste; not something you can trivially delegate-and-forget to a junior researcher (as per @ryan_greenblatt’s vision). (And importantly, if this can be delegated to (AI models isomorphic to) juniors, this is something AGI labs can already do just by hiring juniors.)
Same regarding looking for clues in neuroscience/computer-science literature. In order to pick out good ideas, you need great research taste and plausibly a bird’s eye view on the entire hardware-software research stack. I wouldn’t trust a median ML researcher/engineer’s summary; I would expect them to miss great ideas while bringing slop to my attention, such that it’d be more time-efficient to skim over the literature myself.
In addition, this is likely also a part is where “95% of progress comes from the ability to run big experiments” comes into play. Tons of novel tricks/architectures would perform well at a small scale and flounder at a big scale, or vice versa. You need to pick a new approach and go hard on trying to make it work, not just lazily throw an experiment at it. Which is something that’s bottlenecked on the attention of a senior researcher, not a junior worker.
Overall, it sounds as if… you expect dramatically faster capabilities progress from the AGI labs pivoting towards exploring a breadth of new research directions, with the whole “AI researchers” thing being an unrelated feature? (They can do this pivot with or without them. And as per the compute-constraints arguments, borderline-competent AI researchers aren’t going to nontrivially improve on the companies’ ability to execute this pivot.)
So, I’ve been focusing on giving more of a generic view in my comments. Something that I think someone with similar background in neuroscience, and similar background in ML would endorse as roughly plausible.
I also have an inside view which says more specific things. Like, I don’t just vaguely think that there are probably some fruitful directions in neglected parts of computer science history and in recent neuroscience. What I actually have are specific hypotheses that I’ve been working hard on trying to code up experiments for.
If someone gave me engineering support and compute sufficient to actually get my currently planned experiments run, and the results looked like dead-ends, I think my timelines would go from 2-3 years out to 5-10 years. I’d also be much less confident that we’d see rapid efficiency and capability gains from algorithmic research post-AGI, because I’d be more in mindset of minor tweaks to existing paradigms and further expensive scaling.
This is why I’m basically thinking that I mostly agree with you, Thane, except for this inside view I have about specific approaches I think are currently neglected but unlikely to stay neglected.
Yeah, pretty much. Although I don’t expect this with super high confidence. Maybe 75%?
This is part of why I think a “pause” focused on large models / large training runs would actually dangerously accelerate progress towards AGI. I think a lot of well-resourced high-skill researchers would suddenly shift their focus onto breadth of exploration.
Another point:
I don’t think we’ll ever see AI agents that are exactly isomorphic to junior researchers. Why? Because of the weird spikiness of skills we see. In some ways the LLMs we have are much more skillful than junior researchers, in other ways they are pathetically bad. If you held their competencies constant except for improving the places where they are really bad, you’d suddenly have assistants much better than the median junior!
So when considering the details of how to apply the AI assistants we’re likely to get (based on extrapolating current spiky skill patterns), the set of affordances this offers to the top researchers is quite different from what having a bunch of juniors would be. I think this means we should expect things to weirder and less smooth than Ryan’s straightforward speed-up prediction.
If you look at the recent AI scientist work that’s been done you find this weird spiky portfolio. Having LLMs look through a bunch of papers and try to come up with new research directions? Mostly, but not entirely crap… But then since it’s relatively cheap to do, and quick to do, and not too costly to filter, the trade-off ends up seeming worthwhile?
As for new experiments in totally new regimes, yeah. That’s harder for current LLMs to help with than the well-trodden zones. But I think the specific skills currently beginning to be unlocked by the o1/o3 direction may be enough to make coding agents reliable enough to do a much larger share of this novel experiment setup.
So… It’s complicated. Can’t be sure of success. Can’t be sure of a wall.
Thanks for the mention Thane. I think you make excellent points, and agree with all of them, to some degree. Yet, I’m expecting huge progress in AI algorithms to be unlocked by AI reseachers.
I’ll quote from my comments on the other recent AI timeline discussion.
I think that Ilya and the AGI labs are part of a school of thought that is very focused on tweaking the existing architecture slightly. This then is a researcher-time-light and compute-heavy paradigm.
I think the big advancements require going further afield, outside the current search-space of the major players.
Which is not to say that I think LLMs have to be thrown out as useless. I expect some kind of combo system to work. The question is, combined with what?
Well, my prejudice as someone from a neuroscience background is that I think there are untapped insights from studying the brain.
Look at the limitations of current AI that François Chollet discusses in his various interviews and lectures. I think he’s pointing at real flaws. Look how many data points it takes a typical ML model to learn a new task! How limited in-context learning is!
Brains are doing something different clearly. I think our current models are much more powerful than a mouse brain, and yet there are some things that mice learn better.
So, if you stopped spending your compute on big expensive experiments, and instead spent it on combing through the neuroscience literature looking for clues… Would the AI reseachers make a breakthrough? My guess is yes.
I also suspect that there are ideas in computer science, paths not yet explored with modern compute, that are hiding revolutionary insights. But to find them you’d need to go way outside the current paradigm. Set deep learning entirely aside and look at fundamental ideas. I doubt that this describes even 1% of the current time being spent by researchers currently at the big companies. Their path seems to be working, why should they look elsewhere? The cost to them personally of reorienting to entirely different fields of research would be huge. Not so for AI reseachers. They can search everything, and quickly.
Oh, I very much agree. But any associated software engineering and experiments would then be nontrivial, ones involving setting up a new architecture, correctly interpreting when it’s not working due to a bug vs. because it’s fundamentally flawed, figuring out which tweaks are okay to make and which tweaks would defeat the point of the experiment, et cetera. Something requiring sophisticated research taste; not something you can trivially delegate-and-forget to a junior researcher (as per @ryan_greenblatt’s vision). (And importantly, if this can be delegated to (AI models isomorphic to) juniors, this is something AGI labs can already do just by hiring juniors.)
Same regarding looking for clues in neuroscience/computer-science literature. In order to pick out good ideas, you need great research taste and plausibly a bird’s eye view on the entire hardware-software research stack. I wouldn’t trust a median ML researcher/engineer’s summary; I would expect them to miss great ideas while bringing slop to my attention, such that it’d be more time-efficient to skim over the literature myself.
In addition, this is likely also a part is where “95% of progress comes from the ability to run big experiments” comes into play. Tons of novel tricks/architectures would perform well at a small scale and flounder at a big scale, or vice versa. You need to pick a new approach and go hard on trying to make it work, not just lazily throw an experiment at it. Which is something that’s bottlenecked on the attention of a senior researcher, not a junior worker.
Overall, it sounds as if… you expect dramatically faster capabilities progress from the AGI labs pivoting towards exploring a breadth of new research directions, with the whole “AI researchers” thing being an unrelated feature? (They can do this pivot with or without them. And as per the compute-constraints arguments, borderline-competent AI researchers aren’t going to nontrivially improve on the companies’ ability to execute this pivot.)
So, I’ve been focusing on giving more of a generic view in my comments. Something that I think someone with similar background in neuroscience, and similar background in ML would endorse as roughly plausible.
I also have an inside view which says more specific things. Like, I don’t just vaguely think that there are probably some fruitful directions in neglected parts of computer science history and in recent neuroscience. What I actually have are specific hypotheses that I’ve been working hard on trying to code up experiments for.
If someone gave me engineering support and compute sufficient to actually get my currently planned experiments run, and the results looked like dead-ends, I think my timelines would go from 2-3 years out to 5-10 years. I’d also be much less confident that we’d see rapid efficiency and capability gains from algorithmic research post-AGI, because I’d be more in mindset of minor tweaks to existing paradigms and further expensive scaling.
This is why I’m basically thinking that I mostly agree with you, Thane, except for this inside view I have about specific approaches I think are currently neglected but unlikely to stay neglected.
Yeah, pretty much. Although I don’t expect this with super high confidence. Maybe 75%?
This is part of why I think a “pause” focused on large models / large training runs would actually dangerously accelerate progress towards AGI. I think a lot of well-resourced high-skill researchers would suddenly shift their focus onto breadth of exploration.
Another point:
I don’t think we’ll ever see AI agents that are exactly isomorphic to junior researchers. Why? Because of the weird spikiness of skills we see. In some ways the LLMs we have are much more skillful than junior researchers, in other ways they are pathetically bad. If you held their competencies constant except for improving the places where they are really bad, you’d suddenly have assistants much better than the median junior!
So when considering the details of how to apply the AI assistants we’re likely to get (based on extrapolating current spiky skill patterns), the set of affordances this offers to the top researchers is quite different from what having a bunch of juniors would be. I think this means we should expect things to weirder and less smooth than Ryan’s straightforward speed-up prediction.
If you look at the recent AI scientist work that’s been done you find this weird spiky portfolio. Having LLMs look through a bunch of papers and try to come up with new research directions? Mostly, but not entirely crap… But then since it’s relatively cheap to do, and quick to do, and not too costly to filter, the trade-off ends up seeming worthwhile?
As for new experiments in totally new regimes, yeah. That’s harder for current LLMs to help with than the well-trodden zones. But I think the specific skills currently beginning to be unlocked by the o1/o3 direction may be enough to make coding agents reliable enough to do a much larger share of this novel experiment setup.
So… It’s complicated. Can’t be sure of success. Can’t be sure of a wall.