If you accept the premise of AI remaining within the human capability range in some broad sense, where it brings great productivity improvements and rewards those who use it well but remains foundationally a tool and everything seems basically normal, essentially the AI-Fizzle world, then we have disagreements
There is good reason to believe that AI will have a soft cap at roughly human ability (and by “soft cap” I mean that anything beyond the cap will be much harder to achieve) for the same reason that humans have a soft cap at human ability: copying existing capabilities is much easier than discovering new capabilities.
A human being born today can relatively easily achieve abilities that other humans have achieved, because you just copy them; lots of 12-year-olds can learn calculus, which is much easier than inventing it. AI will have the same issue.
Sutskever’s response to Dwarkesh in their interview was a convincing refutation of this argument for me:
Dwarkesh Patel
So you could argue that next-token prediction can only help us match human performance and maybe not surpass it? What would it take to surpass human performance?
Ilya Sutskever
I challenge the claim that next-token prediction cannot surpass human performance. On the surface, it looks like it cannot. It looks like if you just learn to imitate, to predict what people do, it means that you can only copy people. But here is a counter argument for why it might not be quite so. If your base neural net is smart enough, you just ask it — What would a person with great insight, wisdom, and capability do? Maybe such a person doesn’t exist, but there’s a pretty good chance that the neural net will be able to extrapolate how such a person would behave. Do you see what I mean?
Dwarkesh Patel
Yes, although where would it get that sort of insight about what that person would do? If not from…
Ilya Sutskever
From the data of regular people. Because if you think about it, what does it mean to predict the next token well enough? It’s actually a much deeper question than it seems. Predicting the next token well means that you understand the underlying reality that led to the creation of that token. It’s not statistics. Like it is statistics but what is statistics? In order to understand those statistics to compress them, you need to understand what is it about the world that creates this set of statistics? And so then you say — Well, I have all those people. What is it about people that creates their behaviors? Well they have thoughts and their feelings, and they have ideas, and they do things in certain ways. All of those could be deduced from next-token prediction. And I’d argue that this should make it possible, not indefinitely but to a pretty decent degree to say — Well, can you guess what you’d do if you took a person with this characteristic and that characteristic? Like such a person doesn’t exist but because you’re so good at predicting the next token, you should still be able to guess what that person who would do. This hypothetical, imaginary person with far greater mental ability than the rest of us
I respect Sutskever a lot, but if he believed that he could get an equivalent world model by spending an equivalent amount of compute learning from next-token prediction using any other set of real-world data samples, why would they go to such lengths to specifically obtain human-generated text for training? They might as well just do lots of random recordings (e.g., video, audio, radio signals) and pump it all into the model. In principle that could probably work, but it’s very inefficient.
Human language is a very high density encoding of world models, so by training on human language models get much of their world model “for free“, because humanity has already done a lot of pre-work by sampling reality in a wide variety of ways and compressing it into the structure of language. However, our use of language still doesn’t capture all of reality exactly and I would argue it’s not even close. (Saying otherwise is equivalent to saying we’ve already discovered almost all possible capabilities, which would entail that AI actually has a hard cap at roughly human ability.)
In order to expand its world model beyond human ability, AI has to sample reality itself, which is much less sample-efficient than sampling human behavior, hence the “soft cap”.
In theory, yes, but that’s obviously a lot more costly than running just one instance. And you’ll need to keep these virtual researchers running in order to keep the new capabilities coming. At some point this will probably happen and totally eclipse human ability, but I think the soft cap will slow things down by a lot (i.e., no foom). That’s assuming that compute and the number of researchers even is the bottleneck to new discoveries; it could also be empirical data.
There is good reason to believe that AI will have a soft cap at roughly human ability (and by “soft cap” I mean that anything beyond the cap will be much harder to achieve) for the same reason that humans have a soft cap at human ability: copying existing capabilities is much easier than discovering new capabilities.
A human being born today can relatively easily achieve abilities that other humans have achieved, because you just copy them; lots of 12-year-olds can learn calculus, which is much easier than inventing it. AI will have the same issue.
Sutskever’s response to Dwarkesh in their interview was a convincing refutation of this argument for me:
Dwarkesh Patel
So you could argue that next-token prediction can only help us match human performance and maybe not surpass it? What would it take to surpass human performance?
Ilya Sutskever
I challenge the claim that next-token prediction cannot surpass human performance. On the surface, it looks like it cannot. It looks like if you just learn to imitate, to predict what people do, it means that you can only copy people. But here is a counter argument for why it might not be quite so. If your base neural net is smart enough, you just ask it — What would a person with great insight, wisdom, and capability do? Maybe such a person doesn’t exist, but there’s a pretty good chance that the neural net will be able to extrapolate how such a person would behave. Do you see what I mean?
Dwarkesh Patel
Yes, although where would it get that sort of insight about what that person would do? If not from…
Ilya Sutskever
From the data of regular people. Because if you think about it, what does it mean to predict the next token well enough? It’s actually a much deeper question than it seems. Predicting the next token well means that you understand the underlying reality that led to the creation of that token. It’s not statistics. Like it is statistics but what is statistics? In order to understand those statistics to compress them, you need to understand what is it about the world that creates this set of statistics? And so then you say — Well, I have all those people. What is it about people that creates their behaviors? Well they have thoughts and their feelings, and they have ideas, and they do things in certain ways. All of those could be deduced from next-token prediction. And I’d argue that this should make it possible, not indefinitely but to a pretty decent degree to say — Well, can you guess what you’d do if you took a person with this characteristic and that characteristic? Like such a person doesn’t exist but because you’re so good at predicting the next token, you should still be able to guess what that person who would do. This hypothetical, imaginary person with far greater mental ability than the rest of us
I respect Sutskever a lot, but if he believed that he could get an equivalent world model by spending an equivalent amount of compute learning from next-token prediction using any other set of real-world data samples, why would they go to such lengths to specifically obtain human-generated text for training? They might as well just do lots of random recordings (e.g., video, audio, radio signals) and pump it all into the model. In principle that could probably work, but it’s very inefficient.
Human language is a very high density encoding of world models, so by training on human language models get much of their world model “for free“, because humanity has already done a lot of pre-work by sampling reality in a wide variety of ways and compressing it into the structure of language. However, our use of language still doesn’t capture all of reality exactly and I would argue it’s not even close. (Saying otherwise is equivalent to saying we’ve already discovered almost all possible capabilities, which would entail that AI actually has a hard cap at roughly human ability.)
In order to expand its world model beyond human ability, AI has to sample reality itself, which is much less sample-efficient than sampling human behavior, hence the “soft cap”.
You can bypass this by just running 1000 instances of imitating-genius-ML-researchers AI.
In theory, yes, but that’s obviously a lot more costly than running just one instance. And you’ll need to keep these virtual researchers running in order to keep the new capabilities coming. At some point this will probably happen and totally eclipse human ability, but I think the soft cap will slow things down by a lot (i.e., no foom). That’s assuming that compute and the number of researchers even is the bottleneck to new discoveries; it could also be empirical data.