While the claim—the task ‘predict next token on the internet’ absolutely does not imply learning it caps at human-level intelligence—is true, some parts of the post and reasoning leading to the claims at the end of the post are confused or wrong.
Let’s start from the end and try to figure out what goes wrong.
GPT-4 is still not as smart as a human in many ways, but it’s naked mathematical truth that the task GPTs are being trained on is harder than being an actual human.
And since the task that GPTs are being trained on is different from and harder than the task of being a human, it would be surprising—even leaving aside all the ways that gradient descent differs from natural selection—if GPTs ended up thinking the way humans do, in order to solve that problem.
From a high-level perspective, it is clear that this is just wrong. Part of what human brains are doing is to minimise prediction error with regard to sensory inputs. Unbounded version of the task is basically of same generality and difficulty as what GPT is doing, and is roughly equivalent to understand everything what is understandable in the observable universe. For example: a friend of mine worked at analysing the data from LHC, leading to the Higgs detection paper. Doing this type of work basically requires a human brain to have a predictive model of aggregates of outputs of a very large number of collisions of high-energy particles, processed by a complex configuration of computers and detectors.
Where GPT and humans differ is not some general mathematical fact about the task, but differences in what sensory data is a human and GPT trying to predict, and differences in cognitive architecture and ways how the systems are bounded. The different landscape of both boundedness and architecture can lead to both convergent cognition (thinking as the human would do) and the opposite, predicting what the human would output in highly non-human way.
The boundedness is overall a central concept here. Neither humans nor GPTs are attempting to solve ‘how to predict stuff with unlimited resources’, but a problem of cognitive economy—how to allocate limited computational resources to minimise prediction error.
Or maybe simplest: Imagine somebody telling you to make up random words, and you say, “Morvelkainen bloombla ringa mongo.”
Imagine a mind of a level—where, to be clear, I’m not saying GPTs are at this level yet -
Imagine a Mind of a level where it can hear you say ‘morvelkainen blaambla ringa’, and maybe also read your entire social media history, and then manage to assign 20% probability that your next utterance is ‘mongo’.
The fact that this Mind could double as a really good actor playing your character, does not mean They are only exactly as smart as you.
When you’re trying to be human-equivalent at writing text, you can just make up whatever output, and it’s now a human output because you’re human and you chose to output that.
GPT-4 is being asked to predict all that stuff you’re making up. It doesn’t get to make up whatever. It is being asked to model what you were thinking—the thoughts in your mind whose shadow is your text output—so as to assign as much probability as possible to your true next word.
If I try to imagine a mind which is able to predict my next word when asked to make up random words, and be successful at assigning 20% probability to my true output, I’m firmly in the realm of weird and incomprehensible Gods. If the Mind is imaginably bounded and smart, it seems likely it would not devote much cognitive capacity to trying to model in detail strings prefaced by a context like ‘this is a list of random numbers’, in particular if inverting the process generating the numbers would seem really costly. Being this good at this task would require so much data and cheap computation that this is way beyond superintelligence, in the realm of philosophical experiments.
Overall I think it is really unfortunate way how to think about the problem, where a system which is moderately hard to comprehend (like GPT) is replaced by something much more incomprehensible. Also it seems a bit of a reverse intuition pump—I’m pretty confident most people’s intuitive thinking about this ’simplest’ thing will be utterly confused.
How did we got here?
A human can write a rap battle in an hour. A GPT loss function would like the GPT to be intelligent enough to predict it on the fly.
Apart from the fact that humans are also able to rap battle or impro on the fly, notice that “what would the loss function like the system to do” in principle tells you very little about what the system will do. For example, the human loss function makes some people attempt to predict winning lottery numbers. This is an impossible task for humans and you can’t say much about the human based on this. Or you can speculate about minds which would be able to succeed in this task, but you soon get into the realm of Gods and outside of physics.
Consider that sometimes human beings, in the course of talking, make errors.
GPTs are not being trained to imitate human error. They’re being trained to *predict* human error.
Consider the asymmetry between you, who makes an error, and an outside mind that knows you well enough and in enough detail to predict *which* errors you’ll make.
Again, from the cognitive economy perspective, predicting my errors would often be wasteful. With some simplification, you can imagine I make two types of errors—systematic, and random. Often the simplest way how to predict the systematic error would be to emulate the process which led to the error. Random errors are … random, and a mind which knows me in enough detail to predict which random errors I’ll make seems a bit like the mind predicting the lottery numbers.
Consider that somewhere on the internet is probably a list of thruples: <product of 2 prime numbers, first prime, second prime>.
GPT obviously isn’t going to predict that successfully for significantly-sized primes, but it illustrates the basic point:
There is no law saying that a predictor only needs to be as intelligent as the generator, in order to predict the generator’s next token.
The general claim that some predictions are really hard and you need superhuman powers to be good at them is true, but notice that this does not inform us about what GPT-x will learn.
Imagine yourself in a box, trying to predict the next word—assign as much probability mass to the next token as possible—for all the text on the Internet.
Koan: Is this a task whose difficulty caps out as human intelligence, or at the intelligence level of the smartest human who wrote any Internet text? What factors make that task easier, or harder?
Yes this is clearly true: in the limit the task is of unlimited difficulty.
I didn’t say that GPT’s task is harder than any possible perspective on a form of work you could regard a human brain as trying to do; I said that GPT’s task is harder than being an actual human; in other words, being an actual human is not enough to solve GPT’s task.
I don’t see how the comparison of hardness of ‘GPT task’ and ‘being an actual human’ should technically work—to me it mostly seems like a type error.
- The task ‘predict the activation of photoreceptors in human retina’ clearly has same difficulty as ‘predict next word on the internet’ in the limit. (cf Why Simulator AIs want to be Active Inference AIs)
- Maybe you mean something like task + performance threshold. Here ‘predict the activation of photoreceptors in human retina well enough to be able to function as a typical human’ is clearly less difficult than task + performance threshold ‘predict next word on the internet, almost perfectly’. But this comparison does not seem to be particularly informative.
- Going in this direction we can make comparisonsbetween thresholds closer to reality e.g. ‘predict the activation of photoreceptors in human retina, and do other similar computation well enough to be able to function as a typical human’ vs. ‘predict next word on the internet, at the level of GPT4’ . This seems hard to order—humans are usually able to do the human task and would fail at the GPT4 task at GPT4 level; GPT4 is able to do the GPT4 task and would fail at the human task.
- You can’t make an ordering between cognitive systems based on ‘system A can’t do task T system B can, therefore B>A’ . There are many tasks which human’s can’t solve, but this implies very little. E.g. a human is unable to remember 50 thousand digit random number and my phone can easily, but there are also many things which human can do and my phone can’t.
From the above the possibly interesting direction of comparisons of ‘human skills’ and ‘GPT-4 skills’ is something like ‘why can’t GPT4 solve the human task at human level’ and ‘why can’t human solve the GPT task on GPT4 level’ and ‘why are the skills are a bit hard to compare’.
Some thoughts on this
- GPT4 clearly is “width superhuman”: it’s task is ~modelling of textual output of the whole humanity. This isn’t a great fit for the architecture and bounds of a single human mind roughly for the same reasons why a single human mind would do worse than Amazon recommender in recommending products to each of hundred million users. In contrast a human would probably do better in recommending products to one specific user whose preferences the human recommender would try to predict in detail.
Humanity as a whole would probably do significantly better at this task, if you e.g. imagine assigning every human one other human to model (and study in depth, read all their text outputs, etc)
- GPT4 clearly isn’t “samples → abstractions” better than humans, needing more data to learn the pattern.
- With overall ability to find abstractions, it seems unclear to what extent did GPT “learn smart algorithms independently because they are useful to predict human outputs” vs. “learned smart algorithms because they are implicitly reflected in human text”, and at the current level I would expect a mixture of both
What the main post is responding to is the argument: “We’re just training AIs to imitate human text, right, so that process can’t make them get any smarter than the text they’re imitating, right? So AIs shouldn’t learn abilities that humans don’t have; because why would you need those abilities to learn to imitate humans?” And to this the main post says, “Nope.”
The main post is not arguing: “If you abstract away the tasks humans evolved to solve, from human levels of performance at those tasks, the tasks AIs are being trained to solve are harder than those tasks in principle even if they were being solved perfectly.” I agree this is just false, and did not think my post said otherwise.
I do agree the argument “We’re just training AIs to imitate human text, right, so that process can’t make them get any smarter than the text they’re imitating, right? So AIs shouldn’t learn abilities that humans don’t have; because why would you need those abilities to learn to imitate humans?” is wrong and clearly the answer is “Nope”.
At the same time I do not think parts of your argument in the post are locally valid or good justification for the claim.
Correct and locally valid argument why GPTs are not capped by human level was already written here.
In a very compressed form, you can just imagine GPTs have text as their “sensory inputs” generated by the entire universe, similarly to you having your sensory inputs generated by the entire universe. Neither human intelligence nor GPTs are constrained by the complexity of the task (also: in the abstract, it’s the same task). Because of that, “task difficulty” is not a promising way how to compare these systems, and it is necessary to look into actual cognitive architectures and bounds.
With the last paragraph, I’m somewhat confused by what you mean by “tasks humans evolved to solve”. Does e.g. sending humans to the Moon, or detecting Higgs boson, count as a “task humans evolved to solve” or not?
I’d really like to see Eliezer engage with this comment, because to me it looks like the following sentence’s well-foundedness is rightly being questioned.
it’s naked mathematical truth that the task GPTs are being trained on is harder than being an actual human.
While I generally agree that powerful optimizers are dangerous, the fact that the GPT task and the “being an actual human” task are somewhat different has nothing to do with it.
From a high-level perspective, it is clear that this is just wrong. Part of what human brains are doing is to minimise prediction error with regard to sensory inputs. Unbounded version of the task is basically of same generality and difficulty as what GPT is doing, and is roughly equivalent to understand everything what is understandable in the observable universe.
Yes, human brains can be regarded as trying to solve the problem of minimizing prediction error given their own sensory inputs, but no one is trying to push up the capabilities of an individual human brain as fast as possible to make it better at actually doing so. Lots of people are definitely trying this for GPTs, measuring their progress on harder and harder tasks as they do so, some of which humans already cannot do on their own.
Or, another way of putting it: during training, a GPT is asked to solve a concrete problem no human is capable of or expected to solve. When GPT fails to make an accurate prediction, it gets modified into something that might do better next time. No one performs brain surgery on a human any time they make a prediction error.
Upon opening your eyes, your visual cortex is asked to solve a concrete problem no brain is capable or expected to solve perfectly: predict sensory inputs. When the patterns of firing don’t predict the photoreceptor activations, your brain gets modified into something else, which may do better next time. Every time your brain fails to predict it’s visual field, there is a bit of modification, based on computing what’s locally a good update.
There is no fundamental difference in the nature of the task.
Where the actual difference is are the computational and architectural bounds of the systems.
The smartness of neither humans nor GPTs is bottlenecked by the difficulty of the task, and you can not say how smart the systems are by looking at the problems. To illustrate that fallacy with a very concrete example:
Please do this task: prove P ≠ NP in next 5 minutes. You will get $1M if you do.
Done?
Do you think you have become much smarter mind because of that? I doubt do—but you were given a very hard task, and a high reward.
The actual strategic difference and what’s scary isn’t the difficulty of the task, but the fact human brain’s don’t multiple their size every few months.
Do you think you have become much smarter mind because of that? I doubt do—but you were given a very hard task, and a high reward.
No, but I was able to predict my own sensory input pretty well, for those 5 minutes. (I was sitting in a quiet room, mostly pondering how I would respond to this comment, rather than the actual problem you posed. When I closed my eyes, the sensory prediction problem got even easier.)
You could probably also train a GPT on sensory inputs (suitably encoded) instead of text, and get pretty good predictions about future sensory inputs.
Stepping back, the fact that you can draw a high-level analogy between neuroplasticity in human brains ⇔ SGD in transformer networks, and sensory input prediction ⇔ next token prediction doesn’t mean you can declare there is “no fundamental difference” in the nature of these things, even if you are careful to avoid the type error in your last example.
In the limit (maybe) a sufficiently good predictor could perfectly predict both sensory input and tokens, but the point is that the analogy breaks down in the ordinary, limited case, on the kinds of concrete tasks that GPTs and humans are being asked to solve today. There are plenty of text manipulation and summarization problems that GPT-4 is already superhuman at, and SGD can already re-weight a transformer network much more than neuroplasticity can reshape a human brain.
While the claim—the task ‘predict next token on the internet’ absolutely does not imply learning it caps at human-level intelligence—is true, some parts of the post and reasoning leading to the claims at the end of the post are confused or wrong.
Let’s start from the end and try to figure out what goes wrong.
From a high-level perspective, it is clear that this is just wrong. Part of what human brains are doing is to minimise prediction error with regard to sensory inputs. Unbounded version of the task is basically of same generality and difficulty as what GPT is doing, and is roughly equivalent to understand everything what is understandable in the observable universe. For example: a friend of mine worked at analysing the data from LHC, leading to the Higgs detection paper. Doing this type of work basically requires a human brain to have a predictive model of aggregates of outputs of a very large number of collisions of high-energy particles, processed by a complex configuration of computers and detectors.
Where GPT and humans differ is not some general mathematical fact about the task, but differences in what sensory data is a human and GPT trying to predict, and differences in cognitive architecture and ways how the systems are bounded. The different landscape of both boundedness and architecture can lead to both convergent cognition (thinking as the human would do) and the opposite, predicting what the human would output in highly non-human way.
The boundedness is overall a central concept here. Neither humans nor GPTs are attempting to solve ‘how to predict stuff with unlimited resources’, but a problem of cognitive economy—how to allocate limited computational resources to minimise prediction error.
If I try to imagine a mind which is able to predict my next word when asked to make up random words, and be successful at assigning 20% probability to my true output, I’m firmly in the realm of weird and incomprehensible Gods. If the Mind is imaginably bounded and smart, it seems likely it would not devote much cognitive capacity to trying to model in detail strings prefaced by a context like ‘this is a list of random numbers’, in particular if inverting the process generating the numbers would seem really costly. Being this good at this task would require so much data and cheap computation that this is way beyond superintelligence, in the realm of philosophical experiments.
Overall I think it is really unfortunate way how to think about the problem, where a system which is moderately hard to comprehend (like GPT) is replaced by something much more incomprehensible. Also it seems a bit of a reverse intuition pump—I’m pretty confident most people’s intuitive thinking about this ’simplest’ thing will be utterly confused.
How did we got here?
Apart from the fact that humans are also able to rap battle or impro on the fly, notice that “what would the loss function like the system to do” in principle tells you very little about what the system will do. For example, the human loss function makes some people attempt to predict winning lottery numbers. This is an impossible task for humans and you can’t say much about the human based on this. Or you can speculate about minds which would be able to succeed in this task, but you soon get into the realm of Gods and outside of physics.
Again, from the cognitive economy perspective, predicting my errors would often be wasteful. With some simplification, you can imagine I make two types of errors—systematic, and random. Often the simplest way how to predict the systematic error would be to emulate the process which led to the error. Random errors are … random, and a mind which knows me in enough detail to predict which random errors I’ll make seems a bit like the mind predicting the lottery numbers.
The general claim that some predictions are really hard and you need superhuman powers to be good at them is true, but notice that this does not inform us about what GPT-x will learn.
Yes this is clearly true: in the limit the task is of unlimited difficulty.
I didn’t say that GPT’s task is harder than any possible perspective on a form of work you could regard a human brain as trying to do; I said that GPT’s task is harder than being an actual human; in other words, being an actual human is not enough to solve GPT’s task.
I don’t see how the comparison of hardness of ‘GPT task’ and ‘being an actual human’ should technically work—to me it mostly seems like a type error.
- The task ‘predict the activation of photoreceptors in human retina’ clearly has same difficulty as ‘predict next word on the internet’ in the limit. (cf Why Simulator AIs want to be Active Inference AIs)
- Maybe you mean something like task + performance threshold. Here ‘predict the activation of photoreceptors in human retina well enough to be able to function as a typical human’ is clearly less difficult than task + performance threshold ‘predict next word on the internet, almost perfectly’. But this comparison does not seem to be particularly informative.
- Going in this direction we can make comparisons between thresholds closer to reality e.g. ‘predict the activation of photoreceptors in human retina, and do other similar computation well enough to be able to function as a typical human’ vs. ‘predict next word on the internet, at the level of GPT4’ . This seems hard to order—humans are usually able to do the human task and would fail at the GPT4 task at GPT4 level; GPT4 is able to do the GPT4 task and would fail at the human task.
- You can’t make an ordering between cognitive systems based on ‘system A can’t do task T system B can, therefore B>A’ . There are many tasks which human’s can’t solve, but this implies very little. E.g. a human is unable to remember 50 thousand digit random number and my phone can easily, but there are also many things which human can do and my phone can’t.
From the above the possibly interesting direction of comparisons of ‘human skills’ and ‘GPT-4 skills’ is something like ‘why can’t GPT4 solve the human task at human level’ and ‘why can’t human solve the GPT task on GPT4 level’ and ‘why are the skills are a bit hard to compare’.
Some thoughts on this
- GPT4 clearly is “width superhuman”: it’s task is ~modelling of textual output of the whole humanity. This isn’t a great fit for the architecture and bounds of a single human mind roughly for the same reasons why a single human mind would do worse than Amazon recommender in recommending products to each of hundred million users. In contrast a human would probably do better in recommending products to one specific user whose preferences the human recommender would try to predict in detail.
Humanity as a whole would probably do significantly better at this task, if you e.g. imagine assigning every human one other human to model (and study in depth, read all their text outputs, etc)
- GPT4 clearly isn’t “samples → abstractions” better than humans, needing more data to learn the pattern.
- With overall ability to find abstractions, it seems unclear to what extent did GPT “learn smart algorithms independently because they are useful to predict human outputs” vs. “learned smart algorithms because they are implicitly reflected in human text”, and at the current level I would expect a mixture of both
What the main post is responding to is the argument: “We’re just training AIs to imitate human text, right, so that process can’t make them get any smarter than the text they’re imitating, right? So AIs shouldn’t learn abilities that humans don’t have; because why would you need those abilities to learn to imitate humans?” And to this the main post says, “Nope.”
The main post is not arguing: “If you abstract away the tasks humans evolved to solve, from human levels of performance at those tasks, the tasks AIs are being trained to solve are harder than those tasks in principle even if they were being solved perfectly.” I agree this is just false, and did not think my post said otherwise.
I do agree the argument “We’re just training AIs to imitate human text, right, so that process can’t make them get any smarter than the text they’re imitating, right? So AIs shouldn’t learn abilities that humans don’t have; because why would you need those abilities to learn to imitate humans?” is wrong and clearly the answer is “Nope”.
At the same time I do not think parts of your argument in the post are locally valid or good justification for the claim.
Correct and locally valid argument why GPTs are not capped by human level was already written here.
In a very compressed form, you can just imagine GPTs have text as their “sensory inputs” generated by the entire universe, similarly to you having your sensory inputs generated by the entire universe. Neither human intelligence nor GPTs are constrained by the complexity of the task (also: in the abstract, it’s the same task). Because of that, “task difficulty” is not a promising way how to compare these systems, and it is necessary to look into actual cognitive architectures and bounds.
With the last paragraph, I’m somewhat confused by what you mean by “tasks humans evolved to solve”. Does e.g. sending humans to the Moon, or detecting Higgs boson, count as a “task humans evolved to solve” or not?
I’d really like to see Eliezer engage with this comment, because to me it looks like the following sentence’s well-foundedness is rightly being questioned.
While I generally agree that powerful optimizers are dangerous, the fact that the GPT task and the “being an actual human” task are somewhat different has nothing to do with it.
Yes, human brains can be regarded as trying to solve the problem of minimizing prediction error given their own sensory inputs, but no one is trying to push up the capabilities of an individual human brain as fast as possible to make it better at actually doing so. Lots of people are definitely trying this for GPTs, measuring their progress on harder and harder tasks as they do so, some of which humans already cannot do on their own.
Or, another way of putting it: during training, a GPT is asked to solve a concrete problem no human is capable of or expected to solve. When GPT fails to make an accurate prediction, it gets modified into something that might do better next time. No one performs brain surgery on a human any time they make a prediction error.
This seems the same confusion again.
Upon opening your eyes, your visual cortex is asked to solve a concrete problem no brain is capable or expected to solve perfectly: predict sensory inputs. When the patterns of firing don’t predict the photoreceptor activations, your brain gets modified into something else, which may do better next time. Every time your brain fails to predict it’s visual field, there is a bit of modification, based on computing what’s locally a good update.
There is no fundamental difference in the nature of the task.
Where the actual difference is are the computational and architectural bounds of the systems.
The smartness of neither humans nor GPTs is bottlenecked by the difficulty of the task, and you can not say how smart the systems are by looking at the problems. To illustrate that fallacy with a very concrete example:
Please do this task: prove P ≠ NP in next 5 minutes. You will get $1M if you do.
Done?
Do you think you have become much smarter mind because of that? I doubt do—but you were given a very hard task, and a high reward.
The actual strategic difference and what’s scary isn’t the difficulty of the task, but the fact human brain’s don’t multiple their size every few months.
(edited for clarity)
No, but I was able to predict my own sensory input pretty well, for those 5 minutes. (I was sitting in a quiet room, mostly pondering how I would respond to this comment, rather than the actual problem you posed. When I closed my eyes, the sensory prediction problem got even easier.)
You could probably also train a GPT on sensory inputs (suitably encoded) instead of text, and get pretty good predictions about future sensory inputs.
Stepping back, the fact that you can draw a high-level analogy between neuroplasticity in human brains ⇔ SGD in transformer networks, and sensory input prediction ⇔ next token prediction doesn’t mean you can declare there is “no fundamental difference” in the nature of these things, even if you are careful to avoid the type error in your last example.
In the limit (maybe) a sufficiently good predictor could perfectly predict both sensory input and tokens, but the point is that the analogy breaks down in the ordinary, limited case, on the kinds of concrete tasks that GPTs and humans are being asked to solve today. There are plenty of text manipulation and summarization problems that GPT-4 is already superhuman at, and SGD can already re-weight a transformer network much more than neuroplasticity can reshape a human brain.