The question is not about the very general claim, or general argument, but about this specific reasoning step
GPT-4 is still not as smart as a human in many ways, but it’s naked mathematical truth that the task GPTs are being trained on is harder than being an actual human.
And since the task that GPTs are being trained on is different from and harder than the task of being a human, ….
I do claim this is not locally valid, that’s all (and recommend reading the linked essay). I do not claim the broad argument that text prediction objective doesn’t stop incentivizing higher capabilities once you get to human level capabilities is wrong.
I do agree communication can be hard, and maybe I misunderstand the quoted two sentences, but it seems very natural to read them as making a comparison between tasks at the level of math.
I don’t understand the problem with this sentence. Yes, the task is harder than the task of being a human (as good as a human is at that task). Many objectives that humans optimize for are also not optimized to 100%, and as such, humans also face many tasks that they would like to get better at, and so are harder than the task of simply being a human. Indeed, if you optimized an AI system on those, you would also get no guarantee that the system would end up only as competent as a human.
This is a fact about practically all tasks (including things like calculating the nth-digit of pi, or playing chess), but it is indeed a fact that lots of people get wrong.
There are multiple ways to interpret “being an actual human”. I interpret it as pointing at an ability level.
“the task GPTs are being trained on is harder” ⇒ the prediction objective doesn’t top out at (i.e. the task has more difficulty in it than).
“than being an actual human” ⇒ the ability level of a human (i.e. the task of matching the human ability level at the relevant set of tasks).
Or as Eliezer said:
I said that GPT’s task is harder than being an actual human; in other words, being an actual human is not enough to solve GPT’s task.
In different words again: the tasks GPTs are being incentivised to solve aren’t all solvable at a human level of capability.
You almost had it when you said:
- Maybe you mean something like task + performance threshold. Here ‘predict the activation of photoreceptors in human retina well enough to be able to function as a typical human’ is clearly less difficult than task + performance threshold ‘predict next word on the internet, almost perfectly’. But this comparison does not seem to be particularly informative.
It’s more accurate if I edit it to:
- Maybe you mean something like task + performance threshold. Here ‘predict the activation of photoreceptors in human retina [text] well enough to be able to function as a typical human’ is clearly less difficult than task + performance threshold ‘predict next word on the internet, almost perfectly’.
You say it’s not particularly informative. Eliezer responds by explaining the argument it responds to, which provides the context in which this is an informative statement about the training incentives of a GPT.
The question is not about the very general claim, or general argument, but about this specific reasoning step
I do claim this is not locally valid, that’s all (and recommend reading the linked essay). I do not claim the broad argument that text prediction objective doesn’t stop incentivizing higher capabilities once you get to human level capabilities is wrong.
I do agree communication can be hard, and maybe I misunderstand the quoted two sentences, but it seems very natural to read them as making a comparison between tasks at the level of math.
I don’t understand the problem with this sentence. Yes, the task is harder than the task of being a human (as good as a human is at that task). Many objectives that humans optimize for are also not optimized to 100%, and as such, humans also face many tasks that they would like to get better at, and so are harder than the task of simply being a human. Indeed, if you optimized an AI system on those, you would also get no guarantee that the system would end up only as competent as a human.
This is a fact about practically all tasks (including things like calculating the nth-digit of pi, or playing chess), but it is indeed a fact that lots of people get wrong.
(I affirm this as my intended reading.)
There are multiple ways to interpret “being an actual human”. I interpret it as pointing at an ability level.
“the task GPTs are being trained on is harder” ⇒ the prediction objective doesn’t top out at (i.e. the task has more difficulty in it than).
“than being an actual human” ⇒ the ability level of a human (i.e. the task of matching the human ability level at the relevant set of tasks).
Or as Eliezer said:
In different words again: the tasks GPTs are being incentivised to solve aren’t all solvable at a human level of capability.
You almost had it when you said:
It’s more accurate if I edit it to:
- Maybe you mean something like task + performance threshold. Here ‘predict the
activation of photoreceptors in human retina[text] well enough to be able to function as a typical human’ is clearly less difficult than task + performance threshold ‘predict next word on the internet,almostperfectly’.You say it’s not particularly informative. Eliezer responds by explaining the argument it responds to, which provides the context in which this is an informative statement about the training incentives of a GPT.