There are multiple ways to interpret “being an actual human”. I interpret it as pointing at an ability level.
“the task GPTs are being trained on is harder” ⇒ the prediction objective doesn’t top out at (i.e. the task has more difficulty in it than).
“than being an actual human” ⇒ the ability level of a human (i.e. the task of matching the human ability level at the relevant set of tasks).
Or as Eliezer said:
I said that GPT’s task is harder than being an actual human; in other words, being an actual human is not enough to solve GPT’s task.
In different words again: the tasks GPTs are being incentivised to solve aren’t all solvable at a human level of capability.
You almost had it when you said:
- Maybe you mean something like task + performance threshold. Here ‘predict the activation of photoreceptors in human retina well enough to be able to function as a typical human’ is clearly less difficult than task + performance threshold ‘predict next word on the internet, almost perfectly’. But this comparison does not seem to be particularly informative.
It’s more accurate if I edit it to:
- Maybe you mean something like task + performance threshold. Here ‘predict the activation of photoreceptors in human retina [text] well enough to be able to function as a typical human’ is clearly less difficult than task + performance threshold ‘predict next word on the internet, almost perfectly’.
You say it’s not particularly informative. Eliezer responds by explaining the argument it responds to, which provides the context in which this is an informative statement about the training incentives of a GPT.
There are multiple ways to interpret “being an actual human”. I interpret it as pointing at an ability level.
“the task GPTs are being trained on is harder” ⇒ the prediction objective doesn’t top out at (i.e. the task has more difficulty in it than).
“than being an actual human” ⇒ the ability level of a human (i.e. the task of matching the human ability level at the relevant set of tasks).
Or as Eliezer said:
In different words again: the tasks GPTs are being incentivised to solve aren’t all solvable at a human level of capability.
You almost had it when you said:
It’s more accurate if I edit it to:
- Maybe you mean something like task + performance threshold. Here ‘predict the
activation of photoreceptors in human retina[text] well enough to be able to function as a typical human’ is clearly less difficult than task + performance threshold ‘predict next word on the internet,almostperfectly’.You say it’s not particularly informative. Eliezer responds by explaining the argument it responds to, which provides the context in which this is an informative statement about the training incentives of a GPT.