Are you saying that GPT-3′s training corpus was preprocessed to remove information about the author, title, and publication venue? Or are you only talking about what happens when this info is outside the context window?
No, it’s a more philosophical point. Even if such things appear in the context window, they’re simply more text, and convey the same kind of information: not “the denotation of these words is factually true,” but “these words are part of the text.”
For example, the mere appearance of something like
Title: Why GPT wants to mesa-optimize & how we might change this
Author: John_Maxwell
does not guarantee that the text following it bears that title, or was written by that author. (As I am illustrating right now.)
Of course, one can design datasets where information like this is provided more authoritatively—say, always at the start of each text, curated for quality, etc. (GPT isn’t like that, but Grover and CTRL kind of are, in different ways.)
But even that can only go so far. If the author is “Julius Caesar,” does that mean the historical figure, some internet poster with that handle, or any number of other possibilities? A passage of fiction written in a character’s voice—is the appropriate authorcue the actual writer (who may have written in many different voices over their career) or the character? (Note that the character is a much better answer to the question “who does this sound like?”) And doesn’t the date matter too, so we know whether this post in the venue “Less Wrong” was on 2010′s LW or 2020′s?
Fundamentally, language modeling is about understanding structures in decontextualized blocks of contiguous words. You can try to hack in some sidechannels to provide context, but there’s no way they will capture everything needing to locate the text fully in its social, physical, and temporal position within the broader world. And just as a definitional manner, these sidechannels are modifications to “language modeling,” which in its purest sense is just about filling in an arbitrary text from substrings of it (and no other information).
My intuition is that small-L lookahead could be close to large-L lookahead in programspace for something like an RNN, but not for GPT-3′s transformer architecture.
Yeah, not for transformers I think.
Anyway, the question here isn’t whether lookahead will be perfectly accurate, but whether the post-lookahead distribution of next words will allow for improvement over the pre-lookahead distribution.
capybaralet’s point about conservation of expected evidence applies here—GPT is trying to be optimal at next-step prediction, and an optimal next-step predictor should not get improved by lookahead, it should already have those facts priced in to its next-step prediction.
If we then say “the mechanism for pricing them in is doing internal lookahead,” then we are imagining that lookahead operating over some predictor that is otherwise good but hasn’t priced in lookahead yet. But I don’t know why we should imagine the computation would naturally factor this way, when the benefits of lookahead are small and it beam search take a lot of parameters to implement internally.
Your philosophical point is interesting; I have a post in the queue about that. However I don’t think it really proves what you want it to.
Having John_Maxwell in the byline makes it far more likely that I’m the author of the post.
If humans can make useful judgements re: whether this is something I wrote, vs something nostalgebraist wrote to make a point about bylines, I don’t see why a language model can’t do the same, in principle.
GPT is trying to be optimal at next-step prediction, and an optimal next-step predictor should not get improved by lookahead, it should already have those facts priced in to its next-step prediction.
A perfectly optimal next-step predictor would not be improved by lookahead or anything else, it’s perfectly optimal. I’m talking about computational structures which might be incentivized during training when the predictor is suboptimal. (It’s still going to be suboptimal after training with current technology, of course.)
...GPT-3′s ability to write fiction is impressive- unlike GPT-2, it doesn’t lose track of the plot, it has sensible things happen, it just can’t plan its way to a satisfying resolution.
I’d be somewhat surprised if GPT-4 shared that last problem.
I suspect that either GPT-4 will still be unable to plan its way to a satisfying resolution, or GPT-4 will develop some kind of internal lookahead (probably not beam search, but beam search could be a useful model for understanding it) which is sufficiently general to be re-used across many different writing tasks. (Generality takes fewer parameters.) I don’t know what the relative likelihoods of those possibilities are. But the whole idea of AI safety is to ask what happens if we succeed.
No, it’s a more philosophical point. Even if such things appear in the context window, they’re simply more text, and convey the same kind of information: not “the denotation of these words is factually true,” but “these words are part of the text.”
For example, the mere appearance of something like
Title: Why GPT wants to mesa-optimize & how we might change this
Author: John_Maxwell
does not guarantee that the text following it bears that title, or was written by that author. (As I am illustrating right now.)
Of course, one can design datasets where information like this is provided more authoritatively—say, always at the start of each text, curated for quality, etc. (GPT isn’t like that, but Grover and CTRL kind of are, in different ways.)
But even that can only go so far. If the author is “Julius Caesar,” does that mean the historical figure, some internet poster with that handle, or any number of other possibilities? A passage of fiction written in a character’s voice—is the appropriate author cue the actual writer (who may have written in many different voices over their career) or the character? (Note that the character is a much better answer to the question “who does this sound like?”) And doesn’t the date matter too, so we know whether this post in the venue “Less Wrong” was on 2010′s LW or 2020′s?
Fundamentally, language modeling is about understanding structures in decontextualized blocks of contiguous words. You can try to hack in some sidechannels to provide context, but there’s no way they will capture everything needing to locate the text fully in its social, physical, and temporal position within the broader world. And just as a definitional manner, these sidechannels are modifications to “language modeling,” which in its purest sense is just about filling in an arbitrary text from substrings of it (and no other information).
Yeah, not for transformers I think.
capybaralet’s point about conservation of expected evidence applies here—GPT is trying to be optimal at next-step prediction, and an optimal next-step predictor should not get improved by lookahead, it should already have those facts priced in to its next-step prediction.
If we then say “the mechanism for pricing them in is doing internal lookahead,” then we are imagining that lookahead operating over some predictor that is otherwise good but hasn’t priced in lookahead yet. But I don’t know why we should imagine the computation would naturally factor this way, when the benefits of lookahead are small and it beam search take a lot of parameters to implement internally.
Your philosophical point is interesting; I have a post in the queue about that. However I don’t think it really proves what you want it to.
Having John_Maxwell in the byline makes it far more likely that I’m the author of the post.
If humans can make useful judgements re: whether this is something I wrote, vs something nostalgebraist wrote to make a point about bylines, I don’t see why a language model can’t do the same, in principle.
A perfectly optimal next-step predictor would not be improved by lookahead or anything else, it’s perfectly optimal. I’m talking about computational structures which might be incentivized during training when the predictor is suboptimal. (It’s still going to be suboptimal after training with current technology, of course.)
In orthonormal’s post they wrote:
I suspect that either GPT-4 will still be unable to plan its way to a satisfying resolution, or GPT-4 will develop some kind of internal lookahead (probably not beam search, but beam search could be a useful model for understanding it) which is sufficiently general to be re-used across many different writing tasks. (Generality takes fewer parameters.) I don’t know what the relative likelihoods of those possibilities are. But the whole idea of AI safety is to ask what happens if we succeed.