we already see that; we’re constantly amazed by it, despite little meaning of created texts
But GPT-3 is only trained to minimize prediction loss, not to maximize response. GPT-N may be able to crowd-please if it’s trained on approval, but I don’t think that’s what’s currently happening.
Upon reflection, you’re right that it won’t be maximizing response per se.
But as we get deeper it’s not so straightforward. GTP-3 models can be trained to minimize prediction loss (or, plainly speaking, to simply predict more accurately) on many different tasks, which usually are very simply stated (eg. choose a word that would fill the blank).
But we end up with people taking models trained thusly and use them to generate a long texts based on some primer. And yes, in most cases such abuse of the model will end up with text that is simply coherent. But I would expect humans to have a tendency to conflate coherence and persuasiveness.
I suppose one can fairly easily choose such prediction loss for GTP-3 models that the longer texts would have some desired characteristics. But also even standard tasks probably shape GTP-3 so that it would keep producing vague sentences that continue the primer and that give the reader a feel of “it making sense”. That would entail possibly producing fairly persuasive texts reinforcing primer thesis.
But GPT-3 is only trained to minimize prediction loss, not to maximize response. GPT-N may be able to crowd-please if it’s trained on approval, but I don’t think that’s what’s currently happening.
Upon reflection, you’re right that it won’t be maximizing response per se.
But as we get deeper it’s not so straightforward. GTP-3 models can be trained to minimize prediction loss (or, plainly speaking, to simply predict more accurately) on many different tasks, which usually are very simply stated (eg. choose a word that would fill the blank).
But we end up with people taking models trained thusly and use them to generate a long texts based on some primer. And yes, in most cases such abuse of the model will end up with text that is simply coherent. But I would expect humans to have a tendency to conflate coherence and persuasiveness.
I suppose one can fairly easily choose such prediction loss for GTP-3 models that the longer texts would have some desired characteristics. But also even standard tasks probably shape GTP-3 so that it would keep producing vague sentences that continue the primer and that give the reader a feel of “it making sense”. That would entail possibly producing fairly persuasive texts reinforcing primer thesis.