Sentences 1 and 4 should have higher probability than sentences 2 and 3. What they find is that GPT-2 does worse than chance on these kinds of problems. If a sentence is likely, a variation on the sentence with opposite meaning tends to have similar likelihood.
I can anecdotally confirm this; I’ve been personally calling this the “GPT swerve”, ie. sentences of the form “We are in favor of recycling, because recycling doesn’t actually improve the environment, and that’s why we are against recycling.”
The proposed explanation makes sense as well. Is anyone trying to pre-train a GPT-2 with unlikelihood avoidance?
I can anecdotally confirm this; I’ve been personally calling this the “GPT swerve”, ie. sentences of the form “We are in favor of recycling, because recycling doesn’t actually improve the environment, and that’s why we are against recycling.”
The proposed explanation makes sense as well. Is anyone trying to pre-train a GPT-2 with unlikelihood avoidance?