nostalgebraist comments on GPT-3: a disappointing paper

nostalgebraist 1 Jun 2020 19:53 UTC
11 points
I agree with you about hype management in general, I think. The following does seem like a point of concrete disagreement:
It sounds like you expected “GPT” to mean something more like “paradigm-breaker” and so you were disappointed, but this feels like a ding on your expectations more than a ding on the paper.
If the paper had not done few-shot learning, and had just reviewed LM task performance / generation quality / zero-shot (note that zero-shot scales up well too!), I would agree with you.
However, as I read the paper, it touts few-shot as this new, exciting capability that only properly emerges at the new scale. I expected that, if any given person found the paper impressive, it would be for this purported newness and not only “LM scaling continues,” and this does seem to be the case (e.g. gwern, dxu). So there is a real, object-level dispute over the extent to which this is a qualitative jump.
I’m not sure I have concrete social epistemology goals except “fewer false beliefs” -- that is, I am concerned with group beliefs, but only because they point to which truths will be most impactful to voice. I predicted people would be overly impressed with few-shot, and I wanted to counter that. Arguably I should have concentrated less on “does this deserve the title GPT-3?” and more heavily on few-shot, as I’ve done more recently.