With massive size comes massive generalization ability: GPT-3 is competitive in many benchmarks without even tuning on the target task. [...] Perhaps the most impressive part, though, is that even at such a massive scale, the model still scales smoothly in performance instead of plateauing, implying that still-larger models would perform even better.
GPT-3: A Summary
Link post