gwern comments on Predictions for GPT-N

gwern 29 Jul 2020 16:01 UTC
9 points
You don’t ‘scale on a subtask’, you scale the model, which can be applied to many tasks.. The question is not whether this or that task scales, but whether the model improves enough on enough of importance to justify the costs of scaling, and since lots of the tasks do look like they will scale well, that is prima facie plausible and the burden is on people arguing otherwise.

Personally, I have not seen any OA people dismiss the idea of scaling, Slack comments certainly sound like they expect further scaling, other people report gossip about 100-1,000x scaling being planned, scaling to solve tasks like Winogrande sounds like that would be useful, and given how much the benchmarks undersell the reality of GPT-3 I wouldn’t put too much stock in it anyway.
- romeostevensit 29 Jul 2020 17:00 UTC
  3 points
  Parent
  I mean that a subtask is projected to be valuable enough to be worth the trouble. First I’ve heard about the 100-1000x scaling, that’s helpful to know. Thanks.