My understanding is that general opinion at OpenAI and other ML (but not AI) researchers is that it wouldn’t be worth scaling on the sub tasks that show good scaling so far given training costs and researcher time. Sorry I don’t have a cite as I’ve only been following this loosely.
Except if you have an idea to monetarize one of these sub-tasks? An invest of order 10m USD in compute is not very large if you can create a Pokemon Comedy TV channel out of it, or something like that.
You don’t ‘scale on a subtask’, you scale the model, which can be applied to many tasks.. The question is not whether this or that task scales, but whether the model improves enough on enough of importance to justify the costs of scaling, and since lots of the tasks do look like they will scale well, that is prima facie plausible and the burden is on people arguing otherwise.
Personally, I have not seen any OA people dismiss the idea of scaling, Slack comments certainly sound like they expect further scaling, other people report gossip about 100-1,000x scaling being planned, scaling to solve tasks like Winogrande sounds like that would be useful, and given how much the benchmarks undersell the reality of GPT-3 I wouldn’t put too much stock in it anyway.
I mean that a subtask is projected to be valuable enough to be worth the trouble. First I’ve heard about the 100-1000x scaling, that’s helpful to know. Thanks.
My understanding is that general opinion at OpenAI and other ML (but not AI) researchers is that it wouldn’t be worth scaling on the sub tasks that show good scaling so far given training costs and researcher time. Sorry I don’t have a cite as I’ve only been following this loosely.
Except if you have an idea to monetarize one of these sub-tasks? An invest of order 10m USD in compute is not very large if you can create a Pokemon Comedy TV channel out of it, or something like that.
fair
You don’t ‘scale on a subtask’, you scale the model, which can be applied to many tasks.. The question is not whether this or that task scales, but whether the model improves enough on enough of importance to justify the costs of scaling, and since lots of the tasks do look like they will scale well, that is prima facie plausible and the burden is on people arguing otherwise.
Personally, I have not seen any OA people dismiss the idea of scaling, Slack comments certainly sound like they expect further scaling, other people report gossip about 100-1,000x scaling being planned, scaling to solve tasks like Winogrande sounds like that would be useful, and given how much the benchmarks undersell the reality of GPT-3 I wouldn’t put too much stock in it anyway.
I mean that a subtask is projected to be valuable enough to be worth the trouble. First I’ve heard about the 100-1000x scaling, that’s helpful to know. Thanks.