We’ve know for a while that it’s possible to get good performance with far fewer parameters than BERT/GPT architectures use. E.g., Albert. The key point is that Gopher is much smaller and less capable than the human brain, even if we don’t know the appropriate metric by which we should compare such systems.
Agreed, per Sam Altman’s statements, improving performance without scaling is also OpenAI’s plan for GPT-4. And Gopher is far less capable than a human brain. It’s just the “synapses as parameters” analogy that irks me. I see it everywhere but it isn’t reliable and (despite disclaimers that the analogy isn’t 1 to 1) leads people to even less reliable extrapolations. Hopefully, a better metric will be devised soon.
We’ve know for a while that it’s possible to get good performance with far fewer parameters than BERT/GPT architectures use. E.g., Albert. The key point is that Gopher is much smaller and less capable than the human brain, even if we don’t know the appropriate metric by which we should compare such systems.
Agreed, per Sam Altman’s statements, improving performance without scaling is also OpenAI’s plan for GPT-4. And Gopher is far less capable than a human brain. It’s just the “synapses as parameters” analogy that irks me. I see it everywhere but it isn’t reliable and (despite disclaimers that the analogy isn’t 1 to 1) leads people to even less reliable extrapolations. Hopefully, a better metric will be devised soon.