Sure, but people do worry about harming rats too much, and, more importantly, by the time we get to actual human level it may be already late. Like, there is no prepared procedure for stopping that whole process of scaling, no robust humanity-meters to know when you can safely proceed, and even no consensus on relevant abstract ethics.
Deepmind’s recent research puts some holes in the already shaky analogy between synapses and parameters. RETRO achieved comparable performance to GPT-3 despite having 25x fewer parameters.
A human with google also gets way better performance than a human without google on “predict the next word of this website,” so I’m not sure this undermines the analogy.
We’ve know for a while that it’s possible to get good performance with far fewer parameters than BERT/GPT architectures use. E.g., Albert. The key point is that Gopher is much smaller and less capable than the human brain, even if we don’t know the appropriate metric by which we should compare such systems.
Agreed, per Sam Altman’s statements, improving performance without scaling is also OpenAI’s plan for GPT-4. And Gopher is far less capable than a human brain. It’s just the “synapses as parameters” analogy that irks me. I see it everywhere but it isn’t reliable and (despite disclaimers that the analogy isn’t 1 to 1) leads people to even less reliable extrapolations. Hopefully, a better metric will be devised soon.
280 billion parameters is still far less than the human brain. It’s closer to a rat’s brain. Maybe even smaller than that.
Sure, but people do worry about harming rats too much, and, more importantly, by the time we get to actual human level it may be already late. Like, there is no prepared procedure for stopping that whole process of scaling, no robust humanity-meters to know when you can safely proceed, and even no consensus on relevant abstract ethics.
Deepmind’s recent research puts some holes in the already shaky analogy between synapses and parameters. RETRO achieved comparable performance to GPT-3 despite having 25x fewer parameters.
A human with google also gets way better performance than a human without google on “predict the next word of this website,” so I’m not sure this undermines the analogy.
We’ve know for a while that it’s possible to get good performance with far fewer parameters than BERT/GPT architectures use. E.g., Albert. The key point is that Gopher is much smaller and less capable than the human brain, even if we don’t know the appropriate metric by which we should compare such systems.
Agreed, per Sam Altman’s statements, improving performance without scaling is also OpenAI’s plan for GPT-4. And Gopher is far less capable than a human brain. It’s just the “synapses as parameters” analogy that irks me. I see it everywhere but it isn’t reliable and (despite disclaimers that the analogy isn’t 1 to 1) leads people to even less reliable extrapolations. Hopefully, a better metric will be devised soon.