I came to the comments section to say a similar thing. Right now, the easiest way for companies to push the frontier of capabilities is via throwing more hardware and electricity at the problem, as well as doing some efficiency improvements. If the cost or unavailability of electrical power or hardware were to become a bottleneck, then that would simply tilt the equation more in the direction of searching for more compute efficient methods.
I believe there’s plenty of room to spend more on research there and get decent returns for investment, so I doubt the compute bottleneck would make much of a difference. I’m pretty sure we’re already well into a compute-overhang regime in terms of what the compute costs of more efficient model architectures would be like.
I think the same is true for data, and the potential to spend more research investment on looking for more data efficient algorithms.
I came to the comments section to say a similar thing. Right now, the easiest way for companies to push the frontier of capabilities is via throwing more hardware and electricity at the problem, as well as doing some efficiency improvements. If the cost or unavailability of electrical power or hardware were to become a bottleneck, then that would simply tilt the equation more in the direction of searching for more compute efficient methods.
I believe there’s plenty of room to spend more on research there and get decent returns for investment, so I doubt the compute bottleneck would make much of a difference. I’m pretty sure we’re already well into a compute-overhang regime in terms of what the compute costs of more efficient model architectures would be like. I think the same is true for data, and the potential to spend more research investment on looking for more data efficient algorithms.