I’d say usually bottlenecks aren’t absolute, but instead quantifiable and flexible based on costs, time, etc.?
One could say that we’ve reached the threshold where we’re bottlenecked on inference-compute, whereas previously talk of compute bottlenecks was about training-compute.
This seems to matter for some FOOM scenarios since e.g. it limits the FOOM that can be achieved by self-duplicating.
But the fact that AI companies are trying their hardest to scale up compute, and are also actively researching more compute-efficient algorithms, means IMO that the inference-compute bottleneck will be short-lived.
I’d say usually bottlenecks aren’t absolute, but instead quantifiable and flexible based on costs, time, etc.?
One could say that we’ve reached the threshold where we’re bottlenecked on inference-compute, whereas previously talk of compute bottlenecks was about training-compute.
This seems to matter for some FOOM scenarios since e.g. it limits the FOOM that can be achieved by self-duplicating.
But the fact that AI companies are trying their hardest to scale up compute, and are also actively researching more compute-efficient algorithms, means IMO that the inference-compute bottleneck will be short-lived.
In what sense are they “not trying their hardest”?
I think you inserted an extra “not”.
Oh gosh, how did I hallucinate that?
Maybe you’re an LLM.