no clock speed stays the same, but clock cycle latency of communication between regions increases. Just like CPUs require more clock cycles to access memory than they used to.
Sure, but from the point of view of per token latency that’s going to be a similar effect, no?
no clock speed stays the same, but clock cycle latency of communication between regions increases. Just like CPUs require more clock cycles to access memory than they used to.
Sure, but from the point of view of per token latency that’s going to be a similar effect, no?