In contrast, an AGI could grow its capabilities [...] by manufacturing or otherwise acquiring new processors to expand its computing power.
For this to be a solid argument, please explain why Amdahl’s Law does not apply.
You cannot take an arbitrary program and simply toss more processors at it. Machine learning is nearly the best case[1] for Amdahl’s Law we have in our current regime. Scaling to the extent that you’re talking about for AI… it may be possible, but is nowhere near ‘simply toss more processors at it’.
I agree with you that Amdahl’s law applies; this claim isn’t meant to be read as “it’s possible to make programs faster indefinitely by getting more processors”. Two points:
The final value of r that I estimate for AI is only one to two orders of magnitude above the value that Roodman estimates for human civilization. This is much faster, of course, but the fact that it’s not much larger suggests there are indeed obstacles to making models run arbitrarily fast that aren’t too far off from the timescales at which we grow.
I think it’s worth pointing out that we can make architectures more parallelizable and have done so in the past. RNNs were abandoned both because of the quadratic scaling with hidden state dimension but also because of backprop through them not being a parallelizable computation. It seems like when we run into Amdahl’s law style constraints, we can get around them to a substantial extent by replacing our architectures with ones that are more parallelizable, such as by going from RNNs to Transformers.
For this to be a solid argument, please explain why Amdahl’s Law does not apply.
You cannot take an arbitrary program and simply toss more processors at it. Machine learning is nearly the best case[1] for Amdahl’s Law we have in our current regime. Scaling to the extent that you’re talking about for AI… it may be possible, but is nowhere near ‘simply toss more processors at it’.
It’s “all” just predictable element-wise applications of functions and matrix multiplications, give or take.
I agree with you that Amdahl’s law applies; this claim isn’t meant to be read as “it’s possible to make programs faster indefinitely by getting more processors”. Two points:
The final value of r that I estimate for AI is only one to two orders of magnitude above the value that Roodman estimates for human civilization. This is much faster, of course, but the fact that it’s not much larger suggests there are indeed obstacles to making models run arbitrarily fast that aren’t too far off from the timescales at which we grow.
I think it’s worth pointing out that we can make architectures more parallelizable and have done so in the past. RNNs were abandoned both because of the quadratic scaling with hidden state dimension but also because of backprop through them not being a parallelizable computation. It seems like when we run into Amdahl’s law style constraints, we can get around them to a substantial extent by replacing our architectures with ones that are more parallelizable, such as by going from RNNs to Transformers.