AGIs will run on neural networks that scale up 3x on a regular basis, which can be copied very cheaply after training, and which can process signals orders of magnitude faster than biological brains.
The implication of this phrase, that AGI will think much faster than humans, producing gigabytes of thoughts and plans in an instant, is not evident to me (I intend to explore this question deeper soon, to increase my confidence). Smarter models are larger models, and larger models are generally slower than smaller models. Today’s large language models already think at speed comparable to humans’, ~50ms per token generated.
So, unless some technology, from quantum computing to novel memory hardware will revolutionize the basic performance characteristics of neural net inference, I currently don’t expect AGI to think radically faster than humans do. Perhaps, it will be even slower, maybe an order of magnitude slower.
A whole other question is that AGI can parallelise a lot, even a single GPU cluster can perform in the ballpark of 100 parallel inference threads, and it could also be replicated thousands of times on different physical hardware in different datacenters. AGI could use these parallel thought threads to perform sorts of thought experiments or simulations and analyse all the results later to come up with its “final” answer to some question, or the “best” plan for something. Or, AGI could just deal with different issues in different “thought threads” or copies of itself. In either case, to maintain coherency and predictability, there should be a lot of communication, “negotiation”, cross-validation, etc. happening between these threads and copies, which won’t make the output quicker.
Besides, a similar idea, that digital mids might think “thousands or millions of times faster than humans”, appears in Shulman and Bostrom (2020), so it also looks dubious to me.
Thanks for this comment! I think it raises a very fair point. I do expect algorithmic improvements to make a significant difference—even if AGIs start off thinking more slowly than humans, algorithmic improvements would then allow us to train a smaller model with the same level of performance, but much faster. (This paper isn’t quite measuring the same thing, but “efficiency doubling every 16 months” seems like a reasonable baseline to me.) And techniques like model distillation would also help with that.
However, I think the claim as originally written was probably misleading, so I’ve rephrased it to focus on algorithmic improvements.
The implication of this phrase, that AGI will think much faster than humans, producing gigabytes of thoughts and plans in an instant, is not evident to me (I intend to explore this question deeper soon, to increase my confidence). Smarter models are larger models, and larger models are generally slower than smaller models. Today’s large language models already think at speed comparable to humans’, ~50ms per token generated.
So, unless some technology, from quantum computing to novel memory hardware will revolutionize the basic performance characteristics of neural net inference, I currently don’t expect AGI to think radically faster than humans do. Perhaps, it will be even slower, maybe an order of magnitude slower.
A whole other question is that AGI can parallelise a lot, even a single GPU cluster can perform in the ballpark of 100 parallel inference threads, and it could also be replicated thousands of times on different physical hardware in different datacenters. AGI could use these parallel thought threads to perform sorts of thought experiments or simulations and analyse all the results later to come up with its “final” answer to some question, or the “best” plan for something. Or, AGI could just deal with different issues in different “thought threads” or copies of itself. In either case, to maintain coherency and predictability, there should be a lot of communication, “negotiation”, cross-validation, etc. happening between these threads and copies, which won’t make the output quicker.
Besides, a similar idea, that digital mids might think “thousands or millions of times faster than humans”, appears in
Shulman and Bostrom (2020)
, so it also looks dubious to me.Thanks for this comment! I think it raises a very fair point. I do expect algorithmic improvements to make a significant difference—even if AGIs start off thinking more slowly than humans, algorithmic improvements would then allow us to train a smaller model with the same level of performance, but much faster. (This paper isn’t quite measuring the same thing, but “efficiency doubling every 16 months” seems like a reasonable baseline to me.) And techniques like model distillation would also help with that.
However, I think the claim as originally written was probably misleading, so I’ve rephrased it to focus on algorithmic improvements.