Meant to comment on this a while back but forgot. I have thought about this also and broadly agree that early AGI with ‘thoughts’ at GHz levels is highly unlikely. Originally this was because pre-ML EY and the community broadly associated thoughts with CPU ops but in practice thoughts are more like forward passes through the model.
As Connor Sullivan says, the reasons brains can have low clock rates is that our intelligence algorithms are embarrassingly parallel, as is current ML. Funnily enough, for large models (and definitely if we were to run forward passes through NNs as large as the brain), inference latency is already within an OOM or so of the brain (100ms). Due to parallelisation, you can distribute your forward pass across many GPUs to potentially decrease latency but eventually will get throttled by the networking overhead.
The brain, interestingly, achieves its relatively low latency by being highly parallel and shallow. The brain is not that many ‘layers’ deep. Even though each neuron is slow, the brain can perform core object recognition in <300ms at about 10 synaptic transmissions from retina → IT. This is compared to current resnets which are >>10 layers. It does this through some combination of better architecture, better inference algorithm, and adaptive compute which trades space for time. i.e. you don’t have do all your thinking in a forward pass but instead have recurrent connections so you can keep pondering and improving your estimations through multiple ‘passes’.
Neuromorphic hardware can ameliorate some of these issues but not others. Potentially, it allows for much more efficient parallel processing and lets you replace a multi-GPU cluster with a really big neuromorphic chip. Theoretically this could enable forward passes to occur at GHz speed but probably not within the next decade (technically if you use pure analog or optical chips you can get even faster forward passes!). Downsides are unknown hardware difficulty for more exotic designs and general data movement costs on chip. Also energy intensity will be huge at these speeds. Another bottleneck you end up with in practice is simply speed of encoding/decoding data at the analog-digital interface.
Even based on GPU clusters, early AGI can probably improve inference latency by a few OOMs to 100-1000s of forward passes per second just from low hanging hardware/software improvements. Additional benefits AGI could have are:
1.) Batching. GPUs are great at handling batches rapidly. The AGI can ‘think’ about 1000 things in parallel. The brain has to operate on batch size 1. Interestingly this is also a potential limitation of a lot of neuromorphic hardware as well.
2.) Direct internal access to serial compute. Imagine you had a python repl in your brain you could query and instantly get responses. Same with instant internal database lookup.
Meant to comment on this a while back but forgot. I have thought about this also and broadly agree that early AGI with ‘thoughts’ at GHz levels is highly unlikely. Originally this was because pre-ML EY and the community broadly associated thoughts with CPU ops but in practice thoughts are more like forward passes through the model.
As Connor Sullivan says, the reasons brains can have low clock rates is that our intelligence algorithms are embarrassingly parallel, as is current ML. Funnily enough, for large models (and definitely if we were to run forward passes through NNs as large as the brain), inference latency is already within an OOM or so of the brain (100ms). Due to parallelisation, you can distribute your forward pass across many GPUs to potentially decrease latency but eventually will get throttled by the networking overhead.
The brain, interestingly, achieves its relatively low latency by being highly parallel and shallow. The brain is not that many ‘layers’ deep. Even though each neuron is slow, the brain can perform core object recognition in <300ms at about 10 synaptic transmissions from retina → IT. This is compared to current resnets which are >>10 layers. It does this through some combination of better architecture, better inference algorithm, and adaptive compute which trades space for time. i.e. you don’t have do all your thinking in a forward pass but instead have recurrent connections so you can keep pondering and improving your estimations through multiple ‘passes’.
Neuromorphic hardware can ameliorate some of these issues but not others. Potentially, it allows for much more efficient parallel processing and lets you replace a multi-GPU cluster with a really big neuromorphic chip. Theoretically this could enable forward passes to occur at GHz speed but probably not within the next decade (technically if you use pure analog or optical chips you can get even faster forward passes!). Downsides are unknown hardware difficulty for more exotic designs and general data movement costs on chip. Also energy intensity will be huge at these speeds. Another bottleneck you end up with in practice is simply speed of encoding/decoding data at the analog-digital interface.
Even based on GPU clusters, early AGI can probably improve inference latency by a few OOMs to 100-1000s of forward passes per second just from low hanging hardware/software improvements. Additional benefits AGI could have are:
1.) Batching. GPUs are great at handling batches rapidly. The AGI can ‘think’ about 1000 things in parallel. The brain has to operate on batch size 1. Interestingly this is also a potential limitation of a lot of neuromorphic hardware as well.
2.) Direct internal access to serial compute. Imagine you had a python repl in your brain you could query and instantly get responses. Same with instant internal database lookup.
Strongly upvoted, I found this very valuable/enlightening. I think you should make this a top level answer.