PhD Student at University of Washington & Visiting Researcher at Facebook AI Research
tim_dettmers(Tim Dettmers)
The brain does not use quantum computing as far as I know. At least not in the sense that would be highly beneficial computationally. The brain achieves a compute density many orders of magnitude denser than any classical computer we will ever able to design at about 20 W of energy that is the primary advantage of the brain. We cannot build such a structure with silicon, because it would overheat. Even if we would be able to cool it (room temperature superconductors), there is no way to manufacture it. 3D memory is just a couple of layers extra but the yield is so bad that it is almost not economically viable to produce it because the manufacturing process is so unreliable. It is unlikely that we will able to manufacture it more reliably in the future because our tools are already very close to physical limits.
You might say “Why not just build a brain biologically?”. Well for that we would need to understand the brain at the protein level and how to coordinated proteins from scratch. From our tools it is unlikely that that will ever happen. You cannot measure a collection of tiny things that are closely bundled together when you need large instruments to measure the tiny things. There is just not enough space, geometrically, to do that. And there are more problems after you understood how the brain works on the protein level. I think efficient biological computation it is just not a physically plausible concept.
With those two eliminated there is just not a way to replicate a human brain. There are different ways to achieve super-human compute capabilities by exploiting some physical dimensions which are limited for brains but such an intelligence would be very different from a human talent. Maybe superior in many aspects, but not AGI in the general sense.
My estimate is very different from what others suggest and this stems from my background and my definition of AGI. I see AGI as human-level intelligence. If we present a problem to an AGI system, we would expect that it does not make any “silly” mistakes, but that it makes reasonable responses like a competent human would do.
My background: I work in deep learning on very large language models. I worked on the parallelization of deep learning in the past. I also have in-depth knowledge of GPUs and accelerators in general. I developed the fasted algorithm for top-k sparse-sparse matrix multiplication on a GPU.
I wrote about this 5-years ago, but since then my opinion has not changed: I believe that AGI will be physically impossible with classical computers.
It is very clear that intelligence is all about compute capabilities. The intelligence of mammals is currently limited energy intake — including the intelligence of humans. I believe that the same is true for AI algorithms and these patterns seem to be very clear if you look at the trajectory of compute requirements over the past years.
The main issues are these: You cannot build structures smaller than atoms; you can only dissipate a certain amount of heat per square area; the smaller the structures are that you print with lasers, the higher the probability of artifacts; light can only go a certain distance per second; the speed of SRAM scales sub-linearly with its size. These are all hard physical boundaries that we cannot alter. Yet, all these physical boundaries will be hit within a couple of years and we will fall very, very far short of human processing capabilities and our models will not improve much further. Two orders of magnitude of additional capability are realistic, but anything beyond that is just wishful thinking.
You may say it is just a matter of scale. Our hardware will not be as fast as brains but we just build more of them. Well, the speed of light is fixed and networking scales abysmally. With that, you have a maximum cluster size that you cannot exceed without losing processing power. The thing is, even in current neural networks, doubling the number of GPUs sometimes doubles training time. We will design neural networks that scale better, such as mixtures of experts, but this will not be nearly enough (this will give us another 1-2 orders of magnitude).
We will switch to wafer-scale compute in the next years and this will yield a huge boost in performance, but even wafer-scale chips will not yield the long-term performance that we need to get anywhere near AGI.
The only real possibility that I see is quantum computing. It is currently not clear what quantum AI algorithms would look like but we seem to be able to scale quantum computers double exponentially over time aka Neven’s Law. The real question is if quantum error correction also scales exponentially (the current data suggests this) or if it can scale sub-exponentially. If it scales exponentially, quantum computers will be interesting for chemistry, but they would be useless for AI. If it scales sub-exponentially we will hit quantum computers that are faster than classical computers in 2035. Due to double exponential scaling, the quantum computers in 2037 would be an unbelievable amount more powerful than all classical computers before. We might not be able to reach AGI still because we cannot feed such computer data quickly enough to keep up with its processing power, but I am sure we might be able to figure something out to feed a single powerful quantum computer. Currently, the input requirements for pretraining large deep learning models are minimal for natural language but high for images and videos. As such, we might still not have AGI with quantum computers, but we would have computers with excellent natural language processing capabilities.
If quantum computer do not work, I would predict that we never will reach AGI — hence the probability is zero after the peak in 2035-2037.
If the definition of AGI is relaxed and models are allowed to make “stupid mistakes” and the requirement is just that they on average solve problems better than humans. I would be pretty much in line with Ethan’s predictions (Ethan and I chatted about this before).
Edit: A friend told me that it is better to put my probability after 2100 if I believe it is not happening after the peak 2037. Updated the graph.
Thanks, Ethan! I updated the post with that info!