why there has been so little discussion about his analysis since if true it seems to be quite important
…I can at least address this part from my perspective.
Some of the energy-efficiency discussion (particularly interconnect losses) seems wrong to me, but it seems not to be a crux for anything, so I don’t care to spend time looking into it and arguing about it. If a silicon-chip AGI server were 1000× the power consumption of a human brain, with comparable performance, its electricity costs would still be well below my local minimum wage. So who cares? And the world will run out of GPUs long before it runs out of the electricity needed to run them. And making more chips (or brains-in-vats or whatever) is a far harder problem than making enough solar cells to power them, and that remains true even if we substantially sacrifice energy-efficiency for e.g. higher speed.
If we (or an AI) master synthetic biology and can make brains-in-vats, tended and fed by teleoperated robots, then we (or the AI) can make whole warehouses of millions of them, each far larger (and hence smarter) than would be practical in humans who had to schlep their brains around the savannah, and they can have far better cooling systems (liquid-cooled with 1°C liquid coolant coming out of the HVAC system, rather than blood-temperature which is only slightly cooler than the brain), and each can have an ethernet/radio connection to a distant teleoperated robot body, etc. This all works even when I’m assuming “merely brain efficiency”. It doesn’t seem important to me whether it’s possible to do even better than that.
Likewise, the post argues that existing fabs are pumping out the equivalent of 5 million(5000 maybe? See thread below.) brains per year, which to me seems like plenty for AI takeover—cf. the conquistadors, or Hitler / Stalin taking over a noticeable fraction of humanity with a mere 1 brain each. Again, maybe there’s room for improvement in chip tech / efficiency compared to today, or maybe not, it doesn’t really seem to matter IMO.
Another thing is: Jacob & I agree that “the cortex/cerebellum/BG/thalamus system is a generic universal learning system”, but he argues that this system isn’t doing anything fundamentally different from the MACs and ReLUs and gradient descent that we know and love from deep learning, and I think he’s wrong, but I don’t want to talk about it for infohazard reasons. Obviously, you have no reason to believe me. Oh well. We’ll find out sooner or later. (I will point out this paper arguing that correlations between DNN-learned-model-activations and brain-voxel-activations is weaker evidence than it seems. The paper is mostly about vision but also has an LLM discussion in Section 5.) Anyway, there are a zillion important model differences that are all downstream of that core disagreement, e.g. how many GPUs it will take for human-level capabilities, how soon and how gradually-vs-suddenly we’ll get human-level capabilities, etc. And hence I have a hard time discussing those too ¯\_(ツ)_/¯
Jacob & I have numerous other AI-risk-relevant disagreements too, but they didn’t come up in the “Brain Efficiency” post.
If a silicon-chip AGI server were 1000× the power consumption of a human brain, with comparable performance, its electricity costs would still be well below my local minimum wage. So who cares? And the world will run out of GPUs long before it runs out of the electricity needed to run them. And making more chips (or brains-in-vats or whatever) is a far harder problem than making enough solar cells to power them, and that remains true even if we substantially sacrifice energy-efficiency for e.g. higher speed.
I largely agree with this, except I will note that energy efficiency is extremely important for robotics, which is partly why robotics lags and will continue to lag until we have more neuromorphic computing.
But also again the entire world produces less than 5TW currently, so if we diverted all world energy to running 20KW AGI that would only result in a population of 250M AGIs. But yes given that nvidia produces only a few hundred thousand high end GPUs per year, GPU production is by far the current bottleneck.
If we (or an AI) master synthetic biology and can make brains-in-vats
Yes but they take too long to train. The whole advantage of silicon AI is faster speed of thought (at the cost of enormous energy use).
Likewise, the post argues that existing fabs are pumping out the equivalent of 5 million brains per year, which to me seems like plenty for AI takeove
Err where? My last estimate is a few hundred thousand high end GPUs per year, and currently well more than one GPU to equal one brain (although that comparison is more complex).
Another thing is: Jacob & I agree that “the cortex/cerebellum/BG/thalamus system is a generic universal learning system”, but he argues that this system isn’t doing anything fundamentally different from the MACs and ReLUs and gradient descent that we know and love from deep learning, and I think he’s wrong, but I don’t want to talk about it for infohazard reasons.
Not quite: just GPTs do a bit more than MACs and RELUs, they also have softmax, normalization, transpose, etc. And in that sense the toolkit is complete, its more about what you implement with it and how efficient it is—but it’s obviously a universal circuit toolkit.
But in general I do think there are approaches likely to exceed the current GPT paradigm, and more to learn/apply from the brain, but further discussion in that direction should be offline.
I stand by brains-in-vats being relevant in at least some doom scenarios, notwithstanding the slow training. For example, I sometimes have arguments like:
ME: A power-seeking AGI might wipe out human civilization with a super-plague plus drone strikes on the survivors.
THEM: Even if the AGI could do that, it wouldn’t want to, because it wants to survive into the indefinite future, and that’s impossible without having humans around to manufacture chips, mine minerals, run the power grid, etc.
ME: Even if the AGI merely had access to a few dexterous teleoperated robot bodies and its own grid-isolated solar cell, at first, then once it wipes out all the humans, it could gradually (over decades) build its way back to industrial civilization.
THEM: Nope. Fabs are too labor-intensive to run, supply, and maintain. The AGI could scavenge existing chips but it could never make new ones. Eventually the scavenge-able chips would all break down and the AGI would be dead. The AGI would know that, and therefore it would never wipe out humanity in the first place.
ME: What about brains-in-vats?!
(I have other possible responses too—I actually wouldn’t concede the claim that nanofab is out of the question—but anyway, this is a context where brains-in-vats are plausibly relevant.)
I presume you’re imagining different argument chains, in which case, yeah, brains-in-vats that need 10 years to train might well not be relevant. :)
> Likewise, the post argues that existing fabs are pumping out the equivalent of 5 million brains per year, which to me seems like plenty for AI takeove
Err where
In brain efficiency you wrote “Nvidia—the single company producing most of the relevant flops today—produced roughly 5e21 flops of GPU compute in 2021, or the equivalent of about 5 million brains, perhaps surpassing the compute of the 3.6 million humans born in the US. With 200% growth in net flops output per year from all sources it will take about a decade for net GPU compute to exceed net world brain compute.”
…Whoops. I see. In this paragraph you were talking about FLOP/s, whereas you think the main constraint is memory capacity, which cuts it down by [I think you said] 3 OOM? But I think 5000 brains is enough for takeover too. Again, Hitler & Stalin had one each.
I will strike through my mistake above, sorry about that.
Oh I see. Memory capacity does limit the size of a model you can fit on a reasonable number of GPUs, but flops and bandwidth constrain the speed. In brain efficiency I was just looking at total net compute counting all gpus, more recently I was counting only flagship GPUs (as the small consumer GPUs aren’t used much for AI due to low RAM).
I encourage you to share your knowledge concerning energy-efficiency & interconnect losses! I will split the prize between all serious entries.
(to me the supposed implications for DOOM & FOOM are not so interesting. fwiw I probably agree with what you say here, including and especially your last paragraph)
I certainly don’t expect any prize for this, but…
…I can at least address this part from my perspective.
Some of the energy-efficiency discussion (particularly interconnect losses) seems wrong to me, but it seems not to be a crux for anything, so I don’t care to spend time looking into it and arguing about it. If a silicon-chip AGI server were 1000× the power consumption of a human brain, with comparable performance, its electricity costs would still be well below my local minimum wage. So who cares? And the world will run out of GPUs long before it runs out of the electricity needed to run them. And making more chips (or brains-in-vats or whatever) is a far harder problem than making enough solar cells to power them, and that remains true even if we substantially sacrifice energy-efficiency for e.g. higher speed.
If we (or an AI) master synthetic biology and can make brains-in-vats, tended and fed by teleoperated robots, then we (or the AI) can make whole warehouses of millions of them, each far larger (and hence smarter) than would be practical in humans who had to schlep their brains around the savannah, and they can have far better cooling systems (liquid-cooled with 1°C liquid coolant coming out of the HVAC system, rather than blood-temperature which is only slightly cooler than the brain), and each can have an ethernet/radio connection to a distant teleoperated robot body, etc. This all works even when I’m assuming “merely brain efficiency”. It doesn’t seem important to me whether it’s possible to do even better than that.
Likewise, the post argues that existing fabs are pumping out the equivalent of
5 million(5000 maybe? See thread below.) brains per year, which to me seems like plenty for AI takeover—cf. the conquistadors, or Hitler / Stalin taking over a noticeable fraction of humanity with a mere 1 brain each. Again, maybe there’s room for improvement in chip tech / efficiency compared to today, or maybe not, it doesn’t really seem to matter IMO.Another thing is: Jacob & I agree that “the cortex/cerebellum/BG/thalamus system is a generic universal learning system”, but he argues that this system isn’t doing anything fundamentally different from the MACs and ReLUs and gradient descent that we know and love from deep learning, and I think he’s wrong, but I don’t want to talk about it for infohazard reasons. Obviously, you have no reason to believe me. Oh well. We’ll find out sooner or later. (I will point out this paper arguing that correlations between DNN-learned-model-activations and brain-voxel-activations is weaker evidence than it seems. The paper is mostly about vision but also has an LLM discussion in Section 5.) Anyway, there are a zillion important model differences that are all downstream of that core disagreement, e.g. how many GPUs it will take for human-level capabilities, how soon and how gradually-vs-suddenly we’ll get human-level capabilities, etc. And hence I have a hard time discussing those too ¯\_(ツ)_/¯
Jacob & I have numerous other AI-risk-relevant disagreements too, but they didn’t come up in the “Brain Efficiency” post.
I largely agree with this, except I will note that energy efficiency is extremely important for robotics, which is partly why robotics lags and will continue to lag until we have more neuromorphic computing.
But also again the entire world produces less than 5TW currently, so if we diverted all world energy to running 20KW AGI that would only result in a population of 250M AGIs. But yes given that nvidia produces only a few hundred thousand high end GPUs per year, GPU production is by far the current bottleneck.
Yes but they take too long to train. The whole advantage of silicon AI is faster speed of thought (at the cost of enormous energy use).
Err where? My last estimate is a few hundred thousand high end GPUs per year, and currently well more than one GPU to equal one brain (although that comparison is more complex).
Not quite: just GPTs do a bit more than MACs and RELUs, they also have softmax, normalization, transpose, etc. And in that sense the toolkit is complete, its more about what you implement with it and how efficient it is—but it’s obviously a universal circuit toolkit.
But in general I do think there are approaches likely to exceed the current GPT paradigm, and more to learn/apply from the brain, but further discussion in that direction should be offline.
I stand by brains-in-vats being relevant in at least some doom scenarios, notwithstanding the slow training. For example, I sometimes have arguments like:
ME: A power-seeking AGI might wipe out human civilization with a super-plague plus drone strikes on the survivors.
THEM: Even if the AGI could do that, it wouldn’t want to, because it wants to survive into the indefinite future, and that’s impossible without having humans around to manufacture chips, mine minerals, run the power grid, etc.
ME: Even if the AGI merely had access to a few dexterous teleoperated robot bodies and its own grid-isolated solar cell, at first, then once it wipes out all the humans, it could gradually (over decades) build its way back to industrial civilization.
THEM: Nope. Fabs are too labor-intensive to run, supply, and maintain. The AGI could scavenge existing chips but it could never make new ones. Eventually the scavenge-able chips would all break down and the AGI would be dead. The AGI would know that, and therefore it would never wipe out humanity in the first place.
ME: What about brains-in-vats?!
(I have other possible responses too—I actually wouldn’t concede the claim that nanofab is out of the question—but anyway, this is a context where brains-in-vats are plausibly relevant.)
I presume you’re imagining different argument chains, in which case, yeah, brains-in-vats that need 10 years to train might well not be relevant. :)
In brain efficiency you wrote “Nvidia—the single company producing most of the relevant flops today—produced roughly 5e21 flops of GPU compute in 2021, or the equivalent of about 5 million brains, perhaps surpassing the compute of the 3.6 million humans born in the US. With 200% growth in net flops output per year from all sources it will take about a decade for net GPU compute to exceed net world brain compute.”
…Whoops. I see. In this paragraph you were talking about FLOP/s, whereas you think the main constraint is memory capacity, which cuts it down by [I think you said] 3 OOM? But I think 5000 brains is enough for takeover too. Again, Hitler & Stalin had one each.
I will strike through my mistake above, sorry about that.
Oh I see. Memory capacity does limit the size of a model you can fit on a reasonable number of GPUs, but flops and bandwidth constrain the speed. In brain efficiency I was just looking at total net compute counting all gpus, more recently I was counting only flagship GPUs (as the small consumer GPUs aren’t used much for AI due to low RAM).
I encourage you to share your knowledge concerning energy-efficiency & interconnect losses! I will split the prize between all serious entries.
(to me the supposed implications for DOOM & FOOM are not so interesting. fwiw I probably agree with what you say here, including and especially your last paragraph)
Oh fine, you talked me into it :)
😊 yayy