Which brings me to the second line of very obvious-seeming reasoning that converges upon the same conclusion—that it is in principle possible to build an AGI much more computationally efficient than a human brain—namely that biology is simply not that efficient, and especially when it comes to huge complicated things that it has started doing relatively recently.
Biological cells are computers which must copy bits to copy DNA. So we can ask biology—how much energy do cells use to copy each base pair? Seems they use just 4 ATP per base pair, or 1 ATP/bit, and thus within an OOM of the ‘Landauer bound’. Which is more impressive if you consider that the typically quoted ‘Landauer bound’ of kT ln 2 is overly optimistic as it only applies when the error probability is 50% or the computation takes infinity. Useful computation requires at least somewhat higher speed than inf and reliability higher than none.
Brains have to pump thousands of ions in and out of each stretch of axon and dendrite, in order to restore their ability to fire another fast neural spike. The result is that the brain’s computation is something like half a million times less efficient than the thermodynamic limit for its temperature—so around two millionths as efficient as ATP synthase.
The fact that cell replication operates at the Landauer bound already suggests a prior that neurons should be efficient.
The Landauer bound at room temp is ~ 0.03 eV. Given that an electron is something of an obvious minimal unit for an electrical computer, the Landauer bound can be thought of as a 30 mV thermal noise barrier. Digital computers operate roughly 30x that for speed and reliability, but if you look at neuron swing voltages it’s clear they are operating only ~3x or so above the noise voltage (optimizing hard for energy efficiency at the expense of speed).
Assuming 1hz * 10^14 synapses / 10 watts = 10^13 synops/watt, or about 10^7 electron charges at landauer voltage. A synaptic op is at least doing analog signal multiplication, which requires far more energy/charges than a simple binary op—IIRC you need roughly 2^2K carriers and thus erasures to have precision equivalent to K-bit digital, so an 8-bit synaptic op (which IIRC is near where digital/analog mult energy intersects) would be 10^4 or 10^5. I had a relevant ref for this, can’t find it now (but think you can derive it from the binomial distribution when std dev/precision is equivalent to 2^-8).
Now most synapses are probably smaller/cheaper than 8-bit equiv, but most of the energy cost involved is in pushing data down irreversible dissipative wires (just as true in the brain as it is in a GPU). Now add in the additional costs of synaptic adjustment machinery for learning, cell maintenance tax, dendritic computation, etc and it’s suddenly not clear at all that the brain is really far from energy efficient.
As further and final bayes evidence, Moore’s Law is running out of steam as we run up against the limits of physics (for irreversible computation using irreversible wires) - and at best is just catching up to brain energy efficiency.
Imprecisely multiplying two analog numbers should not require 10^5 times the minimum bit energy in a well-designed computer.
A well-designed computer would also use, say, optical interconnects that worked by pushing one or two photons around at the speed of light. So if neurons are in some sense being relatively efficient at the given task of pumping thousands upon thousands of ions in and out of a depolarizing membrane in order to transmit signals at 100m/sec—every ion of which necessarily uses at least the Landauer minimum energy—they are being vastly far from optimally efficient.
The moment you see ions going in and out of a depolarizing membrane, and contrast that to the possibility of firing a photon down a fiber, you ought to be done asking whether or not biology has built an optimally efficient computer. It actually isn’t any more complicated than that. You are driving yourself further from sanity if you then try to do very complicated reasoning about how it must be close to the limit of efficiency to pump thousands of ions in and out of a membrane instead.
Imprecisely multiplying two analog numbers should not require 10^5 times the minimum bit energy in a well-designed computer.
Much depends on your exact definition of ‘imprecisely’. But if we assume exactly 8-bit equivalent SNR, as I was using above, then you can lookup this question in the research literature and/or ask an LLM and the standard answer is in fact close to ~1e5 eV.
This multiplication op is a masking operation and not inherently reversible so it erases/destroys about 1⁄2 of the energy of the photonic input signal (100% if you multiply by 0, etc). So the min energy boils down to that required to represent an 8-bit number reliably as an analog signal (so for example you could convert a digital 8-bit signal to analog and back to digital losslessly all at the same standard sufficient 1eV reliability).
Analog signals effectively represent numbers as the 1st moment of a binomial distribution over carrier particles, and the information content is basically the entropy of a binomial over 1eV carriers which is ~0.5 log2(N) and thus N ~ 2^(2b) quanta for b bits of precision.
The energy to represent an analog signal doesn’t depend much on the medium—whether you are using photons or electrons/ions. The advantage of the electronic medium is the much smaller practical device dimensions possible when using much heavier/denser particles as bit carriers: 1eV photons are micrometer scale, much larger than the smallest synapses/transistors/biodevices. The obvious advantage of photons is their much higher transmission speed: thus they are used for longer range interconnect (but mostly only for distances larger than the brain radius).
Sorry, explain again why floods of neurotransmitter molecules bopping around are ideally thermodynamically efficient? You’re assuming that they’re trying to do multiplication out to 8-bit precision using analog quantities? Why suppose the 8-bit precision? Even if that part was actually important, why not perhaps ding biology a few engineering points for trying to represent it using analog quantities requiring 2^16 particles bopping around? Optimally doing something incredibly inefficient is incredibly inefficient.
Sorry, explain again why floods of neurotransmitter molecules bopping around are ideally thermodynamically efficient? You’re assuming that they’re trying to do multiplication out to 8-bit precision using analog quantities? Why suppose the 8-bit precision?
I’m not assuming that, but its nonetheless useful as a benchmark for comparison. It helps illustrate that 1e5 eV is really not much—it just allows a single 8-bit analog mult for example.
Earlier in the thread I said:
Now most synapses are probably smaller/cheaper than 8-bit equiv, but most of the energy cost involved is in pushing data down irreversible dissipative wires (just as true in the brain as it is in a GPU). Now add in the additional costs of synaptic adjustment machinery for learning, cell maintenance tax, dendritic computation, etc
The synapse is clearly doing something somewhat more complex than just analog multiplication.
And in terms of communication costs (which are paid at the synaptic junction for the synapse → dendrite → soma path), that 1e5 eV is only enough to carry a reliable 1 bit signal only about ~100mm (1e5 nm) distance through irreversible nano/micro scale wires (the wire bit energy for axons/dendrites and modern cmos is about the same).
Reversible interconnect is much more complex—requires communicating through fully isolated particles over the wire distance, which is obviously much more practical for photons for various reasons, but they are very large etc. Many complex tradeoffs.
And in terms of communication costs (which are paid at the synaptic junction for the synapse → dendrite → soma path), that 1e5 eV is only enough to carry a reliable 1 bit signal only about ~100mm (1e5 nm) distance through irreversible wires (the wire bit energy for axons/dendrites and modern cmos is about the same).
If it applies in the specific cases of axons and cmos there should be justification of why it does, though given the amount of prior discussion I don’t think this would be fruitful.
No—Coax cables are enormous in radius (EM wavelengths), and do not achieve much better than 1 eV / nm in practice. In the same waveguide radius you can you just remove the copper filler and go pure optical and then get significantly below 1 eV/nm anyway—so why even mention coax?
The only thing that was ‘debunked’ was in a tangent conversation that had no bearing on the main point (about nanoscale wire interconnect smaller than EM wavelength—which is irreversible and consumes close to 1 eV/nm in both brains and computers), and it was just my initial conception that coax cables could be modeled in simplification as relays like RC interconnect.
There are many complex tradeoffs between size, speed, energy, etc. Reversible and irreversible comms occupy different regions of that pareto surface. Reversible communication is isomorphic to transmitting particles—in practice always photons—and requires complex/large transmitter/receivers and photon sized waveguides etc. Irreversible communication is isomorphic to domino-based computing, and has the advantage—and cost—of full error correction/erasure at every cycle, and easier to guide down narrow and complex paths.
In general, efficiency at the level of logic gates doesn’t translate into the efficiency at the CPU level.
For example, imagine you’re tasked to correctly identify the faces of your classmates from 1 billion photos of random human faces. If you fail to identify a face, you must re-do the job.
Your neurons are perfectly efficient. You have a highly optimized face-recognition circuitry.
Yet you’ll consume more energy on the task than, say, Apple M1 CPU:
you’ll waste at least 30% of your time on sleep
your highly optimized faces-recognition circuitry is still rather inefficient
you’ll make mistakes, forcing you to re-do the job
you can’t hold your attention long enough to complete such a task, even if your life depends on it
Even if the human brain is efficient on the level of neural circuits, it is unlikely to be the most efficient vessel for a general intelligence.
In general, high-level biological designs are a crappy mess, mostly made of kludgy bugfixes to previous dirty hacks, which were made to fix other kludgy bugfixes (an example).
And the newer is the design, the crappier it is. For example, compare:
the almost perfect DNA replication (optimized for ~10^9 years)
the faulty and biased human brain (optimized for ~10^5 years)
With the exception of a few molecular-level designs, I expect that human engineers can produce much more efficient solutions than the natural selection, in some cases - orders of magnitude more efficient.
Human technology is rarely more efficient than biology along the quantitative dimensions that are important to biology, but human technology is not limited to building out of evolved wetware nanobots and can instead employ high energy manufacturing to create ultra durable materials that then enable very high energy density solutions. Our flying machines may not compete with birds in energy efficiency, but they harness power densities of a completely different scale to that available to biology. Basically the same applies to computers vs brains. AGI will outcompete human brains by brute scale, speed, and power rather than energy efficiency.
The human brain is just a scaled up primate brain, which is just a tweaked, more scalable mammal brain, but mammal brains have the same general architecture—which is closer to ~10^8 years old. It is hardly ‘faulty and biased’ - bias is in the mind.
A lot of the advantage of human technology is due to human technology figuring out how to use covalent bonds and metallic bonds, where biology sticks to ionic bonds and proteins held together by van der Waals forces (static cling, basically). This doesn’t fit into your paradigm; it’s just biology mucking around in a part of the design space easily accessible to mutation error, while humans work in a much more powerful design space because they can move around using abstract cognition.
Covalent/metallic vs ionic bonds implements the high energy density vs wetware constrained distinction I was referring to, so we are mostly in agreement; that is my paradigm. But the evidence is pretty clear that “ionic bond and protein” tech does approach the Landauer limit—at least for protein computation. As for the brain, end of Moore’s Law high end chip research is very much neuromorphic (memristor crossbars, etc), and some designs do claim perhaps 10x or so greater synop/J than the brain (roughly), but they aren’t built yet. So if you had wider uncertainty in your claim, with most mass in the region of the brain being 1 to 3 OOMs from the limit, I probably wouldn’t have commented, but for me that one claim distracted from your larger valid points.
Your arguments apply mostly toward arguing that brains are optimized for energy efficiency, but the important quantity in question is computational efficiency! You even admit that neurons are “optimizing hard for energy efficiency at the expense of speed”, but don’t seem to have noticed that this fact makes almost everything else you said completely irrelevant!
The point of my comment (from my perspective ) was to focus very specifically on a few claims about biology/brains that I found questionable—relevant because the OP specifically was using energy as an efficiency metric.
It’s relevant because energy efficiency is one of the standard key measures of low level hardware substrate computational efficiency.
At a higher level if you are talking about overall efficiency for some complex task, well then software/algorithm efficiency is obviously super important which is a more complex subject. And there are other low level metrics of importance as well such as feature size, speed, etc.
FWIW I agree that bit also rang hollow to me—my sense was also that neurons are basically as energy-efficient as you can get—but by “computational efficiency” one means something like “amount of energy expended to achieve a computational result.”
For example, imagine multiplying two four-digit numbers in your head vs. in a calculator. Each transistor operation in the calculator will be much more expensive than each neuron spike, however the calculator needs many fewer transistor operations than the brain needs neuron spikes, because the calculator is optimized to efficiently compute those sorts of multiplications whereas the brain needs to expensively emulate the calculator. Overall the calculator will spend fewer joules than the brain will.
I don’t think you can directly compare brain voltage to Landauer limit, because brains operate chemically, so we also care about differences in chemical potential (e.g. of sodium vs potassium, which are importantly segregated across cell membranes even though both have the same charge). To really illustrate this, we might imagine information-processing biology that uses no electrical charges, only signalling via gradients of electrically-neutral chemicals. I think this raises the total potential relative to Landauer and cuts down the amount of molecules we should estimate as transported per signal.
Neuron computation is electro-chemical through voltage gated ion channels. If the voltage is at or below the Landauer voltage, then ion motion through the gate is pure noise. As the voltage climbs above the Landauer limit, you start to get meaningful probabilistic state transitions (error rate below 50%) in reasonable time; you can then implement analog computation using many such unreliable carriers reducing error/noise through central limit binomial.
‘Pure’ chemical computation is protein machinery. Biology evolved voltage based signaling for high speed longer distance communication/computation.
Biological cells are computers which must copy bits to copy DNA. So we can ask biology—how much energy do cells use to copy each base pair? Seems they use just 4 ATP per base pair, or 1 ATP/bit, and thus within an OOM of the ‘Landauer bound’. Which is more impressive if you consider that the typically quoted ‘Landauer bound’ of kT ln 2 is overly optimistic as it only applies when the error probability is 50% or the computation takes infinity. Useful computation requires at least somewhat higher speed than inf and reliability higher than none.
The fact that cell replication operates at the Landauer bound already suggests a prior that neurons should be efficient.
The Landauer bound at room temp is ~ 0.03 eV. Given that an electron is something of an obvious minimal unit for an electrical computer, the Landauer bound can be thought of as a 30 mV thermal noise barrier. Digital computers operate roughly 30x that for speed and reliability, but if you look at neuron swing voltages it’s clear they are operating only ~3x or so above the noise voltage (optimizing hard for energy efficiency at the expense of speed).
Assuming 1hz * 10^14 synapses / 10 watts = 10^13 synops/watt, or about 10^7 electron charges at landauer voltage. A synaptic op is at least doing analog signal multiplication, which requires far more energy/charges than a simple binary op—IIRC you need roughly 2^2K carriers and thus erasures to have precision equivalent to K-bit digital, so an 8-bit synaptic op (which IIRC is near where digital/analog mult energy intersects) would be 10^4 or 10^5. I had a relevant ref for this, can’t find it now (but think you can derive it from the binomial distribution when std dev/precision is equivalent to 2^-8).
Now most synapses are probably smaller/cheaper than 8-bit equiv, but most of the energy cost involved is in pushing data down irreversible dissipative wires (just as true in the brain as it is in a GPU). Now add in the additional costs of synaptic adjustment machinery for learning, cell maintenance tax, dendritic computation, etc and it’s suddenly not clear at all that the brain is really far from energy efficient.
As further and final bayes evidence, Moore’s Law is running out of steam as we run up against the limits of physics (for irreversible computation using irreversible wires) - and at best is just catching up to brain energy efficiency.
Imprecisely multiplying two analog numbers should not require 10^5 times the minimum bit energy in a well-designed computer.
A well-designed computer would also use, say, optical interconnects that worked by pushing one or two photons around at the speed of light. So if neurons are in some sense being relatively efficient at the given task of pumping thousands upon thousands of ions in and out of a depolarizing membrane in order to transmit signals at 100m/sec—every ion of which necessarily uses at least the Landauer minimum energy—they are being vastly far from optimally efficient.
The moment you see ions going in and out of a depolarizing membrane, and contrast that to the possibility of firing a photon down a fiber, you ought to be done asking whether or not biology has built an optimally efficient computer. It actually isn’t any more complicated than that. You are driving yourself further from sanity if you then try to do very complicated reasoning about how it must be close to the limit of efficiency to pump thousands of ions in and out of a membrane instead.
Much depends on your exact definition of ‘imprecisely’. But if we assume exactly 8-bit equivalent SNR, as I was using above, then you can lookup this question in the research literature and/or ask an LLM and the standard answer is in fact close to ~1e5 eV.
This multiplication op is a masking operation and not inherently reversible so it erases/destroys about 1⁄2 of the energy of the photonic input signal (100% if you multiply by 0, etc). So the min energy boils down to that required to represent an 8-bit number reliably as an analog signal (so for example you could convert a digital 8-bit signal to analog and back to digital losslessly all at the same standard sufficient 1eV reliability).
Analog signals effectively represent numbers as the 1st moment of a binomial distribution over carrier particles, and the information content is basically the entropy of a binomial over 1eV carriers which is ~0.5 log2(N) and thus N ~ 2^(2b) quanta for b bits of precision.
The energy to represent an analog signal doesn’t depend much on the medium—whether you are using photons or electrons/ions. The advantage of the electronic medium is the much smaller practical device dimensions possible when using much heavier/denser particles as bit carriers: 1eV photons are micrometer scale, much larger than the smallest synapses/transistors/biodevices. The obvious advantage of photons is their much higher transmission speed: thus they are used for longer range interconnect (but mostly only for distances larger than the brain radius).
Sorry, explain again why floods of neurotransmitter molecules bopping around are ideally thermodynamically efficient? You’re assuming that they’re trying to do multiplication out to 8-bit precision using analog quantities? Why suppose the 8-bit precision? Even if that part was actually important, why not perhaps ding biology a few engineering points for trying to represent it using analog quantities requiring 2^16 particles bopping around? Optimally doing something incredibly inefficient is incredibly inefficient.
I’m not assuming that, but its nonetheless useful as a benchmark for comparison. It helps illustrate that 1e5 eV is really not much—it just allows a single 8-bit analog mult for example.
Earlier in the thread I said:
The synapse is clearly doing something somewhat more complex than just analog multiplication.
And in terms of communication costs (which are paid at the synaptic junction for the synapse → dendrite → soma path), that 1e5 eV is only enough to carry a reliable 1 bit signal only about ~100mm (1e5 nm) distance through irreversible nano/micro scale wires (the wire bit energy for axons/dendrites and modern cmos is about the same).
Reversible interconnect is much more complex—requires communicating through fully isolated particles over the wire distance, which is obviously much more practical for photons for various reasons, but they are very large etc. Many complex tradeoffs.
This model of interconnect energy has been thoroughly debunked here, as coax cables violate it by a factor of 200: https://www.lesswrong.com/posts/fm88c8SvXvemk3BhW/brain-efficiency-cannell-prize-contest-award-ceremony
If it applies in the specific cases of axons and cmos there should be justification of why it does, though given the amount of prior discussion I don’t think this would be fruitful.
No—Coax cables are enormous in radius (EM wavelengths), and do not achieve much better than 1 eV / nm in practice. In the same waveguide radius you can you just remove the copper filler and go pure optical and then get significantly below 1 eV/nm anyway—so why even mention coax?
The only thing that was ‘debunked’ was in a tangent conversation that had no bearing on the main point (about nanoscale wire interconnect smaller than EM wavelength—which is irreversible and consumes close to 1 eV/nm in both brains and computers), and it was just my initial conception that coax cables could be modeled in simplification as relays like RC interconnect.
There are many complex tradeoffs between size, speed, energy, etc. Reversible and irreversible comms occupy different regions of that pareto surface. Reversible communication is isomorphic to transmitting particles—in practice always photons—and requires complex/large transmitter/receivers and photon sized waveguides etc. Irreversible communication is isomorphic to domino-based computing, and has the advantage—and cost—of full error correction/erasure at every cycle, and easier to guide down narrow and complex paths.
In general, efficiency at the level of logic gates doesn’t translate into the efficiency at the CPU level.
For example, imagine you’re tasked to correctly identify the faces of your classmates from 1 billion photos of random human faces. If you fail to identify a face, you must re-do the job.
Your neurons are perfectly efficient. You have a highly optimized face-recognition circuitry.
Yet you’ll consume more energy on the task than, say, Apple M1 CPU:
you’ll waste at least 30% of your time on sleep
your highly optimized faces-recognition circuitry is still rather inefficient
you’ll make mistakes, forcing you to re-do the job
you can’t hold your attention long enough to complete such a task, even if your life depends on it
Even if the human brain is efficient on the level of neural circuits, it is unlikely to be the most efficient vessel for a general intelligence.
In general, high-level biological designs are a crappy mess, mostly made of kludgy bugfixes to previous dirty hacks, which were made to fix other kludgy bugfixes (an example).
And the newer is the design, the crappier it is. For example, compare:
the almost perfect DNA replication (optimized for ~10^9 years)
the faulty and biased human brain (optimized for ~10^5 years)
With the exception of a few molecular-level designs, I expect that human engineers can produce much more efficient solutions than the natural selection, in some cases - orders of magnitude more efficient.
Human technology is rarely more efficient than biology along the quantitative dimensions that are important to biology, but human technology is not limited to building out of evolved wetware nanobots and can instead employ high energy manufacturing to create ultra durable materials that then enable very high energy density solutions. Our flying machines may not compete with birds in energy efficiency, but they harness power densities of a completely different scale to that available to biology. Basically the same applies to computers vs brains. AGI will outcompete human brains by brute scale, speed, and power rather than energy efficiency.
The human brain is just a scaled up primate brain, which is just a tweaked, more scalable mammal brain, but mammal brains have the same general architecture—which is closer to ~10^8 years old. It is hardly ‘faulty and biased’ - bias is in the mind.
A lot of the advantage of human technology is due to human technology figuring out how to use covalent bonds and metallic bonds, where biology sticks to ionic bonds and proteins held together by van der Waals forces (static cling, basically). This doesn’t fit into your paradigm; it’s just biology mucking around in a part of the design space easily accessible to mutation error, while humans work in a much more powerful design space because they can move around using abstract cognition.
Covalent/metallic vs ionic bonds implements the high energy density vs wetware constrained distinction I was referring to, so we are mostly in agreement; that is my paradigm. But the evidence is pretty clear that “ionic bond and protein” tech does approach the Landauer limit—at least for protein computation. As for the brain, end of Moore’s Law high end chip research is very much neuromorphic (memristor crossbars, etc), and some designs do claim perhaps 10x or so greater synop/J than the brain (roughly), but they aren’t built yet. So if you had wider uncertainty in your claim, with most mass in the region of the brain being 1 to 3 OOMs from the limit, I probably wouldn’t have commented, but for me that one claim distracted from your larger valid points.
You’re missing the point!
Your arguments apply mostly toward arguing that brains are optimized for energy efficiency, but the important quantity in question is computational efficiency! You even admit that neurons are “optimizing hard for energy efficiency at the expense of speed”, but don’t seem to have noticed that this fact makes almost everything else you said completely irrelevant!
The point of my comment (from my perspective ) was to focus very specifically on a few claims about biology/brains that I found questionable—relevant because the OP specifically was using energy as an efficiency metric.
It’s relevant because energy efficiency is one of the standard key measures of low level hardware substrate computational efficiency.
At a higher level if you are talking about overall efficiency for some complex task, well then software/algorithm efficiency is obviously super important which is a more complex subject. And there are other low level metrics of importance as well such as feature size, speed, etc.
So what did you mean by computational efficiency?
FWIW I agree that bit also rang hollow to me—my sense was also that neurons are basically as energy-efficient as you can get—but by “computational efficiency” one means something like “amount of energy expended to achieve a computational result.”
For example, imagine multiplying two four-digit numbers in your head vs. in a calculator. Each transistor operation in the calculator will be much more expensive than each neuron spike, however the calculator needs many fewer transistor operations than the brain needs neuron spikes, because the calculator is optimized to efficiently compute those sorts of multiplications whereas the brain needs to expensively emulate the calculator. Overall the calculator will spend fewer joules than the brain will.
All that being said—yes there is reversible computation, but it appears to be a much harder longer tech path (so probably not until after AGI).
This was super interesting.
I don’t think you can directly compare brain voltage to Landauer limit, because brains operate chemically, so we also care about differences in chemical potential (e.g. of sodium vs potassium, which are importantly segregated across cell membranes even though both have the same charge). To really illustrate this, we might imagine information-processing biology that uses no electrical charges, only signalling via gradients of electrically-neutral chemicals. I think this raises the total potential relative to Landauer and cuts down the amount of molecules we should estimate as transported per signal.
Neuron computation is electro-chemical through voltage gated ion channels. If the voltage is at or below the Landauer voltage, then ion motion through the gate is pure noise. As the voltage climbs above the Landauer limit, you start to get meaningful probabilistic state transitions (error rate below 50%) in reasonable time; you can then implement analog computation using many such unreliable carriers reducing error/noise through central limit binomial.
‘Pure’ chemical computation is protein machinery. Biology evolved voltage based signaling for high speed longer distance communication/computation.