Firstly, part of the confusion here is my regrettable use of fundamental in the original article here:
Thus the fundamental (nano) wire energy is: ~1 Eb/bit/nm
But just after that I mentioned typical exceptions:
For long distance interconnect or communication reversible (ie optical) signaling is obviously vastly superior in asymptotic energy efficiency,
So I only meant ‘fundamental’ in the more narrow pareto tradeoff sense that if your interconnect is fully dissipative/reversible then the energy will be on around at least Eb/d where d is the distance scale of physical interconnect bits. For a macro scale domino computer, the distance scale is the size/spacing between dominoes. For electronic devices at maximum packing density where you naturally represent bits with single electrons, and the de broglie wavelength is then quite relevant as a constraint on maximum packing density due to quantum tunneling etc.
It’s also obviously energy inefficient to use nanoscale single electron bits for wires/interconnect—but that represents a core space vs energy tradeoff (amongst other optimization dimensions on the pareto surface). You can somewhat easily get much better wire energy efficiency by using much larger bit representations—like EM waves—but those are also much larger at 1eV energies.
One can also perhaps beat that bound by using anisotropic interconnect tiles where electrons move more ballistically as in some hypothetical carbon nanotube interconnect that could have a mean free path 3 OOM beyond copper[1] and proportionally lower bit energy per nm around perhaps 1e-22 J/bit/nm.
The aspect of this I consider fundemental is the more subtle notion of a pareto tradeoff surface around the landauer energy and nanometer scale, for dissipate nanoscale devices. The hypothetical CNT single electron interconnect tile device is fundamentally much slower than copper interconnect—as just one example.
Other interesting examples come from biology, such as kinesin, the motor walking protein, which can walk surprisingly large microvesicle ‘ballons’ down microtubules reliably using around an ATP per few nm—ie nearly the same natural energy bit scale. Typical microvesicles probably do not have a high bit information content, but if they were storing a large snippet of DNA that could increase the bits per unit distance at the same energy scale by OOM—but naturally at a cost of slower transmission.
Neural axon signal propagation has a bit energy scale tradeoff very similar to electronic devices, where the thinnest axons use around ~1eV/bit/nm (1e-19J/bit/nm), and the myelinated axons are a few OOM more efficient at around 5e-21J/nm or so[2], very similar to modern on-chip copper interconnect and coax cable wires (although meylinated axons are thinner at the same energy efficiency vs coax cable).
So I observe the fact that human engineering and biology have ended up on the same pareto surface for interconnect space & energy efficiency—despite being mostly unrelated optimization processes using very different materials—as evidence of a hard pareto surface rather than being mere coincidence.
Thanks for replying. This is a lot clearer to me than prior threads, although it also seems as though you’re walking back some of your stronger statements.
I think this is still not quite a correct picture. I agree with this:
For electronic devices at maximum packing density where you naturally represent bits with single electrons, and the de broglie wavelength is then quite relevant as a constraint on maximum packing density due to quantum tunneling etc.
However, at maximum packing density with single-electron switches, the energy requirements per area of interconnect space are still not related to dissipation, nor to irreversible-bit-erasure costs from sending signals tile by tile. Rather, the Cavin/Zhirnov argument is that the extra energy per area of interconnect should be viewed as necessary to overcome charge shot noise in the bit-copy operations required by fan-out after each switch. Abstractly, you need to pay the Landauer energy per copy operation, and you happen to use a couple interconnect tiles for every new input you’re copying the switch output to. Physically, longer interconnect reduces signal-to-noise ratio per electron because a single electron’s wavefunction is spread across the interconnect, and so is less likely to be counted at any one tile in the interconnect.
Thinking of this as accumulating noise on the Landauer scale at each nanoscale transmission step will give incorrect results in other contexts. For example, this isn’t a cost per length for end-to-end communication via something other than spreading an electron across the entire interconnect. If you have a long interconnect or coaxial cable, you’ll signal using voltage transmitted at the speed of light over conduction electrons, and then you can just think in terms of resistance and capacitance per unit length and so on. And because you need 1V at the output, present devices signal using 1V even though 1mV would overcome voltage noise in the wire. This is the kind of interconnect people are mostly talking about when they talk about reducing interconnect power consumption.
Firstly, part of the confusion here is my regrettable use of fundamental in the original article here:
But just after that I mentioned typical exceptions:
So I only meant ‘fundamental’ in the more narrow pareto tradeoff sense that if your interconnect is fully dissipative/reversible then the energy will be on around at least Eb/d where d is the distance scale of physical interconnect bits. For a macro scale domino computer, the distance scale is the size/spacing between dominoes. For electronic devices at maximum packing density where you naturally represent bits with single electrons, and the de broglie wavelength is then quite relevant as a constraint on maximum packing density due to quantum tunneling etc.
It’s also obviously energy inefficient to use nanoscale single electron bits for wires/interconnect—but that represents a core space vs energy tradeoff (amongst other optimization dimensions on the pareto surface). You can somewhat easily get much better wire energy efficiency by using much larger bit representations—like EM waves—but those are also much larger at 1eV energies.
One can also perhaps beat that bound by using anisotropic interconnect tiles where electrons move more ballistically as in some hypothetical carbon nanotube interconnect that could have a mean free path 3 OOM beyond copper[1] and proportionally lower bit energy per nm around perhaps 1e-22 J/bit/nm.
The aspect of this I consider fundemental is the more subtle notion of a pareto tradeoff surface around the landauer energy and nanometer scale, for dissipate nanoscale devices. The hypothetical CNT single electron interconnect tile device is fundamentally much slower than copper interconnect—as just one example.
Other interesting examples come from biology, such as kinesin, the motor walking protein, which can walk surprisingly large microvesicle ‘ballons’ down microtubules reliably using around an ATP per few nm—ie nearly the same natural energy bit scale. Typical microvesicles probably do not have a high bit information content, but if they were storing a large snippet of DNA that could increase the bits per unit distance at the same energy scale by OOM—but naturally at a cost of slower transmission.
Neural axon signal propagation has a bit energy scale tradeoff very similar to electronic devices, where the thinnest axons use around ~1eV/bit/nm (1e-19J/bit/nm), and the myelinated axons are a few OOM more efficient at around 5e-21J/nm or so[2], very similar to modern on-chip copper interconnect and coax cable wires (although meylinated axons are thinner at the same energy efficiency vs coax cable).
So I observe the fact that human engineering and biology have ended up on the same pareto surface for interconnect space & energy efficiency—despite being mostly unrelated optimization processes using very different materials—as evidence of a hard pareto surface rather than being mere coincidence.
Purewal, Meninder S. Electron transport in single-walled carbon nanotubes. Columbia University, 2008.
Derived from Ralph Merkle’s classic essay on brain limits.
Thanks for replying. This is a lot clearer to me than prior threads, although it also seems as though you’re walking back some of your stronger statements.
I think this is still not quite a correct picture. I agree with this:
However, at maximum packing density with single-electron switches, the energy requirements per area of interconnect space are still not related to dissipation, nor to irreversible-bit-erasure costs from sending signals tile by tile. Rather, the Cavin/Zhirnov argument is that the extra energy per area of interconnect should be viewed as necessary to overcome charge shot noise in the bit-copy operations required by fan-out after each switch. Abstractly, you need to pay the Landauer energy per copy operation, and you happen to use a couple interconnect tiles for every new input you’re copying the switch output to. Physically, longer interconnect reduces signal-to-noise ratio per electron because a single electron’s wavefunction is spread across the interconnect, and so is less likely to be counted at any one tile in the interconnect.
Thinking of this as accumulating noise on the Landauer scale at each nanoscale transmission step will give incorrect results in other contexts. For example, this isn’t a cost per length for end-to-end communication via something other than spreading an electron across the entire interconnect. If you have a long interconnect or coaxial cable, you’ll signal using voltage transmitted at the speed of light over conduction electrons, and then you can just think in terms of resistance and capacitance per unit length and so on. And because you need 1V at the output, present devices signal using 1V even though 1mV would overcome voltage noise in the wire. This is the kind of interconnect people are mostly talking about when they talk about reducing interconnect power consumption.