I had the same experience. I was essentially going to say “meta is only useful insofar as it helps you and others do object-level things, so focus on building object-level things...” oh.
spxtr
This whole post seems to mostly be answering “who has the best ethnic restaurants in Europe/America?” along with “which country has the best variety of good restaurants?” and not “who has the best food?” I think that’s an important distinction. Clearly, Indian, Chinese, and middle eastern foods are the best.
I haven’t heard of ECL before, so I’m sorry if this comes off as naive, but I’m getting stuck on the intro.
For one, I assume that you care about what happens outside our light cone. But more strongly, I’m looking at values with the following property: If you could have a sufficiently large impact outside our lightcone, then the value of taking different actions would be dominated by the impact that those actions had outside our lightcone.
The laws of physics as we know them state that we cannot have any impact outside our light cone. Does ECL (or this post) require this to be wrong?
From the summary you linked,
Many ethical theories (in particular most versions of consequentialism) do not consider geographical distance of relevance to moral value. After all, suffering and the frustration of one’s preferences is bad for someone regardless of where (or when) it happens. This principle should apply even when we consider worlds so far away from us that we can never receive any information from there.
...
Multiverse-wide cooperation via superrationality (abbreviation: MSR) is the idea that, if I think about different value systems and their respective priorities in the world, I should not work on the highest priority according to my own values, but on whatever my comparative advantage is amongst all the interventions favored by the value systems of agents interested in multiverse-wide cooperation.
Is the claim (loosely) that we should take actions we think are morally inferior according to us because … there might be other intelligent beings outside of our light cone with different preferences? I would want them to act a little bit more like me, so in turn I will act a little more like them, in a strange game of blind prisoner’s dilemma.
This is obviously hogwash to me, so I want to make sure I understand it before proceeding.
No you’re right, use 2 or 3 instead of 4 as an average dielectric constant. The document you linked cites https://ieeexplore.ieee.org/abstract/document/7325600 which gives measured resistances and capacitances for the various layers. For Intel’s 14 nm process making use of low-k, ultra-low-k dielectrics, and air gaps, they show numbers down to 0.15 fF/micron, about 15 times higher than .
I remember learning that aspect ratio and dielectric constant alone don’t suffice to explain the high capacitances of interconnects. Instead, you have to include fringe fields—turns out they’re not actually infinite parallel plates (gasp!).
Again, it’s not a big deal and doesn’t detract much from your analysis. I somewhat regret even bringing it up because of how not important it is :)
This is an excellent writeup.
Minor nit, your assertion of is too simple imo, even for a Fermi estimate. At the very least, include a factor of 4 for the dielectric constant of SiO2, and iirc in real interconnects there is a relatively high “minimum” from fringing fields. I can try to find a source for that later tonight, but I would expect it ends up significantly more than . This will actually make your estimate agree even better with Jacob’s.
Active copper cable at 0.5W for 40G over 15 meters is ~J/nm, assuming it actually hits 40G at the max length of 15m.
I can’t access the linked article, but an active cable is not simple to model because its listed power includes the active components. We are interested in the loss within the wire between the active components.
This source has specs for a passive copper wire capable of up to 40G @5m using <1W, which works out to ~J/nm, or a bit less.
They write <1 W for every length of wire, so all you can say is <5 fJ/mm. You don’t know how much less. They are likely writing <1 W for comparison to active wires that consume more than a W. Also, these cables seem to have a powered transceiver built-in on each end that multiplex out the signal to four twisted pair 10G lines.
Compare to 10G from here which. may use up to 5W to hit up to 10G at 100M, for ~J/nm.
Again, these have a powered transceiver on each end.
So for all of these, all we know is that the sum of the losses of the powered components and the wire itself are of order 1 fJ/mm. Edit: I would guess that probably the powered components have very low power draw (I would guess 10s of mW) and the majority of the loss is attenuation in the wire.
The numbers I gave essentially are the theoretical minimum energy loss per bit per mm of that particular cable at that particular signal power. It’s not surprising that multiple twisted pair cables do worse. They’ll have higher attenuation, lower bandwidth, the standard transceivers on either side require larger signals because they have cheaper DAC/ADCs, etc. Also, their error correction is not perfect, and they don’t make full use of their channel capacity. In return, the cables are cheap, flexible, standard, etc.
There’s nothing special about kT/1 nm.
Indeed, the theoretical lower bound is very, very low.
Do you think this is actually achievable with a good enough sensor if we used this exact cable for information transmission, but simply used very low input energies?
The minimum is set by the sensor resolution and noise. A nice oscilloscope, for instance, will have, say, 12 bits of voltage resolution and something like 10 V full scale, so ~2 mV minimum voltage. If you measure across a 50 Ohm load then the minimum received power you can see is This is an underestimate, but that’s the idea.
Maybe I’m interpreting these energies in a wrong way and we could violate Jacob’s postulated bounds by taking an Ethernet cable and transmitting 40 Gbps of information at a long distance, but I doubt that would actually work.
Ethernet cables are twisted pair and will probably never be able to go that fast. You can get above 10 GHz with rigid coax cables, although you still have significant attenuation.
Let’s compute heat loss in a 100 m LDF5-50A, which evidently has 10.9 dB/100 m attenuation at 5 GHz. This is very low in my experience, but it’s what they claim.
Say we put 1 W of signal power at 5 GHz in one side. Because of the 10.9 dB attenuation, we receive 94 mW out the other side, with 906 mW lost to heat.
The Shannon-Hartley theorem says that we can compute the capacity of the wire as where is the bandwidth, is received signal power, and is noise power.
Let’s assume Johnson noise. These cables are rated up to 100 C, so I’ll use that temperature, although it doesn’t make a big difference.
If I plug in 5 GHz for , 94 mW for and for then I get a channel capacity of 160 GHz.
The heat lost is then Quite low compared to Jacob’s ~10 fJ/mm “theoretical lower bound.”
One free parameter is the signal power. The heat loss over the cable is linear in the signal power, while the channel capacity is sublinear, so lowering the signal power reduces the energy cost per bit. It is 10 fJ/bit/mm at about 300 W of input power, quite a lot!
Another is noise power. I assumed Johnson noise, which may be a reasonable assumption for an isolated coax cable, but not for an interconnect on a CPU. Adding an order of magnitude or two to the noise power does not substantially change the final energy cost per bit (0.05 goes to 0.07), however I doubt even that covers the amount of noise in a CPU interconnect.
Similarly, raising the cable attenuation to 50 dB/100 m does not even double the heat loss per bit. Shannon’s theorem still allows a significant capacity. It’s just a question of whether or not the receiver can read such small signals.
The reason that typical interconnects in CPUs and the like tend to be in the realm of 10-100 fJ/bit/mm is because of a wide range of engineering constraints, not because there is a theoretical minimum. Feel free to check my numbers of course. I did this pretty quickly.
Ah, I was definitely unclear in the previous comment. I’ll try to rephrase.
When you complete a circuit, say containing a battery, a wire, and a light bulb, a complicated dance has to happen for the light bulb to turn on. At near the speed of light, electric and magnetic fields around the wire carry energy to the light bulb. At the same time, the voltage throughout the wire establishes itself at the the values you would expect from Ohm’s law and Kirchhoff’s rules and such. At the same time, electrons throughout the wire begin to feel a small force from an electric field pointing along the direction of the wire, even if the wire has bends and such. These fields and voltages, outside and inside the wire, are the result of a complicated, self-consistent arrangement of surface charges on the wire.
See this youtube video for a nice demonstration of a nonintuitive result of this process. The video cites this paper among others, which has a nice introduction and overview.
The key point is that establishing these surface charges and propagating the signal along the wire amounts to moving an extremely small amount of electric charge. In that youtube video he asserts without citation that the electrons move “the radius of a proton” (something like a femtometer) to set up these surface charges. I don’t think it’s always so little, but again I don’t remember where I got my number from. I can try to either look up numbers or calculate it myself if you’d like.
Signals (low vs high voltages, say) do not propagate through circuits by hopping from electron to electron within a wire. In a very real sense they do not even propagate through the wire, but through electric and magnetic fields around and within the wire. This broad statement is also true at high frequencies, although there the details become even more complicated.
To maybe belabor the point: to send a bit across a wire, we set the voltage at one side high or low. That voltage propagates across the wire via the song and dance I just described. It is the heat lost in propagating this voltage that we are interested in for computing the energy of sending the bit over, and this heat loss is typically extremely small, because the electrons barely have to move and so they lose very little energy to collisions.
The part your calculation fails to address is what happens if we attempt to drive this transmission by moving electrons around inside a wire made of an ordinary resistive material such as copper.
I have a number floating around in my head. I’m not sure if it’s right, but I think that at GHz frequencies, electrons in typical wires are moving sub picometer distances (possibly even femtometers?) per clock cycle.
The underlying intuition is that electron charge is “high” in some sense, so that 1. adding or removing a small number of electrons corresponds to a huge amount of energy (remove 1% of electrons from an apple and it will destroy the Earth in its explosion!) and 2. moving the electrons in a metal by a tiny distance (sub picometer) can lead to large enough electric fields to transmit signals with high fidelity.
Feel free to check these numbers, as I’m just going by memory.
The end result is that we can transmit signals with high fidelity by moving electrons many orders of magnitude less distance than their mean free path, which means intuitively it can be done more or less loss-free. This is not a rigorous calculation, of course.
I come more from the physics side and less from the EE side, so for me it would be Datta’s “Electronic Transport in Mesoscopic Systems”, assuming the standard solid state books survive (Kittel, Ashcroft & Mermin, L&L stat mech, etc). For something closer to EE, I would say “Principles of Semiconductor Devices” by Zeghbroeck because it is what I have used and it was good, but I know less about that landscape.
Hi Alexander,
I would be happy to discuss the physics related to the topic with others. I don’t want to keep repeating the same argument endlessly, however.
Note that it appears that EY had a similar experience of repeatedly not having their point addressed:
I’m confused at how somebody ends up calculating that a brain—where each synaptic spike is transmitted by ~10,000 neurotransmitter molecules (according to a quick online check), which then get pumped back out of the membrane and taken back up by the synapse; and the impulse is then shepherded along cellular channels via thousands of ions flooding through a membrane to depolarize it and then getting pumped back out using ATP, all of which are thermodynamically irreversible operations individually—could possibly be within three orders of magnitude of max thermodynamic efficiency at 300 Kelvin. I have skimmed “Brain Efficiency” though not checked any numbers, and not seen anything inside it which seems to address this sanity check.
Then, after a reply:
This does not explain how thousands of neurotransmitter molecules impinging on a neuron and thousands of ions flooding into and out of cell membranes, all irreversible operations, in order to transmit one spike, could possibly be within one OOM of the thermodynamic limit on efficiency for a cognitive system (running at that temperature).
Then, after another reply:
Nothing about any of those claims explains why the 10,000-fold redundancy of neurotransmitter molecules and ions being pumped in and out of the system is necessary for doing the alleged complicated stuff.
Then, nothing more (that I saw, but I might have missed comments. this is a popular thread!).
:), spxtr
It depends on your background in physics.
For the theory of sending information across wires, I don’t think there is any better source than Shannon’s “A Mathematical Theory of Communication.”
I’m not aware of any self-contained sources that are enough to understand the physics of electronics. You need to have a very solid grasp of E&M, the basics of solid state, and at least a small amount of QM. These subjects can be pretty unintuitive. As an example of the nuance even in classical E&M, and an explanation of why I keep insisting that “signals do not propagate in wires by hopping from electron to electron,” see this youtube video.
You don’t actually need all of that in order to argue that the brain cannot be efficient from a thermodynamic perspective. EY does not understand the intricacies of nanoelectronics (probably), but he correctly stated that the final result from the original post cannot be correct, because obviously you can imagine a computation machine that is more thermodynamically efficient than pumping tens of thousands of ions across membranes and back. This intuition probably comes from some thermodynamics or statistical mechanics books.
This is the right idea, but in these circuits there are quite a few more noise sources than Johnson noise. So, it won’t be as straightforward to analyze, but you’ll still end up with essentially a relatively small (compared to L/nm) constant times kT.
Ok, I will disengage. I don’t think there is a plausible way for me to convince you that your model is unphysical.
I know that you disagree with what I am saying, but from my perspective, yours is a crackpot theory. I typically avoid arguing with crackpots, because the arguments always proceed basically how this one did. However, because of apparent interest from others, as well as the fact that nanoelectronics is literally my field of study, I engaged. In this case, it was a mistake.
Sorry for wasting our time.
Please respond to the meat of the argument.
Resistive heat loss is not the same as heat loss from Landauer’s principle. (you agree!)
The Landauer limit is an energy loss per bit flip, with units energy/bit. This is the thermodynamic minimum (with irreversible computing). It is extremely small and difficult to measure. It is unphysical to divide it by 1 nm to model an interconnect, because signals do not propagate through wires by hopping from electron to electron.
The Cavin/Zhirnov paper you cite does not concern the Landauer principle. It models ordinary dissipative interconnects. Due to a wide array of engineering optimizations, these elements tend to have similar energy loss per bit per mm, however this is not a fundamental constraint. This number can be basically arbitrarily changed by multiple orders of magnitude.
You claim that your modified Landauer energy matches the Cavin/Zhirnov numbers, but this is a nonsense comparison because they are different things. One can be varied by orders of magnitude while the other cannot. Because they are different heat sources, their heat losses add.
We have known how wires work for a very long time. There is a thorough and mature field of physics regarding heat and information transport in wires. If we were off by a factor of 2 in heat loss (what you are claiming, possibly without knowing so) then we would have known it long ago. The Landauer principle would not be a very esoteric idea at the fringes of computation and physics, it would be front and center necessary to understand heat dissipation in wires. It would have been measured a hundred years ago.
I’m not going to repeat this again. If you ignore the argument again then I will assume bad faith and quit the conversation.
1 nm is somewhat arbitrary but around that scale is a sensible estimate for minimal single electron device spacing ala Cavin/Zhirnov. If you haven’t actually read those refs you should—as they justify that scale and the tile model.
They use this model to figure out how to pack devices within a given area and estimate their heat loss. It is true that heating of a wire is best described with a resistivity (or parasitic capacitance) that scales as 1/L. If you want to build a model out of tiles, each of which is a few nm on a side (because the FETs are roughly that size), then you are perfectly allowed to do so. IMO the model is a little oversimplified to be particularly useful, but it’s physically reasonable at least.
This is just false, unless you are claiming you have found some error in the cavin/zhirnov papers.
No, the papers are fine. They don’t say what you think they say. They are describing ordinary resistive losses and such. In order to compare different types of interconnects running at different bitrates, they put these losses in units of energy/bit/nm. This has no relation to Landauer’s principle.
Resistive heat loss in a wire is fundamentally different than heat loss from Landauer’s principle. I can communicate 0 bits of information across a wire while losing tons of energy to resistive heat, by just flowing a large constant current through it.
It’s also false in the sense that the model makes reasonable predictions.
As pointed out by Steven Byrnes, your model predicts excess heat loss in a well-understood system. In my linked comment, I pointed out another way that it makes wrong predictions.
I don’t think the thermal de Broglie wavelength is at all relevant in this context, nor the mean free path, and instead I’m trying to shift discussion to “how wires work”.
This is the crux of it. I made the same comment here before seeing this comment chain.
People have been sending binary information over wires since 1840, right? I don’t buy that there are important formulas related to electrical noise that are not captured by the textbook formulas. It’s an extremely mature field.
Also a valid point. @jacob_cannell is making a strong claim: that the energy lost by communicating a bit is the same scale as the energy lost by all other means, by arbitrarily dividing by 1 nm so that the units can be compared. If this were the case, then we would have known about it for a hundred years. Instead, it is extremely difficult to measure the extremely tiny amounts of heat that are actually generated by deleting a bit, such that it’s only been done within the last decade.
This arbitrary choice leads to a dramatically overestimated heat cost of computation, and it ruins the rest of the analysis.
@Alexander Gietelink Oldenziel, for whatever it is worth, I, a physicist working in nanoelectronics, recommend @Steven Byrnes for the $250. (Although, EY’s “it’s wrong because it’s obviously physically wrong” is also correct. You don’t need to dig into details to show that a perpetual motion machine is wrong. You can assert it outright.)
The post is making somewhat outlandish claims about thermodynamics. My initial response was along the lines of “of course this is wrong. Moving on.” I gave it another look today. In one of the first sections I found (what I think is) a crucial mistake. As such, I didn’t read the rest. I assume it is also wrong.
The original post said:
A non-superconducting electronic wire (or axon) dissipates energy according to the same Landauer limit per minimal wire element. Thus we can estimate a bound on wire energy based on the minimal assumption of 1 minimal energy unit per bit per fundamental device tile, where the tile size for computation using electrons is simply the probabilistic radius or De Broglie wavelength of an electron[7:1], which is conveniently ~1nm for 1eV electrons, or about ~3nm for 0.1eV electrons. Silicon crystal spacing is about ~0.5nm and molecules are around ~1nm, all on the same scale.
Thus the fundamental (nano) wire energy is: ~1 , with in the range of 0.1eV (low reliability) to 1eV (high reliability).
The predicted wire energy is J/bit/nm or ~100 fJ/bit/mm for semi-reliable signaling at 1V with = 1eV, down to ~10 fJ/bit/mm at 100mV with complex error correction, which is an excellent fit for actual interconnect wire energy[8][9][10][11], [...]
The measured/simulated interconnect wire energies from the citations in the realm of 10s-100s of fJ/bit/mm are a result of physical properties of the interconnects. These include things like resistances (they’re very small wires) and stray capacitances. In principle, these numbers could be made basically arbitrarily worse by using smaller (cross sectional) interconnects, more resistive materials, or tighter packing of components. They can also be made significantly better, especially if you’re allowed to consider alternative materials. Importantly, this loss can be dramatically reduced by reducing the operating voltage of the system, but some components do not work well at lower voltages, so there’s a tradeoff.
… and we’re supposed to compare that number with /bit/. I might be willing to buy the wide range of , but the choice of de Broglie wavelength as “minimal wire element” is completely arbitrary. The author seems to know this, because they give a few more examples of length scales that are around a nanometer. I can do that too: The spacing of conduction electrons in copper () is roughly 0.2 nm. The mean free path of electrons in copper is a few nm. None of that matters, because signals do not propagate through wires by hopping from electron to electron. There is a complicated dance of electric field, magnetic field, conducting electrons as well as dielectrics that all work together to make signals move. The equation is asserting none of that matters, and it is simply unphysical. Sorry, there’s not a nice way to put that.
The author seems to want us to compare the two equations, but they are truly two different things. I can communicate the same information in a circuit (/bit/ held fixed) but dramatically vary the cited “excellent fit” numbers by orders of magnitude by changing their material or lowering their voltage.
The Landauer energy is very, very small compared to just about every other energy that we care about. It is basically a non-factor in all but a few very esoteric experiments. 20 meV is a decent chunk of energy if it is given to every electron in a system, but it is extremely small as a one-off. It is absurd to think that computers, with their resistive wires and leaky gates, are anywhere near perfect efficiency. It is even more absurd to think that brains, squishy beasts that literally physically pump ions across membranes and back, are anywhere near that limit.
Looking through the comments, I now see that another user has correctly pointed out the same mistakes. See here, and the comments under that. Give them the $250. EY also pointed out the absurdity of brains being considered anywhere near efficient. Nice work!
I downvoted, because
… are not low status fun, but long-term life decisions that should not be taken lightly.
… are just “be rude to friends,” which I consider immoral.
… are probably illegal.