So I predict in advance these approaches will fail or succeed only through using some reversible mechanism (with attended tradeoffs).
If you accept the Landuer analysis then the only question that remains for nano devices (where interconnect tiles are about the same size as your compute devices), is why you would ever use irreversible copy-tiles for interconnect instead of reversible move-tiles. It really doesn’t matter whether you are using ballistic electrons or electron waves or mechanical rods, you just get different variations of ways to represent a bit (which still mostly look like a 12CV2 relation but the form isn’t especially relevant )
A copy tile copy tile copies a bit from one side to the other. It has an internal memory M state (1 bit), and it takes an input bit from say the left and produces an output bit on the right. It’s logic table looks like:
O I M
1 1 0
1 1 1
0 0 0
0 0 1
In other words, every cycle it erases whatever leftover bit it was storing and copies the input bit to the output, so it always erases one bit. This exactly predicts nanowire energy correctly, there is a reason cavin et al use it, etc etc.
But why do that instead of just move a bit? That is the part which I think is less obvious.
I believe it has to do with the difficulties of noise buildup. The copy device doesn’t allow any error to accumulate at all. Your bits can be right on your reliability threshold (1 eV or whatever depending on the required reliability and speed tradeoffs), and error doesn’t accumulate regardless of wire length, because you are erasing at every step.
The reversible move device seems much better—and obviously is for energy efficiency—but it accumulates a bit of noise on the landuer scale at every cycle, because of various thermal/quantum noise sources as you are probably aware: your device is always coupled to a thermal bath, or still subject to cosmic rays even in outer space, and producing it’s own heat regardless at least for error correction. And if you aren’t erasing noise, then you are accumulating noise.
Edit: After writing this out I just stumbled on this paper by Siamak Taati[1] which makes the same argument about exponential noise accumulation much more formally. Looks like fully reversible computers are as challenging as scaling quantum computers. Quantum computers are naturally reversible and have all the same noise accumulation issues, resulting in quick decoherence—so you end up trying to decouple them from the environment as much as possible (absolute zero temp).
You can also have interconnect through free particle transmission as in lasers/optics, but that of course doesn’t completely avoid the noise accumulation issue. Optical interconnect also just greatly increases the device size which is obviously a huge downside but helps further reduce energy losses by just massively scaling up the interaction length or equivalent tile size.
Your whole reply here just doesn’t compute for me. An interconnect is a wire. We know how wires work. They have resistance-per-length, and capacitance-per-length, and characteristic impedance, and Johnson noise, and all the other normal things about wires that we learned in EE 101. If the wire is very small—even down to nanometers—it’s still a wire, it’s just a wire with a higher resistance-per-length (both for the obvious reason of lower cross-sectional area, and because of surface scattering and grain-boundary scattering).
I don’t know why you’re talking about “tiles”. Wires are not made of tiles, right? I know it’s kinda rude of me to not engage with your effortful comment, but I just find it very confusing and foreign, right from the beginning.
If it helps, here is the first random paper I found about on-chip metal interconnects. It treats them exactly like normal (albeit small!) metal wires—it talks about resistance, resistivity, capacitance, current density, and so on. That’s the kind of analysis that I claim is appropriate.
Your whole reply here just doesn’t compute for me. An interconnect is a wire. We know how wires work. They have resistance-per-length, and capacitance-per-length, and characteristic impedance, and Johnson noise, and all the other normal things about wires that we learned in EE 101
None of those are fundemental—all those rules/laws are derived—or should be derivable—from simpler molecular/atomic level simulations.
I don’t know why you’re talking about “tiles”. Wires are not made of tiles, right?
A wire carries a current and can be used to power devices, and or it can be used to transmit information—bits. In the latter usage noise analysis is crucial.
Let me state a chain of propositions to see where you disagree:
The landauer energy/bit/noise analysis is correct (so high speed reliable bits correspond to ~1eV).
The analysis applies to computers of all scales, down to individual atoms/molecules.
For a minimal molecular nanowire, the natural tile size is the electron radius.
An interconnect (wire) tile can be reversible or irreversible.
Reversible tiles rapidly accumulate noise/error ala Taati et al. and so aren’t used for nano scale interconnect in brains or computers.
From 1 − 4 we can calculate the natural wire energy as it’s just 1 electron charge per 1 electron radius and it reproduces wire equation near exactly (recall from that other thread in brain efficiency).
Let’s consider a ≤1mm wire on a 1GHz processor. Given the transmission line propagation speed, we can basically assume that the whole wire is always at a single voltage. I want to treat the whole wire as a unit. We can net add charge from the wire, anywhere in the wire, and the voltage of the whole wire will go up. Or we can remove charge from the wire, anywhere in the wire, and the voltage of the whole wire will go down.
Thus we have a mechanism for communication. We can electrically isolate the wire, and I can stand at one end of the wire, and you can stand at the other. I pull charge off of the wire at my end, and you notice that the voltage of the whole wire has gone down. And then I add charge into the wire, and you notice that the voltage of the whole wire has gone up. So now we’re communicating. And this is how different transistors within a chip communicate with each other, right?
I don’t think electron radius is relevant in this story. And there are no “tiles”. And this is irreversible. (When we bring the whole wire from low voltage to high voltage or vice-versa, energy is irrecoverably dissipated.) And the length of the wire only matters insofar as that changes its capacitance, resistance, inductance, etc. There will be voltage fluctuations (that depend on the frequency band, characteristic impedance, and ohmic losses), but I believe that they’re negligibly small for our purposes (normal chips are sending maybe 600 mV signals through the interconnects, so based on ½CV² we should get 2 OOM lower interconnect losses by “merely” going to 60 mV, whereas the Johnson noise floor at 1GHz is <<1mV I think). The loss involved in switching the whole wire from high voltage to low voltage or vice versa is certainly going to be >>1eV.
I’m still not sure where you disagree with my points 1-5, but I’m guessing 3?
The relevance of 3 is that your wire is made of molecules with electron orbitals each of which is a computer subject to the landauer analysis. To send a bit very reliably across just one electron-radius length of wire requires about 1eV (not exactly, but using the equations). So for a minimal nanowire of single electron width that corresponds to 1V, but a wider wire can naturally represent a bit using more electrons and a lower voltage.
Either way, every individual molecule electron-radius length of the wire is a computer tile which must either 1.) copy a bit and thus erase a bit on order 1eV, or 2.) move a bit without erasure, but thus accumulate noise ala Taati et al.
So if we plug in those equations it near exactly agrees with the spherical cow wire model of nanowires, and you get about 81 fJ/mm.
The only way to greatly improve on this is to increase the interaction distance (and thus tile size) - which requires the electrons move a much larger distance before interacting in the relay chain. That doesn’t seem very feasible for conventional wires made of a dense crystal lattice but obviously is possible for non relay based interconnect like photonics (with its size disadvantage).
So in short, at the nanoscale it’s better to model interconnect as molecular computers, not macro wires. Do you believe Cavin/Zhirnov are incorrect?
Specifically the tile model[1], and also more generally the claim that adiabatic interconnect basically doesn’t work at the nanolevel for conventional computers due to noise accumulation[2], agreeing with Taati:
The presence of thermal noise dictates that an energy barrier is needed to preserve a binary state. Therefore, all electronic devices contain at least one energy barrier to control electron flow. The barrier properties determine the operating characteristics of electronic devices. Furthermore, changes in the barrier shape require changes in charge density. Operation of all charge transport devices includes charging/discharging capacitances to change barrier height. We analyze energy dissipation for several schemes of charging capacitors. A basic assumption of Reversible Computing is that the computing system is completely isolated from the thermal bath. An isolated system is a mathematical abstraction never perfectly realized in practice. Errors due to thermal excitations are equivalent to information erasure, and thus computation dissipates energy. Another source of energy dissipation is due to the need of measurement and control. To analyze this side of the problem, the Maxwell’s Demon is a useful abstraction. We hold that apparent “energy savings” in models of adiabatic circuits result from neglecting the total energy needed by other parts of the system to implement the circuit.
Here’s a toy model. There’s a vacuum-gap coax of length L. The inside is a solid cylindrical wire of diameter D and resistivity ρ. The outside is grounded, and has diameter Dₒ=10×D. I stand at one end and you stand at the other end. The inside starts out at ground. Your end is electrically isolated (open-circuit). If I want to communicate the bit “1” to you, then I raise the voltage at my end to V=+10mV, otherwise I lower the voltage at my end to V=–10mV.
On my end, the energy I need to spend is:
12CV2=πϵ0V2L/ln(Do/D)=L×0.0012 fJ/mm
On your end, you’re just measuring a voltage so the required energy is zero in principle.
The resistivity ρ and diameter D don’t enter this equation, as it turns out, although they do affect the timing. If D is as small as 1nm, that’s fine, as long as the wire continues to be electrically conductive (i.e. satisfy ohm’s law).
Anyway, I have now communicated 1 bit to you with 60,000× less energy expenditure than your supposed limit of 81 fJ/mm. But I don’t see anything going wrong here. Do you? Like, what law of physics or assumption am I violating here?
a wider wire can naturally represent a bit using more electrons and a lower voltage
I don’t think it’s relevant, but for what it’s worth, 1nm³ of copper contains 90 conduction electrons.
This may be obvious—but this fails to predict the actual wire energy, whereas my preferred model does. So if this model is correct—why does it completely fail to predict interconnect wire energy despite an entire engineering industry optimizing such parameters? Where do you believe the error is?
My first guess is perhaps you are failing to account for the complex error/noise buildup per unit length of wire. A bit is an approximation of a probability distribution. So you start out with a waveform on one end of the wire which minimally can represent 1 bit against noise (well maybe not even that—your starting voltage seems unrealistic), but then it quickly degrades to something which can not.
Actually, Looking back at the old thread I believe you are incorrect that 10mV is realistic for anything near nanowire. You need to increase your voltage by 100x or use an enormous number of charge carriers which isn’t possible for a nanowire (and is just a different way to arrive at 1eV per computational relay bit).
And in terms of larger wires, my model from brain efficiency actually comes pretty close to predicting actual wire energy for large copper wires—see this comment.
Kwa estimates 5e-21 J/nm, which is only 2x the lower landauer bound and corresponds to a ~75% bit probability (although the uncertainty in these estimates is probably around 2x itself). My explanation is that such very low bit energies approaching the lower landauer limit are possible but only with complex error correction—which is exactly what ethernet/infinityband cards are doing. But obviously not viable for nanoscale interconnect.
Or put another way—why do you believe that Cavin/Zhirnov are incorrect?
The easiest way to actuate an electronic switch is to use a voltage around 20kT/q≈500mV (where 20 is to get way above the noise floor).
The most efficient way to send information down a wire is to use a voltage around20√kTZ0f≈ 0.3 mV (where 20 is to get way above the noise floor and Z₀ is the wire’s characteristic impedance which is kinda-inevitably somewhat lower than the 377Ω impedance of free space, typically 50-100Ω in practice).
So there’s a giant (>3 OOM) mismatch.
The easy way to deal with that giant mismatch is to ignore it. Just use the same 500mV voltage for both the switches and the wires, even though that entails wasting tons and tons of power unnecessarily in the latter—specifically 6.5 orders of magnitude more interconnect losses than if the voltage were tailored to the wire properties.
The hard way to deal with that giant mismatch is to make billions of nano-sized weird stacks of piezoelectric blocks so that each transistor gate has its own little step-up voltage-converter, or other funny things like that as in my top comment.
But people aren’t doing it the “hard way”, they’re doing it the “easy way”, and always have been.
Given that this is in fact the strategy, we can start doing fermi estimates about interconnect losses. We have V ≈ 20kT/q, C ≈ ε₀ × L (where L = typical device dimension), and if we ask how much loss there is in a “square tile” it would be ½CV²/L ≈200(kT/q)²ε₀=1.2e-21 J/nm which isn’t wildly far from Kwa’s estimate that you cite.
So in summary, I claim that Kwa gets reasonable numbers (compared to actual devices) by implicitly / effectively assuming somewhere-or-other that wire voltage is high enough to also simultaneously be adequate for a transistor gate voltage, even though such a high voltage is not remotely necessary for the wire to function well as a wire. Maybe he thinks otherwise, and if he does, I think he’s wrong. ¯\_(ツ)_/¯
To be clear, Kwa did not provide a model—so estimate is not really the right word. He provided a link to the actual wire consumption of some current coaxial ethernet, did the math, and got 5e−21J/nm, which is near the lower bound I predicted based on the landauer analysis—which only works using sophisticated error correction codes (which require entire chips). You obviously can’t use a whole CPU or ASIC for error correction for every little nanowire interconnect, so interconnect wires need to be closer to the 1eV/nm wire energy to have reliability. So your most efficient model could approach the 2e−21J/nm level, but only using some big bulky mechanism—if not error correction coding then perhaps the billions of piezoelectric blocks.
Now you could believe that I had already looked up all those values and new that, but actually I did not. I did of course test the landauer model on a few examples, and then just wrote it in as it seemed to work.
So I predict that getting below the 2e−21J/nm limit at room temp is impossible for irreversible electronic relay based communication (systems that send signals relayed through electrons on dense crystal lattices).
If you want to know the noise in a wire, you pull out your EE 101 textbook and you get formulas like Vnoiserms≈√kTZ0f=0.015 mV (@ 1GHz & 50Ω) where Z₀ is the wire’s characteristic impedance and f is the frequency bandwidth. (Assuming the wire has a low-impedance [voltage source] termination on at least one side, as expected in this context.) Right? (I might be omitting a factor of 2 or 4? Hmm, actually I’m a bit unsure about various details here. Maybe in practice the noise would be similar to the voltage source noise, which could be even lower. But OTOH there are other noise sources like cross-talk.) The number of charge carriers is not part of this equation, and neither is the wire diameter. If we connect one end of the wire to a +10mV versus −10mV source, that’s 1000× higher than the wire’s voltage noise, even averaging over as short as a nanosecond, so error correction is unnecessary, right?
I feel like your appeal to “big bulky mechanism” is special pleading. I don’t think Landauer’s analysis concluded “…therefore there is an inevitable energy dissipation of kT per bit erasure, oh unless you have a big bulky mechanism involving lots and lots of electrons, in which case energy dissipation can be as low as you like”. Right? Or if there’s a formula describing how “Landauer’s limit for interconnects” gets progressively weaker as the wire gets bigger, then what’s that formula? And why isn’t a 1nm-diameter wire already enough to get to the supposed large-wire-limit, given that copper has 90 conduction electrons per nm³?
Hmm, I think I should get back to my actual job now. You’re welcome to reply, and maybe other people will jump in with opinions. Thanks for the interesting discussion! :)
This is frustrating for me as I have already laid out my core claims and you haven’t clarified which (if any) you disagree with. Perhaps you are uncertain—that’s fine, and I can kind of guess based on your arguments, but it still means we are talking past each more than I’d prefer.
If we connect one end of the wire to a +10mV versus −10mV source, that’s 1000× higher than the wire’s voltage noise, even averaging over as short as a nanosecond, so error correction is unnecessary, right?
It doesn’t matter whether you use 10mV or 0.015mV as in your example above, as Landauer analysis bounds the energy of a bit, not the voltage. For high reliability interconnect you need ~1eV which could be achieved in theory by one electron at one volt naturally, but using 10mV would require ~100 electron charges and 0.015mV would require almost 1e5 electron charges, the latter of which doesn’t seem viable for nanowire interconnect, and doesn’t change the energy per bit requirements regardless.
The wire must use ~1eV to represent and transmit one bit (for high reliability interconnect) to the receiving device across the wire exit surface, regardless of the wire width.
Now we notice that we can divide the wire in half, and the first half is also a wire which must transmit to the 2nd half, so now we know it must use at least 2eV to transmit a bit across both sections, each of which we can subdivide again, resulting in 4eV .. and so on until you naturally bottom out at the minimal wire length of one electron radius.
Hmm, I think I should get back to my actual job now
Agreed—this site was designed to nerdsnipe us away from creating AGI ;)
Gah, against my better judgment I’m gonna carry on for at least one more reply.
This is frustrating for me as I have already laid out my core claims and you haven’t clarified which (if any) you disagree with.
I think it’s wrong to think of a wire as being divided into a bunch of tiles each of which should be treated like a separate bit.
Back to the basic Landauer analysis: Why does a bit-copy operation require kT of energy dissipation? Because we go from four configurations (00,01,10,11) to two (00,11). Thermodynamics says we can’t reduce the number of microstates overall, so if the number of possible chip states goes down, we need to make up for it by increasing the temperature (and hence number of occupied microstates) elsewhere in the environment, i.e. we need to dissipate energy / dump heat.
OK, now consider a situation where we’re transferring information by raising or lowering the voltage on a wire. Define V(X) = voltage of the wire at location X and V(X+1nm) = voltage of the wire at location X+1nm (or whatever the supposed “tile size” is). As it turns out, under practical conditions and at the level of accuracy that matters, V(X) = V(X+1nm) always. No surprise—wires are conductors, and conductors oppose voltage gradients. There was never a time when we went from more microstates to fewer microstates, because there was never a time when V(X) ≠ V(X+1nm) in the first place.They are yoked together, always equal to each other. They are one bit, not two. For example, we don’t need an energy barrier preventing V(X) from contaminating the state of V(X+1nm) or whatever; in fact, that’s exactly the opposite of what we want.
(Nitpicky side note: I’m assuming that, when we switch the wire voltage between low and high, we do so by ramping it very gradually compared to (1nm / speed of light). This will obviously be the case in practice. Then V(X) = V(X+1nm) even during the transient as the wire voltage switches.)
The thing you’re proposing is, to my ears, kinda like saying that the voltage of each individual atom within a single RAM capacitor plate is 1 bit, and it just so happens that all those “bits” within a single capacitor plate are equal to each other at any given time, and since there’s billions of atoms on the one capacitor plate it must take billions of dissipative copy operations to every time that we flip that one RAM bit.
None of those are fundemental—all those rules/laws are derived—or should be derivable—from simpler molecular/atomic level simulations.
I’m confident that I can walk through any of the steps to get from the standard model of particle physics, to Bloch waves and electron scattering, to the drift-diffusion equation and then ohm’s law, and to the telegrapher’s equations, and to Johnson noise and all the other textbook formulas for voltage noise on wires. (Note that I kinda mangled my discussions of voltage noise above, in various ways; I’m happy to elaborate but I don’t think that’s a crux here.)
Whereas “wires should be modeled as a series of discrete tiles with dissipative copy operations between them” is not derivable from fundamental physics, I claim. In particular, I don’t think there is any first-principles story behind your assertion that “the natural tile size is the electron radius”. I think it’s telling that “electron radius” is not a thing that I recall ever being mentioned in discussions of electrical conduction, including numerous courses that I’ve taken and textbooks that I’ve read in solid-state physics, semiconductor physics, nanofabrication, and electronics. Honestly I’m not even sure what you mean by “electron radius” in the first place.
I think it’s wrong to think of a wire as being divided into a bunch of tiles each of which should be treated like a separate bit.
Why? Does not each minimal length of wire need to represent and transmit a bit? Does the landauer principle somehow not apply at the micro or nanoscale?
It is not the case that the wire represents a single bit, stretched out across the length of the wire, as I believe you will agree. Each individual section of wire stores and transmits different individual bits in the sequence chain at each moment in time, such that the number of bits on the wire is a function of length.
As it turns out, under practical conditions and at the level of accuracy that matters, V(X) = V(X+1nm) always.
Only if the wire is perfectly insulated from the external environment—which crucially perhaps is our crux. If the wire is in a noisy conventional environment, it accumulates noise on the landauer scale at each nanoscale transmission step, and at the minimal landauer bit energy scale this noise rapidly collapses the bit representation (decays to noise) exponentially quickly, unless erased. (because the landauer energy scale is defined as the minimal bit energy reasonable distinguishable from noise, so it has no room for more error).
There was never a time when we went from more microstates to fewer microstates, because there was never a time when V(X) ≠ V(X+1nm) in the first place.
I don’t believe this is true in practice as again any conventional system is not perfectly reversible unless (unrealistically) there is no noise coupling.
The thing you’re proposing is, to my ears, kinda like saying that the voltage of each individual atom within a single RAM capacitor plate is 1 bit, and it just so happens that all those “bits” within a single capacitor plate are equal to each other at any given time, and since there’s billions of atoms on the one capacitor plate
I’m not sure how you got that? There are many ways to represent a bit, and for electronic relay systems the bit representation is distributed over some small fraction of the electrons moving between outer orbitals. The bit representation is a design constraint in terms of a conceptual partition of microstates, and as I already stated earlier you can represent a tiny landauer energy bit using partitions of almost unlimited number of atoms and their microstates (at least cross sectionally for an interconnect wire, but for density reasons the wires need be thin).
I sometimes use single electron examples, as those are relevant for nanoscale interconnect, and nanoscale computational models end up being molecule sized cellular automata where bits are represented by few electron gaps (but obviously not all electrons participate).
Whereas “wires should be modeled as a series of discrete tiles with dissipative copy operations between them” is not derivable from fundamental physics, I claim
Do you not believe that wires can be modeled as smaller units, recursively down to the level of atoms?
And I clearly do not believe that wires are somehow only capable of dissipative copy operations in theory. In theory they are perfectly capable of non-dissipative reversible move operations, but in practice that has 1.) never been successfully achieved in any conventional practical use that I am aware of, and 2.) is probably impossible in practical use without exotic noise isolation given the terminal rapid noise buildup problems I mentioned (I have some relevant refs in earlier comments),
In particular, I don’t think there is any first-principles story behind your assertion that “the natural tile size is the electron radius”.
The landauer principle doesn’t suddenly stop applying at the nanoscale, it bounds atoms and electrons at all sizes and scales. The wire equations are just abstractions, the reality at nanoscale should be better modeled by a detailed nanoscale cellular automata. By “electron radius” I meant de broglie wavelength, which i’m using as a reasonable but admittedly vagueish guess for the interaction distance (the smallest distance scale at which we can model it as a cellular automata switching between distinct bit states, which I admit is not a concept I can yet tightly define, but I derive that concept from studies of the absolute minimum spacing between compute elements due to QM electron de broglie wavelength effects, and I expect it’s close to the directional mean free path length but haven’t checked ), so for an interconnect wire I used ~1.23nm at 1 volt, from this thread:
The Landauer/Tile model predicts in advance a natural value of this parameter will be 1 electron charge per 1 volt per 1 electron radius, ie 1.602 e-19 F / 1.23 nm, or 1.3026 e-10 F/m.
Naturally it’s not a fixed quantity as it depends on the electron energy and thus voltage, the thermal noise, etc, but it doesn’t seem like that can make a huge difference for room temp conventional wires. (This page estimates a wavelength of 8 angstrom or 0.8nm for typical metals, so fairly close). I admit that my assertion that the natural interaction length (and thus cellular automata scale) is the electron de broglie wavelength seems ad hoc, but I believe it is justifiable and very much seems to make the right predictions so far.
But in that sense I should reassert that my model applies most directly only to any device which conveys bits relayed through electrons exchanging orbitals, as that is the generalized electronic cellular automata model, and wires should not be able to beat that bound. But if there is some way to make the interaction distance much much larger—for example via electrons moving ballistically OOM greater than the ~1 nm atomic scale before interacting, then the model will break down.
So what would cause you to update?
For me, I will update immediately if someone can find a single example of a conventional wire communication device (room temp etc) which has been measured to transmit information using energy confidently less than 2e−21 J/bit/nm. In your model this doesn’t seem super hard to build.
But in that sense I should reassert that my model applies most directly only to any device which conveys bits relayed through electrons exchanging orbitals, as that is the generalized electronic cellular automata model, and wires should not be able to beat that bound. But if there is some way to make the interaction distance much much larger—for example via electrons moving ballistically OOM greater than the ~1 nm atomic scale before interacting, then the model will break down.
The mean free path of conduction electrons in copper at room temperature is ~40 nm. Cold pure metals can have much greater mean free paths. Also, a copper atom is ~0.1 nm, not ~1 nm.
For me, I will update immediately if someone can find a single example of a conventional wire communication device (room temp etc) which has been measured to transmit information using energy confidently less than 2e−21 J/bit/nm. In your model this doesn’t seem super hard to build.
I guess we could buy a 30-meter cat8 ethernet cable, send 40Gbps of data through it, coil up the cable very far away from both the transmitter and the receiver, and put that coil into a thermally-insulated box (or ideally, a calorimeter), and see if the heat getting dumped off the cable is less than 2.4 watts, right? I think that 2.4 watts is enough to be pretty noticeable without special equipment.
My expectation is… Well, I’m a bit concerned that I’m misunderstanding ethernet specs, but it seems that there are 4 twisted pairs with 75Ω characteristic impedance, and the voltage levels go up to ±1V. That would amount to a power flow of up to 4V²/Z=0.05W. The amount dissipated within the 30-meter cable is of course much less than that, or else there would be nothing left for the receiver to measure. So my prediction for the thermally-insulated box experiment above is “the heat getting dumped off the ethernet cable will be well under 0.05W (unless I’m misunderstanding the ethernet specs)”.
(Update: I struck-through the intensifiers “much” and “well” in the previous paragraph. Maybe they’re justified, but I’m not 100% sure and they’re unnecessary for my point anyway. See bhauth reply below.)
what would cause you to update?
I can easily imagine being convinced by a discussion that talks about wires in the way that I consider “normal”, like if we’re interested in voltage noise then we use the Johnson noise formula (or shot noise or crosstalk noise or whatever it is), or if we’re interested in the spatial profile of the waves then we use the telegrapher’s equations and talk about wavelength, etc.
For example, you wrote “it accumulates noise on the landauer scale at each nanoscale transmission step, and at the minimal landauer bit energy scale this noise rapidly collapses the bit representation (decays to noise) exponentially quickly”. I think if this were a real phenomenon, we should be able to equivalently describe that phenomenon using the formulas for electrical noise that I can find in the noise chapter of my electronics textbook. People have been sending binary information over wires since 1840, right? I don’t buy that there are important formulas related to electrical noise that are not captured by the textbook formulas. It’s an extremely mature field. I once read a whole textbook on transistor noise, it just went on and on about every imaginable effect.
As another example, you wrote:
It is not the case that the wire represents a single bit, stretched out across the length of the wire, as I believe you will agree. Each individual section of wire stores and transmits different individual bits in the sequence chain at each moment in time, such that the number of bits on the wire is a function of length.
Again, I want to use conventional wire formulas here. Let’s say:
It takes 0.1 nanosecond for the voltage to swing from low to high (thanks to the transistor’s own capacitance for example)
The interconnect has a transmission line signal velocity comparable to the speed of light
We’re talking about a 100μm-long interconnect.
Then you can do the math: the entire interconnect will be for all intents and purposes at a uniform voltage throughout the entire voltage-switching process. If you look at a graph of the voltage as a function of position, it will look like a flat horizontal line at each moment, and that horizontal line will smoothly move up or down over the course of the 0.1 ns swing. It won’t look like a propagating wave.
As a meta-commentary, you can see what’s happening here—I don’t think the thermal de Broglie wavelength is at all relevant in this context, nor the mean free path, and instead I’m trying to shift discussion to “how wires work”.
non-dissipative reversible move operations
One of the weird things in this discussion from my perspective is that you’re OK with photons carrying information with less than 2e-21 J/bit/nm energy dissipation but you’re not OK with wires carrying information with less than 2e-21 J/bit/nm energy dissipation. But they’re not so different in my perspective—both of those things are fundamentally electromagnetic waves traveling down transmission lines. Obviously the frequency is different and the electromagnetic mode profile is different, but I don’t see how those are relevant.
I don’t think the thermal de Broglie wavelength is at all relevant in this context, nor the mean free path, and instead I’m trying to shift discussion to “how wires work”.
This is the crux of it. I made the same comment here before seeing this comment chain.
People have been sending binary information over wires since 1840, right? I don’t buy that there are important formulas related to electrical noise that are not captured by the textbook formulas. It’s an extremely mature field.
Also a valid point. @jacob_cannell is making a strong claim: that the energy lost by communicating a bit is the same scale as the energy lost by all other means, by arbitrarily dividing by 1 nm so that the units can be compared. If this were the case, then we would have known about it for a hundred years. Instead, it is extremely difficult to measure the extremely tiny amounts of heat that are actually generated by deleting a bit, such that it’s only been done within the last decade.
This arbitrary choice leads to a dramatically overestimated heat cost of computation, and it ruins the rest of the analysis.
@Alexander Gietelink Oldenziel, for whatever it is worth, I, a physicist working in nanoelectronics, recommend @Steven Byrnes for the $250. (Although, EY’s “it’s wrong because it’s obviously physically wrong” is also correct. You don’t need to dig into details to show that a perpetual motion machine is wrong. You can assert it outright.)
For what it’s worth, I think both sides of this debate appear strangely overconfident in claims that seem quite nontrivial to me. When even properly interpreting the Landauer bound is challenging due to a lack of good understanding of the foundations of thermodynamics, it seems like you should be keeping a more open mind before seeing experimental results.
At this point, I think the remarkable agreement between the wire energies calculated by Jacob and the actual wire energies reported in the literature is too good to be a coincidence. However, I suspect the agreement might be the result of some dimensional analysis magic as opposed to his model actually being good. I’ve been suspicious of the de Broglie wavelength-sized tile model of a wire since the moment I first saw it, but it’s possible that there’s some other fundamental length scale that just so happens to be around 1 nm and therefore makes the formulas work out.
People have been sending binary information over wires since 1840, right? I don’t buy that there are important formulas related to electrical noise that are not captured by the textbook formulas. It’s an extremely mature field.
The Landauer limit was first proposed in 1961, so the fact that people have been sending binary information over wires since 1840 seems to be irrelevant in this context.
arbitrarily dividing by 1 nm so that the units can be compared
1 nm is somewhat arbitrary but around that scale is a sensible estimate for minimal single electron device spacing ala Cavin/Zhirnov. If you haven’t actually read those refs you should—as they justify that scale and the tile model.
This arbitrary choice leads to a dramatically overestimated heat cost of computation, and
This is just false, unless you are claiming you have found some error in the cavin/zhirnov papers. It’s also false in the sense that the model makes reasonable predictions. I’ll just finish my follow up post, but using the mean free path as the approx scale does make sense for larger wires and leads to fairly good predictions for a wide variety of wires from on chip interconnect to coax cable Ethernet to axon signal conduction.
1 nm is somewhat arbitrary but around that scale is a sensible estimate for minimal single electron device spacing ala Cavin/Zhirnov. If you haven’t actually read those refs you should—as they justify that scale and the tile model.
They use this model to figure out how to pack devices within a given area and estimate their heat loss. It is true that heating of a wire is best described with a resistivity (or parasitic capacitance) that scales as 1/L. If you want to build a model out of tiles, each of which is a few nm on a side (because the FETs are roughly that size), then you are perfectly allowed to do so. IMO the model is a little oversimplified to be particularly useful, but it’s physically reasonable at least.
This is just false, unless you are claiming you have found some error in the cavin/zhirnov papers.
No, the papers are fine. They don’t say what you think they say. They are describing ordinary resistive losses and such. In order to compare different types of interconnects running at different bitrates, they put these losses in units of energy/bit/nm. This has no relation to Landauer’s principle.
Resistive heat loss in a wire is fundamentally different than heat loss from Landauer’s principle. I can communicate 0 bits of information across a wire while losing tons of energy to resistive heat, by just flowing a large constant current through it.
It’s also false in the sense that the model makes reasonable predictions.
As pointed out by Steven Byrnes, your model predicts excess heat loss in a well-understood system. In my linked comment, I pointed out another way that it makes wrong predictions.
Resistive heat loss in a wire is fundamentally different than heat loss from Landauer’s principle. I can communicate 0 bits of information across a wire while losing tons of energy to resistive heat, by just flowing a large constant current through it.
As pointed out by Steven Byrnes, your model predicts excess heat loss in a well-understood system.
False. I never at any point modeled the resistive heat/power loss for flowing current through a wire sans communication. It was Byrnes who calculated the resistive loss for a coax cable, and got a somewhat wrong result (for wire communication bit energy cost), whereas the tile model (using mean free path for larger wires) somehow outputs the correct values for actual coax cable communication energy use as shown here.
Resistive heat loss is not the same as heat loss from Landauer’s principle. (you agree!)
The Landauer limit is an energy loss per bit flip, with units energy/bit. This is the thermodynamic minimum (with irreversible computing). It is extremely small and difficult to measure. It is unphysical to divide it by 1 nm to model an interconnect, because signals do not propagate through wires by hopping from electron to electron.
The Cavin/Zhirnov paper you cite does not concern the Landauer principle. It models ordinary dissipative interconnects. Due to a wide array of engineering optimizations, these elements tend to have similar energy loss per bit per mm, however this is not a fundamental constraint. This number can be basically arbitrarily changed by multiple orders of magnitude.
You claim that your modified Landauer energy matches the Cavin/Zhirnov numbers, but this is a nonsense comparison because they are different things. One can be varied by orders of magnitude while the other cannot. Because they are different heat sources, their heat losses add.
We have known how wires work for a very long time. There is a thorough and mature field of physics regarding heat and information transport in wires. If we were off by a factor of 2 in heat loss (what you are claiming, possibly without knowing so) then we would have known it long ago. The Landauer principle would not be a very esoteric idea at the fringes of computation and physics, it would be front and center necessary to understand heat dissipation in wires. It would have been measured a hundred years ago.
I’m not going to repeat this again. If you ignore the argument again then I will assume bad faith and quit the conversation.
I’m really not sure what your argument is if this is the meat, and moreover don’t really feel morally obligated to respond given that you have not yet acknowledged that my model already made roughly correct predictions and that Byrne’s model of wire heating under passive current load is way off theoretically and practically. Interconnect wire energy comes from charging and discharging 12CV2 capacitance energy, not resistive loss for passive constant (unmodulated) current flow.
The landauer limit connects energy to probability of state transitions, and is more general than erasure. Reversible computations still require energies that are multiples of this bound for reliability. It is completely irrelevant how signals propagate through the medium—whether by charging wire capacitance as in RC interconnect, or through changes in drift velocity, or phonons, or whatever. As long as the medium has thermal noise, the landauer/boltzmann relationship applies.
Cavin/Zhirnov absolutely cite and use the Landauer principle for bit energy.
I make no such claim as i’m not using a “modified Landauer energy”.
I’m not making any claims of novel physics or anything that disagrees with known wire equations.
If we were off by a factor of 2 in heat loss (what you are claiming, possibly without knowing so)
Comments like this suggest you don’t have a good model of my model. The actual power usage of actual devices is a known hard fact and coax cable communication devices have actual power usage within the range my model predicted—that is a fact. You can obviously use the wire equations (correctly) to precisely model that power use (or heat loss)! But I am more concerned with the higher level general question of why both human engineering and biology—two very separate long running optimization processes—converged on essentially the same wire bit energy.
Ok, I will disengage. I don’t think there is a plausible way for me to convince you that your model is unphysical.
I know that you disagree with what I am saying, but from my perspective, yours is a crackpot theory. I typically avoid arguing with crackpots, because the arguments always proceed basically how this one did. However, because of apparent interest from others, as well as the fact that nanoelectronics is literally my field of study, I engaged. In this case, it was a mistake.
Things got heated here.
I and many others are grateful for your effort to share your expertise.
Is there a way in which you would feel comfortable continuing to engage?
Remember that for the purposes of the prize pool there is no need to convince Cannell that you are right. In fact I will not judge veracity at all just contribution to the debate (on which metric you’re doing great!)
Dear Jake,
This is the second person in this thread that has explicitly signalled the need to disengage. I also realize this is charged topic and it’s easy for it to get heated when you’re just honestly trying to engage.
I would be happy to discuss the physics related to the topic with others. I don’t want to keep repeating the same argument endlessly, however.
Note that it appears that EY had a similar experience of repeatedly not having their point addressed:
I’m confused at how somebody ends up calculating that a brain—where each synaptic spike is transmitted by ~10,000 neurotransmitter molecules (according to a quick online check), which then get pumped back out of the membrane and taken back up by the synapse; and the impulse is then shepherded along cellular channels via thousands of ions flooding through a membrane to depolarize it and then getting pumped back out using ATP, all of which are thermodynamically irreversible operations individually—could possibly be within three orders of magnitude of max thermodynamic efficiency at 300 Kelvin. I have skimmed “Brain Efficiency” though not checked any numbers, and not seen anything inside it which seems to address this sanity check.
Then, after a reply:
This does not explain how thousands of neurotransmitter molecules impinging on a neuron and thousands of ions flooding into and out of cell membranes, all irreversible operations, in order to transmit one spike, could possibly be within one OOM of the thermodynamic limit on efficiency for a cognitive system (running at that temperature).
Then, after another reply:
Nothing about any of those claims explains why the 10,000-fold redundancy of neurotransmitter molecules and ions being pumped in and out of the system is necessary for doing the alleged complicated stuff.
Then, nothing more (that I saw, but I might have missed comments. this is a popular thread!).
If this is your field but also you don’t have the mood for pedagogy when someone from another field has strong opinions, which is emotionally understandable, I’m curious what learning material you’d recommend working through to find your claims obvious; is a whole degree needed? Are there individual textbooks or classes or even individual lectures?
For the theory of sending information across wires, I don’t think there is any better source than Shannon’s “A Mathematical Theory of Communication.”
I’m not aware of any self-contained sources that are enough to understand the physics of electronics. You need to have a very solid grasp of E&M, the basics of solid state, and at least a small amount of QM. These subjects can be pretty unintuitive. As an example of the nuance even in classical E&M, and an explanation of why I keep insisting that “signals do not propagate in wires by hopping from electron to electron,” see this youtube video.
You don’t actually need all of that in order to argue that the brain cannot be efficient from a thermodynamic perspective. EY does not understand the intricacies of nanoelectronics (probably), but he correctly stated that the final result from the original post cannot be correct, because obviously you can imagine a computation machine that is more thermodynamically efficient than pumping tens of thousands of ions across membranes and back. This intuition probably comes from some thermodynamics or statistical mechanics books.
What is the most insightful textbook about nanoelectronics you know of, regardless of how difficult it may be?
Or for another question trying to get at the same thing: if only one book about nanoelectronics were to be preserved (but standard physics books would all be fine still), which one would you want it to be? (I would be happy with a pair of books too, if that’s an easier question to answer.)
I come more from the physics side and less from the EE side, so for me it would be Datta’s “Electronic Transport in Mesoscopic Systems”, assuming the standard solid state books survive (Kittel, Ashcroft & Mermin, L&L stat mech, etc). For something closer to EE, I would say “Principles of Semiconductor Devices” by Zeghbroeck because it is what I have used and it was good, but I know less about that landscape.
I strongly disapprove of your attitude in this thread. You haven’t provided any convincing explanation of what’s wrong with Jacob’s model beyond saying “it’s unphysical”.
I agree that the model is very suspicious and in some sense doesn’t look like it should work, but at the same time, I think there’s obviously more to the agreement between his numbers and the numbers in the literature than you’re giving credit for. Your claim that there’s no fundamental bound on information transmission that relies on resistive materials of the form energy/bit/length (where the length scale could depend on the material in ways Jacob has already discussed) is unsupported and doesn’t seem like it rests on any serious analysis.
You can’t blame Jacob for not engaging with your arguments because you haven’t made any arguments. You’ve just said that his model is unphysical, which I agree with and presumably he would also agree with to some extent. However, by itself, that’s not enough to show that there is no bound on information transmission which roughly has the form Jacob is talking about, and perhaps for reasons that are not too dissimilar from the ones he’s conjectured.
I could be wrong here, but I think the “well-understood” physics principles that spxtr is getting at are the Shannon-Hartley Theorem and the Johnson-Nyquist noise. My best guess at how one would use these to derive a relationship between power consumption, bit rate, and temperature are as follows:
The power of the Johnson-Nyquist noise goes as kTΔf, where Δf is the bandwidth. So we’re interpreting the units of kT as W/Hz. Interestingly, for power output, the resistance in the circuit is irrelevant. Larger resistance means more voltage noise and less current noise, but the overall power multiplies out to be the same.
Next, the Shannon-Hartley theorem says that the channel capacity is:
C=Δflog2(1+PsignalPnoise)
Where C is the bitrate (units are bits per second), and Psignal,Pnoise are the power levels of signal and noise. Then the energy cost to send a bit (we’ll call it Ebit) is:
Ebit=PsignalC
Based on Johnson-Nyquist, we have a noise level of kTΔf, so overall the energy cost per bit should be:
Ebit=PsignalΔflog2(1+PsignalkTΔf)
Define a dimensionless x=PsignalkTΔf. Then we have:
Ebit=kTxlog2(1+x)
Since x must be positive, the minimum value for the dimensionless part is log2. So this gives a figure of kTlog2 per bit for the entire line, assuming resistance isn’t too large. Interestingly, this is the same number as the Landauer limit itself, something I wasn’t expecting when I started writing this.
I think one reason your capacitor charging/discharging argument didn’t stop this number from coming out so small is that information can travel as pulses along the line that don’t have to charge and discharge the entire thing at once. They just have to contain enough energy to charge the local area they happen to be currently occupying.
The problem with this model is that it would apply equally as well regardless of how you’re transmitting information on an electromagnetic field, or for that matter, any field to which the equipartition theorem applies.
If your field looks like lots of uncoupled harmonic oscillators joined together once you take Fourier transforms, then each harmonic oscillator is a quadratic degree of freedom, and each picks up thermal noise on the order of ~ kT because of the equipartition theorem. Adding these together gives you Johnson noise in units of power. Shannon-Hartley is a mathematical theorem that has nothing to do with electromagnetism in particular, so it will also apply in full generality here.
You getting the bitwise Landauer limit as the optimum is completely unsurprising if you look at the ingredients that are going into your argument. We already know that we can beat Jacob’s wire energy bounds by using optical transmission, for example. The part your calculation fails to address is what happens if we attempt to drive this transmission by moving electrons around inside a wire made of an ordinary resistive material such as copper.
It seems to me that in this case we should expect a bound that has dimensions energy/bit/length and not energy/bit, and such a bound basically has to look like Jacob’s bound by dimensional analysis, modulo the length scale of 1 nm being correct.
Yeah, I agree that once you take into account resistance, you also get a length scale. But that characteristic length is going to be dependent on the exact geometry and resistance of your transmission line. I don’t think it’s really possible to say that there’s a fundamental constant of ~1nm that’s universally implied by thermodynamics, even if we confine ourselves to talking about signal transmission by moving electrons in a conductive material.
There’s a wide spread of possible levels of attenuation for different cable types. Note the log scale.
A typical level of attenuation is 10dB over 100 ft. If the old power requirement per bit was about kT, this new power requirement is about 10kT. Then presumably to send the signal another 100ft, we’d have to pay another 10kT. Call it 100kT to account for inefficiencies in the signal repeater. So this gives us a cost of 1kT per foot rather than 1kT per nanometer!
That linked article and graph seems to be talking about optical communication (waveguides), not electrical.
There’s nothing fundamental about ~1nm, it’s just a reasonable rough guess of max tile density. For thicker interconnect it seems obviously suboptimal to communicate bits through maximally dense single electron tiles.
But you could imagine single electron tile devices with anisotropic interconnect tiles where a single electron moves between two precise slots separated by some greater distance and then ask what is the practical limit on that separation distance and it ends up being mean free path
So anisotropic tiles with length scale around mean free path is about the best one could expect from irreversible communication over electronic wires, and actual electronic wire signaling in resistive wires comes close to that bound such that it is an excellent fit for actual wire energies. This makes sense as we shouldn’t expect random electron motion in wires to beat single electron cellular automata that use precise electron placement.
The equations you are using here seem to be a better fit for communication in superconducting wires where reversible communication is possible.
That linked article and graph seems to be talking about optical communication (waveguides), not electrical.
Terminology: A waveguide has a single conductor, example: a box waveguide. A transmission line has two conductors, example: a coaxial cable.
Yes most of that page is discussing waveguides, but that chart (“Figure 5. Attenuation vs Frequency for a Variety of Coaxial Cables”) is talking about transmission lines, specifically coaxial cables. In some sense even sending a signal through a transmission line is unavoidably optical, since it involves the creation and propagation of electromagnetic fields. But that’s also kind of true of all electrical circuits.
Anyways, given that this attenuation chart should account for all the real-world resistance effects and it says that I only need to pay an extra factor of 10 in energy to send a 1GHz signal 100ft, what’s the additional physical effect that needs to be added to the model in order to get a nanometer length scale rather than a centimeter length scale?
Using steady state continuous power attenuation is incorrect for EM waves in a coax transmission line. It’s the difference between the small power required to maintain drift velocity against frictive resistance vs the larger energy required to accelerate electrons up to the drift velocity from zero for each bit sent.
In some sense none of this matters because if you want to send a bit through a wire using minimal energy, and you aren’t constrained much by wire thickness or the requirement of a somewhat large encoder/decoder devices, you can just skip the electron middleman and use EM waves directly—ie optical.
I don’t have any strong fundemental reason why you couldn’t use reversible signaling through a wave propagating down a wire—it is just another form of wave as you point out.
The landauer bound till applies of course, it just determines the energy involved rather than dissipated. If the signaling mechanism is irreversible, then the best that can be achieved is on order ~1e-21 J/bit/nm. (10x landauer bound for minimal reliability over a long wire, but distance scale of about 10 nm from the mean free path of metals). Actual coax cable wire energy is right around that level, which suggests to me that it is irreversible for whatever reason.
The part your calculation fails to address is what happens if we attempt to drive this transmission by moving electrons around inside a wire made of an ordinary resistive material such as copper.
I have a number floating around in my head. I’m not sure if it’s right, but I think that at GHz frequencies, electrons in typical wires are moving sub picometer distances (possibly even femtometers?) per clock cycle.
The underlying intuition is that electron charge is “high” in some sense, so that 1. adding or removing a small number of electrons corresponds to a huge amount of energy (remove 1% of electrons from an apple and it will destroy the Earth in its explosion!) and 2. moving the electrons in a metal by a tiny distance (sub picometer) can lead to large enough electric fields to transmit signals with high fidelity.
Feel free to check these numbers, as I’m just going by memory.
The end result is that we can transmit signals with high fidelity by moving electrons many orders of magnitude less distance than their mean free path, which means intuitively it can be done more or less loss-free. This is not a rigorous calculation, of course.
I have a number floating around in my head. I’m not sure if it’s right, but I think that at GHz frequencies, electrons in typical wires are moving sub picometer distances (possibly even femtometers?) per clock cycle.
The absolute speed of conduction band electrons inside a typical wire should be around 1e6 m/s at room temperature. At GHz frequencies, the electrons are therefore moving distances comparable to 1 mm per clock cycle.
If you look at the average velocity, i.e. the drift velocity, then that’s of course much slower and the electrons will be moving much more slowly in the wire—the distances you quote should be of the right order of magnitude in this case. But it’s not clear why the drift velocity of electrons is what matters here. By Maxwell, you only care about electron velocity on the average insofar as you’re concerned with the effects on the EM field, but actually, the electrons are moving much faster so could be colliding with a lot of random things and losing energy in the process. It’s this effect that has to be bounded, and I don’t think we can actually bound it by a naive calculation that assumes the classical Drude model or something like that.
If someone worked all of this out in a rigorous analysis I could be convinced, but your reasoning is too informal for me to really believe it.
Ah, I was definitely unclear in the previous comment. I’ll try to rephrase.
When you complete a circuit, say containing a battery, a wire, and a light bulb, a complicated dance has to happen for the light bulb to turn on. At near the speed of light, electric and magnetic fields around the wire carry energy to the light bulb. At the same time, the voltage throughout the wire establishes itself at the the values you would expect from Ohm’s law and Kirchhoff’s rules and such. At the same time, electrons throughout the wire begin to feel a small force from an electric field pointing along the direction of the wire, even if the wire has bends and such. These fields and voltages, outside and inside the wire, are the result of a complicated, self-consistent arrangement of surface charges on the wire.
See this youtube video for a nice demonstration of a nonintuitive result of this process. The video cites this paper among others, which has a nice introduction and overview.
The key point is that establishing these surface charges and propagating the signal along the wire amounts to moving an extremely small amount of electric charge. In that youtube video he asserts without citation that the electrons move “the radius of a proton” (something like a femtometer) to set up these surface charges. I don’t think it’s always so little, but again I don’t remember where I got my number from. I can try to either look up numbers or calculate it myself if you’d like.
Signals (low vs high voltages, say) do not propagate through circuits by hopping from electron to electron within a wire. In a very real sense they do not even propagate through the wire, but through electric and magnetic fields around and within the wire. This broad statement is also true at high frequencies, although there the details become even more complicated.
To maybe belabor the point: to send a bit across a wire, we set the voltage at one side high or low. That voltage propagates across the wire via the song and dance I just described. It is the heat lost in propagating this voltage that we are interested in for computing the energy of sending the bit over, and this heat loss is typically extremely small, because the electrons barely have to move and so they lose very little energy to collisions.
I’m aware of all of this already, but as I said, there seems to be a fairly large gap between this kind of informal explanation of what happens and the actual wire energies that we seem to be able to achieve. Maybe I’m interpreting these energies in a wrong way and we could violate Jacob’s postulated bounds by taking an Ethernet cable and transmitting 40 Gbps of information at a long distance, but I doubt that would actually work.
I’m in a strange situation because while I agree with you that the tile model of a wire is unphysical and very strange, at the same time it seems to me intuitively that if you tried to violate Jacob’s bounds by many orders of magnitude, something would go wrong and you wouldn’t be able to do it. If someone presented a toy model which explained why in practice we can get wire energies down to a certain amount that is predicted by the model while in theory we could lower them by much more, I think that would be quite persuasive.
Maybe I’m interpreting these energies in a wrong way and we could violate Jacob’s postulated bounds by taking an Ethernet cable and transmitting 40 Gbps of information at a long distance, but I doubt that would actually work.
Ethernet cables are twisted pair and will probably never be able to go that fast. You can get above 10 GHz with rigid coax cables, although you still have significant attenuation.
Let’s compute heat loss in a 100 m LDF5-50A, which evidently has 10.9 dB/100 m attenuation at 5 GHz. This is very low in my experience, but it’s what they claim.
Say we put 1 W of signal power at 5 GHz in one side. Because of the 10.9 dB attenuation, we receive 94 mW out the other side, with 906 mW lost to heat.
The Shannon-Hartley theorem says that we can compute the capacity of the wire as C=Blog2(1+SN) where B is the bandwidth, S is received signal power, and N is noise power.
Let’s assume Johnson noise. These cables are rated up to 100 C, so I’ll use that temperature, although it doesn’t make a big difference.
If I plug in 5 GHz for B, 94 mW for S and kB(370K)(5GHz)≈2.5×10−11W for N then I get a channel capacity of 160 GHz.
The heat lost is then (906mW)/(160GHz)/(100m)≈0.05fJ/bit/mm. Quite low compared to Jacob’s ~10 fJ/mm “theoretical lower bound.”
One free parameter is the signal power. The heat loss over the cable is linear in the signal power, while the channel capacity is sublinear, so lowering the signal power reduces the energy cost per bit. It is 10 fJ/bit/mm at about 300 W of input power, quite a lot!
Another is noise power. I assumed Johnson noise, which may be a reasonable assumption for an isolated coax cable, but not for an interconnect on a CPU. Adding an order of magnitude or two to the noise power does not substantially change the final energy cost per bit (0.05 goes to 0.07), however I doubt even that covers the amount of noise in a CPU interconnect.
Similarly, raising the cable attenuation to 50 dB/100 m does not even double the heat loss per bit. Shannon’s theorem still allows a significant capacity. It’s just a question of whether or not the receiver can read such small signals.
The reason that typical interconnects in CPUs and the like tend to be in the realm of 10-100 fJ/bit/mm is because of a wide range of engineering constraints, not because there is a theoretical minimum. Feel free to check my numbers of course. I did this pretty quickly.
The heat lost is then [..] 0.05 fJ/bit/mm. Quite low compared to Jacob’s ~10 fJ/mm “theoretical lower bound.”
In the original article I discuss interconnect wire energy, not a “theoretical lower bound” for any wire energy communication method—and immediately point out reversible communication methods (optical, superconducting) that do not dissipate the wire energy.
Coax cable devices seem to use around 1 to 5 fJ/bit/mm at a few W of power, or a few OOM more than your model predicts here—so I’m curious what you think that discrepancy is, without necessarily disagreeing with the model.
I describe a simple model of wire bit energy for EM wave transmission in coax cable here which seems physically correct but also predicts a bit energy distance range somewhat below observed.
Active copper cable at 0.5W for 40G over 15 meters is ~1e−21J/nm, assuming it actually hits 40G at the max length of 15m.
I can’t access the linked article, but an active cable is not simple to model because its listed power includes the active components. We are interested in the loss within the wire between the active components.
This source has specs for a passive copper wire capable of up to 40G @5m using <1W, which works out to ~5e−21J/nm, or a bit less.
They write <1 W for every length of wire, so all you can say is <5 fJ/mm. You don’t know how much less. They are likely writing <1 W for comparison to active wires that consume more than a W. Also, these cables seem to have a powered transceiver built-in on each end that multiplex out the signal to four twisted pair 10G lines.
Compare to 10G from here which. may use up to 5W to hit up to 10G at 100M, for ~5e−21J/nm.
Again, these have a powered transceiver on each end.
So for all of these, all we know is that the sum of the losses of the powered components and the wire itself are of order 1 fJ/mm. Edit: I would guess that probably the powered components have very low power draw (I would guess 10s of mW) and the majority of the loss is attenuation in the wire.
The numbers I gave essentially are the theoretical minimum energy loss per bit per mm of that particular cable at that particular signal power. It’s not surprising that multiple twisted pair cables do worse. They’ll have higher attenuation, lower bandwidth, the standard transceivers on either side require larger signals because they have cheaper DAC/ADCs, etc. Also, their error correction is not perfect, and they don’t make full use of their channel capacity. In return, the cables are cheap, flexible, standard, etc.
I think this calculation is fairly convincing pending an answer from Jacob. You should have probably just put this calculation at the top of the thread, and then the back-and-forth would probably not have been necessary. The key parameter that is needed here is the estimate of a realistic attenuation rate for a coaxial cable, which was missing from DaemonicSigil’s original calculation that was purely information-theoretic.
As an additional note here, if we take the same setup you’re using, then if you take the energy input x to be a free parameter, then the energy per bit per distance is given by
f(x)=0.906x5⋅1014⋅log2(1+0.094x2.5⋅10−11)
in units of J/bit/mm. This does not have a global optimum for x>0 because it’s strictly increasing, but we can take a limit to get the theoretical lower bound
limx→0f(x)=3.34⋅10−25
which is much lower than what you calculated, though to achieve this you would be sending information very slowly—indeed, infinitely slowly in the limit of x→0.
I am skeptical that steady state direct current flow attenuation is the entirety of the story (and indeed it seems to underestimate actual coax cable wire energy of ~1e^-21 to 5e^-21 J/bit/nm by a few OOM).
For coax cable the transmission is through a transverse (AC) wave that must accelerate a quantity of electrons linearly proportional to the length of the cable. These electrons rather rapidly dissipate this additional drift velocity energy through collisions (resistance), and the entirety of the wave energy is ultimately dissipated.
This seems different than sending continuous DC power through the wire where the electrons have a steady state drift velocity and the only energy required is that to maintain the drift velocity against resistance. For wave propagation the electrons are instead accelerated up from a drift velocity of zero for each bit sent. It’s the difference between the energy required to accelerate a car up to cruising speed and the power required to maintain that speed against friction.
If we take the bit energy to be Eb, then there is a natural EM wavelength of Eb=hcλ, so λ=hcEB, which works out to ~1um for ~1eV. Notice that using a lower frequency / longer wavelength seems to allow one to arbitrarily decrease the bit energy distance scale, but it turns out this just increases the dissipative loss.
So an initial estimate of the characteristic bit energy distance scale here is ~1eV/bit/um or ~1e-22 J/bit/nm. But this is obviously an underestimate as it doesn’t yet include the effect of resistance (and skin effect) during wave propagation.
The bit energy of one wavelength is implemented through electron peak drift velocity on order Eb=12Nemev2d, where Ne is the number of carrier electrons in one wavelength wire section. The relaxation time τ or mean time between thermal collisions with a room temp thermal velocity of around ~1e5 m/s and the mean free path of ~40 nm in copper is τ ~ 4e-13s. Meanwhile the inverse frequency or timespan of one wavelength is around 3e-14 s for an optical frequency 1eV wave, and is ~1e-9 s for a more typical (much higher amplitude) gigahertz frequency wave. So it would seem that resistance is quite significant on these timescales.
Very roughly the gigahertz 1e-9s period wave requires about 5 oom more energy per wavelength due to dissipation which cancels out the 5 oom larger distance scale. Each wavelength section loses about half of the invested energy every τ ~ 4e-13 seconds, so maintaining the bit energy of Eb requires roughly input power of ~Eb/τ for f−1 seconds which cancels out the effect of the longer wavelength distance, resulting in a constant bit energy distance scale independent of wavelength/frequency (naturally there are many other complex effects that are wavelength/frequency dependent but they can’t improve the bit energy distance scale )
For a low frequency (long wavelength) with f−1 << τ :
Eb/d≈Ebf−1τλ=Ebτfλ
λ=cf
Eb/d≈Ebτfλ
Eb/d≈Ebτc ~ 1eV / 10um ~ 1e-23 J/bit/nm
If you take the bit energy down to the minimal landauer limit of ~0.01 eV this ends up about equivalent to your lower limit, but I don’t think that would realistically propagate.
A real wave propagation probably can’t perfectly transfer the bit energy over longer distances and has other losses (dielectric loss, skin effect, etc), so vaguely guesstimating around 100x loss would result in ~1e-21 J/bit/nm. The skin effect alone perhaps increases resistance by roughly 10x at gigahertz frequencies. Coax devices also seem constrained to use specific lower gigahertz frequences and then boost the bitrate through analog encoding, so for example 10-bit analog increases bitrate by 10x at the same frequency but requires about 1024X more power, so that is 2 OOM less efficient per bit.
Notice that the basic energy distance scale of Ebτc is derived from the mean free path, via the relaxation time τ from τ=ℓ/Vn, where ℓ is the mean free path and Vn is the thermal noise velocity (around ~1e5 m/s for room temp electrons).
Coax cable doesn’t seem to have any fundamental advantage over waveguide optical, so I didn’t consider it at all in brain efficiency. It requires wires of about the same width several OOM larger than minimal nanoscale RC interconnect and largish sending/receiving devices as in optics/photonics.
This is very different than sending continuous power through the wire where the electrons have a steady state drift velocity and the only energy required is that to maintain the drift velocity against resistance. For wave propagation the electrons are instead accelerated up from a drift velocity of zero for each bit sent. It’s the difference between the energy required to accelerate a car up to cruising speed and the power required to maintain that speed against friction.
Electrons are very light so the kinetic energy required to get them moving should not be significant in any non-contrived situation I think? The energy of the magnetic field produced by the current would tend to be much more of an important effect.
As for the rest of your comment, I’m not confident enough I understand the details of your argument be able to comment on it in detail. But from a high level view, any effect you’re talking about should be baked into the attenuation chart I linked in this comment. This is the advantage of empirically measured data. For example, the skin-effect (where high frequency AC current is conducted mostly in the surface of a conductor, so the effective resistance increases the higher the frequency of the signal) is already baked in. This effect is (one of the reasons) why there’s a positive slope in the attenuation chart. If your proposed effect is real, it might be contributing to that positive slope, but I don’t see how it could change the “1 kT per foot” calculation.
Electrons are very light so the kinetic energy required to get them moving should not be significant in any non-contrived situation I think? The energy of the magnetic field produced by the current would tend to be much more of an important effect.
My current understanding is that the electric current energy transmits through electron drift velocity (and I believe that is the standard textbook understanding?, although I admit I have some questions concerning the details). The magnetic field is just a component of the EM waves which propagate changes in electron KE between electrons (the EM waves implement the connections between masses in the equivalent mass-spring system).
I’m not sure how you got “1 kT per foot” but that seems roughly similar to the model up thread I am replying to from spxtr that got 0.05 fJ/bit/mm or 5e-23 J/bit/mm. I attempted to derive an estimate from the lower level physics thinking it might be different but it ended up in the same range—and also off by the same 2 OOM vs real data. But I mention that skin effect could plausibly increase power by 10x in my lower level model, as I didn’t model it nor use measured attenuation values at all. The other OOM probably comes from analog SNR inefficiency.
The part of this that is somewhat odd at first is the exponential attenuation. That does show up in my low lever model where any electron kinetic energy in the wire is dissipated by about 50% due to thermal collisions every τ ~ 4e-13 seconds (that is the important part from mean free path / relaxation time). But that doesn’t naturally lead to a linear bit energy distance scale unless that dissipated energy is somehow replaced/driven by the preceding section of waveform.
So if you sent E as a single large infinitesimal pulse down a wire of length D, the energy you get on the other side is E∗2−αD for some attenuation constant α that works out to about 0.1 mm or something as it’s τc, not meters. I believe if your chart showed attenuation in the 100THZ regime on the scale of τ it would be losing 50% per 0.1 mm instead of per meter.
We know that resistance is linear, not exponential—which I think arises from long steady flow where every τ seconds half the electron kinetic energy is dissipated, but this total amount is linear with wire section length. The relaxation time τ then just determines what steady mean electron drift velocity (current flow) results from the dissipated energy.
So when the wave period f−1 is much less than τ you still lose about half of the wave energy E every τ seconds but that can be spread out over a much larger wavelength section. (and indeed at gigahertz frequencies this model roughly predicts the correct 50% attenuation distance scale of ~10m or so).
There’s two types of energy associated with a current we should distinguish. Firstly there’s the power flowing through the circuit, then there’s energy associated with having current flowing in a wire at all. So if we’re looking at a piece of extension cord that’s powering a lightbulb, the power flowing through the circuit is what’s making the lightbulb shine. This is governed by the equation P=IV. But there’s also some energy associated with having current flowing in a wire at all. For example, you can work out what the magnetic field should be around a wire with a given amount of current flowing through it and calculate the energy stored in the magnetic field. (This energy is associated with the inductance of the wire.) Similarly, the kinetic energy associated with the electron drift velocity is also there just because the wire has current flowing through it. (This is typically a very small amount of energy.)
To see that these types have to be distinct, think about what happens when we double the voltage going into the extension cord and also double the resistance of the lightbulb it’s powering. Current stays the same, but with twice the voltage we now have twice the power flowing to the light bulb. Because current hasn’t changed, neither has the magnetic field around the wire, nor the drift velocity. So the energy associated with having a current flowing in this wire is unchanged, even though the power provided to the light bulb has doubled. The important thing about the drift velocity in the context of P=IV is that it moves charge. We can calculate the potential energy associated with a charge in a wire as E=qV, and then taking the time derivative gives the power equation. It’s true that drift velocity is also a velocity, and thus the charge carriers have kinetic energy too, but this is not the energy that powers the light bulb.
In terms of exponential attenuation, even DC through resistors gives exponential attenuation if you have a “transmission line” configuration of resistors that look like this:
So exponential attenuation doesn’t seem too unusual or surprising to me.
Indeed, the theoretical lower bound is very, very low.
Do you think this is actually achievable with a good enough sensor if we used this exact cable for information transmission, but simply used very low input energies?
The minimum is set by the sensor resolution and noise. A nice oscilloscope, for instance, will have, say, 12 bits of voltage resolution and something like 10 V full scale, so ~2 mV minimum voltage. If you measure across a 50 Ohm load then the minimum received power you can see is P=(2mV)2/(50Ω)≈10μW. This is an underestimate, but that’s the idea.
This is the right idea, but in these circuits there are quite a few more noise sources than Johnson noise. So, it won’t be as straightforward to analyze, but you’ll still end up with essentially a relatively small (compared to L/nm) constant times kT.
I think one reason your capacitor charging/discharging argument didn’t stop this number from coming out so small is that information can travel as pulses along the line that don’t have to charge and discharge the entire thing at once.
Sure information can travel that way in theory, but it doesn’t work out in practice for dissipative resistive (ie non superconducting) wires. Actual on chip interconnect wires are ‘RC wires’ which do charge/discharge the entire wire to send a bit. They are like a pipe which allows electrons to flow from some source to a destination device, where that receiving device (transistor) is a capacitor which must be charged to a bit energy Eb>>KBT. The Johnson thermal noise on a capacitor is just the same Landauer Boltzmann noise of En≈KBT. The wire geometry aspect ratio (width/length) determines the speed at which the destination capacitor can be charged up to the bit energy.
The only way for the RC wire to charge the distant receiver capacitor is by charging the entire wire, leading to the familiar RC wire capacitance energy, which is also very close to the landauer tile model energy using mean free path as the tile size (for the reasons i’ve articulated in various previous comments).
Yeah, to be clear I do agree that your model gives good empirical results for on-chip interconnect. (I haven’t checked the numbers myself, but I believe you that they match up well.) (Though I don’t necessarily buy that the 1nm number is related to atom spacing in copper or anything like that. It probably has more to do with the fact that scaling down a transmission line while keeping the geometry the same means that the capacitance per unit length is constant. The idea you mention in your other comment about it somehow falling out of the mean free path also seems somewhat plausible.)
Anyway, I don’t think my argument would apply to chip interconnect. At 1GHz, the wavelength is going to be about a foot, which is still wider than any interconnect on the microchip will be long. And we’re trying to send a single bit along the line using a DC voltage level, rather than some kind of fancy signal wave. So your argument about charging and discharging the entire line should still apply in this case. My comment would mostly apply to Steven Byrnes’s ethernet cable example, rather than microchip interconnect.
Sure, I guess the “much less” was a guess; I should have just said “less” out of an abundance of caution.
Before writing that comment, I had actually looked for a dB/meter versus frequency plot for cat8 Ethernet cable and couldn’t find any. Do you have a ref? It’s not important for this conversation, I’m just curious. :)
The ‘tile’ or cellular automata wire model fits both on-chip copper interconnect wire energy and brain axon wire energy very well. It is more obvious why it fits axon signal conduction as that isn’t really a traditional voltage propagation in a wire, it’s a propagation of ion cellular automata state changes. I’m working on a better writeup and I’ll look into how the wire equations could relate. If you have some relevant link to physical limits of communication over standard electrical wires, that would of course be very interesting/relevant.
My expectation is… Well, I’m a bit concerned that I’m misunderstanding ethernet specs, but it seems that there are 4 twisted pairs with 75Ω characteristic impedance, and the voltage levels go up to ±1V. That would amount to a power flow of up to 4V²/Z=0.05W.
I’m guessing this is probably the correct equation for the resistive loss, but irreversible communication requires doing something dumb like charging and discharging/dissipating 12CV2 (or equivalent) every clock cycle, which is OOM greater than the resistive loss (which would be appropriate for a steady current flow).
Do you have a link to the specs you were looking at? As I’m seeing a bunch of variation in 40G capable cables. Also 40Gb/s is only the maximum transmission rate, actual rate may fall off with distance from what I can tell.
The first reference I can find From this website is second hand but:
When the data rate required for interconnection is less than 5 Gbps, the passive copper cable is usually used for interconnection in data center. However, they can only support 40G transmission over really short distance.
Active copper cable can support 40G transmission over copper cable up to 15 meters with QSFP+ connector embedded with electronics. In the battle over transmission distance, optical active cable wins without doubt.
The connectors attached with AOC and active copper cable are the main reason why the two cables can support 40G transmission over longer distance than that of passive copper cable. AOC which can support the longest 40G transmission distance is with the highest power consumption—more than 2W. The power consumption for active copper cable is only 440mW. However, passive copper cable requires no power during the transmission.
Active copper cable at 0.5W for 40G over 15 meters is ~1e−21J/nm, assuming it actually hits 40G at the max length of 15m.
This source has specs for a passive copper wire capable of up to 40G @5m using <1W, which works out to ~5e−21J/nm, or a bit less.
Compare to 10G from here which. may use up to 5W to hit up to 10G at 100M, for ~5e−21J/nm.
One of the weird things in this discussion from my perspective is that you’re OK with photons carrying information with less than 2e-21 J/bit/nm energy dissipation but you’re not OK with wires carrying information with less than 2e-21 J/bit/nm energy dissipation.
I do think I have a good explanation in the cellular automata model, and I’ll just put my full response in there, but basically it’s the difference between using fermions vs bosons to propagate bits through the system. Photons as bosons are more immune to EM noise pertubations and in typical use have much longer free path length (distance between collisions). One could of course use electrons ballistically to get some of those benefits but they are obviously slower and ‘noisier’.
So I predict in advance these approaches will fail or succeed only through using some reversible mechanism (with attended tradeoffs).
If you accept the Landuer analysis then the only question that remains for nano devices (where interconnect tiles are about the same size as your compute devices), is why you would ever use irreversible copy-tiles for interconnect instead of reversible move-tiles. It really doesn’t matter whether you are using ballistic electrons or electron waves or mechanical rods, you just get different variations of ways to represent a bit (which still mostly look like a 12CV2 relation but the form isn’t especially relevant )
A copy tile copy tile copies a bit from one side to the other. It has an internal memory M state (1 bit), and it takes an input bit from say the left and produces an output bit on the right. It’s logic table looks like:
O I M
1 1 0
1 1 1
0 0 0
0 0 1
In other words, every cycle it erases whatever leftover bit it was storing and copies the input bit to the output, so it always erases one bit. This exactly predicts nanowire energy correctly, there is a reason cavin et al use it, etc etc.
But why do that instead of just move a bit? That is the part which I think is less obvious.
I believe it has to do with the difficulties of noise buildup. The copy device doesn’t allow any error to accumulate at all. Your bits can be right on your reliability threshold (1 eV or whatever depending on the required reliability and speed tradeoffs), and error doesn’t accumulate regardless of wire length, because you are erasing at every step.
The reversible move device seems much better—and obviously is for energy efficiency—but it accumulates a bit of noise on the landuer scale at every cycle, because of various thermal/quantum noise sources as you are probably aware: your device is always coupled to a thermal bath, or still subject to cosmic rays even in outer space, and producing it’s own heat regardless at least for error correction. And if you aren’t erasing noise, then you are accumulating noise.
Edit: After writing this out I just stumbled on this paper by Siamak Taati[1] which makes the same argument about exponential noise accumulation much more formally. Looks like fully reversible computers are as challenging as scaling quantum computers. Quantum computers are naturally reversible and have all the same noise accumulation issues, resulting in quick decoherence—so you end up trying to decouple them from the environment as much as possible (absolute zero temp).
You can also have interconnect through free particle transmission as in lasers/optics, but that of course doesn’t completely avoid the noise accumulation issue. Optical interconnect also just greatly increases the device size which is obviously a huge downside but helps further reduce energy losses by just massively scaling up the interaction length or equivalent tile size.
Reversible cellular automata in presence of noise rapidly forget everything∗
Your whole reply here just doesn’t compute for me. An interconnect is a wire. We know how wires work. They have resistance-per-length, and capacitance-per-length, and characteristic impedance, and Johnson noise, and all the other normal things about wires that we learned in EE 101. If the wire is very small—even down to nanometers—it’s still a wire, it’s just a wire with a higher resistance-per-length (both for the obvious reason of lower cross-sectional area, and because of surface scattering and grain-boundary scattering).
I don’t know why you’re talking about “tiles”. Wires are not made of tiles, right? I know it’s kinda rude of me to not engage with your effortful comment, but I just find it very confusing and foreign, right from the beginning.
If it helps, here is the first random paper I found about on-chip metal interconnects. It treats them exactly like normal (albeit small!) metal wires—it talks about resistance, resistivity, capacitance, current density, and so on. That’s the kind of analysis that I claim is appropriate.
None of those are fundemental—all those rules/laws are derived—or should be derivable—from simpler molecular/atomic level simulations.
A wire carries a current and can be used to power devices, and or it can be used to transmit information—bits. In the latter usage noise analysis is crucial.
Let me state a chain of propositions to see where you disagree:
The landauer energy/bit/noise analysis is correct (so high speed reliable bits correspond to ~1eV).
The analysis applies to computers of all scales, down to individual atoms/molecules.
For a minimal molecular nanowire, the natural tile size is the electron radius.
An interconnect (wire) tile can be reversible or irreversible.
Reversible tiles rapidly accumulate noise/error ala Taati et al. and so aren’t used for nano scale interconnect in brains or computers.
From 1 − 4 we can calculate the natural wire energy as it’s just 1 electron charge per 1 electron radius and it reproduces wire equation near exactly (recall from that other thread in brain efficiency).
Let’s consider a ≤1mm wire on a 1GHz processor. Given the transmission line propagation speed, we can basically assume that the whole wire is always at a single voltage. I want to treat the whole wire as a unit. We can net add charge from the wire, anywhere in the wire, and the voltage of the whole wire will go up. Or we can remove charge from the wire, anywhere in the wire, and the voltage of the whole wire will go down.
Thus we have a mechanism for communication. We can electrically isolate the wire, and I can stand at one end of the wire, and you can stand at the other. I pull charge off of the wire at my end, and you notice that the voltage of the whole wire has gone down. And then I add charge into the wire, and you notice that the voltage of the whole wire has gone up. So now we’re communicating. And this is how different transistors within a chip communicate with each other, right?
I don’t think electron radius is relevant in this story. And there are no “tiles”. And this is irreversible. (When we bring the whole wire from low voltage to high voltage or vice-versa, energy is irrecoverably dissipated.) And the length of the wire only matters insofar as that changes its capacitance, resistance, inductance, etc. There will be voltage fluctuations (that depend on the frequency band, characteristic impedance, and ohmic losses), but I believe that they’re negligibly small for our purposes (normal chips are sending maybe 600 mV signals through the interconnects, so based on ½CV² we should get 2 OOM lower interconnect losses by “merely” going to 60 mV, whereas the Johnson noise floor at 1GHz is <<1mV I think). The loss involved in switching the whole wire from high voltage to low voltage or vice versa is certainly going to be >>1eV.
I’m still not sure where you disagree with my points 1-5, but I’m guessing 3?
The relevance of 3 is that your wire is made of molecules with electron orbitals each of which is a computer subject to the landauer analysis. To send a bit very reliably across just one electron-radius length of wire requires about 1eV (not exactly, but using the equations). So for a minimal nanowire of single electron width that corresponds to 1V, but a wider wire can naturally represent a bit using more electrons and a lower voltage.
Either way, every individual molecule electron-radius length of the wire is a computer tile which must either 1.) copy a bit and thus erase a bit on order 1eV, or 2.) move a bit without erasure, but thus accumulate noise ala Taati et al.
So if we plug in those equations it near exactly agrees with the spherical cow wire model of nanowires, and you get about 81 fJ/mm.
The only way to greatly improve on this is to increase the interaction distance (and thus tile size) - which requires the electrons move a much larger distance before interacting in the relay chain. That doesn’t seem very feasible for conventional wires made of a dense crystal lattice but obviously is possible for non relay based interconnect like photonics (with its size disadvantage).
So in short, at the nanoscale it’s better to model interconnect as molecular computers, not macro wires. Do you believe Cavin/Zhirnov are incorrect?
Specifically the tile model[1], and also more generally the claim that adiabatic interconnect basically doesn’t work at the nanolevel for conventional computers due to noise accumulation[2], agreeing with Taati:
Science and Engineering beyond Moore’s Law
ENERGY BARRIERS, DEMONS, AND MINIMUM ENERGY OPERATION OF ELECTRONIC DEVICES
Here’s a toy model. There’s a vacuum-gap coax of length L. The inside is a solid cylindrical wire of diameter D and resistivity ρ. The outside is grounded, and has diameter Dₒ=10×D. I stand at one end and you stand at the other end. The inside starts out at ground. Your end is electrically isolated (open-circuit). If I want to communicate the bit “1” to you, then I raise the voltage at my end to V=+10mV, otherwise I lower the voltage at my end to V=–10mV.
On my end, the energy I need to spend is:
12CV2=πϵ0V2L/ln(Do/D)=L×0.0012 fJ/mm
On your end, you’re just measuring a voltage so the required energy is zero in principle.
The resistivity ρ and diameter D don’t enter this equation, as it turns out, although they do affect the timing. If D is as small as 1nm, that’s fine, as long as the wire continues to be electrically conductive (i.e. satisfy ohm’s law).
Anyway, I have now communicated 1 bit to you with 60,000× less energy expenditure than your supposed limit of 81 fJ/mm. But I don’t see anything going wrong here. Do you? Like, what law of physics or assumption am I violating here?
I don’t think it’s relevant, but for what it’s worth, 1nm³ of copper contains 90 conduction electrons.
This may be obvious—but this fails to predict the actual wire energy, whereas my preferred model does. So if this model is correct—why does it completely fail to predict interconnect wire energy despite an entire engineering industry optimizing such parameters? Where do you believe the error is?
My first guess is perhaps you are failing to account for the complex error/noise buildup per unit length of wire. A bit is an approximation of a probability distribution. So you start out with a waveform on one end of the wire which minimally can represent 1 bit against noise (well maybe not even that—your starting voltage seems unrealistic), but then it quickly degrades to something which can not.
Actually, Looking back at the old thread I believe you are incorrect that 10mV is realistic for anything near nanowire. You need to increase your voltage by 100x or use an enormous number of charge carriers which isn’t possible for a nanowire (and is just a different way to arrive at 1eV per computational relay bit).
And in terms of larger wires, my model from brain efficiency actually comes pretty close to predicting actual wire energy for large copper wires—see this comment.
Kwa estimates 5e-21 J/nm, which is only 2x the lower landauer bound and corresponds to a ~75% bit probability (although the uncertainty in these estimates is probably around 2x itself). My explanation is that such very low bit energies approaching the lower landauer limit are possible but only with complex error correction—which is exactly what ethernet/infinityband cards are doing. But obviously not viable for nanoscale interconnect.
Or put another way—why do you believe that Cavin/Zhirnov are incorrect?
The easiest way to actuate an electronic switch is to use a voltage around 20kT/q≈500mV (where 20 is to get way above the noise floor).
The most efficient way to send information down a wire is to use a voltage around 20√kTZ0f≈ 0.3 mV (where 20 is to get way above the noise floor and Z₀ is the wire’s characteristic impedance which is kinda-inevitably somewhat lower than the 377Ω impedance of free space, typically 50-100Ω in practice).
So there’s a giant (>3 OOM) mismatch.
The easy way to deal with that giant mismatch is to ignore it. Just use the same 500mV voltage for both the switches and the wires, even though that entails wasting tons and tons of power unnecessarily in the latter—specifically 6.5 orders of magnitude more interconnect losses than if the voltage were tailored to the wire properties.
The hard way to deal with that giant mismatch is to make billions of nano-sized weird stacks of piezoelectric blocks so that each transistor gate has its own little step-up voltage-converter, or other funny things like that as in my top comment.
But people aren’t doing it the “hard way”, they’re doing it the “easy way”, and always have been.
Given that this is in fact the strategy, we can start doing fermi estimates about interconnect losses. We have V ≈ 20kT/q, C ≈ ε₀ × L (where L = typical device dimension), and if we ask how much loss there is in a “square tile” it would be ½CV²/L ≈200(kT/q)²ε₀=1.2e-21 J/nm which isn’t wildly far from Kwa’s estimate that you cite.
So in summary, I claim that Kwa gets reasonable numbers (compared to actual devices) by implicitly / effectively assuming somewhere-or-other that wire voltage is high enough to also simultaneously be adequate for a transistor gate voltage, even though such a high voltage is not remotely necessary for the wire to function well as a wire. Maybe he thinks otherwise, and if he does, I think he’s wrong. ¯\_(ツ)_/¯
To be clear, Kwa did not provide a model—so estimate is not really the right word. He provided a link to the actual wire consumption of some current coaxial ethernet, did the math, and got 5e−21J/nm, which is near the lower bound I predicted based on the landauer analysis—which only works using sophisticated error correction codes (which require entire chips). You obviously can’t use a whole CPU or ASIC for error correction for every little nanowire interconnect, so interconnect wires need to be closer to the 1eV/nm wire energy to have reliability. So your most efficient model could approach the 2e−21J/nm level, but only using some big bulky mechanism—if not error correction coding then perhaps the billions of piezoelectric blocks.
Now you could believe that I had already looked up all those values and new that, but actually I did not. I did of course test the landauer model on a few examples, and then just wrote it in as it seemed to work.
So I predict that getting below the 2e−21J/nm limit at room temp is impossible for irreversible electronic relay based communication (systems that send signals relayed through electrons on dense crystal lattices).
If you want to know the noise in a wire, you pull out your EE 101 textbook and you get formulas like Vnoiserms≈√kTZ0f=0.015 mV (@ 1GHz & 50Ω) where Z₀ is the wire’s characteristic impedance and f is the frequency bandwidth. (Assuming the wire has a low-impedance [voltage source] termination on at least one side, as expected in this context.) Right? (I might be omitting a factor of 2 or 4? Hmm, actually I’m a bit unsure about various details here. Maybe in practice the noise would be similar to the voltage source noise, which could be even lower. But OTOH there are other noise sources like cross-talk.) The number of charge carriers is not part of this equation, and neither is the wire diameter. If we connect one end of the wire to a +10mV versus −10mV source, that’s 1000× higher than the wire’s voltage noise, even averaging over as short as a nanosecond, so error correction is unnecessary, right?
I feel like your appeal to “big bulky mechanism” is special pleading. I don’t think Landauer’s analysis concluded “…therefore there is an inevitable energy dissipation of kT per bit erasure, oh unless you have a big bulky mechanism involving lots and lots of electrons, in which case energy dissipation can be as low as you like”. Right? Or if there’s a formula describing how “Landauer’s limit for interconnects” gets progressively weaker as the wire gets bigger, then what’s that formula? And why isn’t a 1nm-diameter wire already enough to get to the supposed large-wire-limit, given that copper has 90 conduction electrons per nm³?
Hmm, I think I should get back to my actual job now. You’re welcome to reply, and maybe other people will jump in with opinions. Thanks for the interesting discussion! :)
This is frustrating for me as I have already laid out my core claims and you haven’t clarified which (if any) you disagree with. Perhaps you are uncertain—that’s fine, and I can kind of guess based on your arguments, but it still means we are talking past each more than I’d prefer.
It doesn’t matter whether you use 10mV or 0.015mV as in your example above, as Landauer analysis bounds the energy of a bit, not the voltage. For high reliability interconnect you need ~1eV which could be achieved in theory by one electron at one volt naturally, but using 10mV would require ~100 electron charges and 0.015mV would require almost 1e5 electron charges, the latter of which doesn’t seem viable for nanowire interconnect, and doesn’t change the energy per bit requirements regardless.
The wire must use ~1eV to represent and transmit one bit (for high reliability interconnect) to the receiving device across the wire exit surface, regardless of the wire width.
Now we notice that we can divide the wire in half, and the first half is also a wire which must transmit to the 2nd half, so now we know it must use at least 2eV to transmit a bit across both sections, each of which we can subdivide again, resulting in 4eV .. and so on until you naturally bottom out at the minimal wire length of one electron radius.
Agreed—this site was designed to nerdsnipe us away from creating AGI ;)
Gah, against my better judgment I’m gonna carry on for at least one more reply.
I think it’s wrong to think of a wire as being divided into a bunch of tiles each of which should be treated like a separate bit.
Back to the basic Landauer analysis: Why does a bit-copy operation require kT of energy dissipation? Because we go from four configurations (00,01,10,11) to two (00,11). Thermodynamics says we can’t reduce the number of microstates overall, so if the number of possible chip states goes down, we need to make up for it by increasing the temperature (and hence number of occupied microstates) elsewhere in the environment, i.e. we need to dissipate energy / dump heat.
OK, now consider a situation where we’re transferring information by raising or lowering the voltage on a wire. Define V(X) = voltage of the wire at location X and V(X+1nm) = voltage of the wire at location X+1nm (or whatever the supposed “tile size” is). As it turns out, under practical conditions and at the level of accuracy that matters, V(X) = V(X+1nm) always. No surprise—wires are conductors, and conductors oppose voltage gradients. There was never a time when we went from more microstates to fewer microstates, because there was never a time when V(X) ≠ V(X+1nm) in the first place. They are yoked together, always equal to each other. They are one bit, not two. For example, we don’t need an energy barrier preventing V(X) from contaminating the state of V(X+1nm) or whatever; in fact, that’s exactly the opposite of what we want.
(Nitpicky side note: I’m assuming that, when we switch the wire voltage between low and high, we do so by ramping it very gradually compared to (1nm / speed of light). This will obviously be the case in practice. Then V(X) = V(X+1nm) even during the transient as the wire voltage switches.)
The thing you’re proposing is, to my ears, kinda like saying that the voltage of each individual atom within a single RAM capacitor plate is 1 bit, and it just so happens that all those “bits” within a single capacitor plate are equal to each other at any given time, and since there’s billions of atoms on the one capacitor plate it must take billions of dissipative copy operations to every time that we flip that one RAM bit.
I’m confident that I can walk through any of the steps to get from the standard model of particle physics, to Bloch waves and electron scattering, to the drift-diffusion equation and then ohm’s law, and to the telegrapher’s equations, and to Johnson noise and all the other textbook formulas for voltage noise on wires. (Note that I kinda mangled my discussions of voltage noise above, in various ways; I’m happy to elaborate but I don’t think that’s a crux here.)
Whereas “wires should be modeled as a series of discrete tiles with dissipative copy operations between them” is not derivable from fundamental physics, I claim. In particular, I don’t think there is any first-principles story behind your assertion that “the natural tile size is the electron radius”. I think it’s telling that “electron radius” is not a thing that I recall ever being mentioned in discussions of electrical conduction, including numerous courses that I’ve taken and textbooks that I’ve read in solid-state physics, semiconductor physics, nanofabrication, and electronics. Honestly I’m not even sure what you mean by “electron radius” in the first place.
Why? Does not each minimal length of wire need to represent and transmit a bit? Does the landauer principle somehow not apply at the micro or nanoscale?
It is not the case that the wire represents a single bit, stretched out across the length of the wire, as I believe you will agree. Each individual section of wire stores and transmits different individual bits in the sequence chain at each moment in time, such that the number of bits on the wire is a function of length.
Only if the wire is perfectly insulated from the external environment—which crucially perhaps is our crux. If the wire is in a noisy conventional environment, it accumulates noise on the landauer scale at each nanoscale transmission step, and at the minimal landauer bit energy scale this noise rapidly collapses the bit representation (decays to noise) exponentially quickly, unless erased. (because the landauer energy scale is defined as the minimal bit energy reasonable distinguishable from noise, so it has no room for more error).
I don’t believe this is true in practice as again any conventional system is not perfectly reversible unless (unrealistically) there is no noise coupling.
I’m not sure how you got that? There are many ways to represent a bit, and for electronic relay systems the bit representation is distributed over some small fraction of the electrons moving between outer orbitals. The bit representation is a design constraint in terms of a conceptual partition of microstates, and as I already stated earlier you can represent a tiny landauer energy bit using partitions of almost unlimited number of atoms and their microstates (at least cross sectionally for an interconnect wire, but for density reasons the wires need be thin).
I sometimes use single electron examples, as those are relevant for nanoscale interconnect, and nanoscale computational models end up being molecule sized cellular automata where bits are represented by few electron gaps (but obviously not all electrons participate).
Do you not believe that wires can be modeled as smaller units, recursively down to the level of atoms?
And I clearly do not believe that wires are somehow only capable of dissipative copy operations in theory. In theory they are perfectly capable of non-dissipative reversible move operations, but in practice that has 1.) never been successfully achieved in any conventional practical use that I am aware of, and 2.) is probably impossible in practical use without exotic noise isolation given the terminal rapid noise buildup problems I mentioned (I have some relevant refs in earlier comments),
The landauer principle doesn’t suddenly stop applying at the nanoscale, it bounds atoms and electrons at all sizes and scales. The wire equations are just abstractions, the reality at nanoscale should be better modeled by a detailed nanoscale cellular automata. By “electron radius” I meant de broglie wavelength, which i’m using as a reasonable but admittedly vagueish guess for the interaction distance (the smallest distance scale at which we can model it as a cellular automata switching between distinct bit states, which I admit is not a concept I can yet tightly define, but I derive that concept from studies of the absolute minimum spacing between compute elements due to QM electron de broglie wavelength effects, and I expect it’s close to the directional mean free path length but haven’t checked ), so for an interconnect wire I used ~1.23nm at 1 volt, from this thread:
Naturally it’s not a fixed quantity as it depends on the electron energy and thus voltage, the thermal noise, etc, but it doesn’t seem like that can make a huge difference for room temp conventional wires. (This page estimates a wavelength of 8 angstrom or 0.8nm for typical metals, so fairly close). I admit that my assertion that the natural interaction length (and thus cellular automata scale) is the electron de broglie wavelength seems ad hoc, but I believe it is justifiable and very much seems to make the right predictions so far.
But in that sense I should reassert that my model applies most directly only to any device which conveys bits relayed through electrons exchanging orbitals, as that is the generalized electronic cellular automata model, and wires should not be able to beat that bound. But if there is some way to make the interaction distance much much larger—for example via electrons moving ballistically OOM greater than the ~1 nm atomic scale before interacting, then the model will break down.
So what would cause you to update?
For me, I will update immediately if someone can find a single example of a conventional wire communication device (room temp etc) which has been measured to transmit information using energy confidently less than 2e−21 J/bit/nm. In your model this doesn’t seem super hard to build.
The mean free path of conduction electrons in copper at room temperature is ~40 nm. Cold pure metals can have much greater mean free paths. Also, a copper atom is ~0.1 nm, not ~1 nm.
I guess we could buy a 30-meter cat8 ethernet cable, send 40Gbps of data through it, coil up the cable very far away from both the transmitter and the receiver, and put that coil into a thermally-insulated box (or ideally, a calorimeter), and see if the heat getting dumped off the cable is less than 2.4 watts, right? I think that 2.4 watts is enough to be pretty noticeable without special equipment.
My expectation is… Well, I’m a bit concerned that I’m misunderstanding ethernet specs, but it seems that there are 4 twisted pairs with 75Ω characteristic impedance, and the voltage levels go up to ±1V. That would amount to a power flow of up to 4V²/Z=0.05W. The amount dissipated within the 30-meter cable is of course
muchless than that, or else there would be nothing left for the receiver to measure. So my prediction for the thermally-insulated box experiment above is “the heat getting dumped off the ethernet cable will bewellunder 0.05W (unless I’m misunderstanding the ethernet specs)”.(Update: I struck-through the intensifiers “much” and “well” in the previous paragraph. Maybe they’re justified, but I’m not 100% sure and they’re unnecessary for my point anyway. See bhauth reply below.)
I can easily imagine being convinced by a discussion that talks about wires in the way that I consider “normal”, like if we’re interested in voltage noise then we use the Johnson noise formula (or shot noise or crosstalk noise or whatever it is), or if we’re interested in the spatial profile of the waves then we use the telegrapher’s equations and talk about wavelength, etc.
For example, you wrote “it accumulates noise on the landauer scale at each nanoscale transmission step, and at the minimal landauer bit energy scale this noise rapidly collapses the bit representation (decays to noise) exponentially quickly”. I think if this were a real phenomenon, we should be able to equivalently describe that phenomenon using the formulas for electrical noise that I can find in the noise chapter of my electronics textbook. People have been sending binary information over wires since 1840, right? I don’t buy that there are important formulas related to electrical noise that are not captured by the textbook formulas. It’s an extremely mature field. I once read a whole textbook on transistor noise, it just went on and on about every imaginable effect.
As another example, you wrote:
Again, I want to use conventional wire formulas here. Let’s say:
It takes 0.1 nanosecond for the voltage to swing from low to high (thanks to the transistor’s own capacitance for example)
The interconnect has a transmission line signal velocity comparable to the speed of light
We’re talking about a 100μm-long interconnect.
Then you can do the math: the entire interconnect will be for all intents and purposes at a uniform voltage throughout the entire voltage-switching process. If you look at a graph of the voltage as a function of position, it will look like a flat horizontal line at each moment, and that horizontal line will smoothly move up or down over the course of the 0.1 ns swing. It won’t look like a propagating wave.
As a meta-commentary, you can see what’s happening here—I don’t think the thermal de Broglie wavelength is at all relevant in this context, nor the mean free path, and instead I’m trying to shift discussion to “how wires work”.
One of the weird things in this discussion from my perspective is that you’re OK with photons carrying information with less than 2e-21 J/bit/nm energy dissipation but you’re not OK with wires carrying information with less than 2e-21 J/bit/nm energy dissipation. But they’re not so different in my perspective—both of those things are fundamentally electromagnetic waves traveling down transmission lines. Obviously the frequency is different and the electromagnetic mode profile is different, but I don’t see how those are relevant.
This is the crux of it. I made the same comment here before seeing this comment chain.
Also a valid point. @jacob_cannell is making a strong claim: that the energy lost by communicating a bit is the same scale as the energy lost by all other means, by arbitrarily dividing by 1 nm so that the units can be compared. If this were the case, then we would have known about it for a hundred years. Instead, it is extremely difficult to measure the extremely tiny amounts of heat that are actually generated by deleting a bit, such that it’s only been done within the last decade.
This arbitrary choice leads to a dramatically overestimated heat cost of computation, and it ruins the rest of the analysis.
@Alexander Gietelink Oldenziel, for whatever it is worth, I, a physicist working in nanoelectronics, recommend @Steven Byrnes for the $250. (Although, EY’s “it’s wrong because it’s obviously physically wrong” is also correct. You don’t need to dig into details to show that a perpetual motion machine is wrong. You can assert it outright.)
For what it’s worth, I think both sides of this debate appear strangely overconfident in claims that seem quite nontrivial to me. When even properly interpreting the Landauer bound is challenging due to a lack of good understanding of the foundations of thermodynamics, it seems like you should be keeping a more open mind before seeing experimental results.
At this point, I think the remarkable agreement between the wire energies calculated by Jacob and the actual wire energies reported in the literature is too good to be a coincidence. However, I suspect the agreement might be the result of some dimensional analysis magic as opposed to his model actually being good. I’ve been suspicious of the de Broglie wavelength-sized tile model of a wire since the moment I first saw it, but it’s possible that there’s some other fundamental length scale that just so happens to be around 1 nm and therefore makes the formulas work out.
The Landauer limit was first proposed in 1961, so the fact that people have been sending binary information over wires since 1840 seems to be irrelevant in this context.
1 nm is somewhat arbitrary but around that scale is a sensible estimate for minimal single electron device spacing ala Cavin/Zhirnov. If you haven’t actually read those refs you should—as they justify that scale and the tile model.
This is just false, unless you are claiming you have found some error in the cavin/zhirnov papers. It’s also false in the sense that the model makes reasonable predictions. I’ll just finish my follow up post, but using the mean free path as the approx scale does make sense for larger wires and leads to fairly good predictions for a wide variety of wires from on chip interconnect to coax cable Ethernet to axon signal conduction.
They use this model to figure out how to pack devices within a given area and estimate their heat loss. It is true that heating of a wire is best described with a resistivity (or parasitic capacitance) that scales as 1/L. If you want to build a model out of tiles, each of which is a few nm on a side (because the FETs are roughly that size), then you are perfectly allowed to do so. IMO the model is a little oversimplified to be particularly useful, but it’s physically reasonable at least.
No, the papers are fine. They don’t say what you think they say. They are describing ordinary resistive losses and such. In order to compare different types of interconnects running at different bitrates, they put these losses in units of energy/bit/nm. This has no relation to Landauer’s principle.
Resistive heat loss in a wire is fundamentally different than heat loss from Landauer’s principle. I can communicate 0 bits of information across a wire while losing tons of energy to resistive heat, by just flowing a large constant current through it.
As pointed out by Steven Byrnes, your model predicts excess heat loss in a well-understood system. In my linked comment, I pointed out another way that it makes wrong predictions.
Of course—as I pointed out in my reply here.
False. I never at any point modeled the resistive heat/power loss for flowing current through a wire sans communication. It was Byrnes who calculated the resistive loss for a coax cable, and got a somewhat wrong result (for wire communication bit energy cost), whereas the tile model (using mean free path for larger wires) somehow outputs the correct values for actual coax cable communication energy use as shown here.
Please respond to the meat of the argument.
Resistive heat loss is not the same as heat loss from Landauer’s principle. (you agree!)
The Landauer limit is an energy loss per bit flip, with units energy/bit. This is the thermodynamic minimum (with irreversible computing). It is extremely small and difficult to measure. It is unphysical to divide it by 1 nm to model an interconnect, because signals do not propagate through wires by hopping from electron to electron.
The Cavin/Zhirnov paper you cite does not concern the Landauer principle. It models ordinary dissipative interconnects. Due to a wide array of engineering optimizations, these elements tend to have similar energy loss per bit per mm, however this is not a fundamental constraint. This number can be basically arbitrarily changed by multiple orders of magnitude.
You claim that your modified Landauer energy matches the Cavin/Zhirnov numbers, but this is a nonsense comparison because they are different things. One can be varied by orders of magnitude while the other cannot. Because they are different heat sources, their heat losses add.
We have known how wires work for a very long time. There is a thorough and mature field of physics regarding heat and information transport in wires. If we were off by a factor of 2 in heat loss (what you are claiming, possibly without knowing so) then we would have known it long ago. The Landauer principle would not be a very esoteric idea at the fringes of computation and physics, it would be front and center necessary to understand heat dissipation in wires. It would have been measured a hundred years ago.
I’m not going to repeat this again. If you ignore the argument again then I will assume bad faith and quit the conversation.
I’m really not sure what your argument is if this is the meat, and moreover don’t really feel morally obligated to respond given that you have not yet acknowledged that my model already made roughly correct predictions and that Byrne’s model of wire heating under passive current load is way off theoretically and practically. Interconnect wire energy comes from charging and discharging 12CV2 capacitance energy, not resistive loss for passive constant (unmodulated) current flow.
The landauer limit connects energy to probability of state transitions, and is more general than erasure. Reversible computations still require energies that are multiples of this bound for reliability. It is completely irrelevant how signals propagate through the medium—whether by charging wire capacitance as in RC interconnect, or through changes in drift velocity, or phonons, or whatever. As long as the medium has thermal noise, the landauer/boltzmann relationship applies.
Cavin/Zhirnov absolutely cite and use the Landauer principle for bit energy.
I make no such claim as i’m not using a “modified Landauer energy”.
I’m not making any claims of novel physics or anything that disagrees with known wire equations.
Comments like this suggest you don’t have a good model of my model. The actual power usage of actual devices is a known hard fact and coax cable communication devices have actual power usage within the range my model predicted—that is a fact. You can obviously use the wire equations (correctly) to precisely model that power use (or heat loss)! But I am more concerned with the higher level general question of why both human engineering and biology—two very separate long running optimization processes—converged on essentially the same wire bit energy.
Ok, I will disengage. I don’t think there is a plausible way for me to convince you that your model is unphysical.
I know that you disagree with what I am saying, but from my perspective, yours is a crackpot theory. I typically avoid arguing with crackpots, because the arguments always proceed basically how this one did. However, because of apparent interest from others, as well as the fact that nanoelectronics is literally my field of study, I engaged. In this case, it was a mistake.
Sorry for wasting our time.
Dear spxtr,
Things got heated here. I and many others are grateful for your effort to share your expertise. Is there a way in which you would feel comfortable continuing to engage?
Remember that for the purposes of the prize pool there is no need to convince Cannell that you are right. In fact I will not judge veracity at all just contribution to the debate (on which metric you’re doing great!)
Dear Jake,
This is the second person in this thread that has explicitly signalled the need to disengage. I also realize this is charged topic and it’s easy for it to get heated when you’re just honestly trying to engage.
Best, Alexander
Hi Alexander,
I would be happy to discuss the physics related to the topic with others. I don’t want to keep repeating the same argument endlessly, however.
Note that it appears that EY had a similar experience of repeatedly not having their point addressed:
Then, after a reply:
Then, after another reply:
Then, nothing more (that I saw, but I might have missed comments. this is a popular thread!).
:), spxtr
If this is your field but also you don’t have the mood for pedagogy when someone from another field has strong opinions, which is emotionally understandable, I’m curious what learning material you’d recommend working through to find your claims obvious; is a whole degree needed? Are there individual textbooks or classes or even individual lectures?
It depends on your background in physics.
For the theory of sending information across wires, I don’t think there is any better source than Shannon’s “A Mathematical Theory of Communication.”
I’m not aware of any self-contained sources that are enough to understand the physics of electronics. You need to have a very solid grasp of E&M, the basics of solid state, and at least a small amount of QM. These subjects can be pretty unintuitive. As an example of the nuance even in classical E&M, and an explanation of why I keep insisting that “signals do not propagate in wires by hopping from electron to electron,” see this youtube video.
You don’t actually need all of that in order to argue that the brain cannot be efficient from a thermodynamic perspective. EY does not understand the intricacies of nanoelectronics (probably), but he correctly stated that the final result from the original post cannot be correct, because obviously you can imagine a computation machine that is more thermodynamically efficient than pumping tens of thousands of ions across membranes and back. This intuition probably comes from some thermodynamics or statistical mechanics books.
What is the most insightful textbook about nanoelectronics you know of, regardless of how difficult it may be?
Or for another question trying to get at the same thing: if only one book about nanoelectronics were to be preserved (but standard physics books would all be fine still), which one would you want it to be? (I would be happy with a pair of books too, if that’s an easier question to answer.)
I come more from the physics side and less from the EE side, so for me it would be Datta’s “Electronic Transport in Mesoscopic Systems”, assuming the standard solid state books survive (Kittel, Ashcroft & Mermin, L&L stat mech, etc). For something closer to EE, I would say “Principles of Semiconductor Devices” by Zeghbroeck because it is what I have used and it was good, but I know less about that landscape.
I strongly disapprove of your attitude in this thread. You haven’t provided any convincing explanation of what’s wrong with Jacob’s model beyond saying “it’s unphysical”.
I agree that the model is very suspicious and in some sense doesn’t look like it should work, but at the same time, I think there’s obviously more to the agreement between his numbers and the numbers in the literature than you’re giving credit for. Your claim that there’s no fundamental bound on information transmission that relies on resistive materials of the form energy/bit/length (where the length scale could depend on the material in ways Jacob has already discussed) is unsupported and doesn’t seem like it rests on any serious analysis.
You can’t blame Jacob for not engaging with your arguments because you haven’t made any arguments. You’ve just said that his model is unphysical, which I agree with and presumably he would also agree with to some extent. However, by itself, that’s not enough to show that there is no bound on information transmission which roughly has the form Jacob is talking about, and perhaps for reasons that are not too dissimilar from the ones he’s conjectured.
I could be wrong here, but I think the “well-understood” physics principles that spxtr is getting at are the Shannon-Hartley Theorem and the Johnson-Nyquist noise. My best guess at how one would use these to derive a relationship between power consumption, bit rate, and temperature are as follows:
The power of the Johnson-Nyquist noise goes as kTΔf, where Δf is the bandwidth. So we’re interpreting the units of kT as W/Hz. Interestingly, for power output, the resistance in the circuit is irrelevant. Larger resistance means more voltage noise and less current noise, but the overall power multiplies out to be the same.
Next, the Shannon-Hartley theorem says that the channel capacity is:
C=Δflog2(1+PsignalPnoise)
Where C is the bitrate (units are bits per second), and Psignal,Pnoise are the power levels of signal and noise. Then the energy cost to send a bit (we’ll call it Ebit) is:
Ebit=PsignalC
Based on Johnson-Nyquist, we have a noise level of kTΔf, so overall the energy cost per bit should be:
Ebit=PsignalΔflog2(1+PsignalkTΔf)
Define a dimensionless x=PsignalkTΔf. Then we have:
Ebit=kTxlog2(1+x)
Since x must be positive, the minimum value for the dimensionless part is log2. So this gives a figure of kTlog2 per bit for the entire line, assuming resistance isn’t too large. Interestingly, this is the same number as the Landauer limit itself, something I wasn’t expecting when I started writing this.
I think one reason your capacitor charging/discharging argument didn’t stop this number from coming out so small is that information can travel as pulses along the line that don’t have to charge and discharge the entire thing at once. They just have to contain enough energy to charge the local area they happen to be currently occupying.
The problem with this model is that it would apply equally as well regardless of how you’re transmitting information on an electromagnetic field, or for that matter, any field to which the equipartition theorem applies.
If your field looks like lots of uncoupled harmonic oscillators joined together once you take Fourier transforms, then each harmonic oscillator is a quadratic degree of freedom, and each picks up thermal noise on the order of ~ kT because of the equipartition theorem. Adding these together gives you Johnson noise in units of power. Shannon-Hartley is a mathematical theorem that has nothing to do with electromagnetism in particular, so it will also apply in full generality here.
You getting the bitwise Landauer limit as the optimum is completely unsurprising if you look at the ingredients that are going into your argument. We already know that we can beat Jacob’s wire energy bounds by using optical transmission, for example. The part your calculation fails to address is what happens if we attempt to drive this transmission by moving electrons around inside a wire made of an ordinary resistive material such as copper.
It seems to me that in this case we should expect a bound that has dimensions energy/bit/length and not energy/bit, and such a bound basically has to look like Jacob’s bound by dimensional analysis, modulo the length scale of 1 nm being correct.
Yeah, I agree that once you take into account resistance, you also get a length scale. But that characteristic length is going to be dependent on the exact geometry and resistance of your transmission line. I don’t think it’s really possible to say that there’s a fundamental constant of ~1nm that’s universally implied by thermodynamics, even if we confine ourselves to talking about signal transmission by moving electrons in a conductive material.
For example, take a look at this chart:
(source) At 1GHz, we can see that:
There’s a wide spread of possible levels of attenuation for different cable types. Note the log scale.
A typical level of attenuation is 10dB over 100 ft. If the old power requirement per bit was about kT, this new power requirement is about 10kT. Then presumably to send the signal another 100ft, we’d have to pay another 10kT. Call it 100kT to account for inefficiencies in the signal repeater. So this gives us a cost of 1kT per foot rather than 1kT per nanometer!
That linked article and graph seems to be talking about optical communication (waveguides), not electrical.
There’s nothing fundamental about ~1nm, it’s just a reasonable rough guess of max tile density. For thicker interconnect it seems obviously suboptimal to communicate bits through maximally dense single electron tiles.
But you could imagine single electron tile devices with anisotropic interconnect tiles where a single electron moves between two precise slots separated by some greater distance and then ask what is the practical limit on that separation distance and it ends up being mean free path
MFP also naturally determines material resistivity/conductivity.
So anisotropic tiles with length scale around mean free path is about the best one could expect from irreversible communication over electronic wires, and actual electronic wire signaling in resistive wires comes close to that bound such that it is an excellent fit for actual wire energies. This makes sense as we shouldn’t expect random electron motion in wires to beat single electron cellular automata that use precise electron placement.
The equations you are using here seem to be a better fit for communication in superconducting wires where reversible communication is possible.
Terminology: A waveguide has a single conductor, example: a box waveguide. A transmission line has two conductors, example: a coaxial cable.
Yes most of that page is discussing waveguides, but that chart (“Figure 5. Attenuation vs Frequency for a Variety of Coaxial Cables”) is talking about transmission lines, specifically coaxial cables. In some sense even sending a signal through a transmission line is unavoidably optical, since it involves the creation and propagation of electromagnetic fields. But that’s also kind of true of all electrical circuits.
Anyways, given that this attenuation chart should account for all the real-world resistance effects and it says that I only need to pay an extra factor of 10 in energy to send a 1GHz signal 100ft, what’s the additional physical effect that needs to be added to the model in order to get a nanometer length scale rather than a centimeter length scale?
See my reply here.
Using steady state continuous power attenuation is incorrect for EM waves in a coax transmission line. It’s the difference between the small power required to maintain drift velocity against frictive resistance vs the larger energy required to accelerate electrons up to the drift velocity from zero for each bit sent.
In some sense none of this matters because if you want to send a bit through a wire using minimal energy, and you aren’t constrained much by wire thickness or the requirement of a somewhat large encoder/decoder devices, you can just skip the electron middleman and use EM waves directly—ie optical.
I don’t have any strong fundemental reason why you couldn’t use reversible signaling through a wave propagating down a wire—it is just another form of wave as you point out.
The landauer bound till applies of course, it just determines the energy involved rather than dissipated. If the signaling mechanism is irreversible, then the best that can be achieved is on order ~1e-21 J/bit/nm. (10x landauer bound for minimal reliability over a long wire, but distance scale of about 10 nm from the mean free path of metals). Actual coax cable wire energy is right around that level, which suggests to me that it is irreversible for whatever reason.
I have a number floating around in my head. I’m not sure if it’s right, but I think that at GHz frequencies, electrons in typical wires are moving sub picometer distances (possibly even femtometers?) per clock cycle.
The underlying intuition is that electron charge is “high” in some sense, so that 1. adding or removing a small number of electrons corresponds to a huge amount of energy (remove 1% of electrons from an apple and it will destroy the Earth in its explosion!) and 2. moving the electrons in a metal by a tiny distance (sub picometer) can lead to large enough electric fields to transmit signals with high fidelity.
Feel free to check these numbers, as I’m just going by memory.
The end result is that we can transmit signals with high fidelity by moving electrons many orders of magnitude less distance than their mean free path, which means intuitively it can be done more or less loss-free. This is not a rigorous calculation, of course.
The absolute speed of conduction band electrons inside a typical wire should be around 1e6 m/s at room temperature. At GHz frequencies, the electrons are therefore moving distances comparable to 1 mm per clock cycle.
If you look at the average velocity, i.e. the drift velocity, then that’s of course much slower and the electrons will be moving much more slowly in the wire—the distances you quote should be of the right order of magnitude in this case. But it’s not clear why the drift velocity of electrons is what matters here. By Maxwell, you only care about electron velocity on the average insofar as you’re concerned with the effects on the EM field, but actually, the electrons are moving much faster so could be colliding with a lot of random things and losing energy in the process. It’s this effect that has to be bounded, and I don’t think we can actually bound it by a naive calculation that assumes the classical Drude model or something like that.
If someone worked all of this out in a rigorous analysis I could be convinced, but your reasoning is too informal for me to really believe it.
Ah, I was definitely unclear in the previous comment. I’ll try to rephrase.
When you complete a circuit, say containing a battery, a wire, and a light bulb, a complicated dance has to happen for the light bulb to turn on. At near the speed of light, electric and magnetic fields around the wire carry energy to the light bulb. At the same time, the voltage throughout the wire establishes itself at the the values you would expect from Ohm’s law and Kirchhoff’s rules and such. At the same time, electrons throughout the wire begin to feel a small force from an electric field pointing along the direction of the wire, even if the wire has bends and such. These fields and voltages, outside and inside the wire, are the result of a complicated, self-consistent arrangement of surface charges on the wire.
See this youtube video for a nice demonstration of a nonintuitive result of this process. The video cites this paper among others, which has a nice introduction and overview.
The key point is that establishing these surface charges and propagating the signal along the wire amounts to moving an extremely small amount of electric charge. In that youtube video he asserts without citation that the electrons move “the radius of a proton” (something like a femtometer) to set up these surface charges. I don’t think it’s always so little, but again I don’t remember where I got my number from. I can try to either look up numbers or calculate it myself if you’d like.
Signals (low vs high voltages, say) do not propagate through circuits by hopping from electron to electron within a wire. In a very real sense they do not even propagate through the wire, but through electric and magnetic fields around and within the wire. This broad statement is also true at high frequencies, although there the details become even more complicated.
To maybe belabor the point: to send a bit across a wire, we set the voltage at one side high or low. That voltage propagates across the wire via the song and dance I just described. It is the heat lost in propagating this voltage that we are interested in for computing the energy of sending the bit over, and this heat loss is typically extremely small, because the electrons barely have to move and so they lose very little energy to collisions.
I’m aware of all of this already, but as I said, there seems to be a fairly large gap between this kind of informal explanation of what happens and the actual wire energies that we seem to be able to achieve. Maybe I’m interpreting these energies in a wrong way and we could violate Jacob’s postulated bounds by taking an Ethernet cable and transmitting 40 Gbps of information at a long distance, but I doubt that would actually work.
I’m in a strange situation because while I agree with you that the tile model of a wire is unphysical and very strange, at the same time it seems to me intuitively that if you tried to violate Jacob’s bounds by many orders of magnitude, something would go wrong and you wouldn’t be able to do it. If someone presented a toy model which explained why in practice we can get wire energies down to a certain amount that is predicted by the model while in theory we could lower them by much more, I think that would be quite persuasive.
Ethernet cables are twisted pair and will probably never be able to go that fast. You can get above 10 GHz with rigid coax cables, although you still have significant attenuation.
Let’s compute heat loss in a 100 m LDF5-50A, which evidently has 10.9 dB/100 m attenuation at 5 GHz. This is very low in my experience, but it’s what they claim.
Say we put 1 W of signal power at 5 GHz in one side. Because of the 10.9 dB attenuation, we receive 94 mW out the other side, with 906 mW lost to heat.
The Shannon-Hartley theorem says that we can compute the capacity of the wire as C=Blog2(1+SN) where B is the bandwidth, S is received signal power, and N is noise power.
Let’s assume Johnson noise. These cables are rated up to 100 C, so I’ll use that temperature, although it doesn’t make a big difference.
If I plug in 5 GHz for B, 94 mW for S and kB(370K)(5GHz)≈2.5×10−11W for N then I get a channel capacity of 160 GHz.
The heat lost is then (906mW)/(160GHz)/(100m)≈0.05fJ/bit/mm. Quite low compared to Jacob’s ~10 fJ/mm “theoretical lower bound.”
One free parameter is the signal power. The heat loss over the cable is linear in the signal power, while the channel capacity is sublinear, so lowering the signal power reduces the energy cost per bit. It is 10 fJ/bit/mm at about 300 W of input power, quite a lot!
Another is noise power. I assumed Johnson noise, which may be a reasonable assumption for an isolated coax cable, but not for an interconnect on a CPU. Adding an order of magnitude or two to the noise power does not substantially change the final energy cost per bit (0.05 goes to 0.07), however I doubt even that covers the amount of noise in a CPU interconnect.
Similarly, raising the cable attenuation to 50 dB/100 m does not even double the heat loss per bit. Shannon’s theorem still allows a significant capacity. It’s just a question of whether or not the receiver can read such small signals.
The reason that typical interconnects in CPUs and the like tend to be in the realm of 10-100 fJ/bit/mm is because of a wide range of engineering constraints, not because there is a theoretical minimum. Feel free to check my numbers of course. I did this pretty quickly.
In the original article I discuss interconnect wire energy, not a “theoretical lower bound” for any wire energy communication method—and immediately point out reversible communication methods (optical, superconducting) that do not dissipate the wire energy.
Coax cable devices seem to use around 1 to 5 fJ/bit/mm at a few W of power, or a few OOM more than your model predicts here—so I’m curious what you think that discrepancy is, without necessarily disagreeing with the model.
I describe a simple model of wire bit energy for EM wave transmission in coax cable here which seems physically correct but also predicts a bit energy distance range somewhat below observed.
I can’t access the linked article, but an active cable is not simple to model because its listed power includes the active components. We are interested in the loss within the wire between the active components.
They write <1 W for every length of wire, so all you can say is <5 fJ/mm. You don’t know how much less. They are likely writing <1 W for comparison to active wires that consume more than a W. Also, these cables seem to have a powered transceiver built-in on each end that multiplex out the signal to four twisted pair 10G lines.
Again, these have a powered transceiver on each end.
So for all of these, all we know is that the sum of the losses of the powered components and the wire itself are of order 1 fJ/mm. Edit: I would guess that probably the powered components have very low power draw (I would guess 10s of mW) and the majority of the loss is attenuation in the wire.
The numbers I gave essentially are the theoretical minimum energy loss per bit per mm of that particular cable at that particular signal power. It’s not surprising that multiple twisted pair cables do worse. They’ll have higher attenuation, lower bandwidth, the standard transceivers on either side require larger signals because they have cheaper DAC/ADCs, etc. Also, their error correction is not perfect, and they don’t make full use of their channel capacity. In return, the cables are cheap, flexible, standard, etc.
There’s nothing special about kT/1 nm.
I think this calculation is fairly convincing pending an answer from Jacob. You should have probably just put this calculation at the top of the thread, and then the back-and-forth would probably not have been necessary. The key parameter that is needed here is the estimate of a realistic attenuation rate for a coaxial cable, which was missing from DaemonicSigil’s original calculation that was purely information-theoretic.
As an additional note here, if we take the same setup you’re using, then if you take the energy input x to be a free parameter, then the energy per bit per distance is given by
f(x)=0.906x5⋅1014⋅log2(1+0.094x2.5⋅10−11)
in units of J/bit/mm. This does not have a global optimum for x>0 because it’s strictly increasing, but we can take a limit to get the theoretical lower bound
limx→0f(x)=3.34⋅10−25
which is much lower than what you calculated, though to achieve this you would be sending information very slowly—indeed, infinitely slowly in the limit of x→0.
I am skeptical that steady state direct current flow attenuation is the entirety of the story (and indeed it seems to underestimate actual coax cable wire energy of ~1e^-21 to 5e^-21 J/bit/nm by a few OOM).
For coax cable the transmission is through a transverse (AC) wave that must accelerate a quantity of electrons linearly proportional to the length of the cable. These electrons rather rapidly dissipate this additional drift velocity energy through collisions (resistance), and the entirety of the wave energy is ultimately dissipated.
This seems different than sending continuous DC power through the wire where the electrons have a steady state drift velocity and the only energy required is that to maintain the drift velocity against resistance. For wave propagation the electrons are instead accelerated up from a drift velocity of zero for each bit sent. It’s the difference between the energy required to accelerate a car up to cruising speed and the power required to maintain that speed against friction.
If we take the bit energy to be Eb, then there is a natural EM wavelength of Eb=hcλ, so λ=hcEB, which works out to ~1um for ~1eV. Notice that using a lower frequency / longer wavelength seems to allow one to arbitrarily decrease the bit energy distance scale, but it turns out this just increases the dissipative loss.
So an initial estimate of the characteristic bit energy distance scale here is ~1eV/bit/um or ~1e-22 J/bit/nm. But this is obviously an underestimate as it doesn’t yet include the effect of resistance (and skin effect) during wave propagation.
The bit energy of one wavelength is implemented through electron peak drift velocity on order Eb=12Nemev2d, where Ne is the number of carrier electrons in one wavelength wire section. The relaxation time τ or mean time between thermal collisions with a room temp thermal velocity of around ~1e5 m/s and the mean free path of ~40 nm in copper is τ ~ 4e-13s. Meanwhile the inverse frequency or timespan of one wavelength is around 3e-14 s for an optical frequency 1eV wave, and is ~1e-9 s for a more typical (much higher amplitude) gigahertz frequency wave. So it would seem that resistance is quite significant on these timescales.
Very roughly the gigahertz 1e-9s period wave requires about 5 oom more energy per wavelength due to dissipation which cancels out the 5 oom larger distance scale. Each wavelength section loses about half of the invested energy every τ ~ 4e-13 seconds, so maintaining the bit energy of Eb requires roughly input power of ~Eb/τ for f−1 seconds which cancels out the effect of the longer wavelength distance, resulting in a constant bit energy distance scale independent of wavelength/frequency (naturally there are many other complex effects that are wavelength/frequency dependent but they can’t improve the bit energy distance scale )
For a low frequency (long wavelength) with f−1 << τ :
Eb/d≈Ebf−1τλ=Ebτfλ
λ=cf
Eb/d≈Ebτfλ
Eb/d≈Ebτc ~ 1eV / 10um ~ 1e-23 J/bit/nm
If you take the bit energy down to the minimal landauer limit of ~0.01 eV this ends up about equivalent to your lower limit, but I don’t think that would realistically propagate.
A real wave propagation probably can’t perfectly transfer the bit energy over longer distances and has other losses (dielectric loss, skin effect, etc), so vaguely guesstimating around 100x loss would result in ~1e-21 J/bit/nm. The skin effect alone perhaps increases resistance by roughly 10x at gigahertz frequencies. Coax devices also seem constrained to use specific lower gigahertz frequences and then boost the bitrate through analog encoding, so for example 10-bit analog increases bitrate by 10x at the same frequency but requires about 1024X more power, so that is 2 OOM less efficient per bit.
Notice that the basic energy distance scale of Ebτc is derived from the mean free path, via the relaxation time τ from τ=ℓ/Vn, where ℓ is the mean free path and Vn is the thermal noise velocity (around ~1e5 m/s for room temp electrons).
Coax cable doesn’t seem to have any fundamental advantage over waveguide optical, so I didn’t consider it at all in brain efficiency. It requires wires of about the same width several OOM larger than minimal nanoscale RC interconnect and largish sending/receiving devices as in optics/photonics.
Electrons are very light so the kinetic energy required to get them moving should not be significant in any non-contrived situation I think? The energy of the magnetic field produced by the current would tend to be much more of an important effect.
As for the rest of your comment, I’m not confident enough I understand the details of your argument be able to comment on it in detail. But from a high level view, any effect you’re talking about should be baked into the attenuation chart I linked in this comment. This is the advantage of empirically measured data. For example, the skin-effect (where high frequency AC current is conducted mostly in the surface of a conductor, so the effective resistance increases the higher the frequency of the signal) is already baked in. This effect is (one of the reasons) why there’s a positive slope in the attenuation chart. If your proposed effect is real, it might be contributing to that positive slope, but I don’t see how it could change the “1 kT per foot” calculation.
My current understanding is that the electric current energy transmits through electron drift velocity (and I believe that is the standard textbook understanding?, although I admit I have some questions concerning the details). The magnetic field is just a component of the EM waves which propagate changes in electron KE between electrons (the EM waves implement the connections between masses in the equivalent mass-spring system).
I’m not sure how you got “1 kT per foot” but that seems roughly similar to the model up thread I am replying to from spxtr that got 0.05 fJ/bit/mm or 5e-23 J/bit/mm. I attempted to derive an estimate from the lower level physics thinking it might be different but it ended up in the same range—and also off by the same 2 OOM vs real data. But I mention that skin effect could plausibly increase power by 10x in my lower level model, as I didn’t model it nor use measured attenuation values at all. The other OOM probably comes from analog SNR inefficiency.
The part of this that is somewhat odd at first is the exponential attenuation. That does show up in my low lever model where any electron kinetic energy in the wire is dissipated by about 50% due to thermal collisions every τ ~ 4e-13 seconds (that is the important part from mean free path / relaxation time). But that doesn’t naturally lead to a linear bit energy distance scale unless that dissipated energy is somehow replaced/driven by the preceding section of waveform.
So if you sent E as a single large infinitesimal pulse down a wire of length D, the energy you get on the other side is E∗2−αD for some attenuation constant α that works out to about 0.1 mm or something as it’s τc, not meters. I believe if your chart showed attenuation in the 100THZ regime on the scale of τ it would be losing 50% per 0.1 mm instead of per meter.
We know that resistance is linear, not exponential—which I think arises from long steady flow where every τ seconds half the electron kinetic energy is dissipated, but this total amount is linear with wire section length. The relaxation time τ then just determines what steady mean electron drift velocity (current flow) results from the dissipated energy.
So when the wave period f−1 is much less than τ you still lose about half of the wave energy E every τ seconds but that can be spread out over a much larger wavelength section. (and indeed at gigahertz frequencies this model roughly predicts the correct 50% attenuation distance scale of ~10m or so).
There’s two types of energy associated with a current we should distinguish. Firstly there’s the power flowing through the circuit, then there’s energy associated with having current flowing in a wire at all. So if we’re looking at a piece of extension cord that’s powering a lightbulb, the power flowing through the circuit is what’s making the lightbulb shine. This is governed by the equation P=IV. But there’s also some energy associated with having current flowing in a wire at all. For example, you can work out what the magnetic field should be around a wire with a given amount of current flowing through it and calculate the energy stored in the magnetic field. (This energy is associated with the inductance of the wire.) Similarly, the kinetic energy associated with the electron drift velocity is also there just because the wire has current flowing through it. (This is typically a very small amount of energy.)
To see that these types have to be distinct, think about what happens when we double the voltage going into the extension cord and also double the resistance of the lightbulb it’s powering. Current stays the same, but with twice the voltage we now have twice the power flowing to the light bulb. Because current hasn’t changed, neither has the magnetic field around the wire, nor the drift velocity. So the energy associated with having a current flowing in this wire is unchanged, even though the power provided to the light bulb has doubled. The important thing about the drift velocity in the context of P=IV is that it moves charge. We can calculate the potential energy associated with a charge in a wire as E=qV, and then taking the time derivative gives the power equation. It’s true that drift velocity is also a velocity, and thus the charge carriers have kinetic energy too, but this is not the energy that powers the light bulb.
In terms of exponential attenuation, even DC through resistors gives exponential attenuation if you have a “transmission line” configuration of resistors that look like this:
So exponential attenuation doesn’t seem too unusual or surprising to me.
Indeed, the theoretical lower bound is very, very low.
The minimum is set by the sensor resolution and noise. A nice oscilloscope, for instance, will have, say, 12 bits of voltage resolution and something like 10 V full scale, so ~2 mV minimum voltage. If you measure across a 50 Ohm load then the minimum received power you can see is P=(2mV)2/(50Ω)≈10μW. This is an underestimate, but that’s the idea.
This is the right idea, but in these circuits there are quite a few more noise sources than Johnson noise. So, it won’t be as straightforward to analyze, but you’ll still end up with essentially a relatively small (compared to L/nm) constant times kT.
Sure information can travel that way in theory, but it doesn’t work out in practice for dissipative resistive (ie non superconducting) wires. Actual on chip interconnect wires are ‘RC wires’ which do charge/discharge the entire wire to send a bit. They are like a pipe which allows electrons to flow from some source to a destination device, where that receiving device (transistor) is a capacitor which must be charged to a bit energy Eb>>KBT. The Johnson thermal noise on a capacitor is just the same Landauer Boltzmann noise of En≈KBT. The wire geometry aspect ratio (width/length) determines the speed at which the destination capacitor can be charged up to the bit energy.
The only way for the RC wire to charge the distant receiver capacitor is by charging the entire wire, leading to the familiar RC wire capacitance energy, which is also very close to the landauer tile model energy using mean free path as the tile size (for the reasons i’ve articulated in various previous comments).
Yeah, to be clear I do agree that your model gives good empirical results for on-chip interconnect. (I haven’t checked the numbers myself, but I believe you that they match up well.) (Though I don’t necessarily buy that the 1nm number is related to atom spacing in copper or anything like that. It probably has more to do with the fact that scaling down a transmission line while keeping the geometry the same means that the capacitance per unit length is constant. The idea you mention in your other comment about it somehow falling out of the mean free path also seems somewhat plausible.)
Anyway, I don’t think my argument would apply to chip interconnect. At 1GHz, the wavelength is going to be about a foot, which is still wider than any interconnect on the microchip will be long. And we’re trying to send a single bit along the line using a DC voltage level, rather than some kind of fancy signal wave. So your argument about charging and discharging the entire line should still apply in this case. My comment would mostly apply to Steven Byrnes’s ethernet cable example, rather than microchip interconnect.
Signals decay exponentially and dissipation with copper cables can be ~50dB. At high frequencies, most of the power is lost.
Sure, I guess the “much less” was a guess; I should have just said “less” out of an abundance of caution.
Before writing that comment, I had actually looked for a dB/meter versus frequency plot for cat8 Ethernet cable and couldn’t find any. Do you have a ref? It’s not important for this conversation, I’m just curious. :)
The ‘tile’ or cellular automata wire model fits both on-chip copper interconnect wire energy and brain axon wire energy very well. It is more obvious why it fits axon signal conduction as that isn’t really a traditional voltage propagation in a wire, it’s a propagation of ion cellular automata state changes. I’m working on a better writeup and I’ll look into how the wire equations could relate. If you have some relevant link to physical limits of communication over standard electrical wires, that would of course be very interesting/relevant.
I’m guessing this is probably the correct equation for the resistive loss, but irreversible communication requires doing something dumb like charging and discharging/dissipating 12CV2 (or equivalent) every clock cycle, which is OOM greater than the resistive loss (which would be appropriate for a steady current flow).
Do you have a link to the specs you were looking at? As I’m seeing a bunch of variation in 40G capable cables. Also 40Gb/s is only the maximum transmission rate, actual rate may fall off with distance from what I can tell.
The first reference I can find From this website is second hand but:
Active copper cable at 0.5W for 40G over 15 meters is ~1e−21J/nm, assuming it actually hits 40G at the max length of 15m.
This source has specs for a passive copper wire capable of up to 40G @5m using <1W, which works out to ~5e−21J/nm, or a bit less.
Compare to 10G from here which. may use up to 5W to hit up to 10G at 100M, for ~5e−21J/nm.
I do think I have a good explanation in the cellular automata model, and I’ll just put my full response in there, but basically it’s the difference between using fermions vs bosons to propagate bits through the system. Photons as bosons are more immune to EM noise pertubations and in typical use have much longer free path length (distance between collisions). One could of course use electrons ballistically to get some of those benefits but they are obviously slower and ‘noisier’.