Stephen Fowler comments on Dario Amodei — Machines of Loving Grace

Stephen Fowler 12 Oct 2024 12:16 UTC
8 points
4
will be developed by reversible computation, since we will likely have hit the Landauer Limit for non-reversible computation by then, and in principle there is basically 0 limit to how much you can optimize for reversible computation, which leads to massive energy savings, and this lets you not have to consume as much energy as current AIs or brains today.
With respect, I believe this to be overly optimistic about the benefits of reversible computation.
Reversible computation means you aren’t erasing information, so you don’t lose energy in the form of heat (per Landauer^[1]^[2]). But if you don’t erase information, you are faced with the issue of where to store it.
If you are performing a series of computations and only have a finite memory to work with, you will eventually need to reinitialise your registers and empty your memory, at which point you incur the energy cost that you had been trying to avoid. ^[3]
Epistemics:
I’m quite confident ~~(95%+)~~ that the above is true. (edit: RogerDearnaley’s comment has convinced me I was overconfident) Any substantial errors would surprise me.
I’m less confident in the footnotes.
1. ^
  $E \geq k_{B} T ln 2$
2. ^
  A cute, non-rigorous intuition for Landauer’s Principle:
  The process of losing track of (deleting) 1 bit of information means your uncertainty about the state of the environment has increased by 1 bit. You must see entropy increase by at least 1 bit’s worth of entropy.
  Proof:
  Rearrange the Landauer Limit to $E / T \geq k_{B} ln 2$ .
  Now, when you add a small amount of heat to a system, the change in entropy is given by:
  
  $dS = dQ / T$
  But the E occurring in Landauer’s formula is not the total energy of a system, it is a small amount of energy required to delete the information. When it all ends up as heat, we can replace it with $dQ$ and we have:
  $dQ / T = dS \geq k_{B} ln 2$
  Compare this expression with the physicist’s definition of entropy. The entropy of a system is the a scaling factor, $k_{B}$ , times the logarithm of the number of micro-states that the system might be in, $Ω$ .
  $S := k_{B} ln Ω .$
  $∴ S + dS >= k_{B} ln (2 Ω) = k_{B} ln Ω + k_{B} ln 2$
  The choice of units obscures the meaning of the final term. $ln 2$ converted from nats to bits is just 1 bit.
3. ^
  Splitting hairs, some setups will allow you to delete information with a reduced or zero energy cost, but the process is essentially just “kicking the can down the road”. You will incur the full cost during the process of re-initialisation.
  
  For details, see equation (4) and fig (1) of Sagawa, Ueda (2009).
- RogerDearnaley 13 Oct 2024 2:00 UTC
  15 points
  0
  Parent
  Reversible computation means you aren’t erasing information, so you don’t lose energy in the form of heat (per Landauer^[1]^[2]). But if you don’t erase information, you are faced with the issue of where to store it.
  If you are performing a series of computations and only have a finite memory to work with, you will eventually need to reinitialise your registers and empty your memory, at which point you incur the energy cost that you had been trying to avoid. ^[3]
  Generally, reversible computation allows you to avoid wasting energy by deleting any memory used on intermediate answers, and only do so only for final results. It does require that you have enough memory to store all those intermediate answers until you finish the calculation and then run it in reverse. If you don’t have that much memory, you can divide your calculation into steps, connected by final results from each step fed into the next step, and save on the energy cost of all the intermediate results within each step, and pay the cost only for data passed from one step to the next or output from the last step. Or, for a 4x slowdown rather than the usual 2x slowdown for reversible computation, you can have two sizes of step, and have some intermediate results that last only during a small step, and others that are retained for a large step before being uncomputed.
  Memory/energy loss/speed trade-off management for reversible computation is a little more complex than conventional memory management, but is still basically simple, and for many computational tasks you can achieve excellent tradeoffs.
  - Noosphere89 13 Oct 2024 4:29 UTC
    4 points
    1
    Parent
    Yeah, I was thinking of uncomputing strategies that reversed the computation from a error-prone state to an error free state without consuming any energy and work, and it turns out that you can uncompute a result by reversing the computation backwards without having to delete it, which would release waste heat.