1.) Computers are built out of components which are also just simpler computers, which bottoms out at the limits of miniaturization in minimal molecular sized (few nm) computational elements (cellular automata/tiles). Further shrinkage is believed impossible in practice due to various constraints (overcoming these constraints if even possible would require very exotic far future tech).
2.) At this scale the landauer bound represents the ambient temperature dependent noise (which can also manifest as a noise voltage). Reliable computation at speed is only possible using non-trivial multiples of this base energy, for the simple reasons described by landauer and elaborated on in the other refs in my article.
3.) Components can be classified as computing tiles or interconnect tiles, but the latter is simply a computer which computes the identity but moves the input to an output in some spatial direction. Interconnect tiles can be irreversible or reversible, but the latter has enormous tradeoffs in size (ie optical) and or speed or other variables and is thus not used by brains or GPUs/CPUs.
4.) Fully reversible computers are possible in theory but have enormous negative tradeoffs in size/speed due to 1.) the need to avoid erasing bits throughout intermediate computations, 2.) the lack of immediate error correction (achieved automatically in dissipative interconnect by erasing at each cycle) leading to error build up which must be corrected/erased (costing energy), 3.) high sensitivity to noise/disturbance due to 2
And the brain vs computer claims:
5.) The brain is near the pareto frontier for practical 10W computers, and makes reasonably good tradeoffs between size, speed, heat and energy as a computational platform for intelligence
6.) Computers are approaching the same pareto frontier (although currently in a different region of design space) - shrinkage is nearing its end
FWIW, I basically buy all of these, but they are not-at-all sufficient to back up your claims about how superintelligence won’t foom (or whatever your actual intended claims are about takeoff). Insofar as all this is supposed to inform AI threat models, it’s the weakest subclaims necessary to support the foom-claims which are of interest, not the strongest subclaims.
I basically buy all of these, but they are not-at-all sufficient to back up your claims about how superintelligence won’t foom
Foom isn’t something that EY can prove beyond doubt or I can disprove beyond doubt, so this is a matter of subjective priors and posteriors.
If you were convinced of foom inevitability before, these claims are unlikely to convince of the opposite, but they do undermine EY’s argument:
they support the conclusion that the brain is reasonably pareto-efficient (greatly undermining EY’s argument that evolution and the brain are grossly inefficient, as well as his analysis confidence),
they undermine nanotech as a likely source of large FOOM gains, and
weaken EY’s claim of huge software FOOM gains (because the same process which optimized the brain’s hardware platform optimized the wiring/learning algorithms over the same time frame).
The four claims you listed as “central” at the top of this thread don’t even mention the word “brain”, let alone anything about it being pareto-efficient.
It would make this whole discussion a lot less frustrating for me (and probably many others following it) if you would spell out what claims you actually intend to make about brains, nanotech, and FOOM gains, with the qualifiers included. And then I could either say “ok, let’s see how well the arguments back up those claims” or “even if true, those claims don’t actually say much about FOOM because...”, rather than this constant probably-well-intended-but-still-very-annoying jumping between stronger and weaker claims.
Also, I recognize that I’m kinda grouchy about the whole thing and that’s probably coming through in my writing, and I appreciate a lot that you’re responding politely and helpfully on the other side of that. So thankyou for that too.
Jacob something really bothers me about your analysis.
Are you accounting for the brains high error rate? Efficiently getting the wrong answer a high percent of the time isn’t useful, it slashes the number of bits of precision on every calculation and limits system performance.
If every synapse only has an effective 4 bits of precision, the lower order bits being random noise, it would limit throughput through the system and prevent human judgement, possibly on matters where the delta is smaller than 1⁄16. It would explain humans ignoring risks smaller than a few percent or having trouble making a decision between close alternatives.
(And this is true for any analog precision level obviously)
It would mean a digital system with a few more bits of precision and less redundant synapses, could significantly outperform a human brain at the same power level.
Note I also have a ton of skillpoints in this area, I have worked on analog data acquisition and control systems and filters for several years and work on inference accelerators now. (And masters CS/bachelor’s CE)
Note due to my high skillpoints here I also disagree with Yudkowsky on foom but for a different set of reasons, also tied to the real world. Like you I have noticed a shortage of inference compute—if an ASI existed today there aren’t enough of the right kind of accelerators to outthink the bulk of humans. (I have some numbers on this i can edit in this post if you show interest)
Remember Wikipedia says Yudkowsky didn’t even go to high school and I can find no reference to him building anything in the world of engineering in his life. Just writing sci Fi and the sequences. So it may be a case where he’s blind to certain domains and doesn’t know what he doesn’t know.
There is extensive work in DL on bit precision reduction, the industry started at 32b, moved to 16b, is moving to 8b, and will probably end up at 4b or so, similar to the brain.
Just the number of bits used to represent a quantity. The complexity of multiplying numbers is nonlinear in bit complexity, so 32b multipliers are much more expensive than 4b multipliers. Analog multipliers are more efficient in various respects at low signal to noise ratio equivalent to low bit precision, but blow up quickly (exponentially) with a crossover near 8 bits or so last I looked.
I support this and will match the $250 prize.
Here are the central background ideas/claims:
1.) Computers are built out of components which are also just simpler computers, which bottoms out at the limits of miniaturization in minimal molecular sized (few nm) computational elements (cellular automata/tiles). Further shrinkage is believed impossible in practice due to various constraints (overcoming these constraints if even possible would require very exotic far future tech).
2.) At this scale the landauer bound represents the ambient temperature dependent noise (which can also manifest as a noise voltage). Reliable computation at speed is only possible using non-trivial multiples of this base energy, for the simple reasons described by landauer and elaborated on in the other refs in my article.
3.) Components can be classified as computing tiles or interconnect tiles, but the latter is simply a computer which computes the identity but moves the input to an output in some spatial direction. Interconnect tiles can be irreversible or reversible, but the latter has enormous tradeoffs in size (ie optical) and or speed or other variables and is thus not used by brains or GPUs/CPUs.
4.) Fully reversible computers are possible in theory but have enormous negative tradeoffs in size/speed due to 1.) the need to avoid erasing bits throughout intermediate computations, 2.) the lack of immediate error correction (achieved automatically in dissipative interconnect by erasing at each cycle) leading to error build up which must be corrected/erased (costing energy), 3.) high sensitivity to noise/disturbance due to 2
And the brain vs computer claims:
5.) The brain is near the pareto frontier for practical 10W computers, and makes reasonably good tradeoffs between size, speed, heat and energy as a computational platform for intelligence
6.) Computers are approaching the same pareto frontier (although currently in a different region of design space) - shrinkage is nearing its end
FWIW, I basically buy all of these, but they are not-at-all sufficient to back up your claims about how superintelligence won’t foom (or whatever your actual intended claims are about takeoff). Insofar as all this is supposed to inform AI threat models, it’s the weakest subclaims necessary to support the foom-claims which are of interest, not the strongest subclaims.
Foom isn’t something that EY can prove beyond doubt or I can disprove beyond doubt, so this is a matter of subjective priors and posteriors.
If you were convinced of foom inevitability before, these claims are unlikely to convince of the opposite, but they do undermine EY’s argument:
they support the conclusion that the brain is reasonably pareto-efficient (greatly undermining EY’s argument that evolution and the brain are grossly inefficient, as well as his analysis confidence),
they undermine nanotech as a likely source of large FOOM gains, and
weaken EY’s claim of huge software FOOM gains (because the same process which optimized the brain’s hardware platform optimized the wiring/learning algorithms over the same time frame).
The four claims you listed as “central” at the top of this thread don’t even mention the word “brain”, let alone anything about it being pareto-efficient.
It would make this whole discussion a lot less frustrating for me (and probably many others following it) if you would spell out what claims you actually intend to make about brains, nanotech, and FOOM gains, with the qualifiers included. And then I could either say “ok, let’s see how well the arguments back up those claims” or “even if true, those claims don’t actually say much about FOOM because...”, rather than this constant probably-well-intended-but-still-very-annoying jumping between stronger and weaker claims.
Ok fair those are more like background ideas/claims, so I reworded that and added 2
Thanks!
Also, I recognize that I’m kinda grouchy about the whole thing and that’s probably coming through in my writing, and I appreciate a lot that you’re responding politely and helpfully on the other side of that. So thankyou for that too.
Jacob something really bothers me about your analysis.
Are you accounting for the brains high error rate? Efficiently getting the wrong answer a high percent of the time isn’t useful, it slashes the number of bits of precision on every calculation and limits system performance.
If every synapse only has an effective 4 bits of precision, the lower order bits being random noise, it would limit throughput through the system and prevent human judgement, possibly on matters where the delta is smaller than 1⁄16. It would explain humans ignoring risks smaller than a few percent or having trouble making a decision between close alternatives.
(And this is true for any analog precision level obviously)
It would mean a digital system with a few more bits of precision and less redundant synapses, could significantly outperform a human brain at the same power level.
Note I also have a ton of skillpoints in this area, I have worked on analog data acquisition and control systems and filters for several years and work on inference accelerators now. (And masters CS/bachelor’s CE)
Note due to my high skillpoints here I also disagree with Yudkowsky on foom but for a different set of reasons, also tied to the real world. Like you I have noticed a shortage of inference compute—if an ASI existed today there aren’t enough of the right kind of accelerators to outthink the bulk of humans. (I have some numbers on this i can edit in this post if you show interest)
Remember Wikipedia says Yudkowsky didn’t even go to high school and I can find no reference to him building anything in the world of engineering in his life. Just writing sci Fi and the sequences. So it may be a case where he’s blind to certain domains and doesn’t know what he doesn’t know.
There is extensive work in DL on bit precision reduction, the industry started at 32b, moved to 16b, is moving to 8b, and will probably end up at 4b or so, similar to the brain.
For my Noob understanding: what is bit precision exactly?
Just the number of bits used to represent a quantity. The complexity of multiplying numbers is nonlinear in bit complexity, so 32b multipliers are much more expensive than 4b multipliers. Analog multipliers are more efficient in various respects at low signal to noise ratio equivalent to low bit precision, but blow up quickly (exponentially) with a crossover near 8 bits or so last I looked.