There is a good analogy between genetic brain evolution and technological AGI evolution. In both cases there is a clear bi-level optimization, with the inner optimizer using a very similar UL/RL intra-lifetime SGD (or SGD-like) algorithm.
The outer optimizer of genetic evolution is reasonably similar to the outer optimizer of technological evolution. The recipe which produces an organic brain is a highly compressed encoding or low frequency prior on the brain architecture along with a learning algorithm to update the detailed wiring during lifetime training. The genes which encode the brain architectural prior and learning algorithms are very close analogically to the ‘memes’ which are propagated/exchanged in ML papers and encode AI architectural prior and learning algorithms (ie the initial pytorch code etc).
The key differences are mainly just that memetic evolution is much faster—like an amplified artificial selection and genetic engineering process. For tech evolution a large number of successful algorithm memes from many different past experiments can be flexibly recombined in a single new experiment, and the process guiding this recombination and selection is itself running on the inner optimizer of brains.
Humans individually are not robustly aligned to the outer genetic optimizer: as roughly 50% of humans choose to not have children and do the other thing instead, which likely is non-trivially misaligned with genetic fitness (IGF) [1]. Nate uses that as a doom argument, because if tech evolution proceeds like bio evolution except that the first AGI to cross some threshold ends up taking over the world, then 50% chance of non-trivial misalignment roughly translates to 50% doom.
Imagine if one historical person from thousands of years ago was given god like magic power to arbitrarily rearrange the future. Seems roughly 50⁄50 whether that future would be reasonably aligned with IGF.
But of course that is not what happened with the evolution of homo sapiens. Even if humans are not robustly aligned to fitness/IGF at the individual level, we are robustly aligned at the population/species level[2]. The enormous success of homo sapiens, despite common misalignment at the individual level, is a clear illustration of how much more robust multi-polar scenarios can be.
As you argue in this post, it also seems likely that the same factors which improve the efficiency of memetic evolution (ie human engineering) over genetic evolution can/will be applied to improve the capability-weighted expected alignment of AGI systems vs that of brains.
Finally, one other hidden source of potential disagreement is the higher level question of degree of alignment between our individual utility functions and the utility function of global market tech evolution as a system. If you largely believe the “system itself is out of control”, you probably won’t be especially satisfied even if there is strong alignment between AGI and the system, if you believe that system itself is on the completely wrong track. That aspect is discussed less (and explicitly not a pillar of EY/MIRI doomer views AFAIK), but I do suspect it is a subtle important factor on p(doom) for many.
Optimizing for IGF doesn’t actually require having children oneself especially if one’s genotype is already widely replicated, but it doesn’t seem likely that substantially shifts the conclusion.
The average/expectation or more generally a linear combination of many utility vectors/functions can be arbitrarily aligned to some target even if nearly every single individual utility function in the set is orthogonal to the target (misaligned). Smoothing out noise (variance reduction) is crucial for optimization—whether using SGD or evolutionary search.
I view your final point as crucial. I would put an additional twist on it, though. During the approach to AGI, if takeoff is even a little bit slow, the effective goals of the system can change. For example, most corporations arguably don’t pursue profit exclusively even though they may be officially bound to. They favor executives, board members, and key employees in ways both subtle and obvious. But explicitly programming those goals into an SGD algorithm is probably too blatant to get away with.
There is a good analogy between genetic brain evolution and technological AGI evolution. In both cases there is a clear bi-level optimization, with the inner optimizer using a very similar UL/RL intra-lifetime SGD (or SGD-like) algorithm.
The outer optimizer of genetic evolution is reasonably similar to the outer optimizer of technological evolution. The recipe which produces an organic brain is a highly compressed encoding or low frequency prior on the brain architecture along with a learning algorithm to update the detailed wiring during lifetime training. The genes which encode the brain architectural prior and learning algorithms are very close analogically to the ‘memes’ which are propagated/exchanged in ML papers and encode AI architectural prior and learning algorithms (ie the initial pytorch code etc).
The key differences are mainly just that memetic evolution is much faster—like an amplified artificial selection and genetic engineering process. For tech evolution a large number of successful algorithm memes from many different past experiments can be flexibly recombined in a single new experiment, and the process guiding this recombination and selection is itself running on the inner optimizer of brains.
Humans individually are not robustly aligned to the outer genetic optimizer: as roughly 50% of humans choose to not have children and do the other thing instead, which likely is non-trivially misaligned with genetic fitness (IGF) [1]. Nate uses that as a doom argument, because if tech evolution proceeds like bio evolution except that the first AGI to cross some threshold ends up taking over the world, then 50% chance of non-trivial misalignment roughly translates to 50% doom.
Imagine if one historical person from thousands of years ago was given god like magic power to arbitrarily rearrange the future. Seems roughly 50⁄50 whether that future would be reasonably aligned with IGF.
But of course that is not what happened with the evolution of homo sapiens. Even if humans are not robustly aligned to fitness/IGF at the individual level, we are robustly aligned at the population/species level[2]. The enormous success of homo sapiens, despite common misalignment at the individual level, is a clear illustration of how much more robust multi-polar scenarios can be.
As you argue in this post, it also seems likely that the same factors which improve the efficiency of memetic evolution (ie human engineering) over genetic evolution can/will be applied to improve the capability-weighted expected alignment of AGI systems vs that of brains.
Finally, one other hidden source of potential disagreement is the higher level question of degree of alignment between our individual utility functions and the utility function of global market tech evolution as a system. If you largely believe the “system itself is out of control”, you probably won’t be especially satisfied even if there is strong alignment between AGI and the system, if you believe that system itself is on the completely wrong track. That aspect is discussed less (and explicitly not a pillar of EY/MIRI doomer views AFAIK), but I do suspect it is a subtle important factor on p(doom) for many.
Optimizing for IGF doesn’t actually require having children oneself especially if one’s genotype is already widely replicated, but it doesn’t seem likely that substantially shifts the conclusion.
The average/expectation or more generally a linear combination of many utility vectors/functions can be arbitrarily aligned to some target even if nearly every single individual utility function in the set is orthogonal to the target (misaligned). Smoothing out noise (variance reduction) is crucial for optimization—whether using SGD or evolutionary search.
I view your final point as crucial. I would put an additional twist on it, though. During the approach to AGI, if takeoff is even a little bit slow, the effective goals of the system can change. For example, most corporations arguably don’t pursue profit exclusively even though they may be officially bound to. They favor executives, board members, and key employees in ways both subtle and obvious. But explicitly programming those goals into an SGD algorithm is probably too blatant to get away with.