Quintin Pope comments on Humans aren’t fitness maximizers

Quintin Pope 7 Oct 2022 1:23 UTC
5 points
3
That’s the key mechanistic difference between evolution and SGD. There’s an additional layer here that comes from how that mechanistic difference interacts with the circumstances of the ancestral environment (I.e., that ancestral humans never had an IGF abstraction), which means evolutionary optimization over the human mind blueprint in the ancestral environment would have never produced a blueprint that lead to value formation around IGF in the modern environment. This fully explains modern humanity’s misalignment wrt IGF, which would have happened even in worlds where inner alignment is never a problem for ML systems. Thus, evolutionary analogies tell us ~nothing about whether we should be worried about inner alignment.

(This is even ignoring the fact that IGF seems like a very hard concept to align minds to at all, due to the sparseness of IGF reward signals.)