I also don’t think this solution carries over very well to powerful AIs. A powerful AI has exceptionally little reason to treat its actions as correlated with ours, and will not have grown up with us in an evolutionary environment.
This seems correct, but I think that’s also somewhat orthogonal to the point that I read the OP to be making. I read it to be saying something like “some alignment discussions suggest that capabilities may generalize more than alignment, so that when an AI becomes drastically more capable, this will make it unaligned with its original goals; however, humans seem to remain pretty well aligned with their original goals despite a significant increase in their capabilities, so maybe we could use whatever-thing-keeps-humans-aligned-with-their-original-goals to build AIs in such a way that also keeps them aligned with their original goals when their capabilities increase”.
So I think the question that the post is asking is not “why did we originally evolve niceness” (the question that your comment answers) but “why have we retained our niceness despite the increase in our capabilities, and what would we need to do for an AI to similarly retain its original goals as it underwent an increase in capabilities”.
Sure. The issue is that we want to explain why we care about niceness, precisely because we currently care about niceness to a degree that seems surprising from an evolutionary perspective.
This is great from the perspective of humans who like niceness. But it’s not great from the perspective of evolution—to evolution, it looks like the mesa-optimizers’ values are drifting as their capabilities increase, because we’re privileging care/harm over purity/contamination ethics or what have you.
Basically, because of no genetic engineering/mind uploading before the 21st century, and it’s socially unacceptable to genetically engineer people due to World War II. We need to remember how contingent that was, and if WWII was avoided, genetic engineering probably would be more socially acceptable. Only contingency and the new ethical system that grew up in the aftermath of WWII prevented capabilities from eventually misaligning with evolution in genetics. All our capabilities have still not changed human nature.
This seems correct, but I think that’s also somewhat orthogonal to the point that I read the OP to be making. I read it to be saying something like “some alignment discussions suggest that capabilities may generalize more than alignment, so that when an AI becomes drastically more capable, this will make it unaligned with its original goals; however, humans seem to remain pretty well aligned with their original goals despite a significant increase in their capabilities, so maybe we could use whatever-thing-keeps-humans-aligned-with-their-original-goals to build AIs in such a way that also keeps them aligned with their original goals when their capabilities increase”.
So I think the question that the post is asking is not “why did we originally evolve niceness” (the question that your comment answers) but “why have we retained our niceness despite the increase in our capabilities, and what would we need to do for an AI to similarly retain its original goals as it underwent an increase in capabilities”.
Sure. The issue is that we want to explain why we care about niceness, precisely because we currently care about niceness to a degree that seems surprising from an evolutionary perspective.
This is great from the perspective of humans who like niceness. But it’s not great from the perspective of evolution—to evolution, it looks like the mesa-optimizers’ values are drifting as their capabilities increase, because we’re privileging care/harm over purity/contamination ethics or what have you.
Basically, because of no genetic engineering/mind uploading before the 21st century, and it’s socially unacceptable to genetically engineer people due to World War II. We need to remember how contingent that was, and if WWII was avoided, genetic engineering probably would be more socially acceptable. Only contingency and the new ethical system that grew up in the aftermath of WWII prevented capabilities from eventually misaligning with evolution in genetics. All our capabilities have still not changed human nature.