Sure. The issue is that we want to explain why we care about niceness, precisely because we currently care about niceness to a degree that seems surprising from an evolutionary perspective.
This is great from the perspective of humans who like niceness. But it’s not great from the perspective of evolution—to evolution, it looks like the mesa-optimizers’ values are drifting as their capabilities increase, because we’re privileging care/harm over purity/contamination ethics or what have you.
Sure. The issue is that we want to explain why we care about niceness, precisely because we currently care about niceness to a degree that seems surprising from an evolutionary perspective.
This is great from the perspective of humans who like niceness. But it’s not great from the perspective of evolution—to evolution, it looks like the mesa-optimizers’ values are drifting as their capabilities increase, because we’re privileging care/harm over purity/contamination ethics or what have you.