I narrowly agree that evolution failed to align us well with inclusive genetic fitness.
However, your comment indicates to me that you missed OP’s more important points. I think humans have some pretty interesting alignment properties (e.g. blind people presumably lose access to a range of visually-activated hardcoded reward circuitry, and yet are not AFAICT less likely to care about other human beings; thus, human value formation is robust along some kinds of variation on the internal reward function; is value really that fragile after all?). Your comment focuses on evolution/human misalignment, as opposed to genome->human alignment properties (e.g. how sensitive are learned human values to mutations and modifications to the learning process, or how the genome actually mechanistically makes people care about other people).
in favor of various inner-rewards (often of trivial magnitude)
Inner-rewards as in “the reward meted out by the human reward system”? If so, I don’t think that’s how people work. Otherwise, they would be wireheaders: We know how to wirehead humans; neuroscientists do not wirehead themselves, even though some probably could have it arranged; people are not inner-reward maximizers.
I narrowly agree that evolution failed to align us well with inclusive genetic fitness.
However, your comment indicates to me that you missed OP’s more important points. I think humans have some pretty interesting alignment properties (e.g. blind people presumably lose access to a range of visually-activated hardcoded reward circuitry, and yet are not AFAICT less likely to care about other human beings; thus, human value formation is robust along some kinds of variation on the internal reward function; is value really that fragile after all?). Your comment focuses on evolution/human misalignment, as opposed to genome->human alignment properties (e.g. how sensitive are learned human values to mutations and modifications to the learning process, or how the genome actually mechanistically makes people care about other people).
Inner-rewards as in “the reward meted out by the human reward system”? If so, I don’t think that’s how people work. Otherwise, they would be wireheaders: We know how to wirehead humans; neuroscientists do not wirehead themselves, even though some probably could have it arranged; people are not inner-reward maximizers.