The principles from the post can still be applied. Some humans do end up aligned to animals—particularly vegans (such as myself!). How does that happen? There empirically are examples of general intelligences with at least some tendency to terminally value entities massively less powerful than themselves; we should be analyzing how this occurs.
Sure, if you’ve got some example of a mechanism for this that’s likely to scale, it may be worthwhile. I’m just pointing out that a lot of people have already thought about mechanisms and concluded that the mechanisms they could come up with would be unlikely to scale.
By the way: at least part of the explanation for why I personally am aligned to animals is that I have a strong tendency to be moved by the Care/Harm moral foundation—see this summary of The Righteous Mind for more details.
I’m not a big fan of moral foundations theory for explaining individual differences in moral views. I think it lacks evidence.
I’m just pointing out that a lot of people have already thought about mechanisms and concluded that the mechanisms they could come up with would be unlikely to scale.
In my experience, researchers tend to stop at “But humans are hacky kludges” (what do they think they know, and why do they think they know it?). Speaking for myself, I viewed humans as complicated hacks which didn’t offer substantial evidence about alignment questions. This “humans as alignment-magic” or “the selection pressure down the street did it” view seems quite common (but not universal).
AFAICT, most researchers do not appreciate the importance of asking questions with guaranteed answers.
AFAICT, most alignment-produced thinking about humans is about their superficial reliability (e.g. will they let an AI out of the box) or the range of situations in which their behavior will make sense (e.g. how hard is it to find adversarial examples which make a perfect imitation of a human). I think these questions are relatively unimportant to alignment.
Sure, if you’ve got some example of a mechanism for this that’s likely to scale, it may be worthwhile. I’m just pointing out that a lot of people have already thought about mechanisms and concluded that the mechanisms they could come up with would be unlikely to scale.
I’m not a big fan of moral foundations theory for explaining individual differences in moral views. I think it lacks evidence.
In my experience, researchers tend to stop at “But humans are hacky kludges” (what do they think they know, and why do they think they know it?). Speaking for myself, I viewed humans as complicated hacks which didn’t offer substantial evidence about alignment questions. This “humans as alignment-magic” or “the selection pressure down the street did it” view seems quite common (but not universal).
AFAICT, most researchers do not appreciate the importance of asking questions with guaranteed answers.
AFAICT, most alignment-produced thinking about humans is about their superficial reliability (e.g. will they let an AI out of the box) or the range of situations in which their behavior will make sense (e.g. how hard is it to find adversarial examples which make a perfect imitation of a human). I think these questions are relatively unimportant to alignment.