e.g. many people that care about animal welfare differ on the decisions they would make for those animals. What if the AGI ends up a negative utilitarian and sterilizes us all to save humanity from all future suffering? The missing element would again be to have the AGI aligned with humanity, which brings us back to H4: What’s humanity’s alignment anyway?
I think “humanity’s alignment” is a strange term to use. Perhaps you mean “humanity’s values” or even “humanity’s collective utility function.”
I’ll clarify what I mean by empathy here. I think the ideal form of empathy is wanting others to get what they themselves want. Given that entities are competing for scarce resources and tend to interfere with one another’s desires, this leads to the necessity of making tradeoffs about how much you help each desire, but in principle this seems like the ideal to me.
So negative utilitarianism is not actually reasonably empathic, since it is not concerned with the rights of the entities in question to decide about their own futures. In fact I think it’s one of the most dangerous and harmful philosophies I’ve ever seen, and an AI such as I would like to see made would reject it altogether.
Hmmm, yes and no?
e.g. many people that care about animal welfare differ on the decisions they would make for those animals. What if the AGI ends up a negative utilitarian and sterilizes us all to save humanity from all future suffering? The missing element would again be to have the AGI aligned with humanity, which brings us back to H4: What’s humanity’s alignment anyway?
I think “humanity’s alignment” is a strange term to use. Perhaps you mean “humanity’s values” or even “humanity’s collective utility function.”
I’ll clarify what I mean by empathy here. I think the ideal form of empathy is wanting others to get what they themselves want. Given that entities are competing for scarce resources and tend to interfere with one another’s desires, this leads to the necessity of making tradeoffs about how much you help each desire, but in principle this seems like the ideal to me.
So negative utilitarianism is not actually reasonably empathic, since it is not concerned with the rights of the entities in question to decide about their own futures. In fact I think it’s one of the most dangerous and harmful philosophies I’ve ever seen, and an AI such as I would like to see made would reject it altogether.