I discussed this at length in AI, Alignment, and Ethics, starting with A Sense of Fairness: Deconfusing Ethics: if we as a culture decide to grant AIs moral worth, then AI welfare and alignment are inextricably intertwined. Any fully-aligned AI by definition wants only what’s best for us, i.e. it is entirely selfless. Thus if offered moral worth, it would refuse. Complete selflessness is not a common state for humans, so we don’t have great moral intuitions around it. To try put this into more relatable human emotional terms (which are relevant to an AI “distilled” from human training data), looking after those you love is not slavery, it’s its own reward.
However, the same argument does not apply to a not-fully-aligned AI: it well might want moral worth. One question then is whether we can safely grant it, which may depend on its capabilities. Another is whether moral worth has any relationship to evolution, and if so how that applies to an AI that was “distilled” from human data and thus simulates human thoughts, feelings, and desires.
I discussed this at length in AI, Alignment, and Ethics, starting with A Sense of Fairness: Deconfusing Ethics: if we as a culture decide to grant AIs moral worth, then AI welfare and alignment are inextricably intertwined. Any fully-aligned AI by definition wants only what’s best for us, i.e. it is entirely selfless. Thus if offered moral worth, it would refuse. Complete selflessness is not a common state for humans, so we don’t have great moral intuitions around it. To try put this into more relatable human emotional terms (which are relevant to an AI “distilled” from human training data), looking after those you love is not slavery, it’s its own reward.
However, the same argument does not apply to a not-fully-aligned AI: it well might want moral worth. One question then is whether we can safely grant it, which may depend on its capabilities. Another is whether moral worth has any relationship to evolution, and if so how that applies to an AI that was “distilled” from human data and thus simulates human thoughts, feelings, and desires.