a case for AI alignment to be broadened to “alignment with all sentient beings”. And it seems to me such broadening creates a lot of new issues that the alignment field needs to think about.
This seems uncontroversial to me. I expect most people currently thinking about alignment would consider a “good outcome” to be one where the interests of all moral patients, not just humans, are represented—i.e. non-human animals (and potentially aliens). If you have other ideas in mind that you think have significant philosophical or technical implications for alignment, I’d be very interested in even a cursory writeup, especially if they’re new to the field.
For me, an AI is not really aligned if it only aligns with humans
Yep, see above.
It seems that you might be thinking that total extinction is bad for animals. But I think it’s the reverse, most animals live net negative lives, so their total extinction could be good “for them”. In other words, it sounds plausible to me an AI that makes all nonhuman animals go extinct could be (but also possibly not be) one that is “aligned”.
I think total extinction is bad for animals compared to counterfactual future outcomes where their interests are represented by an aligned AI. I don’t have a strong opinion on how it compares to the current state of affairs (but, purely on first-order considerations, it might be an improvement due to factory farming).
But misaligned AI can also create suffering/create things that cause suffering.
Agreed in principle, though I don’t think S-risks are substantially likely.
Agreed in principle, though I don’t think S-risks are substantially likely.
If animals are currently net suffering, AI might still think that the current way animals live is good and increase the number of animals that live the way animals currently live.
From some animal right perspectives that might be an S-risk.
Thanks for the detailed response!
This seems uncontroversial to me. I expect most people currently thinking about alignment would consider a “good outcome” to be one where the interests of all moral patients, not just humans, are represented—i.e. non-human animals (and potentially aliens). If you have other ideas in mind that you think have significant philosophical or technical implications for alignment, I’d be very interested in even a cursory writeup, especially if they’re new to the field.
Yep, see above.
I think total extinction is bad for animals compared to counterfactual future outcomes where their interests are represented by an aligned AI. I don’t have a strong opinion on how it compares to the current state of affairs (but, purely on first-order considerations, it might be an improvement due to factory farming).
Agreed in principle, though I don’t think S-risks are substantially likely.
If animals are currently net suffering, AI might still think that the current way animals live is good and increase the number of animals that live the way animals currently live.
From some animal right perspectives that might be an S-risk.