Can you go into a bit of detail about how the paper is relevant to AI alignment? I read most of it (and skimmed a few sections that looked less relevant), and the section titled Can AI systems make ethically sound decisions? was the closest to being relevant, but didn’t seem to meaningful engage with the core concerns of AI alignment.
The paper also didn’t include any discussion of the most significant impact we’d expect of misaligned AI on animals, i.e. total extinction.
but didn’t seem to meaningful engage with the core concerns of AI alignment.
Yes, not directly. We didn’t include any discussion of AI alignment, or even any futuristic-sounding stuff, to keep things close to the average conversation in the field of AI ethics, plus cater to the taste of our funder—two orgs at Princeton. We might write about these things in the future, or we might not.
But I argue that the paper is relevant to AI alignment because of the core claims of the paper: AI will affect the lives of (many) animals & These impacts matter ethically. And if these claims are true, then extending from them there might be a case for AI alignment to be broadened to “alignment with all sentient beings”. And it seems to me such broadening creates a lot of new issues that the alignment field needs to think about. (I will write about it, not sure I will publish any)
The paper also didn’t include any discussion of the most significant impact we’d expect of misaligned AI on animals, i.e. total extinction.
I sense that we might be disagreeing on many levels on this point, please correct me if I am wrong.
1. We might or might not mean very different things when we say “aligned” or “misaligned AI”. For me, an AI is not really aligned if it only aligns with humans (humans’ intent, interests, preferences, values, CEV, etc).
2. It seems that you might be thinking that total extinction is bad for animals. But I think it’s the reverse, most animals live net negative lives, so their total extinction could be good “for them”. In other words, it sounds plausible to me an AI that makes all nonhuman animals go extinct could be (but also possibly not be) one that is “aligned”. (A pragmatic consideration related to this is that we probably can’t and shouldn’t say things like this in an introductory paper to a proposed new field.)
3. Ignoring the sign of the impact, I disagree that total extinction is the most significant impact on animals we should expect from misaligned AI. Extinction of all nonhuman animals takes away a certain (huge) amount (X) of net suffering. But misaligned AI can also create suffering/create things that cause suffering. It sounds plausible that there are many scenarios where misaligned AIs create >X, or even >>X amount of net suffering for nonhuman animals.
a case for AI alignment to be broadened to “alignment with all sentient beings”. And it seems to me such broadening creates a lot of new issues that the alignment field needs to think about.
This seems uncontroversial to me. I expect most people currently thinking about alignment would consider a “good outcome” to be one where the interests of all moral patients, not just humans, are represented—i.e. non-human animals (and potentially aliens). If you have other ideas in mind that you think have significant philosophical or technical implications for alignment, I’d be very interested in even a cursory writeup, especially if they’re new to the field.
For me, an AI is not really aligned if it only aligns with humans
Yep, see above.
It seems that you might be thinking that total extinction is bad for animals. But I think it’s the reverse, most animals live net negative lives, so their total extinction could be good “for them”. In other words, it sounds plausible to me an AI that makes all nonhuman animals go extinct could be (but also possibly not be) one that is “aligned”.
I think total extinction is bad for animals compared to counterfactual future outcomes where their interests are represented by an aligned AI. I don’t have a strong opinion on how it compares to the current state of affairs (but, purely on first-order considerations, it might be an improvement due to factory farming).
But misaligned AI can also create suffering/create things that cause suffering.
Agreed in principle, though I don’t think S-risks are substantially likely.
Agreed in principle, though I don’t think S-risks are substantially likely.
If animals are currently net suffering, AI might still think that the current way animals live is good and increase the number of animals that live the way animals currently live.
From some animal right perspectives that might be an S-risk.
Can you go into a bit of detail about how the paper is relevant to AI alignment? I read most of it (and skimmed a few sections that looked less relevant), and the section titled
Can AI systems make ethically sound decisions?
was the closest to being relevant, but didn’t seem to meaningful engage with the core concerns of AI alignment.The paper also didn’t include any discussion of the most significant impact we’d expect of misaligned AI on animals, i.e. total extinction.
Yes, not directly. We didn’t include any discussion of AI alignment, or even any futuristic-sounding stuff, to keep things close to the average conversation in the field of AI ethics, plus cater to the taste of our funder—two orgs at Princeton. We might write about these things in the future, or we might not.
But I argue that the paper is relevant to AI alignment because of the core claims of the paper: AI will affect the lives of (many) animals & These impacts matter ethically. And if these claims are true, then extending from them there might be a case for AI alignment to be broadened to “alignment with all sentient beings”. And it seems to me such broadening creates a lot of new issues that the alignment field needs to think about. (I will write about it, not sure I will publish any)
I sense that we might be disagreeing on many levels on this point, please correct me if I am wrong.
1. We might or might not mean very different things when we say “aligned” or “misaligned AI”. For me, an AI is not really aligned if it only aligns with humans (humans’ intent, interests, preferences, values, CEV, etc).
2. It seems that you might be thinking that total extinction is bad for animals. But I think it’s the reverse, most animals live net negative lives, so their total extinction could be good “for them”. In other words, it sounds plausible to me an AI that makes all nonhuman animals go extinct could be (but also possibly not be) one that is “aligned”. (A pragmatic consideration related to this is that we probably can’t and shouldn’t say things like this in an introductory paper to a proposed new field.)
3. Ignoring the sign of the impact, I disagree that total extinction is the most significant impact on animals we should expect from misaligned AI. Extinction of all nonhuman animals takes away a certain (huge) amount (X) of net suffering. But misaligned AI can also create suffering/create things that cause suffering. It sounds plausible that there are many scenarios where misaligned AIs create >X, or even >>X amount of net suffering for nonhuman animals.
Thanks for the detailed response!
This seems uncontroversial to me. I expect most people currently thinking about alignment would consider a “good outcome” to be one where the interests of all moral patients, not just humans, are represented—i.e. non-human animals (and potentially aliens). If you have other ideas in mind that you think have significant philosophical or technical implications for alignment, I’d be very interested in even a cursory writeup, especially if they’re new to the field.
Yep, see above.
I think total extinction is bad for animals compared to counterfactual future outcomes where their interests are represented by an aligned AI. I don’t have a strong opinion on how it compares to the current state of affairs (but, purely on first-order considerations, it might be an improvement due to factory farming).
Agreed in principle, though I don’t think S-risks are substantially likely.
If animals are currently net suffering, AI might still think that the current way animals live is good and increase the number of animals that live the way animals currently live.
From some animal right perspectives that might be an S-risk.