A thought experiment: the mildly xenophobic large alien civilization.
Imagine at some future time we encounter an expanding grabby aliens civilization. The civilization is much older and larger than ours, but cooperates poorly. Their individual members tend to have a mild distaste for the existence of aliens (such as us). It isn’t that severe, but there are very many of them, so their total suffering at our existence and wish for us to die outweighs our own suffering if our AI killed us, and our own will to live.
They aren’t going to kill us directly, because they co-operate poorly, individually don’t care all that much, and defense has the advantage over offense.
But, in this case, the AI programmed as you proposed will kill us once if finds out about these mildly xenophobic aliens. How do you feel about that? And do you feel that, if I don’t want to be killed in this scenario, that my opposition is unjustified?
I like this thought experiment, but I feel like this points out a flaw in the concept of CEV in general, not SCEV in particular.
If the entire future is determined by a singular set of values derived from an aggregation/extrapolation of the values of a group, then you would always run the risk of a “tyranny of the mob” kind of situation.
If in CEV that group is specifically humans, it feels like all the author is calling for is expanding the franchise/inclusion to non-humans as well.
(1) Whether there are possible scenarios like these in which the ASI cannot find a way to adequately satisfy all the extrapolated volition of the included beings is not clear. There might not be any such scenarios.
(2) If these scenarios are possible, it is also not clear how likely they are.
(3) There is a subset of s-risks and undesirable outcomes (those coming from cooperation failures between powerful agents) that are a problem to all ambitious value-alignment proposals, including CEV and SCEV.
(4) In part, because of 3, the conclusion of the paper is not that we should implement SCEV if possible all things considered, but rather that we have some strong pro-tanto reasons in favour of doing so. It still might be best not to do so all things considered.
Regarding NicholasKees’ point about mob rule vs expansion, I wrote a reply that I moved to another comment.
In response to the points in the immediate parent comment:
You have to decide, at some point, what you are optimizing for. If you optimize for X, Y will potentially be sacrificed. Some conflicts might be resolvable but ultimately you are making a tradeoff somewhere.
And while you haven’t taken over yet, other people have a voice as to whether they want to get sacrificed for such a trade-off.
A thought experiment: the mildly xenophobic large alien civilization.
Imagine at some future time we encounter an expanding grabby aliens civilization. The civilization is much older and larger than ours, but cooperates poorly. Their individual members tend to have a mild distaste for the existence of aliens (such as us). It isn’t that severe, but there are very many of them, so their total suffering at our existence and wish for us to die outweighs our own suffering if our AI killed us, and our own will to live.
They aren’t going to kill us directly, because they co-operate poorly, individually don’t care all that much, and defense has the advantage over offense.
But, in this case, the AI programmed as you proposed will kill us once if finds out about these mildly xenophobic aliens. How do you feel about that? And do you feel that, if I don’t want to be killed in this scenario, that my opposition is unjustified?
I like this thought experiment, but I feel like this points out a flaw in the concept of CEV in general, not SCEV in particular.
If the entire future is determined by a singular set of values derived from an aggregation/extrapolation of the values of a group, then you would always run the risk of a “tyranny of the mob” kind of situation.
If in CEV that group is specifically humans, it feels like all the author is calling for is expanding the franchise/inclusion to non-humans as well.
Yes, and—other points may also be relevant:
(1) Whether there are possible scenarios like these in which the ASI cannot find a way to adequately satisfy all the extrapolated volition of the included beings is not clear. There might not be any such scenarios.
(2) If these scenarios are possible, it is also not clear how likely they are.
(3) There is a subset of s-risks and undesirable outcomes (those coming from cooperation failures between powerful agents) that are a problem to all ambitious value-alignment proposals, including CEV and SCEV.
(4) In part, because of 3, the conclusion of the paper is not that we should implement SCEV if possible all things considered, but rather that we have some strong pro-tanto reasons in favour of doing so. It still might be best not to do so all things considered.
Regarding NicholasKees’ point about mob rule vs expansion, I wrote a reply that I moved to another comment.
In response to the points in the immediate parent comment:
You have to decide, at some point, what you are optimizing for. If you optimize for X, Y will potentially be sacrificed. Some conflicts might be resolvable but ultimately you are making a tradeoff somewhere.
And while you haven’t taken over yet, other people have a voice as to whether they want to get sacrificed for such a trade-off.