I like this thought experiment, but I feel like this points out a flaw in the concept of CEV in general, not SCEV in particular.
If the entire future is determined by a singular set of values derived from an aggregation/extrapolation of the values of a group, then you would always run the risk of a “tyranny of the mob” kind of situation.
If in CEV that group is specifically humans, it feels like all the author is calling for is expanding the franchise/inclusion to non-humans as well.
(1) Whether there are possible scenarios like these in which the ASI cannot find a way to adequately satisfy all the extrapolated volition of the included beings is not clear. There might not be any such scenarios.
(2) If these scenarios are possible, it is also not clear how likely they are.
(3) There is a subset of s-risks and undesirable outcomes (those coming from cooperation failures between powerful agents) that are a problem to all ambitious value-alignment proposals, including CEV and SCEV.
(4) In part, because of 3, the conclusion of the paper is not that we should implement SCEV if possible all things considered, but rather that we have some strong pro-tanto reasons in favour of doing so. It still might be best not to do so all things considered.
Regarding NicholasKees’ point about mob rule vs expansion, I wrote a reply that I moved to another comment.
In response to the points in the immediate parent comment:
You have to decide, at some point, what you are optimizing for. If you optimize for X, Y will potentially be sacrificed. Some conflicts might be resolvable but ultimately you are making a tradeoff somewhere.
And while you haven’t taken over yet, other people have a voice as to whether they want to get sacrificed for such a trade-off.
I like this thought experiment, but I feel like this points out a flaw in the concept of CEV in general, not SCEV in particular.
If the entire future is determined by a singular set of values derived from an aggregation/extrapolation of the values of a group, then you would always run the risk of a “tyranny of the mob” kind of situation.
If in CEV that group is specifically humans, it feels like all the author is calling for is expanding the franchise/inclusion to non-humans as well.
Yes, and—other points may also be relevant:
(1) Whether there are possible scenarios like these in which the ASI cannot find a way to adequately satisfy all the extrapolated volition of the included beings is not clear. There might not be any such scenarios.
(2) If these scenarios are possible, it is also not clear how likely they are.
(3) There is a subset of s-risks and undesirable outcomes (those coming from cooperation failures between powerful agents) that are a problem to all ambitious value-alignment proposals, including CEV and SCEV.
(4) In part, because of 3, the conclusion of the paper is not that we should implement SCEV if possible all things considered, but rather that we have some strong pro-tanto reasons in favour of doing so. It still might be best not to do so all things considered.
Regarding NicholasKees’ point about mob rule vs expansion, I wrote a reply that I moved to another comment.
In response to the points in the immediate parent comment:
You have to decide, at some point, what you are optimizing for. If you optimize for X, Y will potentially be sacrificed. Some conflicts might be resolvable but ultimately you are making a tradeoff somewhere.
And while you haven’t taken over yet, other people have a voice as to whether they want to get sacrificed for such a trade-off.