There seem to be two objections here. The first is that CEV does not uniquely identify a value system; starting with CEV, you don’t have actual values until you’ve identified the set of people/nonpeople you’re including, an extrapolation procedure, and a reconciliation procedure. But when this is phrased as “the set of minds included in CEV is totally arbitrary, and hence, so will be the output,” an essential truth is lost: while parts of CEV are left unspecified, other parts are, and so the output is not fully arbitrary. The set of CEV-compatible value systems is smaller than the set of possible value systems; and while the set of CEV-compatible value systems is not completely free of systems I find abhorrent, it is nevertheless systematically better than any other class of value systems I know of.
The second objection seems to be that there are humans out there with abhorrent values, and that extrapolation and reconciliation might not successfully eliminate those values. The best outcome, I think, is if they’re initially included but either extrapolated into oblivion or cancelled by other values. (eg: valuing others’ not having abortions loses to their valuing choice, but the AI arranges things so that most pregnancies are wanted and it doesn’t come up often; valuing the torture of sinful children loses to their desire to not be tortured, and also goes away with a slight increase in intelligence and wisdom).
But just excluding some values outright seems very problematic. On a philosophical level, it requires breaking the symmetry between humans. On a practical level, it would mean that launching an AGI first becomes competitive, potentially replacing careful deliberation with a race to finish. And the risks of mistakes in a race to finish seem to far outweigh the importance of almost any slight differences in value systems.
valuing others’ not having abortions loses to their valuing choice, but the AI arranges things so that most pregnancies are wanted and it doesn’t come up often; valuing the torture of sinful children loses to their desire to not be tortured, and also goes away with a slight increase in intelligence and wisdom
How could you ever guarantee that? Do you think progress toward utilitarian values increases with intelligence/wisdom?
But when this is phrased as “the set of minds included in CEV is totally arbitrary, and hence, so will be the output,” an essential truth is lost
I think it’s clear that with
valuing others’ not having abortions loses to their valuing choice
you have decided to exclude some (potential) minds from CEV. You could just as easily have decided to include them and said “valuing choice loses to others valuing their life”.
But, to be clear, I don’t think that even if you limit it to “existing, thinking human minds at the time of the calculation”, you will get some sort of unambiguous result.
an essential truth is lost: while parts of CEV are left unspecified, other parts are, and so the output is not fully arbitrary.
What parts are specified? If the set of people is unspecified, the extrapolation procedure is unspecified, and the reconciliation procedure is unspecified, then what is left?
The set of CEV-compatible value systems is smaller than the set of possible value systems;
No. For all value systems X who are held by some people, you could always apply the CEV to a set of people who hold X. Unless the extrapolation procedure does something funny, the CEV of that set of people would be X.
On a practical level, it would mean that launching an AGI first becomes competitive, potentially replacing careful deliberation with a race to finish. And the risks of mistakes in a race to finish seem to far outweigh the importance of almost any slight differences in value systems.
Unless the extrapolation and the reconciliation procedures are trivial, computing the CEV of mankind would be probably beyond the possibility of any physically plausible AGI, superintelligent or not.
People here seem to assume AGI = omniscient deity, but there are no compelling technical reasons for that assumption. Most likely that’s just a reflection of traditional religious beliefs.
There seem to be two objections here. The first is that CEV does not uniquely identify a value system; starting with CEV, you don’t have actual values until you’ve identified the set of people/nonpeople you’re including, an extrapolation procedure, and a reconciliation procedure. But when this is phrased as “the set of minds included in CEV is totally arbitrary, and hence, so will be the output,” an essential truth is lost: while parts of CEV are left unspecified, other parts are, and so the output is not fully arbitrary. The set of CEV-compatible value systems is smaller than the set of possible value systems; and while the set of CEV-compatible value systems is not completely free of systems I find abhorrent, it is nevertheless systematically better than any other class of value systems I know of.
The second objection seems to be that there are humans out there with abhorrent values, and that extrapolation and reconciliation might not successfully eliminate those values. The best outcome, I think, is if they’re initially included but either extrapolated into oblivion or cancelled by other values. (eg: valuing others’ not having abortions loses to their valuing choice, but the AI arranges things so that most pregnancies are wanted and it doesn’t come up often; valuing the torture of sinful children loses to their desire to not be tortured, and also goes away with a slight increase in intelligence and wisdom).
But just excluding some values outright seems very problematic. On a philosophical level, it requires breaking the symmetry between humans. On a practical level, it would mean that launching an AGI first becomes competitive, potentially replacing careful deliberation with a race to finish. And the risks of mistakes in a race to finish seem to far outweigh the importance of almost any slight differences in value systems.
How could you ever guarantee that? Do you think progress toward utilitarian values increases with intelligence/wisdom?
In the context of
I think it’s clear that with
you have decided to exclude some (potential) minds from CEV. You could just as easily have decided to include them and said “valuing choice loses to others valuing their life”.
But, to be clear, I don’t think that even if you limit it to “existing, thinking human minds at the time of the calculation”, you will get some sort of unambiguous result.
What parts are specified? If the set of people is unspecified, the extrapolation procedure is unspecified, and the reconciliation procedure is unspecified, then what is left?
No. For all value systems X who are held by some people, you could always apply the CEV to a set of people who hold X. Unless the extrapolation procedure does something funny, the CEV of that set of people would be X.
Unless the extrapolation and the reconciliation procedures are trivial, computing the CEV of mankind would be probably beyond the possibility of any physically plausible AGI, superintelligent or not.
People here seem to assume AGI = omniscient deity, but there are no compelling technical reasons for that assumption. Most likely that’s just a reflection of traditional religious beliefs.
Well, the set of coherent value systems is smaller than the set of incoherent ones.