A CEV optimizer is less likely to do horrific things while its ability to extrapolate volition is “weak”. If it can’t extrapolate far from the unwise preferences people have now with the resources it has, it will notice that the EV varies a lot among the population, and take no action. Or if the extrapolation system has a bug in it, this will hopefully show up as well. So coherence is a kind of “sanity test”.
That’s one reason that leaps to mind anyway.
Of course the other is that there is no evidence any single human is Friendly anyway, so cooperation would be impossible among EV maximizing AI researchers. As such, an AI that maximizes EV is out of the question already. CEV is the next best thing.
A CEV optimizer is less likely to do horrific things while its ability to extrapolate volition is “weak”. If it can’t extrapolate far from the unwise preferences people have now with the resources it has, it will notice that the EV varies a lot among the population, and take no action. Or if the extrapolation system has a bug in it, this will hopefully show up as well. So coherence is a kind of “sanity test”.
That’s one reason that leaps to mind anyway.
Of course the other is that there is no evidence any single human is Friendly anyway, so cooperation would be impossible among EV maximizing AI researchers. As such, an AI that maximizes EV is out of the question already. CEV is the next best thing.