I’m not really sure what it would mean in practice to operate under the assumption that “there is no discoverably correct ethics or metaethics the AI can learn”. The AI still has to do something. The idea of CEV seems to be something like: have the AI act according to preference utilitarianism. It seems to me that this would result in a reasonably good outcome even if preference utilitarianism isn’t true (and in fact I think that preference utilitarianism isn’t true). So the only ethical assumption is that creating a preference utilitarian AI is a good idea. What would be an example of a weaker ethical assumption that could plausibly be sufficient to design an AI?
I’m not entirely sure either, and my best approach has been to change what we really mean “ethics” to make the problem tractable without forcing a move to making choices about what is normative. I’ll touch on this more when I describe my package of philosophical ideas I believe we should adopt in AI safety research, so for now I’ll leave it as an example of the kind of assumption that is affected by this line of thinking.
I’m not really sure what it would mean in practice to operate under the assumption that “there is no discoverably correct ethics or metaethics the AI can learn”. The AI still has to do something. The idea of CEV seems to be something like: have the AI act according to preference utilitarianism. It seems to me that this would result in a reasonably good outcome even if preference utilitarianism isn’t true (and in fact I think that preference utilitarianism isn’t true). So the only ethical assumption is that creating a preference utilitarian AI is a good idea. What would be an example of a weaker ethical assumption that could plausibly be sufficient to design an AI?
I’m not entirely sure either, and my best approach has been to change what we really mean “ethics” to make the problem tractable without forcing a move to making choices about what is normative. I’ll touch on this more when I describe my package of philosophical ideas I believe we should adopt in AI safety research, so for now I’ll leave it as an example of the kind of assumption that is affected by this line of thinking.