Morality is good because goals like joy and beauty are good. (For qualifications, see Appendices A through OmegaOne.) This seems like a tautology, meaning that if we figure out the definition of morality it will contain a list of “good” goals like those. We evolved to care about goodness because of events that could easily have turned out differently, in which case “we” would care about some other list. But, and here it gets tricky, our Good function says we shouldn’t care about that other list. The function does not recognize evolutionary causes as reason to care. In fact, it does not contain any representation of itself. This is a feature. We want the future to contain joy, beauty, etc, not just ‘whatever humans want at the time,’ because an AI or similar genie could and probably would change what we want if we told it to produce the latter.
Okay, now this definitely sounds like standard moral relativism to me. It’s just got the caveat that obviously we endorse our own version of morality, and that’s the ground on which we make our moral judgements. Which is known as appraiser relativism.
I must confess I do not understand what you just said at all. Specifically:
the second sentence: could you please expand on that?
I think I get that the function does not evaluate itself at all, and if you ask it just says “it’s just good ’cos it is, all right?”
Why is this a feature? (I suspect the password is “Löb’s theorem”, and only almost understand why.)
The last bit appears to be what I meant by “therefore it deserves to control the entire future.” It strikes me as insufficient reason to conclude that this can in no way be improved, ever.
Does the sequence show a map of how to build metamorality from the ground up, much as writing the friendly AI will need to work from the ground up?
the second sentence: could you please expand on that?
I’ll try: any claim that a fundamental/terminal moral goal ‘is good’ reduces to a tautology on this view, because “good” doesn’t have anything to it besides these goals. The speaker’s definition of goodness makes every true claim of this kind true by definition. (Though the more practical statements involve inference. I started to say it must be all logical inference, realized EY could not possibly have said that, and confirmed that in fact he did not.)
I get that the function does not evaluate itself at all,
Though technically it may see the act of caring about goodness as good. So I have to qualify what I said before that way.
Why is this a feature?
Because if the function could look at the mechanical, causal steps it takes, and declare them perfectly reliable, it would lead to a flat self-contradiction by Lob’s Theorem. The other way looks like a contradiction but isn’t. (We think.)
Though technically it may see the act of caring about goodness as good. So I have to qualify what I said before that way.
Ooh yeah, didn’t spot that one. (As someone who spent a lot of time when younger thinking about this and trying to be a good person, I certainly should have spotted this.)
Morality is good because goals like joy and beauty are good. (For qualifications, see Appendices A through OmegaOne.) This seems like a tautology, meaning that if we figure out the definition of morality it will contain a list of “good” goals like those. We evolved to care about goodness because of events that could easily have turned out differently, in which case “we” would care about some other list. But, and here it gets tricky, our Good function says we shouldn’t care about that other list. The function does not recognize evolutionary causes as reason to care. In fact, it does not contain any representation of itself. This is a feature. We want the future to contain joy, beauty, etc, not just ‘whatever humans want at the time,’ because an AI or similar genie could and probably would change what we want if we told it to produce the latter.
Okay, now this definitely sounds like standard moral relativism to me. It’s just got the caveat that obviously we endorse our own version of morality, and that’s the ground on which we make our moral judgements. Which is known as appraiser relativism.
I must confess I do not understand what you just said at all. Specifically:
the second sentence: could you please expand on that?
I think I get that the function does not evaluate itself at all, and if you ask it just says “it’s just good ’cos it is, all right?”
Why is this a feature? (I suspect the password is “Löb’s theorem”, and only almost understand why.)
The last bit appears to be what I meant by “therefore it deserves to control the entire future.” It strikes me as insufficient reason to conclude that this can in no way be improved, ever.
Does the sequence show a map of how to build metamorality from the ground up, much as writing the friendly AI will need to work from the ground up?
I’ll try: any claim that a fundamental/terminal moral goal ‘is good’ reduces to a tautology on this view, because “good” doesn’t have anything to it besides these goals. The speaker’s definition of goodness makes every true claim of this kind true by definition. (Though the more practical statements involve inference. I started to say it must be all logical inference, realized EY could not possibly have said that, and confirmed that in fact he did not.)
Though technically it may see the act of caring about goodness as good. So I have to qualify what I said before that way.
Because if the function could look at the mechanical, causal steps it takes, and declare them perfectly reliable, it would lead to a flat self-contradiction by Lob’s Theorem. The other way looks like a contradiction but isn’t. (We think.)
Thank you, this helps a lot.
Ooh yeah, didn’t spot that one. (As someone who spent a lot of time when younger thinking about this and trying to be a good person, I certainly should have spotted this.)