Look, I’m not pro-”Kill All Humans”, but I don’t think that last step is correct.
Bob can prefer that the human race die off and the earth spin uninhabited forever. It makes him evil, but there’s no “logic error” in that, any more than there is in Al’s preference that humanity spread out throughout the stars. They both envision future states and take actions that they believe will cause those states.
I think it’s a logical error from the point of view of my theory of computational meta-ethics, not from a general absolute point of view. Indeed, by VNM theorem, any course of action which is self-consistent can be said to have a guiding value. But, if you see values as something that is calculated inside an agent, as I do, and exists only in the mind of those who do execute that computation, then making a state of the world that terminates your existence is a fallacy: whatever value you are maximizing, you cannot maximize it without anyone who can compute it. Note that this formulation would allow substituting all humans with computronium devoted to calculating that value, so is still vulnerable to UFAI, but at least it rejects prima facie a simple extinction of all sentient life.
Ok, but that sounds like a problem with your theory, not someone’s else’s logic error.
Like, when you call something a “logic error”, my first instinct is to check its logic. Then when you clarify that what you mean is that it didn’t meet with your classification system’s approval, I feel like you are baiting and switching. Maybe go with “sin”, or “perversion”, to make clear that your meaning is just “Mr. Mind doesn’t like this”.
grumble grumble...
Look, I’m not pro-”Kill All Humans”, but I don’t think that last step is correct.
Bob can prefer that the human race die off and the earth spin uninhabited forever. It makes him evil, but there’s no “logic error” in that, any more than there is in Al’s preference that humanity spread out throughout the stars. They both envision future states and take actions that they believe will cause those states.
I think it’s a logical error from the point of view of my theory of computational meta-ethics, not from a general absolute point of view.
Indeed, by VNM theorem, any course of action which is self-consistent can be said to have a guiding value.
But, if you see values as something that is calculated inside an agent, as I do, and exists only in the mind of those who do execute that computation, then making a state of the world that terminates your existence is a fallacy: whatever value you are maximizing, you cannot maximize it without anyone who can compute it.
Note that this formulation would allow substituting all humans with computronium devoted to calculating that value, so is still vulnerable to UFAI, but at least it rejects prima facie a simple extinction of all sentient life.
Ok, but that sounds like a problem with your theory, not someone’s else’s logic error.
Like, when you call something a “logic error”, my first instinct is to check its logic. Then when you clarify that what you mean is that it didn’t meet with your classification system’s approval, I feel like you are baiting and switching. Maybe go with “sin”, or “perversion”, to make clear that your meaning is just “Mr. Mind doesn’t like this”.