There’s a very serious chicken-and-egg problem when you talk about what a utility function SHOULD include, as opposed to what it does. You need a place OUTSIDE of the function to have preferences about what the function is.
If you just mean “I wish more humans shared my values on the topic of AGI x-risk”, that’s perfectly reasonable, but trivial. That’s about YOUR utility function, and the frustration you feel at being an outlier.
Ah, yeah, I didn’t mean to say that others’ utility functions should, by their own lights, be modified to care about alignment. I meant that instrumentally, their utility functions already value surviving AGI highly. I’d want to show this to them to get them to care about alignment, even if they and I disagree about a lot of other normative things.
If someone genuinely, reflectively doesn’t care about surviving AGI … then the above just doesn’t apply to them, and I won’t try to convince them of anything. In their case, we just have fundamental, reflectively robust value-disagreement.
I value not getting trampled by a hippo very highly too, but the likelihood that I find myself near a hippo is low. And my ability to do anything about it is also low.
There’s a very serious chicken-and-egg problem when you talk about what a utility function SHOULD include, as opposed to what it does. You need a place OUTSIDE of the function to have preferences about what the function is.
If you just mean “I wish more humans shared my values on the topic of AGI x-risk”, that’s perfectly reasonable, but trivial. That’s about YOUR utility function, and the frustration you feel at being an outlier.
Ah, yeah, I didn’t mean to say that others’ utility functions should, by their own lights, be modified to care about alignment. I meant that instrumentally, their utility functions already value surviving AGI highly. I’d want to show this to them to get them to care about alignment, even if they and I disagree about a lot of other normative things.
If someone genuinely, reflectively doesn’t care about surviving AGI … then the above just doesn’t apply to them, and I won’t try to convince them of anything. In their case, we just have fundamental, reflectively robust value-disagreement.
I value not getting trampled by a hippo very highly too, but the likelihood that I find myself near a hippo is low. And my ability to do anything about it is also low.