A huge range of utility functions should care about alignment! It’s in the interest of just about everyone to survive AGI.
I’m going to worry less about hammering out value disagreement with people in the here and now, and push this argument on them instead. We’ll hammer out our value disagreements in our CEV, and in our future (should we save it).
There’s a very serious chicken-and-egg problem when you talk about what a utility function SHOULD include, as opposed to what it does. You need a place OUTSIDE of the function to have preferences about what the function is.
If you just mean “I wish more humans shared my values on the topic of AGI x-risk”, that’s perfectly reasonable, but trivial. That’s about YOUR utility function, and the frustration you feel at being an outlier.
Ah, yeah, I didn’t mean to say that others’ utility functions should, by their own lights, be modified to care about alignment. I meant that instrumentally, their utility functions already value surviving AGI highly. I’d want to show this to them to get them to care about alignment, even if they and I disagree about a lot of other normative things.
If someone genuinely, reflectively doesn’t care about surviving AGI … then the above just doesn’t apply to them, and I won’t try to convince them of anything. In their case, we just have fundamental, reflectively robust value-disagreement.
I value not getting trampled by a hippo very highly too, but the likelihood that I find myself near a hippo is low. And my ability to do anything about it is also low.
A huge range of utility functions should care about alignment! It’s in the interest of just about everyone to survive AGI.
I’m going to worry less about hammering out value disagreement with people in the here and now, and push this argument on them instead. We’ll hammer out our value disagreements in our CEV, and in our future (should we save it).
There’s a very serious chicken-and-egg problem when you talk about what a utility function SHOULD include, as opposed to what it does. You need a place OUTSIDE of the function to have preferences about what the function is.
If you just mean “I wish more humans shared my values on the topic of AGI x-risk”, that’s perfectly reasonable, but trivial. That’s about YOUR utility function, and the frustration you feel at being an outlier.
Ah, yeah, I didn’t mean to say that others’ utility functions should, by their own lights, be modified to care about alignment. I meant that instrumentally, their utility functions already value surviving AGI highly. I’d want to show this to them to get them to care about alignment, even if they and I disagree about a lot of other normative things.
If someone genuinely, reflectively doesn’t care about surviving AGI … then the above just doesn’t apply to them, and I won’t try to convince them of anything. In their case, we just have fundamental, reflectively robust value-disagreement.
I value not getting trampled by a hippo very highly too, but the likelihood that I find myself near a hippo is low. And my ability to do anything about it is also low.