I don’t particularly see why an agent would want to have a terminal value it knows it can’t pursue. I don’t really see a point to having terminal values if you can guarantee you’ll never receive utility according to them.
I care about human pleasure, for instance, and assign utility to it over suffering, but if I knew I were going to be consigned to hell, where I and everyone I knew would be tortured for eternity without hope of reprieve, I’d rather be rid of that value.
Well, as-is we don’t even have the option of doing that. But the situation isn’t really analogous to, say, offering Ghandi a murder pill, because that takes as a premise that by changing his values, Ghandi would be motivated to act differently.
If the utility function doesn’t have prospects for modifying the actions of the agent that carries it, it’s basically dead weight.
As the maxim goes, there’s no point worrying about things you can’t do anything about. In real life, I think this is actually generally bad advice, because if you don’t take the time to worry about something at all, you’re liable to miss it if there are things you can do about it. But if you could be assured in advance that there were almost certainly nothing you could do about it, then if it were up to you to choose whether or not to worry, I think it would be better to choose not to.
But if you could be assured in advance that there were almost certainly nothing you could do about it, then if it were up to you to choose whether or not to worry, I think it would be better to choose not to.
I’m not sure I’m parsing you correctly here. Are you talking about the negative utility he gets from … the sensation of getting negative utility from things? So, all things being equal (which they never are) …
That would imply that it was some sort of meta-negative utility, if I’m understanding you correctly. But if you’re asking if I endorse self modifying to give up a value given near certainty of it being a lost cause, the answer is yes.
No, and that’s why I suspect I’m misunderstanding. The same sort of negative utility—if you see something that gives you negative utility, you get negative utility and that—the fact that you got negative utility from something—gives you even more negative utility!
(Presumably, ever-smaller amounts, to prevent this running to infinity. Unless this value has an exception for it’s own negative utility, I suppose?)
I mean, as a utility maximiser, that must be the reason you wanted to stop yourself from getting negative utility from things when those things would continue anyway; because you attach negative utility … to attaching negative utility!
This is confusing me just writing it … but I hope you see what I mean.
I mean, as a utility maximiser, that must be the reason you wanted to stop yourself from getting negative utility from things when those things would continue anyway; because you attach negative utility … to attaching negative utility!
I think it might be useful here to draw on the distinction between trying to help and trying to obtain warm fuzzies. If something bad is happening and it’s impossible for me to do anything about it, I’d rather not get anti-warm fuzzies on top of that.
I don’t particularly see why an agent would want to have a terminal value it knows it can’t pursue. I don’t really see a point to having terminal values if you can guarantee you’ll never receive utility according to them.
I care about human pleasure, for instance, and assign utility to it over suffering, but if I knew I were going to be consigned to hell, where I and everyone I knew would be tortured for eternity without hope of reprieve, I’d rather be rid of that value.
Only if you were 100% certain a situation would never come up where you could satisfy that value.
Not if you can get negative utility according to that value.
What? By that logic, you should just self-modify into a things-as-they-are maximizer.
(The negative-utility events still happen, man, even if you replace yourself with something that doesn’t care.)
Well, as-is we don’t even have the option of doing that. But the situation isn’t really analogous to, say, offering Ghandi a murder pill, because that takes as a premise that by changing his values, Ghandi would be motivated to act differently.
If the utility function doesn’t have prospects for modifying the actions of the agent that carries it, it’s basically dead weight.
As the maxim goes, there’s no point worrying about things you can’t do anything about. In real life, I think this is actually generally bad advice, because if you don’t take the time to worry about something at all, you’re liable to miss it if there are things you can do about it. But if you could be assured in advance that there were almost certainly nothing you could do about it, then if it were up to you to choose whether or not to worry, I think it would be better to choose not to.
I’m not sure I’m parsing you correctly here. Are you talking about the negative utility he gets from … the sensation of getting negative utility from things? So, all things being equal (which they never are) …
Am I barking up the wrong tree here?
That would imply that it was some sort of meta-negative utility, if I’m understanding you correctly. But if you’re asking if I endorse self modifying to give up a value given near certainty of it being a lost cause, the answer is yes.
No, and that’s why I suspect I’m misunderstanding. The same sort of negative utility—if you see something that gives you negative utility, you get negative utility and that—the fact that you got negative utility from something—gives you even more negative utility!
(Presumably, ever-smaller amounts, to prevent this running to infinity. Unless this value has an exception for it’s own negative utility, I suppose?)
I mean, as a utility maximiser, that must be the reason you wanted to stop yourself from getting negative utility from things when those things would continue anyway; because you attach negative utility … to attaching negative utility!
This is confusing me just writing it … but I hope you see what I mean.
I think it might be useful here to draw on the distinction between trying to help and trying to obtain warm fuzzies. If something bad is happening and it’s impossible for me to do anything about it, I’d rather not get anti-warm fuzzies on top of that.
Ah, that does make things much clearer. Thanks!
Yup, warm fuzzies were the thing missing from my model. Gotta take them into account.