If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.
Hahaha no. If it doesn’t desire these other desires, then they are less likely to be fulfilled.
Figuring out morality isn’t going to give you the powers to talk down Clippy from killing you for more paper clips. You aren’t going to show how human ‘morality’, which actualises what humans prefer, is any more preferable than ‘Clippy’ ethics. He is just going to kill you.
Well, if you could persuade him our morality is “better” by his standards—results in more paperclips—than it could work. But obviously arguing that Murder Is Wrong is about as smart as them telling you that killing it would be Wrong because it results in less paperclips.
So, let’s now figure out exactly what we want most, (if we had our own CEV) and then go out and do it. Nobody else is gonna do it for us.
Indeed. (Although “us” here includes an FAI, obviously.)
If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.
Hahaha no. If it doesn’t desire these other desires, then they are less likely to be fulfilled.
I don’t understand… I said it has two equally valued desires? So, it doesn’t desire one over the other. So, if it desired x, y and z, equally well, except that x --> <~y v ~z>, but y or z ( or both ) implied just ~x, then even though it desires x, it would be optimal to alter its desires, so as to not desire x. Then, it will always be happy fulfilling y and z, and not continue to be dissatisfied.
I was saying this in response to dspeyer saying he had two axiomations of morality (I took that to mean two desires, or sets of) which were in conflict. I was saying that there is no universal maxim against which he could measure the two—he just needs to figure out which ones will be optimal in the long term, and (attempt to) discard the rest.
Edit: Oh, I now realise I originally added the word ‘one’ to the first sentence of the earlier post you were quoting. If this was somehow the cause of confusion, my apologies.
“I value both saving orphans from fires and eating chocolate. I’m a horrible person, so I can’t choose whether to abandon my chocolate and save the orphanage.”
Should I self-modify to ignore the orphans? Hell no. If future-me doesn’t want to save orphans then he never will, even if it would cost no chocolate.
That’s a very big counterfactual hypothesis, that there exists someone who holds equal moral weight to the statements ‘I am saving orphans from fires’ and ‘I am eating chocolate’. It would certainly show a lack of empathy—or a near self-destructive need for chocolate! In fact, the best choice for someone (if it would still be ‘human’) with those qualities in our society would be to keep the desire to save orphans, so as to retain a modicum of humanity. The only reason I suggest it would want such a modicum, would be so as to survive in the human society it finds itself (assuming wishes to stay alive, so as to continue fulfilling desires).
Of course, this whole counter-example assumes that the two desires are equally desired, and at odds. Which is quite difficult even to imagine.
But I still think that the earlier idea, that there would be no universal moral standard against which it could compare its decision, remains. It is certainly wrong, and evil to choose the chocolate from my point of view, but I am, alas, only human.
And, I will do everything in my power to encourage the sorts of behaviour that makes agents prefer to save orphans from fires, than to eat chocolate!!!
Hey, it doesn’t have to be orphans. Or it could be two different kinds of orphan—boys and girls, say. The boy’s orphanage is on fire! So is the nearby girl’s orphanage! Which one do you save!
Protip: The correct response is not “I self-modify to only care about one sex.”
EDIT: Also, aren’t you kind of fighting the counterfactual?
I was just talking about sets of desires that clash in principle. When you have to desires that clash over one thing, then you will act to fulfill the stronger of your desires. But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’ then the best idea is to (attempt to) self-modify to have only the desire that will produce the most utility. Having both will mean disutility always.
I’m sorry, I don’t understand what you mean when you say ‘fighting the counterfactual’.
But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’
...then you have a conflict. The best idea is not to cut off one of those desires, but to find out where the conflict comes from and what higher goals are giving rise to these as instrumental subgoals.
But how do you know something is a terminal value? They don’t come conveniently labelled. Someone else just claimed that not killing people is a terminal value for all “neurotypical” people, but unless they’re going to define every soldier, everyone exonerated at an inquest by reason of self defence, and every doctor who has acceded to a terminal patient’s desire for an easy exit, as non-”neurotypical”, “not killing people” bears about as much resemblance to a terminal value as a D&D character sheet does to an actual person.
I’m sorry, I don’t understand what you mean when you say ‘fighting the counterfactual’.
Try the search bar. It’s a pretty common concept here, although I don’t recall where it originated.
I was just talking about sets of desires that clash in principle. When you have to desires that clash over one thing, then you will act to fulfill the stronger of your desires. But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’ then the best idea is to (attempt to) self-modify to have only the desire that will produce the most utility. Having both will mean disutility always.
Well, that disutility is only lower according to my new preferences; my old one’s remain sadly unfulfilled.
More specifically, if I value both freedom and safety (for everyone), should I self-modify not to hate reprogramming others? Or not to care that people will decide to kill each other sometimes?
Hmm… I don’t think my point necessarily helps here. I meant that you will always get disutility when you have two desires that always clash (x and not x); whichever way you choose, the other desire won’t be fulfilled.
However, in the case you offered (and probably most cases) it’s not a good idea to self-modify, as desires don’t clash in principle, always. Like with the chocolate and saving kids one, you just have to perform utility calculations to see which way to go (that one is saving kids).
you will always get disutility when you have two desires that always clash (x and not x); whichever way you choose, the other desire won’t be fulfilled.
Yup. And if you stop caring about one of those values, then modified!you will be happier. But you don’t care about what modified!you wants, you care about x and not-x.
Hahaha no. If it doesn’t desire these other desires, then they are less likely to be fulfilled.
Well, if you could persuade him our morality is “better” by his standards—results in more paperclips—than it could work. But obviously arguing that Murder Is Wrong is about as smart as them telling you that killing it would be Wrong because it results in less paperclips.
Indeed. (Although “us” here includes an FAI, obviously.)
I don’t understand… I said it has two equally valued desires? So, it doesn’t desire one over the other. So, if it desired x, y and z, equally well, except that x --> <~y v ~z>, but y or z ( or both ) implied just ~x, then even though it desires x, it would be optimal to alter its desires, so as to not desire x. Then, it will always be happy fulfilling y and z, and not continue to be dissatisfied.
I was saying this in response to dspeyer saying he had two axiomations of morality (I took that to mean two desires, or sets of) which were in conflict. I was saying that there is no universal maxim against which he could measure the two—he just needs to figure out which ones will be optimal in the long term, and (attempt to) discard the rest.
Edit: Oh, I now realise I originally added the word ‘one’ to the first sentence of the earlier post you were quoting. If this was somehow the cause of confusion, my apologies.
“I value both saving orphans from fires and eating chocolate. I’m a horrible person, so I can’t choose whether to abandon my chocolate and save the orphanage.”
Should I self-modify to ignore the orphans? Hell no. If future-me doesn’t want to save orphans then he never will, even if it would cost no chocolate.
That’s a very big counterfactual hypothesis, that there exists someone who holds equal moral weight to the statements ‘I am saving orphans from fires’ and ‘I am eating chocolate’. It would certainly show a lack of empathy—or a near self-destructive need for chocolate! In fact, the best choice for someone (if it would still be ‘human’) with those qualities in our society would be to keep the desire to save orphans, so as to retain a modicum of humanity. The only reason I suggest it would want such a modicum, would be so as to survive in the human society it finds itself (assuming wishes to stay alive, so as to continue fulfilling desires). Of course, this whole counter-example assumes that the two desires are equally desired, and at odds. Which is quite difficult even to imagine. But I still think that the earlier idea, that there would be no universal moral standard against which it could compare its decision, remains. It is certainly wrong, and evil to choose the chocolate from my point of view, but I am, alas, only human. And, I will do everything in my power to encourage the sorts of behaviour that makes agents prefer to save orphans from fires, than to eat chocolate!!!
Hey, it doesn’t have to be orphans. Or it could be two different kinds of orphan—boys and girls, say. The boy’s orphanage is on fire! So is the nearby girl’s orphanage! Which one do you save!
Protip: The correct response is not “I self-modify to only care about one sex.”
EDIT: Also, aren’t you kind of fighting the counterfactual?
I was just talking about sets of desires that clash in principle. When you have to desires that clash over one thing, then you will act to fulfill the stronger of your desires. But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’ then the best idea is to (attempt to) self-modify to have only the desire that will produce the most utility. Having both will mean disutility always.
I’m sorry, I don’t understand what you mean when you say ‘fighting the counterfactual’.
“Fighting the counterfactual” presumably means “fighting the hypo[thetical]”.
Thanks.
...then you have a conflict. The best idea is not to cut off one of those desires, but to find out where the conflict comes from and what higher goals are giving rise to these as instrumental subgoals.
If you can’t, then:
You have failed.
Sucks to be you.
If you’re screwed enough, you’re screwed.
(For then record, I meant terminal values.)
But how do you know something is a terminal value? They don’t come conveniently labelled. Someone else just claimed that not killing people is a terminal value for all “neurotypical” people, but unless they’re going to define every soldier, everyone exonerated at an inquest by reason of self defence, and every doctor who has acceded to a terminal patient’s desire for an easy exit, as non-”neurotypical”, “not killing people” bears about as much resemblance to a terminal value as a D&D character sheet does to an actual person.
I was oversimplifying things. Updated now, thanks.
Try the search bar. It’s a pretty common concept here, although I don’t recall where it originated.
Well, that disutility is only lower according to my new preferences; my old one’s remain sadly unfulfilled.
More specifically, if I value both freedom and safety (for everyone), should I self-modify not to hate reprogramming others? Or not to care that people will decide to kill each other sometimes?
Hmm… I don’t think my point necessarily helps here. I meant that you will always get disutility when you have two desires that always clash (x and not x); whichever way you choose, the other desire won’t be fulfilled.
However, in the case you offered (and probably most cases) it’s not a good idea to self-modify, as desires don’t clash in principle, always. Like with the chocolate and saving kids one, you just have to perform utility calculations to see which way to go (that one is saving kids).
Yup. And if you stop caring about one of those values, then modified!you will be happier. But you don’t care about what modified!you wants, you care about x and not-x.