I’m trying to understand this, and I’m trying to do it by being a little more concrete.
Suppose I have a choice to make, and my moral intuition is throwing error codes. I have two axiomations of morality that are capable of examining the choice, but they give opposite answers. Does anything in this essay help? If not, is there a future essay planned that will?
In a universe that contains a neurotypical human and clippy, and they’re staring at eachother, is there an asymmetry?
Haiti today is a situation that makes my moral intuition throw error codes. Population density is three times that of Cuba. Should we be sending aid? It would be kinder to send helicopter gunships and carry out a cull. Cut the population back to one tenth of its current level, then build paradise. My rival moral intuition is that culling humans is always wrong.
Trying to stay concrete and present, should I restrict my charitable giving to helping countries make the demographic transition? Within a fixed aid budget one can choose package A = (save one child, provide education, provide entry into global economy; 30 years later the child, now an adult, feeds his own family and has some money left over to help others)
package B = (save four children; that’s it, money all used up, thirty years later there are 16 children needing saving and its not going to happen). Concrete choice of A over B: ignore Haiti and send money to Karuna trust to fund education for untouchables in India, preferring to raise a few children out of poverty by letting other children die.
It’s also about half that of Taiwan, significantly less than South Korea or the Netherlands, and just above Belgium, Israel, and Japan—as well as very nearly on par with India, the country you’re using as an alternative! I suspect your source may have overweighted population density as a factor in poor social outcomes.
I don’t see how these two frameworks are appealing to different terminal values—they seem to be arguments about which policies maximize consequential lives-saved over time, or maximize QALYs (Quality-Adjusted Life Years) over time. This seem like a surprisingly neat and lovely illustration of “disagreeing moral axioms” that turn out to be about instrumental policies without much in the way of differing terminal values, hence a dispute of fact with a true-or-false answer under a correspondence theory of truth for physical-universe hypotheses.
I think that is it, I’m trying to do utilitarianism. I’ve got some notion q of quality and quantity of life. It varies through time. How do I assess a long term policy, with short term sacrifices for better output in the long run? I integrate over time with a suitable weighting such as
e%5E{-\frac{t}{\tau}}%20dt)
What is the significance of the time constant tau? I see it as mainly a humility factor, because I cannot actually see into the future and know how things will turn out in the long run. Accordingly I give reduced weight to the future, much beyond tau, for better or worse, because I do not trust my assessment of either.
But is that an adequate response to human fallibility? My intuition is that one has to back it up with an extra rule: if my moral calculations suggest culling humans, its time to give up, go back to painting kitsch water colours and leave politics to the sane. That’s my interpretation of dspeyer’s phrase “my moral intuition is throwing error codes.” Now I have two rules, so Sod’s Law tells me that some day they are going to conflict.
Eliever’s post made an ontological claim, that a universe with only two kinds of things, physics and logic, has room for morality. It strikes me that I’ve made no dent in that claim. All I’ve managed to argue is that it all adds up to normality: we cannot see the future, so we do not know what to do for the best. Panic and tragic blunders ensue, as usual.
I interpreted Eliever’s questions as a response to the evocative phrase “my moral intuition is throwing error codes.” What does it actually mean? Can it be grounded in an actual situation?
Grounding it in an actual situation introduces complications. Given a real life moral dilemma it is always a good idea to look for a third option. But exploring those additional options doesn’t help us understand the computer programming metaphor of moral intuitions throwing error codes
My original draft contained a long ramble about permanent Malthusian immiseration. History is a bit of a race. Can society progress fast enough to reach the demographic transition? Or does population growth redistribute all the gains in GDP so that individuals get poorer, life gets harder, the demographic transition doesn’t happen,… If I were totally evil and wanted to fuck over as many people as a could, as hard as a I could, my strategy for maximum holocaust is as follows.
Establish free mother-and-baby clinics
Provide free food for the under fives
Leverage the positive reputation from the first two to promote religions that oppose contraception
Leverage religious faith to get contraception legally prohibited
If I can get population growth to out run technological gains in productivity I can engineer a Limits to growth style crash. That will be vastly worse than any wickedness that I could be work by directly harming people.
Unfortunately, I had been reading various articles discussing the 40th Anniversary of the publication of the Limits to Growth book. So I deleted the set up for the moral dilemma from my comment, thinking that my readers will be over-familiar with concerns about permanent Malthusian immiseration, and pick up immediately on “aid as sabotage”, and the creation of permanent traps.
My original comment was a disaster, but since I’m pig-headed I’m going to have another go at saying what it might mean for ones moral intuitions to throw error codes:
Imagine that you (a good person) have volunteered to help out in sub-Saharan Africa, distributing free food to the under fives :-) One day you find out who is paying for the food. Dr Evil is paying; it is part of his plan for maximum holocaust...
Really? That’s your plan for “maximum holocaust”? You’ll do more good than harm in the short run, and if you run out of capital (not hard with such a wastefully expensive plan) then you’ll do nothing but good.
This sounds to me like a political applause light, especially
Leverage the positive reputation from the first two to promote religions that oppose contraception
Leverage religious faith to get contraception legally prohibited
In essence, your statement boils down to “if I wanted to do the most possible harm, I would do what the Enemy are doing!” which is clearly a mindkilling political appeal.
(For reference, here’s my plan for maximum holocaust: select the worst things going on in the world today. Multiply their evil by their likelihoods of success. Found a terrorist group attacking the winners. Be careful to kill lots of civilians without actually stopping your target.)
Imagine that you (a good person) have volunteered to help out in sub-Saharan Africa, distributing free food to the under fives :-) One day you find out who is paying for the food. Dr Evil is paying; it is part of his plan for maximum holocaust...
I’m afraid Franken Fran beat you to this story a while ago.
Hopefully this comment was intended as non-obvious form of satire, otherwise it’s completely nonsensical.
You’re—Mr. AlanCrowe that is—mixing up aid that prevents temporary suffering to lack of proper longterm solutions. As the saying goes:
“Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime.”
You’re forgetting the “teach a man to fish” part entirely. Which should be enough—given the context—to explain what’s wrong with your reasoning. I could go on explaining further, but I don’t want to talk about such heinous acts, the ones you mentioned, unecessarily.
EDIT:
Alright sorry I overlooked the type of your mistake slightly because I had an answer ready and recognized a pattern so your mistake wasn’t quite that skindeep.
In anycase I think it’s extremely insensitive and rash to poorly excuse yourself of atrocities like these:
It would be kinder to send helicopter gunships and carry out a cull. Cut the population back to one tenth of its current level, then build paradise.
In anycase you falsely created a polarity between different attempts of optimizing charity here:
A = (save one child, provide education, provide entry into global economy; 30 years later the child, now an adult, feeds his own family and has some money left over to help others) package B = (save four children; that’s it, money all used up, thirty years later there are 16 children needing saving and its not going to happen).
And then by means of trickery. you transformed it into “being unsympathetic now” + “sympathetic later” > “sympathetic now” > “more to be sympathetic about later”
However in the really real world each unnecessary death prevented counts, each starving child counts, at least in my book. If someone suffers right now in exchange for someone else not suffering later—nothing is gained.
Which to me looks like you’re just eager to throw sympathy out the windowin hopes of looking very rational in contrast. And with this false trickery you’ve made it look like these suffering people deserve what they get and there’s nothing you can do about it. You could also accompany options A and B with option C “Save as many children as possible and fight harder to raise money for schools and infrastructure as well” not to mention that you can give food to people who are building those schools and it’s not a zero-sum game.
Imagine that you (a good person) have volunteered to help out in sub-Saharan Africa, distributing free food to the under fives :-) One day you find out who is paying for the food. Dr Evil is paying; it is part of his plan for maximum holocaust...
I would be very happy that Dr. Evil appears to be maximally incompetent.
Seriously, why are you basing your analysis on a 40 year old book whose predictions have failed to come true?
My actual situations are too complicated and I don’t feel comfortable discussing them on the internet. So here’s a fictional situation with real dilemmas.
Suppose I have a friend who is using drugs to self-destructive levels. This friend is no longer able to keep a job, and I’ve been giving him couch-space. With high probability, if I were to apply pressure, I could decrease his drug use. One axiomization says I should consider how happy he will be with an outcome, and I believe he’ll be happier once he’s sober and capable of taking care of himself. Another axiomization says I should consider how much he wants a course of action, and I believe he’ll be angry at my trying to run his life.
As a further twist, he consistently says different things depending on which drugs he’s on. One axiomization defines a person such that each drug-cocktail-personality is a separate person whose desires have moral weight. Another axiomization defines a person such that my friend is one person, but the drugs are making it difficult for him to express his desires—the desires with moral weight are the ones he would have if he were sober (and it’s up to me to deduce them from the evidence available).
My response to this situation depends on how he’s getting money for drugs given that he no longer has a job and also on how much of a hassle it is for you to give him couch-space. If you don’t have the right to run his life, he doesn’t have the right to interfere in yours (by taking up your couch, asking you for drug money, etc.).
I am deeply uncomfortable with the drug-cocktail-personalities-as-separate-people approach; it seems too easily hackable to be a good foundation for a moral theory. It’s susceptible to a variant of the utility monster, namely a person who takes a huge variety of drug cocktails and consequently has a huge collection of separate people in his head. A potentially more realistic variant of this strategy might be to start a cult and to claim moral weight for your cult’s preferences once it grows large enough…
(Not that I have any particular cult in mind while saying this. Hail Xenu.)
Edit: I suppose your actual question is how the content of this post is relevant to answering such questions. I don’t think it is, directly. Based on the subsequent post about nonstandard models of Peano arithmetic, I think Eliezer is suggesting an analogy between the question of what is true about the natural numbers and the question of what is moral. To address either question one first has to logically pinpoint “the natural numbers” and “morality” respectively, and this post is about doing the latter. Then one has to prove statements about the things that have been logically pointed to, which is a difficult and separate question, but at least an unambiguously meaningful one once the logical pinpointing has taken place.
The two contrasts you’ve set up (happiness vs. desire-satisfaction, and temporal-person-slices vs. unique-rationalized-person-idealization) aren’t completely independent. For instance, if you accept weighting all the temporal slices of the person equally, then you can weight all their desires or happinesses against each other; whereas if you take the ‘idealized rational transformation of my friend’ route, you can disregard essentially all of his empirical desires and pleasures, depending on just how you go about the idealization process. There are three criteria to keep in mind here:
Does your ethical system attend to how reality actually breaks down? Can we find a relatively natural and well-defined notion of ‘personal identity over time’ that solves this problem? If not, then that obviously strengthens the case for treating the fundamental locus of moral concern as a person-relativized-to-a-time, rather than as a person-extended-over-a-lifetime.
Does your ethical system admit of a satisfying reflective equilibrium? Do your values end up in tension with themselves, or underdetermining what the right choice is? If so, you may have taken a wrong turn.
Are these your core axiomatizations, or are they just heuristics for approximating the right utility-maximizing rule? If the latter, then the right question isn’t Which Is The One True Heuristic, but rather which heuristics have the most severe and frequent biases. For instance, the idealized-self approach has some advantages (e.g., it lets us disregard the preferences of brainwashed people in favor of their unbrainwashed selves), but it also has huge risks by virtue of its less empirical character. See Berlin’s discussion of the rational self.
Another axiomization defines a person such that my friend is one person, but the drugs are making it difficult for him to express his desires
I think that is simply factually wrong, meaning, it’s a false statement about your friends brain.
One axiomization says I should consider how happy he will be with an outcome, and I believe he’ll be happier once he’s sober and capable of taking care of himself. Another axiomization says I should consider how much he wants a course of action, and I believe he’ll be angry at my trying to run his life.
I think it comes down to this: you want your friend sober and happy, but your friends preferences and actions work against those values. The question is what kind of influence on him is allowed.
Suppose I have a choice to make, and my moral intuition is throwing error codes. I have two axiomations of morality that are capable of examining the choice, but they give opposite answers.
If you’re not sure which of two options is better, the only thing that will help is to think about it for a long time. (Note: if you “have two axiomatizations of morality”, and they disagree, then at most one of them accurately describes what you were trying to get at when you attempted to axiomatize morality. To work out which one is wrong, you need to think about them for ages until you notice that one of them says something wrong.)
In a universe that contains a neurotypical human and clippy, and they’re staring at eachother, is there an asymmetry?
Yes, the human is better. Why? Because the human cares about what is better. In contrast to clippy, who just cares about what is paperclippier.
Indeed. However, a) betterness is obviously better than clippiness, and b) if dspeyer is anything like a typical human being, the implicit question behind “is there an asymmetry?” was “is one of them better?”
What is your evidence for stating that human-betterness is “obviously better” than clippy-betterness? Your comment reads to me you’re either arguing that 3 > Potato or that there exists a universally compelling argument. I could however be wrong.
“Human-betterness” and “clippy-betterness” are confused terminology. There’s only betterness and clippiness. Clippiness is not a type of betterness. Humans generally care about betterness, paperclippers care about clippiness. You can’t argue a paperclipper into caring about betterness.
I said that betterness is better than clippiness. This should be obvious, since it’s a tautology.
I certainly agree with you that you can’t argue a paperclipper into caring about what you call betterness.
I do however think that “betterness is better than clippiness” is not a tautology, rather it is vacuous.
It has as much meaning as “3 is greater than potato” and invokes the same reaction in me as “comparing apples and oranges”.
At best, if you ranked UberClippy (the most Clippy of all Paperclippers) and UberHuman (the best possible human) on all of the criteria that is important to humans then UberHuman would naturally rate higher, that is a tautology. And if you define better to mean that then I would absolutely concede that (and I assume that you do). However i would also say that it is just as valid to define better such that it applies to all of the criteria that is important to Paperclippers.
To state it a different way, To me your first paragraph leads to the conclusion “Paperclippers cannot do better because clippiness is not a type of betterness” which seems to me like you’re pulling a fast one on the meaning of “better”.
To me it seems that you are mixing together “better” in “morally better”, and “better” as “more efficient”. If we replace the second one with “more efficient”, we get:
Betterness (moral) is more efficient measure of being better (morally).
Clippiness is more efficient measure of being clippy.
I guess we (and Clippy) could agree about this. It is just confusing to write the latter sentence as “clippiness is better than betterness, with regards to being clippy”, because the two different meanings are expressed there by the same word “better”. (Why does this even happen? Because we use “better” as universal applause lights.)
EDIT: More precisely, the moral “better” also means more efficient, but at reaching some specific goals, such as human happiness, etc. So the difference is between “more efficient (without goals being specified)” and “more efficient (at this specific set of goals)”. Clippiness is more efficient at making paperclips, but is not more efficient at making humans happy.
This. “Good” can refer to either a two-place function ‘goodness(action, goal_system)’ (though the second argument can be implicit in the context) or to the one-place function you get when you curry the second argument of the former to something like ‘life, consciousness, etc., etc. etc.’. EY is talking specifically about the latter, but he isn’t terribly clear about that.
EDIT: BTW, the antonym of the former is usually “bad”, whereas the antonym of the latter is usually “evil”.
EDIT 2: A third meaning is the two-place function with the second argument curried to the speaker’s terminal values, so that I could say “good” to mean ‘good for life, consciousness, etc.’ and Clippy could say “good” to mean ‘good for making paperclips’, and this doesn’t mean one of us is mistaken about what “good” means, any more than the fact that we use “here” to refer to different places means one of us is mistaken about what “here” means.
It could be valid to define “better” any way you like. But the definition most consistent with normal usage includes all and only criteria that matter to humans. This is why people say things like “but is it truly, really, fundamentally better?” Because people really care about whether A is better than B. If “better” meant something else (other than better), such as produces more paperclips, then people would find a different word to describe what they care about.
Hrrm ok. That is a different way of looking at it.
My take on the word is that the normal usage of better is by itself a context free comparator. The context of the comparison comes from the things around it (implicitly or explicitly) thus “UberClippy is better than Clippy” (implied: At being a Paperclipper), Manchester United is better than Leeds (implied: At playing football), or even “Betterness is better for humans than clippiness”. I have no problem with “Betterness is more humane than clippiness”.
Note that I don’t think i’m disagreeing with Eliezer here. Fundamentally you are processing the logical concept with a static context, i process it with a local context. Either way it’s highly unlikely that the context you hold or that i would derive would be the same as the paperclipper versions of ourselves (or indeed any given brain in potential brain space).
I am confused by what you mean by “better” here. Your statement makes sense to me if i replace better with “humanier”(more humanly? more human-like? Not humane… too much baggage). Is that what you mean?
Perhaps it would help to taboo “symmetry”, or at least to say what kind of… uhm, mapping… do we really expect here. Just some way to play with words, or something useful? How specifically useful?
Saying “humans : better = paperclips maximizers : more clippy” would be a correct answer in a test of verbal skills. Just be careful not to add a wrong connotation there.
Because saying ”...therefore ‘better’ and ‘more clippy’ are just two different ways of being better, for two different species” would be a nonsense, exactly like saying ”...therefore ‘more clippy’ and ‘better’ are just two different ways of being more clippy, for two different species”. No, being better is not a homo sapiens way to produce the most paperclips. And being more clippy is not a paperclip maximizer way to produce the most happiness (even for the paperclip maximizers).
If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.
To your latter question though, I think that what you’re asking is “If two agents have utility functions that clash, which one is to be preferred?”
Is it that all we can say is “Whichever one has the most resources and most optimisation power/intelligence will be able to put its goals into action and prevent the other one from fully acting upon its”?
Well, I think that the point Eliezer has talked about a few times before is that there is no ultimate morality, written into the universe that will affect any agent so as to act it out. You can’t reason with an agent which has a totally different utility function. The only reason that we can argue with humans is that they’re only human, and thus we share many desires. Figuring out morality isn’t going to give you the powers to talk down Clippy from killing you for more paper clips. You aren’t going to show how human ‘morality’, which actualises what humans prefer, is any more preferable than ‘Clippy’ ethics. He is just going to kill you.
So, let’s now figure out exactly what we want most, (if we had our own CEV) and then go out and do it.
Nobody else is gonna do it for us.
EDIT: First sentence ‘conflicting desires’; I meant to say ‘in principle unresolvable’ like ‘x’ and ‘~x’. Of course, for most situations, you have multiple desires that clash, and you just have to perform utility calculations to figure out what to do.
You can’t reason with an agent which has a totally different utility function. The only reason that we can argue with humans is that they’re only human, and thus we share many desires.
If you know (or correctly guess) the agents’ utility function, and are able to communicate with it, then it may well be possible to reason with it.
Consider this situation; I am captured by a Paperclipper, which wishes to extract the iron from my blood and use it to make more paperclips (incidentally killing me in the process). I can attempt to escape by promising to send to the Paperclipper a quantity of iron—substantially more than can be found in my blood, and easier to extract—as soon as I am safe. As long as I can convince Clippy that I will follow through on my promise, I have a chance of living.
I can’t talk Clippy into adopting my own morality. But I can talk Clippy into performing individual actions that I would prefer Clippy to do (or into refraining from other actions) as long as I ensure that Clippy can get more paperclips by doing what I ask than by not doing what I ask.
Of course—my mistake. I meant that you can’t alter an agent’s desires by reason alone. You can’t appeal to desires you have. You can only appeal to its desires. So, when he’s going to turn the your blood iron into paperclips, and you want to live, you can’t try “But I want to live a long and happy life!”. If Clippy hasn’t got empathy, and you have nothing to offer that will help fulfill his own desires, then there’s nothing to be done, other than try to physical stop or kill him.
Maybe you’d be happier if you put him in a planet of his own, where a machine constantly destroye paperclips, and he was happy making new ones. My point is just that, if you do decide to make him happy, it’s not the optimal decision relative to a universal preference, or morality. It’s just the optimal decision relative to your desires. Is that ‘right’? Yes. That’s what we refer to, when we say ‘right’.
If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.
Hahaha no. If it doesn’t desire these other desires, then they are less likely to be fulfilled.
Figuring out morality isn’t going to give you the powers to talk down Clippy from killing you for more paper clips. You aren’t going to show how human ‘morality’, which actualises what humans prefer, is any more preferable than ‘Clippy’ ethics. He is just going to kill you.
Well, if you could persuade him our morality is “better” by his standards—results in more paperclips—than it could work. But obviously arguing that Murder Is Wrong is about as smart as them telling you that killing it would be Wrong because it results in less paperclips.
So, let’s now figure out exactly what we want most, (if we had our own CEV) and then go out and do it. Nobody else is gonna do it for us.
Indeed. (Although “us” here includes an FAI, obviously.)
If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.
Hahaha no. If it doesn’t desire these other desires, then they are less likely to be fulfilled.
I don’t understand… I said it has two equally valued desires? So, it doesn’t desire one over the other. So, if it desired x, y and z, equally well, except that x --> <~y v ~z>, but y or z ( or both ) implied just ~x, then even though it desires x, it would be optimal to alter its desires, so as to not desire x. Then, it will always be happy fulfilling y and z, and not continue to be dissatisfied.
I was saying this in response to dspeyer saying he had two axiomations of morality (I took that to mean two desires, or sets of) which were in conflict. I was saying that there is no universal maxim against which he could measure the two—he just needs to figure out which ones will be optimal in the long term, and (attempt to) discard the rest.
Edit: Oh, I now realise I originally added the word ‘one’ to the first sentence of the earlier post you were quoting. If this was somehow the cause of confusion, my apologies.
“I value both saving orphans from fires and eating chocolate. I’m a horrible person, so I can’t choose whether to abandon my chocolate and save the orphanage.”
Should I self-modify to ignore the orphans? Hell no. If future-me doesn’t want to save orphans then he never will, even if it would cost no chocolate.
That’s a very big counterfactual hypothesis, that there exists someone who holds equal moral weight to the statements ‘I am saving orphans from fires’ and ‘I am eating chocolate’. It would certainly show a lack of empathy—or a near self-destructive need for chocolate! In fact, the best choice for someone (if it would still be ‘human’) with those qualities in our society would be to keep the desire to save orphans, so as to retain a modicum of humanity. The only reason I suggest it would want such a modicum, would be so as to survive in the human society it finds itself (assuming wishes to stay alive, so as to continue fulfilling desires).
Of course, this whole counter-example assumes that the two desires are equally desired, and at odds. Which is quite difficult even to imagine.
But I still think that the earlier idea, that there would be no universal moral standard against which it could compare its decision, remains. It is certainly wrong, and evil to choose the chocolate from my point of view, but I am, alas, only human.
And, I will do everything in my power to encourage the sorts of behaviour that makes agents prefer to save orphans from fires, than to eat chocolate!!!
Hey, it doesn’t have to be orphans. Or it could be two different kinds of orphan—boys and girls, say. The boy’s orphanage is on fire! So is the nearby girl’s orphanage! Which one do you save!
Protip: The correct response is not “I self-modify to only care about one sex.”
EDIT: Also, aren’t you kind of fighting the counterfactual?
I was just talking about sets of desires that clash in principle. When you have to desires that clash over one thing, then you will act to fulfill the stronger of your desires. But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’ then the best idea is to (attempt to) self-modify to have only the desire that will produce the most utility. Having both will mean disutility always.
I’m sorry, I don’t understand what you mean when you say ‘fighting the counterfactual’.
But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’
...then you have a conflict. The best idea is not to cut off one of those desires, but to find out where the conflict comes from and what higher goals are giving rise to these as instrumental subgoals.
But how do you know something is a terminal value? They don’t come conveniently labelled. Someone else just claimed that not killing people is a terminal value for all “neurotypical” people, but unless they’re going to define every soldier, everyone exonerated at an inquest by reason of self defence, and every doctor who has acceded to a terminal patient’s desire for an easy exit, as non-”neurotypical”, “not killing people” bears about as much resemblance to a terminal value as a D&D character sheet does to an actual person.
I’m sorry, I don’t understand what you mean when you say ‘fighting the counterfactual’.
Try the search bar. It’s a pretty common concept here, although I don’t recall where it originated.
I was just talking about sets of desires that clash in principle. When you have to desires that clash over one thing, then you will act to fulfill the stronger of your desires. But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’ then the best idea is to (attempt to) self-modify to have only the desire that will produce the most utility. Having both will mean disutility always.
Well, that disutility is only lower according to my new preferences; my old one’s remain sadly unfulfilled.
More specifically, if I value both freedom and safety (for everyone), should I self-modify not to hate reprogramming others? Or not to care that people will decide to kill each other sometimes?
Hmm… I don’t think my point necessarily helps here. I meant that you will always get disutility when you have two desires that always clash (x and not x); whichever way you choose, the other desire won’t be fulfilled.
However, in the case you offered (and probably most cases) it’s not a good idea to self-modify, as desires don’t clash in principle, always. Like with the chocolate and saving kids one, you just have to perform utility calculations to see which way to go (that one is saving kids).
you will always get disutility when you have two desires that always clash (x and not x); whichever way you choose, the other desire won’t be fulfilled.
Yup. And if you stop caring about one of those values, then modified!you will be happier. But you don’t care about what modified!you wants, you care about x and not-x.
“And I quoted the above list because the feeling of rightness isn’t about implementing a particular logical function; it contains no mention of logical functions at all; in the environment of evolutionary ancestry nobody has heard of axiomatization; these feelings are about life, consciousness, etcetera”
In a universe that contains a neurotypical human and clippy, and they’re staring at eachother, is there an asymmetry?
I’m trying to understand this, and I’m trying to do it by being a little more concrete.
Suppose I have a choice to make, and my moral intuition is throwing error codes. I have two axiomations of morality that are capable of examining the choice, but they give opposite answers. Does anything in this essay help? If not, is there a future essay planned that will?
In a universe that contains a neurotypical human and clippy, and they’re staring at eachother, is there an asymmetry?
Can you be more concrete? Some past or present actual situation?
Haiti today is a situation that makes my moral intuition throw error codes. Population density is three times that of Cuba. Should we be sending aid? It would be kinder to send helicopter gunships and carry out a cull. Cut the population back to one tenth of its current level, then build paradise. My rival moral intuition is that culling humans is always wrong.
Trying to stay concrete and present, should I restrict my charitable giving to helping countries make the demographic transition? Within a fixed aid budget one can choose package A = (save one child, provide education, provide entry into global economy; 30 years later the child, now an adult, feeds his own family and has some money left over to help others) package B = (save four children; that’s it, money all used up, thirty years later there are 16 children needing saving and its not going to happen). Concrete choice of A over B: ignore Haiti and send money to Karuna trust to fund education for untouchables in India, preferring to raise a few children out of poverty by letting other children die.
It’s also about half that of Taiwan, significantly less than South Korea or the Netherlands, and just above Belgium, Israel, and Japan—as well as very nearly on par with India, the country you’re using as an alternative! I suspect your source may have overweighted population density as a factor in poor social outcomes.
I don’t see how these two frameworks are appealing to different terminal values—they seem to be arguments about which policies maximize consequential lives-saved over time, or maximize QALYs (Quality-Adjusted Life Years) over time. This seem like a surprisingly neat and lovely illustration of “disagreeing moral axioms” that turn out to be about instrumental policies without much in the way of differing terminal values, hence a dispute of fact with a true-or-false answer under a correspondence theory of truth for physical-universe hypotheses.
ISTM he’s not quite sure whether one QALY thirty years from now should be worth as much as one QALY now.
I think that is it, I’m trying to do utilitarianism. I’ve got some notion q of quality and quantity of life. It varies through time. How do I assess a long term policy, with short term sacrifices for better output in the long run? I integrate over time with a suitable weighting such as
e%5E{-\frac{t}{\tau}}%20dt)What is the significance of the time constant tau? I see it as mainly a humility factor, because I cannot actually see into the future and know how things will turn out in the long run. Accordingly I give reduced weight to the future, much beyond tau, for better or worse, because I do not trust my assessment of either.
But is that an adequate response to human fallibility? My intuition is that one has to back it up with an extra rule: if my moral calculations suggest culling humans, its time to give up, go back to painting kitsch water colours and leave politics to the sane. That’s my interpretation of dspeyer’s phrase “my moral intuition is throwing error codes.” Now I have two rules, so Sod’s Law tells me that some day they are going to conflict.
Eliever’s post made an ontological claim, that a universe with only two kinds of things, physics and logic, has room for morality. It strikes me that I’ve made no dent in that claim. All I’ve managed to argue is that it all adds up to normality: we cannot see the future, so we do not know what to do for the best. Panic and tragic blunders ensue, as usual.
Is permitting or perhaps even helping Haitians to emigrate to other countries anywhere in the moral calculus?
I interpreted Eliever’s questions as a response to the evocative phrase “my moral intuition is throwing error codes.” What does it actually mean? Can it be grounded in an actual situation?
Grounding it in an actual situation introduces complications. Given a real life moral dilemma it is always a good idea to look for a third option. But exploring those additional options doesn’t help us understand the computer programming metaphor of moral intuitions throwing error codes
So you’re facing a moral dilemma between giving to charity and murdering nine million people? I think I know what the problem might be.
My original draft contained a long ramble about permanent Malthusian immiseration. History is a bit of a race. Can society progress fast enough to reach the demographic transition? Or does population growth redistribute all the gains in GDP so that individuals get poorer, life gets harder, the demographic transition doesn’t happen,… If I were totally evil and wanted to fuck over as many people as a could, as hard as a I could, my strategy for maximum holocaust is as follows.
Establish free mother-and-baby clinics
Provide free food for the under fives
Leverage the positive reputation from the first two to promote religions that oppose contraception
Leverage religious faith to get contraception legally prohibited
If I can get population growth to out run technological gains in productivity I can engineer a Limits to growth style crash. That will be vastly worse than any wickedness that I could be work by directly harming people.
Unfortunately, I had been reading various articles discussing the 40th Anniversary of the publication of the Limits to Growth book. So I deleted the set up for the moral dilemma from my comment, thinking that my readers will be over-familiar with concerns about permanent Malthusian immiseration, and pick up immediately on “aid as sabotage”, and the creation of permanent traps.
My original comment was a disaster, but since I’m pig-headed I’m going to have another go at saying what it might mean for ones moral intuitions to throw error codes:
Imagine that you (a good person) have volunteered to help out in sub-Saharan Africa, distributing free food to the under fives :-) One day you find out who is paying for the food. Dr Evil is paying; it is part of his plan for maximum holocaust...
Really? That’s your plan for “maximum holocaust”? You’ll do more good than harm in the short run, and if you run out of capital (not hard with such a wastefully expensive plan) then you’ll do nothing but good.
This sounds to me like a political applause light, especially
In essence, your statement boils down to “if I wanted to do the most possible harm, I would do what the Enemy are doing!” which is clearly a mindkilling political appeal.
(For reference, here’s my plan for maximum holocaust: select the worst things going on in the world today. Multiply their evil by their likelihoods of success. Found a terrorist group attacking the winners. Be careful to kill lots of civilians without actually stopping your target.)
I’m afraid Franken Fran beat you to this story a while ago.
Hopefully this comment was intended as non-obvious form of satire, otherwise it’s completely nonsensical.
You’re—Mr. AlanCrowe that is—mixing up aid that prevents temporary suffering to lack of proper longterm solutions. As the saying goes:
“Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime.”
You’re forgetting the “teach a man to fish” part entirely. Which should be enough—given the context—to explain what’s wrong with your reasoning. I could go on explaining further, but I don’t want to talk about such heinous acts, the ones you mentioned, unecessarily.
EDIT: Alright sorry I overlooked the type of your mistake slightly because I had an answer ready and recognized a pattern so your mistake wasn’t quite that skindeep.
In anycase I think it’s extremely insensitive and rash to poorly excuse yourself of atrocities like these:
In anycase you falsely created a polarity between different attempts of optimizing charity here:
And then by means of trickery. you transformed it into “being unsympathetic now” + “sympathetic later” > “sympathetic now” > “more to be sympathetic about later”
However in the really real world each unnecessary death prevented counts, each starving child counts, at least in my book. If someone suffers right now in exchange for someone else not suffering later—nothing is gained.
Which to me looks like you’re just eager to throw sympathy out the window in hopes of looking very rational in contrast. And with this false trickery you’ve made it look like these suffering people deserve what they get and there’s nothing you can do about it. You could also accompany options A and B with option C “Save as many children as possible and fight harder to raise money for schools and infrastructure as well” not to mention that you can give food to people who are building those schools and it’s not a zero-sum game.
I would be very happy that Dr. Evil appears to be maximally incompetent.
Seriously, why are you basing your analysis on a 40 year old book whose predictions have failed to come true?
(Are you sure you want this posted under what appears to be a real name?)
Don’t be absurd. How could advocating population control via shotgun harm one’s reputation?
When should seek the protection of anonymity? Where do I draw the line? On which side do pro-bestiality comments fall?
My actual situations are too complicated and I don’t feel comfortable discussing them on the internet. So here’s a fictional situation with real dilemmas.
Suppose I have a friend who is using drugs to self-destructive levels. This friend is no longer able to keep a job, and I’ve been giving him couch-space. With high probability, if I were to apply pressure, I could decrease his drug use. One axiomization says I should consider how happy he will be with an outcome, and I believe he’ll be happier once he’s sober and capable of taking care of himself. Another axiomization says I should consider how much he wants a course of action, and I believe he’ll be angry at my trying to run his life.
As a further twist, he consistently says different things depending on which drugs he’s on. One axiomization defines a person such that each drug-cocktail-personality is a separate person whose desires have moral weight. Another axiomization defines a person such that my friend is one person, but the drugs are making it difficult for him to express his desires—the desires with moral weight are the ones he would have if he were sober (and it’s up to me to deduce them from the evidence available).
My response to this situation depends on how he’s getting money for drugs given that he no longer has a job and also on how much of a hassle it is for you to give him couch-space. If you don’t have the right to run his life, he doesn’t have the right to interfere in yours (by taking up your couch, asking you for drug money, etc.).
I am deeply uncomfortable with the drug-cocktail-personalities-as-separate-people approach; it seems too easily hackable to be a good foundation for a moral theory. It’s susceptible to a variant of the utility monster, namely a person who takes a huge variety of drug cocktails and consequently has a huge collection of separate people in his head. A potentially more realistic variant of this strategy might be to start a cult and to claim moral weight for your cult’s preferences once it grows large enough…
(Not that I have any particular cult in mind while saying this. Hail Xenu.)
Edit: I suppose your actual question is how the content of this post is relevant to answering such questions. I don’t think it is, directly. Based on the subsequent post about nonstandard models of Peano arithmetic, I think Eliezer is suggesting an analogy between the question of what is true about the natural numbers and the question of what is moral. To address either question one first has to logically pinpoint “the natural numbers” and “morality” respectively, and this post is about doing the latter. Then one has to prove statements about the things that have been logically pointed to, which is a difficult and separate question, but at least an unambiguously meaningful one once the logical pinpointing has taken place.
The two contrasts you’ve set up (happiness vs. desire-satisfaction, and temporal-person-slices vs. unique-rationalized-person-idealization) aren’t completely independent. For instance, if you accept weighting all the temporal slices of the person equally, then you can weight all their desires or happinesses against each other; whereas if you take the ‘idealized rational transformation of my friend’ route, you can disregard essentially all of his empirical desires and pleasures, depending on just how you go about the idealization process. There are three criteria to keep in mind here:
Does your ethical system attend to how reality actually breaks down? Can we find a relatively natural and well-defined notion of ‘personal identity over time’ that solves this problem? If not, then that obviously strengthens the case for treating the fundamental locus of moral concern as a person-relativized-to-a-time, rather than as a person-extended-over-a-lifetime.
Does your ethical system admit of a satisfying reflective equilibrium? Do your values end up in tension with themselves, or underdetermining what the right choice is? If so, you may have taken a wrong turn.
Are these your core axiomatizations, or are they just heuristics for approximating the right utility-maximizing rule? If the latter, then the right question isn’t Which Is The One True Heuristic, but rather which heuristics have the most severe and frequent biases. For instance, the idealized-self approach has some advantages (e.g., it lets us disregard the preferences of brainwashed people in favor of their unbrainwashed selves), but it also has huge risks by virtue of its less empirical character. See Berlin’s discussion of the rational self.
I think that is simply factually wrong, meaning, it’s a false statement about your friends brain.
I think it comes down to this: you want your friend sober and happy, but your friends preferences and actions work against those values. The question is what kind of influence on him is allowed.
If you’re not sure which of two options is better, the only thing that will help is to think about it for a long time. (Note: if you “have two axiomatizations of morality”, and they disagree, then at most one of them accurately describes what you were trying to get at when you attempted to axiomatize morality. To work out which one is wrong, you need to think about them for ages until you notice that one of them says something wrong.)
Yes, the human is better. Why? Because the human cares about what is better. In contrast to clippy, who just cares about what is paperclippier.
And the clippy is clippier. Why? Because the clippy cares about what is clippier. In contrast to the human, who just cares about what is better.
Indeed. However, a) betterness is obviously better than clippiness, and b) if dspeyer is anything like a typical human being, the implicit question behind “is there an asymmetry?” was “is one of them better?”
And clippiness is obviously more clipperific. That doesn’t actually answer the question.
What is your evidence for stating that human-betterness is “obviously better” than clippy-betterness? Your comment reads to me you’re either arguing that 3 > Potato or that there exists a universally compelling argument. I could however be wrong.
“Human-betterness” and “clippy-betterness” are confused terminology. There’s only betterness and clippiness. Clippiness is not a type of betterness. Humans generally care about betterness, paperclippers care about clippiness. You can’t argue a paperclipper into caring about betterness.
I said that betterness is better than clippiness. This should be obvious, since it’s a tautology.
I certainly agree with you that you can’t argue a paperclipper into caring about what you call betterness.
I do however think that “betterness is better than clippiness” is not a tautology, rather it is vacuous. It has as much meaning as “3 is greater than potato” and invokes the same reaction in me as “comparing apples and oranges”.
At best, if you ranked UberClippy (the most Clippy of all Paperclippers) and UberHuman (the best possible human) on all of the criteria that is important to humans then UberHuman would naturally rate higher, that is a tautology. And if you define better to mean that then I would absolutely concede that (and I assume that you do). However i would also say that it is just as valid to define better such that it applies to all of the criteria that is important to Paperclippers.
To state it a different way, To me your first paragraph leads to the conclusion “Paperclippers cannot do better because clippiness is not a type of betterness” which seems to me like you’re pulling a fast one on the meaning of “better”.
To me it seems that you are mixing together “better” in “morally better”, and “better” as “more efficient”. If we replace the second one with “more efficient”, we get:
Betterness (moral) is more efficient measure of being better (morally).
Clippiness is more efficient measure of being clippy.
I guess we (and Clippy) could agree about this. It is just confusing to write the latter sentence as “clippiness is better than betterness, with regards to being clippy”, because the two different meanings are expressed there by the same word “better”. (Why does this even happen? Because we use “better” as universal applause lights.)
EDIT: More precisely, the moral “better” also means more efficient, but at reaching some specific goals, such as human happiness, etc. So the difference is between “more efficient (without goals being specified)” and “more efficient (at this specific set of goals)”. Clippiness is more efficient at making paperclips, but is not more efficient at making humans happy.
This. “Good” can refer to either a two-place function ‘goodness(action, goal_system)’ (though the second argument can be implicit in the context) or to the one-place function you get when you curry the second argument of the former to something like ‘life, consciousness, etc., etc. etc.’. EY is talking specifically about the latter, but he isn’t terribly clear about that.
EDIT: BTW, the antonym of the former is usually “bad”, whereas the antonym of the latter is usually “evil”.
EDIT 2: A third meaning is the two-place function with the second argument curried to the speaker’s terminal values, so that I could say “good” to mean ‘good for life, consciousness, etc.’ and Clippy could say “good” to mean ‘good for making paperclips’, and this doesn’t mean one of us is mistaken about what “good” means, any more than the fact that we use “here” to refer to different places means one of us is mistaken about what “here” means.
It could be valid to define “better” any way you like. But the definition most consistent with normal usage includes all and only criteria that matter to humans. This is why people say things like “but is it truly, really, fundamentally better?” Because people really care about whether A is better than B. If “better” meant something else (other than better), such as produces more paperclips, then people would find a different word to describe what they care about.
Hrrm ok. That is a different way of looking at it.
My take on the word is that the normal usage of better is by itself a context free comparator. The context of the comparison comes from the things around it (implicitly or explicitly) thus “UberClippy is better than Clippy” (implied: At being a Paperclipper), Manchester United is better than Leeds (implied: At playing football), or even “Betterness is better for humans than clippiness”. I have no problem with “Betterness is more humane than clippiness”.
Note that I don’t think i’m disagreeing with Eliezer here. Fundamentally you are processing the logical concept with a static context, i process it with a local context. Either way it’s highly unlikely that the context you hold or that i would derive would be the same as the paperclipper versions of ourselves (or indeed any given brain in potential brain space).
It is, in Eliezer’s sense of the word. So is “clippiness is clippier than betterness”, though.
He/she is using the built-in human betterness module to make a judgement between human-betterness and clippy-betterness.
There exist no universal compelling arguments about physical things either, but that doesn’t stop us from calling things true.
I am confused by what you mean by “better” here. Your statement makes sense to me if i replace better with “humanier”(more humanly? more human-like? Not humane… too much baggage). Is that what you mean?
Ah, but Clippy is far more clipperific, and so will do more clippy things. Better is not clippy, why should it matter?
Perhaps it would help to taboo “symmetry”, or at least to say what kind of… uhm, mapping… do we really expect here. Just some way to play with words, or something useful? How specifically useful?
Saying “humans : better = paperclips maximizers : more clippy” would be a correct answer in a test of verbal skills. Just be careful not to add a wrong connotation there.
Because saying ”...therefore ‘better’ and ‘more clippy’ are just two different ways of being better, for two different species” would be a nonsense, exactly like saying ”...therefore ‘more clippy’ and ‘better’ are just two different ways of being more clippy, for two different species”. No, being better is not a homo sapiens way to produce the most paperclips. And being more clippy is not a paperclip maximizer way to produce the most happiness (even for the paperclip maximizers).
Why do you have two axiomatizations of morality? Where did they come from? Is there a reason to suspect one or both of their sources?
Because aximatizations are hard. I tried twice. And probably messed up both times, but in different ways.
The axiomatizations are internally complete and consistent, so I understand two genuine logical objects, and I’m trying to understand which to apply.
(Note: my actual map of morality is more complicated and fuzzy—I’m simplifying for sake of discussion)
If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.
To your latter question though, I think that what you’re asking is “If two agents have utility functions that clash, which one is to be preferred?” Is it that all we can say is “Whichever one has the most resources and most optimisation power/intelligence will be able to put its goals into action and prevent the other one from fully acting upon its”?
Well, I think that the point Eliezer has talked about a few times before is that there is no ultimate morality, written into the universe that will affect any agent so as to act it out. You can’t reason with an agent which has a totally different utility function. The only reason that we can argue with humans is that they’re only human, and thus we share many desires. Figuring out morality isn’t going to give you the powers to talk down Clippy from killing you for more paper clips. You aren’t going to show how human ‘morality’, which actualises what humans prefer, is any more preferable than ‘Clippy’ ethics. He is just going to kill you.
So, let’s now figure out exactly what we want most, (if we had our own CEV) and then go out and do it. Nobody else is gonna do it for us.
EDIT: First sentence ‘conflicting desires’; I meant to say ‘in principle unresolvable’ like ‘x’ and ‘~x’. Of course, for most situations, you have multiple desires that clash, and you just have to perform utility calculations to figure out what to do.
If you know (or correctly guess) the agents’ utility function, and are able to communicate with it, then it may well be possible to reason with it.
Consider this situation; I am captured by a Paperclipper, which wishes to extract the iron from my blood and use it to make more paperclips (incidentally killing me in the process). I can attempt to escape by promising to send to the Paperclipper a quantity of iron—substantially more than can be found in my blood, and easier to extract—as soon as I am safe. As long as I can convince Clippy that I will follow through on my promise, I have a chance of living.
I can’t talk Clippy into adopting my own morality. But I can talk Clippy into performing individual actions that I would prefer Clippy to do (or into refraining from other actions) as long as I ensure that Clippy can get more paperclips by doing what I ask than by not doing what I ask.
Of course—my mistake. I meant that you can’t alter an agent’s desires by reason alone. You can’t appeal to desires you have. You can only appeal to its desires. So, when he’s going to turn the your blood iron into paperclips, and you want to live, you can’t try “But I want to live a long and happy life!”. If Clippy hasn’t got empathy, and you have nothing to offer that will help fulfill his own desires, then there’s nothing to be done, other than try to physical stop or kill him.
Maybe you’d be happier if you put him in a planet of his own, where a machine constantly destroye paperclips, and he was happy making new ones. My point is just that, if you do decide to make him happy, it’s not the optimal decision relative to a universal preference, or morality. It’s just the optimal decision relative to your desires. Is that ‘right’? Yes. That’s what we refer to, when we say ‘right’.
Hahaha no. If it doesn’t desire these other desires, then they are less likely to be fulfilled.
Well, if you could persuade him our morality is “better” by his standards—results in more paperclips—than it could work. But obviously arguing that Murder Is Wrong is about as smart as them telling you that killing it would be Wrong because it results in less paperclips.
Indeed. (Although “us” here includes an FAI, obviously.)
I don’t understand… I said it has two equally valued desires? So, it doesn’t desire one over the other. So, if it desired x, y and z, equally well, except that x --> <~y v ~z>, but y or z ( or both ) implied just ~x, then even though it desires x, it would be optimal to alter its desires, so as to not desire x. Then, it will always be happy fulfilling y and z, and not continue to be dissatisfied.
I was saying this in response to dspeyer saying he had two axiomations of morality (I took that to mean two desires, or sets of) which were in conflict. I was saying that there is no universal maxim against which he could measure the two—he just needs to figure out which ones will be optimal in the long term, and (attempt to) discard the rest.
Edit: Oh, I now realise I originally added the word ‘one’ to the first sentence of the earlier post you were quoting. If this was somehow the cause of confusion, my apologies.
“I value both saving orphans from fires and eating chocolate. I’m a horrible person, so I can’t choose whether to abandon my chocolate and save the orphanage.”
Should I self-modify to ignore the orphans? Hell no. If future-me doesn’t want to save orphans then he never will, even if it would cost no chocolate.
That’s a very big counterfactual hypothesis, that there exists someone who holds equal moral weight to the statements ‘I am saving orphans from fires’ and ‘I am eating chocolate’. It would certainly show a lack of empathy—or a near self-destructive need for chocolate! In fact, the best choice for someone (if it would still be ‘human’) with those qualities in our society would be to keep the desire to save orphans, so as to retain a modicum of humanity. The only reason I suggest it would want such a modicum, would be so as to survive in the human society it finds itself (assuming wishes to stay alive, so as to continue fulfilling desires). Of course, this whole counter-example assumes that the two desires are equally desired, and at odds. Which is quite difficult even to imagine. But I still think that the earlier idea, that there would be no universal moral standard against which it could compare its decision, remains. It is certainly wrong, and evil to choose the chocolate from my point of view, but I am, alas, only human. And, I will do everything in my power to encourage the sorts of behaviour that makes agents prefer to save orphans from fires, than to eat chocolate!!!
Hey, it doesn’t have to be orphans. Or it could be two different kinds of orphan—boys and girls, say. The boy’s orphanage is on fire! So is the nearby girl’s orphanage! Which one do you save!
Protip: The correct response is not “I self-modify to only care about one sex.”
EDIT: Also, aren’t you kind of fighting the counterfactual?
I was just talking about sets of desires that clash in principle. When you have to desires that clash over one thing, then you will act to fulfill the stronger of your desires. But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’ then the best idea is to (attempt to) self-modify to have only the desire that will produce the most utility. Having both will mean disutility always.
I’m sorry, I don’t understand what you mean when you say ‘fighting the counterfactual’.
“Fighting the counterfactual” presumably means “fighting the hypo[thetical]”.
Thanks.
...then you have a conflict. The best idea is not to cut off one of those desires, but to find out where the conflict comes from and what higher goals are giving rise to these as instrumental subgoals.
If you can’t, then:
You have failed.
Sucks to be you.
If you’re screwed enough, you’re screwed.
(For then record, I meant terminal values.)
But how do you know something is a terminal value? They don’t come conveniently labelled. Someone else just claimed that not killing people is a terminal value for all “neurotypical” people, but unless they’re going to define every soldier, everyone exonerated at an inquest by reason of self defence, and every doctor who has acceded to a terminal patient’s desire for an easy exit, as non-”neurotypical”, “not killing people” bears about as much resemblance to a terminal value as a D&D character sheet does to an actual person.
I was oversimplifying things. Updated now, thanks.
Try the search bar. It’s a pretty common concept here, although I don’t recall where it originated.
Well, that disutility is only lower according to my new preferences; my old one’s remain sadly unfulfilled.
More specifically, if I value both freedom and safety (for everyone), should I self-modify not to hate reprogramming others? Or not to care that people will decide to kill each other sometimes?
Hmm… I don’t think my point necessarily helps here. I meant that you will always get disutility when you have two desires that always clash (x and not x); whichever way you choose, the other desire won’t be fulfilled.
However, in the case you offered (and probably most cases) it’s not a good idea to self-modify, as desires don’t clash in principle, always. Like with the chocolate and saving kids one, you just have to perform utility calculations to see which way to go (that one is saving kids).
Yup. And if you stop caring about one of those values, then modified!you will be happier. But you don’t care about what modified!you wants, you care about x and not-x.
Probably this could (not) help
“And I quoted the above list because the feeling of rightness isn’t about implementing a particular logical function; it contains no mention of logical functions at all; in the environment of evolutionary ancestry nobody has heard of axiomatization; these feelings are about life, consciousness, etcetera”
An asymmetry in what?
In a word: no. You just have to accept uncertainty about your utility function and hope that clippy isn’t able to turn you into paperclips yet.