To whoever voted the parent down, this is edit nearly /edit exactly correct. A paperclip maximizer could, in principle, agree about what is right. It doesn’t have to, I mean a paperclip maximizer could be stupid, but assuming it’s intelligent enough, it could discover what is moral. But a paperclip maximizer doesn’t care about what is right, it only cares about paperclips, so it will continue maximizing paperclips and only worry about what is “right” when doing so helps it create more paperclips. Right is a specific set of terminal values that the paperclip maximizer DOESN”T have. On the other hand you, being human, do have those terminal values on EY’s metaethics.
Agreed that a paperclip maximizer can “discover what is moral,” in the sense that you’re using it here. (Although there’s no reason to expect any particular PM to do so, no matter how intelligent it is.)
Can you clarify why this sort of discovery is in any way interesting, useful, or worth talking about?
...morality is an objective feature of the universe...
Fascinating. I still don’t understand in what sense this could be true, except maybe the way I tried to interpret EY here and here. But those comments simply got downvoted without any explanation or attempt to correct me, therefore I can’t draw any particular conclusion from those downvotes.
You could argue that morality (what is right?) is human and other species will agree that from a human perspective what is moral is right is right is moral. Although I would agree, I don’t understand how such a confusing use of terms is helpful.
Morality is just a specific set of terminal values. It’s an objective feature of the universe because… humans have those terminal values. You can look inside the heads of humans and discover them. “Should,” “right,” and “moral,” in EY’s terms, are just being used as a rigid designators to refer to those specific values.
I’m not sure I understand the distinction between “right” and “moral” in your comment.
To whoever voted the parent down, this is exactly correct.
I was the second to vote down the grandparent. It is not exactly correct. In particular it claims “all disagreement” and “a paperclip maximiser agrees”, not “could in principle agree”.
While the comment could perhaps be salvaged with some tweaks, as it stands it is not correct and would just serve to further obfuscate what some people find confusing as it is.
I concede that I was implicitly assuming that all agents have access to the same information. Other than that, I can think of no source of disagreements apart from misunderstanding. I also meant that if paperclip maximizer attempted to find out what is right and did not make any mistakes, it would arrive at the same answer as a human, though there is not necessarily any reason for it to try in the first place. I do not think that these distinctions were nonobvious, but this may be overconfidence on my part.
Depends on how the question is asked. Does the paperclip maximizer have the definition of the word right stored in its memory? If so, it just consults the memory. Otherwise, the questioner would have to either define the word or explain how to arrive at a definition.
This may seem like cheating, but consider the analogous case where we are discussing prime numbers. You must either already know what a prime number is, or I must tell you, or I must tell you about mathematicians, and you must observe them.
As long as a human and a paperclip maximizer both have the same information about humans, they will both come to the same conclusions about human brains, which happen to encode what is right, thus allowing both the human and the paperclip maximizer to learn about morality. If this paperclip maximizer then chooses to wipe out humanity in order to get more raw materials, it will knows that its actions are wrong; it just has no term in its utility function for morality.
Eliezer believes that you desire to do what is right. It is important to remember that what is right has nothing to do with whether you desire it. Moral facts are interesting because they describe our desires, but they would be true even if our desires were different.
In general, these things are useful for programming FAI and evaluating moral arguments. We should not allow our values to drift too far over time. The fact that wireheads want to be wireheaded is not a a valid argument in favour of wireheading. A FAI should try to make reality match what is right, not make reality match people’s desires (the latter could be accomplished by changing people’s desires). We can be assured that we are acting morally even if there is no magic light from the sky telling us that we are. Moral goals should be pursued. Even if society condones that which is wrong, it is still wrong. Studying the human brain is necessary in order to learn more about morality. When two people disagree about morality, one or both of them is wrong.
And if it turns out that humans currently want something different than what we wanted a thousand years ago, then it follows that a thousand years ago we didn’t want what was right, and now we do… though if you’d asked us a thousand years ago, we’d have said that we want what is right, and we’d have arrived at that conclusion through exactly the same cognitive operations we’re currently using. (Of course, in that case we would be mistaken, unlike the current case.)
And if it turns out that a thousand years from now humans want something different, then we will no longer want what is right… though if you ask us then, we’ll say we want what is right, again using the same cognitive operations. (Again, in that case we would be mistaken.)
And if there turn out to be two groups of humans who want incompatible things (for example, because their brains are sufficiently different), then whichever group I happen to be in wants what is right, and the other group doesn’t… though if you ask them, they’ll (mistakenly) say they want what is right, again using the same cognitive operations.
All of which strikes me as a pointlessly confusing way of saying that I endorse what humans-sufficiently-like-me currently want, and don’t endorse what we used to want or come to want or what anyone else wants if it’s too different from that.
Talking about whether some action is right or wrong or moral seems altogether unnecessary on this view. It is enough to say that I endorse what I value, and will program FAI to optimize for that, and will reject moral arguments that are inconsistent with that, and etc. Sure, if I valued something different, I would endorse that instead, but that doesn’t change anything; if I were hit by a speeding train, I’d be dead, but it doesn’t follow that I am dead. I endorse what I value, which means I consider worlds in which there is less of what I value worse than worlds in which there is more of what I value—even if those worlds also include versions of me that endorse something different. Fine and dandy.
What is added to that description by introducing words like right and wrong and moral, other than the confusion caused by people who assume those words refer to a magic light from the sky? It seems no more useful, on this view, than talking about how certain acts are salvatory or diabolical or fleabag.
If the people a thousand years ago might have wanted what is right, but were mistaken as to what they really wanted. People do not understand their own brains. (You may agree with this; it is unclear from your wording.) Even if they really did have different desires they would not be mistaken. Even if they used the same sound - ‘right’ - they would be attaching a different meaning to it, so it would be a different word. They would be incorrect if they did not recognize our values as right in Eliezer-speak.
This is admitted a nonintuitive meaning. I do not know if there is a clearer way of saying things and I am unsure of what aspects of most people’s understanding of the word Eliezer believes this to capture. The alternative does not seem much clearer. Consider Eliezer’s example of pulling a child off of some train tracks. If you see me do so, you could explain it in terms of physics/neuroscience. If you ask me about it, I could mention the same explanation, but I also have another one. Why did seeing the child motivate me to save it? Yes, my neural pathways caused it, but I was not thinking about those neural pathway; that would be a level confusion. I was thinking about what is right. Saying that I acted because of neuroscience is true, but saying nothing else promotes level confusion. If you ask me what should happen if I were uninvolved or if my brain were different, I would not change my answer from if I were involved because should is a 1-place function. People do get confused about these things, especially when talking about AI, and that should be stopped. For many people, Eliezer did not resolve confusion, so we need to do better, but default language is no less clear than Eliezer-speak. (To the extent that I agree with Eliezer, I came to this agreement after having read the sequences, but directly after reading other arguments.)
I agree that people don’t fully understand their own brains. I agree that it is possible to have mistaken beliefs about what one really wants. I agree that on EY’s view any group that fails to identify our current values as right is mistaken.
I think EY’s usage of “right” in this context leads to unnecessary confusion.
The alternative that seems clearer to me, as I’ve argued elsewhere, is to designate our values as our values, assert that we endorse our values, engage in research to articulate our values more precisely, build systems to optimize for our values, and evaluate moral arguments in terms of how well they align with our values.
None of this requires further discussion of right and wrong, good and evil, salvatory and diabolical, etc., and such terms seem like “applause lights” better-suited to soliciting alliances than anything else.
If you ask me why I pulled the child off the train tracks, I probably reply that I didn’t want the child to die. If you ask me why I stood on the platform while the train ran over the child, I probably reply that I was paralyzed by shock/fear, or that I wasn’t sure what to do. In both cases, the actual reality is more complicated than my self-report: there are lots of factors that influence what I do, and I’m not aware of most of them.
I agree with you that people get confused about these things. I agree with you that there are multiple levels of description, and mixing them leads to confusion.
If you ask me whether the child should be pulled off the tracks, I probably say “yes”; if you ask me why, I probably get confused. The reason I get confused is because I don’t have a clear understanding of how I come to that conclusion; I simply consulted my preferences.
Faced with that confusion, people make up answers, including answers like “because it’s right to do so” or “because it’s wrong to let the child die” or “because children have moral value” or “because pulling the child off the tracks has shouldness” or a million other such sequences of words, none of which actually help resolve the confusion. They add nothing of value.
There are useful ways to address the question. There are things that can be said about how my preferences came to be that way, and what the consequences are of my preferences being that way, and whether my preferences are consistent. There are techniques for arriving at true statements in those categories.
As far as I can tell, talking about what’s right isn’t among them, any more than talking about what God wants is. It merely adds to the confusion.
I agree with everything non-linguistic If we get rid of words like right, wrong, and should, then we are forced to either come up with new words or use ‘want’ and ‘desire’. The first option is confusing and the second can make us seem like egoists or like people who think that wireheading is right because wireheaded people desire it. To someone unfamiliar with this ethical theory, it would be very misleading. Even many of the readers of this website would be confused if we only used words like ‘want’. What we have now is still far from optimal.
If we get rid of words like right, wrong, and should, then we are forced to either come up with new words or use ‘want’ and ‘desire’.
...and ‘preference’ and ‘value’ and so forth. Yes.
If I am talking about current human values, I endorse calling them that, and avoiding introducing new words (like “right”) until there’s something else for those words to designate.
That neither implies that I’m an egoist, nor that I endorse wireheading.
I agree with you that somebody might nevertheless conclude one or both of those things. They’d be mistaken.
I don’t think familiarity with any particular ethical theory is necessary to interpret the lack of a word, though I agree with you that using a word in the absence of a shared theory about its meaning leads to confusion. (I think most usages of “right” fall into this category.)
If you are using ‘right’ to designate something over and above current human values, I endorse you using the word… but I have no idea at the moment what that something is.
I tentatively agree with your wording, though I will have to see if there are any contexts where it fails.
If you are using ‘right’ to designate something over and above current human values, I endorse you using the word… but I have no idea at the moment what that something is.
By definition, wouldn’t humans be unable to want to pursue such a thing?
For example, if humans value X, and “right” designates Y, and aliens edit our brains so we value Y, then we would want to pursue such a thing. Or if Y is a subset of X, we might find it possible to pursue Y instead of X. (I’m less sure about that, though.) Or various other contrived possibilities.
Yes, my statement was way too strong. In fact, it should be much weaker than even what you say; just start a religion that tells people to value Y. I was attempting to express an actual idea that I had with this sentence originally, but my idea was wrong, so never mind.
But supposing it were true, why would it matter?
What does this mean? Supposing that something were right, what would it matter to humans? You could get it to matter to humans by exploiting their irrationality, but if CEV works, it would not matter to that.
What would it even mean for this to be true? You’d need a definition of right.
Eliezer believes that you desire to do what is right. It is important to remember that what is right has nothing to do with whether you desire it. [...] Moral goals should be pursued. Even if society condones that which is wrong, it is still wrong. Studying the human brain is necessary in order to learn more about morality.
How is this helpful? Here is how I would paraphrase the above (as I understand it):
Human brains cause human action through an ambivalent decision process.
What does this tell about wireheading? I think wireheading might increase pleasure but at the same time feel that it would be wrong. So? All that means is that I have complex and frequently ambivalent preferences and that I use an inaccurate and ambivalent language to describe them. What important insight am I missing?
The important thing about wireheading in this context is that desires after being wireheaded do not matter. The pleasure is irrelevant for this purpose; we could just as easily imagine humans being wireheaded to feel pain, but to desire continuing to feel pain. The point is that what is right should be pursued because it is right, not because people desire it. People’s desires are useful as a way of determining what is right, but if it is known that people desires were altered in some way, they stop providing evidence as to what is right. This understanding is essential to a superintelligence considering the best way to alter peoples brains.
The pleasure is irrelevant for this purpose; we could just as easily imagine humans being wireheaded to feel pain, but to desire continuing to feel pain. The point is that what is right should be pursued because it is right, not because people desire it.
That’s expressed very clearly, thanks. I don’t want to sound rude, I honestly want to understand this. I’m reading your comment and can’t help but think that you are arguing about some kind of universal right. I still can’t pinpoint the argument. Why isn’t it completely arbitrary if we desire to feel pain or pleasure? Is the right answer implied by our evolutionary history? That’s a guess, I’m confused.
People’s desires are useful as a way of determining what is right, but if it is known that people desires were altered in some way, they stop providing evidence as to what is right.
Aren’t our desires altered constantly by mutation, nurture, culture and what we experience and learn? Where can you find the purity of human desire?
I get that you are having trouble understanding this; it is hard and I am much worse at explaining thing in text than in person.
What is right is universal in the sense that what is right would not change if our brains were different. The fact that we care about what is right is caused by our evolutionary history. If we evolved differently, we would have different values, wanting what is gleerp rather than what is right. The differences would be arbitrary to most minds, but not to us. One of the problems of friendliness is ensuring that it is not arbitrary to the AI either.
Aren’t our desires altered constantly by mutation, nurture, culture and what we experience and learn?
There are two types of this; we may learn more about our own values, which is good and which Eliezer believes to be the cause of “moral progress”, or our values may really change. The second type of changes to our desires really are bad. People actually do this, like those who refuse to expose themselves to violence because they think that it will desensitize them from violence. They are really just refusing to take Gandhi’s murder pill, but on a smaller scale. If you have a transtemporal disagreement with your future self on what action you future self should take, your future self will win, because you will no longer exist. The only way to prevent this is to simply refuse to allow your values to change, preventing your future self from disagreeing with you in the first place.
I don’t know what you mean by “purity of human desire”.
To whoever voted the parent down, this is edit nearly /edit exactly correct. A paperclip maximizer could, in principle, agree about what is right. It doesn’t have to, I mean a paperclip maximizer could be stupid, but assuming it’s intelligent enough, it could discover what is moral. But a paperclip maximizer doesn’t care about what is right, it only cares about paperclips, so it will continue maximizing paperclips and only worry about what is “right” when doing so helps it create more paperclips. Right is a specific set of terminal values that the paperclip maximizer DOESN”T have. On the other hand you, being human, do have those terminal values on EY’s metaethics.
Agreed that a paperclip maximizer can “discover what is moral,” in the sense that you’re using it here. (Although there’s no reason to expect any particular PM to do so, no matter how intelligent it is.)
Can you clarify why this sort of discovery is in any way interesting, useful, or worth talking about?
It drives home the point that morality is an objective feature of the universe that doesn’t depend on the agent asking “what should I do?”
Huh. I don’t see how it drives home that point at all. But OK, at least I know what your intention is… thank you for clarifying that.
Fascinating. I still don’t understand in what sense this could be true, except maybe the way I tried to interpret EY here and here. But those comments simply got downvoted without any explanation or attempt to correct me, therefore I can’t draw any particular conclusion from those downvotes.
You could argue that morality (what is right?) is human and other species will agree that from a human perspective what is moral is right is right is moral. Although I would agree, I don’t understand how such a confusing use of terms is helpful.
Morality is just a specific set of terminal values. It’s an objective feature of the universe because… humans have those terminal values. You can look inside the heads of humans and discover them. “Should,” “right,” and “moral,” in EY’s terms, are just being used as a rigid designators to refer to those specific values.
I’m not sure I understand the distinction between “right” and “moral” in your comment.
I was the second to vote down the grandparent. It is not exactly correct. In particular it claims “all disagreement” and “a paperclip maximiser agrees”, not “could in principle agree”.
While the comment could perhaps be salvaged with some tweaks, as it stands it is not correct and would just serve to further obfuscate what some people find confusing as it is.
I concede that I was implicitly assuming that all agents have access to the same information. Other than that, I can think of no source of disagreements apart from misunderstanding. I also meant that if paperclip maximizer attempted to find out what is right and did not make any mistakes, it would arrive at the same answer as a human, though there is not necessarily any reason for it to try in the first place. I do not think that these distinctions were nonobvious, but this may be overconfidence on my part.
Can you say more about how the sufficiently intelligent paperclip maximizer goes about finding out what is right?
Depends on how the question is asked. Does the paperclip maximizer have the definition of the word right stored in its memory? If so, it just consults the memory. Otherwise, the questioner would have to either define the word or explain how to arrive at a definition.
This may seem like cheating, but consider the analogous case where we are discussing prime numbers. You must either already know what a prime number is, or I must tell you, or I must tell you about mathematicians, and you must observe them.
As long as a human and a paperclip maximizer both have the same information about humans, they will both come to the same conclusions about human brains, which happen to encode what is right, thus allowing both the human and the paperclip maximizer to learn about morality. If this paperclip maximizer then chooses to wipe out humanity in order to get more raw materials, it will knows that its actions are wrong; it just has no term in its utility function for morality.
Sure, agreed: if I tell the PM that thus-and-such is labeled “right,” or “moral,” or “fleabag,” then it will know these things, and it won’t care.
I have entirely lost track of why this is important.
Eliezer believes that you desire to do what is right. It is important to remember that what is right has nothing to do with whether you desire it. Moral facts are interesting because they describe our desires, but they would be true even if our desires were different.
In general, these things are useful for programming FAI and evaluating moral arguments. We should not allow our values to drift too far over time. The fact that wireheads want to be wireheaded is not a a valid argument in favour of wireheading. A FAI should try to make reality match what is right, not make reality match people’s desires (the latter could be accomplished by changing people’s desires). We can be assured that we are acting morally even if there is no magic light from the sky telling us that we are. Moral goals should be pursued. Even if society condones that which is wrong, it is still wrong. Studying the human brain is necessary in order to learn more about morality. When two people disagree about morality, one or both of them is wrong.
Sure.
And if it turns out that humans currently want something different than what we wanted a thousand years ago, then it follows that a thousand years ago we didn’t want what was right, and now we do… though if you’d asked us a thousand years ago, we’d have said that we want what is right, and we’d have arrived at that conclusion through exactly the same cognitive operations we’re currently using. (Of course, in that case we would be mistaken, unlike the current case.)
And if it turns out that a thousand years from now humans want something different, then we will no longer want what is right… though if you ask us then, we’ll say we want what is right, again using the same cognitive operations. (Again, in that case we would be mistaken.)
And if there turn out to be two groups of humans who want incompatible things (for example, because their brains are sufficiently different), then whichever group I happen to be in wants what is right, and the other group doesn’t… though if you ask them, they’ll (mistakenly) say they want what is right, again using the same cognitive operations.
All of which strikes me as a pointlessly confusing way of saying that I endorse what humans-sufficiently-like-me currently want, and don’t endorse what we used to want or come to want or what anyone else wants if it’s too different from that.
Talking about whether some action is right or wrong or moral seems altogether unnecessary on this view. It is enough to say that I endorse what I value, and will program FAI to optimize for that, and will reject moral arguments that are inconsistent with that, and etc. Sure, if I valued something different, I would endorse that instead, but that doesn’t change anything; if I were hit by a speeding train, I’d be dead, but it doesn’t follow that I am dead. I endorse what I value, which means I consider worlds in which there is less of what I value worse than worlds in which there is more of what I value—even if those worlds also include versions of me that endorse something different. Fine and dandy.
What is added to that description by introducing words like right and wrong and moral, other than the confusion caused by people who assume those words refer to a magic light from the sky? It seems no more useful, on this view, than talking about how certain acts are salvatory or diabolical or fleabag.
If the people a thousand years ago might have wanted what is right, but were mistaken as to what they really wanted. People do not understand their own brains. (You may agree with this; it is unclear from your wording.) Even if they really did have different desires they would not be mistaken. Even if they used the same sound - ‘right’ - they would be attaching a different meaning to it, so it would be a different word. They would be incorrect if they did not recognize our values as right in Eliezer-speak.
This is admitted a nonintuitive meaning. I do not know if there is a clearer way of saying things and I am unsure of what aspects of most people’s understanding of the word Eliezer believes this to capture. The alternative does not seem much clearer. Consider Eliezer’s example of pulling a child off of some train tracks. If you see me do so, you could explain it in terms of physics/neuroscience. If you ask me about it, I could mention the same explanation, but I also have another one. Why did seeing the child motivate me to save it? Yes, my neural pathways caused it, but I was not thinking about those neural pathway; that would be a level confusion. I was thinking about what is right. Saying that I acted because of neuroscience is true, but saying nothing else promotes level confusion. If you ask me what should happen if I were uninvolved or if my brain were different, I would not change my answer from if I were involved because should is a 1-place function. People do get confused about these things, especially when talking about AI, and that should be stopped. For many people, Eliezer did not resolve confusion, so we need to do better, but default language is no less clear than Eliezer-speak. (To the extent that I agree with Eliezer, I came to this agreement after having read the sequences, but directly after reading other arguments.)
I agree that people don’t fully understand their own brains. I agree that it is possible to have mistaken beliefs about what one really wants. I agree that on EY’s view any group that fails to identify our current values as right is mistaken.
I think EY’s usage of “right” in this context leads to unnecessary confusion.
The alternative that seems clearer to me, as I’ve argued elsewhere, is to designate our values as our values, assert that we endorse our values, engage in research to articulate our values more precisely, build systems to optimize for our values, and evaluate moral arguments in terms of how well they align with our values.
None of this requires further discussion of right and wrong, good and evil, salvatory and diabolical, etc., and such terms seem like “applause lights” better-suited to soliciting alliances than anything else.
If you ask me why I pulled the child off the train tracks, I probably reply that I didn’t want the child to die. If you ask me why I stood on the platform while the train ran over the child, I probably reply that I was paralyzed by shock/fear, or that I wasn’t sure what to do. In both cases, the actual reality is more complicated than my self-report: there are lots of factors that influence what I do, and I’m not aware of most of them.
I agree with you that people get confused about these things. I agree with you that there are multiple levels of description, and mixing them leads to confusion.
If you ask me whether the child should be pulled off the tracks, I probably say “yes”; if you ask me why, I probably get confused. The reason I get confused is because I don’t have a clear understanding of how I come to that conclusion; I simply consulted my preferences.
Faced with that confusion, people make up answers, including answers like “because it’s right to do so” or “because it’s wrong to let the child die” or “because children have moral value” or “because pulling the child off the tracks has shouldness” or a million other such sequences of words, none of which actually help resolve the confusion. They add nothing of value.
There are useful ways to address the question. There are things that can be said about how my preferences came to be that way, and what the consequences are of my preferences being that way, and whether my preferences are consistent. There are techniques for arriving at true statements in those categories.
As far as I can tell, talking about what’s right isn’t among them, any more than talking about what God wants is. It merely adds to the confusion.
I agree with everything non-linguistic If we get rid of words like right, wrong, and should, then we are forced to either come up with new words or use ‘want’ and ‘desire’. The first option is confusing and the second can make us seem like egoists or like people who think that wireheading is right because wireheaded people desire it. To someone unfamiliar with this ethical theory, it would be very misleading. Even many of the readers of this website would be confused if we only used words like ‘want’. What we have now is still far from optimal.
...and ‘preference’ and ‘value’ and so forth. Yes.
If I am talking about current human values, I endorse calling them that, and avoiding introducing new words (like “right”) until there’s something else for those words to designate.
That neither implies that I’m an egoist, nor that I endorse wireheading.
I agree with you that somebody might nevertheless conclude one or both of those things. They’d be mistaken.
I don’t think familiarity with any particular ethical theory is necessary to interpret the lack of a word, though I agree with you that using a word in the absence of a shared theory about its meaning leads to confusion. (I think most usages of “right” fall into this category.)
If you are using ‘right’ to designate something over and above current human values, I endorse you using the word… but I have no idea at the moment what that something is.
I tentatively agree with your wording, though I will have to see if there are any contexts where it fails.
By definition, wouldn’t humans be unable to want to pursue such a thing?
Not necessarily.
For example, if humans value X, and “right” designates Y, and aliens edit our brains so we value Y, then we would want to pursue such a thing. Or if Y is a subset of X, we might find it possible to pursue Y instead of X. (I’m less sure about that, though.) Or various other contrived possibilities.
But supposing it were true, why would it matter?
Yes, my statement was way too strong. In fact, it should be much weaker than even what you say; just start a religion that tells people to value Y. I was attempting to express an actual idea that I had with this sentence originally, but my idea was wrong, so never mind.
What does this mean? Supposing that something were right, what would it matter to humans? You could get it to matter to humans by exploiting their irrationality, but if CEV works, it would not matter to that.
What would it even mean for this to be true? You’d need a definition of right.
How is this helpful? Here is how I would paraphrase the above (as I understand it):
Human brains cause human action through an ambivalent decision process.
What does this tell about wireheading? I think wireheading might increase pleasure but at the same time feel that it would be wrong. So? All that means is that I have complex and frequently ambivalent preferences and that I use an inaccurate and ambivalent language to describe them. What important insight am I missing?
The important thing about wireheading in this context is that desires after being wireheaded do not matter. The pleasure is irrelevant for this purpose; we could just as easily imagine humans being wireheaded to feel pain, but to desire continuing to feel pain. The point is that what is right should be pursued because it is right, not because people desire it. People’s desires are useful as a way of determining what is right, but if it is known that people desires were altered in some way, they stop providing evidence as to what is right. This understanding is essential to a superintelligence considering the best way to alter peoples brains.
That’s expressed very clearly, thanks. I don’t want to sound rude, I honestly want to understand this. I’m reading your comment and can’t help but think that you are arguing about some kind of universal right. I still can’t pinpoint the argument. Why isn’t it completely arbitrary if we desire to feel pain or pleasure? Is the right answer implied by our evolutionary history? That’s a guess, I’m confused.
Aren’t our desires altered constantly by mutation, nurture, culture and what we experience and learn? Where can you find the purity of human desire?
I get that you are having trouble understanding this; it is hard and I am much worse at explaining thing in text than in person.
What is right is universal in the sense that what is right would not change if our brains were different. The fact that we care about what is right is caused by our evolutionary history. If we evolved differently, we would have different values, wanting what is gleerp rather than what is right. The differences would be arbitrary to most minds, but not to us. One of the problems of friendliness is ensuring that it is not arbitrary to the AI either.
There are two types of this; we may learn more about our own values, which is good and which Eliezer believes to be the cause of “moral progress”, or our values may really change. The second type of changes to our desires really are bad. People actually do this, like those who refuse to expose themselves to violence because they think that it will desensitize them from violence. They are really just refusing to take Gandhi’s murder pill, but on a smaller scale. If you have a transtemporal disagreement with your future self on what action you future self should take, your future self will win, because you will no longer exist. The only way to prevent this is to simply refuse to allow your values to change, preventing your future self from disagreeing with you in the first place.
I don’t know what you mean by “purity of human desire”.