Yes. It is morally bankrupt. (or would you not mind turning into paperclips if that’s what the Paperclip Maximizer wanted?)
Yes, but that is a matter of taste.
BTW, your current position is more-or-less what theists mean when they say atheists are amoral.
Why would I ever change my current position? If Yudkowsky told me there was some moral laws written into the fabric of reality, what difference would that make? Either such laws are imperative, so that I am unable to escape them, or I simply ignore them if they are opposing my preferences.
Assume all I wanted to do is to kill puppies. Now Yudkowsky told me that this is prohibited and I will suffer disutility because of it. The crucial question would be, does the disutility outweigh the utility I assign to killing puppies? If it doesn’t, why should I care?
Perhaps you assign net utility to killing puppies. If you do, you do. What EY tells you, what I tell you, what is prohibited, etc., has nothing to do with it. Nothing forces you to care about any of that.
If I understand EY’s position, it’s that it cuts both ways: whether killing puppies is right or wrong doesn’t force you to care, but whether or not you care doesn’t change whether it’s right or wrong.
If I understand your position, it’s that what’s right and wrong depends on the agent’s preferences: if you prefer killing puppies, then killing puppies is right; if you don’t, it isn’t.
My own response to EY’s claim is “How do you know that? What would you expect to observe if it weren’t true?” I’m not clear what his answer to that is.
My response to your claim is “If that’s true, so what? Why is right and wrong worth caring about, on that model… why not just say you feel like killing puppies?”
My response to your claim is “If that’s true, so what? Why is right and wrong worth caring about, on that model… why not just say you feel like killing puppies?”
I don’t think those terms are useless, that moral doesn’t exist. But you have to use those words with great care, because on its own they are meaningless. If I know what you want, I can approach the conditions that would be right for you. If I know how you define morality, I can act morally according to you. But I will do so only if I care about your preferences. If part of my preferences is to see other human beings happy then I have to account for your preferences to some extent, which makes them a subset of my preferences. All those different values are then weighted accordingly. Do you disagree with that understanding?
I agree with you that your preferences account for your actions, and that my preferences account for my actions, and that your preferences can include a preference for my preferences being satisfied.
But I think it’s a mistake to use the labels “morality” and “preferences” as though they are interchangeable.
If you have only one referent—which it sounds like you do—then I would recommend picking one label and using it consistently, and not use the other at all. If you have two referents, I would recommend getting clear about the difference and using one label per referent.
Otherwise, you introduce way too many unnecessary vectors for confusion.
It seems relatively clear to me that EY has two referents—he thinks there are two things being talked about. If I’m right, then you and he disagree on something, and by treating the language of morality as though it referred to preferences you obscure that disagreement.
More precisely: consider a system S comprising two agents A and B, each of which has a set of preferences Pa and Pb, and each of which has knowledge of their own and the other’s preferences. Suppose I commit an act X in S.
If I’ve understood correctly, you and EY agree that knowing all of that, you know enough in principle to determine whether X is right or wrong. That is, there isn’t anything left over, there’s no mysterious essence of rightness or external privileged judge or anything like that.
In this, both of you disagree with many other people, such as theists (who would say that you need to consult God’s will to make that determination) and really really strict consequentialists (who would say that you need to consult the whole future history of the results of X to make that determination).
If I’ve understood correctly, you and EY disagree on symmetry. That is, if A endorses X and B rejects X, you would say that whether X is right or not is undetermined… it’s right by reference to A, and wrong by reference to B, and there’s nothing more to be said. EY, if I understand what he’s written, would disagree—he would say that there is, or at least could be, additonal computation to be performed on S that will tell you whether X is right or not.
For example, if A = pebblesorters and X = sorting four pebbles into a pile, A rejects X, and EY (I think) would say that A is wrong to do so… not “wrong with reference to humans,” but simply wrong. You would (I think) say that such a distinction is meaningless, “wrong” is always with reference to something. You consider “wrong” a two-place predicate, EY considers “wrong” a one-place predicate—at least sometimes. I think.
For example, if A = SHFP and B = humans and X = allowing people to experience any pain at all, A rejects X and B endorses X. You would say that X is “right_human” and “wrong_SHFP” and that whether X is right or not is insufficiently specified question. EY would say that X is right and the SHFP are mistaken.
So, I disagree with your understanding, or at least your labeling, insofar as it leads you to elide real disagreements. I endorse clarity about disagreement.
As for whether I agree with your position or EY’s, I certainly find yours easier to justify.
But maybe I misunderstand how he arrives at the belief that “wrong” can be a one-place predicate.
Yeah. While I’m reasonably confident that he holds the belief, I have no confidence in any theories how he arrives at that belief.
What I have gotten from his writing on the subject is a combination of “Well, it sure seems that way to me,” and “Well, if that isn’t true, then I don’t see any way to build a superintelligence that does the right thing, and there has to be a way to build a superintelligence that does the right thing.” Neither of which I find compelling.
But there’s a lot of the metaethics sequence that doesn’t make much sense to me at all, so I have little confidence that what I’ve gotten out of it is a good representation of what’s there.
It’s also possible that I’m completely mistaken and he simply insists on “right” as a one-place predicate as a rhetorical trick; a way of drawing the reader’s attention away from the speaker’s role in that computation.
If that is the case I don’t see how different agents could arrive at the same perception of right and wrong, if their preferences are fundamentally opposing, given additional computation
I am fairly sure EY would say (and I agree) that there’s no reason to expect them to. Different agents with different preferences will have different beliefs about right and wrong, possibly incorrigibly different.
Humans and Babykillers as defined will simply never agree about how the universe would best be ordered, even if they come to agree (as a political exercise) on how to order the universe, without the exercise of force (as the SHFP purpose to do, for example).
(if right and wrong designate future world states).
Um.
Certainly, this model says that you can order world-states in terms of their rightness and wrongness, and there might therefore be a single possible world-state that’s most right within the set of possible world-states (though there might instead be several possible world-states that are equally right and better than all other possibilities).
If there’s only one such state, then I guess “right” could designate a future world state; if there are several, it could designate a set of world states.
But this depends on interpreting “right” to mean maximally right, in the same sense that “cold” could be understood to designate absolute zero. These aren’t the ways we actually use these words, though.
If you just argue that we don’t have free will because what is right is logically implied by cause and effect,
I don’t see what the concept of free will contributes to this discussion.
I’m fairly certain that EY would reject the idea that what’s right is logically implied by cause and effect, if by that you mean that an intelligence that started out without the right values could somehow infer, by analyzing causality in the world, what the right values were.
My own jury is to some degree still out on this one. I’m enough of a consequentialist to believe that an adequate understanding of cause and effect lets you express all judgments about right and wrong action in terms of more and less preferable world-states, but I cannot imagine how you could derive “preferable” from such an understanding. That said, my failure of imagination does not constitute a fact about the world.
Humans and Babykillers are not talking about the same subject matter when they debate what-to-do-next, and their doing different things does not constitute disagreement.
There’s a baby in front of me, and I say “Humans and Babykillers disagree about what to do next with this baby.”
The one replies: “No, they don’t. They aren’t talking about the same subject when they debate what to do next; this is not a disagreement.”
“Let me rephrase,” I say. “Babykillers prefer that this baby be killed. Humans prefer that this baby have fun. Fun and babykilling can’t both be implemented on the same baby: if it’s killed, it’s not having fun; if it’s having fun, it hasn’t been killed.”
Have I left out anything of value in my restatement? If so, what have I left out?
More generally: given all the above, why should I care whether or not what humans and Babykillers have with respect to this baby is a disagreement? What difference does that make?
More generally: given all the above, why should I care whether or not what humans and Babykillers have with respect to this baby is a disagreement? What difference does that make?
If you disagree with someone, and you’re both sufficiently rational, then you can expect to have a good shot at resolving your disagreement by arguing. That doesn’t work if you just have fundamentally different motivational frameworks.
I don’t know if I agree that a disagreement is necessarily resolvable by argument, but I certainly agree that many disagreements are so resolvable, whereas a complete difference of motivational framework is not.
If that’s what EY meant to convey by bringing up the question of whether Humans and Babykillers disagree, I agree completely.
As I said initially: “Humans and Babykillers as defined will simply never agree about how the universe would best be ordered.”
To understand the other side of the argument, I think it helps to look at this:
all disagreements are about facts. What else would you be talking about?
One side has redefined “disagreement” to mean “a difference of opinion over facts”!
I think that explains much of the sound and fury surrounding the issue.
A “difference of opinion over goals” is not a “difference of opinion over facts”.
However, note that different goals led to the cigarette companies denying the link between cigarettes and cancer—and also led to oil company AGW denialism—which caused many real-world disagreements.
All of which leaves me with the same question I started with. If I know what questions you and I give different answers to—be they questions about facts, values, goals, or whatever else—what is added to my understanding of the situation by asserting that we disagree, or don’t disagree?
ata’s reply was that “we disagree” additionally indicates that we can potentially converge on a common answer by arguing. That also seems to be what EY was getting at about hot air and rocks.
That makes sense to me, and sure, it’s additionally worth clarifying whether you and I can potentially converge on a common answer by arguing.
Anything else?
Because all of this dueling-definitions stuff strikes me as a pointless distraction. I use words to communicate concepts; if a word no longer clearly communicates concepts it’s no longer worth anything to me.
ata’s reply was that “we disagree” additionally indicates that we can potentially converge on a common answer by arguing
That doesn’t seem to be what the dictionary says “disagreement” means.
Maybe if both sides realise that the argument is pointless, they would not waste their time—but what if they don’t know what will happen? - or what if their disagreement is intended to sway not their debating partner, but a watching audience?
I agree with you about what the dictionary says, and that people might not know whether they can converge on a common answer, and that people might go through the motions of a disagreement for the benefit of observers.
We talk about what is good, and Babykillers talk about what is eat-babies, but both good and eat-babies perform analogous functions. For building a Friendly-AI we may not give a damn about how to categorize such analogous functions, but I’ve got a feeling that simply hijacking the word “moral” to suddenly not apply to such similar things, as I think it is usually used, you’ve successfully increased my confusion through the last year. Either this, or I’m back at square one. Probably the latter.
The fact that killing puppies is wrong follows from the definition of wrong. The fact that Eliezer does not want to do what is wrong is a fact about his brain, determined by introspection.
Yes, but that is a matter of taste.
Why would I ever change my current position? If Yudkowsky told me there was some moral laws written into the fabric of reality, what difference would that make? Either such laws are imperative, so that I am unable to escape them, or I simply ignore them if they are opposing my preferences.
Assume all I wanted to do is to kill puppies. Now Yudkowsky told me that this is prohibited and I will suffer disutility because of it. The crucial question would be, does the disutility outweigh the utility I assign to killing puppies? If it doesn’t, why should I care?
Perhaps you assign net utility to killing puppies. If you do, you do. What EY tells you, what I tell you, what is prohibited, etc., has nothing to do with it. Nothing forces you to care about any of that.
If I understand EY’s position, it’s that it cuts both ways: whether killing puppies is right or wrong doesn’t force you to care, but whether or not you care doesn’t change whether it’s right or wrong.
If I understand your position, it’s that what’s right and wrong depends on the agent’s preferences: if you prefer killing puppies, then killing puppies is right; if you don’t, it isn’t.
My own response to EY’s claim is “How do you know that? What would you expect to observe if it weren’t true?” I’m not clear what his answer to that is.
My response to your claim is “If that’s true, so what? Why is right and wrong worth caring about, on that model… why not just say you feel like killing puppies?”
I don’t think those terms are useless, that moral doesn’t exist. But you have to use those words with great care, because on its own they are meaningless. If I know what you want, I can approach the conditions that would be right for you. If I know how you define morality, I can act morally according to you. But I will do so only if I care about your preferences. If part of my preferences is to see other human beings happy then I have to account for your preferences to some extent, which makes them a subset of my preferences. All those different values are then weighted accordingly. Do you disagree with that understanding?
I agree with you that your preferences account for your actions, and that my preferences account for my actions, and that your preferences can include a preference for my preferences being satisfied.
But I think it’s a mistake to use the labels “morality” and “preferences” as though they are interchangeable.
If you have only one referent—which it sounds like you do—then I would recommend picking one label and using it consistently, and not use the other at all. If you have two referents, I would recommend getting clear about the difference and using one label per referent.
Otherwise, you introduce way too many unnecessary vectors for confusion.
It seems relatively clear to me that EY has two referents—he thinks there are two things being talked about. If I’m right, then you and he disagree on something, and by treating the language of morality as though it referred to preferences you obscure that disagreement.
More precisely: consider a system S comprising two agents A and B, each of which has a set of preferences Pa and Pb, and each of which has knowledge of their own and the other’s preferences. Suppose I commit an act X in S.
If I’ve understood correctly, you and EY agree that knowing all of that, you know enough in principle to determine whether X is right or wrong. That is, there isn’t anything left over, there’s no mysterious essence of rightness or external privileged judge or anything like that.
In this, both of you disagree with many other people, such as theists (who would say that you need to consult God’s will to make that determination) and really really strict consequentialists (who would say that you need to consult the whole future history of the results of X to make that determination).
If I’ve understood correctly, you and EY disagree on symmetry. That is, if A endorses X and B rejects X, you would say that whether X is right or not is undetermined… it’s right by reference to A, and wrong by reference to B, and there’s nothing more to be said. EY, if I understand what he’s written, would disagree—he would say that there is, or at least could be, additonal computation to be performed on S that will tell you whether X is right or not.
For example, if A = pebblesorters and X = sorting four pebbles into a pile, A rejects X, and EY (I think) would say that A is wrong to do so… not “wrong with reference to humans,” but simply wrong. You would (I think) say that such a distinction is meaningless, “wrong” is always with reference to something. You consider “wrong” a two-place predicate, EY considers “wrong” a one-place predicate—at least sometimes. I think.
For example, if A = SHFP and B = humans and X = allowing people to experience any pain at all, A rejects X and B endorses X. You would say that X is “right_human” and “wrong_SHFP” and that whether X is right or not is insufficiently specified question. EY would say that X is right and the SHFP are mistaken.
So, I disagree with your understanding, or at least your labeling, insofar as it leads you to elide real disagreements. I endorse clarity about disagreement.
As for whether I agree with your position or EY’s, I certainly find yours easier to justify.
Thanks for this, very enlightening! A very good framing and analysis of my beliefs.
Yeah. While I’m reasonably confident that he holds the belief, I have no confidence in any theories how he arrives at that belief.
What I have gotten from his writing on the subject is a combination of “Well, it sure seems that way to me,” and “Well, if that isn’t true, then I don’t see any way to build a superintelligence that does the right thing, and there has to be a way to build a superintelligence that does the right thing.” Neither of which I find compelling.
But there’s a lot of the metaethics sequence that doesn’t make much sense to me at all, so I have little confidence that what I’ve gotten out of it is a good representation of what’s there.
It’s also possible that I’m completely mistaken and he simply insists on “right” as a one-place predicate as a rhetorical trick; a way of drawing the reader’s attention away from the speaker’s role in that computation.
I am fairly sure EY would say (and I agree) that there’s no reason to expect them to. Different agents with different preferences will have different beliefs about right and wrong, possibly incorrigibly different.
Humans and Babykillers as defined will simply never agree about how the universe would best be ordered, even if they come to agree (as a political exercise) on how to order the universe, without the exercise of force (as the SHFP purpose to do, for example).
Um.
Certainly, this model says that you can order world-states in terms of their rightness and wrongness, and there might therefore be a single possible world-state that’s most right within the set of possible world-states (though there might instead be several possible world-states that are equally right and better than all other possibilities).
If there’s only one such state, then I guess “right” could designate a future world state; if there are several, it could designate a set of world states.
But this depends on interpreting “right” to mean maximally right, in the same sense that “cold” could be understood to designate absolute zero. These aren’t the ways we actually use these words, though.
I don’t see what the concept of free will contributes to this discussion.
I’m fairly certain that EY would reject the idea that what’s right is logically implied by cause and effect, if by that you mean that an intelligence that started out without the right values could somehow infer, by analyzing causality in the world, what the right values were.
My own jury is to some degree still out on this one. I’m enough of a consequentialist to believe that an adequate understanding of cause and effect lets you express all judgments about right and wrong action in terms of more and less preferable world-states, but I cannot imagine how you could derive “preferable” from such an understanding. That said, my failure of imagination does not constitute a fact about the world.
Humans and Babykillers are not talking about the same subject matter when they debate what-to-do-next, and their doing different things does not constitute disagreement.
There’s a baby in front of me, and I say “Humans and Babykillers disagree about what to do next with this baby.”
The one replies: “No, they don’t. They aren’t talking about the same subject when they debate what to do next; this is not a disagreement.”
“Let me rephrase,” I say. “Babykillers prefer that this baby be killed. Humans prefer that this baby have fun. Fun and babykilling can’t both be implemented on the same baby: if it’s killed, it’s not having fun; if it’s having fun, it hasn’t been killed.”
Have I left out anything of value in my restatement? If so, what have I left out?
More generally: given all the above, why should I care whether or not what humans and Babykillers have with respect to this baby is a disagreement? What difference does that make?
If you disagree with someone, and you’re both sufficiently rational, then you can expect to have a good shot at resolving your disagreement by arguing. That doesn’t work if you just have fundamentally different motivational frameworks.
I don’t know if I agree that a disagreement is necessarily resolvable by argument, but I certainly agree that many disagreements are so resolvable, whereas a complete difference of motivational framework is not.
If that’s what EY meant to convey by bringing up the question of whether Humans and Babykillers disagree, I agree completely.
As I said initially: “Humans and Babykillers as defined will simply never agree about how the universe would best be ordered.”
We previously debated the disagreements between those with different values here.
The dictionary apparently supports the idea that any conflict is a disagreement.
To understand the other side of the argument, I think it helps to look at this:
One side has redefined “disagreement” to mean “a difference of opinion over facts”!
I think that explains much of the sound and fury surrounding the issue.
A “difference of opinion over goals” is not a “difference of opinion over facts”.
However, note that different goals led to the cigarette companies denying the link between cigarettes and cancer—and also led to oil company AGW denialism—which caused many real-world disagreements.
All of which leaves me with the same question I started with. If I know what questions you and I give different answers to—be they questions about facts, values, goals, or whatever else—what is added to my understanding of the situation by asserting that we disagree, or don’t disagree?
ata’s reply was that “we disagree” additionally indicates that we can potentially converge on a common answer by arguing. That also seems to be what EY was getting at about hot air and rocks.
That makes sense to me, and sure, it’s additionally worth clarifying whether you and I can potentially converge on a common answer by arguing.
Anything else?
Because all of this dueling-definitions stuff strikes me as a pointless distraction. I use words to communicate concepts; if a word no longer clearly communicates concepts it’s no longer worth anything to me.
That doesn’t seem to be what the dictionary says “disagreement” means.
Maybe if both sides realise that the argument is pointless, they would not waste their time—but what if they don’t know what will happen? - or what if their disagreement is intended to sway not their debating partner, but a watching audience?
I agree with you about what the dictionary says, and that people might not know whether they can converge on a common answer, and that people might go through the motions of a disagreement for the benefit of observers.
We talk about what is good, and Babykillers talk about what is eat-babies, but both good and eat-babies perform analogous functions. For building a Friendly-AI we may not give a damn about how to categorize such analogous functions, but I’ve got a feeling that simply hijacking the word “moral” to suddenly not apply to such similar things, as I think it is usually used, you’ve successfully increased my confusion through the last year. Either this, or I’m back at square one. Probably the latter.
The fact that killing puppies is wrong follows from the definition of wrong. The fact that Eliezer does not want to do what is wrong is a fact about his brain, determined by introspection.