Moral Error and Moral Disagreement

Eliezer Yudkowsky10 Aug 2008 23:32 UTC

26 points

Followup to: Inseparably Right, Sorting Pebbles Into Correct Heaps

Richard Chappell, a pro, writes:

“When Bob says “Abortion is wrong”, and Sally says, “No it isn’t”, they are disagreeing with each other.

I don’t see how Eliezer can accommodate this. On his account, what Bob asserted is true iff abortion is prohibited by the morality_Bob norms. How can Sally disagree? There’s no disputing (we may suppose) that abortion is indeed prohibited by morality_Bob...

Since there is moral disagreement, whatever Eliezer purports to be analysing here, it is not morality.”

The phenomena of moral disagreement, moral error, and moral progress, on terminal values, are the primary drivers behind my metaethics. Think of how simple Friendly AI would be if there were no moral disagreements, moral errors, or moral progress!

Richard claims, “There’s no disputing (we may suppose) that abortion is indeed prohibited by morality_Bob.”

We may not suppose, and there is disputing. Bob does not have direct, unmediated, veridical access to the output of his own morality.

I tried to describe morality as a “computation”. In retrospect, I don’t think this is functioning as the Word of Power that I thought I was emitting.

Let us read, for “computation”, “idealized abstract dynamic”—maybe that will be a more comfortable label to apply to morality.

Even so, I would have thought it obvious that computations may be the subjects of mystery and error. Maybe it’s not as obvious outside computer science?

Disagreement has two prerequisites: the possibility of agreement and the possibility of error. For two people to agree on something, there must be something they are agreeing about, a referent held in common. And it must be possible for an “error” to take place, a conflict between “P” in the map and not-P in the territory. Where these two prerequisites are present, Sally can say to Bob: “That thing we were just both talking about—you are in error about it.”

Richard’s objection would seem in the first place to rule out the possibility of moral error, from which he derives the impossibility of moral agreement.

So: does my metaethics rule out moral error? Is there no disputing that abortion is indeed prohibited by morality_Bob?

This is such a strange idea that I find myself wondering what the heck Richard could be thinking. My best guess is that Richard, perhaps having not read all the posts in this sequence, is taking my notion of morality_Bob to refer to a flat, static list of valuations explicitly asserted by Bob. “Abortion is wrong” would be on Bob’s list, and there would be no disputing that.

But on the contrary, I conceive of morality_Bob as something that unfolds into Bob’s morality—like the way one can describe in 6 states and 2 symbols a Turing machine that will write 4.640 × 10¹⁴³⁹ 1s to its tape before halting.

So morality_Bob refers to a compact folded specification, and not a flat list of outputs. But still, how could Bob be wrong about the output of his own morality?

In manifold obvious and non-obvious ways:

Bob could be empirically mistaken about the state of fetuses, perhaps believing fetuses to be aware of the outside world. (Correcting this might change Bob’s instrumental values but not terminal values.)

Bob could have formed his beliefs about what constituted “personhood” in the presence of confusion about the nature of consciousness, so that if Bob were fully informed about consciousness, Bob would not have been tempted to talk about “the beginning of life” or “the human kind” in order to define personhood. (This changes Bob’s expressed terminal values; afterward he will state different general rules about what sort of physical things are ends in themselves.)

So those are the obvious moral errors—instrumental errors driven by empirical mistakes; and erroneous generalizations about terminal values, driven by failure to consider moral arguments that are valid but hard to find in the search space.

Then there are less obvious sources of moral error: Bob could have a list of mind-influencing considerations that he considers morally valid, and a list of other mind-influencing considerations that Bob considers morally invalid. Maybe Bob was raised a Christian and now considers that cultural influence to be invalid. But, unknown to Bob, when he weighs up his values for and against abortion, the influence of his Christian upbringing comes in and distorts his summing of value-weights. So Bob believes that the output of his current validated moral beliefs is to prohibit abortion, but actually this is a leftover of his childhood and not the output of those beliefs at all.

(Note that Robin Hanson and I seem to disagree, in a case like this, as to exactly what degree we should take Bob’s word about what his morals are.)

Or Bob could believe that the word of God determines moral truth and that God has prohibited abortion in the Bible. Then Bob is making metaethical mistakes, causing his mind to malfunction in a highly general way, and add moral generalizations to his belief pool, which he would not do if veridical knowledge of the universe destroyed his current and incoherent metaethics.

Now let us turn to the disagreement between Sally and Bob.

You could suggest that Sally is saying to Bob, “Abortion is allowed by morality_Bob”, but that seems a bit oversimplified; it is not psychologically or morally realistic.

If Sally and Bob were unrealistically sophisticated, they might describe their dispute as follows:

Bob: “Abortion is wrong.”

Sally: “Do you think that this is something of which most humans ought to be persuadable?”

Bob: “Yes, I do. Do you think abortion is right?”

Sally: “Yes, I do. And I don’t think that’s because I’m a psychopath by common human standards. I think most humans would come to agree with me, if they knew the facts I knew, and heard the same moral arguments I’ve heard.”

Bob: “I think, then, that we must have a moral disagreement: since we both believe ourselves to be a shared moral frame of reference on this issue, and yet our moral intuitions say different things to us.”

Sally: “Well, it is not logically necessary that we have a genuine disagreement. We might be mistaken in believing ourselves to mean the same thing by the words right and wrong, since neither of us can introspectively report our own moral reference frames or unfold them fully.”

Bob: “But if the meaning is similar up to the third decimal place, or sufficiently similar in some respects that it ought to be delivering similar answers on this particular issue, then, even if our moralities are not in-principle identical, I would not hesitate to invoke the intuitions for transpersonal morality.”

Sally: “I agree. Until proven otherwise, I am inclined to talk about this question as if it is the same question unto us.”

Bob: “So I say ‘Abortion is wrong’ without further qualification or specialization on what wrong means unto me.”

Sally: “And I think that abortion is right. We have a disagreement, then, and at least one of us must be mistaken.”

Bob: “Unless we’re actually choosing differently because of in-principle unresolvable differences in our moral frame of reference, as if one of us were a paperclip maximizer. In that case, we would be mutually mistaken in our belief that when we talk about doing what is right, we mean the same thing by right. We would agree that we have a disagreement, but we would both be wrong.”

Now, this is not exactly what most people are explicitly thinking when they engage in a moral dispute—but it is how I would cash out and naturalize their intuitions about transpersonal morality.

Richard also says, “Since there is moral disagreement...” This seems like a prime case of what I call naive philosophical realism—the belief that philosophical intuitions are direct unmediated veridical passports to philosophical truth.

It so happens that I agree that there is such a thing as moral disagreement. Tomorrow I will endeavor to justify, in fuller detail, how this statement can possibly make sense in a reductionistic natural universe. So I am not disputing this particular proposition. But I note, in passing, that Richard cannot justifiably assert the existence of moral disagreement as an irrefutable premise for discussion, though he could consider it as an apparent datum. You cannot take as irrefutable premises, things that you have not explained exactly; for then what is it that is certain to be true?

I cannot help but note the resemblance to Richard’s assumption that “there’s no disputing” that abortion is indeed prohibited by morality_Bob—the assumption that Bob has direct veridical unmediated access to the final unfolded output of his own morality.

Perhaps Richard means that we could suppose that abortion is indeed prohibited by morality_Bob, and allowed by morality_Sally, there being at least two possible minds for whom this would be true. Then the two minds might be mistaken about believing themselves to disagree. Actually they would simply be directed by different algorithms.

You cannot have a disagreement about which algorithm should direct your actions, without first having the same meaning of should—and no matter how you try to phrase this in terms of “what ought to direct your actions” or “right actions” or “correct heaps of pebbles”, in the end you will be left with the empirical fact that it is possible to construct minds directed by any coherent utility function.

When a paperclip maximizer and a pencil maximizer do different things, they are not disagreeing about anything, they are just different optimization processes. You cannot detach should-ness from any specific criterion of should-ness and be left with a pure empty should-ness that the paperclip maximizer and pencil maximizer can be said to disagree about—unless you cover “disagreement” to include differences where two agents have nothing to say to each other.

But this would be an extreme position to take with respect to your fellow humans, and I recommend against doing so. Even a psychopath would still be in a common moral reference frame with you, if, fully informed, they would decide to take a pill that would make them non-psychopaths. If you told me that my ability to care about other people was neurologically damaged, and you offered me a pill to fix it, I would take it. Now, perhaps some psychopaths would not be persuadable in-principle to take the pill that would, by our standards, “fix” them. But I note the possibility to emphasize what an extreme statement it is to say of someone:

“We have nothing to argue about, we are only different optimization processes.”

That should be reserved for paperclip maximizers, not used against humans whose arguments you don’t like.

Part of The Metaethics Sequence

Next post: “Abstracted Idealized Dynamics”

Previous post: “Sorting Pebbles Into Correct Heaps”

What links here?

Eliezer Yudkowsky10 Aug 2008 23:32 UTC

26 points

133 comments6 min readLW link Archive

Metaethics