That seems like a reasonable definition; my point is that not everyone uses the same equation.
That’s true, the question is, how often is this because people have totally different values, and how often is it that they have extremely similar “ideal equations,” but different “approximations” of what they think that equation is. I think for sociopaths, and other people with harmful ego-syntonic mental disorders it’s probably the former, but its more often the later for normal people.
I’d say sometimes A, and sometimes B. But I think that’s true even in the absence of mental disorders; I don’t think that the “ideal equation” necessarily sits somewhere hidden in the human psyche.
It sounds to me like Fred and Marvin both care about achieving similar moral objectives, but have different ideas about how to go about it. I’d say that again, which moral code is better can only be determined by trying to figure out which one actually does a better job of achieving moral goals. “Moral progress” can be regarded as finding better and better heuristics to achieve those moral goals, and finding a closer representation of the ideal equation.
That is valid, as long as both systems have the same goals. Marvin’s system includes the explicit goal “stay alive”, more heavily weighted then the goal “keep a stranger alive”; Fred’s system explicitly entirely excludes the goal “stay alive”.
If two moral systems agree both on the goals to be achieved, and the weightings to give those goals, then they will be the same moral system, yes. But two people’s moral systems need not agree on the underlying goals.
Again, I think I agree with Eliezer that a truly alien code of behavior, like that exhibited by sociopaths, and really inhuman aliens like the Pebblesorters or paperclippers, should maybe be referred to by some word other than morality. This is because since the word “morality” usually refers to doing things like making the world a happier place and increasing the positive things in life.
Well, to be fair, in a Paperclipper’s mind, paperclips are the positive things in life, and they certainly make the paperclipper happier. I realise that’s probably not what you intended, but the phrasing may need work.
Which really feeds into the question of what goals a moral system should have. To the Babyeaters, a moral system should have the goal of eating babies, and they can provide a lot of argument to support that point—in terms of improved evolutionary fitness, for example.
I think that we can agree that a moral system’s goals should be the good things in life. I’m less certain that we can agree on what those good things necessarily are, or on how they should be combined relative to each other. (I expect that if we really go to the point of thoroughly dissecting what we consider to be the good things in life, then we’ll agree more than we disagree; I expect we’ll be over 95% in agreement, but not quite 100%. This is what I generally expect for any stranger).
For example, we might disagree on whether it is more important to be independant in our actions, or to follow the legitimate instructions of a suitably legitimate authority.
Sorry, the only Cherryh I’ve read is “The Scapegoat.” I thought it gave a good impression of how alien values would look to humans, but wish it had given some more ideas about what it was that made elves think so differently.
I’d say sometimes A, and sometimes B. But I think that’s true even in the absence of mental disorders; I don’t think that the “ideal equation” necessarily sits somewhere hidden in the human psyche.
It’s not that I think there’s literally a math equation locked in the human psyche that encodes morality. It’s more like there are multiple (sometimes conflicting) moral values and methods for resolving conflicts between them and that the sum of these can be modeled as a large and complicated equation.
That is valid, as long as both systems have the same goals. Marvin’s system includes the explicit goal “stay alive”, more heavily weighted then the goal “keep a stranger alive”; Fred’s system explicitly entirely excludes the goal “stay alive”.
You gave me the impression that Marvin valued “staying alive” less as an end in itself, and more as a means to achieve the end of improving the world. in particular when you said this:
Marvin’s moral system considers the total benefit to the world of every action; but he tends to weight actions in favour of himself, because he knows that in the future, he will always choose to do the right thing (by his morality) and thus deserves ties broken in his favour.
This is actually something that bothers me in fiction when a character who is superhumanly good and power (i.e. Superman, the Doctor) risks their lives to save a relatively small amount of people. It seems short-sighted of them to do that since they regularly save much larger groups of people and anticipate continuing to do so in the future, so it seems like they should preserve their lives for those people’s sakes.
Well, to be fair, in a Paperclipper’s mind, paperclips are the positive things in life, and they certainly make the paperclipper happier.
I get the impression that the paperclipper doesn’t feel happiness, just a raw motivation to increase the amount of paperclips.
I think that we can agree that a moral system’s goals should be the good things in life. I’m less certain that we can agree on what those good things necessarily are, or on how they should be combined relative to each other.
If you define “the good things in life” as “whatever an entity wants the most,” then you can agree, whatever someone wants is “good,” be it paperclips or eudaemonia. On the other hand, I’m not sure we should do this, there are some hypothetical entities I can imagine where I can’t see it as ever being good that they get what they want. For instance I can imagine a Human-Torture-Maximizer that wants to do nothing but torture human beings. It seems to me that even if there were a trillion Human-Torture-Maximizers and one human in the universe it would be bad for them to get what they want.
For more neutral, but still alien preferences, I’m less sure. It seems to me that I have a right to stop Human-Torture-Maximizers from getting what they want. But would I have the right to stop paperclippers? Making the same paperclip over and over again seems like a pointless activity to me, but if the paperclippers are willing to share part of the universe with existing humans do I have a right to stop them? I don’t know, and I don’t think Eliezer does either.
(I expect that if we really go to the point of thoroughly dissecting what we consider to be the good things in life, then we’ll agree more than we disagree; I expect we’ll be over 95% in agreement, but not quite 100%. This is what I generally expect for any stranger).
I think that we, and most humans, have the same basic desires, where we differ is the object of those desires, and the priority of those desires.
For instance, most people desire romantic love. But those desires usually have different objects, I desire romantic love with my girlfriend, other people desire it with their significant others. Similarly, most people desire to consume stories, but the object of that desire differs, some people like Transformers, others The Notebook.
Similarly, people often desire the same things, but differ as to their priorities, how much of those things they want. Most people desire both socializing, and quiet solitude, but some extroverts want lots of one and less of the other, while introverts are the opposite.
In the case of the paerclippers, my first instinct is to regard opposing paperclipping as no different from the many ways humans have persecuted each other for wanting different things in the past. But then it occurred to me that paperclip-maximizing might be different because most persecutions in the past involve persecuting people who have different objects and priorities, not people who actually have different desires. For instance homosexuality is the same kind of desire as heterosexuality, just with a different object (same sex instead of opposite).
For example, we might disagree on whether it is more important to be independant in our actions, or to follow the legitimate instructions of a suitably legitimate authority.
This seems like a difference in priority, rather than desire, as most people would prefer differing proportions of both both. It’s still a legitimate disagreement, but I think it’s more about finding a compromise between conflicting priorities, rather than totally different values.
Compounding this problem is the fact that people value diversity to some extent. We don’t value all types of diversity obviously, I think we’d all like to live in a world where people held unanimous views on the unacceptability of torturing innocent people. But we would like other people to be different from us in some ways. Most people, I think, would rather live in a world full of different people with different personalities than a world consisting entirely of exact duplicates (in both personality and memory) of one person. So it might be impossible to reach full agreement on those other values without screwing up the achievement of the Value of Diversity.
It’s not that I think there’s literally a math equation locked in the human psyche that encodes morality. It’s more like there are multiple (sometimes conflicting) moral values and methods for resolving conflicts between them and that the sum of these can be modeled as a large and complicated equation.
I’m sorry, there’s an ambiguity there—when you say “the sum of these”, are you summing across the moral values and imperatives of a single person, or of humanity as a whole?
You gave me the impression that Marvin valued “staying alive” less as an end in itself, and more as a means to achieve the end of improving the world. in particular when you said this:
You are quite correct. I apologise; I changed that example several times from where I started, and it seems that one of my last-minute changes actually made it a worse example (my aim was to try to show how the explicit aim of self-preservation could be a reasonable moral aim, but in the process I made it not a moral aim at all). I should watch out for that in the future.
This is actually something that bothers me in fiction when a character who is superhumanly good and power (i.e. Superman, the Doctor) risks their lives to save a relatively small amount of people. It seems short-sighted of them to do that since they regularly save much larger groups of people and anticipate continuing to do so in the future, so it seems like they should preserve their lives for those people’s sakes.
I’ve always felt that was because one of the effects of great power, is that it’s so very easy to let everyone die. With great power, as Spiderman is told, comes great responsibility; one way to ensure that you’re not letting your own power go to your head, is by refusing to not-rescue anyone. After all, if the average science hero lets everyone he thinks is an idiot die, then who would be left?
Sometimes there’s a different reason, though; Sherlock Holmes would ignore a straightforward and safe case to catch a serial killer in order to concentrate on a tricky and deadly case involving a stolen diamond; he wasn’t in the detective business to help people, he was in the detective business in order to be challenged, and he would regularly refuse to take cases that did not challenge him.
(That’s probably a fair example as well, actually; for Holmes, only the challenge, the mental stimulation of a worthy foe, is important; for Superman, what is important is the saving of lives, whether from a mindless tsunami or Lex Luthor’s latest plot).
I think that we, and most humans, have the same basic desires, where we differ is the object of those desires, and the priority of those desires.
Hmmm. If you’re willing to accept zero, or near-zero, as a priority, then that statement can apply to any two sets of desires. Consider Sherlock holmes and a paperclipper; Holmes’ desire for mental stimulation is high-priority, his desire for paperclips is zero-priority, while the paperclipper’s desire for paperclips is high priority, and its desire for mental stimulation is zero-priority. (Some desires may have negative priority, which can then be interpreted as a priority to avoid that outcome—for example, my desire to immerse my hand in acid is negative, but a masochist may have a positive priority for that desire)
This implies that, in order to meaningfully differentiate the above statement from “some people have different desires”, I may have to designate some very low priority, below which the desire is considered absent (I may, of course, place that line at exactly zero priority). Some desires, however, may have no priority on their own, but inherit priority from another desire that they feed into; for example, a paperclipper has zero desire for self-preservation on its own, but it will desire self-preservation so that it can better create more paperclips.
Now, given a pool of potential goals, most people will pick out several desires from that pool, and there will be a large overlap between any two people (for example, most humans desire to eat—most but not all, there are certain eating disorders that can mess with that), and it is possible to pick out a set of desires that most people will have high priorities for.
It’s even probably possible to pick out a (smaller) set of desires such that those who do not have those desires at some positive priority are considered psychologically unhealthy. But such people nonetheless do exist.
Does this mean it isn’t bad to oppose paperclipping? I don’t know, maybe, but maybe not.
In my personal view, it is neutral to paperclip or to oppose paperclipping. It becomes bad to paperclip only when the paperclipping takes resources away from something more important.
And there are circumstances (somewhat forced circumstances) where it could be good to paperclip.
For example, we might disagree on whether it is more important to be independant in our actions, or to follow the legitimate instructions of a suitably legitimate authority.
This seems like a difference in priority, rather than desire, as most people would prefer differing proportions of both both. It’s still a legitimate disagreement, but I think it’s more about finding a compromise between conflicting priorities, rather than totally different values.
There exist people who would place negative value on the idea of following the instuctions of any legitimate authority. (They tend to remain a small and marginal group, because they cannot in turn form an authority for followers to follow without rampant hypocrisy).
Compounding this problem is the fact that people value diversity to some extent. We don’t value all types of diversity obviously, I think we’d all like to live in a world where people held unanimous views on the unacceptability of torturing innocent people. But we would like other people to be different from us in some ways. Most people, I think, would rather live in a world full of different people with different personalities than a world consisting entirely of exact duplicates (in both personality and memory) of one person. So it might be impossible to reach full agreement on those other values without screwing up the achievement of the Value of Diversity.
Yes, diversity has many benefits. The second-biggest benefit of diversity is that some people will be more correct than others, and this can be seen in the results they get; then everyone can re-diversify around the most correct group (a slow process, taking generations, as the most successful group slowly outcompetes the rest and thus passes their memes to a greater and/or more powerful proportion of the next generation). By a similar tokem, it means that one something happens that detroys one type of person, it doesn’t destroy everyone (bananas have a definite problem there, being a bit of a monoculture).
The biggest benefit is that it leads to social interaction. A completely non-diverse society would have to be a hive mind (or different experiences would slowly begin to introduce diversity), and it would be a very lonely hive mind, with no-one to talk to.
I’m sorry, there’s an ambiguity there—when you say “the sum of these”, are you summing across the moral values and imperatives of a single person, or of humanity as a whole?
Nearly all of humanity as a whole. There are obviously some humans who don’t really value morality, we call them sociopaths, but I think most humans care about very similar moral concepts. The fact that people have somewhat different personal preferences and desires at first might seem to challenge this idea, but I don’t really think it does. It just means that there are some desires that generate the same “value” of “good” when fed into the “equation.” In fact, if diversity is a good, as we discussed previously, then people having different personal preferences might in fact be morally desirable.
Hmmm. If you’re willing to accept zero, or near-zero, as a priority, then that statement can apply to any two sets of desires......This implies that, in order to meaningfully differentiate the above statement from “some people have different desires”, I may have to designate some very low priority, below which the desire is considered absent
That’s a good point. I was considering using the word “proportionality” instead of “priority” to better delineate that I don’t accept zero as a priority, but rejected it because it sounded clunky. Maybe I shouldn’t have.
In my personal view, it is neutral to paperclip or to oppose paperclipping. It becomes bad to paperclip only when the paperclipping takes resources away from something more important.
I agree with that. What I’m wondering is, would I have a moral duty to share resources with a paperclipper if it existed, or would pretty much any of the things I spend the resources on if I kept them for myself count (i.e. eudaemonic things) as “something more important.”
There exist people who would place negative value on the idea of following the instuctions of any legitimate authority.
I think there might actually be lots of people like this, but most appear normal because they place even greater negative value on doing something stupid because they ignored good advice just because it came from an authority. In other words, following authority is a negative terminal value, but an extremely positive instrumental value.
The biggest benefit is that it leads to social interaction. A completely non-diverse society would have to be a hive mind (or different experiences would slowly begin to introduce diversity), and it would be a very lonely hive mind, with no-one to talk to.
Exactly. I would still want the world to be full of a diverse variety of people, even if I had a nonsentient AI that was right about everything and could serve my every bodily need.
I’m sorry, there’s an ambiguity there—when you say “the sum of these”, are you summing across the moral values and imperatives of a single person, or of humanity as a whole?
Nearly all of humanity as a whole. There are obviously some humans who don’t really value morality, we call them sociopaths, but I think most humans care about very similar moral concepts.
Okay then, next question; how do you decide which people to exclude? You say that you are excluding sociopaths, and I think that they should be excluded; but on exactly what basis? If you’re excluding them simply because they fail to have the same moral imperatives as the ones that you think are important, then that sounds very much like a No True Scotsman argument to me. (I exclude them mainly on an argument of appeal to authority, myself, but that also has logic problems; in either case, it’s a matter of first sketching out what the moral imperative should be, then throwing out the people who don’t match).
And for a follow-up question; is it necessary to limit it to humanity? Let us assume that, ten years from now, a flying saucer lands in the middle of Durban, and we meet a sentient alien form of life. Would it be necessary to include their moral preferences in the equation as well?
Even if they are Pebblesorters?
In fact, if diversity is a good, as we discussed previously, then people having different personal preferences might in fact be morally desirable.
It may be, but only within a limited range. A serial killer is well outside that range, even if he believes that he is doing good by only killing “evil” people (for some definition of “evil”).
What I’m wondering is, would I have a moral duty to share resources with a paperclipper if it existed, or would pretty much any of the things I spend the resources on if I kept them for myself count (i.e. eudaemonic things) as “something more important.”
Hmmm. I think I’d put “buying a packet of paperclips for the paperclipper” as on the same moral footing, more or less, as “buying an icecream for a small child”. It’s nice for the person (or paperclipper) recieving the gift, and that makes it a minor moral positive by increasing happiness by a tiny fraction. But if you could otherwise spend that money on something that would save a life, then that clearly takes priority.
I think there might actually be lots of people like this, but most appear normal because they place even greater negative value on doing something stupid because they ignored good advice just because it came from an authority. In other words, following authority is a negative terminal value, but an extremely positive instrumental value.
Hmmm. Good point; that is quite possible. (Given how many people seem to follow any reasonably persuasive authority, though, I suspect that most people have a positive priority for this goal—this is probably because, for a lot of human history, peasants who disagreed with the aristocracy tended to have fewer descendants unless they all disagreed and wiped out said aristocracy).
Exactly. I would still want the world to be full of a diverse variety of people, even if I had a nonsentient AI that was right about everything and could serve my every bodily need.
Here’s a tricky question—what exactly are the limits of “nonsentient”? Can a nonsentient AI fake it by, with clever use of holograms and/or humanoid robots, cause you to think that you are surrounded by a diverse variety of people even when you are not (thus supplying the non-bodily need of social interaction)? The robots would all be philosophical zombies, of course; but is there any way to tell?
Okay then, next question; how do you decide which people to exclude?
I don’t think I’m coming across right. I’m not saying that morality is some sort of collective agreement of people in regards to their various preferences. I’m saying that morality is a series of concepts such as fairness, happiness, freedom etc., that these concepts are objective in the sense that it can be objectively determined how much fairness, freedom, happiness etc. there is in the world, and that the sum of these concepts can be expressed as a large equation.
People vary in their preference for morality, most people care about fairness, freedom, happiness, etc. to some extent. But there are some people who don’t care about morality at all, such as sociopaths.
Morality isn’t a preference. It isn’t the part of a person’s brain that says “This society is fair and free and happy, therefore I prefer it.” Morality is those disembodied concepts of freedom, fairness, happiness, etc. So if a person doesn’t care about those things, it doesn’t mean that freedom, fairness, happiness, etc. aren’t part of their morality. It means that person doesn’t care about morality, they care about something else.”
To use the Pebblesorter analogy again, the fact that you and I don’t care about sorting pebbles into prime-numbered heaps isn’t because we have our own concept of “primeness” that doesn’t include 2, 3, 5 and 7. It just means we don’t care about primeness.
To make another analogy, if most people preferred wearing wool clothes but one person preferred cotton, that wouldn’t mean that that person had their own version of wool, which was cotton. It means that that person doesn’t prefer wool.
Look inward, and consider why you think most people should be included. Presumably it’s because you really care a lot about being fair. But that necessarily means that you cared about fairness before you even considered what other people might think. Otherwise it wouldn’t have even occurred to you to think about what they preferred in the first place.
The fact that most humans care, to some extent, about the various facets of morality, is a very lucky thing, a planet full of sociopaths would be most unpleasant. But it isn’t relevant to the truth of morality. You’d still think torturing people was bad if all the non-sociopaths on Earth except you were killed, wouldn’t you? If, in that devastated world, you came across a sociopath torturing another sociopath or an animal, and could stop them at no risk to yourself, you’d do it, wouldn’t you?
You say that you are excluding sociopaths, and I think that they should be excluded; but on exactly what basis?
I suspect that your intuition comes from the fact that a central part of morality is fairness, and sociopaths don’t care about fairness. Obviously being fair to the unfair is as unwise as tolerating the intolerant.
And for a follow-up question; is it necessary to limit it to humanity? Let us assume that, ten years from now, a flying saucer lands in the middle of Durban, and we meet a sentient alien form of life. Would it be necessary to include their moral preferences in the equation as well?
Again, I want to emphasize that morality isn’t the “preference” part, it’s the “concepts” part. But the question of the moral significance of aliens is relevant, I think it would depend on how many of the concepts that make up morality they cared about. I think that at a bare minimum they’d need fairness and sympathy.
So if the Pebblesorters that came out of that ship were horrified that we didn’t care about primality, but were willing to be fair and share the universe with us, they’d be a morally worthwhile species. But if they had no preference for fairness or any sympathy at all, and would gladly kill a billion humans to sort a few more pebbles, that would be a different story. In that case we should probably, after satisfying ourselves that all Pebblesorters were psychologically similar, start prepping a Relativistic Kill Vehicle to point at their planet if they try something.
Here’s a tricky question—what exactly are the limits of “nonsentient”? Can a nonsentient AI fake it by, with clever use of holograms and/or humanoid robots, cause you to think that you are surrounded by a diverse variety of people even when you are not (thus supplying the non-bodily need of social interaction)? The robots would all be philosophical zombies, of course; but is there any way to tell?
I don’t know if I could tell, but I’d very much prefer that the AI not do that, and would consider myself to have been massively harmed if it did, even if I never found out. My preference is to actually interact with a diverse variety of people, not to merely have a series of experiences that seem like I’m doing it.
So, OK. Suppose, on this account, that you and I both care about morality to the same degree… that is, you don’t care about morality more than I do, and I don’t care about morality more than you do. (I’m not sure how we could ever know that this was the case, but just suppose hypothetically that it’s true.)
Suppose we’re faced with a situation in which there are two choices we can make. Choice A causes a system to be more fair, but less free. Choice B leaves that system unchanged. Suppose, for simplicity’s sake, that those are the only two choices available, and we both have all relevant information about the system.
On your account, will we necessarily agree on which choice to make? Or is it possible, in that situation, that you might choose A and I choose B, or vice-versa?
I think it depends on the degree of the change. If the change is very lopsided (i.e −100 freedom, +1 fairness) I think we’d both choose B.
If we assume that the degree of change is about the same (i.e. +1 fairness, −1 freedom) it would depend on how much freedom and fairness already exist. If the system is very fair, but very unfree, we’d both choose B, but if it’s very free and very unfair we’d both choose A.
However, if we are to assume that the gain in fairness and the loss in freedom are of approximately equivalent size and the current system has fairly large amounts of both freedom and fairness (which I think is what you meant) then it might be possible that we’d have a disagreement that couldn’t be resolved with pure reasoning.
This is called moral pluralism, the idea that there might be multiple moral values (such as freedom, fairness, and happiness) which are objectively correct, imperfectly commensurable with each other, and can be combined in different proportions that are of approximately equivalent objective moral value. If this is the case then your preference for one set of proportions over the other might be determined by arbitrary factors of your personality.
This is not the same as moral relativism, as these moral values are all objectively good, and any society that severely lacks one of them is objectively bad. It’s just that there are certain combinations with different proportions of values that might be both “equally good,” and personal preferences might be the “tiebreaker.” To put it in more concrete terms, a social democracy with low economic regulation and a small welfare state might be “just as good” as a social democracy with slightly higher economic regulation and a slightly larger welfare state, and people might honestly and irresolvably disagree over which one is better. However, both of those societies would definitely be objectively better than Cambodia under the Khmer Rouge, and any rational, fully informed person who cares about morality would be able to see that.
Of course, if we are both highly rational and moral, and disagreed about A vs. B, we’d both agree that fighting over them excessively would be morally worse than choosing either of them, and find some way to resolve our disagreement, even if it meant flipping a coin.
I agree with you that in sufficiently extreme cases, we would both make the same choice. Call that set of cases S1.
I think you’re saying that if the case is not that extreme, we might not make the same choice, even though we both care equally about the thing you’re using “morality” to refer to. I agree with that as well. Call that set of cases S2.
I also agree that even in S2, there’s a vast class of options that we’d both agree are worse than either of our choices (as you illustrate with the Khmer Rouge), and a vast class of options that we’d both agree are better than either of our choices, supposing that we are as you suggest rational informed people who care about the thing you’re using “morality” to refer to.
If I’m understanding you, you’re saying in S2 we are making different decisions, but our decisions are equally good. Further, you’re saying that we might not know that our decisions are equally good. I might make choice A and think choice B is wrong, and you might make choice B and think choice A is wrong. Being rational and well-informed people we’d agree that both A and B are better than the Khmer Rouge, and we might even agree that they’re both better than fighting over which one to adopt, but it might still remain true that I think B is wrong and you think A is wrong, even though neither of us thinks the other choice is as wrong as the Khmer Rouge, or fighting about it, or setting fire to the building, or various other wrong things we might choose to evaluate.
It follows that if a choice can go one of three ways (c1, c2, c3) and if I think c1> c2 > c3 and therefore endorse c1, and if you think c2 > c1 > c3 and therefore endorse c2, and if we’re both rational informed people who are in possession of the same set of facts about that choice and its consequences, and if we each think that the other is wrong to endorse the choice we endorse (while still agreeing that it’s better than c3), that there are (at least) two possibilities.
One possibility is that c1 and c2 are, objectively, equally good choices, but we each think the other is wrong anyway. In this case we both care about morality, even though we disagree about right action.
Another possibility is that c1 and c2 are, objectively, not equally good. For example, perhaps c1 is objectively bad, violates morality, and I endorse it only because I don’t actually care about morality. Of course, in this case I may use the label “morality” to describe what I care about, but that’s at best confusing and at worst actively deceptive, because what I really care about isn’t morality at all, but some other thing, like prime-numbered heaps or whatever.
Yes?
So, given that, I think my question is: how might I go about figuring out which possibility is the case?
One possibility is that c1 and c2 are, objectively, equally good choices, but we each think the other is wrong anyway.
I’d say it’s misleading to say we thought the other person was “wrong,” since in this context “wrong” is a word usually used to describe a situation where someone is in objective moral error. It might be better to say: “c1 and c2 are, objectively, equally morally good, but we each prefer a different one for arbitrary, non-moral reasons.”
This doesn’t change your argument in any way, I just think it’s good to have the language clear to avoid accidentally letting in any connotations that don’t belong.
So, given that, I think my question is: how might I go about figuring out which possibility is the case?
This is not something I have done a lot of thinking on since the odds of ever encountering such a situation are quite low at the present. It seems to me, however, that if you are this fair to your opponent, and care this much about finding out the honest truth, that you probably care at least somewhat about morality.
(This brings up an interesting question, which is: might there be some “semi-sociopathic” humans who care about morality, but incrementally, not categorically? That is, if one of these people was rational, fully informed, lacking in self deception, and lacked akrasia, they would devote maybe 70% of their time and effort to morality and 30% to other things? Such a person, if compelled to be honest, might admit that c2 is morally worse than c1, but they don’t care because they’ve used up their 70% of moral effort for the day. It doesn’t seem totally implausible that such people might exist, but maybe I’m missing something about how moral psychology works, maybe it doesn’t work unless it’s all or nothing.)
As for determining if your opponent care about morality, you might look to see if they exhibit any of the signs of sociopathy. You might search their arguments for signs of anti-epistemology, or plain moral errors. If you don’t notice any of these things, you might assign a higher probability to prediction that your disagreement is to preferring different forms of pluralism
Of course, in real life perfectly informed, rational humans who lack self deception, akrasia, and so on do not exist. So you should probably assign a much, much, much higher probability to one of those things causing your disagreement.
It might be better to say: “c1 and c2 are, objectively, equally morally good, but we each prefer a different one for arbitrary, non-moral reasons.”
OK. In which case I can also phrase my question as, when I choose c1 over c2, how can I tell whether I’m making that choice for objective moral reasons, as opposed to making that choice for arbitrary non-moral reasons?
You’re right that it doesn’t really change the argument, I’m just trying to establish some common language so we can communicate clearly.
For my own part, I agree with you that ignorance and akrasia are major influences, and I also believe that what you describe as “incremental caring about morality” is pretty common (though I would describe it as individual values differing).
Wikipedia’s page on internalism and externalism calls an entity that understands moral arguments, but is not motivated by them an “amoralist.” We could say that a person who cares about morality incrementally to have individual values that are part moralist and part amoralist.
It’s hard to tell how many people are like this due to the confounding factors of irrationality and akrasia. But I think it’s possible that there are some people who, if their irrationality and akrasia were cured, would not act perfectly moral. These people would say “I know that the world would be a better place if I acted differently, but I only care about the world to a limited extent.”
However, considering that these people would be rational and lack akrasia, they would still probably do more moral good than the average person does today.
These people would say “I know that the world would be a better place if I acted differently, but I only care about the world to a limited extent.”
Would they say necessarily say that? Or might they instead say “I know you think the world would be a better place if I acted differently, but actually it seems to me the world is better if I do what I’m doing?”
Would they say necessarily say that? Or might they instead say “I know you think the world would be a better place if I acted differently, but actually it seems to me the world is better if I do what I’m doing?”
That depends on if they are using the term “better” to mean “morally better” or “more effectively satisfies the sum total of all my values, both moral or non-moral.”
If the person is fully rational, lacking in self-deception, is being totally honest with you, and you and they had both agreed ahead of time that the word “better” means “morally better” then yes, I think they would say that. If they were lying, or they thought you were using the term “better” to mean “more effectively satisfies the sum total of all my values, both moral or non-moral,” then they might not.
I agree with you that IF (some of) my values are not moral and I choose to maximally implement my values, THEN I’m choosing not to act so as to make the world a morally better place, and IF I somehow knew that those values were not moral, THEN I would say as much if asked (supposing I was aware of my values and I was honest and so forth).
But on your account I still don’t see any way for me to ever know which of my values are moral and which ones aren’t, no matter how self-aware, rational, or lacking in self-deception I might be.
Also, even if I did somehow know that, and were honest and so forth, I don’t think I would say “I only care about the world to a limited extent.” By maximizing my values as implemented in the world, I would be increasing the value of the world, which is one way to express caring about the world. Rather, I would say “I only care about moral betterness to a limited extent; there are more valuable things for the world to be than morally better.”
But on your account I still don’t see any way for me to ever know which of my values are moral and which ones aren’t, no matter how self-aware, rational, or lacking in self-deception I might be.
How does a Pebblesorter know it’s piles are prime? The less intelligent and rational probably use some sort of vague intuition. The more intelligent and rational probably try dividing the numbers of pebbles by a number other than one.
If you had full knowledge of the concept of “morality” and all the various sub-concepts it included, you could translate that concept into a mathematical equation (the one I’ve been discussing with CC lately), and see if the various values of yours that you feed into it return positive numbers.
If your knowledge is more crude (i.e. if you’re a real person who actually exists), then a possible way to do it would be to divide the nebulous super-concept of “morality” into a series of more concrete and clearly defined sub-concepts that compose it (i.e. freedom, happiness, fairness, etc.). It might also be helpful to make a list of sub-concepts that are definitely not part of morality (possible candidates include malice, sadism, anhedonia, and xenophobia).
After doing that you could, if you are not self-deceived, use introspection to figure what you value. If you find that your values include the various moral sub-concepts, then it seems like you value morality. If you find yourself not valuing the moral sub-concepts, or valuing some nonmoral concept, then you do not value morality, or value non-moral things.
There is no pure ghostly essence of goodness apart from things like truth, happiness and sentient life.
The moral equation we are looking for isn’t something that will provide us with a ghostly essence. It is something that will allow us to sum up and aggregate all the seperate good things like truth, happiness, and sentient life, so that we can effectively determine how good various combinations of these things are relative to each other, and reach an optimal combo.
Do you want people to be happy, free, be treated fairly, etc? Then you value morality to some extent. Do you love torturing people just for the hell of it, or want to convert all the matter in the universe into paperclips? Then you, at the very least, definitely value other things than morality.
Also, even if I did somehow know that, and were honest and so forth, I don’t think I would say “I only care about the world to a limited extent.” By maximizing my values as implemented in the world, I would be increasing the value of the world, which is one way to express caring about the world.
By “caring” I meant “caring about whether the world is a good and moral place.” If you instead use the word “caring” to mean “have values that assign different levels of desirability to various possible states that the world could be in” then you are indeed correct that you would not say you didn’t care about the world.
Rather, I would say “I only care about moral betterness to a limited extent; there are more valuable things for the world to be than morally better.”
If by “valuable” you mean “has more of the things that I care about,” then yes, you could say that. Remember, however, that in that case what is “valuable” is subjective, it changes from person to person depending on their individual utility functions. What is “morally valuable,” by contrast, is objective. Anyone regardless of their utility function, can agree on whether or not the world has great quantities of things like truth, freedom, happiness, and sentient life. What determines the moral character of a person is how much they value those particular things.
Also, as an aside, when I mentioned concepts that probably aren’t part of morality earlier, I did not mean to say that pursuit of those concepts always neccessarily leads to immoral results. For instance, imagine a malicious sadist who wants to break someone’s knees. This person assaults someone else out of pure malice and breaks their knees. The injured person turns out to be an escaped serial killer who was about to kill again, and the police are able to apprehend them in their injured state. In this case the malicious person has done good. However, this is not because they have intentionally increased the amount of malicious torture in the universe. It is because they accidentally decreased the amount of murders in the universe.
I 100% agree that there is no ghostly essence of goodness.
I agree that pursuing amoral, or even immoral, values can still lead to moral results. (And also vice-versa.)
I agree that if I somehow knew what was moral and what wasn’t, then I would have a basis for formally distinguishing my moral values from my non-moral values even when my intuitions failed. I could even, in principle, build an automated mechanism for judging things as moral or non-moral. (Similarly, if a Pebblesorter knew that primeness was what it valued and knew how to factor large numbers, it would have a basis for formally distinguishing piles it valued from piles it didn’t value even when its intuitive judgments failed, and it could build an automated mechanism for distinguishing such piles.)
I agree with you that real people who actually exist can’t do this, at least not in detail.
You suggest we can divide morality into subconcepts that comprise it (freedom, happiness, fairness, etc.) and that it excludes (anhedonia, etc.). What I continue to not get is, on your account, how I do that in such a way as to ensure that what I end up with is the objectively correct list of moral values, which on your account exists, rather than some different list of values.
That is, suppose Sam and George both go through this exercise, and one of them ends up with “freedom” on their list but not “cooperation”, and the other ends up with “cooperation” but not “freedom.” On your account it seems clear that at least one of them is wrong, because the correct list of moral values is objective.
So, OK… what would we expect to experience if Sam were right? How does that differ from what we would expect to experience if George were right, or if neither of them were?
Do you want people to be happy, free, be treated fairly, etc? Then you value morality to some extent.
Again: how do we know that? What would I expect to experience differently if, instead, happiness, freedom, fairness, etc. turned out not to be aspects of morality, just like maximizing paperclips does? What should I be looking for, to notice if this is true, or confirm that it isn’t? I would still want people to be happy, free, be treated fairly, etc. in either case, after all. What differences would I experience between the two cases?
If you instead use the word “caring” to mean “have values that assign different levels of desirability to various possible states that the world could be in”
Yes, that’s more or less what I mean by “caring”. More precisely I would say that caring about X consists of desiring states of the world with more X more than states of the world with less X, all else being equal, but that’s close enough to what you said.
If by “valuable” you mean “has more of the things that I care about,” then yes, you could say that. Remember, however, that in that case what is “valuable” is subjective, it changes from person to person depending on their individual utility functions.
Yes, that’s what I mean by “valuable.” And yes, absolutely, what is valuable changes from person to person. If I act to maximize my values and you act to maximize yours we might act in opposition (or we might not, depending, but it’s possible).
And I get that you want to say that if we both gave up maximizing our values and instead agreed to implement moral values, then we would be cooperating instead, and the world would be better (even if it turned out that both of us found it less valuable). What I’m asking you is how (even in principle) we could ever reach that point.
To say that a little differently: you value some things (Vg) and I value some things (Vd). Supposing we are both perfectly rational and honest and etc., we can both know what Vg and Vd are, and what events in the world would maximize each. We can agree to cooperate on maximizing the intersection of (Vg,Vd), and we can work out some pragmatic compromise about the non-overlapping stuff. So far so good; I see how we could in principle reach that point, even if in practice we aren’t rational or self-aware or honest enough to do it.
But I don’t see how we could ever say “There’s this other list, Vm, of moral values; let’s ignore Vg and Vd altogether and instead implement Vm!” because I don’t see how we could ever know what Vm was, even in principle. If we happened to agree on some list Vm, either by coincidence or due to social conditioning or for other reasons, we could agree to implement Vm… which might or might not make the world better, depending on whether Vm happened to be the objectively correct list of moral values. But I don’t see how we could ever, even in principle, confirm or deny this, or correct it if we somehow came to know we had the objectively wrong list.
And if we can’t know or confirm or deny or correct it, even in principle, then I don’t see what is added by discussing it. It seems to me I can just as usefully say, in this case, “I value happiness, freedom, fairness, etc. I will act to maximize those values, and I endorse acting this way,” and nothing is added by saying “Those values comprise morality” except that I’ve asserted a privileged social status for my values.
So, OK… what would we expect to experience if Sam were right? How does that differ from what we would expect to experience if George were right, or if neither of them were?......Again: how do we know that? What would I expect to experience differently if, instead, happiness, freedom, fairness, etc. turned out not to be aspects of morality, just like maximizing paperclips does?
Well, I am basically asserting that morality is some sort of objective equation, or “abstract idealized dynamic,” as Eliezer calls it, concerned with people’s wellbeing. And I am further asserting that most human beings care very much about this concept. I think this would make the following predictions:
In a situation where a given group of humans had similar levels of empirical knowledge and a similar sanity waterline there would be far more moral agreement among them than would be predicted by chance, and far less moral disagreement than is mentally possible.
It is physically possible to persuade people to change their moral values by reasoned argument.
Inhabitants of a society who are unusually rational and intelligent will be the first people in that society to make moral progress, as they will be better at extrapolating answers out of the “equation.”
If one attempted to convert the moral computations people make into an abstract, idealized process, and determine it’s results, many people would find those results at least somewhat persuasive, and may find their ethical views changed by observing them.
All of these predictions appear to be true:
Human societies tend to have a rather high level of moral agreement between their members. Conformity is not necessarily an indication of rightness, it seems fairly obvious that whole societies have held gravely mistaken moral views, such as those that believed slavery was good. However, it is interesting that all those people in those societies were mistaken in exactly the same way. That seems like evidence that they were all reasoning towards similar conclusions, and the mistakes they made were caused by common environmental factors that impacted all of them. There are other theories that explain this data of course, (peer pressure, for instance), but I still find it striking.
I’ve had moral arguments made by other people change my mind, and changed the minds of other people by moral argument. I’m sure you have also had this experience.
It is well known that intellectuals tend to develop and adopt new moral theories before the general populace does. Common examples of intellectuals whose moral concepts have disseminated into the general populace include John Locke, Jeremy Bentham, and William Lloyd Garrison. Many of these peoples’ principles have since been adopted into the public consciousness.
Ethical theorists who have attempted to derive new ethical principles by working from an abstract, idealized form of ethics have often been very persuasive. To name just one example, Peter Singer ended up turning thousands of people into vegetarians with moral arguments that started on a fairly abstract level.
It seems to me I can just as usefully say, in this case, “I value happiness, freedom, fairness, etc. I will act to maximize those values, and I endorse acting this way,” and nothing is added by saying “Those values comprise morality”
Asserting that those values comprise morality seems to be effective because it seems to most people that those values are related in some way, because they form the superconcept “morality.” Morality is a useful catchall term for certain types of values, and it would be a shame to lose it.
Still, I suppose that asserting “I value happiness, freedom, fairness, etc” is similar enough to saying “I care about morality” that I really can’t object terribly strongly if that’s what you’d prefer to do.
except that I’ve asserted a privileged social status for my values.
Why does doing that bother you? Presumably, because you care about the moral concept of fairness, and don’t want to claim an unfair level of status for you and your views. But does it really make sense to say “I care about fairness, but I want to be fair to other people who don’t care about it, so I’ll go ahead and let them treat people unfairly, in order to be fair.” That sounds silly, doesn’t it? it has the same problems that come with being tolerant of intolerant people.
I think this would make the following predictions:
All of those predictions seem equally likely to me whether Sam is right or George is, so don’t really engage with my question at all. At this point, after several trips ’round the mulberry bush, I conclude that this is not because I’m being unclear with my question but rather because you’re choosing not to answer it, so I will stop trying to clarify the question further.
If I map your predictions and observations to the closest analogues that make any sense to me at all, I basically agree with them.
I suppose that asserting “I value happiness, freedom, fairness, etc” is similar enough to saying “I care about morality” that I really can’t object terribly strongly if that’s what you’d prefer to do.
It is.
Why does doing that [asserting a privileged social status for my values] bother you?
It doesn’t bother me; it’s a fine thing to do under some circumstances. If we can agree that that’s what we’re doing when we talk about “objective morality,” great. If not (which I find more likely), never mind.
Presumably, because you care about the moral concept of fairness, and don’t want to claim an unfair level of status for you and your views.
As above, I don’t see what the word “moral” is adding to this sentence. But sure, unfairly claiming status bothers me to the extent that I care about fairness. (That said, I don’t think claiming status by describing my values as “moral” is unfair; pretty much everybody has an equal ability to do it, and indeed they do. I just think it confuses any honest attempt at understanding what’s really going on when we decide on what to do.)
But does it really make sense to say “I care about fairness, but I want to be fair to other people who don’t care about it, so I’ll go ahead and let them treat people unfairly, in order to be fair.”
It depends on why and how I value (“care about”) fairness.
If I value it instrumentally (which I do), then it makes perfect sense to say that being fair to people who treat others unfairly is net-valuable, although it might be true or false in any given situation depending on what is achieved by the various kind of fairness that exist in tension in that situation.
Similarly, if I value it in proportion to how much of it there is (which I do), then it makes sense to say that, although it might be true or false depending on how much fairness is gained or lost by doing so.
That sounds silly, doesn’t it?
(nods) Totally. And the ability to phrase ideas in silly-sounding ways is valuable for rhetorical purposes, although it isn’t worth much as an analytical tool.
All of those predictions seem equally likely to me whether Sam is right or George is, so don’t really engage with my question at all.
I’m really sorry, I was trying to kill two birds with one stone and simultaneously engage that question and your later question [“What would I expect to experience differently if, instead, happiness, freedom, fairness, etc. turned out not to be aspects of morality, just like maximizing paperclips does?”] at the same time, and I ended up doing a crappy job of answering both of them. I’ll try to just answer the Sam and George question now.
I’ll start by examining the Pebblesorters P-George and P-Sam. P-George thinks 9 is p-right and 16 is p-wrong. P-Sam thinks 9 is p-wrong and 16 is p-right. They both think they are using the word “p-right” to refer to the same abstract, idealized process. What can they do to see which one is right?
They assume that most other Pebblesorters care about the same abstract process they do, so they can try to persuade them and see how successful they are. Of course, even if all the Pebblesorters agree with one of them, that doesn’t necessarily mean that one is p-correct, those sorters may be making the same mistake as P-George or P-Sam. But I think it’s non-zero Bayesian evidence of the p-rightness of their views.
They can try to control for environmentally caused error by seeing if they can also persuade Pebblesorters who live in different environments and cultures.
They can find the most rational and p-sane Pebblesorting societies and see if they have an easier time persuading them.
They can actually try to extrapolate what the abstract, idealized equation that the word “p-right” represents is and compare it to their views. They read up on Pebblesorter philospher’s theories of p-rightness and see how they correlate with their views. Pebblesorting is much simpler than morality, so we know that the abstract, idealized dynamic that the concept “p-right” represents is “primality.” So we know that P-Sam and P-George are both partly right and partly wrong, 9 and 16 both aren’t prime.
Now let’s translate that into human.
We would expect if Sam was right and George was wrong:
He would have an easier time persuading non-sociopathic humans of the rightness of his views than George, because his views are closer to the results of the equation those people have in their head.
If he went around to different societies with different moral views and attempted to persuade the people there of his views he should, on average, also have an easier time of it than George, again because his views are closer to the results of the equation those people have in their head.
Societies with higher levels of sanity and rationality should be especially easily persuaded, because they are better at determining what the results of that equation would be.
When Sam compared his and George’s views to views generated by various attempts by philosophers to create an abstract idealized version of the equation (ie. moral theories), his view should be a better match to many of them, and the results they generate, than George’s are.
The problem is that the concept of morality is far more complex than the concept of primality, so finding the right abstract idealized equation is harder for humans than it is for Pebblesorters. We still haven’t managed to do it yet. But I think that comparing Sam and George’s views to the best approximations we have so far (various forms of consequentialism, in my view) we can get some Bayesian evidence of the rightness of their views.
If George is right, he will achieve these results instead of Sam. If they are both wrong, they will both fail at doing these things.
If I value it instrumentally (which I do), then it makes perfect sense to say that being fair to people who treat others unfairly is net-valuable, although it might be true or false in any given situation depending on what is achieved by the various kind of fairness that exist in tension in that situation.
Sorry, I was probably being unclear as to what I meant because I was trying to sound clever. When I said it was silly to be fair to unfair people what I meant was that you should not regard their advice on how to best treat other people with the same consideration you’d give to a fair-minded person’s advice.
For instance, you wouldn’t say “I think it’s wrong to enslave black people, but that guy over there thinks it’s right, so let’s compromise and believe it’s okay to enslave them 50% of the time.” I suppose you might pretend to believe that if the other guy had a gun and you didn’t, but you wouldn’t let his beliefs affect yours.
I did not mean that, for example, if you, two fair-minded people, and one unfair-minded person are lost in the woods and find a pie, that you shouldn’t give the unfair-minded person a quarter of the pie to eat. That is an instance where it does make sense to treat unfair people fairly.
OK. Thanks for engaging with the question; that was very helpful. I now have a much better understanding of what you believe the differences-in-practice between moral and non-moral values are.
Just to echo back what I’m hearing you say: to the extent that some set of values Vm is easier to convince humans to adopt than other sets of values and easier to convince sane, rational societies to adopt than less sane, less rational societies and better approximates the moral theories created by philosophers than other sets of values, to that extent we can be confident that Vm is the set of values that comprise morality.
Did I get that right?
Regarding fairmindedness: I endorse giving someone’s advice consideration to the extent that I’m confident that considering their advice will implement my values. And, sure, it’s unlikely that the advice of an unfairminded person would, if considered, implement the value of fairness.
This brings up an interesting question, which is: might there be some “semi-sociopathic” humans who care about morality, but incrementally, not categorically?
It seems very likely that a person who cares a certain amount about morality, and a certain amount about money, would be willing to compromise his morality if given sufficient money. Such a mental model would form the basis of bribery. (It doesn’t have to be money, either, but the principle remains the same).
So a semi-sociopathic person would be anyone who could be bribed into completely disregarding morality.
a semi-sociopathic person would be anyone who could be bribed into completely disregarding morality.
On this account, we could presumably also categorize a semi-semi-sociopathic person as one who could be bribed into partially disregarding the thing we’re labeling “morality”. And of course bribes needn’t be money… people can be bribed by all kinds of things. Social status. Sex. Pleasant experiences. The promise of any or all of those things in the future.
Which is to say, we could categorize a semi-semi-sociopath as someone who cares about some stuff, and makes choices consistent with maximizing the stuff they care about, where some of that stuff is what we’re labeling “morality” and some of it isn’t.
We could also replace the term “semi-semi-sociopath” with the easier to pronounce and roughly equivalent term “person”.
It’s also worth noting that there probably exists stuff that we would label “morality” in one context and “bribe” in another, were we inclined to use such labels.
I don’t think I’m coming across right. I’m not saying that morality is some sort of collective agreement of people in regards to their various preferences. I’m saying that morality is a series of concepts such as fairness, happiness, freedom etc., that these concepts are objective in the sense that it can be objectively determined how much fairness, freedom, happiness etc. there is in the world, and that the sum of these concepts can be expressed as a large equation.
Ah, I think I see your point. What you’re saying—and correct me if I’m wrong—is that there is some objective True Morality, some complex equation that, if applied to any possible situation, will tell you how moral a given act is.
This is probably true.
This equation isn’t written into the human psyche; it exists independantly of what people think about morality. It just is. And even if we don’t know exactly what the equation is, even if we can’t work out the morality of a given act down to the tenth decimal place, we can still apply basic heuristics and arrive at a usable estimate in most situations.
My question is, then—assuming the above is true, how do we find that equation? Does there exist some objective method whereby you, I, a Pebblesorter, and a Paperclipper can all independently arrive at the same definition for what is moral (given that the Pebblesorter and Paperclipper will almost certainly promptly ignore the result)?
(I had thought that you were proposing that we find that equation by summing across the moral values and imperatives of humanity as a whole—excluding the psychopaths. This is why I asked about the exclusion, because it sounded a lot like writing down what you wanted at the end of the page and then going back and discarding the steps that wouldn’t lead there; that is also why I asked about the aliens).
I don’t know if I could tell, but I’d very much prefer that the AI not do that, and would consider myself to have been massively harmed if it did, even if I never found out. My preference is to actually interact with a diverse variety of people, not to merely have a series of experiences that seem like I’m doing it.
Yes, I think we’re in agreement on that. (Though this does suggest that ‘sentient’ may need a proper definition at some point).
What you’re saying—and correct me if I’m wrong—is that there is some objective True Morality, some complex equation that, if applied to any possible situation, will tell you how moral a given act is.
In the same way as there exists a True Set of Prime Numbers, and True Measure of How Many Paperclips There Are...
My question is, then—assuming the above is true, how do we find that equation?
Even though the equation exists independently of our thoughts (the same way primality exists independently from Pebblesorter thoughts) fact that we are capable of caring about the results given by the equation means we must have some parts of it “written” in our heads, the same way Pebblesorters must have some concept of primality “written” in their heads. Otherwise, how would we be capable of caring about its results?
I think that probably evolution metaphorically “wrote” a desire to care about the equation in our heads because if humans care about what is good and right it makes it easier for them to cooperate and trust each other, which has obvious fitness advantages. Of course, the fact that evolution did a good thing by causing us to care about morality doesn’t mean that evolution is always good, or that evolutionary fitness is a moral justification for anything. Evolution is an amoral force causes many horrible things to happen. It just happened that in this particular instance, evolution’s amoral metaphorical “desires” happened to coincide with what was morally good. That coincidence is far from the norm, in fact, evolution probably deleted morality from the brains of sociopaths because double-crossing morally good people also sometimes confers a fitness advantage.
So how do we learn more about this moral equation that we care about? One common form of attempting to get approximations of it in philosophy is called reflective equilibrium, where you take your moral imperatives and heuristics and attempt to find the commonalities and consistencies they have with each other. It’s far from perfect, but I think that this method has produced useful results in the past.
Eliezer has proposed what is essentially a souped up version of reflective equilibrium called Coherent Extrapolated Volition. He has argued, however, that the primary use of CEV is in designing AIs that won’t want to kill us, and that attempting to extrapolate other people’s volition is open to corruption, as we could easily fall to the temptation to extrapolate it to something that personally benefits us.
Does there exist some objective method whereby you, I, a Pebblesorter, and a Paperclipper can all independently arrive at the same definition for what is moral (given that the Pebblesorter and Paperclipper will almost certainly promptly ignore the result)?
Again, we could probably get closer through reflective equilibrium, and by critiquing the methods and results of each other’s reflections. If you somehow managed to get a Pebblesorter or a Paperclipper to do it too, they might generate similar results, although since they don’t intrinsically care about the equation you would probably have to give them some basic instructions before they started working on the problem.
I had thought that you were proposing that we find that equation by summing across the moral values and imperatives of humanity as a whole—excluding the psychopaths.
If we assume that most humans care about acting morally, doing research about what people’s moral imperatives are might be somewhat helpful, since it would allow us to harvest the fruits of other people’s moral reflections and compare them with our own. We can exclude sociopaths because there is ample evidence that they care nothing for morality.
Although I suppose, that a super-genius sociopath who had the basic concept explained to them might be able to do some useful work in the same fashion that a Pebblesorter or Paperclipper might be able to. Of course, the genius sociopath wouldn’t care about the results, and probably would have to be paid a large sum to even agree to work on the problem.
I think that probably evolution metaphorically “wrote” a desire to care about the equation in our heads because if humans care about what is good and right it makes it easier for them to cooperate and trust each other, which has obvious fitness advantages.
Hmmm. That which evolution has “written” into the human psyche could, in theory, and given sufficient research, be read out again (and will almost certainly not be constant across most of humanity, but will rather exist with variations). But I doubt that morality is all in out genetic nature; I suspect that most of it is learned, from our parents, aunts, uncles, grandparents and other older relatives; I think, in short, that morality is memetic rather than genetic. Though evolution still happens in memetic systems just as well as in genetic systems.
So how do we learn more about this moral equation that we care about? One common form of attempting to get approximations of it in philosophy is called reflective equilibrium, where you take your moral imperatives and heuristics and attempt to find the commonalities and consistencies they have with each other. It’s far from perfect, but I think that this method has produced useful results in the past.
Hmmm. Looking at the wikipedia article, I can expect reflective equilibrium to produce a consistent moral framework. I also expect a correct moral framework to be consistent; but not all consistent moral frameworks are correct. (A paperclipper does not have what I’d consider a correct moral framework, but it does have a consistent one).
If you start out close to a correct moral framework, then reflective equilibrium can move you closer, but it doesnt necessarily do so.
Eliezer has proposed what is essentially a souped up version of reflective equilibrium called Coherent Extrapolated Volition. He has argued, however, that the primary use of CEV is in designing AIs that won’t want to kill us, and that attempting to extrapolate other people’s volition is open to corruption, as we could easily fall to the temptation to extrapolate it to something that personally benefits us.
Hmmm. The primary use of trying to find the True Morality Equation, to my mind, is to work it into a future AI. If we can find such an equation, prove it correct, and make an AI that maximises its output value, then that would be an optimally moral AI. This may or may not count as Friendly, but it’s certainly a potential contender for the title of Friendly.
Again, we could probably get closer through reflective equilibrium, and by critiquing the methods and results of each other’s reflections. If you somehow managed to get a Pebblesorter or a Paperclipper to do it too, they might generate similar results, although since they don’t intrinsically care about the equation you would probably have to give them some basic instructions before they started working on the problem.
Carrying through this method to completion could give us—or anyone else—an equation. But is there any way to be sure that it necessarily gives us the correct equation? (A pebblesorter may actually be a very good help in resolving this question; he does not care about morality, and therefore does not have any emotional investment in the research).
The first thought that comes to my mind, is to have a very large group of researchers, divide them into N groups, and have each of these groups attempt, independently, to find an equation; if all of the groups find the same equation, this would be evidence that the equation found is correct (with stronger evidence at larger values of N). However, I anticipate that the acquired results would be N subtly different, but similar, equations.
But I doubt that morality is all in out genetic nature; I suspect that most of it is learned, from our parents, aunts, uncles, grandparents and other older relatives; I think, in short, that morality is memetic rather than genetic.
That’s possible. But memetics can’t build morality out of nothing. At the very least, evolved genetics has to provide a “foundation,” a part of the brain that moral memes can latch onto. Sociopaths lack that foundation, although the research is inconclusive as to what extent this is caused by genetics, and what extent it is caused by later developmental factors (it appears to be a mix of some sort).
Hmmm. Looking at the wikipedia article, I can expect reflective equilibrium to produce a consistent moral framework. I also expect a correct moral framework to be consistent; but not all consistent moral frameworks are correct.
Yes, that’s why I consider reflective equilibrium to be far from perfect. Depending on how many errors you latch onto, it might worsen your moral state.
Carrying through this method to completion could give us—or anyone else—an equation. But is there any way to be sure that it necessarily gives us the correct equation?
Considering how morally messed up the world is now, even an imperfect equation would likely be better (closer to being correct) than our current slapdash moral heuristics. At this point we haven’t even achieved “good enough,” so I don’t think we should worry too much about being “perfect.”
However, I anticipate that the acquired results would be N subtly different, but similar, equations.
That’s not inconceivable. But I think that each of the subtly different equations would likely be morally better than pretty much every approximation we currently have.
But memetics can’t build morality out of nothing. At the very least, evolved genetics has to provide a “foundation,” a part of the brain that moral memes can latch onto. Sociopaths lack that foundation, although the research is inconclusive as to what extent this is caused by genetics, and what extent it is caused by later developmental factors
That sounds plausible, yes.
Considering how morally messed up the world is now, even an imperfect equation would likely be better (closer to being correct) than our current slapdash moral heuristics. At this point we haven’t even achieved “good enough,” so I don’t think we should worry too much about being “perfect.”
Hmmm. Finding an approximation to the equation will probably be easier than step two; encouraging people worldwide to accept the approximation. (Especially since many people who do accept it will then promptly begin looking for loopholes; either to use or to patch them).
However, if the correct equation cannot be found, then this means that the Morality Maximiser AI cannot be designed.
However, if the correct equation cannot be found, then this means that the Morality Maximiser AI cannot be designed.
That’s true, what I was trying to say is that a world ruled by a 99.99% Approximation of Morality Maximizer AI might well be far far better than our current one, even if it is imperfect.
Of course, it might be a problem if we put the 99.99% Approximation of Morality Maximizer AI in power, then find the correct equation, only to discover that the 99AMMAI is unwilling to step down in favor of the Morality Maximizer AI. On the other hand, putting the 99AMM AI in power might be the only way to ensure a Paperclipper doesn’t ascend to power before we find the correct equation and design the MMAI. I’m not sure whether we should risk it or not.
I’d say sometimes A, and sometimes B. But I think that’s true even in the absence of mental disorders; I don’t think that the “ideal equation” necessarily sits somewhere hidden in the human psyche.
That is valid, as long as both systems have the same goals. Marvin’s system includes the explicit goal “stay alive”, more heavily weighted then the goal “keep a stranger alive”; Fred’s system explicitly entirely excludes the goal “stay alive”.
If two moral systems agree both on the goals to be achieved, and the weightings to give those goals, then they will be the same moral system, yes. But two people’s moral systems need not agree on the underlying goals.
Well, to be fair, in a Paperclipper’s mind, paperclips are the positive things in life, and they certainly make the paperclipper happier. I realise that’s probably not what you intended, but the phrasing may need work.
Which really feeds into the question of what goals a moral system should have. To the Babyeaters, a moral system should have the goal of eating babies, and they can provide a lot of argument to support that point—in terms of improved evolutionary fitness, for example.
I think that we can agree that a moral system’s goals should be the good things in life. I’m less certain that we can agree on what those good things necessarily are, or on how they should be combined relative to each other. (I expect that if we really go to the point of thoroughly dissecting what we consider to be the good things in life, then we’ll agree more than we disagree; I expect we’ll be over 95% in agreement, but not quite 100%. This is what I generally expect for any stranger).
For example, we might disagree on whether it is more important to be independant in our actions, or to follow the legitimate instructions of a suitably legitimate authority.
Hmmm. I haven’t read that one.
It’s not that I think there’s literally a math equation locked in the human psyche that encodes morality. It’s more like there are multiple (sometimes conflicting) moral values and methods for resolving conflicts between them and that the sum of these can be modeled as a large and complicated equation.
You gave me the impression that Marvin valued “staying alive” less as an end in itself, and more as a means to achieve the end of improving the world. in particular when you said this:
This is actually something that bothers me in fiction when a character who is superhumanly good and power (i.e. Superman, the Doctor) risks their lives to save a relatively small amount of people. It seems short-sighted of them to do that since they regularly save much larger groups of people and anticipate continuing to do so in the future, so it seems like they should preserve their lives for those people’s sakes.
If you define “the good things in life” as “whatever an entity wants the most,” then you can agree, whatever someone wants is “good,” be it paperclips or eudaemonia. On the other hand, I’m not sure we should do this, there are some hypothetical entities I can imagine where I can’t see it as ever being good that they get what they want. For instance I can imagine a Human-Torture-Maximizer that wants to do nothing but torture human beings. It seems to me that even if there were a trillion Human-Torture-Maximizers and one human in the universe it would be bad for them to get what they want.
For more neutral, but still alien preferences, I’m less sure. It seems to me that I have a right to stop Human-Torture-Maximizers from getting what they want. But would I have the right to stop paperclippers? Making the same paperclip over and over again seems like a pointless activity to me, but if the paperclippers are willing to share part of the universe with existing humans do I have a right to stop them? I don’t know, and I don’t think Eliezer does either.
I think that we, and most humans, have the same basic desires, where we differ is the object of those desires, and the priority of those desires.
For instance, most people desire romantic love. But those desires usually have different objects, I desire romantic love with my girlfriend, other people desire it with their significant others. Similarly, most people desire to consume stories, but the object of that desire differs, some people like Transformers, others The Notebook.
Similarly, people often desire the same things, but differ as to their priorities, how much of those things they want. Most people desire both socializing, and quiet solitude, but some extroverts want lots of one and less of the other, while introverts are the opposite.
In the case of the paerclippers, my first instinct is to regard opposing paperclipping as no different from the many ways humans have persecuted each other for wanting different things in the past. But then it occurred to me that paperclip-maximizing might be different because most persecutions in the past involve persecuting people who have different objects and priorities, not people who actually have different desires. For instance homosexuality is the same kind of desire as heterosexuality, just with a different object (same sex instead of opposite).
Does this mean it isn’t bad to oppose paperclipping? I don’t know, maybe, but maybe not. Maybe we should just try to avoid creating paperclippers or similar creatures so we don’t have to deal with it.
This seems like a difference in priority, rather than desire, as most people would prefer differing proportions of both both. It’s still a legitimate disagreement, but I think it’s more about finding a compromise between conflicting priorities, rather than totally different values.
Compounding this problem is the fact that people value diversity to some extent. We don’t value all types of diversity obviously, I think we’d all like to live in a world where people held unanimous views on the unacceptability of torturing innocent people. But we would like other people to be different from us in some ways. Most people, I think, would rather live in a world full of different people with different personalities than a world consisting entirely of exact duplicates (in both personality and memory) of one person. So it might be impossible to reach full agreement on those other values without screwing up the achievement of the Value of Diversity.
I’m sorry, there’s an ambiguity there—when you say “the sum of these”, are you summing across the moral values and imperatives of a single person, or of humanity as a whole?
You are quite correct. I apologise; I changed that example several times from where I started, and it seems that one of my last-minute changes actually made it a worse example (my aim was to try to show how the explicit aim of self-preservation could be a reasonable moral aim, but in the process I made it not a moral aim at all). I should watch out for that in the future.
I’ve always felt that was because one of the effects of great power, is that it’s so very easy to let everyone die. With great power, as Spiderman is told, comes great responsibility; one way to ensure that you’re not letting your own power go to your head, is by refusing to not-rescue anyone. After all, if the average science hero lets everyone he thinks is an idiot die, then who would be left?
Sometimes there’s a different reason, though; Sherlock Holmes would ignore a straightforward and safe case to catch a serial killer in order to concentrate on a tricky and deadly case involving a stolen diamond; he wasn’t in the detective business to help people, he was in the detective business in order to be challenged, and he would regularly refuse to take cases that did not challenge him.
(That’s probably a fair example as well, actually; for Holmes, only the challenge, the mental stimulation of a worthy foe, is important; for Superman, what is important is the saving of lives, whether from a mindless tsunami or Lex Luthor’s latest plot).
Hmmm. If you’re willing to accept zero, or near-zero, as a priority, then that statement can apply to any two sets of desires. Consider Sherlock holmes and a paperclipper; Holmes’ desire for mental stimulation is high-priority, his desire for paperclips is zero-priority, while the paperclipper’s desire for paperclips is high priority, and its desire for mental stimulation is zero-priority. (Some desires may have negative priority, which can then be interpreted as a priority to avoid that outcome—for example, my desire to immerse my hand in acid is negative, but a masochist may have a positive priority for that desire)
This implies that, in order to meaningfully differentiate the above statement from “some people have different desires”, I may have to designate some very low priority, below which the desire is considered absent (I may, of course, place that line at exactly zero priority). Some desires, however, may have no priority on their own, but inherit priority from another desire that they feed into; for example, a paperclipper has zero desire for self-preservation on its own, but it will desire self-preservation so that it can better create more paperclips.
Now, given a pool of potential goals, most people will pick out several desires from that pool, and there will be a large overlap between any two people (for example, most humans desire to eat—most but not all, there are certain eating disorders that can mess with that), and it is possible to pick out a set of desires that most people will have high priorities for.
It’s even probably possible to pick out a (smaller) set of desires such that those who do not have those desires at some positive priority are considered psychologically unhealthy. But such people nonetheless do exist.
In my personal view, it is neutral to paperclip or to oppose paperclipping. It becomes bad to paperclip only when the paperclipping takes resources away from something more important.
And there are circumstances (somewhat forced circumstances) where it could be good to paperclip.
There exist people who would place negative value on the idea of following the instuctions of any legitimate authority. (They tend to remain a small and marginal group, because they cannot in turn form an authority for followers to follow without rampant hypocrisy).
Yes, diversity has many benefits. The second-biggest benefit of diversity is that some people will be more correct than others, and this can be seen in the results they get; then everyone can re-diversify around the most correct group (a slow process, taking generations, as the most successful group slowly outcompetes the rest and thus passes their memes to a greater and/or more powerful proportion of the next generation). By a similar tokem, it means that one something happens that detroys one type of person, it doesn’t destroy everyone (bananas have a definite problem there, being a bit of a monoculture).
The biggest benefit is that it leads to social interaction. A completely non-diverse society would have to be a hive mind (or different experiences would slowly begin to introduce diversity), and it would be a very lonely hive mind, with no-one to talk to.
Nearly all of humanity as a whole. There are obviously some humans who don’t really value morality, we call them sociopaths, but I think most humans care about very similar moral concepts. The fact that people have somewhat different personal preferences and desires at first might seem to challenge this idea, but I don’t really think it does. It just means that there are some desires that generate the same “value” of “good” when fed into the “equation.” In fact, if diversity is a good, as we discussed previously, then people having different personal preferences might in fact be morally desirable.
That’s a good point. I was considering using the word “proportionality” instead of “priority” to better delineate that I don’t accept zero as a priority, but rejected it because it sounded clunky. Maybe I shouldn’t have.
I agree with that. What I’m wondering is, would I have a moral duty to share resources with a paperclipper if it existed, or would pretty much any of the things I spend the resources on if I kept them for myself count (i.e. eudaemonic things) as “something more important.”
I think there might actually be lots of people like this, but most appear normal because they place even greater negative value on doing something stupid because they ignored good advice just because it came from an authority. In other words, following authority is a negative terminal value, but an extremely positive instrumental value.
Exactly. I would still want the world to be full of a diverse variety of people, even if I had a nonsentient AI that was right about everything and could serve my every bodily need.
Okay then, next question; how do you decide which people to exclude? You say that you are excluding sociopaths, and I think that they should be excluded; but on exactly what basis? If you’re excluding them simply because they fail to have the same moral imperatives as the ones that you think are important, then that sounds very much like a No True Scotsman argument to me. (I exclude them mainly on an argument of appeal to authority, myself, but that also has logic problems; in either case, it’s a matter of first sketching out what the moral imperative should be, then throwing out the people who don’t match).
And for a follow-up question; is it necessary to limit it to humanity? Let us assume that, ten years from now, a flying saucer lands in the middle of Durban, and we meet a sentient alien form of life. Would it be necessary to include their moral preferences in the equation as well?
Even if they are Pebblesorters?
It may be, but only within a limited range. A serial killer is well outside that range, even if he believes that he is doing good by only killing “evil” people (for some definition of “evil”).
Hmmm. I think I’d put “buying a packet of paperclips for the paperclipper” as on the same moral footing, more or less, as “buying an icecream for a small child”. It’s nice for the person (or paperclipper) recieving the gift, and that makes it a minor moral positive by increasing happiness by a tiny fraction. But if you could otherwise spend that money on something that would save a life, then that clearly takes priority.
Hmmm. Good point; that is quite possible. (Given how many people seem to follow any reasonably persuasive authority, though, I suspect that most people have a positive priority for this goal—this is probably because, for a lot of human history, peasants who disagreed with the aristocracy tended to have fewer descendants unless they all disagreed and wiped out said aristocracy).
Here’s a tricky question—what exactly are the limits of “nonsentient”? Can a nonsentient AI fake it by, with clever use of holograms and/or humanoid robots, cause you to think that you are surrounded by a diverse variety of people even when you are not (thus supplying the non-bodily need of social interaction)? The robots would all be philosophical zombies, of course; but is there any way to tell?
I don’t think I’m coming across right. I’m not saying that morality is some sort of collective agreement of people in regards to their various preferences. I’m saying that morality is a series of concepts such as fairness, happiness, freedom etc., that these concepts are objective in the sense that it can be objectively determined how much fairness, freedom, happiness etc. there is in the world, and that the sum of these concepts can be expressed as a large equation.
People vary in their preference for morality, most people care about fairness, freedom, happiness, etc. to some extent. But there are some people who don’t care about morality at all, such as sociopaths.
Morality isn’t a preference. It isn’t the part of a person’s brain that says “This society is fair and free and happy, therefore I prefer it.” Morality is those disembodied concepts of freedom, fairness, happiness, etc. So if a person doesn’t care about those things, it doesn’t mean that freedom, fairness, happiness, etc. aren’t part of their morality. It means that person doesn’t care about morality, they care about something else.”
To use the Pebblesorter analogy again, the fact that you and I don’t care about sorting pebbles into prime-numbered heaps isn’t because we have our own concept of “primeness” that doesn’t include 2, 3, 5 and 7. It just means we don’t care about primeness.
To make another analogy, if most people preferred wearing wool clothes but one person preferred cotton, that wouldn’t mean that that person had their own version of wool, which was cotton. It means that that person doesn’t prefer wool.
Look inward, and consider why you think most people should be included. Presumably it’s because you really care a lot about being fair. But that necessarily means that you cared about fairness before you even considered what other people might think. Otherwise it wouldn’t have even occurred to you to think about what they preferred in the first place.
The fact that most humans care, to some extent, about the various facets of morality, is a very lucky thing, a planet full of sociopaths would be most unpleasant. But it isn’t relevant to the truth of morality. You’d still think torturing people was bad if all the non-sociopaths on Earth except you were killed, wouldn’t you? If, in that devastated world, you came across a sociopath torturing another sociopath or an animal, and could stop them at no risk to yourself, you’d do it, wouldn’t you?
I suspect that your intuition comes from the fact that a central part of morality is fairness, and sociopaths don’t care about fairness. Obviously being fair to the unfair is as unwise as tolerating the intolerant.
Again, I want to emphasize that morality isn’t the “preference” part, it’s the “concepts” part. But the question of the moral significance of aliens is relevant, I think it would depend on how many of the concepts that make up morality they cared about. I think that at a bare minimum they’d need fairness and sympathy.
So if the Pebblesorters that came out of that ship were horrified that we didn’t care about primality, but were willing to be fair and share the universe with us, they’d be a morally worthwhile species. But if they had no preference for fairness or any sympathy at all, and would gladly kill a billion humans to sort a few more pebbles, that would be a different story. In that case we should probably, after satisfying ourselves that all Pebblesorters were psychologically similar, start prepping a Relativistic Kill Vehicle to point at their planet if they try something.
I don’t know if I could tell, but I’d very much prefer that the AI not do that, and would consider myself to have been massively harmed if it did, even if I never found out. My preference is to actually interact with a diverse variety of people, not to merely have a series of experiences that seem like I’m doing it.
So, OK. Suppose, on this account, that you and I both care about morality to the same degree… that is, you don’t care about morality more than I do, and I don’t care about morality more than you do. (I’m not sure how we could ever know that this was the case, but just suppose hypothetically that it’s true.)
Suppose we’re faced with a situation in which there are two choices we can make. Choice A causes a system to be more fair, but less free. Choice B leaves that system unchanged. Suppose, for simplicity’s sake, that those are the only two choices available, and we both have all relevant information about the system.
On your account, will we necessarily agree on which choice to make? Or is it possible, in that situation, that you might choose A and I choose B, or vice-versa?
I think it depends on the degree of the change. If the change is very lopsided (i.e −100 freedom, +1 fairness) I think we’d both choose B.
If we assume that the degree of change is about the same (i.e. +1 fairness, −1 freedom) it would depend on how much freedom and fairness already exist. If the system is very fair, but very unfree, we’d both choose B, but if it’s very free and very unfair we’d both choose A.
However, if we are to assume that the gain in fairness and the loss in freedom are of approximately equivalent size and the current system has fairly large amounts of both freedom and fairness (which I think is what you meant) then it might be possible that we’d have a disagreement that couldn’t be resolved with pure reasoning.
This is called moral pluralism, the idea that there might be multiple moral values (such as freedom, fairness, and happiness) which are objectively correct, imperfectly commensurable with each other, and can be combined in different proportions that are of approximately equivalent objective moral value. If this is the case then your preference for one set of proportions over the other might be determined by arbitrary factors of your personality.
This is not the same as moral relativism, as these moral values are all objectively good, and any society that severely lacks one of them is objectively bad. It’s just that there are certain combinations with different proportions of values that might be both “equally good,” and personal preferences might be the “tiebreaker.” To put it in more concrete terms, a social democracy with low economic regulation and a small welfare state might be “just as good” as a social democracy with slightly higher economic regulation and a slightly larger welfare state, and people might honestly and irresolvably disagree over which one is better. However, both of those societies would definitely be objectively better than Cambodia under the Khmer Rouge, and any rational, fully informed person who cares about morality would be able to see that.
Of course, if we are both highly rational and moral, and disagreed about A vs. B, we’d both agree that fighting over them excessively would be morally worse than choosing either of them, and find some way to resolve our disagreement, even if it meant flipping a coin.
I agree with you that in sufficiently extreme cases, we would both make the same choice. Call that set of cases S1.
I think you’re saying that if the case is not that extreme, we might not make the same choice, even though we both care equally about the thing you’re using “morality” to refer to. I agree with that as well. Call that set of cases S2.
I also agree that even in S2, there’s a vast class of options that we’d both agree are worse than either of our choices (as you illustrate with the Khmer Rouge), and a vast class of options that we’d both agree are better than either of our choices, supposing that we are as you suggest rational informed people who care about the thing you’re using “morality” to refer to.
If I’m understanding you, you’re saying in S2 we are making different decisions, but our decisions are equally good. Further, you’re saying that we might not know that our decisions are equally good. I might make choice A and think choice B is wrong, and you might make choice B and think choice A is wrong. Being rational and well-informed people we’d agree that both A and B are better than the Khmer Rouge, and we might even agree that they’re both better than fighting over which one to adopt, but it might still remain true that I think B is wrong and you think A is wrong, even though neither of us thinks the other choice is as wrong as the Khmer Rouge, or fighting about it, or setting fire to the building, or various other wrong things we might choose to evaluate.
Have I followed your position so far?
Yes, I think so.
OK, good.
It follows that if a choice can go one of three ways (c1, c2, c3) and if I think c1> c2 > c3 and therefore endorse c1, and if you think c2 > c1 > c3 and therefore endorse c2, and if we’re both rational informed people who are in possession of the same set of facts about that choice and its consequences, and if we each think that the other is wrong to endorse the choice we endorse (while still agreeing that it’s better than c3), that there are (at least) two possibilities.
One possibility is that c1 and c2 are, objectively, equally good choices, but we each think the other is wrong anyway. In this case we both care about morality, even though we disagree about right action.
Another possibility is that c1 and c2 are, objectively, not equally good. For example, perhaps c1 is objectively bad, violates morality, and I endorse it only because I don’t actually care about morality. Of course, in this case I may use the label “morality” to describe what I care about, but that’s at best confusing and at worst actively deceptive, because what I really care about isn’t morality at all, but some other thing, like prime-numbered heaps or whatever.
Yes?
So, given that, I think my question is: how might I go about figuring out which possibility is the case?
I’d say it’s misleading to say we thought the other person was “wrong,” since in this context “wrong” is a word usually used to describe a situation where someone is in objective moral error. It might be better to say: “c1 and c2 are, objectively, equally morally good, but we each prefer a different one for arbitrary, non-moral reasons.”
This doesn’t change your argument in any way, I just think it’s good to have the language clear to avoid accidentally letting in any connotations that don’t belong.
This is not something I have done a lot of thinking on since the odds of ever encountering such a situation are quite low at the present. It seems to me, however, that if you are this fair to your opponent, and care this much about finding out the honest truth, that you probably care at least somewhat about morality.
(This brings up an interesting question, which is: might there be some “semi-sociopathic” humans who care about morality, but incrementally, not categorically? That is, if one of these people was rational, fully informed, lacking in self deception, and lacked akrasia, they would devote maybe 70% of their time and effort to morality and 30% to other things? Such a person, if compelled to be honest, might admit that c2 is morally worse than c1, but they don’t care because they’ve used up their 70% of moral effort for the day. It doesn’t seem totally implausible that such people might exist, but maybe I’m missing something about how moral psychology works, maybe it doesn’t work unless it’s all or nothing.)
As for determining if your opponent care about morality, you might look to see if they exhibit any of the signs of sociopathy. You might search their arguments for signs of anti-epistemology, or plain moral errors. If you don’t notice any of these things, you might assign a higher probability to prediction that your disagreement is to preferring different forms of pluralism
Of course, in real life perfectly informed, rational humans who lack self deception, akrasia, and so on do not exist. So you should probably assign a much, much, much higher probability to one of those things causing your disagreement.
OK. In which case I can also phrase my question as, when I choose c1 over c2, how can I tell whether I’m making that choice for objective moral reasons, as opposed to making that choice for arbitrary non-moral reasons?
You’re right that it doesn’t really change the argument, I’m just trying to establish some common language so we can communicate clearly.
For my own part, I agree with you that ignorance and akrasia are major influences, and I also believe that what you describe as “incremental caring about morality” is pretty common (though I would describe it as individual values differing).
Wikipedia’s page on internalism and externalism calls an entity that understands moral arguments, but is not motivated by them an “amoralist.” We could say that a person who cares about morality incrementally to have individual values that are part moralist and part amoralist.
It’s hard to tell how many people are like this due to the confounding factors of irrationality and akrasia. But I think it’s possible that there are some people who, if their irrationality and akrasia were cured, would not act perfectly moral. These people would say “I know that the world would be a better place if I acted differently, but I only care about the world to a limited extent.”
However, considering that these people would be rational and lack akrasia, they would still probably do more moral good than the average person does today.
Would they say necessarily say that? Or might they instead say “I know you think the world would be a better place if I acted differently, but actually it seems to me the world is better if I do what I’m doing?”
That depends on if they are using the term “better” to mean “morally better” or “more effectively satisfies the sum total of all my values, both moral or non-moral.”
If the person is fully rational, lacking in self-deception, is being totally honest with you, and you and they had both agreed ahead of time that the word “better” means “morally better” then yes, I think they would say that. If they were lying, or they thought you were using the term “better” to mean “more effectively satisfies the sum total of all my values, both moral or non-moral,” then they might not.
I agree with you that IF (some of) my values are not moral and I choose to maximally implement my values, THEN I’m choosing not to act so as to make the world a morally better place, and IF I somehow knew that those values were not moral, THEN I would say as much if asked (supposing I was aware of my values and I was honest and so forth).
But on your account I still don’t see any way for me to ever know which of my values are moral and which ones aren’t, no matter how self-aware, rational, or lacking in self-deception I might be.
Also, even if I did somehow know that, and were honest and so forth, I don’t think I would say “I only care about the world to a limited extent.” By maximizing my values as implemented in the world, I would be increasing the value of the world, which is one way to express caring about the world. Rather, I would say “I only care about moral betterness to a limited extent; there are more valuable things for the world to be than morally better.”
How does a Pebblesorter know it’s piles are prime? The less intelligent and rational probably use some sort of vague intuition. The more intelligent and rational probably try dividing the numbers of pebbles by a number other than one.
If you had full knowledge of the concept of “morality” and all the various sub-concepts it included, you could translate that concept into a mathematical equation (the one I’ve been discussing with CC lately), and see if the various values of yours that you feed into it return positive numbers.
If your knowledge is more crude (i.e. if you’re a real person who actually exists), then a possible way to do it would be to divide the nebulous super-concept of “morality” into a series of more concrete and clearly defined sub-concepts that compose it (i.e. freedom, happiness, fairness, etc.). It might also be helpful to make a list of sub-concepts that are definitely not part of morality (possible candidates include malice, sadism, anhedonia, and xenophobia).
After doing that you could, if you are not self-deceived, use introspection to figure what you value. If you find that your values include the various moral sub-concepts, then it seems like you value morality. If you find yourself not valuing the moral sub-concepts, or valuing some nonmoral concept, then you do not value morality, or value non-moral things.
As Eliezer puts it:
The moral equation we are looking for isn’t something that will provide us with a ghostly essence. It is something that will allow us to sum up and aggregate all the seperate good things like truth, happiness, and sentient life, so that we can effectively determine how good various combinations of these things are relative to each other, and reach an optimal combo.
Do you want people to be happy, free, be treated fairly, etc? Then you value morality to some extent. Do you love torturing people just for the hell of it, or want to convert all the matter in the universe into paperclips? Then you, at the very least, definitely value other things than morality.
By “caring” I meant “caring about whether the world is a good and moral place.” If you instead use the word “caring” to mean “have values that assign different levels of desirability to various possible states that the world could be in” then you are indeed correct that you would not say you didn’t care about the world.
If by “valuable” you mean “has more of the things that I care about,” then yes, you could say that. Remember, however, that in that case what is “valuable” is subjective, it changes from person to person depending on their individual utility functions. What is “morally valuable,” by contrast, is objective. Anyone regardless of their utility function, can agree on whether or not the world has great quantities of things like truth, freedom, happiness, and sentient life. What determines the moral character of a person is how much they value those particular things.
Also, as an aside, when I mentioned concepts that probably aren’t part of morality earlier, I did not mean to say that pursuit of those concepts always neccessarily leads to immoral results. For instance, imagine a malicious sadist who wants to break someone’s knees. This person assaults someone else out of pure malice and breaks their knees. The injured person turns out to be an escaped serial killer who was about to kill again, and the police are able to apprehend them in their injured state. In this case the malicious person has done good. However, this is not because they have intentionally increased the amount of malicious torture in the universe. It is because they accidentally decreased the amount of murders in the universe.
I 100% agree that there is no ghostly essence of goodness.
I agree that pursuing amoral, or even immoral, values can still lead to moral results. (And also vice-versa.)
I agree that if I somehow knew what was moral and what wasn’t, then I would have a basis for formally distinguishing my moral values from my non-moral values even when my intuitions failed. I could even, in principle, build an automated mechanism for judging things as moral or non-moral. (Similarly, if a Pebblesorter knew that primeness was what it valued and knew how to factor large numbers, it would have a basis for formally distinguishing piles it valued from piles it didn’t value even when its intuitive judgments failed, and it could build an automated mechanism for distinguishing such piles.)
I agree with you that real people who actually exist can’t do this, at least not in detail.
You suggest we can divide morality into subconcepts that comprise it (freedom, happiness, fairness, etc.) and that it excludes (anhedonia, etc.). What I continue to not get is, on your account, how I do that in such a way as to ensure that what I end up with is the objectively correct list of moral values, which on your account exists, rather than some different list of values.
That is, suppose Sam and George both go through this exercise, and one of them ends up with “freedom” on their list but not “cooperation”, and the other ends up with “cooperation” but not “freedom.” On your account it seems clear that at least one of them is wrong, because the correct list of moral values is objective.
So, OK… what would we expect to experience if Sam were right? How does that differ from what we would expect to experience if George were right, or if neither of them were?
Again: how do we know that? What would I expect to experience differently if, instead, happiness, freedom, fairness, etc. turned out not to be aspects of morality, just like maximizing paperclips does? What should I be looking for, to notice if this is true, or confirm that it isn’t? I would still want people to be happy, free, be treated fairly, etc. in either case, after all. What differences would I experience between the two cases?
Yes, that’s more or less what I mean by “caring”. More precisely I would say that caring about X consists of desiring states of the world with more X more than states of the world with less X, all else being equal, but that’s close enough to what you said.
Yes, that’s what I mean by “valuable.” And yes, absolutely, what is valuable changes from person to person. If I act to maximize my values and you act to maximize yours we might act in opposition (or we might not, depending, but it’s possible).
And I get that you want to say that if we both gave up maximizing our values and instead agreed to implement moral values, then we would be cooperating instead, and the world would be better (even if it turned out that both of us found it less valuable). What I’m asking you is how (even in principle) we could ever reach that point.
To say that a little differently: you value some things (Vg) and I value some things (Vd). Supposing we are both perfectly rational and honest and etc., we can both know what Vg and Vd are, and what events in the world would maximize each. We can agree to cooperate on maximizing the intersection of (Vg,Vd), and we can work out some pragmatic compromise about the non-overlapping stuff. So far so good; I see how we could in principle reach that point, even if in practice we aren’t rational or self-aware or honest enough to do it.
But I don’t see how we could ever say “There’s this other list, Vm, of moral values; let’s ignore Vg and Vd altogether and instead implement Vm!” because I don’t see how we could ever know what Vm was, even in principle. If we happened to agree on some list Vm, either by coincidence or due to social conditioning or for other reasons, we could agree to implement Vm… which might or might not make the world better, depending on whether Vm happened to be the objectively correct list of moral values. But I don’t see how we could ever, even in principle, confirm or deny this, or correct it if we somehow came to know we had the objectively wrong list.
And if we can’t know or confirm or deny or correct it, even in principle, then I don’t see what is added by discussing it. It seems to me I can just as usefully say, in this case, “I value happiness, freedom, fairness, etc. I will act to maximize those values, and I endorse acting this way,” and nothing is added by saying “Those values comprise morality” except that I’ve asserted a privileged social status for my values.
Well, I am basically asserting that morality is some sort of objective equation, or “abstract idealized dynamic,” as Eliezer calls it, concerned with people’s wellbeing. And I am further asserting that most human beings care very much about this concept. I think this would make the following predictions:
In a situation where a given group of humans had similar levels of empirical knowledge and a similar sanity waterline there would be far more moral agreement among them than would be predicted by chance, and far less moral disagreement than is mentally possible.
It is physically possible to persuade people to change their moral values by reasoned argument.
Inhabitants of a society who are unusually rational and intelligent will be the first people in that society to make moral progress, as they will be better at extrapolating answers out of the “equation.”
If one attempted to convert the moral computations people make into an abstract, idealized process, and determine it’s results, many people would find those results at least somewhat persuasive, and may find their ethical views changed by observing them.
All of these predictions appear to be true:
Human societies tend to have a rather high level of moral agreement between their members. Conformity is not necessarily an indication of rightness, it seems fairly obvious that whole societies have held gravely mistaken moral views, such as those that believed slavery was good. However, it is interesting that all those people in those societies were mistaken in exactly the same way. That seems like evidence that they were all reasoning towards similar conclusions, and the mistakes they made were caused by common environmental factors that impacted all of them. There are other theories that explain this data of course, (peer pressure, for instance), but I still find it striking.
I’ve had moral arguments made by other people change my mind, and changed the minds of other people by moral argument. I’m sure you have also had this experience.
It is well known that intellectuals tend to develop and adopt new moral theories before the general populace does. Common examples of intellectuals whose moral concepts have disseminated into the general populace include John Locke, Jeremy Bentham, and William Lloyd Garrison. Many of these peoples’ principles have since been adopted into the public consciousness.
Ethical theorists who have attempted to derive new ethical principles by working from an abstract, idealized form of ethics have often been very persuasive. To name just one example, Peter Singer ended up turning thousands of people into vegetarians with moral arguments that started on a fairly abstract level.
Asserting that those values comprise morality seems to be effective because it seems to most people that those values are related in some way, because they form the superconcept “morality.” Morality is a useful catchall term for certain types of values, and it would be a shame to lose it.
Still, I suppose that asserting “I value happiness, freedom, fairness, etc” is similar enough to saying “I care about morality” that I really can’t object terribly strongly if that’s what you’d prefer to do.
Why does doing that bother you? Presumably, because you care about the moral concept of fairness, and don’t want to claim an unfair level of status for you and your views. But does it really make sense to say “I care about fairness, but I want to be fair to other people who don’t care about it, so I’ll go ahead and let them treat people unfairly, in order to be fair.” That sounds silly, doesn’t it? it has the same problems that come with being tolerant of intolerant people.
All of those predictions seem equally likely to me whether Sam is right or George is, so don’t really engage with my question at all. At this point, after several trips ’round the mulberry bush, I conclude that this is not because I’m being unclear with my question but rather because you’re choosing not to answer it, so I will stop trying to clarify the question further.
If I map your predictions and observations to the closest analogues that make any sense to me at all, I basically agree with them.
It is.
It doesn’t bother me; it’s a fine thing to do under some circumstances. If we can agree that that’s what we’re doing when we talk about “objective morality,” great. If not (which I find more likely), never mind.
As above, I don’t see what the word “moral” is adding to this sentence. But sure, unfairly claiming status bothers me to the extent that I care about fairness. (That said, I don’t think claiming status by describing my values as “moral” is unfair; pretty much everybody has an equal ability to do it, and indeed they do. I just think it confuses any honest attempt at understanding what’s really going on when we decide on what to do.)
It depends on why and how I value (“care about”) fairness.
If I value it instrumentally (which I do), then it makes perfect sense to say that being fair to people who treat others unfairly is net-valuable, although it might be true or false in any given situation depending on what is achieved by the various kind of fairness that exist in tension in that situation.
Similarly, if I value it in proportion to how much of it there is (which I do), then it makes sense to say that, although it might be true or false depending on how much fairness is gained or lost by doing so.
(nods) Totally. And the ability to phrase ideas in silly-sounding ways is valuable for rhetorical purposes, although it isn’t worth much as an analytical tool.
I’m really sorry, I was trying to kill two birds with one stone and simultaneously engage that question and your later question [“What would I expect to experience differently if, instead, happiness, freedom, fairness, etc. turned out not to be aspects of morality, just like maximizing paperclips does?”] at the same time, and I ended up doing a crappy job of answering both of them. I’ll try to just answer the Sam and George question now.
I’ll start by examining the Pebblesorters P-George and P-Sam. P-George thinks 9 is p-right and 16 is p-wrong. P-Sam thinks 9 is p-wrong and 16 is p-right. They both think they are using the word “p-right” to refer to the same abstract, idealized process. What can they do to see which one is right?
They assume that most other Pebblesorters care about the same abstract process they do, so they can try to persuade them and see how successful they are. Of course, even if all the Pebblesorters agree with one of them, that doesn’t necessarily mean that one is p-correct, those sorters may be making the same mistake as P-George or P-Sam. But I think it’s non-zero Bayesian evidence of the p-rightness of their views.
They can try to control for environmentally caused error by seeing if they can also persuade Pebblesorters who live in different environments and cultures.
They can find the most rational and p-sane Pebblesorting societies and see if they have an easier time persuading them.
They can actually try to extrapolate what the abstract, idealized equation that the word “p-right” represents is and compare it to their views. They read up on Pebblesorter philospher’s theories of p-rightness and see how they correlate with their views. Pebblesorting is much simpler than morality, so we know that the abstract, idealized dynamic that the concept “p-right” represents is “primality.” So we know that P-Sam and P-George are both partly right and partly wrong, 9 and 16 both aren’t prime.
Now let’s translate that into human.
We would expect if Sam was right and George was wrong:
He would have an easier time persuading non-sociopathic humans of the rightness of his views than George, because his views are closer to the results of the equation those people have in their head.
If he went around to different societies with different moral views and attempted to persuade the people there of his views he should, on average, also have an easier time of it than George, again because his views are closer to the results of the equation those people have in their head.
Societies with higher levels of sanity and rationality should be especially easily persuaded, because they are better at determining what the results of that equation would be.
When Sam compared his and George’s views to views generated by various attempts by philosophers to create an abstract idealized version of the equation (ie. moral theories), his view should be a better match to many of them, and the results they generate, than George’s are.
The problem is that the concept of morality is far more complex than the concept of primality, so finding the right abstract idealized equation is harder for humans than it is for Pebblesorters. We still haven’t managed to do it yet. But I think that comparing Sam and George’s views to the best approximations we have so far (various forms of consequentialism, in my view) we can get some Bayesian evidence of the rightness of their views.
If George is right, he will achieve these results instead of Sam. If they are both wrong, they will both fail at doing these things.
Sorry, I was probably being unclear as to what I meant because I was trying to sound clever. When I said it was silly to be fair to unfair people what I meant was that you should not regard their advice on how to best treat other people with the same consideration you’d give to a fair-minded person’s advice.
For instance, you wouldn’t say “I think it’s wrong to enslave black people, but that guy over there thinks it’s right, so let’s compromise and believe it’s okay to enslave them 50% of the time.” I suppose you might pretend to believe that if the other guy had a gun and you didn’t, but you wouldn’t let his beliefs affect yours.
I did not mean that, for example, if you, two fair-minded people, and one unfair-minded person are lost in the woods and find a pie, that you shouldn’t give the unfair-minded person a quarter of the pie to eat. That is an instance where it does make sense to treat unfair people fairly.
OK. Thanks for engaging with the question; that was very helpful. I now have a much better understanding of what you believe the differences-in-practice between moral and non-moral values are.
Just to echo back what I’m hearing you say: to the extent that some set of values Vm is easier to convince humans to adopt than other sets of values and easier to convince sane, rational societies to adopt than less sane, less rational societies and better approximates the moral theories created by philosophers than other sets of values, to that extent we can be confident that Vm is the set of values that comprise morality.
Did I get that right?
Regarding fairmindedness: I endorse giving someone’s advice consideration to the extent that I’m confident that considering their advice will implement my values. And, sure, it’s unlikely that the advice of an unfairminded person would, if considered, implement the value of fairness.
Yes, all those things provide small bits of Bayesian evidence that Vm is closer to morality than some other set of values.
It seems very likely that a person who cares a certain amount about morality, and a certain amount about money, would be willing to compromise his morality if given sufficient money. Such a mental model would form the basis of bribery. (It doesn’t have to be money, either, but the principle remains the same).
So a semi-sociopathic person would be anyone who could be bribed into completely disregarding morality.
On this account, we could presumably also categorize a semi-semi-sociopathic person as one who could be bribed into partially disregarding the thing we’re labeling “morality”. And of course bribes needn’t be money… people can be bribed by all kinds of things. Social status. Sex. Pleasant experiences. The promise of any or all of those things in the future.
Which is to say, we could categorize a semi-semi-sociopath as someone who cares about some stuff, and makes choices consistent with maximizing the stuff they care about, where some of that stuff is what we’re labeling “morality” and some of it isn’t.
We could also replace the term “semi-semi-sociopath” with the easier to pronounce and roughly equivalent term “person”.
It’s also worth noting that there probably exists stuff that we would label “morality” in one context and “bribe” in another, were we inclined to use such labels.
Ah, I think I see your point. What you’re saying—and correct me if I’m wrong—is that there is some objective True Morality, some complex equation that, if applied to any possible situation, will tell you how moral a given act is.
This is probably true.
This equation isn’t written into the human psyche; it exists independantly of what people think about morality. It just is. And even if we don’t know exactly what the equation is, even if we can’t work out the morality of a given act down to the tenth decimal place, we can still apply basic heuristics and arrive at a usable estimate in most situations.
My question is, then—assuming the above is true, how do we find that equation? Does there exist some objective method whereby you, I, a Pebblesorter, and a Paperclipper can all independently arrive at the same definition for what is moral (given that the Pebblesorter and Paperclipper will almost certainly promptly ignore the result)?
(I had thought that you were proposing that we find that equation by summing across the moral values and imperatives of humanity as a whole—excluding the psychopaths. This is why I asked about the exclusion, because it sounded a lot like writing down what you wanted at the end of the page and then going back and discarding the steps that wouldn’t lead there; that is also why I asked about the aliens).
Yes, I think we’re in agreement on that. (Though this does suggest that ‘sentient’ may need a proper definition at some point).
In the same way as there exists a True Set of Prime Numbers, and True Measure of How Many Paperclips There Are...
Even though the equation exists independently of our thoughts (the same way primality exists independently from Pebblesorter thoughts) fact that we are capable of caring about the results given by the equation means we must have some parts of it “written” in our heads, the same way Pebblesorters must have some concept of primality “written” in their heads. Otherwise, how would we be capable of caring about its results?
I think that probably evolution metaphorically “wrote” a desire to care about the equation in our heads because if humans care about what is good and right it makes it easier for them to cooperate and trust each other, which has obvious fitness advantages. Of course, the fact that evolution did a good thing by causing us to care about morality doesn’t mean that evolution is always good, or that evolutionary fitness is a moral justification for anything. Evolution is an amoral force causes many horrible things to happen. It just happened that in this particular instance, evolution’s amoral metaphorical “desires” happened to coincide with what was morally good. That coincidence is far from the norm, in fact, evolution probably deleted morality from the brains of sociopaths because double-crossing morally good people also sometimes confers a fitness advantage.
So how do we learn more about this moral equation that we care about? One common form of attempting to get approximations of it in philosophy is called reflective equilibrium, where you take your moral imperatives and heuristics and attempt to find the commonalities and consistencies they have with each other. It’s far from perfect, but I think that this method has produced useful results in the past.
Eliezer has proposed what is essentially a souped up version of reflective equilibrium called Coherent Extrapolated Volition. He has argued, however, that the primary use of CEV is in designing AIs that won’t want to kill us, and that attempting to extrapolate other people’s volition is open to corruption, as we could easily fall to the temptation to extrapolate it to something that personally benefits us.
Again, we could probably get closer through reflective equilibrium, and by critiquing the methods and results of each other’s reflections. If you somehow managed to get a Pebblesorter or a Paperclipper to do it too, they might generate similar results, although since they don’t intrinsically care about the equation you would probably have to give them some basic instructions before they started working on the problem.
If we assume that most humans care about acting morally, doing research about what people’s moral imperatives are might be somewhat helpful, since it would allow us to harvest the fruits of other people’s moral reflections and compare them with our own. We can exclude sociopaths because there is ample evidence that they care nothing for morality.
Although I suppose, that a super-genius sociopath who had the basic concept explained to them might be able to do some useful work in the same fashion that a Pebblesorter or Paperclipper might be able to. Of course, the genius sociopath wouldn’t care about the results, and probably would have to be paid a large sum to even agree to work on the problem.
Hmmm. That which evolution has “written” into the human psyche could, in theory, and given sufficient research, be read out again (and will almost certainly not be constant across most of humanity, but will rather exist with variations). But I doubt that morality is all in out genetic nature; I suspect that most of it is learned, from our parents, aunts, uncles, grandparents and other older relatives; I think, in short, that morality is memetic rather than genetic. Though evolution still happens in memetic systems just as well as in genetic systems.
Hmmm. Looking at the wikipedia article, I can expect reflective equilibrium to produce a consistent moral framework. I also expect a correct moral framework to be consistent; but not all consistent moral frameworks are correct. (A paperclipper does not have what I’d consider a correct moral framework, but it does have a consistent one).
If you start out close to a correct moral framework, then reflective equilibrium can move you closer, but it doesnt necessarily do so.
Hmmm. The primary use of trying to find the True Morality Equation, to my mind, is to work it into a future AI. If we can find such an equation, prove it correct, and make an AI that maximises its output value, then that would be an optimally moral AI. This may or may not count as Friendly, but it’s certainly a potential contender for the title of Friendly.
Carrying through this method to completion could give us—or anyone else—an equation. But is there any way to be sure that it necessarily gives us the correct equation? (A pebblesorter may actually be a very good help in resolving this question; he does not care about morality, and therefore does not have any emotional investment in the research).
The first thought that comes to my mind, is to have a very large group of researchers, divide them into N groups, and have each of these groups attempt, independently, to find an equation; if all of the groups find the same equation, this would be evidence that the equation found is correct (with stronger evidence at larger values of N). However, I anticipate that the acquired results would be N subtly different, but similar, equations.
That’s possible. But memetics can’t build morality out of nothing. At the very least, evolved genetics has to provide a “foundation,” a part of the brain that moral memes can latch onto. Sociopaths lack that foundation, although the research is inconclusive as to what extent this is caused by genetics, and what extent it is caused by later developmental factors (it appears to be a mix of some sort).
Yes, that’s why I consider reflective equilibrium to be far from perfect. Depending on how many errors you latch onto, it might worsen your moral state.
Considering how morally messed up the world is now, even an imperfect equation would likely be better (closer to being correct) than our current slapdash moral heuristics. At this point we haven’t even achieved “good enough,” so I don’t think we should worry too much about being “perfect.”
That’s not inconceivable. But I think that each of the subtly different equations would likely be morally better than pretty much every approximation we currently have.
That sounds plausible, yes.
Hmmm. Finding an approximation to the equation will probably be easier than step two; encouraging people worldwide to accept the approximation. (Especially since many people who do accept it will then promptly begin looking for loopholes; either to use or to patch them).
However, if the correct equation cannot be found, then this means that the Morality Maximiser AI cannot be designed.
That’s true, what I was trying to say is that a world ruled by a 99.99% Approximation of Morality Maximizer AI might well be far far better than our current one, even if it is imperfect.
Of course, it might be a problem if we put the 99.99% Approximation of Morality Maximizer AI in power, then find the correct equation, only to discover that the 99AMMAI is unwilling to step down in favor of the Morality Maximizer AI. On the other hand, putting the 99AMM AI in power might be the only way to ensure a Paperclipper doesn’t ascend to power before we find the correct equation and design the MMAI. I’m not sure whether we should risk it or not.