Correct. I’m a moral cognitivist; “should” statements have truth-conditions. It’s just that very few possible minds care whether should-statements are true or not; most possible minds care about whether alien statements (like “leads-to-maximum-paperclips”) are true or not. They would agree with us on what should be done; they just wouldn’t care, because they aren’t built to do what they should. They would similarly agree with us that their morals are pointless, but would be concerned with whether their morals are justified-by-paperclip-production, not whether their morals are pointless. And under ordinary circumstances, of course, they would never formulate—let alone bother to compute—the function we name “should” (or the closely related functions “justifiable” or “arbitrary”).
Yes, we’re disputing definitions, which tends to become pointless. However, we can’t seem to get away from using the word “should”, so we might as well get it pinned down to something we can agree upon.
Am I right in interpreting the bulk of the thread following this comment (excepting perhaps the FAI derail) as a dispute on the definition of “should”?
I think you are right.
The dispute also serves as a signal of what some parts of the disputants personal morality probably includes. This is fitting with the practical purpose that the concept ‘should’ has in general. Given what Eliezer has chosen as his mission this kind of signalling is a matter of life and death in the same way that it would have been in our environment of evolutionary adaptation. That is, if people had sufficient objection to Eliezer’s values they would kill him rather than let him complete an AI.
An intelligent machine might make one of its first acts the assassination of other machine intelligence researchers—unless it is explicitly told not to do that. I figure we are going to want machines that will obey the law. That should be part of any sensible machine morality proposal.
As you can see, RobinZ, I’m trying to cure a particular kind of confusion here. The way people deploy their mental categories has consequences. The problem here is that “should” is already bound fairly tightly to certain concepts, no matter what sort of verbal definitions people think they’re deploying, and if they expand the verbal label beyond that, it has consequences for e.g. how they think aliens and AIs will work, and consequences for how they emotionally experience their own moralities.
It is odd how you apparently seem to think you are using the conventional definition of “should”—when you have several people telling you that your use of “should” and “ought” is counter-intuitive.
Most people are familiar with the idea that there are different human cultures, with somewhat different notions of right and wrong—and that “should” is often used in the context of the local moral climate.
For example:
If the owner of the restaurant serves you himself, you should still tip him;
You should not put your elbows on the table while you are eating;
Women should curtsey—“a little bob is quite sufficient”.
It is odd how you apparently seem to think you are using the conventional definition of “should”—when you have several people telling you that your use of “should” and “ought” is counter-intuitive.
To be fair, there are several quite distinct ways in which ‘should’ is typically used. Eliezer’s usage is one of them. It is used more or less universally by children and tends to be supplanted or supplemented as people mature with the ‘local context’ definition you mention and/or the ‘best action for agent given his preferences’ definition. In Eliezer’s case he seems to have instead evolved and philosophically refined the child version. (I hasten to add that I imply only that he matured his moral outlook in other ways than by transitioning usage of those particular words in the most common manner.)
I can understand such usage. However, we have things like: “I’m trying to cure a particular kind of confusion here”. The confusion he is apparently talking about is the conventional view of “ought” and “should”—and it doesn’t need “curing”.
In fact, it helps us to understand the moral customs of other cultures—rather than labeling them as being full of “bad” heathens—who need to be brought into the light.
My use is not counterintuitive. The fact that it is the intuitive use—that only humans ever think of what they should do in the ordinary sense, while aliens do what is babyeating; that looking at a paperclipper’s actions conveys no more information about what we should do than looking at evolution or a rockslide—is counterintuitive.
If you tell me that “should” has a usage which is unrelated to “right”, “good”, and “ought”, then that usage could be adapted for aliens.
If you tell me that “should” has a usage which is unrelated to “right”, “good”, and “ought”, then that usage could be adapted for aliens.
One of the standard usages is “doing this will most enhance your utility”. As in “you should kill that motherf@#$%”. This is distinct from ‘right’ and ‘good’ although ‘ought’ is used in the same way, albeit less frequently. It is advice, rather than exhortation.
Hell no. “The Fifth” is the only significant law-item that I’m explicitly familiar with. And I’m not even American.
Your personal utility is one thing—but “should” and “ought” often have more to do with what society thinks of your actions.
More often what you want society to think of people’s actions (either as a signal or as persuasion. I wonder which category my answers above fit into?).
It’s counterintuitive to me—and I’m not the only one—if you look at the other comments here.
Aliens could have the “right”, “good”, “ought” and “should” concept cluster—just as some other social animals can, or other tribes, or humans at other times.
Basically, there are a whole bunch of possible and actual moral frameworks—and these words normally operate relative to the framework under consideration.
There are some people who think that “right” and “wrong” have some kind of universal moral meaning. However most of those people are religious, and think morality comes straight from god—or some such nonsense.
To clarify, people agree that the moral “right” and “wrong” categories contain things that are moral and immoral respectively—but they disagree with each other about which actions are moral and which are immoral.
For example, some people think abortion is immoral. Other people think eating meat is immoral. Other people think homosexual union is immoral—and so on.
These opinions are not widely agreed upon—yet many of those who hold them defend them passionately.
Different people seem to find different parts of this counterintuitive.
And some people simply disagree with you. Some people say, for example, that ‘they don’t have a universal meaning’. They assert that ‘should’ claims are not claims that have truth value and allow that the value depends on the person speaking. They find this quite intuitive and even go so far as create words such as ‘normative’ and ‘subjective’ to describe these concepts when talking to each other.
It is not likely that aliens, for example, have the concept ‘should’ at all and so it is likely that other words will be needed. The Babyeaters, as described, seem to be using a concept sufficiently similar as to be within the variability of use within humans. ‘Should’ and ‘good’ would not be particularly poor translations. About the same as using, say, ‘tribe’ or ‘herd’ for example.
Okay, then these are the people I’m arguing against, as a view of morality. I’m arguing that, say, dragging 6-year-olds off the train tracks, as opposed to eating them for lunch, is every bit as much uniquely the right answer as it looks; and that the Space Cannibals are every bit as awful as they look; and that the aliens do not have a different view of the subject, but simply a view of a different subject.
As an intuition pump, it might help to imagine someone saying that “truth” has different values in different places and that we want to parameterize it by true and true. If Islam has a sufficiently different criterion for using the word “true”, i.e. “recorded in the Koran”, then we just want to say “recorded in the Koran”, not use the word “true”.
Another way of looking at it is that if we are not allowed to use the word “right” or any of its synonyms, at all, a la Empty Labels and Rationalist Taboo and Replace the Symbol with the Substance, then the new language that we are forced to use will no longer create the illusion that we and the aliens are talking about the same thing. (Like forcing people from two different spiritual traditions to say what they think exists without using the word “God”, thus eliminating the illusion of agreement.) And once you realize that we and the aliens are not talking about the same thing at all, and have no disagreement over the same subject, you are no longer tempted to try to relativize morality.
It’s all very well to tell me that I should stop arguing over definitions, but I seem to be at a loss to make people understand what I am trying to say here. You are, of course, welcome to tell me that this is my fault; but it is somewhat disconcerting to find everyone saying that they agree with me, while continuing to disagree with each other.
I disagree with you about what “should” means, and I’m not even a Space Cannibal. Or do I? Are you committed to saying that I, too, am talking past you if I type “should” to sincerely refer to things?
Are you basically declaring yourself impossible to disagree with?
Do you think we’re asking sufficiently different questions such that they would be expected to have different answers in the first place? How could you know?
Humans, especially humans from an Enlightenment tradition, I presume by default to be talking about the same thing as me—we share a lot of motivations and might share even more in the limit of perfect knowledge and perfect reflection. So when we appear to disagree, I assume by default and as a matter of courtesy that we are disagreeing about the answer to the same question or to questions sufficiently similar that they could normally be expected to have almost the same answer. And so we argue, and try to share thoughts.
With aliens, there might be some overlap—or might not; a starfish is pretty different from a mammal, and that’s just on Earth. With paperclip maximizers, they are simply not asking our question or anything like that question. And so there is no point in arguing, for there is no disagreement to argue about. It would be like arguing with natural selection. Evolution does not work like you do, and it does not choose actions the way you do, and it was not disagreeing with you about anything when it sentenced you to die of old age. It’s not that evolution is a less authoritative source, but that it is not saying anything at all about the morality of aging. Consider how many bioconservatives cannot understand the last sentence; it may help convey why this point is both metaethically important and intuitively difficult.
Do you think we’re asking sufficiently different questions such that they would be expected to have different answers in the first place? How could you know?
I really do not know. Our disagreements on ethics are definitely nontrivial—the structure of consequentialism inspires you to look at a completely different set of sub-questions than the ones I’d use to determine the nature of morality. That might mean that (at least) one of us is taking the wrong tack on a shared question, or that we’re asking different basic questions. We will arrive at superficially similar answers much of the time because “appeal to intuition” is considered a legitimate move in ethics and we have some similar intuitions about the kinds of answers we want to arrive at.
I think you are right that paperclip maximizers would not care at all about ethics. Babyeaters, though, seem like they do, and it’s not even completely obvious to me that the gulf between me and a babyeater (in methodology, not in result) is larger than the gulf between me and you. It looks to me a bit like you and I get to different parts of city A via bicycle and dirigible respectively, and then the babyeaters get to city B via kayak—yes, we humans have more similar destinations to each other than to the Space Cannibals, but the kind of journey undertaken seems at least as significant, and trying to compare a bike and a blimp and a boat is not a task obviously approachable.
Do you also find it suspicious that we could both arrive in the same city using different vehicles? Or that the answer to “how many socks is Alicorn wearing?” and the answer to “what is 6 − 4?” are the same? Or that one could correctly answer “yes” to the question “is there cheese in the fridge?” and the question “is it 4:30?” without meaning to use a completely different, non-yes word in either case?
Do you also find it suspicious that we could both arrive in the same city using different vehicles?
Not at all, if we started out by wanting to arrive in the same city.
And not at all, if I selected you as a point of comparison by looking around the city I was in at the time.
Otherwise, yes, very suspicious. Usually, when two randomly selected people in Earth’s population get into a car and drive somewhere, they arrive in different cities.
Or that the answer to “how many socks is Alicorn wearing?” and the answer to “what is 6 − 4?” are the same?
No, because you selected those two questions to have the same answer.
Or that one could correctly answer “yes” to the question “is there cheese in the fridge?” and the question “is it 4:30?” without meaning to use a completely different, non-yes word in either case?
Yes-or-no questions have a very small answer space so even if you hadn’t selected them to correlate, it would only be 1 bit of coincidence.
The examples in the grandparent do seem to miss the point that Alicorn was originally describing.
I find it a suspicious coincidence that we should arrive at similar answers by asking dissimilar questions.
It is still surprising, but somewhat less so if our question answering is about finding descriptions for our hardwired intuitions. In that case people with similar personalities can be expected to formulate question-answer pairs that differ mainly in their respective areas of awkwardness as descriptions of the territory.
Not at all, if we started out by wanting to arrive in the same city.
And we did exactly that (metaphorically speaking). I said:
We will arrive at superficially similar answers much of the time because “appeal to intuition” is considered a legitimate move in ethics and we have some similar intuitions about the kinds of answers we want to arrive at.
It seems to me that you and I ask dissimilar questions and arrive at superficially similar answers. (I say “superficially similar” because I consider the “because” clause in an ethical statement to be important—if you think you should pull the six-year-old off the train tracks because that maximizes your utility function and I think you should do it because the six-year-old is entitled to your protection on account of being a person, those are different answers, even if the six-year-old lives either way.) The babyeaters get more non-matching results in the “does the six-year-old live” department, but their questions—just about as important in comparing theories—are not (it seems to me) so much more different than yours and mine.
Everybody, in seeking a principled ethical theory, has to bite some bullets (or go on an endless Easter-epicycle hunt).
To me, this doesn’t seem like superficial similarity at all. I should sooner call the differences of verbal “because” superficial, and focus on that which actually produces the answer.
I think you should do it because the six-year-old is valuable and precious and irreplaceable, and if I had a utility function it would describe that. I’m not sure how this differs from what you’re doing, but I think it differs from what you think I’m doing.
I think you are right that paperclip maximizers would not care at all about ethics.
Correct. But neither would they ‘care’ about paperclips, under the way Eliezer’s pushing this idea. They would flarb about paperclips, and caring would be as alien to them as flarbing is to you.
Babyeaters, though, seem like they do, and it’s not even completely obvious to me that the gulf between me and a babyeater (in methodology, not in result) is larger than the gulf between me and you.
It’s all very well to tell me that I should stop arguing over definitions
You are arguing over definitions but it is useful. You make many posts that rely on these concepts so the definitions are relevant. That ‘you are just arguing semantics’ call is sometimes an irritating cached response.
but I seem to be at a loss to make people understand what I am trying to say here. You are, of course, welcome to tell me that this is my fault; but
You are making more than one claim here. The different-concept-alien stuff you have explained quite clearly (eg. from the first semicolon onwards in the parent). This seems to be obviously true. The part before the semicolon is a different concept (probably two). Your posts have not given me the impression that you consider the true issue distinct from normative issue and subjectivity. You also included ‘objective morality’ in with ‘True’, ‘transcendental’ and ‘ultimate’ as things that have no meaning. I believe you are confused and that your choice of definition for ‘should’ contributes to this.
it is somewhat disconcerting to find everyone saying that they agree with me
I say I disagree with a significant part of your position, although not the most important part.
, while continuing to disagree with each other.
I definitely disagree with Tim. I may agree with some of the others.
I agree with the claim you imply with the intuition pump. I disagree with the claim you imply when you are talking about ‘uniquely the right answer’. Your intuition pump does not describe the same concept that your description does.
; and that the aliens do not have a different view of the subject, but simply a view of a different subject.
This part does match the intuition pump but you are consistently conflating this concept with another (see uniquely-right true-value of girl treatment) in your posts in this thread. You are confused.
The fact that it is the intuitive use—that only humans ever think of what they should do in the ordinary sense, while aliens do what is babyeating; that looking at a paperclipper’s actions conveys no more information about what we should do than looking at evolution or a rockslide—is counterintuitive.
It is the claims along the lines of ‘truth value’ that are most counterintuitive. The universality that you attribute to ‘Right’ also requires some translation.
The problem here is that “should” is already bound fairly tightly to certain concepts, no matter what sort of verbal definitions people think they’re deploying, and if they expand the verbal label beyond that, it has consequences for e.g. how they think aliens and AIs will work, and consequences for how they emotionally experience their own moralities.
I see, and that is an excellent point. Daniel Dennett has taken a similar attitude towards qualia, if I interpret you correctly—he argues that the idea of qualia is so inextricably bound with its standard properties (his list goes ineffable, intrinsic, private, and directly or immediately apprehensible by the consciousness) that to describe a phenomenon lacking those properties by that term is as wrongheaded as using the term elan vital to refer to DNA.
An intelligent machine might make one of its first acts the assassination of other machine intelligence researchers—unless it is explicitly told not to do that. I figure we are going to want machines that will obey the law. That should be part of any sensible machine morality proposal.
I absolutely do not want my FAI to be constrained by the law. If the FAI allows machine intelligence researchers to create an uFAI we will all die. An AI that values the law above the existence of me and my species is evil, not Friendly. I wouldn’t want the FAI to kill such researchers unless it was unable to find a more appealing way to ensure future safety but I wouldn’t dream of constraining it to either laws or politics. But come to think of it I don’t want it to be sensible either.
The Three Laws of Robotics may be a naive conception but that Zeroth law was a step in the right direction.
Re: If the FAI allows machine intelligence researchers to create an uFAI we will all die
Yes, that’s probably just the kind of paranoid delusional thinking that a psychopathic superintelligence with no respect for the law would use to justify its murder of academic researchers.
Hopefully, we won’t let it get that far. Constructing an autonomous tool that will kill people is conspiracy to murder—so hopefully the legal system will allow us to lock up researchers who lack respect for the law before they do some real damage.
Assassinating your competitors is not an acceptable business practice.
Hopefully, the researchers will learn the error of their ways before then. The first big and successful machine intelligence project may well be a collaboration. Help build my tool, or be killed by it—is a rather aggressive proposition—and I expect most researchers will reject it, and expend their energies elsewhere—hopefully on more law-abiding projects.
Yes, that’s probably just the kind of paranoid delusional thinking that a psychopathic superintelligence with no respect for the law would use to justify its murder of academic researchers.
You seem confused (or, perhaps, hysterical). A psychopathic superintelligence would have no need to justify anything it does to anyone.
By including ‘delusional’ you appear to be claiming that an unfriendly super-intelligence would not likely cause the extinction of humanity. Was that your intent? If so, why do you suggest that the first actions of a FAI would be to kill AI researchers? Do you believe that a superintelligence will disagree with you about whether uFAI is a threat and that it will be wrong while you are right? That is a bizarre prediction.
and I expect most researchers will reject it, and expend their energies elsewhere—hopefully on more law-abiding projects.
You seem to have a lot of faith in the law. I find this odd. Has it escaped your notice that a GAI is not constrained by country borders? I’m afraid most of the universe, even most of the planet, is out of your jurisdiction.
A powerful corporate agent not bound by the law might well choose to assassinate its potential competitors—if it thought it could get away with it. Its competitors are likely to be among those best placed to prevent it from meeting its goals.
Its competitors don’t have to want to destroy all humankind for it to want to eliminate them! The tiniest divergence between its goals and theirs could potentially be enough.
It is a misconception to think of law as a set of rules. Even more so to understand them as a set of rules that apply to non-humans today. In addition, rules won’t be very effective constraints on superintelligences.
“Should” means “is such so as to fulfill the desires in question.” For example, “If you want to avoid being identified while robbing a convenience store, you should wear a mask.”
In the context of morality, the desires in question are all desires that exist. “You shouldn’t rob convenience stores” means, roughly, “People in general have many and strong reasons to ensure that individuals don’t want to rob convenience stores.”
I’m a moral cognitivist too but I’m becoming quite puzzled as to what truth-conditions you think “should” statements have. Maybe it would help if you said which of these you think are true statements.
1) Eliezer Yudkowsky should not kill babies.
2) Babyeating aliens should not kill babies.
3) Sharks should not kill babies.
4) Volcanoes should not kill babies.
5) Should not kill babies. (sic)
The meaning of “should not” in 2 through 5 are intended to be the same as the common usage of the words in 1.
Technically, you would need to include a caveat in all of those like, “unless to do so would advance paperclip production” but I assume that’s what you meant.
The meaning of “should not” in 2 through 5 are intended to be the same as the common usage of the words in 1.
I don’t think there is one common usage of the word “should”.
(ETA: I asked the nearest three people if “volcanoes shouldn’t kill people” is true, false, or neither, assuming that “people shouldn’t kill people” is true or false so moral non-realism wasn’t an issue. One said true, two said neither.)
Here’s my guess at one type of situation Eliezer might be
thinking of when calling proposition B false: It is
rational (let us stipulate) for a paperclip maximizer to
turn all the matter in the solar system into computronium in
order to compute ways to maximize paperclips, but “should”
does not apply to paperclip maximizers.
EDIT: If I were picking nits, I would say, “‘Should’ does apply to paperclip maximizers—it is rational for X to make paperclips but it should not do so—however, paperclip maximizers don’t care and so it is pointless to talk about what they should do.” But the overall intent of the statement is correct—I disagree with its intent in neither anticipation nor morals—and in such cases I usually just say “Correct”. In this case I suppose that wasn’t the best policy, but it is my usual policy.
There is a function Should(human) (or Should(Eliezer)) which computes the human consensus (or Eliezer’s opinion) on what the morally correct course of action is.
And some alien beliefs have their own Should function which would be, in form if not in content, similar to our own. So a paperclip maximiser doesn’t get a should, as it simply follows a “figure out how to maximise paper clips—then do it” format. However a complex alien society that has many values and feels they must kill everyone else for the artistic cohesion of the universe, but often fails to act on this feeling because of akrasia, will get a Should(Krikkit) function.
However, until such time as we meet this alien civilization, we should just use Should as a shorthand for Should(human).
There could be a word defined that way, but for purposes of staying unconfused about morality, I prefer to use “would-want” so that “should” is reserved specifically for things that, you know, actually ought to be done.
Fair enough. But are you saying that there is an objective standard of ought, or do you just mean a shared subjective standard? Or maybe a single subjective standard?
The word “ought” means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values. There’s just nothing which says that other minds necessarily care about them. It is also possible that different humans care about different things, but there’s enough overlap that it makes sense (I believe, Greene does not) to use words like “ought” in daily communication.
What would the universe look like if there were such a thing as an “objective standard”? If you can’t tell me what the universe looks like in this case, then the statement “there is an objective morality” is not false—it’s not that there’s a closet which is supposed to contain an objective morality, and we looked inside it, and the closet is empty—but rather the statement fails to have a truth-condition. Sort of like opening a suitcase that actually does contain a million dollars, and you say “But I want an objective million dollars”, and you can’t say what the universe would look like if the million dollars were objective or not.
I should write a post at some point about how we should learn to be content with happiness instead of “true happiness”, truth instead of “ultimate truth”, purpose instead of “transcendental purpose”, and morality instead of “objective morality”. It’s not that we can’t obtain these other things and so must be satisfied with what we have, but rather that tacking on an impressive adjective results in an impressive phrase that fails to mean anything. It is not that there is no ultimate truth, but rather, that there is no closet which might contain or fail to contain “ultimate truth”, it’s just the word “truth” with the sonorous-sounding adjective “ultimate” tacked on in front. Truth is all there is or coherently could be.
I should write a post at some point about how we should learn to be content with happiness instead of “true happiness”, truth instead of “ultimate truth”, purpose instead of “transcendental purpose”, and morality instead of “objective morality”.
When you put those together like that it occurs to me that they all share the feature of being provably final. I.e., when you have true happiness you can stop working on happiness; when you have ultimate truth you can stop looking for truth; when you know an objective morality you can stop thinking about morality. So humans are always striving to end striving.
(Of course whether they’d be happy if they actually ended striving is a different question, and one you’ve written eloquently about in the “fun theory” series.)
The word “ought” means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values. There’s just nothing which says that other minds necessarily care about them. It is also possible that different humans care about different things, but there’s enough overlap that it makes sense (I believe, Greene does not) to use words like “ought” in daily communication.
Just a minor thought: there is a great deal of overlap on human “ought”s, but not so much on formal philosphical “ought”s. Dealing with philosophers often, I prefer to see ought as a function, so I can talk of “ought(Kantian)” and “ought(utilitarian)”.
Maybe Greene has more encounters with formal philosophers than you, and thus cannot see much overlap?
Re: “The word “ought” means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values.”
A reveling and amazing comment—from my point of view. I had no idea you believed that.
What about alien “ought”s? Presumably you can hack the idea that aliens might see morality rather differently from us. So, presumably you are talking about ought—glossing over our differences from one another.
There’s a human morality in about the same sense as there’s a human height.
There are no alien oughts, though there are alien desires and alien would-wants. They don’t see morality differently from us; the criterion by which they choose is simply not that which we name morality.
There’s a human morality in about the same sense as there’s a human height.
This is a wonderful epigram, though it might be too optimistic. The far more pessimistic version would be “There’s a human morality in about the same sense as there’s a human language.” (This is what Greene seems to believe and it’s a dispute of fact.)
Eliezer, I think your proposed semantics of “ought” is confusing, and doesn’t match up very well with ordinary usage. May I suggest the following alternative?
ought refer’s to X’s would-wants if X is an individual. If X is a group, then ought is the overlap between the oughts of its members.
In ordinary conversation, when people use “ought” without an explicit subscript or possessive, the implicit X is the speaker plus the intended audience (not humanity as a whole).
ETA: The reason we use “ought” is to convince the audience to do or not do something, right? Why would we want to refer to ought, when ought would work just fine for that purpose, and ought covers a lot more ground than ought?
“There’s a human morality in about the same sense as there’s a human language.” (This is what Greene seems to believe and it’s a dispute of fact.)
That seems to hit close to the mark. Human language contains all sorts of features that are more or less universal to humans due to their hardware while also being significantly determined by cultural influences. It also shares the feature that certain types of language (and ‘ought’ systems) are more useful in different cultures or subcultures.
This is a wonderful epigram, though it might be too optimistic. The far more pessimistic version would be
I’m not sure I follow this. Neither seem particularly pessimistic to me and I’m not sure how one could be worse than the other.
Jumping recklessly in at the middle: even granting your premises regarding the scope of ‘ought’, it is not wholly clear that an alien “ought” is impossible. As timtyler pointed out, the Babyeaters in “Three Worlds Collide” probably had a would-want structure within the “ought” cluster in thingspace, and systems of behaviors have been observed in some nonhuman animals which resemble human morality.
I’m not saying it’s likely, though, so this probably constitutes nitpicking.
“There are no alien oughts” and “They don’t see morality differently from us”—these seem like more bizarre-sounding views on the subject of morality—and it seems especially curious to hear them from the author of the “Baby-Eating Aliens” story.
Look, it’s not very complicated: When you see Eliezer write “morality” or “oughts”, read it as “human morality” and “human oughts”.
It isn’t that simple either. Human morality contains a significant component of trying to coerce other humans into doing things that benefit you. Even on a genetic level humans come with significantly different ways of processing moral thoughts. What is often called ‘personality’, particularly in the context of ‘personality type’.
The translation I find useful is to read it as “Eliezer-would-want”. By the definitions Eliezer has given us the two must be identical. (Except, perhaps if Eliezer has for some reason decided to make himself immoral a priori.)
Well then, I don’t understand why you would find statements like “There are no alien [human oughts]” and “They don’t see [human morality] differently from us” bizarre-sounding.
It is not that there is no ultimate truth, but rather, that there is no closet which might contain or fail to contain “ultimate truth”, it’s just the word “truth” with the sonorous-sounding adjective “ultimate” tacked on in front. Truth is all there is or coherently could be.
It is [...] possible that different humans care about different things, but there’s enough overlap that it makes sense (I believe, Greene does not) to use words like “ought” in daily communication.
Fair enough. But are you saying that there is an objective standard of ought, or do you just mean a shared subjective standard? Or maybe a single subjective standard?
A single subjective standard. But he uses different terminology, with that difference having implications about how morality should (full Eliezer meaning) be thought about.
It can be superficially considered to be a shared subjective standard in as much as many other humans have morality that overlaps with his in some ways and also in the sense that his morality includes (if I recall correctly) the preferences of others somewhere within it. I find it curious that the final result leaves language and positions that are reminiscent of those begot by a belief in an objective standard of ought but without requiring totally insane beliefs like, say, theism or predicting that a uFAI will learn ‘compassion’ and become a FAI just because ‘should’ is embedded in the universe as an inevitable force or something.
Still, if I am to translate the Eliezer word into the language of Stuart_Armstrong it matches “a single subjective standardbut I’m really serious about it”. (Part of me wonders if Eliezer’s position on this particular branch of semantics would be any different if there were less non-sequitur rejections of Bayesian statistics with that pesky ‘subjective’ word in it.)
On your analysis of should, paperclip maximizers should not maximize paperclips. Do you think this is a more useful characterization of ‘should’ than one in which we should be moral and rational, etc., and paperclip maximizers should maximize paperclips?
A paperclip maximizer will maximize paperclips. I am unable to distinguish any sense in which this is a good thing. Why should I use the word “should” to describe this, when “will” serves exactly as well?
Please amplify on that. I can sorta guess what you mean, but can’t be sure.
We make a distinction between the concepts of what people will do and what they should do. Is there an analogous pair of concepts applicable to paperclip maximizers? Why or why not? If not, what is the difference between people and paperclip maximizers that justifies there being this difference for people but not for paperclip maximizers?
A paperclip maximizer will maximize paperclips.
Will paperclip maximizers, when talking about themselves, distinguish between what they will do, and what will maximize paperclips? (While wishing they’d be more paperclip maximizers they wish they were.) What they will actually do is distinct from what will maximize paperclips: it’s predictable that actual performance is always less than optimal, given the problem is open-ended enough.
Let there be a mildly insane (after the fashion of a human) paperclipper named Clippy.
Clippy does A. Clippy would do B if a sane but bounded rationalist, C if an unbounded rationalist, and D if it had perfect veridical knowledge. That is, D is the actual paperclip-maximizing action, C is theoretically optimal given all of Clippy’s knowledge, B is as optimal as C can realistically get under perfect conditions.
Is B, C, or D what Clippy Should(Clippy) do? This is a reason to prefer “would-want”. Though I suppose a similar question applies to humans. Still, what Clippy should do is give up paperclips and become an FAI. There’s no chance of arguing Clippy into that, because Clippy doesn’t respond to what we consider a moral argument. So what’s the point of talking about what Clippy should do, since Clippy’s not going to do it? (Nor is it going to do B, C, or D, just A.)
PS: I’m also happy to talk about what it is rational for Clippy to do, referring to B.
Your usage of ‘should’ is more of a redefinition than clarification. B,C and D work as clarifications for the usual sense of the word: “should” has a feel ‘meta’ enough to transfer over to more kinds of agents.
If you can equally well talk of Should(Clippy) and Should(Humanity), then for the purposes of FAI it’s Should that needs to be understood, not one particular sense should=Should(Humanity). If one can’t explicitly write out Should(Humanity), one should probably write out Should(-), which is featureless enough for there to be no problem with the load of detailed human values, and in some sense pass Humanity as a parameter to its implementation. Do you see this framing as adequate or do you know of some problem with it?
This is a good framing for explaining the problem—you would not, in fact, try to build the same FAI for Clippies and humans, and then pass it humans as a parameter.
E.g. structural complications of human “should” that only the human FAI would have to be structurally capable of learning. (No, you cannot have complete structural freedom because then you cannot do induction.)
This is a good framing for explaining the problem—you would not, in fact, try to build the same FAI for Clippies and humans, and then pass it humans as a parameter.
I expect you would build the same FAI for paperclipping (although we don’t have any Clippies to pass it as parameter), so I’d appreciate it if you did explain the problem given you believe there is one, since it’s a direction that I’m currently working.
Humans are stuff, just like any other feature of the world, that FAI would optimize, and on stuff-level it makes no difference that people prefer to be “free to optimize”. You are “free to optimize” in a deterministic universe, it’s the way this stuff is (being) arranged that makes the difference, and it’s the content of human preference that says it shouldn’t have some features like undeserved million-dollar bags falling from the sky, where undeserved is another function of stuff. An important subtlety of preference is that it makes different features of perhaps mutually exclusive possible scenarios depend on each other, so the fact that one should care about what could be and how it’s related to what could be otherwise and even to how it’s chosen what to actually realize is about scope of what preference describes, not about specific instance of preference. That is, in a manner of speaking, it’s saying that you need an Int32, not a Bool to hold this variable, but that Int32 seems big enough.
Furthermore, considering the kind of dependence you described in that post you linked seems fundamental from a certain logical standpoint, for any system (not even “AI”). If you build the ontology for FAI on its epistemology, that is you don’t consider it as already knowing anything but only as having its program that could interact with anything, then the possible futures and its own decision-making are already there (and it’s all there is, from its point of view). All it can do, on this conceptual level, is to craft proofs (plans, designs of actions) that have the property of having certain internal dependencies in them, with the AI itself being the “current snapshot” of what it’s planning. That’s enough to handle the “free to optimize” requirement, given the right program.
Hmm, I’m essentially arguing that universal-enough FAI is “computable”, that there is a program that computes a FAI for any given “creature”, within a certain class of “creatures”. I guess this problem is void, since obviously on the too-big-class side, for a small enough class this problem is in principle solvable, and for a big enough class it’ll hit problems, if not conceptual then practical.
So the real question is about the characteristics of such class of systems for which it’s easier to build an abstract FAI, that is a tool that takes a specimen of this class as a parameter and becomes a custom-made FAI for that specimen. This class needs to at least include humanity, and given the size of humanity’s values, it needs to also include a lot of other stuff, for itself to be small enough to program explicitly. I currently expect a class of parameters of a manageable abstract FAI implementation to include even rocks and trees, since I don’t see how to rigorously define and use in FAI theory the difference between these systems and us.
This also takes care of human values/humanity’s values divide: these are just different systems to parameterize the FAI with, so there is no need for a theory of “value overlaps” distinct from a theory of “systems values”. Another question is that “humanity” will probably be a bit harder to specify as parameter than some specific human or group of people.
Re: I suppose a similar question applies to humans.
Indeed—this objection is the same for any agent, including humans.
It doesn’t seem to follow that the “should” term is inappropriate. If this is a reason for objecting to the “should” term, then the same argument concludes that it should not be used in a human context either.
Why should I use the word “should” to describe this, when “will” serves exactly as well?
‘Will’ does not serve exactly as well when considering agents with limited optimisation power (that is, any actual agent). Considering, for example, a Paperclip Maximiser that happens to be less intelligent than I am. I may be able to predict that Clippy will colonize Mars before he invades earth but also be quite sure that more paperclips would be formed if Clippy invaded Earth first. In this case I will likely want a word that means “would better serve to maximise the agent’s expected utility even if the agent does not end up doing it”.
One option is to take ‘should’ and make it the generic ‘should’. I’m not saying you should use ‘should’ (implicitly, ‘should’) to describe the action that Clippy would take if he had sufficient optimisation power. But I am saying that ‘will’ does not serve exactly as well.
I use “would-want” to indicate extrapolation. I.e., A wants X but would-want Y. This helps to indicate the implicit sensitivity to the exact extrapolation method, and that A does not actually represent a desire for Y at the current moment, etc. Similarly, A does X but would-do Y, A chooses X but would-choose Y, etc.
It’s a good thing—from their point of view. They probably think that there should be more paperclips. The term “should” makes sense in the context of a set of preferences.
No, it’s a paperclip-maximizing thing. From their point of view, and ours. No disagreement. They just care about what’s paperclip-maximizing, not what’s good.
IMO, in this context, “good” just means “favoured by this moral system”. An action that “should” be performed is just one that would be morally obligatory—according to the specified moral system. Both terms are relative to a set of moral standards.
I was talking as though a paperclip maximiser would have morals that reflected their values. You were apparently assuming the opposite. Which perspective is better would depend on which particular paperclip maximiser was being examined.
Personally, I think there are often good reasons for morals and values being in tune with one another.
I think you’re just using different words to say the same thing that Greene is saying, you in particular use “should” and “morally right” in a nonstandard way—but I don’t really care about the particular way you formulate the correct position, just as I wouldn’t care if you used the variable “x” where Greene used “y” in an integral.
You do agree that you and Greene are actually saying the same thing, yes?
Whose version of “should” are you using in that sentence? If you’re using the EY version of “should” then it is not possible for you and Greene to think people should do different things unless you and Greene anticipate different experimental results...
… since the EY version of “should” is (correct me if I am wrong) a long list of specific constraints and valuators that together define one specific utility function U humanmoralityaccordingtoEY. You can’t disagree with Greene over what the concrete result of maximizing U humanmoralityaccordingtoEY is unless one of you is factually wrong.
Oh well in that case, we disagree about what reply we would hear if we asked a friendly AI how to talk and think about morality in order to maximize human welfare as construed in most traditional utilitarian senses.
This is phrased as a different observable, but it represents more of a disagreement about impossible possible worlds than possible worlds—we disagree about statements with truth conditions of the type of mathematical truth, i.e. which conclusions are implied by which premises. Though we may also have some degree of empirical disagreement about what sort of talk and thought leads to which personal-hedonic results and which interpersonal-political results.
we disagree about what reply we would hear if we asked a friendly AI how to talk and think about morality in order to maximize human welfare as construed in most traditional utilitarian senses.
Surely you should both have large error bars around the answer to that question in the form of fairly wide probability distributions over the set of possible answers. If you’re both well-calibrated rationalists those distributions should overlap a lot. Perhaps you should go talk to Greene? I vote for a bloggingheads.
Yes, it’s possible that Greene is correct about what humanity ought to do at this point, but I think I know a bit more about his arguments than he does about mine...
I would be surprised if Eliezer would cite Joshua Greene’s moral anti-realist view with approval.
Correct. I’m a moral cognitivist; “should” statements have truth-conditions. It’s just that very few possible minds care whether should-statements are true or not; most possible minds care about whether alien statements (like “leads-to-maximum-paperclips”) are true or not. They would agree with us on what should be done; they just wouldn’t care, because they aren’t built to do what they should. They would similarly agree with us that their morals are pointless, but would be concerned with whether their morals are justified-by-paperclip-production, not whether their morals are pointless. And under ordinary circumstances, of course, they would never formulate—let alone bother to compute—the function we name “should” (or the closely related functions “justifiable” or “arbitrary”).
Am I right in interpreting the bulk of the thread following this comment (excepting perhaps the FAI derail) as a dispute on the definition of “should”?
Yes, we’re disputing definitions, which tends to become pointless. However, we can’t seem to get away from using the word “should”, so we might as well get it pinned down to something we can agree upon.
I think you are right.
The dispute also serves as a signal of what some parts of the disputants personal morality probably includes. This is fitting with the practical purpose that the concept ‘should’ has in general. Given what Eliezer has chosen as his mission this kind of signalling is a matter of life and death in the same way that it would have been in our environment of evolutionary adaptation. That is, if people had sufficient objection to Eliezer’s values they would kill him rather than let him complete an AI.
The other way around is also of some concern:
An intelligent machine might make one of its first acts the assassination of other machine intelligence researchers—unless it is explicitly told not to do that. I figure we are going to want machines that will obey the law. That should be part of any sensible machine morality proposal.
As you can see, RobinZ, I’m trying to cure a particular kind of confusion here. The way people deploy their mental categories has consequences. The problem here is that “should” is already bound fairly tightly to certain concepts, no matter what sort of verbal definitions people think they’re deploying, and if they expand the verbal label beyond that, it has consequences for e.g. how they think aliens and AIs will work, and consequences for how they emotionally experience their own moralities.
It is odd how you apparently seem to think you are using the conventional definition of “should”—when you have several people telling you that your use of “should” and “ought” is counter-intuitive.
Most people are familiar with the idea that there are different human cultures, with somewhat different notions of right and wrong—and that “should” is often used in the context of the local moral climate.
For example:
If the owner of the restaurant serves you himself, you should still tip him;
You should not put your elbows on the table while you are eating;
Women should curtsey—“a little bob is quite sufficient”.
To be fair, there are several quite distinct ways in which ‘should’ is typically used. Eliezer’s usage is one of them. It is used more or less universally by children and tends to be supplanted or supplemented as people mature with the ‘local context’ definition you mention and/or the ‘best action for agent given his preferences’ definition. In Eliezer’s case he seems to have instead evolved and philosophically refined the child version. (I hasten to add that I imply only that he matured his moral outlook in other ways than by transitioning usage of those particular words in the most common manner.)
I can understand such usage. However, we have things like: “I’m trying to cure a particular kind of confusion here”. The confusion he is apparently talking about is the conventional view of “ought” and “should”—and it doesn’t need “curing”.
In fact, it helps us to understand the moral customs of other cultures—rather than labeling them as being full of “bad” heathens—who need to be brought into the light.
My use is not counterintuitive. The fact that it is the intuitive use—that only humans ever think of what they should do in the ordinary sense, while aliens do what is babyeating; that looking at a paperclipper’s actions conveys no more information about what we should do than looking at evolution or a rockslide—is counterintuitive.
If you tell me that “should” has a usage which is unrelated to “right”, “good”, and “ought”, then that usage could be adapted for aliens.
One of the standard usages is “doing this will most enhance your utility”. As in “you should kill that motherf@#$%”. This is distinct from ‘right’ and ‘good’ although ‘ought’ is used in the same way, albeit less frequently. It is advice, rather than exhortation.
Indeed. “The Pebblesorters should avoid making piles of 1,001 stones” makes perfect sense.
“Should” and “ought” actually have strong connotations of societal morality.
Should you rob the bank? Should you have sex with the minor? Should you confess to the crime?
Your personal utility is one thing—but “should” and “ought” often have more to do with what society thinks of your actions.
Probably not.
Probably not here.
Hell no. “The Fifth” is the only significant law-item that I’m explicitly familiar with. And I’m not even American.
More often what you want society to think of people’s actions (either as a signal or as persuasion. I wonder which category my answers above fit into?).
It’s counterintuitive to me—and I’m not the only one—if you look at the other comments here.
Aliens could have the “right”, “good”, “ought” and “should” concept cluster—just as some other social animals can, or other tribes, or humans at other times.
Basically, there are a whole bunch of possible and actual moral frameworks—and these words normally operate relative to the framework under consideration.
There are some people who think that “right” and “wrong” have some kind of universal moral meaning. However most of those people are religious, and think morality comes straight from god—or some such nonsense.
They have a universal meaning. They are fixed concepts. If you are talking about a different concept, you should use a different word.
Different people seem to find different parts of this counterintuitive.
Not how natural language works.
Do you mean it would be right and good for him to use a different word, or that it would be more effective communication if he did so?
To clarify, people agree that the moral “right” and “wrong” categories contain things that are moral and immoral respectively—but they disagree with each other about which actions are moral and which are immoral.
For example, some people think abortion is immoral. Other people think eating meat is immoral. Other people think homosexual union is immoral—and so on.
These opinions are not widely agreed upon—yet many of those who hold them defend them passionately.
And some people simply disagree with you. Some people say, for example, that ‘they don’t have a universal meaning’. They assert that ‘should’ claims are not claims that have truth value and allow that the value depends on the person speaking. They find this quite intuitive and even go so far as create words such as ‘normative’ and ‘subjective’ to describe these concepts when talking to each other.
It is not likely that aliens, for example, have the concept ‘should’ at all and so it is likely that other words will be needed. The Babyeaters, as described, seem to be using a concept sufficiently similar as to be within the variability of use within humans. ‘Should’ and ‘good’ would not be particularly poor translations. About the same as using, say, ‘tribe’ or ‘herd’ for example.
Okay, then these are the people I’m arguing against, as a view of morality. I’m arguing that, say, dragging 6-year-olds off the train tracks, as opposed to eating them for lunch, is every bit as much uniquely the right answer as it looks; and that the Space Cannibals are every bit as awful as they look; and that the aliens do not have a different view of the subject, but simply a view of a different subject.
As an intuition pump, it might help to imagine someone saying that “truth” has different values in different places and that we want to parameterize it by true and true. If Islam has a sufficiently different criterion for using the word “true”, i.e. “recorded in the Koran”, then we just want to say “recorded in the Koran”, not use the word “true”.
Another way of looking at it is that if we are not allowed to use the word “right” or any of its synonyms, at all, a la Empty Labels and Rationalist Taboo and Replace the Symbol with the Substance, then the new language that we are forced to use will no longer create the illusion that we and the aliens are talking about the same thing. (Like forcing people from two different spiritual traditions to say what they think exists without using the word “God”, thus eliminating the illusion of agreement.) And once you realize that we and the aliens are not talking about the same thing at all, and have no disagreement over the same subject, you are no longer tempted to try to relativize morality.
It’s all very well to tell me that I should stop arguing over definitions, but I seem to be at a loss to make people understand what I am trying to say here. You are, of course, welcome to tell me that this is my fault; but it is somewhat disconcerting to find everyone saying that they agree with me, while continuing to disagree with each other.
I disagree with you about what “should” means, and I’m not even a Space Cannibal. Or do I? Are you committed to saying that I, too, am talking past you if I type “should” to sincerely refer to things?
Are you basically declaring yourself impossible to disagree with?
Do you think we’re asking sufficiently different questions such that they would be expected to have different answers in the first place? How could you know?
Humans, especially humans from an Enlightenment tradition, I presume by default to be talking about the same thing as me—we share a lot of motivations and might share even more in the limit of perfect knowledge and perfect reflection. So when we appear to disagree, I assume by default and as a matter of courtesy that we are disagreeing about the answer to the same question or to questions sufficiently similar that they could normally be expected to have almost the same answer. And so we argue, and try to share thoughts.
With aliens, there might be some overlap—or might not; a starfish is pretty different from a mammal, and that’s just on Earth. With paperclip maximizers, they are simply not asking our question or anything like that question. And so there is no point in arguing, for there is no disagreement to argue about. It would be like arguing with natural selection. Evolution does not work like you do, and it does not choose actions the way you do, and it was not disagreeing with you about anything when it sentenced you to die of old age. It’s not that evolution is a less authoritative source, but that it is not saying anything at all about the morality of aging. Consider how many bioconservatives cannot understand the last sentence; it may help convey why this point is both metaethically important and intuitively difficult.
I really do not know. Our disagreements on ethics are definitely nontrivial—the structure of consequentialism inspires you to look at a completely different set of sub-questions than the ones I’d use to determine the nature of morality. That might mean that (at least) one of us is taking the wrong tack on a shared question, or that we’re asking different basic questions. We will arrive at superficially similar answers much of the time because “appeal to intuition” is considered a legitimate move in ethics and we have some similar intuitions about the kinds of answers we want to arrive at.
I think you are right that paperclip maximizers would not care at all about ethics. Babyeaters, though, seem like they do, and it’s not even completely obvious to me that the gulf between me and a babyeater (in methodology, not in result) is larger than the gulf between me and you. It looks to me a bit like you and I get to different parts of city A via bicycle and dirigible respectively, and then the babyeaters get to city B via kayak—yes, we humans have more similar destinations to each other than to the Space Cannibals, but the kind of journey undertaken seems at least as significant, and trying to compare a bike and a blimp and a boat is not a task obviously approachable.
I find it a suspicious coincidence that we should arrive at similar answers by asking dissimilar questions.
Do you also find it suspicious that we could both arrive in the same city using different vehicles? Or that the answer to “how many socks is Alicorn wearing?” and the answer to “what is 6 − 4?” are the same? Or that one could correctly answer “yes” to the question “is there cheese in the fridge?” and the question “is it 4:30?” without meaning to use a completely different, non-yes word in either case?
Not at all, if we started out by wanting to arrive in the same city.
And not at all, if I selected you as a point of comparison by looking around the city I was in at the time.
Otherwise, yes, very suspicious. Usually, when two randomly selected people in Earth’s population get into a car and drive somewhere, they arrive in different cities.
No, because you selected those two questions to have the same answer.
Yes-or-no questions have a very small answer space so even if you hadn’t selected them to correlate, it would only be 1 bit of coincidence.
The examples in the grandparent do seem to miss the point that Alicorn was originally describing.
It is still surprising, but somewhat less so if our question answering is about finding descriptions for our hardwired intuitions. In that case people with similar personalities can be expected to formulate question-answer pairs that differ mainly in their respective areas of awkwardness as descriptions of the territory.
And we did exactly that (metaphorically speaking). I said:
It seems to me that you and I ask dissimilar questions and arrive at superficially similar answers. (I say “superficially similar” because I consider the “because” clause in an ethical statement to be important—if you think you should pull the six-year-old off the train tracks because that maximizes your utility function and I think you should do it because the six-year-old is entitled to your protection on account of being a person, those are different answers, even if the six-year-old lives either way.) The babyeaters get more non-matching results in the “does the six-year-old live” department, but their questions—just about as important in comparing theories—are not (it seems to me) so much more different than yours and mine.
Everybody, in seeking a principled ethical theory, has to bite some bullets (or go on an endless Easter-epicycle hunt).
To me, this doesn’t seem like superficial similarity at all. I should sooner call the differences of verbal “because” superficial, and focus on that which actually produces the answer.
I think you should do it because the six-year-old is valuable and precious and irreplaceable, and if I had a utility function it would describe that. I’m not sure how this differs from what you’re doing, but I think it differs from what you think I’m doing.
Correct. But neither would they ‘care’ about paperclips, under the way Eliezer’s pushing this idea. They would flarb about paperclips, and caring would be as alien to them as flarbing is to you.
I think some subset of paperclip maximizers might be said to care about paperclips. Not, most likely, all possible instances of them.
I had the same thought.
You are arguing over definitions but it is useful. You make many posts that rely on these concepts so the definitions are relevant. That ‘you are just arguing semantics’ call is sometimes an irritating cached response.
You are making more than one claim here. The different-concept-alien stuff you have explained quite clearly (eg. from the first semicolon onwards in the parent). This seems to be obviously true. The part before the semicolon is a different concept (probably two). Your posts have not given me the impression that you consider the true issue distinct from normative issue and subjectivity. You also included ‘objective morality’ in with ‘True’, ‘transcendental’ and ‘ultimate’ as things that have no meaning. I believe you are confused and that your choice of definition for ‘should’ contributes to this.
I say I disagree with a significant part of your position, although not the most important part.
I definitely disagree with Tim. I may agree with some of the others.
I agree with the claim you imply with the intuition pump. I disagree with the claim you imply when you are talking about ‘uniquely the right answer’. Your intuition pump does not describe the same concept that your description does.
This part does match the intuition pump but you are consistently conflating this concept with another (see uniquely-right true-value of girl treatment) in your posts in this thread. You are confused.
It is the claims along the lines of ‘truth value’ that are most counterintuitive. The universality that you attribute to ‘Right’ also requires some translation.
I see, and that is an excellent point. Daniel Dennett has taken a similar attitude towards qualia, if I interpret you correctly—he argues that the idea of qualia is so inextricably bound with its standard properties (his list goes ineffable, intrinsic, private, and directly or immediately apprehensible by the consciousness) that to describe a phenomenon lacking those properties by that term is as wrongheaded as using the term elan vital to refer to DNA.
I withdraw my implied criticism.
I absolutely do not want my FAI to be constrained by the law. If the FAI allows machine intelligence researchers to create an uFAI we will all die. An AI that values the law above the existence of me and my species is evil, not Friendly. I wouldn’t want the FAI to kill such researchers unless it was unable to find a more appealing way to ensure future safety but I wouldn’t dream of constraining it to either laws or politics. But come to think of it I don’t want it to be sensible either.
The Three Laws of Robotics may be a naive conception but that Zeroth law was a step in the right direction.
Re: If the FAI allows machine intelligence researchers to create an uFAI we will all die
Yes, that’s probably just the kind of paranoid delusional thinking that a psychopathic superintelligence with no respect for the law would use to justify its murder of academic researchers.
Hopefully, we won’t let it get that far. Constructing an autonomous tool that will kill people is conspiracy to murder—so hopefully the legal system will allow us to lock up researchers who lack respect for the law before they do some real damage. Assassinating your competitors is not an acceptable business practice.
Hopefully, the researchers will learn the error of their ways before then. The first big and successful machine intelligence project may well be a collaboration. Help build my tool, or be killed by it—is a rather aggressive proposition—and I expect most researchers will reject it, and expend their energies elsewhere—hopefully on more law-abiding projects.
You seem confused (or, perhaps, hysterical). A psychopathic superintelligence would have no need to justify anything it does to anyone.
By including ‘delusional’ you appear to be claiming that an unfriendly super-intelligence would not likely cause the extinction of humanity. Was that your intent? If so, why do you suggest that the first actions of a FAI would be to kill AI researchers? Do you believe that a superintelligence will disagree with you about whether uFAI is a threat and that it will be wrong while you are right? That is a bizarre prediction.
You seem to have a lot of faith in the law. I find this odd. Has it escaped your notice that a GAI is not constrained by country borders? I’m afraid most of the universe, even most of the planet, is out of your jurisdiction.
Re: You seem confused (or, perhaps, hysterical).
Uh, thanks :-(
A powerful corporate agent not bound by the law might well choose to assassinate its potential competitors—if it thought it could get away with it. Its competitors are likely to be among those best placed to prevent it from meeting its goals.
Its competitors don’t have to want to destroy all humankind for it to want to eliminate them! The tiniest divergence between its goals and theirs could potentially be enough.
It is a misconception to think of law as a set of rules. Even more so to understand them as a set of rules that apply to non-humans today. In addition, rules won’t be very effective constraints on superintelligences.
Here’s a very short unraveling of “should”:
“Should” means “is such so as to fulfill the desires in question.” For example, “If you want to avoid being identified while robbing a convenience store, you should wear a mask.”
In the context of morality, the desires in question are all desires that exist. “You shouldn’t rob convenience stores” means, roughly, “People in general have many and strong reasons to ensure that individuals don’t want to rob convenience stores.”
For the long version, see http://atheistethicist.blogspot.com/2005/12/meaning-of-ought.html .
I’m a moral cognitivist too but I’m becoming quite puzzled as to what truth-conditions you think “should” statements have. Maybe it would help if you said which of these you think are true statements.
1) Eliezer Yudkowsky should not kill babies.
2) Babyeating aliens should not kill babies.
3) Sharks should not kill babies.
4) Volcanoes should not kill babies.
5) Should not kill babies. (sic)
The meaning of “should not” in 2 through 5 are intended to be the same as the common usage of the words in 1.
Technically, you would need to include a caveat in all of those like, “unless to do so would advance paperclip production” but I assume that’s what you meant.
I don’t think there is one common usage of the word “should”.
(ETA: I asked the nearest three people if “volcanoes shouldn’t kill people” is true, false, or neither, assuming that “people shouldn’t kill people” is true or false so moral non-realism wasn’t an issue. One said true, two said neither.)
I don’t think there’s one canonical common usage of the word “should”.
(I’m not sure whether to say that 2-5 are true, or that 2-4 are type errors and 5 is a syntax error.)
They all sound true to me.
Interesting, what about either of the following:
A) If X should do A, then it is rational for X to do A.
B) If it is rational for X to do A, then X should do A.
From what I understand of what Eliezer’s position:
False
False.
(If this isn’t the case then Eliezer’s ‘should’ is even more annoying than how I now understand it.)
Yep, both false.
So, just to dwell on this for a moment, there exist X and A such that (1) it is rational for X to do A and (2) X should not do A.
How do you reconcile this with “rationalists should win”? (I think I know what your response will be, but I want to make sure.)
Here’s my guess at one type of situation Eliezer might be thinking of when calling proposition B false: It is rational (let us stipulate) for a paperclip maximizer to turn all the matter in the solar system into computronium in order to compute ways to maximize paperclips, but “should” does not apply to paperclip maximizers.
Correct.
EDIT: If I were picking nits, I would say, “‘Should’ does apply to paperclip maximizers—it is rational for X to make paperclips but it should not do so—however, paperclip maximizers don’t care and so it is pointless to talk about what they should do.” But the overall intent of the statement is correct—I disagree with its intent in neither anticipation nor morals—and in such cases I usually just say “Correct”. In this case I suppose that wasn’t the best policy, but it is my usual policy.
Of course, Kant distinguished between two different meanings of “should”: the hypothetical and the categorical.
If you want to be a better Go player, you should study the games of Honinbo Shusaku.
You should pull the baby off the rail track.
This seems useful here...
False. Be consistent.
What I think you mean is:
There is a function Should(human) (or Should(Eliezer)) which computes the human consensus (or Eliezer’s opinion) on what the morally correct course of action is.
And some alien beliefs have their own Should function which would be, in form if not in content, similar to our own. So a paperclip maximiser doesn’t get a should, as it simply follows a “figure out how to maximise paper clips—then do it” format. However a complex alien society that has many values and feels they must kill everyone else for the artistic cohesion of the universe, but often fails to act on this feeling because of akrasia, will get a Should(Krikkit) function.
However, until such time as we meet this alien civilization, we should just use Should as a shorthand for Should(human).
Is my understanding correct?
There could be a word defined that way, but for purposes of staying unconfused about morality, I prefer to use “would-want” so that “should” is reserved specifically for things that, you know, actually ought to be done.
“would-want”—under what circumstances? Superficially, it seems like pointless jargon. Is there a description somewhere of what it is supposed to mean?
Hmm. I guess not.
Fair enough. But are you saying that there is an objective standard of ought, or do you just mean a shared subjective standard? Or maybe a single subjective standard?
The word “ought” means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values. There’s just nothing which says that other minds necessarily care about them. It is also possible that different humans care about different things, but there’s enough overlap that it makes sense (I believe, Greene does not) to use words like “ought” in daily communication.
What would the universe look like if there were such a thing as an “objective standard”? If you can’t tell me what the universe looks like in this case, then the statement “there is an objective morality” is not false—it’s not that there’s a closet which is supposed to contain an objective morality, and we looked inside it, and the closet is empty—but rather the statement fails to have a truth-condition. Sort of like opening a suitcase that actually does contain a million dollars, and you say “But I want an objective million dollars”, and you can’t say what the universe would look like if the million dollars were objective or not.
I should write a post at some point about how we should learn to be content with happiness instead of “true happiness”, truth instead of “ultimate truth”, purpose instead of “transcendental purpose”, and morality instead of “objective morality”. It’s not that we can’t obtain these other things and so must be satisfied with what we have, but rather that tacking on an impressive adjective results in an impressive phrase that fails to mean anything. It is not that there is no ultimate truth, but rather, that there is no closet which might contain or fail to contain “ultimate truth”, it’s just the word “truth” with the sonorous-sounding adjective “ultimate” tacked on in front. Truth is all there is or coherently could be.
When you put those together like that it occurs to me that they all share the feature of being provably final. I.e., when you have true happiness you can stop working on happiness; when you have ultimate truth you can stop looking for truth; when you know an objective morality you can stop thinking about morality. So humans are always striving to end striving.
(Of course whether they’d be happy if they actually ended striving is a different question, and one you’ve written eloquently about in the “fun theory” series.)
That’s actually an excellent way of thinking about it—perhaps the terms are not as meaningless as I thought.
Just a minor thought: there is a great deal of overlap on human “ought”s, but not so much on formal philosphical “ought”s. Dealing with philosophers often, I prefer to see ought as a function, so I can talk of “ought(Kantian)” and “ought(utilitarian)”.
Maybe Greene has more encounters with formal philosophers than you, and thus cannot see much overlap?
Re: “The word “ought” means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values.”
A reveling and amazing comment—from my point of view. I had no idea you believed that.
What about alien “ought”s? Presumably you can hack the idea that aliens might see morality rather differently from us. So, presumably you are talking about ought—glossing over our differences from one another.
There’s a human morality in about the same sense as there’s a human height.
There are no alien oughts, though there are alien desires and alien would-wants. They don’t see morality differently from us; the criterion by which they choose is simply not that which we name morality.
This is a wonderful epigram, though it might be too optimistic. The far more pessimistic version would be “There’s a human morality in about the same sense as there’s a human language.” (This is what Greene seems to believe and it’s a dispute of fact.)
Eliezer, I think your proposed semantics of “ought” is confusing, and doesn’t match up very well with ordinary usage. May I suggest the following alternative?
ought refer’s to X’s would-wants if X is an individual. If X is a group, then ought is the overlap between the oughts of its members.
In ordinary conversation, when people use “ought” without an explicit subscript or possessive, the implicit X is the speaker plus the intended audience (not humanity as a whole).
ETA: The reason we use “ought” is to convince the audience to do or not do something, right? Why would we want to refer to ought, when ought would work just fine for that purpose, and ought covers a lot more ground than ought?
That seems to hit close to the mark. Human language contains all sorts of features that are more or less universal to humans due to their hardware while also being significantly determined by cultural influences. It also shares the feature that certain types of language (and ‘ought’ systems) are more useful in different cultures or subcultures.
I’m not sure I follow this. Neither seem particularly pessimistic to me and I’m not sure how one could be worse than the other.
Jumping recklessly in at the middle: even granting your premises regarding the scope of ‘ought’, it is not wholly clear that an alien “ought” is impossible. As timtyler pointed out, the Babyeaters in “Three Worlds Collide” probably had a would-want structure within the “ought” cluster in thingspace, and systems of behaviors have been observed in some nonhuman animals which resemble human morality.
I’m not saying it’s likely, though, so this probably constitutes nitpicking.
“There are no alien oughts” and “They don’t see morality differently from us”—these seem like more bizarre-sounding views on the subject of morality—and it seems especially curious to hear them from the author of the “Baby-Eating Aliens” story.
Look, it’s not very complicated: When you see Eliezer write “morality” or “oughts”, read it as “human morality” and “human oughts”.
It isn’t that simple either. Human morality contains a significant component of trying to coerce other humans into doing things that benefit you. Even on a genetic level humans come with significantly different ways of processing moral thoughts. What is often called ‘personality’, particularly in the context of ‘personality type’.
The translation I find useful is to read it as “Eliezer-would-want”. By the definitions Eliezer has given us the two must be identical. (Except, perhaps if Eliezer has for some reason decided to make himself immoral a priori.)
Um, that’s what I just said: “presumably you are talking about ought”.
We were then talking about the meaning of ought.
There’s also the issue of whether to discuss ought and ought—which are evidently quite different—due to the shifting moral zeitgeist.
Well then, I don’t understand why you would find statements like “There are no alien [human oughts]” and “They don’t see [human morality] differently from us” bizarre-sounding.
Having established EY meant ought, I was asking about ought.
Maybe you are right—and EY misinterpreted me—and genuinely thought I was asking about ought.
If so, that seems like a rather ridiculous question for me to be asking—and I’m surprised it made it through his sanity checker.
Even if “morality” means “criterion for choosing..”? Their criterion might have a different referent, but that does not imply a different sense. cf. “This planet”. Out of the two, sense has more to do with meaning, since it doesn’t change with changes of place and time.
Then we need a better way of distinguishing between what we’re doing and what we would be doing if we were better at it.
You’ve written about the difference between rationality and believing that one’s bad arguments are rational.
For the person who is in the latter state, something that might be called “true rationality” is unimaginable, but it exists.
Thanks, this has made your position clear. And—apart from tiny differences in vocabulary—it is exactly the same as mine.
So what about the Ultimate Showdown of Ultimate Destiny?
...sorry, couldn’t resist.
But there is a truth-condition for whether a showdown is “ultimate” or not.
This sentence is much clearer than the sort of thing you usually say.
A single subjective standard. But he uses different terminology, with that difference having implications about how morality should (full Eliezer meaning) be thought about.
It can be superficially considered to be a shared subjective standard in as much as many other humans have morality that overlaps with his in some ways and also in the sense that his morality includes (if I recall correctly) the preferences of others somewhere within it. I find it curious that the final result leaves language and positions that are reminiscent of those begot by a belief in an objective standard of ought but without requiring totally insane beliefs like, say, theism or predicting that a uFAI will learn ‘compassion’ and become a FAI just because ‘should’ is embedded in the universe as an inevitable force or something.
Still, if I am to translate the Eliezer word into the language of Stuart_Armstrong it matches “a single subjective standard but I’m really serious about it”. (Part of me wonders if Eliezer’s position on this particular branch of semantics would be any different if there were less non-sequitur rejections of Bayesian statistics with that pesky ‘subjective’ word in it.)
On your analysis of should, paperclip maximizers should not maximize paperclips. Do you think this is a more useful characterization of ‘should’ than one in which we should be moral and rational, etc., and paperclip maximizers should maximize paperclips?
A paperclip maximizer will maximize paperclips. I am unable to distinguish any sense in which this is a good thing. Why should I use the word “should” to describe this, when “will” serves exactly as well?
Please amplify on that. I can sorta guess what you mean, but can’t be sure.
We make a distinction between the concepts of what people will do and what they should do. Is there an analogous pair of concepts applicable to paperclip maximizers? Why or why not? If not, what is the difference between people and paperclip maximizers that justifies there being this difference for people but not for paperclip maximizers?
Will paperclip maximizers, when talking about themselves, distinguish between what they will do, and what will maximize paperclips? (While wishing they’d be more paperclip maximizers they wish they were.) What they will actually do is distinct from what will maximize paperclips: it’s predictable that actual performance is always less than optimal, given the problem is open-ended enough.
Let there be a mildly insane (after the fashion of a human) paperclipper named Clippy.
Clippy does A. Clippy would do B if a sane but bounded rationalist, C if an unbounded rationalist, and D if it had perfect veridical knowledge. That is, D is the actual paperclip-maximizing action, C is theoretically optimal given all of Clippy’s knowledge, B is as optimal as C can realistically get under perfect conditions.
Is B, C, or D what Clippy Should(Clippy) do? This is a reason to prefer “would-want”. Though I suppose a similar question applies to humans. Still, what Clippy should do is give up paperclips and become an FAI. There’s no chance of arguing Clippy into that, because Clippy doesn’t respond to what we consider a moral argument. So what’s the point of talking about what Clippy should do, since Clippy’s not going to do it? (Nor is it going to do B, C, or D, just A.)
PS: I’m also happy to talk about what it is rational for Clippy to do, referring to B.
Your usage of ‘should’ is more of a redefinition than clarification. B,C and D work as clarifications for the usual sense of the word: “should” has a feel ‘meta’ enough to transfer over to more kinds of agents.
If you can equally well talk of Should(Clippy) and Should(Humanity), then for the purposes of FAI it’s Should that needs to be understood, not one particular sense should=Should(Humanity). If one can’t explicitly write out Should(Humanity), one should probably write out Should(-), which is featureless enough for there to be no problem with the load of detailed human values, and in some sense pass Humanity as a parameter to its implementation. Do you see this framing as adequate or do you know of some problem with it?
This is a good framing for explaining the problem—you would not, in fact, try to build the same FAI for Clippies and humans, and then pass it humans as a parameter.
E.g. structural complications of human “should” that only the human FAI would have to be structurally capable of learning. (No, you cannot have complete structural freedom because then you cannot do induction.)
I expect you would build the same FAI for paperclipping (although we don’t have any Clippies to pass it as parameter), so I’d appreciate it if you did explain the problem given you believe there is one, since it’s a direction that I’m currently working.
Humans are stuff, just like any other feature of the world, that FAI would optimize, and on stuff-level it makes no difference that people prefer to be “free to optimize”. You are “free to optimize” in a deterministic universe, it’s the way this stuff is (being) arranged that makes the difference, and it’s the content of human preference that says it shouldn’t have some features like undeserved million-dollar bags falling from the sky, where undeserved is another function of stuff. An important subtlety of preference is that it makes different features of perhaps mutually exclusive possible scenarios depend on each other, so the fact that one should care about what could be and how it’s related to what could be otherwise and even to how it’s chosen what to actually realize is about scope of what preference describes, not about specific instance of preference. That is, in a manner of speaking, it’s saying that you need an Int32, not a Bool to hold this variable, but that Int32 seems big enough.
Furthermore, considering the kind of dependence you described in that post you linked seems fundamental from a certain logical standpoint, for any system (not even “AI”). If you build the ontology for FAI on its epistemology, that is you don’t consider it as already knowing anything but only as having its program that could interact with anything, then the possible futures and its own decision-making are already there (and it’s all there is, from its point of view). All it can do, on this conceptual level, is to craft proofs (plans, designs of actions) that have the property of having certain internal dependencies in them, with the AI itself being the “current snapshot” of what it’s planning. That’s enough to handle the “free to optimize” requirement, given the right program.
Hmm, I’m essentially arguing that universal-enough FAI is “computable”, that there is a program that computes a FAI for any given “creature”, within a certain class of “creatures”. I guess this problem is void, since obviously on the too-big-class side, for a small enough class this problem is in principle solvable, and for a big enough class it’ll hit problems, if not conceptual then practical.
So the real question is about the characteristics of such class of systems for which it’s easier to build an abstract FAI, that is a tool that takes a specimen of this class as a parameter and becomes a custom-made FAI for that specimen. This class needs to at least include humanity, and given the size of humanity’s values, it needs to also include a lot of other stuff, for itself to be small enough to program explicitly. I currently expect a class of parameters of a manageable abstract FAI implementation to include even rocks and trees, since I don’t see how to rigorously define and use in FAI theory the difference between these systems and us.
This also takes care of human values/humanity’s values divide: these are just different systems to parameterize the FAI with, so there is no need for a theory of “value overlaps” distinct from a theory of “systems values”. Another question is that “humanity” will probably be a bit harder to specify as parameter than some specific human or group of people.
Re: I suppose a similar question applies to humans.
Indeed—this objection is the same for any agent, including humans.
It doesn’t seem to follow that the “should” term is inappropriate. If this is a reason for objecting to the “should” term, then the same argument concludes that it should not be used in a human context either.
‘Will’ does not serve exactly as well when considering agents with limited optimisation power (that is, any actual agent). Considering, for example, a Paperclip Maximiser that happens to be less intelligent than I am. I may be able to predict that Clippy will colonize Mars before he invades earth but also be quite sure that more paperclips would be formed if Clippy invaded Earth first. In this case I will likely want a word that means “would better serve to maximise the agent’s expected utility even if the agent does not end up doing it”.
One option is to take ‘should’ and make it the generic ‘should’. I’m not saying you should use ‘should’ (implicitly, ‘should’) to describe the action that Clippy would take if he had sufficient optimisation power. But I am saying that ‘will’ does not serve exactly as well.
I use “would-want” to indicate extrapolation. I.e., A wants X but would-want Y. This helps to indicate the implicit sensitivity to the exact extrapolation method, and that A does not actually represent a desire for Y at the current moment, etc. Similarly, A does X but would-do Y, A chooses X but would-choose Y, etc.
“Should” is a standard word for indicating moral obligation—it seems only sensible to use it in the context of other moral systems.
It’s a good thing—from their point of view. They probably think that there should be more paperclips. The term “should” makes sense in the context of a set of preferences.
No, it’s a paperclip-maximizing thing. From their point of view, and ours. No disagreement. They just care about what’s paperclip-maximizing, not what’s good.
This is not a real point of disagreement.
IMO, in this context, “good” just means “favoured by this moral system”. An action that “should” be performed is just one that would be morally obligatory—according to the specified moral system. Both terms are relative to a set of moral standards.
I was talking as though a paperclip maximiser would have morals that reflected their values. You were apparently assuming the opposite. Which perspective is better would depend on which particular paperclip maximiser was being examined.
Personally, I think there are often good reasons for morals and values being in tune with one another.
I think you’re just using different words to say the same thing that Greene is saying, you in particular use “should” and “morally right” in a nonstandard way—but I don’t really care about the particular way you formulate the correct position, just as I wouldn’t care if you used the variable “x” where Greene used “y” in an integral.
You do agree that you and Greene are actually saying the same thing, yes?
I don’t think we anticipate different experimental results. We do, however, seem to think that people should do different things.
Whose version of “should” are you using in that sentence? If you’re using the EY version of “should” then it is not possible for you and Greene to think people should do different things unless you and Greene anticipate different experimental results...
… since the EY version of “should” is (correct me if I am wrong) a long list of specific constraints and valuators that together define one specific utility function U humanmoralityaccordingtoEY. You can’t disagree with Greene over what the concrete result of maximizing U humanmoralityaccordingtoEY is unless one of you is factually wrong.
Oh well in that case, we disagree about what reply we would hear if we asked a friendly AI how to talk and think about morality in order to maximize human welfare as construed in most traditional utilitarian senses.
This is phrased as a different observable, but it represents more of a disagreement about impossible possible worlds than possible worlds—we disagree about statements with truth conditions of the type of mathematical truth, i.e. which conclusions are implied by which premises. Though we may also have some degree of empirical disagreement about what sort of talk and thought leads to which personal-hedonic results and which interpersonal-political results.
(It’s a good and clever question, though!)
Surely you should both have large error bars around the answer to that question in the form of fairly wide probability distributions over the set of possible answers. If you’re both well-calibrated rationalists those distributions should overlap a lot. Perhaps you should go talk to Greene? I vote for a bloggingheads.
Asked Greene, he was busy.
Yes, it’s possible that Greene is correct about what humanity ought to do at this point, but I think I know a bit more about his arguments than he does about mine...
That is plausible.
Wouldn’t that be ‘advocate’, ‘propose’ or ‘suggest’?
I vote no, it wouldn’t be
I find that quite surprising to hear. Wouldn’t disagreements about meaning generally cash out in some sort of difference in experimental results?