5 is questionable. When you say “Nothing is fundamentally moral” can you explain what it would be like if something was fundamentally moral? If not, the term “fundamentally moral” is confused rather than untrue; it’s not that we looked in the closet of fundamental morality and found it empty, but that we were confused and looking in the wrong closet.
Indeed my utility function is generally indifferent to the exact state of universes that have no observers, but this is a contingent fact about me rather than a necessary truth of metaethics, for indifference is also a value. A paperclip maximizer would very much care that these uninhabited universes contained as many paperclips as possible—even if the paperclip maximizer were outside that universe and powerless to affect its state, in which case it might not bother to cognitively process the preference.
You seem to be angling for a theory of metaethics in which objects pick up a charge of value when some valuer values them, but this is not what I think, because I don’t think it makes any moral difference whether a paperclip maximizer likes paperclips. What makes moral differences are things like, y’know, life, consciousness, activity, blah blah.
And if you’ve been reading along this whole time, you know the answer isn’t going to be, “Look at this fundamentally moral stuff!”
I didn’t know what “fundamentally moral” meant, so I translated it to the nearest term with which I’m more familiar, what Mackie called “intrinsic prescriptivity.” Or, perhaps more clearly, “intrinsic goodness,” following Korsgaard:
Objects, activities, or whatever have an instrumental value if they are valued for the sake of something else—tools, money, and chores would be standard examples. A common explanation of the supposedly contrasting kind, intrinsic goodness, is to say that a thing is intrinsically good if it is valued for its own sake, that being the obvious alternative to a thing’s being valued for the sake of something else. This is not, however, what the words “intrinsic value” mean. To say that something is intrinsically good is not by definition to say that it is valued for its own sake: it is to say that it has goodness in itself. It refers, one might say, to the location or source of the goodness rather than the way we value the thing. The contrast between instrumental and intrinsic value is therefore misleading, a false contrast. The natural contrast to intrinsic goodness—the value a thing has “in itself”—is extrinsic goodness—the value a thing gets from some other source. The natural contrast to a thing that is valued instrumentally or as a means is a thing that is valued for its own sake or as an end.
So what I mean to say in (5) is that nothing is intrinsically good (in Korsgaard’s sense). That is, nothing has value in itself. Things only have value in relation to something else.
I’m not sure whether this notion of intrinsic value is genuinely confused or merely not-understood-by-Luke-Muehlhauser, but I’m betting it is either confused or false. (“Untrue” is the term usually used to capture a statement’s being either incoherent or meaningful-and-false: see for example Richard Joyce on error theory.)
But now, I’m not sure you agree with (5) as I intended it. Do you think life, consciousness, activity, and some other things have value-in-themselves? Do these things have intrinsic value?
Thanks again for your reply. I’m going to read Chappell’s comment on this thread, too.
Do you think a heap of five pebbles is intrinsically prime, or does it get its primeness from some extrinsic thing that attaches a tag with the five English letters “PRIME” and could in principle be made to attach the same tag to composite heaps instead? If you consider “beauty” as the logical function your brain’s beauty-detectors compute, then is a screensaver intrinsically beautiful?
Does the word “intrinsic” even help, considering that it invokes bad metaphysics all by itself? In the physical universe there are only quantum amplitudes. Moral facts are logical facts, but not all minds are compelled by that-subject-matter-which-we-name-”morality”; one could as easily build a mind to be compelled by the primality of a heap of pebbles.
So the short answer is that there are different functions that use the same labels to designate different relations while we believe that the same labels designate the same functions?
I wonder if Max Tegmark would have written a similar comment. I’m not sure if there is a meaningful difference regarding Luke’s question to say that there are only quantum amplitudes versus there are only relations.
What I’m saying is that in the physical world there are only causes and effects, and the primeness of a heap of pebbles is not an ontologically basic fact operating as a separate and additional element of physical reality, but it is nonetheless about as “intrinsic” to the heap of pebbles as anything.
Once morality stops being mysterious and you start cashing it out as a logical function, the moral awfulness of a murder is exactly as intrinsic as the primeness of a heap of pebbles. Just as we don’t care whether pebble heaps are prime or experience any affect associated with its primeness, the Pebblesorters don’t care or compute whether a murder is morally awful; and this doesn’t mean that a heap of five pebbles isn’t really prime or that primeness is arbitrary, nor yet that on the “moral Twin Earth” murder could be a good thing. And there are no little physical primons associated with the pebble-heap that could be replaced by compositons to make it composite without changing the number of pebbles; and no physical stone tablet on which morality is written that could be rechiseled to make murder good without changing the circumstances of the murder; but if you’re looking for those you’re looking in the wrong closet.
Are you arguing that the world is basically a cellular automaton and that therefore beauty is logically implied to be a property of some instance of the universe? If some agent does perceive beauty then that is a logically implied fact about the circumstances. Asking if another agent would perceive the same beauty could be rephrased as asking about the equality of the expressions of an equation?
I think a lot of people are arguing about the ambiguity of the string “beauty” as it is multiply realized.
But now, I’m not sure you agree with (5) as I intended it. Do you think life, consciousness, activity, and some other things have value-in-themselves? Do these things have intrinsic value?
It is rather difficult to ask that question in the way you intend it. Particularly if the semantics have “because I say so” embedded rather than supplemented.
When you say “Nothing is fundamentally moral” can you explain what it would be like if something was fundamentally moral? If not, the term “fundamentally moral” is confused rather than untrue; it’s not that we looked in the closet of fundamental morality and found it empty, but that we were confused and looking in the wrong closet.
“Innately” is being used in that post in the sense of being a fundamental personality trait or a strong predisposition (as in “Correspondance Bias”, to which that post is a followup). And fundamental personality traits and predispositions do exist — including some that actually do predispose people toward being evil (e.g. sociopathy) — so, although the phrase “innately evil” is a bit dramatic, I find its meaning clear enough in that post’s context that I don’t think it’s a mistake similar to “fundamentally moral”. It’s not arguing about whether there’s a ghostly detachable property called “evil” that’s independent of any normal facts about a person’s mind and history.
When you say “Nothing is fundamentally moral” can you explain what it would be like if something was fundamentally moral?
He did, by implication, in describing what it’s like if nothing is:
There is nothing that would have value if it existed in an isolated universe all by itself that contained no valuers.
Clearly, many of the items on EY’s list, such as fun, humor, and justice, require the existence of valuers. The question above then amounts to whether all items of moral goodness require the existence of valuers. I think the question merits an answer, even if (see below) it might not be the one lukeprog is most curious about.
Or, perhaps more clearly, “intrinsic goodness,” following Korsgaard [...]
Unfortunately, lukeprog changed the terms in the middle of the discussion. Not that there is anything wrong with the new question (and I like EY’s answer).
I don’t think it makes any moral difference whether a paperclip maximizer likes paperclips. What makes moral differences are things like, y’know, life, consciousness, activity, blah blah.
What difference would CEV make from a universe in which a Paperclip Maximizer equipped everyone with the desire to maximize paperclips? Of what difference is a universe with as many discrete consciousness entities as possible from one with a single universe-spanning consciousness?
If it doesn’t make any difference, then how can we be sure that the SIAI won’t just implement the first fooming AI with whatever terminal goal it desires?
I don’t see how you can argue that the question “What is right?” is about the state of affairs that will help people to have more fun and yet claim that you don’t think that “it makes any moral difference whether a paperclip maximizer likes paperclips”
What difference would CEV make from a universe in which a Paperclip Maximizer equipped everyone with the desire to maximize paperclips? Of what difference is a universe with as many discrete consciousness entities as possible from one with a single universe-spanning consciousness?
If a paperclip maximizer modified everyone such that we really only valued paperclips and nothing else, and we then ran CEV, then CEV would produce a powerful paperclip maximizer. This is… I’m not going to say it’s a feature, but it’s not a bug, at least. You can’t expect CEV to generate accurate information about morality if you erase morality from the minds it’s looking at. (You could recover some information about morality by looking at history, or human DNA (if the paperclip maximizer didn’t modify that), etc., but then you’d need a strategy other than CEV.)
I don’t think I understand your second question.
I don’t see how you can argue that the question “What is right?” is about the state of affairs that will help people to have more fun and yet claim that you don’t think that “it makes any moral difference whether a paperclip maximizer likes paperclips”
That depends on whether the paperclip maximizer is sentient, whether it just makes paperclips or it actually enjoys making paperclips, etc. If those are the case, then its preferences matter… a little. (So let’s not make one of those.)
That depends on whether the paperclip maximizer is sentient, whether it just makes paperclips or it actually enjoys making paperclips, etc.
All those concepts seem to be vague. To be sentient, to enjoy. Do you need to figure out how to define those concepts mathematically before you’ll be able to implement CEV? Or are you just going to let extrapolated human volition decide about that? If so, how can you possible make claims about how valuable, or how much the preference of a paperclip maximizer matter? Maybe it will all turn out to be wireheading in the end...
What is really weird is that Yudkowsky is using the word right in reference to actions affecting other agents, yet doesn’t think that it would be reasonable to assign moral weight to the preferences of a paperclip maximizer.
CEV will decide. In general, it seems unlikely that the preferences of nonsentient objects will have moral value.
Edit: Looking back, this comment doesn’t really address the parent. Extrapolated human volition will be used to determine which things are morally significant. I think it is relatively probable that wireheading might turn out to be morally necessary. Eliezer does think that the preferences of a paperclip maximizer would have moral value if one existed. (If a nonexistent paperclip maximizer had moral worth, so would a nonexistent paperclip minimizer. This isn’t completely certain, because paperclip maximizers might gain moral significance from a property other than existence that is not shared with paperclip minimizers, but at this point, this is just speculation and we can do little better without CEV.) A nonsentient paperclip maximizer probably has no more moral value than a rock with “make paperclips” written on the side.
The reason that CEV is only based on human preferences is because, as humans, we want to create an algorithm that does what is right and humans are the only things we have that know what is right. If other species have moral value then humans, if we knew more, would care about them. If there is nothing in human minds that could motivate us to care about some specific thing, than what reason could we possibly have for designing an AI to care about that thing?
Paperclips aren’t part of fun, on EY’s account as I understand it, and therefore not relevant to morality or right. If paperclip maximizers believe otherwise they are simply wrong (perhaps incorrigibly so, but wrong nonetheless)… right and wrong don’t depend on the beliefs of agents, on this account.
So those claims seem consistent to me.
Similarly, a universe in which a PM equipped everyone with the desire to maximize paperclips would therefore be a universe with less desire for fun in it. (This would presumably in turn cause it to be a universe with less fun in it, and therefore a less valuable universe.)
I should add that I don’t endorse this view, but it does seem to be pretty clearly articulated/presented. If I’m wrong about this, then I am deeply confused.
If paperclip maximizers believe otherwise they are simply wrong (perhaps incorrigibly so, but wrong nonetheless)… right and wrong don’t depend on the beliefs of agents, on this account.
I don’t understand how someone can arrive at “right and wrong don’t depend on the beliefs of agents”.
I conclude that you use “I don’t understand” here to indicate that you don’t find the reasoning compelling. I don’t find it compelling, either—hence, my not endorsing it—so I don’t have anything more to add on that front.
If those people propose that utility functions are timeless (e.g. the Mathematical Universe), or simply an intrinsic part of the quantum amplitudes that make up physical reality (is there a meaningful difference?), then under that assumption I agree. If beauty can be captured as a logical function then women.beautiful is right independent of any agent that might endorse that function. The problem of differing tastes, differing aesthetic value, that lead to sentences like “beauty is in the eye of the beholder” are a result of trying to derive functions by the labeling of relations. There can be different functions that designate the same label to different relations. x is R-related to y can be labeled “beautiful” but so can xSy. So while some people talk about the ambiguity of the label beauty and conclude that what is beautiful is agent-dependent, other people talk about the set of functions that are labeled as beauty-function or assign the label beautiful to certain relations and conclude that their output is agent-independent.
(nods) Yes, I think EY believes that rightness can be computed as a property of physical reality, without explicit reference to other agents.
That said, I think he also believes that the specifics of that computation cannot be determined without reference to humans. I’m not 100% clear on whether he considers that a mere practical limitation or something more fundamental.
1-4 yes.
5 is questionable. When you say “Nothing is fundamentally moral” can you explain what it would be like if something was fundamentally moral? If not, the term “fundamentally moral” is confused rather than untrue; it’s not that we looked in the closet of fundamental morality and found it empty, but that we were confused and looking in the wrong closet.
Indeed my utility function is generally indifferent to the exact state of universes that have no observers, but this is a contingent fact about me rather than a necessary truth of metaethics, for indifference is also a value. A paperclip maximizer would very much care that these uninhabited universes contained as many paperclips as possible—even if the paperclip maximizer were outside that universe and powerless to affect its state, in which case it might not bother to cognitively process the preference.
You seem to be angling for a theory of metaethics in which objects pick up a charge of value when some valuer values them, but this is not what I think, because I don’t think it makes any moral difference whether a paperclip maximizer likes paperclips. What makes moral differences are things like, y’know, life, consciousness, activity, blah blah.
Eliezer,
In Setting Up Metaethics, you wrote:
I didn’t know what “fundamentally moral” meant, so I translated it to the nearest term with which I’m more familiar, what Mackie called “intrinsic prescriptivity.” Or, perhaps more clearly, “intrinsic goodness,” following Korsgaard:
So what I mean to say in (5) is that nothing is intrinsically good (in Korsgaard’s sense). That is, nothing has value in itself. Things only have value in relation to something else.
I’m not sure whether this notion of intrinsic value is genuinely confused or merely not-understood-by-Luke-Muehlhauser, but I’m betting it is either confused or false. (“Untrue” is the term usually used to capture a statement’s being either incoherent or meaningful-and-false: see for example Richard Joyce on error theory.)
But now, I’m not sure you agree with (5) as I intended it. Do you think life, consciousness, activity, and some other things have value-in-themselves? Do these things have intrinsic value?
Thanks again for your reply. I’m going to read Chappell’s comment on this thread, too.
Do you think a heap of five pebbles is intrinsically prime, or does it get its primeness from some extrinsic thing that attaches a tag with the five English letters “PRIME” and could in principle be made to attach the same tag to composite heaps instead? If you consider “beauty” as the logical function your brain’s beauty-detectors compute, then is a screensaver intrinsically beautiful?
Does the word “intrinsic” even help, considering that it invokes bad metaphysics all by itself? In the physical universe there are only quantum amplitudes. Moral facts are logical facts, but not all minds are compelled by that-subject-matter-which-we-name-”morality”; one could as easily build a mind to be compelled by the primality of a heap of pebbles.
So the short answer is that there are different functions that use the same labels to designate different relations while we believe that the same labels designate the same functions?
I wonder if Max Tegmark would have written a similar comment. I’m not sure if there is a meaningful difference regarding Luke’s question to say that there are only quantum amplitudes versus there are only relations.
What I’m saying is that in the physical world there are only causes and effects, and the primeness of a heap of pebbles is not an ontologically basic fact operating as a separate and additional element of physical reality, but it is nonetheless about as “intrinsic” to the heap of pebbles as anything.
Once morality stops being mysterious and you start cashing it out as a logical function, the moral awfulness of a murder is exactly as intrinsic as the primeness of a heap of pebbles. Just as we don’t care whether pebble heaps are prime or experience any affect associated with its primeness, the Pebblesorters don’t care or compute whether a murder is morally awful; and this doesn’t mean that a heap of five pebbles isn’t really prime or that primeness is arbitrary, nor yet that on the “moral Twin Earth” murder could be a good thing. And there are no little physical primons associated with the pebble-heap that could be replaced by compositons to make it composite without changing the number of pebbles; and no physical stone tablet on which morality is written that could be rechiseled to make murder good without changing the circumstances of the murder; but if you’re looking for those you’re looking in the wrong closet.
Are you arguing that the world is basically a cellular automaton and that therefore beauty is logically implied to be a property of some instance of the universe? If some agent does perceive beauty then that is a logically implied fact about the circumstances. Asking if another agent would perceive the same beauty could be rephrased as asking about the equality of the expressions of an equation?
I think a lot of people are arguing about the ambiguity of the string “beauty” as it is multiply realized.
Good answer!
It is rather difficult to ask that question in the way you intend it. Particularly if the semantics have “because I say so” embedded rather than supplemented.
BTW, in your post Are Your Enemies Innately Evil?, I think you are making a similar mistake about the concept of evil.
“Innately” is being used in that post in the sense of being a fundamental personality trait or a strong predisposition (as in “Correspondance Bias”, to which that post is a followup). And fundamental personality traits and predispositions do exist — including some that actually do predispose people toward being evil (e.g. sociopathy) — so, although the phrase “innately evil” is a bit dramatic, I find its meaning clear enough in that post’s context that I don’t think it’s a mistake similar to “fundamentally moral”. It’s not arguing about whether there’s a ghostly detachable property called “evil” that’s independent of any normal facts about a person’s mind and history.
He did, by implication, in describing what it’s like if nothing is:
Clearly, many of the items on EY’s list, such as fun, humor, and justice, require the existence of valuers. The question above then amounts to whether all items of moral goodness require the existence of valuers. I think the question merits an answer, even if (see below) it might not be the one lukeprog is most curious about.
Unfortunately, lukeprog changed the terms in the middle of the discussion. Not that there is anything wrong with the new question (and I like EY’s answer).
What difference would CEV make from a universe in which a Paperclip Maximizer equipped everyone with the desire to maximize paperclips? Of what difference is a universe with as many discrete consciousness entities as possible from one with a single universe-spanning consciousness?
If it doesn’t make any difference, then how can we be sure that the SIAI won’t just implement the first fooming AI with whatever terminal goal it desires?
I don’t see how you can argue that the question “What is right?” is about the state of affairs that will help people to have more fun and yet claim that you don’t think that “it makes any moral difference whether a paperclip maximizer likes paperclips”
If a paperclip maximizer modified everyone such that we really only valued paperclips and nothing else, and we then ran CEV, then CEV would produce a powerful paperclip maximizer. This is… I’m not going to say it’s a feature, but it’s not a bug, at least. You can’t expect CEV to generate accurate information about morality if you erase morality from the minds it’s looking at. (You could recover some information about morality by looking at history, or human DNA (if the paperclip maximizer didn’t modify that), etc., but then you’d need a strategy other than CEV.)
I don’t think I understand your second question.
That depends on whether the paperclip maximizer is sentient, whether it just makes paperclips or it actually enjoys making paperclips, etc. If those are the case, then its preferences matter… a little. (So let’s not make one of those.)
All those concepts seem to be vague. To be sentient, to enjoy. Do you need to figure out how to define those concepts mathematically before you’ll be able to implement CEV? Or are you just going to let extrapolated human volition decide about that? If so, how can you possible make claims about how valuable, or how much the preference of a paperclip maximizer matter? Maybe it will all turn out to be wireheading in the end...
What is really weird is that Yudkowsky is using the word right in reference to actions affecting other agents, yet doesn’t think that it would be reasonable to assign moral weight to the preferences of a paperclip maximizer.
CEV will decide. In general, it seems unlikely that the preferences of nonsentient objects will have moral value.
Edit: Looking back, this comment doesn’t really address the parent. Extrapolated human volition will be used to determine which things are morally significant. I think it is relatively probable that wireheading might turn out to be morally necessary. Eliezer does think that the preferences of a paperclip maximizer would have moral value if one existed. (If a nonexistent paperclip maximizer had moral worth, so would a nonexistent paperclip minimizer. This isn’t completely certain, because paperclip maximizers might gain moral significance from a property other than existence that is not shared with paperclip minimizers, but at this point, this is just speculation and we can do little better without CEV.) A nonsentient paperclip maximizer probably has no more moral value than a rock with “make paperclips” written on the side.
The reason that CEV is only based on human preferences is because, as humans, we want to create an algorithm that does what is right and humans are the only things we have that know what is right. If other species have moral value then humans, if we knew more, would care about them. If there is nothing in human minds that could motivate us to care about some specific thing, than what reason could we possibly have for designing an AI to care about that thing?
near future : “you are paper clip maximazer! Kill him!”
What is this supposed to mean?
Paperclips aren’t part of fun, on EY’s account as I understand it, and therefore not relevant to morality or right. If paperclip maximizers believe otherwise they are simply wrong (perhaps incorrigibly so, but wrong nonetheless)… right and wrong don’t depend on the beliefs of agents, on this account.
So those claims seem consistent to me.
Similarly, a universe in which a PM equipped everyone with the desire to maximize paperclips would therefore be a universe with less desire for fun in it. (This would presumably in turn cause it to be a universe with less fun in it, and therefore a less valuable universe.)
I should add that I don’t endorse this view, but it does seem to be pretty clearly articulated/presented. If I’m wrong about this, then I am deeply confused.
I don’t understand how someone can arrive at “right and wrong don’t depend on the beliefs of agents”.
I conclude that you use “I don’t understand” here to indicate that you don’t find the reasoning compelling. I don’t find it compelling, either—hence, my not endorsing it—so I don’t have anything more to add on that front.
If those people propose that utility functions are timeless (e.g. the Mathematical Universe), or simply an intrinsic part of the quantum amplitudes that make up physical reality (is there a meaningful difference?), then under that assumption I agree. If beauty can be captured as a logical function then women.beautiful is right independent of any agent that might endorse that function. The problem of differing tastes, differing aesthetic value, that lead to sentences like “beauty is in the eye of the beholder” are a result of trying to derive functions by the labeling of relations. There can be different functions that designate the same label to different relations. x is R-related to y can be labeled “beautiful” but so can xSy. So while some people talk about the ambiguity of the label beauty and conclude that what is beautiful is agent-dependent, other people talk about the set of functions that are labeled as beauty-function or assign the label beautiful to certain relations and conclude that their output is agent-independent.
(nods) Yes, I think EY believes that rightness can be computed as a property of physical reality, without explicit reference to other agents.
That said, I think he also believes that the specifics of that computation cannot be determined without reference to humans. I’m not 100% clear on whether he considers that a mere practical limitation or something more fundamental.