But importantly, when Eliezer says something is “good” he doesn’t mean quite the same thing I mean when I say something is “good.” We actually speak slightly different languages in which the word “good” has slightly different meaning
In http://lesswrong.com/lw/t0/abstracted_idealized_dynamics/mgr, user steven wrote “When X (an agent) judges that Y (another agent) should Z (take some action, make some decision), X is judging that Z is the solution to the problem W (perhaps increasing a world’s measure under some optimization criterion), where W is a rigid designator for the problem structure implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments. (Or at least X is asserting that it’s shared.) Due to the nature of W, becoming informed will cause X and Y to get closer to the solution of W, but wanting-it-when-informed is not what makes that solution moral.” with which Eliezer agreed.
This means that, even though people might presently have different things in mind when they say something is “good”, Eliezer does not regard their/our/his present ideas as either the meaning of their-form-of-good or his-form-of-good. The meaning of good is not “the things someone/anyone personally, presently finds morally compelling”, but something like “the fixed facts that are found but not defined by clarifying the result of applying the shared human evaluative cognitive machinery to a wide variety of situations under reflectively ideal conditions of information.” That is to say, Eliezer thinks, not only that moral questions are well defined, “objective”, in a realist or cognitivist way, but that our present explicit-moralities all have a single, fixed, external referent which is constructively revealed via the moral computations that weigh our many criteria.
I haven’t finished reading CEV, but here’s a quote from Levels of Organization that seems relevant: “The target matter of Artificial Intelligence is not the surface variation that makes one human slightly smarter than another human, but rather the vast store of complexity that separates a human from an amoeba”. Similarly, the target matter of inferences that figure out the content of morality is not the surface variation of moral intuitions and beliefs under partial information which result in moral disagreements, but the vast store of neural complexity that allows humans to disagree at all, rather than merely be asking different questions.
So the meaning of presently-acted-upon-and-explicitly-stated-rightness in your language, and the meaning of it in my language might be different, but one of the many points of the meta-ethics sequence is that the expanded-enlightened-mature-unfolding of those present usages gives us a single, shared, expanded-meaning in both our languages.
If you still think that moral relativism is a good way to convey that in daily language, fine. It seems the most charitable way in which he could be interpreted as a relativist is if “good” is always in quotes, to denote the present meaning a person attaches to the word. He is a “moral” relativist, and a moral realist/cognitivist/constructivist.
Hm, that sounds plausible, especially your last paragraph. I think my problem is that I don’t see any reason to suspect that the expanded-enlightened-mature-unfolding of our present usages will converge in the way Eliezer wants to use as a definition. See for instance the “repugnant conclusion” debate; people like Peter Singer and Robin Hanson think the repugnant conclusion actually sounds pretty awesome, while Derek Parfit thinks it’s basically a reductio on aggregate utilitarianism as a philosophy and I’m pretty sure Eliezer agrees with him, and has more or less explicitly identified it as a failure mode of AI development. I doubt these are beliefs that really converge with more information and reflection.
Or in steven’s formulation, I suspect that relatively few agents actually have Ws in common; his definition presupposes that there’s a problem structure “implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments”. I’m arguing that many agents have sufficiently different implicit problem structures that, for instance, by that definition Eliezer and Robin Hanson can’t really make “should” statements to each other.
Just getting citations out of the way, Eliezer talked about the repugnant conclusion here and here. He argues for shared W in Psychological Unity and Moral Disagreement. Kaj Sotala wrote a notable reply to Psychological Unity, Psychological Diversity. Finally Coherent Extrapolated Volition is all about finding a way to unfold present-explicit-moralities into that shared-should that he believes in, so I’d expect to see some arguments there.
Now, doesn’t the state of the world today suggest that human explicit-moralities are close enough that we can live together in a Hubble volume without too many wars, without a thousand broken coalitions of support over sides of irreconcilable differences, without blowing ourselves up because the universe would be better with no life than with the evil monsters in that tribe on the other side of the river?
Human concepts are similar enough that we can talk to each other. Human aesthetics are similar enough that there’s a billion dollar video game industry. Human emotions are similar enough that Macbeth is still being produced three hundred years later on the other side of the globe. We have the same anatomical and functional regions in our brains. Parents everywhere use baby talk. On all six populated continents there are countries in which more than half of the population identifies with the Christian religions.
For all those similarities, is humanity really going to be split over the Repugnant Conclusion? Even if the Repugnant Conclusion is more of a challenge than muscling past a few inductive biases (scope insensitivity and the attribute substitution heuristic are also universal), I think we have some decent prospect for a future in which you don’t have to kill me. Whatever will help us to get to that future, that’s what I’m looking for when I say “right”. No matter how small our shared values are once we’ve felt the weight of relevant moral arguments, that’s what we need to find.
This comment may be a little scattered; I apologize. (In particular, much of this discussion is beside the point of my original claim that Eliezer really is a meta-ethical relativist, about which see my last paragraph).
I certainly don’t think we have to escalate to violence. But I do think there are subjects on which we might never come to agreement even given arbitrary time and self-improvement and processing power. Some of these are minor judgments; some are more important. But they’re very real.
In a number of places Eliezer commented that he’s not too worried about, say, two systems morality_1 and morality_2 that differ in the third decimal place. I think it’s actually really interesting when they differ in the third decimal place; it’s probably not important to the project of designing an AI but I don’t find that project terribly interesting so that doesn’t bother me.
But I’m also more willing to say to someone, “”We have nothing to argue about [on this subject], we are only different optimization processes.” With most of my friends I really do have to say this, as far as I can tell, on at least one subject.
However, I really truly don’t think this is as all-or-nothing as you or Eliezer seem to paint it. First, because while morality may be a compact algorithm relative to its output, it can still be pretty big, and disagreeing seriously about one component doesn’t mean you don’t agree about the other several hundred. (A big sticking point between me and my friends is that I think getting angry is in general deeply morally blameworthy, whereas many of them believe that failing to get angry at outrageous things is morally blameworthy; and as far as I can tell this is more or less irreducible in the specification for all of us). But I can still talk to these people and have rewarding conversations on other subjects.
Second, because I realize there are other means of persuasion than argument. You can’t argue someone into changing their terminal values, but you can often persuade them to do so through literature and emotional appeal, largely due to psychological unity. I claim that this is one of the important roles that story-telling plays: it focuses and unifies our moralities through more-or-less arational means. But this isn’t an argument per se and has no particular reason one would expect it to converge to a particular outcome—among other things, the result is highly contingent on what talented artists happen to believe. (See Rorty’s Contingency, Irony, and Solidarity for discussion of this).
Humans have a lot of psychological similarity. They also have some very interesting and deep psychological variation (see e.g. Haidt’s work on the five moral systems). And it’s actually useful to a lot of societies to have variation in moral systems—it’s really useful to have some altruistic punishers, but not really for everyone to be an altruistic punisher.
But really, this is beside the point of the original question, whether Eliezer is really a meta-ethical relativist, because the limit of this sequence which he claims converges isn’t what anyone else is talking about when they say “morality”. Because generally, “morality” is defined more or less to be a consideration that would/should be compelling to all sufficiently complex optimization processes. Eliezer clearly doesn’t believe any such thing exists. And he’s right.
A big sticking point between me and my friends is that I think getting angry is in general deeply morally blameworthy, whereas many of them believe that failing to get angry at outrageous things is morally blameworthy
Your friends can understand why humans have positive personality descriptors for people who don’t get angry in various situations: descriptors like reflective, charming, polite, solemn, respecting, humble, tranquil, agreeable, open-minded, approachable, cooperative, curious, hospitable, sensitive, sympathetic, trusting, merciful, gracious.
You can understand why we have positive personality descriptors for people who get angry in various situations: descriptors like impartial, loyal, decent, passionate, courageous, boldness, leadership, strength, resilience, candor, vigilance, independence, reputation, and dignity.
Both you and your friends can see how either group could pattern match their behavioral bias as being friendly, supportive, mature, disciplined, or prudent.
These are not deep variations, they are relative strengths of reliance on the exact same intuitions.
You can’t argue someone into changing their terminal values, but you can often persuade them to do so through literature and emotional appeal, largely due to psychological unity. I claim that this is one of the important roles that story-telling plays: it focuses and unifies our moralities through more-or-less arational means. But this isn’t an argument per se and has no particular reason one would expect it to converge to a particular outcome—among other things, the result is highly contingent on what talented artists happen to believe.
Stories strengthen our associations of different emotions in response to analogous situations, which doesn’t have much of a converging effect (Edit: unless, you know, it’s something like the bible that a billion people read. That certainly pushes humanity in some direction), but they can also create associations to moral evaluative machinery that previously wasn’t doing its job. There’s nothing arational about this: neurons firing in the inferior frontal gyrus are evidence relevant to a certain useful categorizing inference, “things which are sentient”.
Because generally, “morality” is defined more or less to be a consideration that would/should be compelling to all sufficiently complex optimization processes
I’m not in a mood to argue definitions, but “optimization process” is a very new concept, so I’d lean toward “less”.
You’re...very certain of what I understand. And of the implications of that understanding.
More generally, you’re correct that people don’t have a lot of direct access to their moral intuitions. But I don’t actually see any evidence for the proposition they should converge sufficiently other than a lot of handwaving about the fundamental psychological similarity of humankind, which is more-or-less true but probably not true enough. In contrast, I’ve seen lots of people with deeply, radically separated moral beliefs, enough so that it seems implausible that these all are attributable to computational error.
I’m not disputing that we share a lot of mental circuitry, or that we can basically understand each other. But we can understand without agreeing, and be similar without being the same.
As for the last bit—I don’t want to argue definitions either. It’s a stupid pastime. But to the extent Eliezer claims not to be a meta-ethical relativist he’s doing it purely through a definitional argument.
He does intend to convey something real and nontrivial (well, some people might find it trivial, but enough people don’t that it is important to be explicit) by saying that he is not a meta-ethical realist. The basic idea is that, while his brain is the causal reason for him wanting to do certain things, it is not referenced in the abstract computation that defines what is right. To use a metaphor from the meta-ethics sequence, it is a fact about a calculator that it is computing 1234 * 5678, but the fact that 1234 * 5678 = 7 006 652 is not a fact about that calculator.
This distinguishes him from some types of relativism, which I would guess to be the most common types. I am unsure whether people understand that he is trying to draw this distinction and still think that it is misleading to say that he is not a moral relativist or whether people are confused/have a different explanation for why he does not identify as a relativist.
In contrast, I’ve seen lots of people with deeply, radically separated moral beliefs, enough so that it seems implausible that these all are attributable to computational error.
The claim wasn’t that it happens too often to attribute to computation error, but that the types of differences seem unlikely to stem from computational errors.
In http://lesswrong.com/lw/t0/abstracted_idealized_dynamics/mgr, user steven wrote “When X (an agent) judges that Y (another agent) should Z (take some action, make some decision), X is judging that Z is the solution to the problem W (perhaps increasing a world’s measure under some optimization criterion), where W is a rigid designator for the problem structure implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments. (Or at least X is asserting that it’s shared.) Due to the nature of W, becoming informed will cause X and Y to get closer to the solution of W, but wanting-it-when-informed is not what makes that solution moral.” with which Eliezer agreed.
This means that, even though people might presently have different things in mind when they say something is “good”, Eliezer does not regard their/our/his present ideas as either the meaning of their-form-of-good or his-form-of-good. The meaning of good is not “the things someone/anyone personally, presently finds morally compelling”, but something like “the fixed facts that are found but not defined by clarifying the result of applying the shared human evaluative cognitive machinery to a wide variety of situations under reflectively ideal conditions of information.” That is to say, Eliezer thinks, not only that moral questions are well defined, “objective”, in a realist or cognitivist way, but that our present explicit-moralities all have a single, fixed, external referent which is constructively revealed via the moral computations that weigh our many criteria.
I haven’t finished reading CEV, but here’s a quote from Levels of Organization that seems relevant: “The target matter of Artificial Intelligence is not the surface variation that makes one human slightly smarter than another human, but rather the vast store of complexity that separates a human from an amoeba”. Similarly, the target matter of inferences that figure out the content of morality is not the surface variation of moral intuitions and beliefs under partial information which result in moral disagreements, but the vast store of neural complexity that allows humans to disagree at all, rather than merely be asking different questions.
So the meaning of presently-acted-upon-and-explicitly-stated-rightness in your language, and the meaning of it in my language might be different, but one of the many points of the meta-ethics sequence is that the expanded-enlightened-mature-unfolding of those present usages gives us a single, shared, expanded-meaning in both our languages.
If you still think that moral relativism is a good way to convey that in daily language, fine. It seems the most charitable way in which he could be interpreted as a relativist is if “good” is always in quotes, to denote the present meaning a person attaches to the word. He is a “moral” relativist, and a moral realist/cognitivist/constructivist.
Hm, that sounds plausible, especially your last paragraph. I think my problem is that I don’t see any reason to suspect that the expanded-enlightened-mature-unfolding of our present usages will converge in the way Eliezer wants to use as a definition. See for instance the “repugnant conclusion” debate; people like Peter Singer and Robin Hanson think the repugnant conclusion actually sounds pretty awesome, while Derek Parfit thinks it’s basically a reductio on aggregate utilitarianism as a philosophy and I’m pretty sure Eliezer agrees with him, and has more or less explicitly identified it as a failure mode of AI development. I doubt these are beliefs that really converge with more information and reflection.
Or in steven’s formulation, I suspect that relatively few agents actually have Ws in common; his definition presupposes that there’s a problem structure “implicitly defined by the machinery shared by X and Y which they both use to make desirability judgments”. I’m arguing that many agents have sufficiently different implicit problem structures that, for instance, by that definition Eliezer and Robin Hanson can’t really make “should” statements to each other.
Just getting citations out of the way, Eliezer talked about the repugnant conclusion here and here. He argues for shared W in Psychological Unity and Moral Disagreement. Kaj Sotala wrote a notable reply to Psychological Unity, Psychological Diversity. Finally Coherent Extrapolated Volition is all about finding a way to unfold present-explicit-moralities into that shared-should that he believes in, so I’d expect to see some arguments there.
Now, doesn’t the state of the world today suggest that human explicit-moralities are close enough that we can live together in a Hubble volume without too many wars, without a thousand broken coalitions of support over sides of irreconcilable differences, without blowing ourselves up because the universe would be better with no life than with the evil monsters in that tribe on the other side of the river?
Human concepts are similar enough that we can talk to each other. Human aesthetics are similar enough that there’s a billion dollar video game industry. Human emotions are similar enough that Macbeth is still being produced three hundred years later on the other side of the globe. We have the same anatomical and functional regions in our brains. Parents everywhere use baby talk. On all six populated continents there are countries in which more than half of the population identifies with the Christian religions.
For all those similarities, is humanity really going to be split over the Repugnant Conclusion? Even if the Repugnant Conclusion is more of a challenge than muscling past a few inductive biases (scope insensitivity and the attribute substitution heuristic are also universal), I think we have some decent prospect for a future in which you don’t have to kill me. Whatever will help us to get to that future, that’s what I’m looking for when I say “right”. No matter how small our shared values are once we’ve felt the weight of relevant moral arguments, that’s what we need to find.
This comment may be a little scattered; I apologize. (In particular, much of this discussion is beside the point of my original claim that Eliezer really is a meta-ethical relativist, about which see my last paragraph).
I certainly don’t think we have to escalate to violence. But I do think there are subjects on which we might never come to agreement even given arbitrary time and self-improvement and processing power. Some of these are minor judgments; some are more important. But they’re very real.
In a number of places Eliezer commented that he’s not too worried about, say, two systems morality_1 and morality_2 that differ in the third decimal place. I think it’s actually really interesting when they differ in the third decimal place; it’s probably not important to the project of designing an AI but I don’t find that project terribly interesting so that doesn’t bother me.
But I’m also more willing to say to someone, “”We have nothing to argue about [on this subject], we are only different optimization processes.” With most of my friends I really do have to say this, as far as I can tell, on at least one subject.
However, I really truly don’t think this is as all-or-nothing as you or Eliezer seem to paint it. First, because while morality may be a compact algorithm relative to its output, it can still be pretty big, and disagreeing seriously about one component doesn’t mean you don’t agree about the other several hundred. (A big sticking point between me and my friends is that I think getting angry is in general deeply morally blameworthy, whereas many of them believe that failing to get angry at outrageous things is morally blameworthy; and as far as I can tell this is more or less irreducible in the specification for all of us). But I can still talk to these people and have rewarding conversations on other subjects.
Second, because I realize there are other means of persuasion than argument. You can’t argue someone into changing their terminal values, but you can often persuade them to do so through literature and emotional appeal, largely due to psychological unity. I claim that this is one of the important roles that story-telling plays: it focuses and unifies our moralities through more-or-less arational means. But this isn’t an argument per se and has no particular reason one would expect it to converge to a particular outcome—among other things, the result is highly contingent on what talented artists happen to believe. (See Rorty’s Contingency, Irony, and Solidarity for discussion of this).
Humans have a lot of psychological similarity. They also have some very interesting and deep psychological variation (see e.g. Haidt’s work on the five moral systems). And it’s actually useful to a lot of societies to have variation in moral systems—it’s really useful to have some altruistic punishers, but not really for everyone to be an altruistic punisher.
But really, this is beside the point of the original question, whether Eliezer is really a meta-ethical relativist, because the limit of this sequence which he claims converges isn’t what anyone else is talking about when they say “morality”. Because generally, “morality” is defined more or less to be a consideration that would/should be compelling to all sufficiently complex optimization processes. Eliezer clearly doesn’t believe any such thing exists. And he’s right.
Calling something a terminal value is the default behavior when humans look for a justification and don’t find anything. This happens because we perceive little of our own mental processes and in the absence of that information we form post-hoc rationalizations. In short, we know very little about our own values. But that lack of retrieved / constructed justification doesn’t mean it’s impossible to unpack moral intuitions into algorithms so that we can more fully debate which factors we recognize and find relevant.
Your friends can understand why humans have positive personality descriptors for people who don’t get angry in various situations: descriptors like reflective, charming, polite, solemn, respecting, humble, tranquil, agreeable, open-minded, approachable, cooperative, curious, hospitable, sensitive, sympathetic, trusting, merciful, gracious.
You can understand why we have positive personality descriptors for people who get angry in various situations: descriptors like impartial, loyal, decent, passionate, courageous, boldness, leadership, strength, resilience, candor, vigilance, independence, reputation, and dignity.
Both you and your friends can see how either group could pattern match their behavioral bias as being friendly, supportive, mature, disciplined, or prudent.
These are not deep variations, they are relative strengths of reliance on the exact same intuitions.
Stories strengthen our associations of different emotions in response to analogous situations, which doesn’t have much of a converging effect (Edit: unless, you know, it’s something like the bible that a billion people read. That certainly pushes humanity in some direction), but they can also create associations to moral evaluative machinery that previously wasn’t doing its job. There’s nothing arational about this: neurons firing in the inferior frontal gyrus are evidence relevant to a certain useful categorizing inference, “things which are sentient”.
I’m not in a mood to argue definitions, but “optimization process” is a very new concept, so I’d lean toward “less”.
You’re...very certain of what I understand. And of the implications of that understanding.
More generally, you’re correct that people don’t have a lot of direct access to their moral intuitions. But I don’t actually see any evidence for the proposition they should converge sufficiently other than a lot of handwaving about the fundamental psychological similarity of humankind, which is more-or-less true but probably not true enough. In contrast, I’ve seen lots of people with deeply, radically separated moral beliefs, enough so that it seems implausible that these all are attributable to computational error.
I’m not disputing that we share a lot of mental circuitry, or that we can basically understand each other. But we can understand without agreeing, and be similar without being the same.
As for the last bit—I don’t want to argue definitions either. It’s a stupid pastime. But to the extent Eliezer claims not to be a meta-ethical relativist he’s doing it purely through a definitional argument.
He does intend to convey something real and nontrivial (well, some people might find it trivial, but enough people don’t that it is important to be explicit) by saying that he is not a meta-ethical realist. The basic idea is that, while his brain is the causal reason for him wanting to do certain things, it is not referenced in the abstract computation that defines what is right. To use a metaphor from the meta-ethics sequence, it is a fact about a calculator that it is computing 1234 * 5678, but the fact that 1234 * 5678 = 7 006 652 is not a fact about that calculator.
This distinguishes him from some types of relativism, which I would guess to be the most common types. I am unsure whether people understand that he is trying to draw this distinction and still think that it is misleading to say that he is not a moral relativist or whether people are confused/have a different explanation for why he does not identify as a relativist.
Do you know anyone who never makes computational errors? If ‘mistakes’ happen at all, we would expect to see them in cases involving tribal loyalties. See von Neumann and those who trusted him on hidden variables.
The claim wasn’t that it happens too often to attribute to computation error, but that the types of differences seem unlikely to stem from computational errors.