I think we are simply having a definitional dispute. As the term is used generally, moral realism doesn’t mean that each agent has a morality, but that there are facts about morality that are external to the agent (i.e. objective). Now, “objective” is not identical to “universal,” but in practice, objective facts tend to cause convergence of beliefs. So I think what I am calling “moral realism” is something like what you are calling “Friendliness realism.”
Lengthening the inferential distance further is that realism is a two place word. As you noted, there is a distinction between realism(Friendliness, agents) and realism(Friendliness, humans).
That said, I do think that “people would perceive an AI implementing objective morals as Friendly” if I believed that objective morals exist. I’m not sure why you think that’s a stronger claim than “people who are sufficiently educated and exposed to the right knowledge will come to agree with certain universal objective morals.” If you believed that there were objective moral facts and knew the content of those facts, wouldn’t you try to adjust your beliefs and actions to conform to those facts, in the same way that you would adjust your physical-world beliefs to conform to objective physical facts?
I think we are simply having a definitional dispute.
That seems likely. If moral realists think the morality is a one-place word, and anti realists think it’s a two place word, we would be better served by using two distinct words.
It is somewhat unclear to me what moral realists are thinking of, or claiming, about whatever it is they call morality. (Even after taking into account that different people identified as moral realists do not all agree on the subject.)
So I think what I am calling “moral realism” is something like what you are calling “Friendliness realism.”
I defined ‘Friendliness (to X)’ as ‘behaving towards X in the way that is best for X in some implied sense’. Obviously there is no Friendliness towards everyone, but there might be Friendliness towards humans: then “Friendliness realism” (my coining) is the belief that there is a single Friendly-towards-humans behavior that will in fact be Friendly towards all humans. Whereas Friendliness anti-realism is the belief no one behavior would satisfy all humans, and it would inevitably be unFriendly towards some of them.
Clearly this discussion assumes many givens. Most importantly, 1) what exactly counts as being Friendly towards someone (are we utilitarian? what kind? must we agree with the target human as to what is Friendly towards them? If we influence them to come to like us, when is that allowed?). 2) what is the set of ‘all humans’? Do past, distant, future expected, or entirely hypothetical people count? What is the value of creating new people? Etc.
My position is that: 1) for most common assumed answers to these questions, I am a “Friendliness anti-realist”; I do not believe any one behavior by a superpowerful universe-optimizing AI would count as Friendliness towards all humans at once. And 2), inasfar as I have seen moral realism explained, it seems to me to be incompatible with Friendliness realism. But it’s possible some people mean something entirely different by “morals” and by “moral realism” than what I’ve read.
If you believed that there were objective moral facts and knew the content of those facts, wouldn’t you try to adjust your beliefs and actions to conform to those facts
That’s a tautology: yes I would. But, the assumption is not valid.
Even if you assume there exist objective moral facts (whatever you take that to mean), it does not follow that you would be able to convince other people that they are true moral facts! I believe it is extremely likely you would not be able to convince people—just as today most people in the world seem to be moral realists (mostly religious), and yet hold widely differing moral beliefs and when they convert to another set of beliefs it is almost never due to some sort of rational convincing.
It would be nice to live in a world where you could start from the premise that “people believe that there are objective moral facts and know the content of those facts”. But in practice we, and any future FAI, will live in a world where most people will reject mere verbal arguments in favor of new morals contradicting their current ones.
I think we are simply having a definitional dispute. As the term is used generally, moral realism doesn’t mean that each agent has a morality, but that there are facts about morality that are external to the agent (i.e. objective). Now, “objective” is not identical to “universal,” but in practice, objective facts tend to cause convergence of beliefs. So I think what I am calling “moral realism” is something like what you are calling “Friendliness realism.”
Lengthening the inferential distance further is that realism is a two place word. As you noted, there is a distinction between realism(Friendliness, agents) and realism(Friendliness, humans).
That said, I do think that “people would perceive an AI implementing objective morals as Friendly” if I believed that objective morals exist. I’m not sure why you think that’s a stronger claim than “people who are sufficiently educated and exposed to the right knowledge will come to agree with certain universal objective morals.” If you believed that there were objective moral facts and knew the content of those facts, wouldn’t you try to adjust your beliefs and actions to conform to those facts, in the same way that you would adjust your physical-world beliefs to conform to objective physical facts?
That seems likely. If moral realists think the morality is a one-place word, and anti realists think it’s a two place word, we would be better served by using two distinct words.
It is somewhat unclear to me what moral realists are thinking of, or claiming, about whatever it is they call morality. (Even after taking into account that different people identified as moral realists do not all agree on the subject.)
I defined ‘Friendliness (to X)’ as ‘behaving towards X in the way that is best for X in some implied sense’. Obviously there is no Friendliness towards everyone, but there might be Friendliness towards humans: then “Friendliness realism” (my coining) is the belief that there is a single Friendly-towards-humans behavior that will in fact be Friendly towards all humans. Whereas Friendliness anti-realism is the belief no one behavior would satisfy all humans, and it would inevitably be unFriendly towards some of them.
Clearly this discussion assumes many givens. Most importantly, 1) what exactly counts as being Friendly towards someone (are we utilitarian? what kind? must we agree with the target human as to what is Friendly towards them? If we influence them to come to like us, when is that allowed?). 2) what is the set of ‘all humans’? Do past, distant, future expected, or entirely hypothetical people count? What is the value of creating new people? Etc.
My position is that: 1) for most common assumed answers to these questions, I am a “Friendliness anti-realist”; I do not believe any one behavior by a superpowerful universe-optimizing AI would count as Friendliness towards all humans at once. And 2), inasfar as I have seen moral realism explained, it seems to me to be incompatible with Friendliness realism. But it’s possible some people mean something entirely different by “morals” and by “moral realism” than what I’ve read.
That’s a tautology: yes I would. But, the assumption is not valid.
Even if you assume there exist objective moral facts (whatever you take that to mean), it does not follow that you would be able to convince other people that they are true moral facts! I believe it is extremely likely you would not be able to convince people—just as today most people in the world seem to be moral realists (mostly religious), and yet hold widely differing moral beliefs and when they convert to another set of beliefs it is almost never due to some sort of rational convincing.
It would be nice to live in a world where you could start from the premise that “people believe that there are objective moral facts and know the content of those facts”. But in practice we, and any future FAI, will live in a world where most people will reject mere verbal arguments in favor of new morals contradicting their current ones.