I do wonder what would constitute “good moral consequences” in this context. If it’s being defined as the practical extension of goodwill, or of its tangible signs, then the argument seems very nearly tautological.
Not to put too fine a point on it, but part of Rorty’s argument seems to be that if you don’t already have a reasonably good sense for what “good moral consequences” would be, then you’re part of the problem. Rorty claims that philosophical ethics has been largely concerned with explaining to “psychopaths” like Thrasymachus and Callicles (the sophists in Plato’s dialogues who argue that might makes right) why they would do better to be moral; but that the only way for morality to win out in the real world is to avoid bringing agents into existence that lack moral sentiment:
It would have been better if Plato had decided, as Aristotle was to decide, that there was nothing much to be done with people like Thrasymachus and Callicles, and that the problem was how to avoid having children who would be like Thrasymachus and Callicles.
As far as I can tell, this fits perfectly into the FAI project, which is concerned with bringing into existence superhuman AI that does have a sense of “good moral consequences” before someone else creates one that doesn’t.
You can’t write an algorithm based on “if you don’t get it, you’re part of the problem”. You can get away with telling that to your children, sort of, but only because children are very good at synthesizing behavioral rules from contextual cues. Rorty’s advice might be useful as a practical guide to making moral humans, but it only masks the underlying issue: if the only way for morality to win in the real world is to avoid bringing amoral agents into existence, then there must already exist a well-bounded set of moral utility functions for agents to follow. It doesn’t tell us much about what such a set might contain, giving only a loose suggestion that good morality functions tend to be relatively subject-independent.
Now, to encode a member of such a set into an AI (which may or may not end up being Friendly depending on how well those functions generalize outside the human problem domain), you need a formalization of it. To teach one implicitly, you need a formalization of something analogous (but not necessarily identical) to the social intuitions that human children use to derive their morals, which is most likely a harder problem. And if you have such a formalization, explaining an instance of moral behavior to a rational sociopath is as easy as running it on particular inputs.
Presented with an irrational sociopath you’re out of luck, but I can’t think of any ethical systems that don’t have that problem.
I do wonder what would constitute “good moral consequences” in this context. If it’s being defined as the practical extension of goodwill, or of its tangible signs, then the argument seems very nearly tautological.
Not to put too fine a point on it, but part of Rorty’s argument seems to be that if you don’t already have a reasonably good sense for what “good moral consequences” would be, then you’re part of the problem. Rorty claims that philosophical ethics has been largely concerned with explaining to “psychopaths” like Thrasymachus and Callicles (the sophists in Plato’s dialogues who argue that might makes right) why they would do better to be moral; but that the only way for morality to win out in the real world is to avoid bringing agents into existence that lack moral sentiment:
As far as I can tell, this fits perfectly into the FAI project, which is concerned with bringing into existence superhuman AI that does have a sense of “good moral consequences” before someone else creates one that doesn’t.
You can’t write an algorithm based on “if you don’t get it, you’re part of the problem”. You can get away with telling that to your children, sort of, but only because children are very good at synthesizing behavioral rules from contextual cues. Rorty’s advice might be useful as a practical guide to making moral humans, but it only masks the underlying issue: if the only way for morality to win in the real world is to avoid bringing amoral agents into existence, then there must already exist a well-bounded set of moral utility functions for agents to follow. It doesn’t tell us much about what such a set might contain, giving only a loose suggestion that good morality functions tend to be relatively subject-independent.
Now, to encode a member of such a set into an AI (which may or may not end up being Friendly depending on how well those functions generalize outside the human problem domain), you need a formalization of it. To teach one implicitly, you need a formalization of something analogous (but not necessarily identical) to the social intuitions that human children use to derive their morals, which is most likely a harder problem. And if you have such a formalization, explaining an instance of moral behavior to a rational sociopath is as easy as running it on particular inputs.
Presented with an irrational sociopath you’re out of luck, but I can’t think of any ethical systems that don’t have that problem.