This may have some value, but probably not towards actually making AI more moral/friendly on average. Conversing about morality can demonstrate knowledge of morality, but does little to demonstrate evidence of being moral/friendly. Example: a psychopath would not necessarily have any difficulty passing this Moral Turing Test.
On the other hand, a machine could fail a morality test simply by saying something controversial, or just failing to signal properly. For example atheism could be considered immoral by religious people; they could conclude that the machine is missing a part of human utility function. Or if some nice and correct belief has bad consequences, but humans compartmentalize it away and the machine would point it out explicitly, that could be percieved as a moral failure.
If the machine is allowed to lie, passing this test could just mean the machine is a skilled psychopath. If the machine is not allowed to lie, failing this test could just mean humans confuse signalling with the real thing.
I agree, the goal is to get humans to think about programming some forms of moral reasoning, even if it’s far from sufficient (and it’s far from being the hardest part of FAI).
This may have some value, but probably not towards actually making AI more moral/friendly on average. Conversing about morality can demonstrate knowledge of morality, but does little to demonstrate evidence of being moral/friendly. Example: a psychopath would not necessarily have any difficulty passing this Moral Turing Test.
On the other hand, a machine could fail a morality test simply by saying something controversial, or just failing to signal properly. For example atheism could be considered immoral by religious people; they could conclude that the machine is missing a part of human utility function. Or if some nice and correct belief has bad consequences, but humans compartmentalize it away and the machine would point it out explicitly, that could be percieved as a moral failure.
If the machine is allowed to lie, passing this test could just mean the machine is a skilled psychopath. If the machine is not allowed to lie, failing this test could just mean humans confuse signalling with the real thing.
I agree, the goal is to get humans to think about programming some forms of moral reasoning, even if it’s far from sufficient (and it’s far from being the hardest part of FAI).