Emile comments on Brainstorming additional AI risk reduction ideas

Emile 14 Jun 2012 20:48 UTC
2 points
Sponsor a “morality turing test” contest

From Prolegomena to any future artificial moral agent:

A Moral Turing Test (MTT ) might similarly be proposed to bypass disagreements about ethical standards by restricting the standard Turing Test to conversations about morality. If human interrogators ’ cannot identify the machine at above chance accuracy, then the machine is, on this criterion, a moral agent . [...] To shift the focus from conversational ability to action, an alternative MTT could be structured in such a way that theinterrogator ’ is given pairs of descriptions of actual, morally-significant actions of a human and an AMA, purged of all references that would identify the agents. If the interrogator correctly identify es the machine at a level above chance, then the machine has failed the test. A problem for this version of the MTT is that distinguishability is the wrong criterion because the machine might be recognizable for acting in ways that are consistently better than a human in the same situation. So instead, the interrogator might be asked to assess whether one agent is less moral than the other. If the machine is not reported as responding less morally than the human, it will have passed the test. This test is called the ‘comparative MTT’”

The rules may have to be tweaked a bit more, but it sounds like something that might get various AI students or wannabe AI programmers interested in morality.
- jacob_cannell 14 Jun 2012 21:38 UTC
  2 points
  Parent
  This may have some value, but probably not towards actually making AI more moral/friendly on average. Conversing about morality can demonstrate knowledge of morality, but does little to demonstrate evidence of being moral/friendly. Example: a psychopath would not necessarily have any difficulty passing this Moral Turing Test.
  - Viliam_Bur 15 Jun 2012 8:50 UTC
    6 points
    Parent
    On the other hand, a machine could fail a morality test simply by saying something controversial, or just failing to signal properly. For example atheism could be considered immoral by religious people; they could conclude that the machine is missing a part of human utility function. Or if some nice and correct belief has bad consequences, but humans compartmentalize it away and the machine would point it out explicitly, that could be percieved as a moral failure.
    
    If the machine is allowed to lie, passing this test could just mean the machine is a skilled psychopath. If the machine is not allowed to lie, failing this test could just mean humans confuse signalling with the real thing.
  - Emile 14 Jun 2012 21:52 UTC
    0 points
    Parent
    I agree, the goal is to get humans to think about programming some forms of moral reasoning, even if it’s far from sufficient (and it’s far from being the hardest part of FAI).