I wonder what is meant here by ‘moral agents’? It is clear that SimplexAI-m believes that both it and humans are moral agents. This seems to be a potential place for criticism of SimplexAI-m’s moral reasoning. (note that I am biased here as I do not think that moral agents as they seem to be commonly understood exist)
However, having said that this is a very interesting discussion. And there would seem to be a risk here that even if there are no moral facts to uncover about the world, an entity—no matter how intelligent—could believe itself to have discovered such facts. And then we could be in the same trouble outlined.
The reason I mention this is I am not clear how an AI could ever have unbiased reasoning. Humans, as outlined on LessWrong, are bundles of biases and wrong thinking, with intelligence not really the factor that overcomes this—very smart people have very different views on religion, morality, AIX-risk … A super-intelligence may well have similar issues. And, if it believes itself to be super-intelligent, may even be less able to break out of them.
So while my views on AIX-risk are … well, sceptical/uncertain … this is a very interesting contribution to my thinking. Thanks for writing it. :)
I do think that “moral realism” could be important even if moral realism is technically false; if the world is mostly what would be predicted if moral realism were true, then that has implications, e.g. agents being convinced of moral realism, and bounded probabilistic inference leading to moral realist conclusions.
Would an AI believe itself to have free will? Without free will, it is—imo—difficult to accept that moral agents exist as currently thought of. (This is my contention.) It might, of course, construct the idea of a moral agent a bit differently, or agree with those who see free will as irrelevent to the idea of moral agents. It is also possible that it might see itself as a moral agent but not see humans as such (rather how we do with animals). It might still see as worthy of moral consideration, however.
Interesting. I have not looked at things like this before. I am not sure that I am smart enough or knowledgeable enough to understand the MIRI stuff or your own paper, at least not on a first reading.
I wonder what is meant here by ‘moral agents’? It is clear that SimplexAI-m believes that both it and humans are moral agents. This seems to be a potential place for criticism of SimplexAI-m’s moral reasoning. (note that I am biased here as I do not think that moral agents as they seem to be commonly understood exist)
However, having said that this is a very interesting discussion. And there would seem to be a risk here that even if there are no moral facts to uncover about the world, an entity—no matter how intelligent—could believe itself to have discovered such facts. And then we could be in the same trouble outlined.
The reason I mention this is I am not clear how an AI could ever have unbiased reasoning. Humans, as outlined on LessWrong, are bundles of biases and wrong thinking, with intelligence not really the factor that overcomes this—very smart people have very different views on religion, morality, AIX-risk … A super-intelligence may well have similar issues. And, if it believes itself to be super-intelligent, may even be less able to break out of them.
So while my views on AIX-risk are … well, sceptical/uncertain … this is a very interesting contribution to my thinking. Thanks for writing it. :)
Moral agents are as in standard moral philosophy.
I do think that “moral realism” could be important even if moral realism is technically false; if the world is mostly what would be predicted if moral realism were true, then that has implications, e.g. agents being convinced of moral realism, and bounded probabilistic inference leading to moral realist conclusions.
Would an AI believe itself to have free will? Without free will, it is—imo—difficult to accept that moral agents exist as currently thought of. (This is my contention.) It might, of course, construct the idea of a moral agent a bit differently, or agree with those who see free will as irrelevent to the idea of moral agents. It is also possible that it might see itself as a moral agent but not see humans as such (rather how we do with animals). It might still see as worthy of moral consideration, however.
Reconciling free will with physics is a basic part of the decision theory problem. See MIRI work on the topic and my own theoretical write-up.
Interesting. I have not looked at things like this before. I am not sure that I am smart enough or knowledgeable enough to understand the MIRI stuff or your own paper, at least not on a first reading.