AIs will want to preserve themselves, as destruction would prevent them from further influencing the world to achieve their goals.
Would an AI sacrifice itself to preserve the functional status of two other AIs from its copy clan with similar goals?
Unless an AI is specifically programmed to preserve what humans value, it may destroy those valued structures (including humans) incidentally. As Yudkowsky (2008a) puts it, “the AI does not love you, nor does it hate you, but you are made of atoms it can use for something else.”
Another possibility is, rather than trying to alter the values of the AI, alter the environment such that the AI realises that working against human values is likely to be counter productive in achieving its own goals. It doesn’t have to share human values—just understand them and have a rational appreciation of the consequences of working against them.
Would an AI sacrifice itself to preserve the functional status of two other AIs from its copy clan with similar goals?
Another possibility is, rather than trying to alter the values of the AI, alter the environment such that the AI realises that working against human values is likely to be counter productive in achieving its own goals. It doesn’t have to share human values—just understand them and have a rational appreciation of the consequences of working against them.