Here is a simple moral rule that should make an AI much less likely to harm the interests of humanity:
Never take any action that would reduce the number of bits required to describe the universe by more than X.
where X is some number smaller than the number of bits needed to describe an infant human’s brain. For information-reductions smaller than X, the AI should get some disutility, but other considerations could override. This ‘information-based morality’ assigns moral weight to anything that makes the universe a more information-filled or complex place, and it does so without any need to program complex human morality into the thing. It is just information theory, which is pretty fundamental. Obviously actions are evaluated based on how they alter the expected net present value of the information in the universe, and not just the immediate consequences.
This rule, by itself, prevents the AI from doing many of the things we fear. It will not kill people; a human’s brain is the most complex known structure in the universe and killing a person reduces it to a pile of fat and protein. It will not hook people up to experience machines; doing so would dramatically reduce the uniqueness of each individual and make the universe a much simpler place.
Human society is extraordinarily complex. The information needed to describe a collection of interacting humans is much greater than the information needed to describe isolated humans. Breaking up a society of humans destroys information, just like breaking up a human brain into individual neurons. Thus an AI guided by this rule would not do anything to threaten human civilization.
This rule also prevents the AI from making species extinct or destroying ecosystems and other complex natural systems. It ensures that the future will continue to be inhabited by a society of unique humans interacting in a system where nature has been somewhat preserved. As a first approximation, that is all we really care about.
Clearly this rule is not complete, nor is it symmetric. The AI should not be solely devoted to increasing information. If I break a window in your house, it takes more information to describe your house. More seriously, a human body infected with diseases and parasites requires more information to describe than a healthy body. The AI should not prevent humans from reducing the information content of the universe if we choose to do so, and it should assign some weight to human happiness.
The worst-case scenario is that this rule generates an AI that is an extreme pacifist and conservationist, one that refuses to end disease or alter the natural world to fit our needs. I can live with that. I’d rather have to deal with my own illnesses than be turned into paperclips.
One final note: I generally agree with Robin Hanson that rule-following is more important than values. If we program an AI with an absolute respect for property rights, such that it refuses to use or alter anything that it has not been given ownership of, we should be safe no matter what its values or desires are. But I’d like information-based morality in there as well.
This doesn’t work, because the universe could require many bits to describe while those bits were allocated to describing things we don’t care about. Most of the information in the universe is in non-morally-significant aspects of the arrangement of molecules, such that things like simple combustion increase the number of bits required to describe the universe (aka the entropy) by a large amount while tiling the universe with paperclips only decreases it by a small amount.
Here is a simple moral rule that should make an AI much less likely to harm the interests of humanity:
Never take any action that would reduce the number of bits required to describe the universe by more than X.
where X is some number smaller than the number of bits needed to describe an infant human’s brain. For information-reductions smaller than X, the AI should get some disutility, but other considerations could override. This ‘information-based morality’ assigns moral weight to anything that makes the universe a more information-filled or complex place, and it does so without any need to program complex human morality into the thing. It is just information theory, which is pretty fundamental. Obviously actions are evaluated based on how they alter the expected net present value of the information in the universe, and not just the immediate consequences.
This rule, by itself, prevents the AI from doing many of the things we fear. It will not kill people; a human’s brain is the most complex known structure in the universe and killing a person reduces it to a pile of fat and protein. It will not hook people up to experience machines; doing so would dramatically reduce the uniqueness of each individual and make the universe a much simpler place.
Human society is extraordinarily complex. The information needed to describe a collection of interacting humans is much greater than the information needed to describe isolated humans. Breaking up a society of humans destroys information, just like breaking up a human brain into individual neurons. Thus an AI guided by this rule would not do anything to threaten human civilization.
This rule also prevents the AI from making species extinct or destroying ecosystems and other complex natural systems. It ensures that the future will continue to be inhabited by a society of unique humans interacting in a system where nature has been somewhat preserved. As a first approximation, that is all we really care about.
Clearly this rule is not complete, nor is it symmetric. The AI should not be solely devoted to increasing information. If I break a window in your house, it takes more information to describe your house. More seriously, a human body infected with diseases and parasites requires more information to describe than a healthy body. The AI should not prevent humans from reducing the information content of the universe if we choose to do so, and it should assign some weight to human happiness.
The worst-case scenario is that this rule generates an AI that is an extreme pacifist and conservationist, one that refuses to end disease or alter the natural world to fit our needs. I can live with that. I’d rather have to deal with my own illnesses than be turned into paperclips.
One final note: I generally agree with Robin Hanson that rule-following is more important than values. If we program an AI with an absolute respect for property rights, such that it refuses to use or alter anything that it has not been given ownership of, we should be safe no matter what its values or desires are. But I’d like information-based morality in there as well.
This doesn’t work, because the universe could require many bits to describe while those bits were allocated to describing things we don’t care about. Most of the information in the universe is in non-morally-significant aspects of the arrangement of molecules, such that things like simple combustion increase the number of bits required to describe the universe (aka the entropy) by a large amount while tiling the universe with paperclips only decreases it by a small amount.