The universe may also contain a large number of rapidly expanding friendly AIs. The likelihood of one arising on this planet may correlate with the likelihood of it arising on other planets, although I’m not sure how strong a correlation we can suppose for intelligent life forms with completely different evolutionary histories. In that case, anything that increases the chance of friendly AI arising on this planet can also be taken to decrease our chance of being subsumed by an extraplanetary unfriendly AI.
An AI that is friendly to one intelligent species might not be friendly to others, but such AI should probably be considered to be imperfectly friendly.
This seems to be conflating a Friendly intelligence (that is, one constrained by its creators’ terminal values) with a friendly one (that is, one that effectively signals the intent to engage in mutually beneficial social exchange).
As I said below, the reasoning used elsewhere on this site seems to conclude that a Friendly intelligence with nonhuman creators will not be Friendly to humans, since there’s no reason to expect a nonhuman’s terminal values to align with our own.
(Conversely, if there is some reason to expect nonhuman terminal values to align with human ones, then it may be worth clarifying that reason, as the same forces that make such convergence likely for natural-selection-generated NIs might also apply to human-generated AIs.)
I think that an AI whose values aligned perfectly with our own (or at least, my own) would have to assign value in its utility function to other intelligent beings. Supposing I created an AI that established a utopia for humans, but when it encountered extraterrestrial intelligences, subjected them to something they considered a fate worse than death, I would consider that to be a failing of my design.
Perfectly Friendly AI might be deserving of a category entirely to itself, since by its nature it seems that it would be a much harder problem to resolve even than ordinary Friendly AI.
The universe may also contain a large number of rapidly expanding friendly AIs. The likelihood of one arising on this planet may correlate with the likelihood of it arising on other planets, although I’m not sure how strong a correlation we can suppose for intelligent life forms with completely different evolutionary histories. In that case, anything that increases the chance of friendly AI arising on this planet can also be taken to decrease our chance of being subsumed by an extraplanetary unfriendly AI.
An AI that is friendly to one intelligent species might not be friendly to others, but such AI should probably be considered to be imperfectly friendly.
This seems to be conflating a Friendly intelligence (that is, one constrained by its creators’ terminal values) with a friendly one (that is, one that effectively signals the intent to engage in mutually beneficial social exchange).
As I said below, the reasoning used elsewhere on this site seems to conclude that a Friendly intelligence with nonhuman creators will not be Friendly to humans, since there’s no reason to expect a nonhuman’s terminal values to align with our own.
(Conversely, if there is some reason to expect nonhuman terminal values to align with human ones, then it may be worth clarifying that reason, as the same forces that make such convergence likely for natural-selection-generated NIs might also apply to human-generated AIs.)
I think that an AI whose values aligned perfectly with our own (or at least, my own) would have to assign value in its utility function to other intelligent beings. Supposing I created an AI that established a utopia for humans, but when it encountered extraterrestrial intelligences, subjected them to something they considered a fate worse than death, I would consider that to be a failing of my design.
Perfectly Friendly AI might be deserving of a category entirely to itself, since by its nature it seems that it would be a much harder problem to resolve even than ordinary Friendly AI.