This seems to be conflating a Friendly intelligence (that is, one constrained by its creators’ terminal values) with a friendly one (that is, one that effectively signals the intent to engage in mutually beneficial social exchange).
As I said below, the reasoning used elsewhere on this site seems to conclude that a Friendly intelligence with nonhuman creators will not be Friendly to humans, since there’s no reason to expect a nonhuman’s terminal values to align with our own.
(Conversely, if there is some reason to expect nonhuman terminal values to align with human ones, then it may be worth clarifying that reason, as the same forces that make such convergence likely for natural-selection-generated NIs might also apply to human-generated AIs.)
I think that an AI whose values aligned perfectly with our own (or at least, my own) would have to assign value in its utility function to other intelligent beings. Supposing I created an AI that established a utopia for humans, but when it encountered extraterrestrial intelligences, subjected them to something they considered a fate worse than death, I would consider that to be a failing of my design.
Perfectly Friendly AI might be deserving of a category entirely to itself, since by its nature it seems that it would be a much harder problem to resolve even than ordinary Friendly AI.
This seems to be conflating a Friendly intelligence (that is, one constrained by its creators’ terminal values) with a friendly one (that is, one that effectively signals the intent to engage in mutually beneficial social exchange).
As I said below, the reasoning used elsewhere on this site seems to conclude that a Friendly intelligence with nonhuman creators will not be Friendly to humans, since there’s no reason to expect a nonhuman’s terminal values to align with our own.
(Conversely, if there is some reason to expect nonhuman terminal values to align with human ones, then it may be worth clarifying that reason, as the same forces that make such convergence likely for natural-selection-generated NIs might also apply to human-generated AIs.)
I think that an AI whose values aligned perfectly with our own (or at least, my own) would have to assign value in its utility function to other intelligent beings. Supposing I created an AI that established a utopia for humans, but when it encountered extraterrestrial intelligences, subjected them to something they considered a fate worse than death, I would consider that to be a failing of my design.
Perfectly Friendly AI might be deserving of a category entirely to itself, since by its nature it seems that it would be a much harder problem to resolve even than ordinary Friendly AI.