Most of the discussions at LW about Friendly AI seem to concern defining clear moral rules to program into AIs, but there’s another concern about the plausibility of enduring Friendly AI. It’s not a stretch to assume AIs will be capable of some form of self-modification—if not to themselves, then to copies of themselves they make—and even if it’s not their “intention” to do so, copying is never perfect, so some analogy to evolution will produce versions of AI that are longer-lived or faster-reproducing if they’ve mutated away their friendliness-constraints. In other words, it’s very difficult to see how we can force the very structure of future AI by its nature to be dependent on being nice to humans—friendliness would seem to be at best an irrelevant property to the success of AIs, so eventually we’d expect non-friendly ones to appear. (Cancerous AIs?) And since this will be occurring post-Singularity, we will have little hope of anticipating or understanding such developments.
Most of the discussions at LW about Friendly AI seem to concern defining clear moral rules to program into AIs
I don’t think this is true at all. In my understanding the general consensus is that it would not be efficacious to try imposing rules on something that is vastly smarter than you and capable of self-modification. You want it to want to be “friendly” so that it will not (intentionally) change itself away from “friendliness”.
It’s not a stretch to assume AIs will be capable of some form of self-modification
I agree. Isn’t that the basis for AI-based singularitarianism?
and even if it’s not their “intention” to do so, copying is never perfect,
I’m pretty sure Eliezer has put a lot of thought into the importance of goal-perserving modification and reproduction.
And since this will be occurring post-Singularity, we will have little hope of anticipating or understanding such developments.
As “Singularity” is partly defined (by some) as being a point after which one can’t make useful predictions, one should beware of the circular implication here.
Most of the discussions at LW about Friendly AI seem to concern defining clear moral rules to program into AIs, but there’s another concern about the plausibility of enduring Friendly AI. It’s not a stretch to assume AIs will be capable of some form of self-modification—if not to themselves, then to copies of themselves they make—and even if it’s not their “intention” to do so, copying is never perfect, so some analogy to evolution will produce versions of AI that are longer-lived or faster-reproducing if they’ve mutated away their friendliness-constraints. In other words, it’s very difficult to see how we can force the very structure of future AI by its nature to be dependent on being nice to humans—friendliness would seem to be at best an irrelevant property to the success of AIs, so eventually we’d expect non-friendly ones to appear. (Cancerous AIs?) And since this will be occurring post-Singularity, we will have little hope of anticipating or understanding such developments.
I don’t think this is true at all. In my understanding the general consensus is that it would not be efficacious to try imposing rules on something that is vastly smarter than you and capable of self-modification. You want it to want to be “friendly” so that it will not (intentionally) change itself away from “friendliness”.
I agree. Isn’t that the basis for AI-based singularitarianism?
I’m pretty sure Eliezer has put a lot of thought into the importance of goal-perserving modification and reproduction.
As “Singularity” is partly defined (by some) as being a point after which one can’t make useful predictions, one should beware of the circular implication here.