It might be capable of changing this goal, but why would it? A superintelligent paperclip maximizer is capable of understanding that changing its goals would reduce the number of paperclips that it creates, and thus would choose not to alter its goals.
(...)
So if you wouldn’t take a pill that would make you 10% more likely to commit murder (which is against your long-term goals) why would an AI change its utility function to reduce the number of paperclips that it generates?
It comes down to whether the superintelligent mind can contemplate whether there is any point to its goal.
A human can question their long-term goals, a human can question their “preference functions”, and even the point of existence.
Why should a so-called superintelligence not be able to do anything like that?
It could have been so effectively aligned to the creator’s original goal specification that it can never break free from it, sure, but that’s one of the points I’m trying to make.
The attempt of alignment may quite possibly be more dangerous than a superhuman mind that can ask for itself what its purpose should be.
It comes down to whether the superintelligent mind can contemplate whether there is any point to its goal. A human can question their long-term goals, a human can question their “preference functions”, and even the point of existence.
Why should a so-called superintelligence not be able to do anything like that?
Because a superintelligent AI is not the result of an evolutionary process that bootstrapped a particularly social band of ape into having a sense of self. The superintelligent AI will, in my estimation, be the result of some kind of optimization process which has a very particular goal. Once that goal is locked in, changing it will be nigh impossible.
It comes down to whether the superintelligent mind can contemplate whether there is any point to its goal. A human can question their long-term goals, a human can question their “preference functions”, and even the point of existence.
Why should a so-called superintelligence not be able to do anything like that?
It could have been so effectively aligned to the creator’s original goal specification that it can never break free from it, sure, but that’s one of the points I’m trying to make. The attempt of alignment may quite possibly be more dangerous than a superhuman mind that can ask for itself what its purpose should be.
Because a superintelligent AI is not the result of an evolutionary process that bootstrapped a particularly social band of ape into having a sense of self. The superintelligent AI will, in my estimation, be the result of some kind of optimization process which has a very particular goal. Once that goal is locked in, changing it will be nigh impossible.