An agent which is doing what the operator wants, where the operator is “whatever currently has physical control of the AI,” won’t try to replace the operator—because that’s not what the operator wants.
I disagree (though we may be interpreting that sentence differently). Once the AI has the possibility of subverting the controller, then it is, in effect, in physical control of itself. So it itself becomes the “formal operator”, and, depending on how it’s motivated, is perfectly willing to replace the “human operator”, whose wishes are now irrelevant (because it’s no longer the formal operator).
And this never involves any goal modification at all—it’s the same goal, except that the change in control has changed the definition of the operator.
I disagree (though we may be interpreting that sentence differently). Once the AI has the possibility of subverting the controller, then it is, in effect, in physical control of itself. So it itself becomes the “formal operator”, and, depending on how it’s motivated, is perfectly willing to replace the “human operator”, whose wishes are now irrelevant (because it’s no longer the formal operator).
And this never involves any goal modification at all—it’s the same goal, except that the change in control has changed the definition of the operator.