Rolf Nelson’s AI deterrence doesn’t work for Schellingian reasons: the Rogue AI has incentive to modify itself to not understand such threats before it first looks at the outside world. This makes you unable to threaten, because when you simulate the Rogue AI you will see its precommitment first. So the Rogue AI negates your “first mover advantage” by becoming the first mover in your simulation :-) Discuss.
Rolf Nelson’s AI deterrence doesn’t work for Schellingian reasons: the Rogue AI has incentive to modify itself to not understand such threats before it first looks at the outside world. This makes you unable to threaten, because when you simulate the Rogue AI you will see its precommitment first. So the Rogue AI negates your “first mover advantage” by becoming the first mover in your simulation :-) Discuss.
I agree that AI deterrence will necessarily fail if:
All AI’s modify themselves to ignore threats from all agents (including ones it considers irrational), and
any deterrence simulation counts as a threat.
Why do you believe that both or either of these statements are true? Do you have some concrete definition of ‘threat’ in mind?
I don’t believe statement 1 and don’t see why it’s required. After all, we are quite rational, and so is our future FAI.
The notion of “first mover” is meaningless, where the other player’s program is visible from the start.