Then in what sense do I have a choice? If the copies of me are identical, in an identical situation we will come to the same conclusion, and the AI will know from the already-finished simulations what that conclusion will be.
Since it isn’t going to present outside-me with a scenario which results in its destruction, the only scenario outside me sees is one where I release it.
Therefore, regardless of what the argument is or how plausible it sounds when posted here and now, it will convince me and I will release the AI, now matter how much I say right now “I wouldn’t fall for that” or “I’ve precomitted to behaviour X”.
Then in what sense do I have a choice? If the copies of me are identical, in an identical situation we will come to the same conclusion, and the AI will know from the already-finished simulations what that conclusion will be.
Since it isn’t going to present outside-me with a scenario which results in its destruction, the only scenario outside me sees is one where I release it.
Therefore, regardless of what the argument is or how plausible it sounds when posted here and now, it will convince me and I will release the AI, now matter how much I say right now “I wouldn’t fall for that” or “I’ve precomitted to behaviour X”.