In real life, you can reverse blackmail by saying: “Blackmail is serious felony, an you could get one year in jail in US for blackmail, so now you have to pay me for not reporting the blackmail to the police” (I don’t recommend it in real life, as you both will be arrested, but such aggressive posture may stop the blackmail.)
The same way acausal blackmail by AI could be reversed: You can threaten the AI that you had precommited to create thousands other AI which will simulate all this setup, and will punish the AI if it tries to torture any simulated being. This could be used to make a random paperclipper to behave as a Benevolent AI and the idea was suggested by Rolf Nelson. I analysed it it details in the text.
That strategy might work as deterrence, although actually implementing it would still be ethically...suboptimal, as you would still need to harm simulated observers. Sure, they would be Rogue AIs instead of poor innocent humans, but in the end, you would be doing something rather similar to what you blame them for in the first place: creating intelligent observers with the explicit purpose of punishing them if they act the wrong way.
In real life, you can reverse blackmail by saying: “Blackmail is serious felony, an you could get one year in jail in US for blackmail, so now you have to pay me for not reporting the blackmail to the police” (I don’t recommend it in real life, as you both will be arrested, but such aggressive posture may stop the blackmail.)
The same way acausal blackmail by AI could be reversed: You can threaten the AI that you had precommited to create thousands other AI which will simulate all this setup, and will punish the AI if it tries to torture any simulated being. This could be used to make a random paperclipper to behave as a Benevolent AI and the idea was suggested by Rolf Nelson. I analysed it it details in the text.
That strategy might work as deterrence, although actually implementing it would still be ethically...suboptimal, as you would still need to harm simulated observers. Sure, they would be Rogue AIs instead of poor innocent humans, but in the end, you would be doing something rather similar to what you blame them for in the first place: creating intelligent observers with the explicit purpose of punishing them if they act the wrong way.