Thank you for your answer. I agree that human nature is a reason to believe that a RB-like scenario (especially one based on acausal blackmail) is less likely to happen. However, I was thinking more of a degenerate scenario similar to the one proposed in this comment. Just exchange the message coming from a text terminal with the fact that you are thinking about a Basilisk situation: a future superintelligence might have created many observers, some of whom think very much like you, but are less prone to believing in human laziness and more likely to support RB. Thus, if you consider answering no to Q1 (in other words, dismissing the Basilisk), you could see this as evidence that maybe H1 is still true, and you are just (unluckily) one of the simulations that will be punished. By this logic, it would be very advisable to actually answer yes (assuming you care more about your own utility than that of a copy of you).
Actually though, my anthropic argument might be flawed. If we think about it like in this post by Stuart Armstrong, we see that in both H0 and H1, there is exactly one observer that is me, personally (i.e. having experiences that I identify with), thus, the probability of me being in a RB-scenario should not be higher than being in the real world (or any other simulation) after all. But which way of thinking about anthropic probability is correct, in this case?
In real life, you can reverse blackmail by saying: “Blackmail is serious felony, an you could get one year in jail in US for blackmail, so now you have to pay me for not reporting the blackmail to the police” (I don’t recommend it in real life, as you both will be arrested, but such aggressive posture may stop the blackmail.)
The same way acausal blackmail by AI could be reversed: You can threaten the AI that you had precommited to create thousands other AI which will simulate all this setup, and will punish the AI if it tries to torture any simulated being. This could be used to make a random paperclipper to behave as a Benevolent AI and the idea was suggested by Rolf Nelson. I analysed it it details in the text.
That strategy might work as deterrence, although actually implementing it would still be ethically...suboptimal, as you would still need to harm simulated observers. Sure, they would be Rogue AIs instead of poor innocent humans, but in the end, you would be doing something rather similar to what you blame them for in the first place: creating intelligent observers with the explicit purpose of punishing them if they act the wrong way.
Thank you for your answer. I agree that human nature is a reason to believe that a RB-like scenario (especially one based on acausal blackmail) is less likely to happen. However, I was thinking more of a degenerate scenario similar to the one proposed in this comment. Just exchange the message coming from a text terminal with the fact that you are thinking about a Basilisk situation: a future superintelligence might have created many observers, some of whom think very much like you, but are less prone to believing in human laziness and more likely to support RB. Thus, if you consider answering no to Q1 (in other words, dismissing the Basilisk), you could see this as evidence that maybe H1 is still true, and you are just (unluckily) one of the simulations that will be punished. By this logic, it would be very advisable to actually answer yes (assuming you care more about your own utility than that of a copy of you).
Actually though, my anthropic argument might be flawed. If we think about it like in this post by Stuart Armstrong, we see that in both H0 and H1, there is exactly one observer that is me, personally (i.e. having experiences that I identify with), thus, the probability of me being in a RB-scenario should not be higher than being in the real world (or any other simulation) after all. But which way of thinking about anthropic probability is correct, in this case?
In real life, you can reverse blackmail by saying: “Blackmail is serious felony, an you could get one year in jail in US for blackmail, so now you have to pay me for not reporting the blackmail to the police” (I don’t recommend it in real life, as you both will be arrested, but such aggressive posture may stop the blackmail.)
The same way acausal blackmail by AI could be reversed: You can threaten the AI that you had precommited to create thousands other AI which will simulate all this setup, and will punish the AI if it tries to torture any simulated being. This could be used to make a random paperclipper to behave as a Benevolent AI and the idea was suggested by Rolf Nelson. I analysed it it details in the text.
That strategy might work as deterrence, although actually implementing it would still be ethically...suboptimal, as you would still need to harm simulated observers. Sure, they would be Rogue AIs instead of poor innocent humans, but in the end, you would be doing something rather similar to what you blame them for in the first place: creating intelligent observers with the explicit purpose of punishing them if they act the wrong way.