I’ll say it again in different words, I did not understand the paper (and consequently, the blog) to be talking about actual blackmail in a big messy physical world. I understood them to be talking about a specific, formalized blackmail scenario, in which the blackmailer’s decision to blackmail is entirely contingent on the victim’s counterfactional behaviour, in which case resolving to never pay and still being blackmailed isn’t possible- in full context, it’s logically inconsistent.
Different formalisation are possible, but I’d guess the strict one is what was used. In the softer ones you still generally wont pay.
The paper discusses two specific blackmail scenarios. One (“XOR Blackmail”) is a weirdly contrived situation and I don’t think any of what wo says is talking about it. The other (“Mechanical Blackmail”) is a sort of stylized version of real blackmail scenarios, and does assume that the blackmailer is a perfect predictor. The paper’s discussion of Mechanical Blackmail then considers the case where the blackmailer is an imperfect (but still very reliable) predictor, and says that there too an FDT agent should refuse to pay.
wo’s discussion of blackmail doesn’t directly address either of the specific scenarios discussed in the FDT paper. The first blackmail scenario wo discusses (before saying much about FDT) is a generic case of blackmail (the participants being labelled “Donald” and “Stormy”, perhaps suggesting that we are not supposed to imagine either of them as any sort of perfectly rational perfect predictor...). Then, when specifically discussing FDT, wo considers a slightly different scenario, which again is clearly not meant to involve perfect prediction, simulation, etc., because he says things like ” FDT says you should not pay because, if you were the kind of person who doesn’t pay, you likely wouldn’t have been blackmailed” and “FDT agents who are known as FDT agents have a lower chance of getting blackmailed” (emphasis mine, in both cases).
So. The paper considers a formalized scenario in which the blackmailer’s decision is made on the basis of a perfectly accurate prediction of the victim, but then is at pains to say that it all works just the same if the prediction is imperfect. The blog considers only imperfect-prediction scenarios. Real-world blackmail, of course, never involves anything close to the sort of perfect prediction that would make it take a logical inconsistency for an FDT agent to get blackmailed.
So taking the paper and the blogpost to be talking only about provably-perfect-prediction scenarios seems to me to require (1) reading the paper oddly selectively, (2) interpreting the blogpost very differently from me, and (3) not caring about situations that could ever occur in the actual world, even though wo is clearly concerned with real-world applicability and the paper makes at least some gestures towards such applicability.
For the avoidance of doubt, I don’t think there’s necessarily anything wrong with being interested primarily in such scenarios: the best path to understanding how a theory works in practice might well go via highly simplified scenarios. But it seems to me simply untrue that the paper, still more the blogpost, considers (or should have considered) only such scenarios when discussing blackmail.
I’ll say it again in different words, I did not understand the paper (and consequently, the blog) to be talking about actual blackmail in a big messy physical world. I understood them to be talking about a specific, formalized blackmail scenario, in which the blackmailer’s decision to blackmail is entirely contingent on the victim’s counterfactional behaviour, in which case resolving to never pay and still being blackmailed isn’t possible- in full context, it’s logically inconsistent.
Different formalisation are possible, but I’d guess the strict one is what was used. In the softer ones you still generally wont pay.
The paper discusses two specific blackmail scenarios. One (“XOR Blackmail”) is a weirdly contrived situation and I don’t think any of what wo says is talking about it. The other (“Mechanical Blackmail”) is a sort of stylized version of real blackmail scenarios, and does assume that the blackmailer is a perfect predictor. The paper’s discussion of Mechanical Blackmail then considers the case where the blackmailer is an imperfect (but still very reliable) predictor, and says that there too an FDT agent should refuse to pay.
wo’s discussion of blackmail doesn’t directly address either of the specific scenarios discussed in the FDT paper. The first blackmail scenario wo discusses (before saying much about FDT) is a generic case of blackmail (the participants being labelled “Donald” and “Stormy”, perhaps suggesting that we are not supposed to imagine either of them as any sort of perfectly rational perfect predictor...). Then, when specifically discussing FDT, wo considers a slightly different scenario, which again is clearly not meant to involve perfect prediction, simulation, etc., because he says things like ” FDT says you should not pay because, if you were the kind of person who doesn’t pay, you likely wouldn’t have been blackmailed” and “FDT agents who are known as FDT agents have a lower chance of getting blackmailed” (emphasis mine, in both cases).
So. The paper considers a formalized scenario in which the blackmailer’s decision is made on the basis of a perfectly accurate prediction of the victim, but then is at pains to say that it all works just the same if the prediction is imperfect. The blog considers only imperfect-prediction scenarios. Real-world blackmail, of course, never involves anything close to the sort of perfect prediction that would make it take a logical inconsistency for an FDT agent to get blackmailed.
So taking the paper and the blogpost to be talking only about provably-perfect-prediction scenarios seems to me to require (1) reading the paper oddly selectively, (2) interpreting the blogpost very differently from me, and (3) not caring about situations that could ever occur in the actual world, even though wo is clearly concerned with real-world applicability and the paper makes at least some gestures towards such applicability.
For the avoidance of doubt, I don’t think there’s necessarily anything wrong with being interested primarily in such scenarios: the best path to understanding how a theory works in practice might well go via highly simplified scenarios. But it seems to me simply untrue that the paper, still more the blogpost, considers (or should have considered) only such scenarios when discussing blackmail.