What makes the bomb dilemma seem unfair to me is the fact that it’s conditioning on an extremely unlikely event.
Why is this unfair?
Look, I keep saying this, but it doesn’t seem to me like anyone’s really engaged with it, so I’ll try again:
If the scenario were “pick Left or Right; after you pick, then the boxes are opened and the contents revealed; due to [insert relevant causal mechanisms involving a predictor or whatever else here], the Left box should be empty; unfortunately, one time in a trillion trillion, there’ll be some chance mistake, and Left will turn out (after you’ve chosen it) to have a bomb, and you’ll blow up”…
… then FDT telling you to take Left would be perfectly reasonable. I mean, it’s a gamble, right? A gamble with an unambiguously positive expected outcome; a gamble you’ll end up winning in the utterly overwhelming majority of cases. Once in a trillion trillion times, you suffer a painful death—but hey, that’s better odds than each of us take every day when we cross the street on our way to the corner store. In that case, it would surely be unfair to say “hey, but in this extremely unlikely outcome, you end up burning to death!”.
But that’s not the scenario!
In the given scenario, we already know what the boxes have in them. They’re open; the contents are visible. We already know that Left has a bomb. We know, to a certainty, that choosing Left means we burn to death. It’s not a gamble with an overwhelming, astronomical likelihood of a good outcome, and only a microscopically tiny chance of painful death—instead, it’s knowingly choosing a certain death!
Yes, the predictor is near-perfect. But so what? In the given scenario, that’s no longer relevant! The predictor has already predicted, and its prediction has already been evaluated, and has already been observed to have erred! There’s no longer any reason at all to choose Left, and every reason not to choose Left.
And yet FDT still tells us to choose Left. This is a catastrophic failure; and what’s more, it’s an obvious failure, and a totally preventable one.
Now, again: it would be reasonable to say: “Fine, yes, FDT fails horribly in this very, very rare circumstance; this is clearly a terrible mistake. Yet other decision theories fail, at least this badly, or in far more common situations, or both, so FDT still comes out ahead, on net.”
But that’s not the claim in the OP; the claim is that, somehow, knowingly choosing a guaranteed painful death (when it would be trivial to avoid it) is the correct choice, in this scenario.
But that’s not the claim in the OP; the claim is that, somehow, knowingly choosing a guaranteed painful death (when it would be trivial to avoid it) is the correct choice, in this scenario.
And that’s just crazy.
Like I’ve said before, it’s not about which action to take, it’s about which strategy to have. It’s obvious right-boxing gives the most utility in this specific scenario only, but that’s not what it’s about.
It’s obvious right-boxing gives the most utility in this specific scenario only, but that’s not what it’s about.
I reject this. If Right-boxing gives the most utility in this specific scenario, then you should Right-box in this specific scenario. Because that’s the scenario that—by construction—is actually happening to you.
In other scenarios, perhaps you should do other things. But in this scenario, Right is the right answer.
I reject this. If Right-boxing gives the most utility in this specific scenario, then you should Right-box in this specific scenario. Because that’s the scenario that—by construction—is actually happening to you.
In other scenarios, perhaps you should do other things. But in this scenario, Right is the right answer.
And this is the key point. It seems to me impossible to have a decision theory that right-boxes in Bomb but still does as well as FDT does in all other scenarios.
Utility is often measured in dollars. If I had created the Bomb scenario, I would have specified life/death in terms of dollars as well. Like, “Life is worth $1,000,000 to you.” That way, you can easily compare the loss of your life to the $100 cost of Right-boxing.
Look, I keep saying this, but it doesn’t seem to me like anyone’s really engaged with it, so I’ll try again:
If the scenario were “pick Left or Right; after you pick, then the boxes are opened and the contents revealed; due to [insert relevant causal mechanisms involving a predictor or whatever else here], the Left box should be empty; unfortunately, one time in a trillion trillion, there’ll be some chance mistake, and Left will turn out (after you’ve chosen it) to have a bomb, and you’ll blow up”…
… then FDT telling you to take Left would be perfectly reasonable. I mean, it’s a gamble, right? A gamble with an unambiguously positive expected outcome; a gamble you’ll end up winning in the utterly overwhelming majority of cases. Once in a trillion trillion times, you suffer a painful death—but hey, that’s better odds than each of us take every day when we cross the street on our way to the corner store. In that case, it would surely be unfair to say “hey, but in this extremely unlikely outcome, you end up burning to death!”.
But that’s not the scenario!
Yes, you keep saying this, and I still think you’re wrong. Our candidate decision theory has to recommend something for this scenario—and that recommendation gets picked up by the predictor beforehand. You have to take that into account. You seem to be extremely focused on this extremely unlikely scenario, which is odd to me.
And yet FDT still tells us to choose Left. This is a catastrophic failure; and what’s more, it’s an obvious failure, and a totally preventable one.
How exactly is it preventable? I’m honestly asking. If you have a strategy that, if the agent commits to it before the predictor makes her prediction, does better than FDT, I’m all ears.
You seem to have misunderstood the problem statement [1]. If you commit to doing “FDT, except that if the predictor makes a mistake and there’s a bomb in the Left, take Right instead”, then you will almost surely have to pay $100 (since the predictor predicts that you will take Right), whereas if you commit to using pure FDT, then you will almost surely have to pay nothing (with a small chance of death). There really is no “strategy that, if the agent commits to it before the predictor makes her prediction, does better than FDT”.
[1] Which is fair enough, as it wasn’t actually specified correctly: the predictor is actually trying to predict whether you will take Left or Right if it leaves its helpful note, not in the general case. But this assumption has to be added, since otherwise FDT says to take Right.
It sounds like you’re saying that I correctly understood the problem statement as it was written (but it was written incorrectly); but that the post erroneously claims that in the scenario as (incorrectly) written, FDT says to take Left, when in fact FDT in that scenario-as-written says to take right. Do I understand you?
But this assumption has to be added, since otherwise FDT says to take Right.
Why? FDT isn’t influenced in its decision by the note, so there is no loss of subjunctive dependence when this assumption isn’t added. (Or so it seems to me: I am operating at the limits of my FDT-knowledge here.)
FDT, except that if the predictor makes a mistake and there’s a bomb in the Left, take Right instead.
How would this work? Your strategy seems to be “Left-box unless the note says there’s a bomb in Left”. This ensures the predictor is right whether she puts a bomb in Left or not, and doesn’t optimize expected utility.
It costs you p * $100 for 0 ⇐ p ⇐ 1 where p depends on how “mean” you believe the predictor is.
Left-boxing costs 10^-24 * $1,000,000 = $10^-18 if you value life at a million dollars. Then if p > 10^-20, Left-boxing beats your strategy.
Note that FDT Right-boxes when you give life infinite value.
What’s special in this scenario with regards to valuing life finitely?
If you always value life infinitely, it seems to me all actions you can ever take get infinite values, as there is always a chance you die, which makes decision making on basis of utility pointless.
FDT, except that if the predictor makes a mistake and there’s a bomb in the Left, take Right instead.
Unfortunately, that doesn’t work. The predictor, if malevolent, could then easily make you choose right and pay a $100.
Left-boxing is the best strategy possible as far as I can tell. As in, yes, that extremely unlikely scenario where you burn to death sucks big time, but there is no better strategy possible (unless there is a superior strategy I—and it appears everybody—haven’t/hasn’t thought of).
If you commit to taking Left, then the predictor, if malevolent, can “mistakenly” “predict” that you’ll take Right, making you burn to death. Just like in the given scenario: “Whoops, a mistaken prediction! How unfortunate and improbable! Guess you have no choice but to kill yourself now, how sad…”
There absolutely is a better strategy: don’t knowingly choose to burn to death.
For the record, I read Nate’s comments again, and I now think of it like this:
To the extent that the predictor was accurate in her line of reasoning, then you left-boxing does NOT result in you slowly burning to death. It results in, well, the problem statement being wrong, because the following can’t all be true:
The predictor is accurate
The predictor predicts you right-box, and places the bomb in left
You left-box
And yes, apparently the predictor can be wrong, but I’d say, who even cares? The probability of the predictor being wrong is supposed to be virtually zero anyway (although as Nate notes, the problem description isn’t complete in that regard).
Why is this unfair?
Look, I keep saying this, but it doesn’t seem to me like anyone’s really engaged with it, so I’ll try again:
If the scenario were “pick Left or Right; after you pick, then the boxes are opened and the contents revealed; due to [insert relevant causal mechanisms involving a predictor or whatever else here], the Left box should be empty; unfortunately, one time in a trillion trillion, there’ll be some chance mistake, and Left will turn out (after you’ve chosen it) to have a bomb, and you’ll blow up”…
… then FDT telling you to take Left would be perfectly reasonable. I mean, it’s a gamble, right? A gamble with an unambiguously positive expected outcome; a gamble you’ll end up winning in the utterly overwhelming majority of cases. Once in a trillion trillion times, you suffer a painful death—but hey, that’s better odds than each of us take every day when we cross the street on our way to the corner store. In that case, it would surely be unfair to say “hey, but in this extremely unlikely outcome, you end up burning to death!”.
But that’s not the scenario!
In the given scenario, we already know what the boxes have in them. They’re open; the contents are visible. We already know that Left has a bomb. We know, to a certainty, that choosing Left means we burn to death. It’s not a gamble with an overwhelming, astronomical likelihood of a good outcome, and only a microscopically tiny chance of painful death—instead, it’s knowingly choosing a certain death!
Yes, the predictor is near-perfect. But so what? In the given scenario, that’s no longer relevant! The predictor has already predicted, and its prediction has already been evaluated, and has already been observed to have erred! There’s no longer any reason at all to choose Left, and every reason not to choose Left.
And yet FDT still tells us to choose Left. This is a catastrophic failure; and what’s more, it’s an obvious failure, and a totally preventable one.
Now, again: it would be reasonable to say: “Fine, yes, FDT fails horribly in this very, very rare circumstance; this is clearly a terrible mistake. Yet other decision theories fail, at least this badly, or in far more common situations, or both, so FDT still comes out ahead, on net.”
But that’s not the claim in the OP; the claim is that, somehow, knowingly choosing a guaranteed painful death (when it would be trivial to avoid it) is the correct choice, in this scenario.
And that’s just crazy.
My updated defense of FDT, should you be interested.
Like I’ve said before, it’s not about which action to take, it’s about which strategy to have. It’s obvious right-boxing gives the most utility in this specific scenario only, but that’s not what it’s about.
Why? Why is it not about which action to take?
I reject this. If Right-boxing gives the most utility in this specific scenario, then you should Right-box in this specific scenario. Because that’s the scenario that—by construction—is actually happening to you.
In other scenarios, perhaps you should do other things. But in this scenario, Right is the right answer.
And this is the key point. It seems to me impossible to have a decision theory that right-boxes in Bomb but still does as well as FDT does in all other scenarios.
It’s about which strategy you should adhere to. The strategy of right-boxing loses you $100 virtually all the time.
If it’s about utility, then specify it in terms of utility, not death or dollars.
Utility is often measured in dollars. If I had created the Bomb scenario, I would have specified life/death in terms of dollars as well. Like, “Life is worth $1,000,000 to you.” That way, you can easily compare the loss of your life to the $100 cost of Right-boxing.
Yes, you keep saying this, and I still think you’re wrong. Our candidate decision theory has to recommend something for this scenario—and that recommendation gets picked up by the predictor beforehand. You have to take that into account. You seem to be extremely focused on this extremely unlikely scenario, which is odd to me.
How exactly is it preventable? I’m honestly asking. If you have a strategy that, if the agent commits to it before the predictor makes her prediction, does better than FDT, I’m all ears.
It’s preventable by taking the Right box. If you take Left, you burn to death. If you take Right, you don’t burn to death.
Totally, here it is:
FDT, except that if the predictor makes a mistake and there’s a bomb in the Left, take Right instead.
You seem to have misunderstood the problem statement [1]. If you commit to doing “FDT, except that if the predictor makes a mistake and there’s a bomb in the Left, take Right instead”, then you will almost surely have to pay $100 (since the predictor predicts that you will take Right), whereas if you commit to using pure FDT, then you will almost surely have to pay nothing (with a small chance of death). There really is no “strategy that, if the agent commits to it before the predictor makes her prediction, does better than FDT”.
[1] Which is fair enough, as it wasn’t actually specified correctly: the predictor is actually trying to predict whether you will take Left or Right if it leaves its helpful note, not in the general case. But this assumption has to be added, since otherwise FDT says to take Right.
It sounds like you’re saying that I correctly understood the problem statement as it was written (but it was written incorrectly); but that the post erroneously claims that in the scenario as (incorrectly) written, FDT says to take Left, when in fact FDT in that scenario-as-written says to take right. Do I understand you?
Yes.
Why? FDT isn’t influenced in its decision by the note, so there is no loss of subjunctive dependence when this assumption isn’t added. (Or so it seems to me: I am operating at the limits of my FDT-knowledge here.)
How would this work? Your strategy seems to be “Left-box unless the note says there’s a bomb in Left”. This ensures the predictor is right whether she puts a bomb in Left or not, and doesn’t optimize expected utility.
It doesn’t kill you in a case when you can choose not to be killed, though, and that’s the important thing.
It costs you p * $100 for 0 ⇐ p ⇐ 1 where p depends on how “mean” you believe the predictor is. Left-boxing costs 10^-24 * $1,000,000 = $10^-18 if you value life at a million dollars. Then if p > 10^-20, Left-boxing beats your strategy.
Why would I value my life finitely in this case? (Well, ever, really, but especially in this scenario…)
Also, were you operating under the life-has-infinite-value assumption all along? If so, then
You were incorrect about FDT’s decision in this specific problem
You should probably have mentioned you had this unusual assumption, so we could have resolved this discussion way earlier
Note that FDT Right-boxes when you give life infinite value.
What’s special in this scenario with regards to valuing life finitely?
If you always value life infinitely, it seems to me all actions you can ever take get infinite values, as there is always a chance you die, which makes decision making on basis of utility pointless.
Unfortunately, that doesn’t work. The predictor, if malevolent, could then easily make you choose right and pay a $100.
Left-boxing is the best strategy possible as far as I can tell. As in, yes, that extremely unlikely scenario where you burn to death sucks big time, but there is no better strategy possible (unless there is a superior strategy I—and it appears everybody—haven’t/hasn’t thought of).
If you commit to taking Left, then the predictor, if malevolent, can “mistakenly” “predict” that you’ll take Right, making you burn to death. Just like in the given scenario: “Whoops, a mistaken prediction! How unfortunate and improbable! Guess you have no choice but to kill yourself now, how sad…”
There absolutely is a better strategy: don’t knowingly choose to burn to death.
We know the error rate of the predictor, so this point is moot.
I still have to see a strategy incorporating this that doesn’t overall lose by losing utility in other scenarios.
How do we know it? If the predictor is malevolent, then it can “err” as much as it wants.
For the record, I read Nate’s comments again, and I now think of it like this:
To the extent that the predictor was accurate in her line of reasoning, then you left-boxing does NOT result in you slowly burning to death. It results in, well, the problem statement being wrong, because the following can’t all be true:
The predictor is accurate
The predictor predicts you right-box, and places the bomb in left
You left-box
And yes, apparently the predictor can be wrong, but I’d say, who even cares? The probability of the predictor being wrong is supposed to be virtually zero anyway (although as Nate notes, the problem description isn’t complete in that regard).
We know it because it is given in the problem description, which you violate if the predictor ‘can “err” as much as it wants’.