This just seems like a variant of newcomb’s box, and EDT is naturally optimal here (as it is everywhere).
Assume the predictor is never wrong and never lies. Then upon receiving the letter we know that in worlds where the house is not infested we pay, and in worlds where the house is infested we do not. So we pay and win $999,000, which is optimal.
Perfect predictors are roughly equivalent to time travel. Its equivalent to filtering out all universes where the house is not infected and we don’t pay, and all those where the house is infected and we pay.
To compare decision algos we need a formal utility measure for our purposes of comparison.
Given any such formal utility measure, we could then easily define the optimal decision algorithm—it is whatever argmaxes that measure! EDT is simply that, for the very reasonable expected utiltiy metric.
Given that you receive the letter, paying is indeed evidence for not having termites and winning $999,000. EDT is elegant, but still can’t be correct in my view. I wish it were, and have attempted to “fix” it.
My take is this. Either you have the termite infestation, or you don’t.
Say you do. Then
being a “payer” means you never receive the letter, as both conditions are false. As you don’t receive the letter, you don’t actually pay, and lose the $1,000,000 in damages.
being a “non-payer” means you get the letter, and you don’t pay. You lose $1,000,000.
Say you don’t. Then
payer: you get the letter, pay $1,000. You lose $1,000.
non-payer: you don’t get the letter, and don’t pay $1,000. You lose nothing.
Being a payer has the same result when you do have the termites, but is worse when you don’t. So overall, it’s worse. Being a payer or a non-payer only influences whether or not you get the letter, and this view is more coherent with the intuition that you can’t possibly influence whether or not you have a termite infestation.
In your problem description you said you receive the letter:
Thus, the claim made by the letter is true. Assume the agent receives the letter. Should she pay up?
Given that you did receive the letter, that eliminates 2 of the 4 possible worlds, and we are left with only (infested, dont_pay) and (uninfested, pay). Then the choice is obvious. EDT is correct here.
Obviously if you don’t receive the letter you have more options but then its not much of an interesting problem.
you can’t possibly influence whether or not you have a termite infestation.
This intuition is actually false for perfect predictors. A perfect predictor could simulate your mind (along with everything else) perfectly, which is somewhat equivalent to time travel. Its not actual time travel of course; in these ‘perfect prediction’ scenarios your future (perfectly predicted) decisions have already effected your past.
“In your problem description you said you receive the letter”
True, but the problem description also specifies subjunctive dependence between the agent and the predictor. When the predictor made her prediction the letter isn’t yet sent. So the agent’s decision influences whether or not she gets the letter.
“This intuition is actually false for perfect predictors.”
I agree (and have written extensively on the subject). But it’s the prediction the agent influences, not the presence of the termite infestation.
The payoff and optimal move naturally depends on the exact time of measurement. Before receiving any letter you can save $1000 by precomitting to not paying: but that is a move both FDT and EDT will make. But after receiving the letter (which you assumed) the optimal move is to pay the $1000 to save $1M. FDT from my understanding fails here as it retroactively precommits to not paying and thus loses $1M. So this is a good example of where EDT > FDT.
The only example i’ve seen so far where the retroactive precommitment of FDT actually could make sense is the specific variant 5 from here where we measure utility before the agent knows the rules or has observed anything. And even in that scenario FDT only has a net advantage if it is optimal to make the universal precommitmment everywhere. EDT can decide to do that: EDT->FDT is allowed, but FDT can never switch back. So in that sense EDT is ‘dominant’, or the question reduces to: is the universal precommitment of FDT a win on net across the multiverse? Which is far from clear.
The trick with FDT is that FDT agents never receive the letter and never pay.
FDT payoff is p*(-1000000), where p is a probability of infestation. EDT payoff is p*(-1000000) + (1-p)*(-1000), which seems to me speaking for itself.
This just seems like a variant of newcomb’s box, and EDT is naturally optimal here (as it is everywhere).
Assume the predictor is never wrong and never lies. Then upon receiving the letter we know that in worlds where the house is not infested we pay, and in worlds where the house is infested we do not. So we pay and win $999,000, which is optimal.
Perfect predictors are roughly equivalent to time travel. Its equivalent to filtering out all universes where the house is not infected and we don’t pay, and all those where the house is infected and we pay.
To compare decision algos we need a formal utility measure for our purposes of comparison. Given any such formal utility measure, we could then easily define the optimal decision algorithm—it is whatever argmaxes that measure! EDT is simply that, for the very reasonable expected utiltiy metric.
Given that you receive the letter, paying is indeed evidence for not having termites and winning $999,000. EDT is elegant, but still can’t be correct in my view. I wish it were, and have attempted to “fix” it.
My take is this. Either you have the termite infestation, or you don’t.
Say you do. Then
being a “payer” means you never receive the letter, as both conditions are false. As you don’t receive the letter, you don’t actually pay, and lose the $1,000,000 in damages.
being a “non-payer” means you get the letter, and you don’t pay. You lose $1,000,000.
Say you don’t. Then
payer: you get the letter, pay $1,000. You lose $1,000.
non-payer: you don’t get the letter, and don’t pay $1,000. You lose nothing.
Being a payer has the same result when you do have the termites, but is worse when you don’t. So overall, it’s worse. Being a payer or a non-payer only influences whether or not you get the letter, and this view is more coherent with the intuition that you can’t possibly influence whether or not you have a termite infestation.
In your problem description you said you receive the letter:
Given that you did receive the letter, that eliminates 2 of the 4 possible worlds, and we are left with only (infested, dont_pay) and (uninfested, pay). Then the choice is obvious. EDT is correct here.
Obviously if you don’t receive the letter you have more options but then its not much of an interesting problem.
This intuition is actually false for perfect predictors. A perfect predictor could simulate your mind (along with everything else) perfectly, which is somewhat equivalent to time travel. Its not actual time travel of course; in these ‘perfect prediction’ scenarios your future (perfectly predicted) decisions have already effected your past.
“In your problem description you said you receive the letter”
True, but the problem description also specifies subjunctive dependence between the agent and the predictor. When the predictor made her prediction the letter isn’t yet sent. So the agent’s decision influences whether or not she gets the letter.
“This intuition is actually false for perfect predictors.”
I agree (and have written extensively on the subject). But it’s the prediction the agent influences, not the presence of the termite infestation.
The payoff and optimal move naturally depends on the exact time of measurement. Before receiving any letter you can save $1000 by precomitting to not paying: but that is a move both FDT and EDT will make. But after receiving the letter (which you assumed) the optimal move is to pay the $1000 to save $1M. FDT from my understanding fails here as it retroactively precommits to not paying and thus loses $1M. So this is a good example of where EDT > FDT.
The only example i’ve seen so far where the retroactive precommitment of FDT actually could make sense is the specific variant 5 from here where we measure utility before the agent knows the rules or has observed anything. And even in that scenario FDT only has a net advantage if it is optimal to make the universal precommitmment everywhere. EDT can decide to do that: EDT->FDT is allowed, but FDT can never switch back. So in that sense EDT is ‘dominant’, or the question reduces to: is the universal precommitment of FDT a win on net across the multiverse? Which is far from clear.
The trick with FDT is that FDT agents never receive the letter and never pay. FDT payoff is p*(-1000000), where p is a probability of infestation. EDT payoff is p*(-1000000) + (1-p)*(-1000), which seems to me speaking for itself.
The problem clearly states:
So that is baked into the environment, it is a fact. The EDT payoff is maximal.