I don’t even know a fraction of the math you people know, but this problem seems obvious to me: One-box fully expecting the box to be empty (or something impossible to happen).
More generally, if expecting A implies B, expecting B implies A or C and expecting C implies C and U(B)>U(C)>U(A) expect A unless the cost of “being wrong” is larger than the difference. (A being one-boxing the empty box, C being two-boxing with one box empty and B being one-boxing a full box here, in this case B → C would be via D expecting to two-box with both boxes full, which is not implied by any expectation and therefore unreachable).
In the model I suppose the agent would prove that the utility of the various actions depends on whether the agent always chooses the action for which the greatest utility was proven, go into the branch that makes an exception here, and this being correctly predicted by the predictor.
Another way to phrase it is that this problem is isomorphic to the transparent box Newcomb problem, and you are trying to find a formalized decision theory that will one-box the empty box “knowing” it is empty. (Just that instead of “knowing” through updating on seeing the empty box, which UDT refuses to do, there is an equivalent trick with the dependence.) The only way you can do that is if you either don’t actually try to maximize the money at that point or expect a contradiction. Not trying to maximize the money in such situations probably is easier to deal with.
I don’t even know a fraction of the math you people know, but this problem seems obvious to me: One-box fully expecting the box to be empty (or something impossible to happen).
More generally, if expecting A implies B, expecting B implies A or C and expecting C implies C and U(B)>U(C)>U(A) expect A unless the cost of “being wrong” is larger than the difference. (A being one-boxing the empty box, C being two-boxing with one box empty and B being one-boxing a full box here, in this case B → C would be via D expecting to two-box with both boxes full, which is not implied by any expectation and therefore unreachable).
In the model I suppose the agent would prove that the utility of the various actions depends on whether the agent always chooses the action for which the greatest utility was proven, go into the branch that makes an exception here, and this being correctly predicted by the predictor.
Is there something important I’m missing?
Another way to phrase it is that this problem is isomorphic to the transparent box Newcomb problem, and you are trying to find a formalized decision theory that will one-box the empty box “knowing” it is empty. (Just that instead of “knowing” through updating on seeing the empty box, which UDT refuses to do, there is an equivalent trick with the dependence.) The only way you can do that is if you either don’t actually try to maximize the money at that point or expect a contradiction. Not trying to maximize the money in such situations probably is easier to deal with.