CDT, TDT, and UDT would not give away the money because there is no causal (or acausal) influence on the number of universes.
I’m not so sure about UDT’s response. From what I’ve heard, depending on the exact formal implementation of the problem, UDT might also pay the money? If your thought experiment works via a correlation between the type of universe you live in and the decision theory you employ, then it might be a similar problem to the Coin Flip Creation. I introduced the latter decision problem in an attempt to make a less ambiguous version of the Smoking Lesion. In a comment in response to my post, cousin_it writes:
Here’s why I think egoistic UDT would one-box. From the problem setup it’s provable that one-boxing implies finding money in box A. That’s exactly the information that UDT requires for decision making (“logical counterfactual”). It doesn’t need to deduce unconditionally that there’s money in box A or that it will one-box.
One possible confounder in your thought experiment is the agent’s altruism. The agent doesn’t care about which world he lives in, but only about which worlds exist. If you reason from an “updateless”, outside perspective (like Anthropic Decision Theory), it then becomes irrelevant what you choose. This is because if you act in a way that’s only logically compatible with world A, you know you just wouldn’t have existed in the other world. A way around this would be if you’re not completely updateless, but if you instead have already updated on the fact that you do exist. In this case you’d have more power with your decision. “One-boxing” might also make sense if you’re just a copy-egoist and prefer to live in world A.
A way around this would be if you’re not completely updateless, but if you instead have already updated on the fact that you do exist.
It’s not a given that you can easily observe your existence. From updateless point of view, all possible worlds, or theories of worlds, or maybe finite fragments of reasoning about them, in principle “exist” to some degree, in the sense of being data potentially relevant for estimating the value of everything, which is something to be done for the strategies under agent’s consideration. So in case of worlds, or instances of the agent in worlds, the useful sense of “existence” is relevance for estimating the value of everything (or of change in value depending on agent’s strategy, which is the sense in which worlds that couldn’t contain or think about the agent, don’t exist). Since in this case we are talking about possible worlds, they do or don’t exist in the sense of having no measure (probability) in the updateless prior (to the extent that it makes sense to talk about the decision algorithm using a prior). In this sense, observing one’s existence means observing an argument about the a priori probability of the world you inhabit. In a world that has relatively tiny a priori probability, you should be able to observe your own (or rather the world’s) non-existence, in the same sense.
It’s not a given that you can easily observe your existence.
It took me a while to understand this. Would you say that for example in the Evidential Blackmail, you can never tell whether your decision algorithm is just being simulated or whether you’re actually in the world where you received the letter, because both times, the decision algorithms receive exactly the same evidence? So in this sense, after updating on receiving the letter, both worlds are still equally likely, and only via your decision do you find out which of those worlds are the simulated ones and which are the real ones. One can probably generalize this principle: you can never differentiate between different instantiations of your decision algorithm that have the same evidence. So when you decide what action to output conditional on receiving some sense data, you always have to decide based on your prior probabilities. Normally, this works exactly as if you would first update on this sense data and then decide. But sometimes, e.g. if your actions in one world make a difference to the other world via a simulation, then it makes a difference. Maybe if you assign anthropic probabilities to either being a “logical zombie” or the real you, then the result would be like UDT even with updating?
What I still don’t understand is how this motivates updatelessness with regard to anthropic probabilities (e.g. if I know that I have a low index number, or in Psy Kosh’s problem, if I already know I’m the decider). I totally get how it makes sense to precommit yourself and how one should talk about decision problems instead of probabilities, how you should reason as if you’re all instantiations of your decision algorithm at once, etc. Also, intuitively I agree with sticking with the priors. But somehow I can’t get my head around what exactly is wrong about the update. Why is it wrong to assign more “caring energy” to the world in which some kind of observation that I make would have been more probable? Is it somehow wrong that it “would have been more probable”? Did I choose the wrong reference classes? Is it because in these problems, too, the worlds influence each other, so that you have to consider the impact that your decision would have on the other world as well?
The Smoking Lesion and Newcomb are formally equivalent. So no consistent decision theory can say, “smoke, but one-box.” Eliezer hoped to get this response. If he succeeded, UDT is inconsistent. If UDT is consistent, it must recommend either smoking and two-boxing, or not smoking and one-boxing.
Notice that cousin it’s argument applies exactly to the 100% correlation smoking lesion: you can deduce from the fact that you do not smoke that you do not have cancer, and by UDT as cousin it understands it, that is all you need to decide not to smoke.
I’m not so sure about UDT’s response. From what I’ve heard, depending on the exact formal implementation of the problem, UDT might also pay the money? If your thought experiment works via a correlation between the type of universe you live in and the decision theory you employ, then it might be a similar problem to the Coin Flip Creation. I introduced the latter decision problem in an attempt to make a less ambiguous version of the Smoking Lesion. In a comment in response to my post, cousin_it writes:
One possible confounder in your thought experiment is the agent’s altruism. The agent doesn’t care about which world he lives in, but only about which worlds exist. If you reason from an “updateless”, outside perspective (like Anthropic Decision Theory), it then becomes irrelevant what you choose. This is because if you act in a way that’s only logically compatible with world A, you know you just wouldn’t have existed in the other world. A way around this would be if you’re not completely updateless, but if you instead have already updated on the fact that you do exist. In this case you’d have more power with your decision. “One-boxing” might also make sense if you’re just a copy-egoist and prefer to live in world A.
It’s not a given that you can easily observe your existence. From updateless point of view, all possible worlds, or theories of worlds, or maybe finite fragments of reasoning about them, in principle “exist” to some degree, in the sense of being data potentially relevant for estimating the value of everything, which is something to be done for the strategies under agent’s consideration. So in case of worlds, or instances of the agent in worlds, the useful sense of “existence” is relevance for estimating the value of everything (or of change in value depending on agent’s strategy, which is the sense in which worlds that couldn’t contain or think about the agent, don’t exist). Since in this case we are talking about possible worlds, they do or don’t exist in the sense of having no measure (probability) in the updateless prior (to the extent that it makes sense to talk about the decision algorithm using a prior). In this sense, observing one’s existence means observing an argument about the a priori probability of the world you inhabit. In a world that has relatively tiny a priori probability, you should be able to observe your own (or rather the world’s) non-existence, in the same sense.
This also follows the principle of reducing concepts like existence or probability (where they make sense) to components of the decision algorithm, and abandoning them in sufficiently unusual thought experiments (where they may fail to make sense, but where it’s still possible to talk about decisions). See also this post of Vadim’s and the idea of cognitive reductions (looking for the role a concept plays in your thinking, not just for what it could match in the world).
Thanks for the reply and all the useful links!
It took me a while to understand this. Would you say that for example in the Evidential Blackmail, you can never tell whether your decision algorithm is just being simulated or whether you’re actually in the world where you received the letter, because both times, the decision algorithms receive exactly the same evidence? So in this sense, after updating on receiving the letter, both worlds are still equally likely, and only via your decision do you find out which of those worlds are the simulated ones and which are the real ones. One can probably generalize this principle: you can never differentiate between different instantiations of your decision algorithm that have the same evidence. So when you decide what action to output conditional on receiving some sense data, you always have to decide based on your prior probabilities. Normally, this works exactly as if you would first update on this sense data and then decide. But sometimes, e.g. if your actions in one world make a difference to the other world via a simulation, then it makes a difference. Maybe if you assign anthropic probabilities to either being a “logical zombie” or the real you, then the result would be like UDT even with updating?
What I still don’t understand is how this motivates updatelessness with regard to anthropic probabilities (e.g. if I know that I have a low index number, or in Psy Kosh’s problem, if I already know I’m the decider). I totally get how it makes sense to precommit yourself and how one should talk about decision problems instead of probabilities, how you should reason as if you’re all instantiations of your decision algorithm at once, etc. Also, intuitively I agree with sticking with the priors. But somehow I can’t get my head around what exactly is wrong about the update. Why is it wrong to assign more “caring energy” to the world in which some kind of observation that I make would have been more probable? Is it somehow wrong that it “would have been more probable”? Did I choose the wrong reference classes? Is it because in these problems, too, the worlds influence each other, so that you have to consider the impact that your decision would have on the other world as well?
Edit: Never mind, I think http://lesswrong.com/lw/jpr/sudt_a_toy_decision_theory_for_updateless/ kind of answers my question :)
My way of looking at this:
The Smoking Lesion and Newcomb are formally equivalent. So no consistent decision theory can say, “smoke, but one-box.” Eliezer hoped to get this response. If he succeeded, UDT is inconsistent. If UDT is consistent, it must recommend either smoking and two-boxing, or not smoking and one-boxing.
Notice that cousin it’s argument applies exactly to the 100% correlation smoking lesion: you can deduce from the fact that you do not smoke that you do not have cancer, and by UDT as cousin it understands it, that is all you need to decide not to smoke.