So if I understood this correctly, in this variant of Newcomb’s Problem (NP), which I’ll call Proxy Newcomb’s Problem (PNP), you get $1000 if you two-box in NP and also two-box in PNP, otherwise you get $0.
With UDT, you don’t need to precommit in advance, you just act according to a precommitment you should’ve made, from a state of knowledge that hasn’t updated on actuality of the current situation. The usual convention is to give the thought experiment (together with any relevant counterfactuals) a lot of probability, so that its implausibility doesn’t distract from the problem. This doesn’t simultaneously extend to other related thought experiments that are not part of this one.
More carefully, a thought experiment X (which could be PNP or NP) usually has a prior state of knowledge X1 where the players know the rules of the thought experiment and that it’s happening, but without yet specifying which of the possibilities within it take place. And also another possibly more narrow state of knowledge X2 that describes the specific situation within the thought experiment that’s taken as a point of view for its statement, what is being observed. To apply UDT to the thought experiment is to decide on a strategy from the state of knowledge X1, even as you currently occupy a different state of knowledge X2, and then enact the part of that strategy that pertains to the situation X2.
Here, we have two thought experiments, PNP and NP, but PNP references the strategy for NP. Usually, the UDT strategy for NP would be the strategy chosen from the state of knowledge NP1 (which is essentially the same as NP2, the distinction becomes important in cases like Transparent Newcomb’s Problem and for things like CDT). But in PNP, the UDT strategy is to be chosen in the state of knowledge PNP1, so it becomes unclear what “strategy in NP” means, because it’s unclear what state of knowledge the strategy in NP is to be chosen from. It can’t really be chosen from the state of knowledge PNP1, because then NP is not expected in reality. And in the prior state of knowledge where neither NP1 nor PNP1 are assumed, it becomes a competition between the tiny probabilities of NP and PNP. But if the strategy in NP is chosen from state of knowledge NP1, it’s not under control of the strategy for PNP chosen from state PNP1.
In other words, there doesn’t seem to be a way of communicating that PNP is happening and NP isn’t, to the hypothetical of NP happening, thus influencing what you do in counterfactual NP in order to do well in actual PNP. And absent that, you do in NP what you would do if it’s actual (from state of knowledge NP1), ignoring possibility of PNP (which NP1 doesn’t expect). If somehow knowledge of actuality of PNP is allowed to be added to NP1, and you control actions within NP from state of knowledge PNP1, then the correct strategy is obviously to two-box in both. But the problem statement is very confusing on this point.
Upon reflection, it was probably a mistake for me to write this phrased as a story/problem/thought experiment. I should probably have just written a shorter post titled something like “Newcomb’s problem provides no (interesting, non-trivial) evidence against using causal decision theory.” I had some fun writing this, though, and (mistakenly?) hoped that people would have fun reading it.
I think I disagree somewhat that “PNP references the strategy for NP”. I think many (most?) LW people have decided they are “the type of person who one-boxes in NP”, and believe that says something positive about them in their actual life. This post is an attempt to push back on that.
It seems from your comment that you think of “What I, Vladimir Nesov, would do in a thought experiment” as different from what you would actually do in real life. (eg, when you say “the problem statement is very confusing on this point.”). I think of both as being much more closely tied.
Possibly the confusion comes from the difference between what you-VN-would-actually-do and what you think is correct/optimal/rational behavior? Like, in a thought experiment, you don’t actually try to imagine or predict what real-you would do, you just wonder what optimal behavior/strategy is? In that case, I agree that this is a confusing problem statement.
The point of UDT as I understand it is that you should be the sort of person who predictably one-boxes in NP. This seems incorrect to me. I think if you are the sort of person who one-boxes in a surprise NP, you will have worse outcomes in general, and that if you have a surprise NP, you should two-box. If you know you will be confronted with NP tomorrow, then sure, you should decide to one-box ahead of time. But I think deciding now to “be the sort of person who would one-box in NP,” (or equivalently, deciding now to commit to a decision theory which will result in that) is a mistake.
Eliezer Yudkowsky and the whole UDT crowd seem to think that you should commit to a decision theory which seems like a bad one to me, on the basis that it would be rational to have precommitted if you end up in this situation. They seem to have convinced most LW people of this. I think they are wrong. I think CDT is a better decision theory which is more intuitive. I agree CDT gives a suboptimal outcome in surprise-NP, but I think any decision theory can give a good or bad outcome in corner-cases, along the lines of “You meet a superintelligent agent which will punish people who use (good decision theory) and reward those who use (bad decision theory).” Thus, NP shouldn’t count as a strike against CDT.
So if I understood this correctly, in this variant of Newcomb’s Problem (NP), which I’ll call Proxy Newcomb’s Problem (PNP), you get $1000 if you two-box in NP and also two-box in PNP, otherwise you get $0.
With UDT, you don’t need to precommit in advance, you just act according to a precommitment you should’ve made, from a state of knowledge that hasn’t updated on actuality of the current situation. The usual convention is to give the thought experiment (together with any relevant counterfactuals) a lot of probability, so that its implausibility doesn’t distract from the problem. This doesn’t simultaneously extend to other related thought experiments that are not part of this one.
More carefully, a thought experiment X (which could be PNP or NP) usually has a prior state of knowledge X1 where the players know the rules of the thought experiment and that it’s happening, but without yet specifying which of the possibilities within it take place. And also another possibly more narrow state of knowledge X2 that describes the specific situation within the thought experiment that’s taken as a point of view for its statement, what is being observed. To apply UDT to the thought experiment is to decide on a strategy from the state of knowledge X1, even as you currently occupy a different state of knowledge X2, and then enact the part of that strategy that pertains to the situation X2.
Here, we have two thought experiments, PNP and NP, but PNP references the strategy for NP. Usually, the UDT strategy for NP would be the strategy chosen from the state of knowledge NP1 (which is essentially the same as NP2, the distinction becomes important in cases like Transparent Newcomb’s Problem and for things like CDT). But in PNP, the UDT strategy is to be chosen in the state of knowledge PNP1, so it becomes unclear what “strategy in NP” means, because it’s unclear what state of knowledge the strategy in NP is to be chosen from. It can’t really be chosen from the state of knowledge PNP1, because then NP is not expected in reality. And in the prior state of knowledge where neither NP1 nor PNP1 are assumed, it becomes a competition between the tiny probabilities of NP and PNP. But if the strategy in NP is chosen from state of knowledge NP1, it’s not under control of the strategy for PNP chosen from state PNP1.
In other words, there doesn’t seem to be a way of communicating that PNP is happening and NP isn’t, to the hypothetical of NP happening, thus influencing what you do in counterfactual NP in order to do well in actual PNP. And absent that, you do in NP what you would do if it’s actual (from state of knowledge NP1), ignoring possibility of PNP (which NP1 doesn’t expect). If somehow knowledge of actuality of PNP is allowed to be added to NP1, and you control actions within NP from state of knowledge PNP1, then the correct strategy is obviously to two-box in both. But the problem statement is very confusing on this point.
Upon reflection, it was probably a mistake for me to write this phrased as a story/problem/thought experiment. I should probably have just written a shorter post titled something like “Newcomb’s problem provides no (interesting, non-trivial) evidence against using causal decision theory.” I had some fun writing this, though, and (mistakenly?) hoped that people would have fun reading it.
I think I disagree somewhat that “PNP references the strategy for NP”. I think many (most?) LW people have decided they are “the type of person who one-boxes in NP”, and believe that says something positive about them in their actual life. This post is an attempt to push back on that.
It seems from your comment that you think of “What I, Vladimir Nesov, would do in a thought experiment” as different from what you would actually do in real life. (eg, when you say “the problem statement is very confusing on this point.”). I think of both as being much more closely tied.
Possibly the confusion comes from the difference between what you-VN-would-actually-do and what you think is correct/optimal/rational behavior? Like, in a thought experiment, you don’t actually try to imagine or predict what real-you would do, you just wonder what optimal behavior/strategy is? In that case, I agree that this is a confusing problem statement.
The point of UDT as I understand it is that you should be the sort of person who predictably one-boxes in NP. This seems incorrect to me. I think if you are the sort of person who one-boxes in a surprise NP, you will have worse outcomes in general, and that if you have a surprise NP, you should two-box. If you know you will be confronted with NP tomorrow, then sure, you should decide to one-box ahead of time. But I think deciding now to “be the sort of person who would one-box in NP,” (or equivalently, deciding now to commit to a decision theory which will result in that) is a mistake.
Eliezer Yudkowsky and the whole UDT crowd seem to think that you should commit to a decision theory which seems like a bad one to me, on the basis that it would be rational to have precommitted if you end up in this situation. They seem to have convinced most LW people of this. I think they are wrong. I think CDT is a better decision theory which is more intuitive. I agree CDT gives a suboptimal outcome in surprise-NP, but I think any decision theory can give a good or bad outcome in corner-cases, along the lines of “You meet a superintelligent agent which will punish people who use (good decision theory) and reward those who use (bad decision theory).” Thus, NP shouldn’t count as a strike against CDT.