At a guess, the optimal strategy here will be a mixed one, one-boxing with probability p and two-boxing with probability (1-p).
If Omega’s correctness is independent of your thought process, the optimal strategy will be pure, not mixed. As you make Omega more accurate, at some point you switch from pure two-boxing to pure one-boxing.
Are you sure about that? If you’re right, that’s the exact transition point I’ve been looking to scrutinize. But what is the point at which you switch strategies?
cousin_it answered as I would, but I’ll go ahead and give the formal calculation anyway. If you start from an Omega accuracy rate r = 50%, that is equivalent to the case of Omega’s choice and yours being uncorrelated (causally or acausally). In that case, two boxing is optimal, and TDT and CDT both output that (as a pure strategy). As you increase r, CDT continues to output two-box, as it assigns the same optimality, while TDT will assign increasing optimality (call it TDTO, though it amounts to the same as EU) to one-boxing and decreasing optimality to two-boxing.
Solving for TDTO(one-box) > TDTO(two-box), you get that one-boxing chosen is under TDT (and optimal) whenever r > 50.05%, or whenever Omega has more than 721 nanobits of information (!!!) about your decision theory. (Note, that’s 0.000000721 bits of information.)
Viewed in this light, it should make more sense—do people never have more than 1 microbit of information about your decision theory? (Note: with less drastic differences between the outcomes, the threshold is higher.)
(I don’t think the inclusion of probabilistic strategies changes the basic point.)
I had been thinking that the only way to even approximately realize a Newcomb’s-problem situation was with computer programs. But a threshold so low makes it sound as if even a human being could qualify as a fallible Omega, and that maybe you could somehow test all this experimentally. Though even if we had human players in an experiment who were one-boxing and reaping the rewards, I’d still be very wary of supposing that the reason they were winning was because TDT is correct. If the Omega player was successfully anticipating the choices of a player who uses TDT, it suggests that the Omega player knows what TDT is. The success of one-boxing in such a situation might be fundamentally due to coordination arising from common concepts, rather than due to TDT being the right decision theory.
But first let me talk about realizing Newcomb’s problem with computer programs, and then I’ll return to the human scenario.
When I think about doing it with computer programs, two questions arise.
First question: Would an AI that was capable of understanding that it was in a Newcomb situation also be capable of figuring out the right thing to do?
In other words, do we need to include a “TDT special sauce” from the beginning, in the makeup of such a program, in order for it to discover the merits of one-boxing; or is a capacity for ordinary causal reasoning, coupled with the capacity to represent the defining elements of Newcomb’s problem, enough for an independent discovery of these ideas?
Second question: How does Omega get its knowledge of the player’s dispositions, and does this make any difference to the situation? (And we can also ask how the player knows that Omega has the power of prediction!)
If omega() and player() are two agents running in the same computer, the easiest way for omega() to predict player()’s behavior is just to simulate player(). omega() would then enact the game twice. First, it would start a copy of player() running, telling it (falsely) that it had predicted its choice, and then it would see the choice it made under such conditions. Then, omega() would play the game for real with the original(?) player(), now telling it (truthfully) that it has a prediction for its choice (due to the simulation of the game situation that had just been performed).
For certain types of player(), explicit simulation should not be necessary. If player() always does the same thing, completely unaffected by initial conditions and without any cognitive process, omega() can just inspect the source code. If player() has a simple decision procedure, something less than full simulation may also be sufficient. But full simulation of the game, including simulation of the beginning, where player() is introduced to the situation, should surely be sufficient, and for some cases (some complex agents) it will be necessary.
cousin_it’s scenario is a step down this path—world() corresponds to omega(), agent() to player(). But its agents, world() at least, lack the cognitive structure of real decision-makers. world() and agent() are functions whose values mimic the mutual dependency of Newcomb’s Omega and a TDT agent, and agent() has a decision procedure, though it’s just a brute-force search (and it requires access to world()’s source, which is unusual). But to really have confidence that TDT was the right approach in this situation, and that its apparent success was not just an artefact arising (e.g.) from more superficial features of the scenario, I need both omega() and player() to explicitly be agents that reason on the basis of evidence.
If we return now to the scenario of human beings playing this game with each other, with one human player being a “fallible Omega”… we do at least know that humans are agents that reason on the basis of evidence. But here, what we’d want to show is that any success of TDT among human beings actually resulted because of evidence-based cognition, rather than from (e.g.) “coordination due to common concepts”, as I suggested in the first paragraph.
In other words, do we need to include a “TDT special sauce” from the beginning, in the makeup of such a program, in order for it to discover the merits of one-boxing; or is a capacity for ordinary causal reasoning, coupled with the capacity to represent the defining elements of Newcomb’s problem, enough for an independent discovery of these ideas?
This is basically what EY discusses in pp. ~27-37 of the thesis he posted, where he poses it as the difference between optimality on action-determined problems (in which ordinary causal reasoning suffices to win) and optimality on decision-determined problems (on which ordinary causal reasoning loses, and you have to incorporate knowledge of “what kind of being makes this decision”).
I don’t think there’s anything especially interesting about that point, it’s just the point where the calculated expected utilities of one-boxing and two-boxing become equal.
If Omega’s correctness is independent of your thought process, the optimal strategy will be pure, not mixed. As you make Omega more accurate, at some point you switch from pure two-boxing to pure one-boxing.
Are you sure about that? If you’re right, that’s the exact transition point I’ve been looking to scrutinize. But what is the point at which you switch strategies?
cousin_it answered as I would, but I’ll go ahead and give the formal calculation anyway. If you start from an Omega accuracy rate r = 50%, that is equivalent to the case of Omega’s choice and yours being uncorrelated (causally or acausally). In that case, two boxing is optimal, and TDT and CDT both output that (as a pure strategy). As you increase r, CDT continues to output two-box, as it assigns the same optimality, while TDT will assign increasing optimality (call it TDTO, though it amounts to the same as EU) to one-boxing and decreasing optimality to two-boxing.
TDT will reason as such:
One box: TDTO = r*(1e6) + (1-r)*0 = (1000e3)r
Two box: TDTO = r*1000 + (1-r)*(1,001,000) = 1001e3 - (1000e3)r
Solving for TDTO(one-box) > TDTO(two-box), you get that one-boxing chosen is under TDT (and optimal) whenever r > 50.05%, or whenever Omega has more than 721 nanobits of information (!!!) about your decision theory. (Note, that’s 0.000000721 bits of information.)
Viewed in this light, it should make more sense—do people never have more than 1 microbit of information about your decision theory? (Note: with less drastic differences between the outcomes, the threshold is higher.)
(I don’t think the inclusion of probabilistic strategies changes the basic point.)
I had been thinking that the only way to even approximately realize a Newcomb’s-problem situation was with computer programs. But a threshold so low makes it sound as if even a human being could qualify as a fallible Omega, and that maybe you could somehow test all this experimentally. Though even if we had human players in an experiment who were one-boxing and reaping the rewards, I’d still be very wary of supposing that the reason they were winning was because TDT is correct. If the Omega player was successfully anticipating the choices of a player who uses TDT, it suggests that the Omega player knows what TDT is. The success of one-boxing in such a situation might be fundamentally due to coordination arising from common concepts, rather than due to TDT being the right decision theory.
But first let me talk about realizing Newcomb’s problem with computer programs, and then I’ll return to the human scenario.
When I think about doing it with computer programs, two questions arise.
First question: Would an AI that was capable of understanding that it was in a Newcomb situation also be capable of figuring out the right thing to do?
In other words, do we need to include a “TDT special sauce” from the beginning, in the makeup of such a program, in order for it to discover the merits of one-boxing; or is a capacity for ordinary causal reasoning, coupled with the capacity to represent the defining elements of Newcomb’s problem, enough for an independent discovery of these ideas?
Second question: How does Omega get its knowledge of the player’s dispositions, and does this make any difference to the situation? (And we can also ask how the player knows that Omega has the power of prediction!)
If omega() and player() are two agents running in the same computer, the easiest way for omega() to predict player()’s behavior is just to simulate player(). omega() would then enact the game twice. First, it would start a copy of player() running, telling it (falsely) that it had predicted its choice, and then it would see the choice it made under such conditions. Then, omega() would play the game for real with the original(?) player(), now telling it (truthfully) that it has a prediction for its choice (due to the simulation of the game situation that had just been performed).
For certain types of player(), explicit simulation should not be necessary. If player() always does the same thing, completely unaffected by initial conditions and without any cognitive process, omega() can just inspect the source code. If player() has a simple decision procedure, something less than full simulation may also be sufficient. But full simulation of the game, including simulation of the beginning, where player() is introduced to the situation, should surely be sufficient, and for some cases (some complex agents) it will be necessary.
cousin_it’s scenario is a step down this path—world() corresponds to omega(), agent() to player(). But its agents, world() at least, lack the cognitive structure of real decision-makers. world() and agent() are functions whose values mimic the mutual dependency of Newcomb’s Omega and a TDT agent, and agent() has a decision procedure, though it’s just a brute-force search (and it requires access to world()’s source, which is unusual). But to really have confidence that TDT was the right approach in this situation, and that its apparent success was not just an artefact arising (e.g.) from more superficial features of the scenario, I need both omega() and player() to explicitly be agents that reason on the basis of evidence.
If we return now to the scenario of human beings playing this game with each other, with one human player being a “fallible Omega”… we do at least know that humans are agents that reason on the basis of evidence. But here, what we’d want to show is that any success of TDT among human beings actually resulted because of evidence-based cognition, rather than from (e.g.) “coordination due to common concepts”, as I suggested in the first paragraph.
This is basically what EY discusses in pp. ~27-37 of the thesis he posted, where he poses it as the difference between optimality on action-determined problems (in which ordinary causal reasoning suffices to win) and optimality on decision-determined problems (on which ordinary causal reasoning loses, and you have to incorporate knowledge of “what kind of being makes this decision”).
Of course, if player() is sentient, doing so would require omega() to create and destroy a sentient being in order to model player().
I don’t think there’s anything especially interesting about that point, it’s just the point where the calculated expected utilities of one-boxing and two-boxing become equal.