I had been thinking that the only way to even approximately realize a Newcomb’s-problem situation was with computer programs. But a threshold so low makes it sound as if even a human being could qualify as a fallible Omega, and that maybe you could somehow test all this experimentally. Though even if we had human players in an experiment who were one-boxing and reaping the rewards, I’d still be very wary of supposing that the reason they were winning was because TDT is correct. If the Omega player was successfully anticipating the choices of a player who uses TDT, it suggests that the Omega player knows what TDT is. The success of one-boxing in such a situation might be fundamentally due to coordination arising from common concepts, rather than due to TDT being the right decision theory.
But first let me talk about realizing Newcomb’s problem with computer programs, and then I’ll return to the human scenario.
When I think about doing it with computer programs, two questions arise.
First question: Would an AI that was capable of understanding that it was in a Newcomb situation also be capable of figuring out the right thing to do?
In other words, do we need to include a “TDT special sauce” from the beginning, in the makeup of such a program, in order for it to discover the merits of one-boxing; or is a capacity for ordinary causal reasoning, coupled with the capacity to represent the defining elements of Newcomb’s problem, enough for an independent discovery of these ideas?
Second question: How does Omega get its knowledge of the player’s dispositions, and does this make any difference to the situation? (And we can also ask how the player knows that Omega has the power of prediction!)
If omega() and player() are two agents running in the same computer, the easiest way for omega() to predict player()’s behavior is just to simulate player(). omega() would then enact the game twice. First, it would start a copy of player() running, telling it (falsely) that it had predicted its choice, and then it would see the choice it made under such conditions. Then, omega() would play the game for real with the original(?) player(), now telling it (truthfully) that it has a prediction for its choice (due to the simulation of the game situation that had just been performed).
For certain types of player(), explicit simulation should not be necessary. If player() always does the same thing, completely unaffected by initial conditions and without any cognitive process, omega() can just inspect the source code. If player() has a simple decision procedure, something less than full simulation may also be sufficient. But full simulation of the game, including simulation of the beginning, where player() is introduced to the situation, should surely be sufficient, and for some cases (some complex agents) it will be necessary.
cousin_it’s scenario is a step down this path—world() corresponds to omega(), agent() to player(). But its agents, world() at least, lack the cognitive structure of real decision-makers. world() and agent() are functions whose values mimic the mutual dependency of Newcomb’s Omega and a TDT agent, and agent() has a decision procedure, though it’s just a brute-force search (and it requires access to world()’s source, which is unusual). But to really have confidence that TDT was the right approach in this situation, and that its apparent success was not just an artefact arising (e.g.) from more superficial features of the scenario, I need both omega() and player() to explicitly be agents that reason on the basis of evidence.
If we return now to the scenario of human beings playing this game with each other, with one human player being a “fallible Omega”… we do at least know that humans are agents that reason on the basis of evidence. But here, what we’d want to show is that any success of TDT among human beings actually resulted because of evidence-based cognition, rather than from (e.g.) “coordination due to common concepts”, as I suggested in the first paragraph.
In other words, do we need to include a “TDT special sauce” from the beginning, in the makeup of such a program, in order for it to discover the merits of one-boxing; or is a capacity for ordinary causal reasoning, coupled with the capacity to represent the defining elements of Newcomb’s problem, enough for an independent discovery of these ideas?
This is basically what EY discusses in pp. ~27-37 of the thesis he posted, where he poses it as the difference between optimality on action-determined problems (in which ordinary causal reasoning suffices to win) and optimality on decision-determined problems (on which ordinary causal reasoning loses, and you have to incorporate knowledge of “what kind of being makes this decision”).
I had been thinking that the only way to even approximately realize a Newcomb’s-problem situation was with computer programs. But a threshold so low makes it sound as if even a human being could qualify as a fallible Omega, and that maybe you could somehow test all this experimentally. Though even if we had human players in an experiment who were one-boxing and reaping the rewards, I’d still be very wary of supposing that the reason they were winning was because TDT is correct. If the Omega player was successfully anticipating the choices of a player who uses TDT, it suggests that the Omega player knows what TDT is. The success of one-boxing in such a situation might be fundamentally due to coordination arising from common concepts, rather than due to TDT being the right decision theory.
But first let me talk about realizing Newcomb’s problem with computer programs, and then I’ll return to the human scenario.
When I think about doing it with computer programs, two questions arise.
First question: Would an AI that was capable of understanding that it was in a Newcomb situation also be capable of figuring out the right thing to do?
In other words, do we need to include a “TDT special sauce” from the beginning, in the makeup of such a program, in order for it to discover the merits of one-boxing; or is a capacity for ordinary causal reasoning, coupled with the capacity to represent the defining elements of Newcomb’s problem, enough for an independent discovery of these ideas?
Second question: How does Omega get its knowledge of the player’s dispositions, and does this make any difference to the situation? (And we can also ask how the player knows that Omega has the power of prediction!)
If omega() and player() are two agents running in the same computer, the easiest way for omega() to predict player()’s behavior is just to simulate player(). omega() would then enact the game twice. First, it would start a copy of player() running, telling it (falsely) that it had predicted its choice, and then it would see the choice it made under such conditions. Then, omega() would play the game for real with the original(?) player(), now telling it (truthfully) that it has a prediction for its choice (due to the simulation of the game situation that had just been performed).
For certain types of player(), explicit simulation should not be necessary. If player() always does the same thing, completely unaffected by initial conditions and without any cognitive process, omega() can just inspect the source code. If player() has a simple decision procedure, something less than full simulation may also be sufficient. But full simulation of the game, including simulation of the beginning, where player() is introduced to the situation, should surely be sufficient, and for some cases (some complex agents) it will be necessary.
cousin_it’s scenario is a step down this path—world() corresponds to omega(), agent() to player(). But its agents, world() at least, lack the cognitive structure of real decision-makers. world() and agent() are functions whose values mimic the mutual dependency of Newcomb’s Omega and a TDT agent, and agent() has a decision procedure, though it’s just a brute-force search (and it requires access to world()’s source, which is unusual). But to really have confidence that TDT was the right approach in this situation, and that its apparent success was not just an artefact arising (e.g.) from more superficial features of the scenario, I need both omega() and player() to explicitly be agents that reason on the basis of evidence.
If we return now to the scenario of human beings playing this game with each other, with one human player being a “fallible Omega”… we do at least know that humans are agents that reason on the basis of evidence. But here, what we’d want to show is that any success of TDT among human beings actually resulted because of evidence-based cognition, rather than from (e.g.) “coordination due to common concepts”, as I suggested in the first paragraph.
This is basically what EY discusses in pp. ~27-37 of the thesis he posted, where he poses it as the difference between optimality on action-determined problems (in which ordinary causal reasoning suffices to win) and optimality on decision-determined problems (on which ordinary causal reasoning loses, and you have to incorporate knowledge of “what kind of being makes this decision”).
Of course, if player() is sentient, doing so would require omega() to create and destroy a sentient being in order to model player().