On this topic, I’d like to suggest a variant of Newcomb’s problem that I don’t recall seeing anywhere in LessWrong (or anywhere else). As usual, Omega presents you with two boxes, box A and box B. She says “You may take either box A or both boxes. Box B contains 1,000$. Box A either contains 1,000,000$ or is empty. Here is how I decided what to put in box A: I consider a perfectly rational agent being put in an identical situation to the one you’re in. If I predict she takes one box I put the money in box A, otherwise I put nothing.” Suppose further that Omega has put many other people into this exact situation, and in all those cases the amount of money in box A was identical.
The reason why I mention the problem is that while the original Newcomb’s problem is analogous to the Prisoner’s Dilemma with clones that you described, this problem is more directly analogous to the ordinary one-shot Prisoner’s Dilemma. In the Prisoner’s Dilemma with clones and in Newcomb’s problem, your outcome is controlled by a factor that you don’t directly control but is nonetheless influenced by your strategy. In the ordinary Prisoner’s dilemma and in my Newcomb-like problem, this factor is controlled a rational agent that is distinct from yourself (although note that in the Prisoner’s Dilemma this agent’s outcome is directly influenced by what you do, but not so in my own dilemma).
People have made the argument that you should cooperate in the one-shot Prisoner’s Dilemma for essentially the same reason you should one-box. I disagree with that, and I think my hypothetical illustrates that the two problems are disanalogous by presenting a more correct analogue. While there is a strong argument for one-boxing in Newcomb’s problem, which I agree with, the case is less clear here. I think the argument that a TDT agent would choose cooperation in Prisoner’s Dilemma is flawed. I believe TDT in its current form is not precise enough to give a clear answer to this question. After all, both the CDT argument in terms of dominated strategies and the superrational argument in terms of the underlying symmetry of the situation can be phrased in TDT depending on how you draw the causal graph over computations.
I think even TDT says that you should 2-box in Newcomb’s problem when the box is full if and only if false.
But more seriously, presumably in your scenario the behavior of a “perfectly rational agent” actually means the behavior of an agent whose behavior is specified by some fixed, known program. In this case, the participant can determine whether or not the box is full. Thus, either the box is always full or the box is always empty, and the participant knows which is the case. If you are playing Newcomb’s problem with the box always full, you 2-box. If you play Newcomb’s problem with the box always empty, you 2-box. Therefore you 2-box. Therefore, the perfectly rational agent 2-boxes. Therefore, the box is always empty.
OK. OK. OK. You TDT people will say something like “but I am a perfectly rational agent and therefore my actions are non-causally related to whether or not the box is full, thus I should 1-box as it will cause the box to be full.” On the other hand, if I modify your code to 2-box in this type of Newcomb’s problem you do better and thus you were never perfectly rational to begin with.
On the other hand, if the universe can punish you directly (i.e. not simply via your behavior) for running the wrong program, the program that does best depends heavily on which universe you are in and thus there cannot be a “perfectly rational agent” unless you assume a fixed prior over possible universes.
On this topic, I’d like to suggest a variant of Newcomb’s problem that I don’t recall seeing anywhere in LessWrong (or anywhere else). As usual, Omega presents you with two boxes, box A and box B. She says “You may take either box A or both boxes. Box B contains 1,000$. Box A either contains 1,000,000$ or is empty. Here is how I decided what to put in box A: I consider a perfectly rational agent being put in an identical situation to the one you’re in. If I predict she takes one box I put the money in box A, otherwise I put nothing.” Suppose further that Omega has put many other people into this exact situation, and in all those cases the amount of money in box A was identical.
The reason why I mention the problem is that while the original Newcomb’s problem is analogous to the Prisoner’s Dilemma with clones that you described, this problem is more directly analogous to the ordinary one-shot Prisoner’s Dilemma. In the Prisoner’s Dilemma with clones and in Newcomb’s problem, your outcome is controlled by a factor that you don’t directly control but is nonetheless influenced by your strategy. In the ordinary Prisoner’s dilemma and in my Newcomb-like problem, this factor is controlled a rational agent that is distinct from yourself (although note that in the Prisoner’s Dilemma this agent’s outcome is directly influenced by what you do, but not so in my own dilemma).
People have made the argument that you should cooperate in the one-shot Prisoner’s Dilemma for essentially the same reason you should one-box. I disagree with that, and I think my hypothetical illustrates that the two problems are disanalogous by presenting a more correct analogue. While there is a strong argument for one-boxing in Newcomb’s problem, which I agree with, the case is less clear here. I think the argument that a TDT agent would choose cooperation in Prisoner’s Dilemma is flawed. I believe TDT in its current form is not precise enough to give a clear answer to this question. After all, both the CDT argument in terms of dominated strategies and the superrational argument in terms of the underlying symmetry of the situation can be phrased in TDT depending on how you draw the causal graph over computations.
Didn’t we just cover this? If not, I don’t understand what Omega’s saying here.
I think even TDT says that you should 2-box in Newcomb’s problem when the box is full if and only if false.
But more seriously, presumably in your scenario the behavior of a “perfectly rational agent” actually means the behavior of an agent whose behavior is specified by some fixed, known program. In this case, the participant can determine whether or not the box is full. Thus, either the box is always full or the box is always empty, and the participant knows which is the case. If you are playing Newcomb’s problem with the box always full, you 2-box. If you play Newcomb’s problem with the box always empty, you 2-box. Therefore you 2-box. Therefore, the perfectly rational agent 2-boxes. Therefore, the box is always empty.
OK. OK. OK. You TDT people will say something like “but I am a perfectly rational agent and therefore my actions are non-causally related to whether or not the box is full, thus I should 1-box as it will cause the box to be full.” On the other hand, if I modify your code to 2-box in this type of Newcomb’s problem you do better and thus you were never perfectly rational to begin with.
On the other hand, if the universe can punish you directly (i.e. not simply via your behavior) for running the wrong program, the program that does best depends heavily on which universe you are in and thus there cannot be a “perfectly rational agent” unless you assume a fixed prior over possible universes.