In the previous post, we applied some calculus to game theoretic problems. Let’s now look at the famous Newcomb’s problem, and how we can use calculus to find a solution.
An agent finds herself standing in front of a transparent box labeled “A” that contains $1,000, and an opaque box labeled “B” that contains either $1,000,000 or $0. A reliable predictor, who has made similar predictions in the past and been correct 99% of the time, claims to have placed $1,000,000 in box B iff she predicted that the agent would leave box A behind. The predictor has already made her prediction and left. Box B is now empty or full. Should the agent take both boxes (“two-boxing”), or only box B, leaving the transparent box containing $1,000 behind (“one-boxing”)?
First, we need to come up with a function modeling the amount of money the agent gets. There’s only one action to do: call this a. Then V(a) is the function for the amount of money earned for each action (one-boxing or two-boxing). However, as you might know, historically, thinkers have diverged on what V(a) should be.
Causal Decision Theory
Causal Decision Theory (CDT) states that an agent should look only at the causal effects of her actions. In Newcomb’s problem, this means acknowledging that the predictor already made her prediction (and either put $1,000,000 in box B or not), and that the agent’s action now can’t causally influence what’s in box B. So either there’s $1,000,000in box B or not. In both cases, two-boxing earns $1,000 more (the content of box A) than one-boxing. Then V(a)=B+1,000a, where B is a constant representing the amount of money in box B and a∈{0,1}, where a=0 is one-boxing and a=1 is two-boxing. V′(a)=1,000>0, and of course V(a) returns the most value for the highest value of a: a=1 (two-boxing).
This is all pretty straightforward, but the problem is that almost all agents using CDT ends up with only $1,000, as the predictor predicts the agents will two-box with 0.99accuracy and put nothing in box B. It’s one-boxing that gets you the $1,000,000. Enter logical decision theories, e.g. Functional Decision Theory.
Functional Decision Theory
Functional Decision Theory (FDT) reasons about the effects of the agents decision procedure (which produces an action) instead of the effects of her actions. The point is that a decision procedure can be implemented multiple times. In Newcomb’s problem, it seems the predictor implements the decision procedure of the agent: she can run this model of the agent’s decision procedure to see what action it produces, and use it to predict what the actual agent will do. In V(a)=B+1,000a, this means that B and a are dependent on the same decision procedure! The decision procedure’s implementation in the agent produces a, and the implementation in the predictor is used to determine what B should be. Let’s write d for the decision procedure: d∈{0,1}, where 0 means a one-boxing decision and 1 means a two-boxing decision. Then a=d, and B=0.99∗1,000,000−0.98∗1,000,000d=990,000−980,000d. After all: if d=0, the agent decides on one-boxing and the predictor will have predicted that with 0.99accuracy, giving an expected value of 0.99∗$1,000,000=$990,000 in box B. Should the agent decide to two-box, the predictor will have predicted that with probability 0.99and only put $1,000,000 in box B if she mistakenly predicted a one-box action. Then the expected value of box B is 0.01∗$1,000,000=$10,000, which for d=1 is represented by B=990,000−980,000d. Great! We now have V(d)=(990,000−980,000d)+1000d=990,000−979,000d. V′(d)=−979,000<0. The lowest possible decision, then, wins: d=0, which gives V(0)=990,000, whereas V(1)=990,000−979,000=11,000.
This outcome reflects the fact that it’s one-boxers who almost always win $1,000,000, whereas two-boxer rarely do. If they do, they get the $1,000,000 and the $1,000 of box A, for a total of $1,001,000, but the probability of getting the $1,000,000 is too low for this to matter enough.
The Calculus of Newcomb’s Problem
In the previous post, we applied some calculus to game theoretic problems. Let’s now look at the famous Newcomb’s problem, and how we can use calculus to find a solution.
Newcomb’s problem
From Functional Decision Theory: A New Theory of Instrumental Rationality:
First, we need to come up with a function modeling the amount of money the agent gets. There’s only one action to do: call this a. Then V(a) is the function for the amount of money earned for each action (one-boxing or two-boxing). However, as you might know, historically, thinkers have diverged on what V(a) should be.
Causal Decision Theory
Causal Decision Theory (CDT) states that an agent should look only at the causal effects of her actions. In Newcomb’s problem, this means acknowledging that the predictor already made her prediction (and either put $1,000,000 in box B or not), and that the agent’s action now can’t causally influence what’s in box B. So either there’s $1,000,000in box B or not. In both cases, two-boxing earns $1,000 more (the content of box A) than one-boxing. Then V(a)=B+1,000a, where B is a constant representing the amount of money in box B and a∈{0,1}, where a=0 is one-boxing and a=1 is two-boxing. V′(a)=1,000>0, and of course V(a) returns the most value for the highest value of a: a=1 (two-boxing).
This is all pretty straightforward, but the problem is that almost all agents using CDT ends up with only $1,000, as the predictor predicts the agents will two-box with 0.99accuracy and put nothing in box B. It’s one-boxing that gets you the $1,000,000. Enter logical decision theories, e.g. Functional Decision Theory.
Functional Decision Theory
Functional Decision Theory (FDT) reasons about the effects of the agents decision procedure (which produces an action) instead of the effects of her actions. The point is that a decision procedure can be implemented multiple times. In Newcomb’s problem, it seems the predictor implements the decision procedure of the agent: she can run this model of the agent’s decision procedure to see what action it produces, and use it to predict what the actual agent will do. In V(a)=B+1,000a, this means that B and a are dependent on the same decision procedure! The decision procedure’s implementation in the agent produces a, and the implementation in the predictor is used to determine what B should be. Let’s write d for the decision procedure: d∈{0,1}, where 0 means a one-boxing decision and 1 means a two-boxing decision. Then a=d, and B=0.99∗1,000,000−0.98∗1,000,000d=990,000−980,000d. After all: if d=0, the agent decides on one-boxing and the predictor will have predicted that with 0.99accuracy, giving an expected value of 0.99∗$1,000,000=$990,000 in box B. Should the agent decide to two-box, the predictor will have predicted that with probability 0.99and only put $1,000,000 in box B if she mistakenly predicted a one-box action. Then the expected value of box B is 0.01∗$1,000,000=$10,000, which for d=1 is represented by B=990,000−980,000d. Great! We now have V(d)=(990,000−980,000d)+1000d=990,000−979,000d. V′(d)=−979,000<0. The lowest possible decision, then, wins: d=0, which gives V(0)=990,000, whereas V(1)=990,000−979,000=11,000.
This outcome reflects the fact that it’s one-boxers who almost always win $1,000,000, whereas two-boxer rarely do. If they do, they get the $1,000,000 and the $1,000 of box A, for a total of $1,001,000, but the probability of getting the $1,000,000 is too low for this to matter enough.