The title of this post most probably deserves a cautious question mark at the end, but I’ll go out on a limb and start sawing it behind me: I think I’ve got a framework that consistently solves correlated decision problems. That it is, those situation where different agents (a forgetful you at different times, your duplicates, or Omega’s prediction of you) will come to the same decision.
After my first post on the subject, Wei Dai asked whether my ideas could be formalisedenough that it could applied mechanically. There were further challenges: introducing further positional information, and dealing with the difference between simulations and predictions. Since I claimed this sort of approach could apply to the Newcomb’s problem, it is also useful to see it work in cases were the two decisions are only partially correlated—where Omega is good, but he’s not perfect.
The theory
In standard decision making, it is easy to estimate your own contribution to your own utility; the contribution of others to your own utility is then estimated separately. In correlated decision-making, both steps are trickier; estimating your contribution is non-obvious, and the contribution from others is not independent. In fact, the question to ask is not “if I decide this, how much return will I make”, but rather “in a world in which I decide this, how much return will I make”.
You first estimate the contribution of each decision made to your own utility, using a simplified version of the CDP: if N correlated decisions are needed to gain some utility, then each decision maker is estimated to have contributed 1/N of the effort towards the gain of that utility.
Then the procedure under correlated decision making is:
1) Estimate the contribution of each correlated decision towards your utility, using CDP.
2) Estimate the probability that each decision actually happens (this is an implicit use of the SIA).
3) Use 1) and 2) to estimate the total utility that emerges from the decision.
To illustrate, apply it to the generalisedabsent minded driver problem, where the return for turning off at the first and second intersection are x and y, respectively, while driving straight through grants a return of z. The expected return for going straight with probability p is R = (1-p)x + p(1-p)y + p2z.
Then the expected return for the driver at the first intersection is (1-p)x + [p(1-p)y + p2z]/2, since the y and z returns require two decisions before being claimed. The expected return for the second driver is [(1-p)y + pz]/2. The first driver exists with probability one, while the second driver exists with probability p, giving the correct return of R.
In the example given in Outlawing Athropics, there are twenty correlated decision makers, all existing with probability 1⁄2. Two of them contribute towards a decision which has utility −52, hence each generates a utility of −52/2. Eighteen of them contribute towards a decision which has utility 12, hence each one generates a utility of 12⁄18. Summing this up, the total utility generated is [2*(-52)/2 + 18*(12)/18]/2 = −20, which is correct.
Simulation versus prediction
In the Newcomb problem, there are two correlated decisions: your choice of one- or two-boxing, and Omega’s decision on whether to put the money. The return to you for one-boxing in either case is X/2; for two boxing, the return is 1000⁄2.
If Omega simulates you, you can be either decision maker, with probability 1⁄2; if he predicts without simulating, you are certainly the box-chooser. But it makes no difference—who you are is not an issue, you are simply looking at the probability of each decision maker existing, which is 1 in both cases. So adding up the two utilities gives you the correct estimate.
Consequently, predictions and simulations can be treated similarly in this setup.
Partial correlation
If two decisions are partially correlated—say, Newcomb’s problem where Omega has a probability p of correctly guessing your decision—then the way of modeling it is to split it into several perfectly correlated pieces.
For instance, the partially correlated Newcomb’s problem can be split into one model which is perfectly correlated (with probability p), and one model which is perfectly anti-correlated (with probability (1-p)). The return from two-boxing in the first case is 1000, and X+1000 is the second case. One-boxing gives a return of X in the first case and 0 in the second case. Hence the expected return from one-boxing is p(X), and for two-boxing is 1000 + (1-p)X, which are the correct odds.
Adding positional information
SilasBarta asked whether my old model could deal with the absent-minded driver if there were some positional information. For instance, imagine if there were a light at each crossing that could be red or green, and it was green 1⁄2 the time at the first crossing and 2⁄3 of the time in the second crossing. Then if your probability of continuing on a green light was g, and if on a red light it was r, your initial expected return is R = (2-r-g)x/2 + (r+g)D/2, where D = ((1-r)y + rz)/3 + 2((1-g)y + gz)/3 ).
Then if you are at the first intersection, your expected return must be (2-r-g)x/2 + (r+g)D/4 (CDP on y and z, which require 2 decisions), while if you are at the second intersection, your expected return is D/4. The first driver exists with certainty, while the second one exists with probability r+g, giving us the correct return R.
Correlated decision making: a complete theory
The title of this post most probably deserves a cautious question mark at the end, but I’ll go out on a limb and start sawing it behind me: I think I’ve got a framework that consistently solves correlated decision problems. That it is, those situation where different agents (a forgetful you at different times, your duplicates, or Omega’s prediction of you) will come to the same decision.
After my first post on the subject, Wei Dai asked whether my ideas could be formalised enough that it could applied mechanically. There were further challenges: introducing further positional information, and dealing with the difference between simulations and predictions. Since I claimed this sort of approach could apply to the Newcomb’s problem, it is also useful to see it work in cases were the two decisions are only partially correlated—where Omega is good, but he’s not perfect.
The theory
In standard decision making, it is easy to estimate your own contribution to your own utility; the contribution of others to your own utility is then estimated separately. In correlated decision-making, both steps are trickier; estimating your contribution is non-obvious, and the contribution from others is not independent. In fact, the question to ask is not “if I decide this, how much return will I make”, but rather “in a world in which I decide this, how much return will I make”.
You first estimate the contribution of each decision made to your own utility, using a simplified version of the CDP: if N correlated decisions are needed to gain some utility, then each decision maker is estimated to have contributed 1/N of the effort towards the gain of that utility.
Then the procedure under correlated decision making is:
1) Estimate the contribution of each correlated decision towards your utility, using CDP.
2) Estimate the probability that each decision actually happens (this is an implicit use of the SIA).
3) Use 1) and 2) to estimate the total utility that emerges from the decision.
To illustrate, apply it to the generalised absent minded driver problem, where the return for turning off at the first and second intersection are x and y, respectively, while driving straight through grants a return of z. The expected return for going straight with probability p is R = (1-p)x + p(1-p)y + p2z.
Then the expected return for the driver at the first intersection is (1-p)x + [p(1-p)y + p2z]/2, since the y and z returns require two decisions before being claimed. The expected return for the second driver is [(1-p)y + pz]/2. The first driver exists with probability one, while the second driver exists with probability p, giving the correct return of R.
In the example given in Outlawing Athropics, there are twenty correlated decision makers, all existing with probability 1⁄2. Two of them contribute towards a decision which has utility −52, hence each generates a utility of −52/2. Eighteen of them contribute towards a decision which has utility 12, hence each one generates a utility of 12⁄18. Summing this up, the total utility generated is [2*(-52)/2 + 18*(12)/18]/2 = −20, which is correct.
Simulation versus prediction
In the Newcomb problem, there are two correlated decisions: your choice of one- or two-boxing, and Omega’s decision on whether to put the money. The return to you for one-boxing in either case is X/2; for two boxing, the return is 1000⁄2.
If Omega simulates you, you can be either decision maker, with probability 1⁄2; if he predicts without simulating, you are certainly the box-chooser. But it makes no difference—who you are is not an issue, you are simply looking at the probability of each decision maker existing, which is 1 in both cases. So adding up the two utilities gives you the correct estimate.
Consequently, predictions and simulations can be treated similarly in this setup.
Partial correlation
If two decisions are partially correlated—say, Newcomb’s problem where Omega has a probability p of correctly guessing your decision—then the way of modeling it is to split it into several perfectly correlated pieces.
For instance, the partially correlated Newcomb’s problem can be split into one model which is perfectly correlated (with probability p), and one model which is perfectly anti-correlated (with probability (1-p)). The return from two-boxing in the first case is 1000, and X+1000 is the second case. One-boxing gives a return of X in the first case and 0 in the second case. Hence the expected return from one-boxing is p(X), and for two-boxing is 1000 + (1-p)X, which are the correct odds.
Adding positional information
SilasBarta asked whether my old model could deal with the absent-minded driver if there were some positional information. For instance, imagine if there were a light at each crossing that could be red or green, and it was green 1⁄2 the time at the first crossing and 2⁄3 of the time in the second crossing. Then if your probability of continuing on a green light was g, and if on a red light it was r, your initial expected return is R = (2-r-g)x/2 + (r+g)D/2, where D = ((1-r)y + rz)/3 + 2((1-g)y + gz)/3 ).
Then if you are at the first intersection, your expected return must be (2-r-g)x/2 + (r+g)D/4 (CDP on y and z, which require 2 decisions), while if you are at the second intersection, your expected return is D/4. The first driver exists with certainty, while the second one exists with probability r+g, giving us the correct return R.