The point of Newcomb’s Problem isn’t that it’s difficult to write a program that one-boxes- it’s that it’s difficult to write a program that comes out ahead both on Newcomblike problems with Predictors and on basic problems without Predictors. Yours would grab 1000 over 1001000 in general, which would be the wrong move if the boxes of utility were natural features of a landscape.
I don’t have a better decision theory than TDT, but I also don’t believe that what you do in the present affects the past. However, the nature of the situation is such as to reinforce, in a TDT-like agent, the illusion that decisions made in the present affect decisions simulated in the past. (That is, assuming it is an agent self-aware enough to have such beliefs.)
One conception of the relationship between CDT and TDT is that it is like the relationship between classical and relativistic mechanics: relativistic mechanics is truer but it reduces to classical mechanics in a certain limit. But I think TDT is more like alphabetical decision theory—though useful in a far wider variety of scenarios: it is not a decision theory that you would want to have, outside of certain peculiar situations which offer an incentive to deviate from CDT.
I need to study UDT more, because sometimes it sounds like it’s just CDT in a multiverse context, and yet it’s supposed to favor one-boxing.
Would you cooperate in the Prisoner’s Dilemma against an almost-copy of yourself (with only trivial differences so that your experiences would be distinguishable)? It can be set up so that neither of you decide within the light-cone of the other’s decision, so there’s no way your cooperation can physically ensure the other’s cooperation.
If you’re quite convinced that the reasonable thing is to defect, then pretty obviously you’ll get (D,D).
If you’re quite convinced that the reasonable thing is to cooperate, then pretty obviously you’ll get (C,C).
(OK, you could decide randomly- but then you’re just as likely to get (C,D) as (D,C).)
This is another sort of problem that TDT and UDT get right without any need for ad-hoc add-ons. The point is that advanced decision theories can be reasonably simple (where applications of Löb’s Theorem are counted as simple), get the right answer in all the cases where CDT gets the right answer (grabbing the highest utility when you’re the only agent around, finding the Nash equilibrium in a zero-sum game, etc), and also get the right answer when other agents are basing their decisions in a knowable way off of their predictions of what you’ll do in various hypotheticals. Newcomb’s Problem may sound artificial, but that’s because we’ve made that dependence as simple and deterministic as possible in order to have a good test problem.
The point of Newcomb’s Problem isn’t that it’s difficult to write a program that one-boxes- it’s that it’s difficult to write a program that comes out ahead both on Newcomblike problems with Predictors and on basic problems without Predictors. Yours would grab 1000 over 1001000 in general, which would be the wrong move if the boxes of utility were natural features of a landscape.
I don’t have a better decision theory than TDT, but I also don’t believe that what you do in the present affects the past. However, the nature of the situation is such as to reinforce, in a TDT-like agent, the illusion that decisions made in the present affect decisions simulated in the past. (That is, assuming it is an agent self-aware enough to have such beliefs.)
One conception of the relationship between CDT and TDT is that it is like the relationship between classical and relativistic mechanics: relativistic mechanics is truer but it reduces to classical mechanics in a certain limit. But I think TDT is more like alphabetical decision theory—though useful in a far wider variety of scenarios: it is not a decision theory that you would want to have, outside of certain peculiar situations which offer an incentive to deviate from CDT.
I need to study UDT more, because sometimes it sounds like it’s just CDT in a multiverse context, and yet it’s supposed to favor one-boxing.
Would you cooperate in the Prisoner’s Dilemma against an almost-copy of yourself (with only trivial differences so that your experiences would be distinguishable)? It can be set up so that neither of you decide within the light-cone of the other’s decision, so there’s no way your cooperation can physically ensure the other’s cooperation.
If you’re quite convinced that the reasonable thing is to defect, then pretty obviously you’ll get (D,D).
If you’re quite convinced that the reasonable thing is to cooperate, then pretty obviously you’ll get (C,C).
(OK, you could decide randomly- but then you’re just as likely to get (C,D) as (D,C).)
This is another sort of problem that TDT and UDT get right without any need for ad-hoc add-ons. The point is that advanced decision theories can be reasonably simple (where applications of Löb’s Theorem are counted as simple), get the right answer in all the cases where CDT gets the right answer (grabbing the highest utility when you’re the only agent around, finding the Nash equilibrium in a zero-sum game, etc), and also get the right answer when other agents are basing their decisions in a knowable way off of their predictions of what you’ll do in various hypotheticals. Newcomb’s Problem may sound artificial, but that’s because we’ve made that dependence as simple and deterministic as possible in order to have a good test problem.