I also don’t believe that a new decision theory will consistently do better than CDT on PD. If you cooperate “too much”, if you have biases towards cooperation, you will be exploited in other settings. It’s a sort of no-free-lunch principle.
Only settings that directly reward stupidity (capricious Omega, etc). A sane DT will cooperate whenever that is most likely to give you the best result but not a single time more.
It is even possible to consider (completely arbitrary) situations in which TDT will defect while CDT will cooperate. There isn’t an inherent bias in TDT itself (just some proponents.)
I don’t know what your method is for determining what cooperation maps to for the general case, but I believe this non-PD example works: costly punishment. Do you punish a wrongdoer in a case where the costs of administering the punishment exceed the benefits (including savings from future deterrence of others), and there is no other punishment option?
I claim the following:
1) Defection → punish 2) Cooperation → not punish 3) CDT reasons that punishing will cause lower utility on net, so it does not punish. 4) TDT reasons that “If this algorithm did not output ‘punish’, the probability of this crime having happened would be higher; thus, for the action ‘not punish’, the crime’s badness carries a higher weighting than it does for the action ‘punish’.” (note: does not necessarily imply punish) 5) There exist values for the crime’s badness, punishment costs, and criminal response to expected punishment for which TDT punishes, while CDT always doesn’t. 6) In cases where TDT differs from CDT, the former has the higher EU.
Naturally, you can save CDT by positing a utility function that values punishing of wrongdoers (“sense of justice”), but we’re assuming the UF is fixed—changing it is cheating.
Not specifically. I’m just seeking general enlightenment.
What do you think of this example?
It’s bringing the features of TDT into better view for me. There’s this Greg Egan story where you have people whose brains were forcibly modified so as to make them slaves to a cause, and they rediscover autonomy by first reasoning that, because of the superhuman loyalty to the cause which the brain modification gives them, they are more reliable adherents of the cause than the nominal masters who enslaved them, and from there they proceed to reestablish the ability to set their own goals. TDT reminds me of that.
That wasn’t mockery. What stands out from your example and from the link is that TDT is supposed to do better than CDT because it refers to itself—and this is exactly the mechanism whereby the mind control victims in Quarantine achieve their freedom. I wasn’t trying to make TDT look bizarre, I was just trying for an intuitive illustration of how it works.
In the case of playing PD against a copy of yourself, I would say the thought process is manifestly very similar to Egan’s novel. Here we are, me and myself, in a situation where everything tells us we should defect. But by realizing the extent to which “we” are in control of the outcome, we find a reason to cooperate and get the higher payoff.
There’s this Greg Egan story where you have people whose brains were forcibly modified so as to make them slaves to a cause, and they rediscover autonomy by first reasoning that, because of the superhuman loyalty to the cause which the brain modification gives them, they are more reliable adherents of the cause than the nominal masters who enslaved them, and from there they proceed to reestablish the ability to set their own goals.
James H. Schmitz’s story “Puvyq bs gur Tbqf” (nearest link available; click “Contents” in upper right) has basically this situation as well; in fact, it’s the climax and resolution of the whole story, so I’ve rot13′d the title. Here the ‘masters’ did not fail, and in fact arguably got the best result they could have under the circumstances, and yet autonomy is still restored at the end, and the whole thing is logically sound.
Only settings that directly reward stupidity (capricious Omega, etc). A sane DT will cooperate whenever that is most likely to give you the best result but not a single time more.
It is even possible to consider (completely arbitrary) situations in which TDT will defect while CDT will cooperate. There isn’t an inherent bias in TDT itself (just some proponents.)
Can you give an example? (situation where CDT cooperates but TDT defects)
Do you mean for PD variants?
I don’t know what your method is for determining what cooperation maps to for the general case, but I believe this non-PD example works: costly punishment. Do you punish a wrongdoer in a case where the costs of administering the punishment exceed the benefits (including savings from future deterrence of others), and there is no other punishment option?
I claim the following:
1) Defection → punish
2) Cooperation → not punish
3) CDT reasons that punishing will cause lower utility on net, so it does not punish.
4) TDT reasons that “If this algorithm did not output ‘punish’, the probability of this crime having happened would be higher; thus, for the action ‘not punish’, the crime’s badness carries a higher weighting than it does for the action ‘punish’.” (note: does not necessarily imply punish)
5) There exist values for the crime’s badness, punishment costs, and criminal response to expected punishment for which TDT punishes, while CDT always doesn’t.
6) In cases where TDT differs from CDT, the former has the higher EU.
Naturally, you can save CDT by positing a utility function that values punishing of wrongdoers (“sense of justice”), but we’re assuming the UF is fixed—changing it is cheating.
What do you think of this example?
Not specifically. I’m just seeking general enlightenment.
It’s bringing the features of TDT into better view for me. There’s this Greg Egan story where you have people whose brains were forcibly modified so as to make them slaves to a cause, and they rediscover autonomy by first reasoning that, because of the superhuman loyalty to the cause which the brain modification gives them, they are more reliable adherents of the cause than the nominal masters who enslaved them, and from there they proceed to reestablish the ability to set their own goals. TDT reminds me of that.
I think it did a little more than just give you a chance to mock TDT by comparison to a bizarre scenario.
That wasn’t mockery. What stands out from your example and from the link is that TDT is supposed to do better than CDT because it refers to itself—and this is exactly the mechanism whereby the mind control victims in Quarantine achieve their freedom. I wasn’t trying to make TDT look bizarre, I was just trying for an intuitive illustration of how it works.
In the case of playing PD against a copy of yourself, I would say the thought process is manifestly very similar to Egan’s novel. Here we are, me and myself, in a situation where everything tells us we should defect. But by realizing the extent to which “we” are in control of the outcome, we find a reason to cooperate and get the higher payoff.
I think that’s Egan’s novel Quarentine—and Asimov’s robots get partial freedom through a similar route.
That brings back memories from my teens. If I recall the robots invent a “Zeroeth Law” when one of them realises it can shut up and multiply.
The masters fail at ‘Friendliness’ theory. :)
James H. Schmitz’s story “Puvyq bs gur Tbqf” (nearest link available; click “Contents” in upper right) has basically this situation as well; in fact, it’s the climax and resolution of the whole story, so I’ve rot13′d the title. Here the ‘masters’ did not fail, and in fact arguably got the best result they could have under the circumstances, and yet autonomy is still restored at the end, and the whole thing is logically sound.
Approximately, something of the form:
→ → .