My other comment might have come across as as unnecessarily hostile. For the sake of a productive discussion, can you please point out (or provide some more specific reference) how TDT can possibly succeed at achieving mutual cooperation in the standard, one-shot, no-communication prisoner’s dilemma? Because, in general, that doesn’t seem to be possible.
I mean, if you are certain (or highly confident) that you are playing against a mental clone (a stronger condition that just using the same decision theory), then you can safely cooperate. But, in most scenarios, it’s foolish to have a prior like that: other agents that aren’t mental clones of yours will shamelessly exploit you. Even if you start with a large population of mental clones playing anonymous PD against each others, if there are mutations (random or designed), as soon as defectors appear, they will start exploiting the clones that blindly cooperate, at least until the clones have updated their beliefs and switch to defecting, which yields the Nash equilibrium. Blindly trusting that you are playing against mental clones is a very unstable strategy.
In the program-swap version of one-shot prisoner dilemma, strategies like the CliqueBots (and generalizations) can achieve stable mutual cooperation because they can actually check for each game whether the other party is a mental clone (there is the problem of coordinating on a single clique, but once the clique is chosen, there is no incentive to deviate from it). But achieving mutual cooperation in the standard one-shot PD seems impossible without making unrealistic assumptions about the other players. I don’t think that even Yudkowsky or other MIRI people argued that TDT can achieve that.
That’s the infamous 120 pages draft (or has it been updated?) I had started reading it some time ago, found some basic errors, lots of rambling and decided it wasn’t worth my time.
The linked article says something like: Superrationality is not necessary. All you need is to realize that in real life, there are no single-shot PDs, and therefore you should use the optimal strategy for iterated PD, which cooperates in the first move.
That’s simply refusing to deal with the original question and answering something different instead.
No, it says that one-shot PD is rare, and when it actually happens defecting is indeed the correct choice, even if it is counterintuitive because we are much more accustomed with scenarios that are similar to iterated PD.
Timeless Decision Theory paper.
Not particularly. I recommend reading the paper.
My other comment might have come across as as unnecessarily hostile. For the sake of a productive discussion, can you please point out (or provide some more specific reference) how TDT can possibly succeed at achieving mutual cooperation in the standard, one-shot, no-communication prisoner’s dilemma? Because, in general, that doesn’t seem to be possible.
I mean, if you are certain (or highly confident) that you are playing against a mental clone (a stronger condition that just using the same decision theory), then you can safely cooperate.
But, in most scenarios, it’s foolish to have a prior like that: other agents that aren’t mental clones of yours will shamelessly exploit you. Even if you start with a large population of mental clones playing anonymous PD against each others, if there are mutations (random or designed), as soon as defectors appear, they will start exploiting the clones that blindly cooperate, at least until the clones have updated their beliefs and switch to defecting, which yields the Nash equilibrium.
Blindly trusting that you are playing against mental clones is a very unstable strategy.
In the program-swap version of one-shot prisoner dilemma, strategies like the CliqueBots (and generalizations) can achieve stable mutual cooperation because they can actually check for each game whether the other party is a mental clone (there is the problem of coordinating on a single clique, but once the clique is chosen, there is no incentive to deviate from it).
But achieving mutual cooperation in the standard one-shot PD seems impossible without making unrealistic assumptions about the other players. I don’t think that even Yudkowsky or other MIRI people argued that TDT can achieve that.
That’s the infamous 120 pages draft (or has it been updated?)
I had started reading it some time ago, found some basic errors, lots of rambling and decided it wasn’t worth my time.
Anyway, I found a good critique to superrationality: http://www.science20.com/hammock_physicist/whats_wrong_superrationality-100813
The linked article says something like: Superrationality is not necessary. All you need is to realize that in real life, there are no single-shot PDs, and therefore you should use the optimal strategy for iterated PD, which cooperates in the first move.
That’s simply refusing to deal with the original question and answering something different instead.
No, it says that one-shot PD is rare, and when it actually happens defecting is indeed the correct choice, even if it is counterintuitive because we are much more accustomed with scenarios that are similar to iterated PD.