The usual Nash equilibrium argument doesn’t depend on players knowing when the end is… there just has to be some upper bound on the number of iterations. This is sufficient to show that any true equilibrium for the game must involve both players always defecting. It is mathematically rigorous.
This seems like it would depend on the assumptions. Suppose that you are playing an iterated PD where, after each round, there is a 10% chance that the game stops and a 90% chance that it continues. Does the proof still apply?
Edit: The more general point is that this proof by backwards induction, which is central to this post, sounds extremely fragile. If I’m reconstructing it correctly in my head, it depends on there being some round where, if you reached it, you would be absolutely certain that it was the last round. It also depends on you being absolutely certain that the other person would defect in that round.* If either probability shifts from zero to epsilon, the proof breaks. And with any agent that was programmed as haphazardly as humans were, you have to deal in probabilities rather than logical certainties.
* It seems that it would also depend on them knowing (with certainty) that you know that they’ll defect, and so on up to as many meta-levels as there are rounds. And does it also depend on you and the other person having the same belief about the maximum possible number of rounds? If you think “there’s no way that we’ll play a trillion rounds” and I think “there’s no way that we’ll play 2 trillion rounds”, would the proof still go through?
If after each round, there is a 90% chance of continuation, then this is an infinite iterated prisoner’s dilemma, and you are right, the backwards induction doesn’t apply in that case. But my point was that this never applies in the real world… there are always finite upper bounds, and in that case a strategy like TFT is vulnerable to invasion by a variant (or “mutant”) strategy which defects on the last round. There is no need to assume a biological mutation by the way… It could be a cultural mutation, learned response which others copy etc.
By the way, the “common knowledge” objection to backwards induction doesn’t work, though I’ve seen it before. This is because the demonstration that an ESS must be a Nash equilibrium doesn’t require any strong common knowledge assumptions about the problem statement, the rationality of the other players etc. You just need to consider what happens if a population adopts a strategy which is not a Nash equilibrium, and then mutants following a superior strategy show up…
If there is a 90% chance of continuation and a maximum of 10,000 rounds, then a variant of TFT that always defects on the 10,000th round has less than a 10^-400 chance of behaving differently from TFT. In practice, TFT would be evolutionarily stable against this variant, since our universe hasn’t lasted long enough for these two strategies to be distinguishable.
If there is a shortcut that makes it possible to skip this infeasible evolutionary process, I suspect that it would need to involve strong assumptions about common knowledge.
In the real world, I think it is relatively rare to reach a last round and know that it is the last round (and rarer still to know that a certain round is the next-to-last round, or the third-from-last), which limits the advantages of strategies based on backwards induction. Our ancestors lived in smallish gossipy groups, which meant that few of their interactions were isolated prisoners’ dilemmas (with no indirect costs of defection).
If there is a shortcut that makes it possible to skip this infeasible evolutionary process, I suspect that it would need to involve strong assumptions about common knowledge.
I’ve been thinking about this, and here are some possible shortcuts. Consider a strategy which I will call “Grumpy-Old-Man” or GOM-n. This behaves like TFT for the first n rounds, and then defects afterwards.
In your model, GOM-n for n = 150 would be able to drift into a population of TFT (since it has a very small fitness penalty of about 1 in 10 million, which is small enough to allow drift). If it did drift in and stabilize, there would then be selection pressure to slowly reduce the n, to GOM-149, then GOM-148 and so on.
Worse still, consider a mutant that produces GOM-150 but has selective advantages in earlier life, at a cost of crippling the TFT machinery in later life. (Not implausible because mutations often do have several effects). Then it could enter a population with a positive selective advantage, and clear the way for n to slowly reduce.
The particular example I was thinking of was a variant which is very good at detecting sneaky defection, disguised to look like co-operation. But then, as a side-effect in later life, everything starts to look like defection, so it becomes grumpy and stops co-operating itself. I know one or two folks like this...
I see your point here about dependency on assumptions. You describe a finite iterated model with consistently high probability of future interaction, but which then suddenly drops to zero at a high upper bound. (There is no “senescence” effect, with the probability of future interaction declining down to zero). Agreed that in that model, there is minuscule chance of anyone ever reaching the upper bound, so a strategy which isn’t strictly an ESS won’t be replaced. Thanks for the proof of concept—upvoted.
That model (and ones like it with slow senescence) would match a case described in my original article, and which I mentioned below in my reply to Randaly. Basically, TDT-1 never gets a hold, because it is too unlikely that “this time is the last”.
But my skepticism about this as an explanation remains: we do find real occasions where we know that “this time is the last” (or the only) interaction, and while our behaviour changes somewhat, we don’t automatically defect in those cases. This suggests to me that a variant strategy of “Detect if this time is the last, and if so defect” would indeed be possibe, but for some (other) reason is not favoured.
But do you actually know that? I mean, in your evolutionary past, there were people alive who committed betrayals that they thought they had gotten away with. They didn’t, and those people are less likely to be your ancestors.
So people’s brains, presumably with circuitry tuned to follow TFT because their ancestors used it, warns them not to commit acts of betrayal that in reality they could get away with. So they commit less acts of perfect betrayal than they otherwise would.
Like most other instincts, it can be overcome with learned behavior, which is why people exist who do betray others every chance they get and get away with it.
The fact that people will sometimes get it wrong (predict they can get away with betrayal, but can’t) is not a problem. It’s really a balance of fitness question (in cases where there s high probability of getting away with a defection gain, vs small probability of getting caught. Consider that the waiter you don’t tip might just chase you out of the restaurant with a gun. Probably won’t though.) Evolution would still favour last-round defections in such cases.
I’m saying that in the past, if you committed a major betrayal against your tribe—they kill you. It wouldn’t even what you stole or who you raped, etc, it’s the fact that you were willing to do it against the tribe. So, even in last round cases where you might think you got away with it, the times that you fail to get away with it erase your gains.
Look what happens to powerful people in today’s society if they get caught with some relatively minor transgression. So what if a Congressman sends naked pictures of himself to potential mates, or coaxes a female intern to give him a BJ? But, in both cases, the politician was betraying implicit promises and social norms for behavior that the voters want to see in a person in elected office.
This might be true… If the punishment for defection is always very severe after getting caught, then even with a very low probability of getting caught, but a low gain from defecting, evolution would favour co-operating on the last round (or single round) rather than defection. But this means that others’ commitments to vengeance have transformed the prisoner’s dilemma to a non-PD, which is my explanation 2 in the original article. (Or explanation 3 if vengeance is exacted by the whole tribe, including members who weren’t directly injured by the original defection.)
However, this just shifts the burden of explanation to accounting for why we (or whole tribes) are vengeful to such an extreme extent. After all, vengeance is enormously costly, and risks injury (the condemned can fight back) or counter-vengeance (whoever kills the original defector risks being killed in turn by the defector’s surviving family, and then the whole tribe splits apart in a cyle of killing). And notice that at that point, the original defection has already happened, so can’t be deterred any more, and the injury-risking, potentially-tribe-splitting vengeance has negative fitness. The tribe’s already in trouble—because of the betrayal—and the vengeance cycle could now destroy it. So why does it happen? What selection pressure maintains such severe punishment when it is fitness destroying?
Isolated prisoners’ dilemmas were rare in the ancestral environment—most PD-like interactions took place in a social environment where they would (at least in expectation) have indirect effects either within a particular relationship or on one’s broader reputation (and thus shared features with an indefinitely iterated PD). So the advantages of being good at PD-in-a-social-context far outweighed the possible benefits of consistently defecting in the rare truly one-shot PD. That means most of evolution’s optimization power went towards building in adaptations that were good at PD-in-a-social-context, even if the adaptation made one less likely to defect in a truly one-shot PD.
For example, people tend to internalize their reputation, and to feel bad when others disapprove of them (either a particular close other or one’s broader reputation). Having a model of how others will react to your behavior, which is readily accessible and closely tied to your motivations, is very useful for PD-in-a-social-context, but it will make it harder to defect in a one-shot PD.
Another adaptation is the capability of feeling close to another individual, in such a way that you like & trust them and feel motivated to do things that help them. This adaptation probably involved repurposing the machinery that makes parents love their offspring (it involves the hormone Oxytocin), and it makes it harder to defect on the last turn. For actions towards one’s offspring, evolution didn’t want us to defect on the last turn. Adding last-turn defection in non-kin relationships seems like a lot of complexity for a low return, with a potentially high cost if the adaptation isn’t narrowly targeted and has collateral damage towards kin or earlier turns.
There are also various specific emotions which encourage TFT-like-behavior, like gratitude and vindictiveness. Someone who spends their last words on their deathbed praising the person who helped them, or cursing the person who cheated them, is cooperating in a PD. They are spreading accurate reputational information, and probably also strengthening the social rewards system by increasing people’s expectations that good behavior will be socially rewarded or bad behavior will be socially punished. Even if these deathbed acts don’t benefit the individual, they arise from emotions that did benefit the individual (making others more likely to help them, or less likely to cheat them). And again, when these emotions were in development by natural selection, deathbed turn-off was probably a relatively low-priority feature to add (although there does seem to be some tendency for vindictiveness to get turned off when someone is dying—I’m not sure if that’s related).
Short answer: we’re adaptation executors, not fitness maximizers.
I fully get the point, but this doesn’t by itself explain why superior adaptations haven’t come along. Basically, we need to consider a “constraint on perfection” argument here, and ask what may be the constraints concerned in this case. It is generally possible to test the proposals.
Some obvious (standard) proposals are:
1) Mutations can’t arise to turn TFT into TFT-1
This is a bit unlikely for the reasons I already discussed. We do seem to have slightly different behaviour in the one-shot (or last-round) cases, so it is not implausible that some “mutant” would knock out co-operation completely on the last round (or on all rounds after a certain age—see my Grumpy-Old-Man idea above). There is a special concern when we allow for “cultural” mutations (or learned responses which can be imitated) as well as “biological” mutations.
2) Additional costs
The argument here is that TFT-1 has an additional cost penalty, because of the complexity overhead of successfully detecting the last round (or the only round), and the large negative cost of getting it wrong. Again it faces the objection that we do appear to behave slightly differently in last (or only) rounds, whereas if it were truly too difficult to discrimate, we’d have the same behaviour as on regular rounds.
3) Time-lags
This is the argument that we are adapted for an environment which has recently shifted, so cases of single-round (or known last-round) Prisoner’s Dilemma are much more common than before and evolution hasn’t caught up.
This might be testable by directly comparing behaviours of people living in conditions closer to Paleolithic versus industrialized conditions. Are there any differences in reactions when they are presented with one-shot prisoner’s dilemmas? If one-shot PD is a new phenomenon, then we might expect “Paleo-people” to instinctively co-operate, whereas westerners think a bit then defect (indicating that a learned response is overriding an instinctive response). This strikes me as somewhat unlikely (I think it’s more likely that the instinct is to defect, because there is a pattern-match to “not a member of my tribe”, whereas industrialized westerners have been conditioned to co-operate, at least some of the time). But it’s testable.
A variant of this is Randaly’s suggestion that true last-rounds are indeed new, because of the effect of retaliation against family (which has only recently been prohibited). This has a nice feature, that in cases where the last round truly was the last (because there was no family left), the mutant wouldn’t spread.
4) Side effects
Perhaps the mutations that would turn TFT into TFT-1 have other undesirable side effects? This is the counter-argument to Grumpy-Old-Man mutants invading because they have other positive side effects. Difficult to test this one until we know what range of mutations are possible (and whether we are considering biological or cultural ones).
I don’t think it was particularly central; while he did give it as an argument, drnickbone also gave examples of people cooperating on one-shot PD’s, both in formal experiments and in practice (eg choosing to tip at a foreign restaurant to waiter who will never be seen again.)
This seems like it would depend on the assumptions. Suppose that you are playing an iterated PD where, after each round, there is a 10% chance that the game stops and a 90% chance that it continues. Does the proof still apply?
Edit: The more general point is that this proof by backwards induction, which is central to this post, sounds extremely fragile. If I’m reconstructing it correctly in my head, it depends on there being some round where, if you reached it, you would be absolutely certain that it was the last round. It also depends on you being absolutely certain that the other person would defect in that round.* If either probability shifts from zero to epsilon, the proof breaks. And with any agent that was programmed as haphazardly as humans were, you have to deal in probabilities rather than logical certainties.
* It seems that it would also depend on them knowing (with certainty) that you know that they’ll defect, and so on up to as many meta-levels as there are rounds. And does it also depend on you and the other person having the same belief about the maximum possible number of rounds? If you think “there’s no way that we’ll play a trillion rounds” and I think “there’s no way that we’ll play 2 trillion rounds”, would the proof still go through?
If after each round, there is a 90% chance of continuation, then this is an infinite iterated prisoner’s dilemma, and you are right, the backwards induction doesn’t apply in that case. But my point was that this never applies in the real world… there are always finite upper bounds, and in that case a strategy like TFT is vulnerable to invasion by a variant (or “mutant”) strategy which defects on the last round. There is no need to assume a biological mutation by the way… It could be a cultural mutation, learned response which others copy etc.
By the way, the “common knowledge” objection to backwards induction doesn’t work, though I’ve seen it before. This is because the demonstration that an ESS must be a Nash equilibrium doesn’t require any strong common knowledge assumptions about the problem statement, the rationality of the other players etc. You just need to consider what happens if a population adopts a strategy which is not a Nash equilibrium, and then mutants following a superior strategy show up…
If there is a 90% chance of continuation and a maximum of 10,000 rounds, then a variant of TFT that always defects on the 10,000th round has less than a 10^-400 chance of behaving differently from TFT. In practice, TFT would be evolutionarily stable against this variant, since our universe hasn’t lasted long enough for these two strategies to be distinguishable.
If there is a shortcut that makes it possible to skip this infeasible evolutionary process, I suspect that it would need to involve strong assumptions about common knowledge.
In the real world, I think it is relatively rare to reach a last round and know that it is the last round (and rarer still to know that a certain round is the next-to-last round, or the third-from-last), which limits the advantages of strategies based on backwards induction. Our ancestors lived in smallish gossipy groups, which meant that few of their interactions were isolated prisoners’ dilemmas (with no indirect costs of defection).
I’ve been thinking about this, and here are some possible shortcuts. Consider a strategy which I will call “Grumpy-Old-Man” or GOM-n. This behaves like TFT for the first n rounds, and then defects afterwards.
In your model, GOM-n for n = 150 would be able to drift into a population of TFT (since it has a very small fitness penalty of about 1 in 10 million, which is small enough to allow drift). If it did drift in and stabilize, there would then be selection pressure to slowly reduce the n, to GOM-149, then GOM-148 and so on.
Worse still, consider a mutant that produces GOM-150 but has selective advantages in earlier life, at a cost of crippling the TFT machinery in later life. (Not implausible because mutations often do have several effects). Then it could enter a population with a positive selective advantage, and clear the way for n to slowly reduce.
The particular example I was thinking of was a variant which is very good at detecting sneaky defection, disguised to look like co-operation. But then, as a side-effect in later life, everything starts to look like defection, so it becomes grumpy and stops co-operating itself. I know one or two folks like this...
I see your point here about dependency on assumptions. You describe a finite iterated model with consistently high probability of future interaction, but which then suddenly drops to zero at a high upper bound. (There is no “senescence” effect, with the probability of future interaction declining down to zero). Agreed that in that model, there is minuscule chance of anyone ever reaching the upper bound, so a strategy which isn’t strictly an ESS won’t be replaced. Thanks for the proof of concept—upvoted.
That model (and ones like it with slow senescence) would match a case described in my original article, and which I mentioned below in my reply to Randaly. Basically, TDT-1 never gets a hold, because it is too unlikely that “this time is the last”.
But my skepticism about this as an explanation remains: we do find real occasions where we know that “this time is the last” (or the only) interaction, and while our behaviour changes somewhat, we don’t automatically defect in those cases. This suggests to me that a variant strategy of “Detect if this time is the last, and if so defect” would indeed be possibe, but for some (other) reason is not favoured.
But do you actually know that? I mean, in your evolutionary past, there were people alive who committed betrayals that they thought they had gotten away with. They didn’t, and those people are less likely to be your ancestors.
So people’s brains, presumably with circuitry tuned to follow TFT because their ancestors used it, warns them not to commit acts of betrayal that in reality they could get away with. So they commit less acts of perfect betrayal than they otherwise would.
Like most other instincts, it can be overcome with learned behavior, which is why people exist who do betray others every chance they get and get away with it.
The fact that people will sometimes get it wrong (predict they can get away with betrayal, but can’t) is not a problem. It’s really a balance of fitness question (in cases where there s high probability of getting away with a defection gain, vs small probability of getting caught. Consider that the waiter you don’t tip might just chase you out of the restaurant with a gun. Probably won’t though.) Evolution would still favour last-round defections in such cases.
I’m saying that in the past, if you committed a major betrayal against your tribe—they kill you. It wouldn’t even what you stole or who you raped, etc, it’s the fact that you were willing to do it against the tribe. So, even in last round cases where you might think you got away with it, the times that you fail to get away with it erase your gains.
Look what happens to powerful people in today’s society if they get caught with some relatively minor transgression. So what if a Congressman sends naked pictures of himself to potential mates, or coaxes a female intern to give him a BJ? But, in both cases, the politician was betraying implicit promises and social norms for behavior that the voters want to see in a person in elected office.
This might be true… If the punishment for defection is always very severe after getting caught, then even with a very low probability of getting caught, but a low gain from defecting, evolution would favour co-operating on the last round (or single round) rather than defection. But this means that others’ commitments to vengeance have transformed the prisoner’s dilemma to a non-PD, which is my explanation 2 in the original article. (Or explanation 3 if vengeance is exacted by the whole tribe, including members who weren’t directly injured by the original defection.)
However, this just shifts the burden of explanation to accounting for why we (or whole tribes) are vengeful to such an extreme extent. After all, vengeance is enormously costly, and risks injury (the condemned can fight back) or counter-vengeance (whoever kills the original defector risks being killed in turn by the defector’s surviving family, and then the whole tribe splits apart in a cyle of killing). And notice that at that point, the original defection has already happened, so can’t be deterred any more, and the injury-risking, potentially-tribe-splitting vengeance has negative fitness. The tribe’s already in trouble—because of the betrayal—and the vengeance cycle could now destroy it. So why does it happen? What selection pressure maintains such severe punishment when it is fitness destroying?
Short answer: we’re adaptation executors, not fitness maximizers.
Isolated prisoners’ dilemmas were rare in the ancestral environment—most PD-like interactions took place in a social environment where they would (at least in expectation) have indirect effects either within a particular relationship or on one’s broader reputation (and thus shared features with an indefinitely iterated PD). So the advantages of being good at PD-in-a-social-context far outweighed the possible benefits of consistently defecting in the rare truly one-shot PD. That means most of evolution’s optimization power went towards building in adaptations that were good at PD-in-a-social-context, even if the adaptation made one less likely to defect in a truly one-shot PD.
For example, people tend to internalize their reputation, and to feel bad when others disapprove of them (either a particular close other or one’s broader reputation). Having a model of how others will react to your behavior, which is readily accessible and closely tied to your motivations, is very useful for PD-in-a-social-context, but it will make it harder to defect in a one-shot PD.
Another adaptation is the capability of feeling close to another individual, in such a way that you like & trust them and feel motivated to do things that help them. This adaptation probably involved repurposing the machinery that makes parents love their offspring (it involves the hormone Oxytocin), and it makes it harder to defect on the last turn. For actions towards one’s offspring, evolution didn’t want us to defect on the last turn. Adding last-turn defection in non-kin relationships seems like a lot of complexity for a low return, with a potentially high cost if the adaptation isn’t narrowly targeted and has collateral damage towards kin or earlier turns.
There are also various specific emotions which encourage TFT-like-behavior, like gratitude and vindictiveness. Someone who spends their last words on their deathbed praising the person who helped them, or cursing the person who cheated them, is cooperating in a PD. They are spreading accurate reputational information, and probably also strengthening the social rewards system by increasing people’s expectations that good behavior will be socially rewarded or bad behavior will be socially punished. Even if these deathbed acts don’t benefit the individual, they arise from emotions that did benefit the individual (making others more likely to help them, or less likely to cheat them). And again, when these emotions were in development by natural selection, deathbed turn-off was probably a relatively low-priority feature to add (although there does seem to be some tendency for vindictiveness to get turned off when someone is dying—I’m not sure if that’s related).
I fully get the point, but this doesn’t by itself explain why superior adaptations haven’t come along. Basically, we need to consider a “constraint on perfection” argument here, and ask what may be the constraints concerned in this case. It is generally possible to test the proposals.
Some obvious (standard) proposals are:
1) Mutations can’t arise to turn TFT into TFT-1
This is a bit unlikely for the reasons I already discussed. We do seem to have slightly different behaviour in the one-shot (or last-round) cases, so it is not implausible that some “mutant” would knock out co-operation completely on the last round (or on all rounds after a certain age—see my Grumpy-Old-Man idea above). There is a special concern when we allow for “cultural” mutations (or learned responses which can be imitated) as well as “biological” mutations.
2) Additional costs
The argument here is that TFT-1 has an additional cost penalty, because of the complexity overhead of successfully detecting the last round (or the only round), and the large negative cost of getting it wrong. Again it faces the objection that we do appear to behave slightly differently in last (or only) rounds, whereas if it were truly too difficult to discrimate, we’d have the same behaviour as on regular rounds.
3) Time-lags
This is the argument that we are adapted for an environment which has recently shifted, so cases of single-round (or known last-round) Prisoner’s Dilemma are much more common than before and evolution hasn’t caught up.
This might be testable by directly comparing behaviours of people living in conditions closer to Paleolithic versus industrialized conditions. Are there any differences in reactions when they are presented with one-shot prisoner’s dilemmas? If one-shot PD is a new phenomenon, then we might expect “Paleo-people” to instinctively co-operate, whereas westerners think a bit then defect (indicating that a learned response is overriding an instinctive response). This strikes me as somewhat unlikely (I think it’s more likely that the instinct is to defect, because there is a pattern-match to “not a member of my tribe”, whereas industrialized westerners have been conditioned to co-operate, at least some of the time). But it’s testable.
A variant of this is Randaly’s suggestion that true last-rounds are indeed new, because of the effect of retaliation against family (which has only recently been prohibited). This has a nice feature, that in cases where the last round truly was the last (because there was no family left), the mutant wouldn’t spread.
4) Side effects
Perhaps the mutations that would turn TFT into TFT-1 have other undesirable side effects? This is the counter-argument to Grumpy-Old-Man mutants invading because they have other positive side effects. Difficult to test this one until we know what range of mutations are possible (and whether we are considering biological or cultural ones).
I don’t think it was particularly central; while he did give it as an argument, drnickbone also gave examples of people cooperating on one-shot PD’s, both in formal experiments and in practice (eg choosing to tip at a foreign restaurant to waiter who will never be seen again.)