Theorem 1: the game G has a Nash equilibrium which leads to the outcome (C,C).
I assume you know the result that any game has a Nash Equilibrium for any (expected) outcomes in which each player receives at least their minmax values?
Also it might be slightly confusing to reuse the symbol G twice. Looking forwards to the bounded Loeb theorem proof!
I assume you know the result that any game has a Nash Equilibrium for any (expected) outcomes in which each player receives at least their minmax values?
Sorry, why do you mention that? The minmax values in the PD are achieved by the outcome (D,D), which is worse than (C,C).
Also it might be slightly confusing to reuse the symbol G twice.
Thanks, nice catch! Fixed.
Looking forwards to the bounded Loeb theorem proof!
Will try. It took me awhile to even arrive at a formulation that looks correct, maybe it will change again.
Sorry, why do you mention that? The minmax value in the PD is achieved by the outcome (D,D), which is worse than (C,C).
Precisely. Since (C,C) is better for both players than the minmax (D,D), there has to be a Nash equilibrium that finds it (and you construction is one example of such).
cousin_it’s example seemed to be a special case of a more general type of theorem. That theorem (in various forms) is that there exists nash equilibriums for repeated games for every situation where everyone gets more than their absolute guaranteed minimum. The equilibrium goes like “everyone commits to this strategy, and if anyone disobeys, we punish them by acting so as to keep their gains to the strict minimum”. Then nobody has any incentive to deviate from that.
It is a Nash equilibrium for the infinitely repeated PD (take two tit-for-tat opponents—neither have any incentives to deviate from their strategy).
I’m not sure that’s completely right. Infinitely repeated games need a discount factor to keep utilities finite, and the result doesn’t seem to hold if the discount factor is high enough.
I believe the same result holds for one-shot games where you have your opponent’s code.
Yeah, that’s actually another result of mine called freaky fairness ;-) It relies on quining cooperation described here. Maybe I’ll present it in the paper too. LW user benelliott has shown that it’s wrong for multiplayer games, but I believe it still holds for 2-player ones.
Pages 151-152 of multiagent systems http://www.masfoundations.org/ has the proper formulation. But they don’t seem to mention the need for low discount factors...
Your linked result seems to talk about average utilities in the long run, which corresponds to a discount factor of 1. In the general case it seems to me that discount factors can change the outcome. For example, if the benefit of unilaterally defecting instead of cooperating on the first move outweighs the entire future revenue stream, then cooperating on the first move cannot be part of any Nash equilibrium. I found some results saying indefinite cooperation is sustainable if the discount factor is below a certain threshold.
That sounds reasonable. If v is the expected discounted utility at minmax, w the expected discounted utility according to the cooperative strategy, then whenever the gain to defection is less than w-v, we’re fine.
Cool.
I assume you know the result that any game has a Nash Equilibrium for any (expected) outcomes in which each player receives at least their minmax values?
Also it might be slightly confusing to reuse the symbol G twice. Looking forwards to the bounded Loeb theorem proof!
Sorry, why do you mention that? The minmax values in the PD are achieved by the outcome (D,D), which is worse than (C,C).
Thanks, nice catch! Fixed.
Will try. It took me awhile to even arrive at a formulation that looks correct, maybe it will change again.
Precisely. Since (C,C) is better for both players than the minmax (D,D), there has to be a Nash equilibrium that finds it (and you construction is one example of such).
After reading the thread below, I still don’t understand what your original point was. Could you elaborate?
cousin_it’s example seemed to be a special case of a more general type of theorem. That theorem (in various forms) is that there exists nash equilibriums for repeated games for every situation where everyone gets more than their absolute guaranteed minimum. The equilibrium goes like “everyone commits to this strategy, and if anyone disobeys, we punish them by acting so as to keep their gains to the strict minimum”. Then nobody has any incentive to deviate from that.
Sorry, I must’ve gone insane for a minute there. Are you saying that (C,C) is a Nash equilibrium of the classical Prisoner’s Dilemma?
It is a Nash equilibrium for the infinitely repeated PD (take two tit-for-tat opponents—neither have any incentives to deviate from their strategy).
I believe the same result holds for one-shot games where you have your opponent’s code.
I’m not sure that’s completely right. Infinitely repeated games need a discount factor to keep utilities finite, and the result doesn’t seem to hold if the discount factor is high enough.
Yeah, that’s actually another result of mine called freaky fairness ;-) It relies on quining cooperation described here. Maybe I’ll present it in the paper too. LW user benelliott has shown that it’s wrong for multiplayer games, but I believe it still holds for 2-player ones.
Pages 151-152 of multiagent systems http://www.masfoundations.org/ has the proper formulation. But they don’t seem to mention the need for low discount factors...
Your linked result seems to talk about average utilities in the long run, which corresponds to a discount factor of 1. In the general case it seems to me that discount factors can change the outcome. For example, if the benefit of unilaterally defecting instead of cooperating on the first move outweighs the entire future revenue stream, then cooperating on the first move cannot be part of any Nash equilibrium. I found some results saying indefinite cooperation is sustainable if the discount factor is below a certain threshold.
That sounds reasonable. If v is the expected discounted utility at minmax, w the expected discounted utility according to the cooperative strategy, then whenever the gain to defection is less than w-v, we’re fine.