I don’t understand the motivation to preserve the min-max value, and perhaps that’s why it’s a folk theorem rather than an actual theorem. Each participant knows that they can’t unilaterally do better than 99.3, which they get by choosing 30 while the other players all choose 100. But a player’s maxing (of utility; min temperature) doesn’t oblige them to correct or reduce their utility (by raising the temperature) just because the opponents fail to minimize the player’s utility (by raising the temperature).
There is no debt created anywhere in the model or description of the players. Everyone min-maxes as a strategy, picking the value that maximizes each player’s utility assuming all opponents minimize that player’s utility. But the other players aren’t REQUIRED to play maximum cruelty—they’re doing the same min-max strategy, but for their own utility, leading everyone to set their dial to 30.
I believe many of the theorems have known proofs (e.g. this paper). Here’s an explanation of the debt mechanic:
Debt is initially 0. Equilibrium temperature is 99 if debt is 0, otherwise 100. For everyone who sets the temperature less than equilibrium in a round, debt increases by 1. Debt decreases by 1 per round naturally, unless it was already 0.
Hmm. A quick reading of that paper talks about punishment for defection, not punishment for unexpected cooperation. Can you point to the section that discusses the reason for the “debt” concept as applied to deviations that benefit the player in question?
Note that I’m going to have to spend a bit more time on the paper, because I’m fascinated by the introduction of a discount rate to make the punishment non-infinite. I do not expect to find that it mandates punishment for unexpected gifts of utility.
I only skimmed the paper, it was linked from Wikipedia as a citation for one of the folk theorems. But, it’s easy to explain the debt-based equilibrium assuming no discount rate. If you’re considering setting the temperature lower than the equilibrium, you will immediately get some utility by setting the temperature lower, but you will increase the debt by 1. That means the equilibrium temperature will be 1 degree higher in 1 future round, more than compensating for the lower temperature in this round. So there is no incentive to set the temperature lower than the equilibrium (which is itself determined by debt).
The problem with using this as an equilibrium in an iterated game with a discount rate is that if the debt is high enough, the higher temperature due to higher debt might come very late, so the agents care less about it. I’m not sure how to fix this but I strongly believe it’s possible.
I understand the debt calculation as group-enforced punishment for defection. It’s a measure of how much punishment is due to bring the average utility of an opponent back down to expectation, after that opponent “steals” utility by defecting when they “should” cooperate. It’s not an actual debt, and not symmetrical around that average. In fact, in the temperature example, it should be considered NEGATIVE debt for someone to unilaterally set their temperature lower.
But it doesn’t show that it’s a subgame perfect equilibrium. This paper claims to prove it for subgame perfect equilibria, although I haven’t checked it in detail.
I rewrote part of the post to give an equilibrium that works with a discount rate as well.
“The way it works is that, in each round, there’s an equilibrium temperature, which starts out at 99. If anyone puts the dial less than the equilibrium temperature in a round, the equilibrium temperature in the next round is 100. Otherwise, the equilibrium temperature in the next round is 99 again. This is a Nash equilibrium because it is never worth deviating from. In the Nash equilibrium, everyone else selects the equilibrium temperature, so by selecting a lower temperature, you cause an increase of the equilibrium temperature in the next round. While you decrease the temperature in this round, it’s never worth it, since the higher equilibrium temperature in the next round more than compensates for this decrease.”
I am confused.
Why does everyone else select the equilibrium temperature? Why would they push it to 100 in the next round? You never explain this.
I understand you may be starting off a theorem that I don’t know.
To me the obvious course of action would be something like: the temperature is way too high, so I’ll lower the temperature.
Wouldn’t others appreciate that the temperature is dropping and getting closer to their own preference of 30 degrees ?
Are you saying what you’re describing makes sense, or are you saying that what you’re describing is a weird (and meaningless?) consequence of Nash theorem?
I’m saying it’s a Nash equilibrium, not that it’s particularly realistic.
They push it to 100 because they expect everyone else to do so, and they expect that if anyone sets it to less than 100, the equilibrium temperature in the round after that will be 100 instead of 99. If everyone else is going to select 100, it’s futile to individually deviate and set the temperature to 30, because that means in the next round everyone but you will set it to 100 again, and that’s not worth being able to individually set it to 30 in this round.
After giving it some thought, I do see a lot of real-life situations where you get to such a place.
For instance- I was recently watching The Vow, the documentary about the NXIVM cult (“nexium”). In very broad strokes, one of the core fucked up things the leader does, is to gaslight the members into thinking that pain is good. If you resist him, don’t like what he says, etc, there is something deficient in you. After a while, even when he’s not in the picture so it would make sense for everyone to suffer less and get some slack, people punish each other for being deficient or weak.
And now that I wrote it about NXIVM I imagine those dynamics are actually commonplace in everyday society too.
I don’t really know about the theorem, but I think there’s something real here. I think the theorem is in spirit something like: “bad for everyone” equilibria can be enforced, as long as there’s a worse possible history. As long as there’s a worse possible history that can be enforced by everyone but you, regardless of what you do, then everyone but you can incentivize you to do any particular thing.
Like suppose you wake up and everyone tells you: we’ve all decided we’re going to torture everyone forever if you don’t pinch everyone you meet; but if you pinch everyone you meet, we won’t do that. So then your individually Nash-rational response is to pinch everyone you meet.
Ok so that gets one person. But this could apply to everyone. Suppose everyone woke up one day with some weird brain damage such that they have all the same values as before, except that they have a very specific and strong intention that if any single person fails to pinch everyone they meet, then everyone else will coordinate to torture everyone forever. Then everyone’s Nash-rational response is to pinch everyone.
But how can this be an equilibrium? Why wouldn’t everyone just decide to not do this torture thing? Isn’t that strictly better? If we’re just talking about Nash equilibria, the issue is that it counts as an equilibrium as long as what each player actually does is responding correctly given that everyone else’s policy is whatever it is. So even though it’s weird for players to harm everyone, it still counts as a Nash equilibrium, as long as everyone actually goes around pinching everyone in response. Pinch everyone is Nash-correct if everyone else would punish you, and indeed everyone else would punish you. Since everyone pinches everyone, no one has to actually torture everyone (which would be Nash-irrational, but that doesn’t matter).
But why would people have all this Nash-irrational behavior outside of what actually happens? It’s not actually necessary. https://en.wikipedia.org/wiki/Folk_theorem_(game_theory)#Subgame_perfection
You can apparently still get the result by, instead of “torture everyone forever” as the threat, have “torture everyone for a year” as the threat. Not sure exactly how this works, it seems to rely heavily on indifference?
Anyway, it’s also intuitively weird that people would have chosen some weird equilibrium like this in the first place. How do they suddenly leap to that equilibrium?
In real life a coalition doesn’t just punish defectors but also punishes people who don’t punish defectors, and so on. So to me it’s far from implausible that this would happen in real life.
Per Wikipedia, it’s called a “folk theorem” because there was a substantial period of time when most people in the field were aware of it but it hadn’t been formally published. It’s still an “actual” theorem.
It doesn’t say this outcome WILL or SHOULD happen; it just says there exists some Nash equilibrium where it happens.
I don’t understand the motivation to preserve the min-max value, and perhaps that’s why it’s a folk theorem rather than an actual theorem. Each participant knows that they can’t unilaterally do better than 99.3, which they get by choosing 30 while the other players all choose 100. But a player’s maxing (of utility; min temperature) doesn’t oblige them to correct or reduce their utility (by raising the temperature) just because the opponents fail to minimize the player’s utility (by raising the temperature).
There is no debt created anywhere in the model or description of the players. Everyone min-maxes as a strategy, picking the value that maximizes each player’s utility assuming all opponents minimize that player’s utility. But the other players aren’t REQUIRED to play maximum cruelty—they’re doing the same min-max strategy, but for their own utility, leading everyone to set their dial to 30.
I believe many of the theorems have known proofs (e.g. this paper). Here’s an explanation of the debt mechanic:
Debt is initially 0. Equilibrium temperature is 99 if debt is 0, otherwise 100. For everyone who sets the temperature less than equilibrium in a round, debt increases by 1. Debt decreases by 1 per round naturally, unless it was already 0.
Hmm. A quick reading of that paper talks about punishment for defection, not punishment for unexpected cooperation. Can you point to the section that discusses the reason for the “debt” concept as applied to deviations that benefit the player in question?
Note that I’m going to have to spend a bit more time on the paper, because I’m fascinated by the introduction of a discount rate to make the punishment non-infinite. I do not expect to find that it mandates punishment for unexpected gifts of utility.
I only skimmed the paper, it was linked from Wikipedia as a citation for one of the folk theorems. But, it’s easy to explain the debt-based equilibrium assuming no discount rate. If you’re considering setting the temperature lower than the equilibrium, you will immediately get some utility by setting the temperature lower, but you will increase the debt by 1. That means the equilibrium temperature will be 1 degree higher in 1 future round, more than compensating for the lower temperature in this round. So there is no incentive to set the temperature lower than the equilibrium (which is itself determined by debt).
The problem with using this as an equilibrium in an iterated game with a discount rate is that if the debt is high enough, the higher temperature due to higher debt might come very late, so the agents care less about it. I’m not sure how to fix this but I strongly believe it’s possible.
I understand the debt calculation as group-enforced punishment for defection. It’s a measure of how much punishment is due to bring the average utility of an opponent back down to expectation, after that opponent “steals” utility by defecting when they “should” cooperate. It’s not an actual debt, and not symmetrical around that average. In fact, in the temperature example, it should be considered NEGATIVE debt for someone to unilaterally set their temperature lower.
Ah, here’s a short proof of a folk theorem: *
But it doesn’t show that it’s a subgame perfect equilibrium. This paper claims to prove it for subgame perfect equilibria, although I haven’t checked it in detail.
I rewrote part of the post to give an equilibrium that works with a discount rate as well.
“The way it works is that, in each round, there’s an equilibrium temperature, which starts out at 99. If anyone puts the dial less than the equilibrium temperature in a round, the equilibrium temperature in the next round is 100. Otherwise, the equilibrium temperature in the next round is 99 again. This is a Nash equilibrium because it is never worth deviating from. In the Nash equilibrium, everyone else selects the equilibrium temperature, so by selecting a lower temperature, you cause an increase of the equilibrium temperature in the next round. While you decrease the temperature in this round, it’s never worth it, since the higher equilibrium temperature in the next round more than compensates for this decrease.”
I am confused. Why does everyone else select the equilibrium temperature? Why would they push it to 100 in the next round? You never explain this.
I understand you may be starting off a theorem that I don’t know. To me the obvious course of action would be something like: the temperature is way too high, so I’ll lower the temperature. Wouldn’t others appreciate that the temperature is dropping and getting closer to their own preference of 30 degrees ?
Are you saying what you’re describing makes sense, or are you saying that what you’re describing is a weird (and meaningless?) consequence of Nash theorem?
I’m saying it’s a Nash equilibrium, not that it’s particularly realistic.
They push it to 100 because they expect everyone else to do so, and they expect that if anyone sets it to less than 100, the equilibrium temperature in the round after that will be 100 instead of 99. If everyone else is going to select 100, it’s futile to individually deviate and set the temperature to 30, because that means in the next round everyone but you will set it to 100 again, and that’s not worth being able to individually set it to 30 in this round.
Gotcha. Thanks for clarifying!
After giving it some thought, I do see a lot of real-life situations where you get to such a place.
For instance-
I was recently watching The Vow, the documentary about the NXIVM cult (“nexium”).
In very broad strokes, one of the core fucked up things the leader does, is to gaslight the members into thinking that pain is good. If you resist him, don’t like what he says, etc, there is something deficient in you. After a while, even when he’s not in the picture so it would make sense for everyone to suffer less and get some slack, people punish each other for being deficient or weak.
And now that I wrote it about NXIVM I imagine those dynamics are actually commonplace in everyday society too.
I don’t really know about the theorem, but I think there’s something real here. I think the theorem is in spirit something like: “bad for everyone” equilibria can be enforced, as long as there’s a worse possible history. As long as there’s a worse possible history that can be enforced by everyone but you, regardless of what you do, then everyone but you can incentivize you to do any particular thing.
Like suppose you wake up and everyone tells you: we’ve all decided we’re going to torture everyone forever if you don’t pinch everyone you meet; but if you pinch everyone you meet, we won’t do that. So then your individually Nash-rational response is to pinch everyone you meet.
Ok so that gets one person. But this could apply to everyone. Suppose everyone woke up one day with some weird brain damage such that they have all the same values as before, except that they have a very specific and strong intention that if any single person fails to pinch everyone they meet, then everyone else will coordinate to torture everyone forever. Then everyone’s Nash-rational response is to pinch everyone.
But how can this be an equilibrium? Why wouldn’t everyone just decide to not do this torture thing? Isn’t that strictly better? If we’re just talking about Nash equilibria, the issue is that it counts as an equilibrium as long as what each player actually does is responding correctly given that everyone else’s policy is whatever it is. So even though it’s weird for players to harm everyone, it still counts as a Nash equilibrium, as long as everyone actually goes around pinching everyone in response. Pinch everyone is Nash-correct if everyone else would punish you, and indeed everyone else would punish you. Since everyone pinches everyone, no one has to actually torture everyone (which would be Nash-irrational, but that doesn’t matter).
But why would people have all this Nash-irrational behavior outside of what actually happens? It’s not actually necessary. https://en.wikipedia.org/wiki/Folk_theorem_(game_theory)#Subgame_perfection You can apparently still get the result by, instead of “torture everyone forever” as the threat, have “torture everyone for a year” as the threat. Not sure exactly how this works, it seems to rely heavily on indifference?
Anyway, it’s also intuitively weird that people would have chosen some weird equilibrium like this in the first place. How do they suddenly leap to that equilibrium?
In real life a coalition doesn’t just punish defectors but also punishes people who don’t punish defectors, and so on. So to me it’s far from implausible that this would happen in real life.
Per Wikipedia, it’s called a “folk theorem” because there was a substantial period of time when most people in the field were aware of it but it hadn’t been formally published. It’s still an “actual” theorem.
It doesn’t say this outcome WILL or SHOULD happen; it just says there exists some Nash equilibrium where it happens.