Seriously, what? I’m missing something critical. Under the stated rules as I understand them, I don’t see why anyone would punish another player for reducing their dial.
You state that 99 is a nash equilibrium, but this just makes no sense to me. Is the key that you’re stipulating that everyone must play as though everyone else is out to make it as bad as possible for them? That sounds like an incredibly irrational strategy.
I think it’s not that 99 is a Nash equilibrium, it’s that everyone doing “Play 99 and, if anyone deviates, play 100 to punish them until they give in” is a Nash equilibrium. (Those who think they understand the post: am I correct?)
Start by playing 99. If someone played less than they were supposed to last round, you’re now supposed to play 100. Otherwise, you’re now supposed to play 99.
I think what people are missing (I know I am) is where does the “supposed to” come from? I totally understand the debt calculation to get altruistic punishment for people who deviate in ways that hurt you—that’s just maximizing long-term expectation through short-term loss. I don’t understand WHY a rational agent would punish someone who is BENEFITTING you with their deviant play.
I’d totally get it if you reacted to someone playing MORE than they were supposed to. But if someone plays less than, there’s no debt or harm to punish.
Formally, it’s an arbitrary strategy profile that happens to be a Nash equilibrium, since if everyone else plays it, they’ll punish if you deviate from it unilaterally.
In terms of more realistic scenarios there are some examples of bad “punishing non punishers” equilibria that people have difficulty escaping. E.g. an equilibrium with honor killings, where parents kill their own children partly because they expect to be punished if they don’t. Rober Trivers, an evolutionary psychologist, has studied these equilibria, as they are anomalous from an evolutionary psychology perspective.
This doesn’t really answer the question. If some prisoner turns the dial to 30, everyone gets higher utility the next round, with no downside. In order to have some reason to not put it to 30, they need some incentive (e.g. that if anyone puts it strictly below average they also get an electric shock or whatever).
In the round after the round where the 30 applies, the Shelling temperature for the next round increases to 100, and it’s a Nash equilibrium for everyone to always select the Schelling temperature.
You can claim this is an unrealistic Nash equilibrium but I am pretty sure that unilateral deviation from the Schelling temperature, assuming everyone else always plays the Schelling temperature, never works out in anyone’s favor.
If a mathematical model doesn’t reflect at all the thing it’s supposed to represent, it’s not a good model. Saying “this is what the model predicts” isn’t helpful.
There is absolutely zero incentive to anyone to put the temperature to 100 at any time. Even as deterrence, there is no reason for the equilibrium temperature to be an unsurvivable 99. It makes no sense, no one gains anything from it, especially if we assume communication between the parties (which is required for there to be deterrence and other such mechanisms in place). There is no reason to punish someone putting the thermostat lower than the equilibrium temperature either, since the lowest possible temperature is still comfortable. The model is honestly just wrong to describe any actual situation of interest.
At the very least, the utility function is wrong: it’s not linear in temperature, obviously. It skyrockets around where temperatures exceed the survivable limit and then plateaus. There’s essentially no difference between 99 and 99.3, but there’s a much stronger incentive to go back below 40 as quickly as possible.
I think the mention here of “unsurvivable” temperature misses this point from the simulation description:
their bodies repair themselves automatically, so there is no release from their suffering
I agree that the incentives are different if high temperatures are not survivable and/or that there is a release from suffering. In particular the best alternative to a negotiated agreement is probably for me to experience a short period of excruciating pain and then die. This means that any outcome cannot be worse for me than that.
Ah, true. But I still wouldn’t expect that the difference between 99 and 99.3 would matter much compared to the possibility of breaking the deadlock and going back to a non-torturous temperature. Essentially, if the equilibrium is 99, the worst that the others can do to you is raise it up to 99.3. Conversely, keeping your temperature at 30 sends a signal that someone is trying to lower it, and if even just another one joins you, you get 89.6. At which point temperature might go even lower if others pick up on the trend. Essentially, as things are presented here, there is no reason why the equilibrium ought to be stable.
“Stable Nash equilibrium” is a term-of-art that I don’t think you meant to evoke, but it’s true that you can reach better states if multiple people act in concert. Saying this is a Nash equilibrium only means that no single player can do better, if you assume that everyone else is a robot that is guaranteed to keep following their current strategy no matter what.
This equilibrium is a local maximum surrounded by a tiny moat of even-worse outcomes. The moat is very thin, and almost everything beyond it is better than this, but you need to pass through the even-worse moat in order to get to anything better. (And you can’t cross the moat unilaterally.)
Of course, it’s easy to vary the parameters of this thought-experiment to make the moat wider. If you set the equilibrium at 98 instead of 99, then you’d need 3 defectors to do better, instead of only 2; etc.
So you can say “this is such an extreme example that I don’t expect real humans to actually follow it”, but that’s only a difference in degree, not a difference in kind. It’s pretty easy to find real-life examples where actual humans are actually stuck in an equilibrium that is strictly worse than some other equilibrium they theoretically could have, because switching would require coordination between a bunch of people at once (not just 2 or 3).
It’s pretty easy to find real-life examples where actual humans are actually stuck in an equilibrium that is strictly worse than some other equilibrium they theoretically could have, because switching would require coordination between a bunch of people at once (not just 2 or 3).
It is, in theory, but I feel like this underrates the real reason for most such situations: actual asymmetries in values, information, or both. A few things that may hold an otherwise pointless taboo or rule in place:
it serving as a shibboleth that identifies the in-group. This is a tangible benefit in certain situations. It’s true that another could be chosen and it could be something that is also more inherently worthy rather than just conventionally picked, but that requires time and adjusting and may create confusion
it being tied to some religious or ideological worldview such that at least some people genuinely believe it’s beneficial, and not just a convention. That makes them a lot more resistant to dropping it even if there was an attempt at coordination
it having become something that is genuinely unpleasant to drop even at an individual level simply because force of habit has led some individuals to internalize it.
In general I think the game theoretical model honestly doesn’t represent anything like a real world situation well because it creates a situation that’s so abstract and extreme, it’s impossible to imagine any of these dynamics at work. Even the worst, most dystopian totalitarianism in which everyone spies on everyone else and everyone’s life is miserable will at least have been started by a group of true believers who think this is genuinely a good thing.
I contend examples are easy to find even after you account for all of those things you listed. If you’d like a more in-depth exploration of this topic, you might be interested in the book Inadequate Equilibria.
I’ve read Inadequate Equilibria, but that’s exactly the thing, this specific example doesn’t really convey that sort of situation. At the very least, some social interaction as well as the path to the pathological equilibrium are crucial to it. By stripping down all of that, the 99 C example makes no sense. They’re an integral part of why such things happen.
″...everyone’s utility in a given round … is the negative of the average temperature.” Why would we assume that?
”Clearly, this is feasible, because it’s happening.” Is this rational? Isn’t this synonymous with saying “clearly my scenario makes sense because my scenario says so”?
“Each prisoner’s min-max payoff is −99.3” If everyone else is min-maxing against any given individual, you would have a higher payoff if you set your dial to 0, no? The worst total payoff would be −99.
What am I missing? Can anyone bridge this specific gap for me?
“Feasible” is being used as a technical term-of-art, not a value judgment. It basically translates to “physically possible”. You can’t have an equilibrium of 101 because the dials only go up to 100, so 101 is not “feasible”.
The min-max payoff is −99.3 because the dials only go down to 30, not to 0.
We’re assuming that utility function because it’s a simple thought-experiment meant to illustrate a general principle, and assuming something more complicated would just make the illustration more complicated. It’s part of the premise of the thought-experiment, just like assuming that people are in cages with dials that go from 30 to 100 is part of the premise.
The problem is that the model is so stripped down it doesn’t illustrate the principle any more. The principle, as I understand it, is that there are certain “everyone does X” equilibria in which X doesn’t have to be useful or even good per se, it’s just something everyone’s agreed upon. That’s true, but only to a certain point. Past a certain degree of utter insanity and masochism, people start solving the coordination problem by reasonably assuming that no one else can actually want X, and may try rebellion. In the thermostat example, a turn in which simply two prisoners rebelled would be enough to get a lower temperature even if the others tried to punish them. At that point the process would snowball. It’s only “stable” to the minimum possible perturbation of a single person turning the knob to 30, and deciding it’s not worth it any more after one turn at a mere 0.3 C above the already torturous temperature of 99 C.
I’m confused. Are you saying that the example is bad because the utility function of “everyone wants to minimize the average temperature” is too simplified? If not, why is this being posted as a reply to this chain?
I think the claim is that, while it may be irrational, it can be a Nash equilibrium. (And sometimes agents are more Nash-rational than really rational.)
This strikes me as so far from any real world scenario as to be useless.
The only point I can draw from this is that if everyone acts crazy then everyone is acting crazy together. The game theory is irrelevant.
Everyone is running a policy that’s very much against their own interests. Is the point that their policy to punish makes them vulnerable to a very bad equilibrium? B cause it seems like they are punishing good behavior, and it seems clear why that would have terrible results.
We see plenty of crazy, self-harming behavior in the real world. And plenty of people following local incentive to the their own long-term detriment. And people giving in to threats. And people punishing others, to their own detriment, including punishing what seems like prosocial behavior. And we see plenty of coalitions that punish defectors from the coalition, and punish people who fail to punish defectors. I would hope that the exact scenario in the OP wouldn’t literally happen. But “so far from any real world scenario as to be useless” seems very incorrect. (Either way, the game-theoretic point might be conceptually useful.)
We see plenty of crazy, self-harming behavior in the real world
Yes, but it’s usually because people believe that it does some good, or are locked in an actual prisoner’s dilemma in which being the first to cooperate makes you the sucker. Not situations in which defecting produces immediate (if small) benefits to you with no downsides.
I can see how that would apply in principle. I’m just saying: wouldn’t you want a dramatically more real-world relevant scenario?
If you punish good behavior, of course you’ll get bad equilibria. Does punishing bad behavior also give bad equilibria? It would be fascinating if it did, but this scenario has nothing to say about that.
This has an obvious natural definition in this particular thought-experiment, because every action affects all players in the same way, and the effect of every action is independent of every other action (e.g. changing your dial from 70 to 71 will always raise the average temperature by 0.01, no matter what any other dial is set to). But that’s a very special case.
The given example involves punishing behavior that is predicted to lower utility for all players, given the current strategies of all players. Does that sound bad in any way at all?
I guess it doesn’t, when you put it that way. I’d just like an example that has more real-world connections. It’s hard to see how actual intelligent agents would adopt that particular set of strategies. I suspect there are some real world similarities but this seems like an extreme case that’s pretty implausible on the face of it.
It is punishing good behavior in the sense that they’re punishing players for making things better for everyone on the next turn.
Two comments to this: 1) The scenario described here is a Nash equilibrium but not a subgame-perfect Nash equilibrium. (IE, there are counterfactual parts of the game tree where the players behave “irationally”.) Note that subgame-perfection is orthogonal with “reasonable policy to have”, so the argument “yeah, clearly the solution is to always require subgame-perfection” does not work. (Why is it orthogonal? First, the example from the post shows a “stupid” policy that isn’t subgame-perfect. However, there are cases where subgame-imperfection seems smart, because it ensures that those counterfactual situations don’t become reality. EG, humans are somewhat transparent to each other, so having the policy of refusing unfair splits in the Final Offer / Ultimatum game can lead to not being offered unfair splits in the first place.)
2) You could modify the scenario such that the “99 equilibrium” becomes more robust. (EG, suppose the players have a way of paying a bit to punish a specific player a lot. Then you add the norm of turning temperature to 99, the meta-norm of punishing defectors, the meta-meta-norm of punishing those who don’t punish defectors, etc. And tadaaaa, you have a pretty robust hell. This is a part of how society actually works, except usually those norms typically enforce pro-social behaviour.)
You may be confusing the questions “starting from a blank slate, would you expect players to go here?” and “given that players are (somehow) already here, would they stay here?” Saying that something is a Nash equilibrium only implies the second thing.
You’d punish a player for setting their dial lower because you expect that this will actually make the temperature higher (on average, in the long-run). And you expect that it will make the temperature higher because you expect everyone to punish them for it. This is self-referential, but it’s internally-consistent. It’s probably not what a new player would come up with on their own if you suddenly dropped them into this game with no explanation, but if everyone already believes it then it’s true.
(If you can’t immediately see how it’s true given that belief, try imagining that you are the only human player in this game and the other 99 players are robots who are programmed to follow this strategy and cannot do otherwise. Then, what is your best strategy?)
You’re correct, I was confused in exactly that way.
Once that confusion was cleared up by replies, I became confused as to why the hell (ha?) we were talking about this example at all. I am currently of the belief that it’s just a bad example and we should talk about a different one, since there have got to be better examples of counterintuitive bad outcomes from more reasonable-sounding punishment strategies.
Seriously, what? I’m missing something critical. Under the stated rules as I understand them, I don’t see why anyone would punish another player for reducing their dial.
You state that 99 is a nash equilibrium, but this just makes no sense to me. Is the key that you’re stipulating that everyone must play as though everyone else is out to make it as bad as possible for them? That sounds like an incredibly irrational strategy.
I think it’s not that 99 is a Nash equilibrium, it’s that everyone doing “Play 99 and, if anyone deviates, play 100 to punish them until they give in” is a Nash equilibrium. (Those who think they understand the post: am I correct?)
Start by playing 99. If someone played less than they were supposed to last round, you’re now supposed to play 100. Otherwise, you’re now supposed to play 99.
I think what people are missing (I know I am) is where does the “supposed to” come from? I totally understand the debt calculation to get altruistic punishment for people who deviate in ways that hurt you—that’s just maximizing long-term expectation through short-term loss. I don’t understand WHY a rational agent would punish someone who is BENEFITTING you with their deviant play.
I’d totally get it if you reacted to someone playing MORE than they were supposed to. But if someone plays less than, there’s no debt or harm to punish.
Formally, it’s an arbitrary strategy profile that happens to be a Nash equilibrium, since if everyone else plays it, they’ll punish if you deviate from it unilaterally.
In terms of more realistic scenarios there are some examples of bad “punishing non punishers” equilibria that people have difficulty escaping. E.g. an equilibrium with honor killings, where parents kill their own children partly because they expect to be punished if they don’t. Rober Trivers, an evolutionary psychologist, has studied these equilibria, as they are anomalous from an evolutionary psychology perspective.
This doesn’t really answer the question. If some prisoner turns the dial to 30, everyone gets higher utility the next round, with no downside. In order to have some reason to not put it to 30, they need some incentive (e.g. that if anyone puts it strictly below average they also get an electric shock or whatever).
In the round after the round where the 30 applies, the Shelling temperature for the next round increases to 100, and it’s a Nash equilibrium for everyone to always select the Schelling temperature.
You can claim this is an unrealistic Nash equilibrium but I am pretty sure that unilateral deviation from the Schelling temperature, assuming everyone else always plays the Schelling temperature, never works out in anyone’s favor.
If a mathematical model doesn’t reflect at all the thing it’s supposed to represent, it’s not a good model. Saying “this is what the model predicts” isn’t helpful.
There is absolutely zero incentive to anyone to put the temperature to 100 at any time. Even as deterrence, there is no reason for the equilibrium temperature to be an unsurvivable 99. It makes no sense, no one gains anything from it, especially if we assume communication between the parties (which is required for there to be deterrence and other such mechanisms in place). There is no reason to punish someone putting the thermostat lower than the equilibrium temperature either, since the lowest possible temperature is still comfortable. The model is honestly just wrong to describe any actual situation of interest.
At the very least, the utility function is wrong: it’s not linear in temperature, obviously. It skyrockets around where temperatures exceed the survivable limit and then plateaus. There’s essentially no difference between 99 and 99.3, but there’s a much stronger incentive to go back below 40 as quickly as possible.
I think the mention here of “unsurvivable” temperature misses this point from the simulation description:
I agree that the incentives are different if high temperatures are not survivable and/or that there is a release from suffering. In particular the best alternative to a negotiated agreement is probably for me to experience a short period of excruciating pain and then die. This means that any outcome cannot be worse for me than that.
Ah, true. But I still wouldn’t expect that the difference between 99 and 99.3 would matter much compared to the possibility of breaking the deadlock and going back to a non-torturous temperature. Essentially, if the equilibrium is 99, the worst that the others can do to you is raise it up to 99.3. Conversely, keeping your temperature at 30 sends a signal that someone is trying to lower it, and if even just another one joins you, you get 89.6. At which point temperature might go even lower if others pick up on the trend. Essentially, as things are presented here, there is no reason why the equilibrium ought to be stable.
“Stable Nash equilibrium” is a term-of-art that I don’t think you meant to evoke, but it’s true that you can reach better states if multiple people act in concert. Saying this is a Nash equilibrium only means that no single player can do better, if you assume that everyone else is a robot that is guaranteed to keep following their current strategy no matter what.
This equilibrium is a local maximum surrounded by a tiny moat of even-worse outcomes. The moat is very thin, and almost everything beyond it is better than this, but you need to pass through the even-worse moat in order to get to anything better. (And you can’t cross the moat unilaterally.)
Of course, it’s easy to vary the parameters of this thought-experiment to make the moat wider. If you set the equilibrium at 98 instead of 99, then you’d need 3 defectors to do better, instead of only 2; etc.
So you can say “this is such an extreme example that I don’t expect real humans to actually follow it”, but that’s only a difference in degree, not a difference in kind. It’s pretty easy to find real-life examples where actual humans are actually stuck in an equilibrium that is strictly worse than some other equilibrium they theoretically could have, because switching would require coordination between a bunch of people at once (not just 2 or 3).
It is, in theory, but I feel like this underrates the real reason for most such situations: actual asymmetries in values, information, or both. A few things that may hold an otherwise pointless taboo or rule in place:
it serving as a shibboleth that identifies the in-group. This is a tangible benefit in certain situations. It’s true that another could be chosen and it could be something that is also more inherently worthy rather than just conventionally picked, but that requires time and adjusting and may create confusion
it being tied to some religious or ideological worldview such that at least some people genuinely believe it’s beneficial, and not just a convention. That makes them a lot more resistant to dropping it even if there was an attempt at coordination
it having become something that is genuinely unpleasant to drop even at an individual level simply because force of habit has led some individuals to internalize it.
In general I think the game theoretical model honestly doesn’t represent anything like a real world situation well because it creates a situation that’s so abstract and extreme, it’s impossible to imagine any of these dynamics at work. Even the worst, most dystopian totalitarianism in which everyone spies on everyone else and everyone’s life is miserable will at least have been started by a group of true believers who think this is genuinely a good thing.
I contend examples are easy to find even after you account for all of those things you listed. If you’d like a more in-depth exploration of this topic, you might be interested in the book Inadequate Equilibria.
I’ve read Inadequate Equilibria, but that’s exactly the thing, this specific example doesn’t really convey that sort of situation. At the very least, some social interaction as well as the path to the pathological equilibrium are crucial to it. By stripping down all of that, the 99 C example makes no sense. They’re an integral part of why such things happen.
That’s correct, but that just makes this a worse (less intuitive) version of the stag hunt.
I’m in the same boat.
″...everyone’s utility in a given round … is the negative of the average temperature.” Why would we assume that?
”Clearly, this is feasible, because it’s happening.” Is this rational? Isn’t this synonymous with saying “clearly my scenario makes sense because my scenario says so”?
“Each prisoner’s min-max payoff is −99.3” If everyone else is min-maxing against any given individual, you would have a higher payoff if you set your dial to 0, no? The worst total payoff would be −99.
What am I missing? Can anyone bridge this specific gap for me?
“Feasible” is being used as a technical term-of-art, not a value judgment. It basically translates to “physically possible”. You can’t have an equilibrium of 101 because the dials only go up to 100, so 101 is not “feasible”.
The min-max payoff is −99.3 because the dials only go down to 30, not to 0.
We’re assuming that utility function because it’s a simple thought-experiment meant to illustrate a general principle, and assuming something more complicated would just make the illustration more complicated. It’s part of the premise of the thought-experiment, just like assuming that people are in cages with dials that go from 30 to 100 is part of the premise.
The problem is that the model is so stripped down it doesn’t illustrate the principle any more. The principle, as I understand it, is that there are certain “everyone does X” equilibria in which X doesn’t have to be useful or even good per se, it’s just something everyone’s agreed upon. That’s true, but only to a certain point. Past a certain degree of utter insanity and masochism, people start solving the coordination problem by reasonably assuming that no one else can actually want X, and may try rebellion. In the thermostat example, a turn in which simply two prisoners rebelled would be enough to get a lower temperature even if the others tried to punish them. At that point the process would snowball. It’s only “stable” to the minimum possible perturbation of a single person turning the knob to 30, and deciding it’s not worth it any more after one turn at a mere 0.3 C above the already torturous temperature of 99 C.
I’m confused. Are you saying that the example is bad because the utility function of “everyone wants to minimize the average temperature” is too simplified? If not, why is this being posted as a reply to this chain?
I think the claim is that, while it may be irrational, it can be a Nash equilibrium. (And sometimes agents are more Nash-rational than really rational.)
Are they, though?
This strikes me as so far from any real world scenario as to be useless.
The only point I can draw from this is that if everyone acts crazy then everyone is acting crazy together. The game theory is irrelevant.
Everyone is running a policy that’s very much against their own interests. Is the point that their policy to punish makes them vulnerable to a very bad equilibrium? B cause it seems like they are punishing good behavior, and it seems clear why that would have terrible results.
We see plenty of crazy, self-harming behavior in the real world. And plenty of people following local incentive to the their own long-term detriment. And people giving in to threats. And people punishing others, to their own detriment, including punishing what seems like prosocial behavior. And we see plenty of coalitions that punish defectors from the coalition, and punish people who fail to punish defectors. I would hope that the exact scenario in the OP wouldn’t literally happen. But “so far from any real world scenario as to be useless” seems very incorrect. (Either way, the game-theoretic point might be conceptually useful.)
Yes, but it’s usually because people believe that it does some good, or are locked in an actual prisoner’s dilemma in which being the first to cooperate makes you the sucker. Not situations in which defecting produces immediate (if small) benefits to you with no downsides.
I can see how that would apply in principle. I’m just saying: wouldn’t you want a dramatically more real-world relevant scenario?
If you punish good behavior, of course you’ll get bad equilibria. Does punishing bad behavior also give bad equilibria? It would be fascinating if it did, but this scenario has nothing to say about that.
What do you mean by “bad” behavior?
This has an obvious natural definition in this particular thought-experiment, because every action affects all players in the same way, and the effect of every action is independent of every other action (e.g. changing your dial from 70 to 71 will always raise the average temperature by 0.01, no matter what any other dial is set to). But that’s a very special case.
I don’t know, but I’d settle for moving to an example of bad effects from punishing behavior that sounds bad in any way at all.
The given example involves punishing behavior that is predicted to lower utility for all players, given the current strategies of all players. Does that sound bad in any way at all?
I guess it doesn’t, when you put it that way. I’d just like an example that has more real-world connections. It’s hard to see how actual intelligent agents would adopt that particular set of strategies. I suspect there are some real world similarities but this seems like an extreme case that’s pretty implausible on the face of it.
It is punishing good behavior in the sense that they’re punishing players for making things better for everyone on the next turn.
Two comments to this:
1) The scenario described here is a Nash equilibrium but not a subgame-perfect Nash equilibrium. (IE, there are counterfactual parts of the game tree where the players behave “irationally”.) Note that subgame-perfection is orthogonal with “reasonable policy to have”, so the argument “yeah, clearly the solution is to always require subgame-perfection” does not work. (Why is it orthogonal? First, the example from the post shows a “stupid” policy that isn’t subgame-perfect. However, there are cases where subgame-imperfection seems smart, because it ensures that those counterfactual situations don’t become reality. EG, humans are somewhat transparent to each other, so having the policy of refusing unfair splits in the Final Offer / Ultimatum game can lead to not being offered unfair splits in the first place.)
2) You could modify the scenario such that the “99 equilibrium” becomes more robust. (EG, suppose the players have a way of paying a bit to punish a specific player a lot. Then you add the norm of turning temperature to 99, the meta-norm of punishing defectors, the meta-meta-norm of punishing those who don’t punish defectors, etc. And tadaaaa, you have a pretty robust hell. This is a part of how society actually works, except usually those norms typically enforce pro-social behaviour.)
You may be confusing the questions “starting from a blank slate, would you expect players to go here?” and “given that players are (somehow) already here, would they stay here?” Saying that something is a Nash equilibrium only implies the second thing.
You’d punish a player for setting their dial lower because you expect that this will actually make the temperature higher (on average, in the long-run). And you expect that it will make the temperature higher because you expect everyone to punish them for it. This is self-referential, but it’s internally-consistent. It’s probably not what a new player would come up with on their own if you suddenly dropped them into this game with no explanation, but if everyone already believes it then it’s true.
(If you can’t immediately see how it’s true given that belief, try imagining that you are the only human player in this game and the other 99 players are robots who are programmed to follow this strategy and cannot do otherwise. Then, what is your best strategy?)
You’re correct, I was confused in exactly that way.
Once that confusion was cleared up by replies, I became confused as to why the hell (ha?) we were talking about this example at all. I am currently of the belief that it’s just a bad example and we should talk about a different one, since there have got to be better examples of counterintuitive bad outcomes from more reasonable-sounding punishment strategies.