This problem was originally seen in a manga chapter. I will describe the canon (form it appeared in the manga (with a few adjustments)) form, and a general form of the problem. I couldn’t solve the problem in the little time I spent to think of it (before continuing with the manga), so it flagged my attention as “interesting”.
Canon Form
There are five players, $(a_1, a_2, …, a_5)$.
Every round, each of these players is given an initial fund of 5 coins.
There are 5 rounds in the game.
On each round, each player has two choices: to make a secret deposit in the tax fund, or in their personal account. The players have to deposit (in secrecy) all their coins on hand in at least one of the two accounts (they can deposit in both accounts). The players cannot show “hard” evidence of their deposits (no receipt, pictures, etc).
Money deposited in a player’s personal account becomes the player’s fund for the next round.
Money deposited in the tax fund is multiplied by two and shared equally among all five players (including players who deposited nothing, hence the name).
The deposits are made sequentially, with the order randomly determined each round.
Players do not know how much is inside the tax fund when they make their deposit.
The coins are worthless outside the game.
After the five rounds, all players that accumulated at least 40 coins would receive rewards as such:
1st: $100,000,000
2nd: $30,000,000
3rd: $3,000,000
4th/5th: $0
The utility of money grows linearly for the players.
The total amount of money is $133,000,000. If multiple people get a position, the money assigned to that position is split equally among the people.
All players who amassed below the required forty coins incur a debt of $100,000,000.
I use “rational” in a very specific sense, so saying all players are “perfectly rational” makes the problem useless. Nevertheless, in canon the participants of this game were all extraordinarily competent, and it was common knowledge * that they were all extraordinarily competent. To use jargon that I personally adopt:
The participants are “sensible” (they act solely to maximise their utility).
It is unknown whether the participants are “reasonable” (their decision algorithm satisfies some apriori reasonableness standard of decision making).
The participants know (solely by reputation) they are all of (exceedingly) above average “competence” (they win).
---
General Form
There are m (m >= 5 and m is odd) players, $(a_1, a_2, …, a_5)$.
Every round, each of these players is given an initial fund of c coins.
There are n (n >= 5, n >= m) rounds in the game.
On each round, each player has two choices” to make a secret—On each round, each player has two choices: to make a secret deposit in the tax fund, or in their personal account. The players have to deposit (in secrecy) all their coins on hand in at least one of the two accounts (they can deposit in both accounts). The players cannot show “hard” evidence of their deposits (no receipt, pictures, etc).
Money deposited in the players personal account becomes the player’s fund for the next round.
Money deposited in the tax fund is multiplied by two and shared equally among all m players (including players who deposited nothing, hence the name).
The deposits are made sequentially, with the order randomly determined each round.
Players do not know how much is inside the tax fund when they make their deposit.
The coins are worthless outside the game.
After the n rounds, all players that accumulated at least k (k >= 1.6\ n\*c) coins would receive rewards as such:
1st: $100,000,000
2nd: $30,000,000
3rd: $3,000,000
<= 4th: $0
The utility of money grows linearly for the participants.
The total amount of money is $133,000,000. If multiple people get a position, the money assigned to that position is split evenly among the people.
All players who amassed below the required k coins incur a debt of $100,000,000.
The participants are “sensible” (they act solely to maximise their utility).
It is unknown whether the participants are “reasonable” (their decision algorithm satisfies some apriori reasonableness standard of decision making).
The participants know (solely by reputation) they are all (exceedingly) above average “competence” (they win).
---
Scenario 1
The players cannot communicate at all with each other.
Scenario 2
The players can communicate freely with each other, and even exclusively with selected players during the course of the game, except from when at least one of the prospective conversation partners is making their deposit(s).
I think Scenario 2 is more practical, but I’m interested in the results of Scenario 1 analysis.
I think the problem is interesting because:
Not depositing money dominates depositing money.
If no players deposit money on all five turns, then none of the players would ever accumulate the required 40 coins to receive post game benefits.
Nota Bene
In the original game, a conference took place at the beginning of each round. During this conference, if more than half of the players agreed, they could exile one player. At most one player could be exiled per round. I left this out, as I’m not sure how we could usefully analyse this factor of expulsion. Furthermore, the players in canon were not randomly selected, and there were in fact two groups of players that knew each other very well (plus one wild card who was familiar with both groups, but unpredictability was sort of her thing).
Post Script
I loved the manga (totally recommend it), and it made me rethink how I define rational (the post on that is coming out this weekend). I introduced “competent” to my vocabulary to refer to people who actually make utility maximising decisions (i.e win). “competent” is different from “rational”, because competency can only be evaluated after the decision has been made, and not before. I decided that however I define rationality, it should be evaluatable apriori, or it wouldn’t be a very useful definition.
Competency is introduced, so as to compare rationality with competency. The competent decision is the action that maximises payoff under the state of the world that ended up occurring. I don’t want to be rational—I want to be competent—I care about maximising expected utility, only in so much as it helps me maximise utility. I care only about being more rational in so much as it makes me more competent. I think this should be true for others as well—and this is the only moral prescription I make:
Maximise your utility.
---
Standardised Jargon
“cooperation” means acting according to a strategy agreed upon with the other players.
“defection” means acting differently from a strategy agreed upon with the other players.
$r_x$ denotes round $x$.
$t_x$ denotes the amount in the tax fund at $r_x$.
$j_x$ denotes the number of coins a player saves (donates into their account) in $r_x$.
$w_x$ denotes a player’s accumulated fund at $r_x$.
$r_x$ denotes how much a player decides to donate at $r_x$.
$p_x$ denotes a player’s pseudo pay off (“pseudo” as coins are worthless outside the game) at $r_x$.
$w_{n+1}$ denotes a player’s accumulated fund at $r_{n+1}$ (after the last round). $w_{n + 1}$ is what determines a player’s actual payoff.
Some relationships
$d_x = w_{x − 1} + c—j_x$
$p_x = \frac{(2*t_x)}{m} - d_x$
$w_x = w_{x − 1} + p_{x − 1} + c$
$w_1 = w_0 + c + p\ 0$
$w_1 = 0 + c + 0 = c$
$w_{n+1} = c + \sum_{x = 1}^n {p_x}$
Accumulated knowledge so far
Irregardless of whatever strategy the other players are utilising, and assuming your decision does not influence the decision of the other players, refusing to donate at a particular round always results in you having more coins than other players.
Any defection prior to the last round can (usually) be fixed at the last round (it can’t be “fixed” if there is simply not enough money for example).
Everybody donating 0 coins all 5 rounds is a Nash equilibrium.
This is a sequential decision problem. This means that player’s preferences are over terminal environment states $(r_{n+1})$ in this case, and not intermediate environment states $(r_x (x \in [1, n]))$. All players have the following preferences:
-$\{w_{n+1} > k\} > \{w_{n+1} < k\}$
-{$w_{n+1}$ is first and less than four other people are first (the fewer draws the better) } > {$w_{n+1}$ is second and no other people are second} > {$w_{n+1}$ is first and four other people are first} > {$w_{n+1}$ is second and at least one other person is second (the fewer draws the better)} > {$w_{n+1}$ is third (the fewer draws the better)} > {$w_{n+1}$ is fourth or fifth}.
These are two hierarchies of preference. The first preference is on a higher level than the second preference. If the first preference is not satisfied, then the second is not even considered at all. I don’t know/can’t recall the name for this kind of preference.
The hierarchical nature of the players’ preference means that it is not irrational for a player to adopt a strategy $(s_i)$ at $r_x$ that the player believes would definitely lead to a lower $w_{x+1}$ than another strategy $s_j$, if and only if the agent believes $s_i$ would lead to a higher $w_{n+1}$ than $s_j$.
To illustrate. Suppose everyone agrees to donate 8 out of their 20 coins on the fourth round, then if the plan goes well, they’ll donate 7 coins on the fifth round. Suppose one player has the option to defect maximally and donate nothing, while other players donate everything. Suppose he believes that if he defects and donates less than 8 coins, all other players would not donate anything for the rest of the game. I.e. he believes the remaining four players are all irrationally retributive, and keep to their promises.
Let $s_1$ represent donating 8 coins.
Let $s_0$ represent donating 0 coins.
Assuming they adopt $s_1$:
$p_4 = (8 * 5 * 2)/5 − 8 = 8$
$w_5 = 20 + 8 + 5 = 33$
Assuming they adopt $s_0$:
$p_4 = (8 * 4 * 2)/5 − 0 = 12.8$
$w_5 = 20 + 12.8 + 5 = 37.8$
$w_6 < 40$, as none of the other players donate any money on the fifth round.
Thus, even though $s_0$ conveys a better pseudo payoff to the player, the actual payoff to the player of $s_0$ is less than $s_1$. If the player wants to defect, the best time to do so, is at $r_5$ and not $r_4$.
This illustrates the fact that the players have preferences over only $w_{n+1}$ and don’t have preferences over $(w_x (x \in [1, n]))$.
Tragedy of the Commons
This problem was originally seen in a manga chapter. I will describe the canon (form it appeared in the manga (with a few adjustments)) form, and a general form of the problem. I couldn’t solve the problem in the little time I spent to think of it (before continuing with the manga), so it flagged my attention as “interesting”.
Canon Form
There are five players, $(a_1, a_2, …, a_5)$.
Every round, each of these players is given an initial fund of 5 coins.
There are 5 rounds in the game.
On each round, each player has two choices: to make a secret deposit in the tax fund, or in their personal account. The players have to deposit (in secrecy) all their coins on hand in at least one of the two accounts (they can deposit in both accounts). The players cannot show “hard” evidence of their deposits (no receipt, pictures, etc).
Money deposited in a player’s personal account becomes the player’s fund for the next round.
Money deposited in the tax fund is multiplied by two and shared equally among all five players (including players who deposited nothing, hence the name).
The deposits are made sequentially, with the order randomly determined each round.
Players do not know how much is inside the tax fund when they make their deposit.
The coins are worthless outside the game.
After the five rounds, all players that accumulated at least 40 coins would receive rewards as such:
1st: $100,000,000
2nd: $30,000,000
3rd: $3,000,000
4th/5th: $0
The utility of money grows linearly for the players.
The total amount of money is $133,000,000. If multiple people get a position, the money assigned to that position is split equally among the people.
All players who amassed below the required forty coins incur a debt of $100,000,000.
I use “rational” in a very specific sense, so saying all players are “perfectly rational” makes the problem useless. Nevertheless, in canon the participants of this game were all extraordinarily competent, and it was common knowledge * that they were all extraordinarily competent. To use jargon that I personally adopt:
The participants are “sensible” (they act solely to maximise their utility).
It is unknown whether the participants are “reasonable” (their decision algorithm satisfies some apriori reasonableness standard of decision making).
The participants know (solely by reputation) they are all of (exceedingly) above average “competence” (they win).
---
General Form
There are m (m >= 5 and m is odd) players, $(a_1, a_2, …, a_5)$.
Every round, each of these players is given an initial fund of c coins.
There are n (n >= 5, n >= m) rounds in the game.
On each round, each player has two choices” to make a secret—On each round, each player has two choices: to make a secret deposit in the tax fund, or in their personal account. The players have to deposit (in secrecy) all their coins on hand in at least one of the two accounts (they can deposit in both accounts). The players cannot show “hard” evidence of their deposits (no receipt, pictures, etc).
Money deposited in the players personal account becomes the player’s fund for the next round.
Money deposited in the tax fund is multiplied by two and shared equally among all m players (including players who deposited nothing, hence the name).
The deposits are made sequentially, with the order randomly determined each round.
Players do not know how much is inside the tax fund when they make their deposit.
The coins are worthless outside the game.
After the n rounds, all players that accumulated at least k (k >= 1.6\ n\*c) coins would receive rewards as such:
1st: $100,000,000
2nd: $30,000,000
3rd: $3,000,000
<= 4th: $0
The utility of money grows linearly for the participants.
The total amount of money is $133,000,000. If multiple people get a position, the money assigned to that position is split evenly among the people.
All players who amassed below the required k coins incur a debt of $100,000,000.
The participants are “sensible” (they act solely to maximise their utility).
It is unknown whether the participants are “reasonable” (their decision algorithm satisfies some apriori reasonableness standard of decision making).
The participants know (solely by reputation) they are all (exceedingly) above average “competence” (they win).
---
Scenario 1
The players cannot communicate at all with each other.
Scenario 2
The players can communicate freely with each other, and even exclusively with selected players during the course of the game, except from when at least one of the prospective conversation partners is making their deposit(s).
I think Scenario 2 is more practical, but I’m interested in the results of Scenario 1 analysis.
I think the problem is interesting because:
Not depositing money dominates depositing money.
If no players deposit money on all five turns, then none of the players would ever accumulate the required 40 coins to receive post game benefits.
Nota Bene
In the original game, a conference took place at the beginning of each round. During this conference, if more than half of the players agreed, they could exile one player. At most one player could be exiled per round. I left this out, as I’m not sure how we could usefully analyse this factor of expulsion. Furthermore, the players in canon were not randomly selected, and there were in fact two groups of players that knew each other very well (plus one wild card who was familiar with both groups, but unpredictability was sort of her thing).
Post Script
I loved the manga (totally recommend it), and it made me rethink how I define rational (the post on that is coming out this weekend). I introduced “competent” to my vocabulary to refer to people who actually make utility maximising decisions (i.e win). “competent” is different from “rational”, because competency can only be evaluated after the decision has been made, and not before. I decided that however I define rationality, it should be evaluatable apriori, or it wouldn’t be a very useful definition.
Competency is introduced, so as to compare rationality with competency. The competent decision is the action that maximises payoff under the state of the world that ended up occurring. I don’t want to be rational—I want to be competent—I care about maximising expected utility, only in so much as it helps me maximise utility. I care only about being more rational in so much as it makes me more competent. I think this should be true for others as well—and this is the only moral prescription I make:
---
Standardised Jargon
“cooperation” means acting according to a strategy agreed upon with the other players.
“defection” means acting differently from a strategy agreed upon with the other players.
$r_x$ denotes round $x$.
$t_x$ denotes the amount in the tax fund at $r_x$.
$j_x$ denotes the number of coins a player saves (donates into their account) in $r_x$.
$w_x$ denotes a player’s accumulated fund at $r_x$.
$r_x$ denotes how much a player decides to donate at $r_x$.
$p_x$ denotes a player’s pseudo pay off (“pseudo” as coins are worthless outside the game) at $r_x$.
$w_{n+1}$ denotes a player’s accumulated fund at $r_{n+1}$ (after the last round). $w_{n + 1}$ is what determines a player’s actual payoff.
Some relationships
$d_x = w_{x − 1} + c—j_x$
$p_x = \frac{(2*t_x)}{m} - d_x$
$w_x = w_{x − 1} + p_{x − 1} + c$
$w_1 = w_0 + c + p\ 0$
$w_1 = 0 + c + 0 = c$
$w_{n+1} = c + \sum_{x = 1}^n {p_x}$
Accumulated knowledge so far
Irregardless of whatever strategy the other players are utilising, and assuming your decision does not influence the decision of the other players, refusing to donate at a particular round always results in you having more coins than other players.
Any defection prior to the last round can (usually) be fixed at the last round (it can’t be “fixed” if there is simply not enough money for example).
Everybody donating 0 coins all 5 rounds is a Nash equilibrium.
This is a sequential decision problem. This means that player’s preferences are over terminal environment states $(r_{n+1})$ in this case, and not intermediate environment states $(r_x (x \in [1, n]))$. All players have the following preferences:
-$\{w_{n+1} > k\} > \{w_{n+1} < k\}$
-{$w_{n+1}$ is first and less than four other people are first (the fewer draws the better) } > {$w_{n+1}$ is second and no other people are second} > {$w_{n+1}$ is first and four other people are first} > {$w_{n+1}$ is second and at least one other person is second (the fewer draws the better)} > {$w_{n+1}$ is third (the fewer draws the better)} > {$w_{n+1}$ is fourth or fifth}.
These are two hierarchies of preference. The first preference is on a higher level than the second preference. If the first preference is not satisfied, then the second is not even considered at all. I don’t know/can’t recall the name for this kind of preference.
The hierarchical nature of the players’ preference means that it is not irrational for a player to adopt a strategy $(s_i)$ at $r_x$ that the player believes would definitely lead to a lower $w_{x+1}$ than another strategy $s_j$, if and only if the agent believes $s_i$ would lead to a higher $w_{n+1}$ than $s_j$.
To illustrate. Suppose everyone agrees to donate 8 out of their 20 coins on the fourth round, then if the plan goes well, they’ll donate 7 coins on the fifth round. Suppose one player has the option to defect maximally and donate nothing, while other players donate everything. Suppose he believes that if he defects and donates less than 8 coins, all other players would not donate anything for the rest of the game. I.e. he believes the remaining four players are all irrationally retributive, and keep to their promises.
Let $s_1$ represent donating 8 coins.
Let $s_0$ represent donating 0 coins.
Assuming they adopt $s_1$:
$p_4 = (8 * 5 * 2)/5 − 8 = 8$
$w_5 = 20 + 8 + 5 = 33$
Assuming they adopt $s_0$:
$p_4 = (8 * 4 * 2)/5 − 0 = 12.8$
$w_5 = 20 + 12.8 + 5 = 37.8$
$w_6 < 40$, as none of the other players donate any money on the fifth round.
Thus, even though $s_0$ conveys a better pseudo payoff to the player, the actual payoff to the player of $s_0$ is less than $s_1$. If the player wants to defect, the best time to do so, is at $r_5$ and not $r_4$.
This illustrates the fact that the players have preferences over only $w_{n+1}$ and don’t have preferences over $(w_x (x \in [1, n]))$.