You might want to analyze this game according to standard game theory before trying to reinvent the wheel (and risking to make it square). In particular, when your proposed strategy involves guessing what the other players do, and this doesn’t converge when the other players do the same to you, it is usually a bad sign (sometimes it’s not possible to do anything better, but sometime it is).
Let’s start with the standard version:
There are two pure strategy Nash equilibria: (A, B) and (B, A). Both are Pareto-optimal, but they benefit the player who played A more than the other, thus they are “unfair”, in the sense that they yield asymmetric payoff in a symmetric game. There is also a mixed strategy Nash equilibrium which is symmetric but sub-optimal.
The game is a bargaining problem and a coordination game as well. In particular, it is a variation of the battle of the sexes game. Unfortunately, there doesn’t seem to be any truly satisfactory solution to this game. There is, however, a correlated equilibrium solution. If both players have access to a public random bit generator, they can obtain a Pareto-optimal symmetric payoff equilibrium, assuming that they can communicate to coordinate a strategy (they don’t have to make a contract).
Now, let’s consider the program-swap variant of the game.
First of all, a remark on the wording:
what if we allow players access to each others’ source code?
Players don’t have access to each others’ source code. Players submit ‘bots’: programs which have access to each others’ source code. The submitted programs are (pure) strategies, not players. I think it is important to always make the distinction clear, otherwise there is a risk of confusion between the decision makers and the decisions themselves, which may lead to flawed proposed solutions.
In the program-swap variant of this game we (players) can achieve something eqiivalent to the correlated equilibrium by
using clique-like mixed strategies:
Each player flips a coin and submits either CliqueBot_0 or CliqueBot_1, which are defined as: def CliqueBot_X(otherBot): if otherBot is not a CliqueBot_X then output ‘A’ Y = extract ‘X’ from otherBot ID = player id (either 0 or 1) if X xor Y xor ID then output ‘A’ otherwise output ‘B’
This achieves a Pareto-optimal symmetric equilibrium against itself and, being a clique, it also achieves stability: defectors from the clique are immediately punished. It also achieves maximum payoff against B-bots and PredictorBots, though it performs poorly against A-bots.
The main issue with this strategy is, of course, that there are infinitely many cliques, each of them doesn’t cooperate with the others. This becomes to a “choosing sides” coordination game, which is easier than the original game: you just need a single way communication channel between the players to achieve coordination. Even if there is no communicaton channel, it might possible to achieve coordination using a prior, without lack of convergence.
You might want to analyze this game according to standard game theory before trying to reinvent the wheel (and risking to make it square).
Fair enough,
Players don’t have access to each others’ source code.
Well… This article is tagged decision_theory, not game_theory. The goal is not merely to explore solutions to this game, the goal is to explore solution-finding strategies to this game, or at least lay the foundations for that.
I think your CliqueBots need the insight behind Emile’s hashMe proposal—they each need a unique ID. Otherwise, when CliqueBot_0 plays a copy, “X xor Y xor ID” is 0, so both play B, so they both lose.
Well… This article is tagged decision_theory, not game_theory. The goal is not merely to explore solutions to this game, the goal is to explore solution-finding strategies to this game, or at least lay the foundations for that.
Isn’t that exactly the topic of game theory?
I think your CliqueBots need the insight behind Emile’s hashMe proposal—they each need a unique ID. Otherwise, when CliqueBot_0 plays a copy, “X xor Y xor ID” is 0, so both play B, so they both lose.
Yes, I assumed that the execution environment provides the bots with a binary role ID as an additional (implicit) parameter. If that’s a problem you can use a more sophisticated symmetry-breaking scheme, on the lines of Emile’s proposal, but I think it is better to maintain cliquing, which is missing from Emile’s proposal.
Well, you seem to be more expert than I am, so I can just ask you :D Given a small, two-player payoff matrix, what general process outputs the correct move for player 1?
I’m not really that expert, but I can try to answer the question anyway.
The short answer is to implement game theory.
Of course, this is easier said than done. Under the assumption that the other player is at least as rational as you are, you generally want a decision procedure that chooses a strategy which is part of a Nash equilibrium (under mild assumptions, at least one Nash equilibrium always exists). If there are multiple Nash equilibria, as it is often the case, you want your decision procedure to seek for a refinement and hope that the other player does the same. Ultimately, for some games there is no single satisfactory solution. It could be that this happens because game theory is incomplete or that these games are intrinsically unsolvable dilemmas. In these cases, you might want to fall back to an heuristic (such as randomization or trying to guess the other player decision)
Hm. So I think I reinvented the wheel to moderate success, then. In the blind, no-communication case, you play the bargaining equilibrium. With communication via source code, though, I didn’t make my wheel—following the bargaining equilibrium among strategies—very round. This is because you can use your source codes to send two random numbers, which lets you cooperate with the other player. But there are many different possible ways of doing this, each corresponding to a different equilibrium.
To have any chance of cooperating, and thus beating my slightly non-round wheel, you need a prior over opponents to let you narrow down the search for cooperation schemes (probably “pick the simplest”), and you need to randomize over the multiple remaining equilibria. And no cliqueishness here, since we’re trying to play blind.
You might want to analyze this game according to standard game theory before trying to reinvent the wheel (and risking to make it square).
In particular, when your proposed strategy involves guessing what the other players do, and this doesn’t converge when the other players do the same to you, it is usually a bad sign (sometimes it’s not possible to do anything better, but sometime it is).
Let’s start with the standard version:
There are two pure strategy Nash equilibria: (A, B) and (B, A). Both are Pareto-optimal, but they benefit the player who played A more than the other, thus they are “unfair”, in the sense that they yield asymmetric payoff in a symmetric game.
There is also a mixed strategy Nash equilibrium which is symmetric but sub-optimal. The game is a bargaining problem and a coordination game as well. In particular, it is a variation of the battle of the sexes game.
Unfortunately, there doesn’t seem to be any truly satisfactory solution to this game. There is, however, a correlated equilibrium solution. If both players have access to a public random bit generator, they can obtain a Pareto-optimal symmetric payoff equilibrium, assuming that they can communicate to coordinate a strategy (they don’t have to make a contract).
Now, let’s consider the program-swap variant of the game.
First of all, a remark on the wording:
Players don’t have access to each others’ source code. Players submit ‘bots’: programs which have access to each others’ source code. The submitted programs are (pure) strategies, not players. I think it is important to always make the distinction clear, otherwise there is a risk of confusion between the decision makers and the decisions themselves, which may lead to flawed proposed solutions.
In the program-swap variant of this game we (players) can achieve something eqiivalent to the correlated equilibrium by using clique-like mixed strategies:
Each player flips a coin and submits either CliqueBot_0 or CliqueBot_1, which are defined as:
def CliqueBot_X(otherBot):
if otherBot is not a CliqueBot_X then output ‘A’
Y = extract ‘X’ from otherBot
ID = player id (either 0 or 1)
if X xor Y xor ID then output ‘A’
otherwise output ‘B’
This achieves a Pareto-optimal symmetric equilibrium against itself and, being a clique, it also achieves stability: defectors from the clique are immediately punished. It also achieves maximum payoff against B-bots and PredictorBots, though it performs poorly against A-bots.
The main issue with this strategy is, of course, that there are infinitely many cliques, each of them doesn’t cooperate with the others. This becomes to a “choosing sides” coordination game, which is easier than the original game: you just need a single way communication channel between the players to achieve coordination. Even if there is no communicaton channel, it might possible to achieve coordination using a prior, without lack of convergence.
Fair enough,
Well… This article is tagged decision_theory, not game_theory. The goal is not merely to explore solutions to this game, the goal is to explore solution-finding strategies to this game, or at least lay the foundations for that.
I think your CliqueBots need the insight behind Emile’s hashMe proposal—they each need a unique ID. Otherwise, when CliqueBot_0 plays a copy, “X xor Y xor ID” is 0, so both play B, so they both lose.
Isn’t that exactly the topic of game theory?
Yes, I assumed that the execution environment provides the bots with a binary role ID as an additional (implicit) parameter. If that’s a problem you can use a more sophisticated symmetry-breaking scheme, on the lines of Emile’s proposal, but I think it is better to maintain cliquing, which is missing from Emile’s proposal.
Well, you seem to be more expert than I am, so I can just ask you :D Given a small, two-player payoff matrix, what general process outputs the correct move for player 1?
I’m not really that expert, but I can try to answer the question anyway.
The short answer is to implement game theory.
Of course, this is easier said than done.
Under the assumption that the other player is at least as rational as you are, you generally want a decision procedure that chooses a strategy which is part of a Nash equilibrium (under mild assumptions, at least one Nash equilibrium always exists). If there are multiple Nash equilibria, as it is often the case, you want your decision procedure to seek for a refinement and hope that the other player does the same.
Ultimately, for some games there is no single satisfactory solution. It could be that this happens because game theory is incomplete or that these games are intrinsically unsolvable dilemmas. In these cases, you might want to fall back to an heuristic (such as randomization or trying to guess the other player decision)
Hm. So I think I reinvented the wheel to moderate success, then. In the blind, no-communication case, you play the bargaining equilibrium. With communication via source code, though, I didn’t make my wheel—following the bargaining equilibrium among strategies—very round. This is because you can use your source codes to send two random numbers, which lets you cooperate with the other player. But there are many different possible ways of doing this, each corresponding to a different equilibrium.
To have any chance of cooperating, and thus beating my slightly non-round wheel, you need a prior over opponents to let you narrow down the search for cooperation schemes (probably “pick the simplest”), and you need to randomize over the multiple remaining equilibria. And no cliqueishness here, since we’re trying to play blind.