Alright, here comes a pretty detailed proposal! The idea is to find out if the sum of expected utility for both players is “small” or “large” using the appropriate normalizers.
First, let’s define some quantities. (I’m not overly familiar with game theory, and my notation and terminology are probably non-standard. Please correct me if that’s the case!)
A. The payoff matrix for player 1.
B. The payoff matrix for player 2.
s,r the mixed strategies for players 1 and 2. These are probability vectors, i.e., vectors of non-negative numbers summing to 1.
Then the expected payoff for player 1 is the bilinear form sTAr=∑i,jsiaijrj and the expected payoff for player 2 is sTBr=∑i,jsibijrj. The sum of payoffs is sT(A+B)r.
But we’re not done defining stuff yet. I interpret alignment to be about welfare. Or how large the sum of utilities is when compared to the best-case scenario and the worst-case scenario. To make an alignment coefficient out of this idea, we will need
l(A,B). This is the lower bound to the sum of payoffs, l(A,B)=minu,v[uT(A+B)v], where u,v are probability vectors. Evidentely, l(A,B)=min(A+B).
u(A,B). The upper bound to the sum of payoffs in the counterfactual situation where the payoff to player 1 is not affected by the actions of player 2, and vice versa. Then u(A,B)=maxu,vuTAv+maxu,vuTBv. Now we find that u(A,B)=maxA+maxB.
Now define the alignment coefficient of the strategies (s,r) in the game defined by the payoff matrices (A,B) as
a=sT(A+B)r−l(A,B)u(A,B)−l(A,B).
The intuition is that alignment quantifies how the expected payoff sum sT(A+B)r compares to the best possible payoff sum u(A,B) attainable when the payoffs are independent. If they are equal, we have perfect alignment (a=1). On the other hand, if sT(A+B)r=l(A,B), the expected payoff sum is as bad as it could possibly be, and we have minimal alignment (a=0).
The only problem is that u(A,B)=l(A,B) makes the denominator equal to 0; but in this case, u(A,B)=sT(A+B)r as well, which I believe means that defining a=1 is correct. (It’s also true thatl(A,B)=sT(A+B)r, but I don’t think this matters too much. The players get the best possible outcome no matter how they play, which deserves a=1.) This is an extreme edge case, as it only holds for the special payoff matrices A (B) that contain the same element a (b) in every cell.
Let’s look at some properties:
A pure coordination game has at least one maximal alignment equilibrium, i.e., a(s,r)=1 for some s,r. All of these are necessarily Nash equilibria.
A zero-sum game (that isn’t game-theoretically equivalent to the 0 matrix) has a=0 for every pair of strategies (s,r). This is because sTAr+sTBr=l(A,B)=0 for every s,r. The total payoff is always the worst possible.
The alignment coefficient is linear in a specific senst, i.e., a(A,B)=a(aJ+dA,bJ+dB), where J is the matrix consisting of only 1s.
Now let’s take a look at a variant of the Prisoner’s dilemma with joint payoff matrix
Assuming pure strategies, we find the following matrix of alignment, where aij is the alignment when player 1 plays i with certainty and player 2 plays j with certainty.
a=[1/21/41/40]
Sinces=r=(0,1) is the only Nash equilibrium, the “alignment at rationality” is 0. By taking convex combinations, the range of alignment coefficients is [0,1/2].
Some further comments:
Any general alignment coefficient probably has to be a function of (s,r), as we need to allow them to vary when doing game theory.
Specialized coefficients would only report the alignment at Nash equilibria, maybe the maximal Nash equilibrium.
One may report the maximal alignment without caring about equilibrium points, but then the strategies do not have to be in equilibrium, which I am uneasy with. The maximal alignment for the Prisoner’s dilemma is 1⁄2, but does this matter? Not if we want to quantify the tendency for rational actors to maximize their total utility, at least.
Using e.g. the correlation between the payoffs is not a good idea, as it implicitly assumes the uniform distribution on s,r. And why would you do that?
I like how this proposal makes explicit the player strategies, and how they are incorporated into the calculation. I also think that the edge case where the agents actions have no effect on the result
I think that this proposal making alignment symmetric might be undesirable. Taking the prisoner’s dilemma as an example, if s = always cooperate and r = always defect, then I would say s is perfectly aligned with r, and r is not at all aligned with s.
The result of 0 alignment for the Nash equilibrium of PD seems correct.
I think this should be the alignment matrix for pure-strategy, single-shot PD:
a=[1,11,00,10,0]
Here the first of each ordered pair represents A’s alignment with B. (assuming we use the [0,1] interval)
I think in this case the alignments are simple, because A can choose to either maximize or to minimize B’s utility.
I believe the upper right-hand corner of a shouldn’t be 1; even if both players are acting in each other’s best interest, they are not acting in their own best interest. And alignment is about having both at the same time. The configuration of Prisoner’s dilemma makes it impossible to make both players maximally satisfied at the same time, so I believe it cannot have maximal alignment for any strategy.
Anyhow, your concept of alignment might involve altruism only, which is fair enough. In that case, Vanessa Kosoy has a similar proposal to mine, but not working with sums, which probably does exactly what you are looking for.
Getting alignment in the upper right-hand corner in the Prisoner’s dilemma matrix to be 1 may be possible if we redefine u(A,B) to u(A,B)=maxu,vuT(A+B)v, the best attainable payoff sum. But then zero-sum games will have maximal instead of minimal alignment! (This is one reason why I defined u(A,B)=maxu,vuTAv+maxu,vuTBv.)
(Btw, the coefficient isn’t symmetric; it’s only symmetric for symmetric games. No alignment coefficient depending on the strategies can be symmetric, as the vectors can have different lengths.)
Alright, here comes a pretty detailed proposal! The idea is to find out if the sum of expected utility for both players is “small” or “large” using the appropriate normalizers.
First, let’s define some quantities. (I’m not overly familiar with game theory, and my notation and terminology are probably non-standard. Please correct me if that’s the case!)
A. The payoff matrix for player 1.
B. The payoff matrix for player 2.
s,r the mixed strategies for players 1 and 2. These are probability vectors, i.e., vectors of non-negative numbers summing to 1.
Then the expected payoff for player 1 is the bilinear form sTAr=∑i,jsiaijrj and the expected payoff for player 2 is sTBr=∑i,jsibijrj. The sum of payoffs is sT(A+B)r.
But we’re not done defining stuff yet. I interpret alignment to be about welfare. Or how large the sum of utilities is when compared to the best-case scenario and the worst-case scenario. To make an alignment coefficient out of this idea, we will need
u(A,B). The upper bound to the sum of payoffs in the counterfactual situation where the payoff to player 1 is not affected by the actions of player 2, and vice versa. Then u(A,B)=maxu,vuTAv+maxu,vuTBv. Now we find that u(A,B)=maxA+maxB.
Now define the alignment coefficient of the strategies (s,r) in the game defined by the payoff matrices (A,B) as
a=sT(A+B)r−l(A,B)u(A,B)−l(A,B).The intuition is that alignment quantifies how the expected payoff sum sT(A+B)r compares to the best possible payoff sum u(A,B) attainable when the payoffs are independent. If they are equal, we have perfect alignment (a=1). On the other hand, if sT(A+B)r=l(A,B), the expected payoff sum is as bad as it could possibly be, and we have minimal alignment (a=0).
The only problem is that u(A,B)=l(A,B) makes the denominator equal to 0; but in this case, u(A,B)=sT(A+B)r as well, which I believe means that defining a=1 is correct. (It’s also true thatl(A,B)=sT(A+B)r, but I don’t think this matters too much. The players get the best possible outcome no matter how they play, which deserves a=1.) This is an extreme edge case, as it only holds for the special payoff matrices A (B) that contain the same element a (b) in every cell.
Let’s look at some properties:
A pure coordination game has at least one maximal alignment equilibrium, i.e., a(s,r)=1 for some s,r. All of these are necessarily Nash equilibria.
A zero-sum game (that isn’t game-theoretically equivalent to the 0 matrix) has a=0 for every pair of strategies (s,r). This is because sTAr+sTBr=l(A,B)=0 for every s,r. The total payoff is always the worst possible.
The alignment coefficient is linear in a specific senst, i.e., a(A,B)=a(aJ+dA,bJ+dB), where J is the matrix consisting of only 1s.
Now let’s take a look at a variant of the Prisoner’s dilemma with joint payoff matrix
P=[(2,2)(0,3)(3,0)(1,1)]Then
A=[2031],B=AT,A+B=A+AT=[4332].The alignment coefficient at (s,r) is
sT(A+AT)r−26−2=14(4s1r1+3s1(1−r1)+3r1(1−s1)+(1−s1)(1−r1)−2)=s1+r14Assuming pure strategies, we find the following matrix of alignment, where aij is the alignment when player 1 plays i with certainty and player 2 plays j with certainty.
a=[1/21/41/40]Sinces=r=(0,1) is the only Nash equilibrium, the “alignment at rationality” is 0. By taking convex combinations, the range of alignment coefficients is [0,1/2].
Some further comments:
Any general alignment coefficient probably has to be a function of (s,r), as we need to allow them to vary when doing game theory.
Specialized coefficients would only report the alignment at Nash equilibria, maybe the maximal Nash equilibrium.
One may report the maximal alignment without caring about equilibrium points, but then the strategies do not have to be in equilibrium, which I am uneasy with. The maximal alignment for the Prisoner’s dilemma is 1⁄2, but does this matter? Not if we want to quantify the tendency for rational actors to maximize their total utility, at least.
Using e.g. the correlation between the payoffs is not a good idea, as it implicitly assumes the uniform distribution on s,r. And why would you do that?
I like how this proposal makes explicit the player strategies, and how they are incorporated into the calculation. I also think that the edge case where the agents actions have no effect on the result
I think that this proposal making alignment symmetric might be undesirable. Taking the prisoner’s dilemma as an example, if s = always cooperate and r = always defect, then I would say s is perfectly aligned with r, and r is not at all aligned with s.
The result of 0 alignment for the Nash equilibrium of PD seems correct.
I think this should be the alignment matrix for pure-strategy, single-shot PD:
a=[1,11,00,10,0]Here the first of each ordered pair represents A’s alignment with B. (assuming we use the [0,1] interval)
I think in this case the alignments are simple, because A can choose to either maximize or to minimize B’s utility.
I believe the upper right-hand corner of a shouldn’t be 1; even if both players are acting in each other’s best interest, they are not acting in their own best interest. And alignment is about having both at the same time. The configuration of Prisoner’s dilemma makes it impossible to make both players maximally satisfied at the same time, so I believe it cannot have maximal alignment for any strategy.
Anyhow, your concept of alignment might involve altruism only, which is fair enough. In that case, Vanessa Kosoy has a similar proposal to mine, but not working with sums, which probably does exactly what you are looking for.
Getting alignment in the upper right-hand corner in the Prisoner’s dilemma matrix to be 1 may be possible if we redefine u(A,B) to u(A,B)=maxu,vuT(A+B)v, the best attainable payoff sum. But then zero-sum games will have maximal instead of minimal alignment! (This is one reason why I defined u(A,B)=maxu,vuTAv+maxu,vuTBv.)
(Btw, the coefficient isn’t symmetric; it’s only symmetric for symmetric games. No alignment coefficient depending on the strategies can be symmetric, as the vectors can have different lengths.)