Sorry, I didn’t mean to be accusatory in that, only descriptive in a way that I hope will let me understand what you’re trying to model/measure as “alignment”, with the prerequisite understanding of what the payout matrix indicates. http://cs.brown.edu/courses/cs1951k/lectures/2020/chapters1and2.pdf is one reference, but I’ll admit it’s baked in to my understanding to the point that I don’t know where I first saw it. I can’t find any references to the other interpretation (that the payouts are something other than a ranking of preferences by each player).
So the question is “what DO these payout numbers represent”? and “what other factors go into an agent’s decision of which row/column to choose”?
I think I agree that payout represents player utility.
The agent’s decision can be made in any way. Best response, worst response, random response, etc.
I just don’t want to assume the players are making decisions via best response to each strategy profile (which is just some joint strategy of all the game’s players). Like, in rock-paper-scissors, if we consider the strategy profile P1: rock, P2: scissors, I’m not assuming that P2 would respond to this by playing paper.
And when I talk about ‘responses’, I do mean ‘response’ in the ‘best response’ sense; the same way one can reason about Nash equilibria in non-iterated games, we can imagine asking “how would the player respond to this outcome?”.
Another point for triangulating my thoughts here is Vanessa’s answer, which I think resolves the open question.
I like Vanessa’s answer for the fact that it’s clearly NOT utility that is in the given payoff matrix. It’s not specified what it actually is, but the inclusion of a utility function that transforms the given outcomes into desirability (utility) for the players separates the concept enough to make sense. and then defining alignment as how well player A’s utility function supports player B’s game-outcome works. Not sure it’s useful, but it’s sensible.
How is it clearly not about utility being specified in the payoff matrix? Vanessa’s definition itself relies on utility, and both of us interchanged ‘payoff’ and ‘utility’ in the ensuing comments.
Sorry, I didn’t mean to be accusatory in that, only descriptive in a way that I hope will let me understand what you’re trying to model/measure as “alignment”, with the prerequisite understanding of what the payout matrix indicates. http://cs.brown.edu/courses/cs1951k/lectures/2020/chapters1and2.pdf is one reference, but I’ll admit it’s baked in to my understanding to the point that I don’t know where I first saw it. I can’t find any references to the other interpretation (that the payouts are something other than a ranking of preferences by each player).
So the question is “what DO these payout numbers represent”? and “what other factors go into an agent’s decision of which row/column to choose”?
Right, thanks!
I think I agree that payout represents player utility.
The agent’s decision can be made in any way. Best response, worst response, random response, etc.
I just don’t want to assume the players are making decisions via best response to each strategy profile (which is just some joint strategy of all the game’s players). Like, in rock-paper-scissors, if we consider the strategy profile
P1: rock, P2: scissors
, I’m not assuming that P2 would respond to this by playing paper.And when I talk about ‘responses’, I do mean ‘response’ in the ‘best response’ sense; the same way one can reason about Nash equilibria in non-iterated games, we can imagine asking “how would the player respond to this outcome?”.
Another point for triangulating my thoughts here is Vanessa’s answer, which I think resolves the open question.
I like Vanessa’s answer for the fact that it’s clearly NOT utility that is in the given payoff matrix. It’s not specified what it actually is, but the inclusion of a utility function that transforms the given outcomes into desirability (utility) for the players separates the concept enough to make sense. and then defining alignment as how well player A’s utility function supports player B’s game-outcome works. Not sure it’s useful, but it’s sensible.
How is it clearly not about utility being specified in the payoff matrix? Vanessa’s definition itself relies on utility, and both of us interchanged ‘payoff’ and ‘utility’ in the ensuing comments.