I think I agree that payout represents player utility.
The agent’s decision can be made in any way. Best response, worst response, random response, etc.
I just don’t want to assume the players are making decisions via best response to each strategy profile (which is just some joint strategy of all the game’s players). Like, in rock-paper-scissors, if we consider the strategy profile P1: rock, P2: scissors, I’m not assuming that P2 would respond to this by playing paper.
And when I talk about ‘responses’, I do mean ‘response’ in the ‘best response’ sense; the same way one can reason about Nash equilibria in non-iterated games, we can imagine asking “how would the player respond to this outcome?”.
Another point for triangulating my thoughts here is Vanessa’s answer, which I think resolves the open question.
I like Vanessa’s answer for the fact that it’s clearly NOT utility that is in the given payoff matrix. It’s not specified what it actually is, but the inclusion of a utility function that transforms the given outcomes into desirability (utility) for the players separates the concept enough to make sense. and then defining alignment as how well player A’s utility function supports player B’s game-outcome works. Not sure it’s useful, but it’s sensible.
How is it clearly not about utility being specified in the payoff matrix? Vanessa’s definition itself relies on utility, and both of us interchanged ‘payoff’ and ‘utility’ in the ensuing comments.
Right, thanks!
I think I agree that payout represents player utility.
The agent’s decision can be made in any way. Best response, worst response, random response, etc.
I just don’t want to assume the players are making decisions via best response to each strategy profile (which is just some joint strategy of all the game’s players). Like, in rock-paper-scissors, if we consider the strategy profile
P1: rock, P2: scissors
, I’m not assuming that P2 would respond to this by playing paper.And when I talk about ‘responses’, I do mean ‘response’ in the ‘best response’ sense; the same way one can reason about Nash equilibria in non-iterated games, we can imagine asking “how would the player respond to this outcome?”.
Another point for triangulating my thoughts here is Vanessa’s answer, which I think resolves the open question.
I like Vanessa’s answer for the fact that it’s clearly NOT utility that is in the given payoff matrix. It’s not specified what it actually is, but the inclusion of a utility function that transforms the given outcomes into desirability (utility) for the players separates the concept enough to make sense. and then defining alignment as how well player A’s utility function supports player B’s game-outcome works. Not sure it’s useful, but it’s sensible.
How is it clearly not about utility being specified in the payoff matrix? Vanessa’s definition itself relies on utility, and both of us interchanged ‘payoff’ and ‘utility’ in the ensuing comments.