Nash equilibria are generated by both players using the argmax strategy which tries to respond as best as possible to what the foe is doing.
Which (seems to) make sense when it’s zero-sum.
And it has the teensy little problem that it’s assuming that you can make your decision completely independently of the foe without them catching on and changing what they do in response.
One component is kind of like asking ‘what moves are good whatever the other player does’, though the formalization seems to drift from this. (And the continuous/probabilistic version I’m slightly less clear on—especially because these agents don’t choose equilibria.)
Which doesn’t look like much of a stretch, especially given that knowing someone’s decision procedure seems easier than knowing their planned output for a game where you can’t peek at the foe. The former only requires source code access while the latter requires actually running the computation that is the foe.
Or having observed them actual make an action—whatever the procedure—but that’s fair, that’s a different game.
Basically, it’s assuming that the two players can observe shared random bits.
I didn’t know that. (Not surprising given “fairly little is known about it.”.)
Which (seems to) make sense when it’s zero-sum.
One component is kind of like asking ‘what moves are good whatever the other player does’, though the formalization seems to drift from this. (And the continuous/probabilistic version I’m slightly less clear on—especially because these agents don’t choose equilibria.)
Or having observed them actual make an action—whatever the procedure—but that’s fair, that’s a different game.
I didn’t know that. (Not surprising given “fairly little is known about it.”.)