Lukas_Gloor comments on Let’s Read: Superhuman AI for multiplayer poker

Lukas_Gloor 14 Jul 2019 20:26 UTC
12 points
Thanks for this summary!
In 2017 I commented on the two-player version here.
… if the player bets in [a winning] situation only when holding the best possible hand, then the opponents would know to always fold in response. To cope with this, Pluribus keeps track of the probability it would have reached the current situation with each possible hand according to its strategy. Regardless of which hand Pluribus is actually holding, it will first calculate how it would act with every possible hand, being careful to balance its strategy across all the hands so as to remain unpredictable to the opponent. Once this balanced strategy across all hands is computed, Pluribus then executes an action for the hand it is actually holding.
Human professional players are trying to approximate this level of balancedness as well, using computer programs (“solvers”). See this youtube video for an example of a hand with solver analysis. In order to get the solver analysis started, one needs to specify input hand ranges one expects people to have in the specific situations, as well as bet sizes for the solver to consider (more than just 2-3 bet sizes would be too much for the solver to handle). To specify those parameters, professionals can make guesses (sometimes based on data) about how other players play. Because the input parameters depend on human learned wisdom rather than worked out game theory, solvers can’t quite be said to have solved poker.
So, like the computer, human players try to simplify the game tree in order to be able to approximate balanced play. However, this is much easier for computers. Pluribus knows its own counterfactuals perfectly, and it can make sure it always covers all the options for cards to have (in order to represent different board textures) and has the right number of bluffs paired with good hands for every state of the game given past actions.
It almost seems kind of easy to beat humans in this way, except that knowing how to simplify and then model the situations in the first place seemed to have been the bottleneck up until 2017.
Donk betting: some kind of uncommon play that’s usually considered dumb (like a donkey). I didn’t figure out what it actually means.
“Donk betting” has a bad reputation because it’s a typical mistake amateur players make, doing it in the wrong type of situations with the wrong types of hands. You can only donk bet in some betting round if you’re first to act, and a general weakness amateur players have is that they don’t understand the value of being last to act (having more information). To at least somewhat mitigate the awfulness of being first to act, good players try to give out as little information as possible. If you played the previous street passively and your opponent displayed strength, you generally want to check because your opponent already expects you to be weaker, and so will do the betting for you often enough because they’re still telling their story of having a stronger hand. If you donk bet when a new card improved you, you telegraph information and your opponent can play perfectly against that, folding their weak hands and continuing only with strong hands. If you check instead, you get more value from your opponent’s bluffs, and you almost always still get to put in your raise after they bet for you, reopening the betting round for you.
However, there are instances where donk betting is clearly good: When a new card is much more likely to improve your range of hands compared to your opponent’s. In certain situations a new card is terrible for one player and good for the other player. In those instances, you can expect thinking opponents to check after you even with most of their strong hands, because they became apprehensive of your range of hands having improved a lot. In that case, you sometimes want to bet out right away (both in some of the cases where you hit, as well as with bluffs).
However, Pluribus disagrees with the folk wisdom that “donk betting” (starting a round by betting when one ended the previous betting round with a call) is a mistake; Pluribus does this far more often than professional humans do.
It might just be that professional humans decide to keep the game tree simple by not developing donk bet strategies for situations where this is complicated to balance and only produces small benefits if done perfectly. But it could be that Pluribus found a more interesting reason to occasionally use donk bets in situations where professional players would struggle to see the immediate use. Unfortunately I couldn’t find any discussion of hand histories illustrating the concept.