Steven Byrnes comments on Rational Agents Cooperate in the Prisoner’s Dilemma

Steven Byrnes 3 Sep 2023 17:32 UTC
2 points
1
Just to be clear, are you saying that “in the typical prisoner’s dilemma”, each player has access to the other player’s source code?
- Isaac King 3 Sep 2023 17:37 UTC
  2 points
  −9
  Parent
  It’s usually not stated in computational terms like that, but to my understanding yes. The prisoner’s dilemma is usually posed as a game of complete information, with nothing hidden.
  - Steven Byrnes 3 Sep 2023 20:29 UTC
    5 points
    4
    Parent
    Oh wow, that’s a completely outlandish thing to believe, from my perspective. I’ll try to explain why I feel that way:
    I haven’t read very much of the open-source game theory literature, but I have read a bit, and everybody presents it as “this is a weird and underexplored field”, not “this is how game theory works and has always worked and everybody knows that”.
    Here’s an example: an open-source game theory domain expert writes “I sometimes forget that not everyone realizes how poorly understood open-source game theory is … open-source game theory can be very counterintuitive, and we could really use a lot more research to understand how it works before we start building a lot of code-based agents that are smart enough to read and write code themselves”
    Couple more random papers: this one & this one. Both make it clear from their abstracts that they are exploring a different problem than the normal problems of game theory.
    You can look up any normal game theory paper / study ever written, and you’ll notice that the agents are not sharing source code
    Game theory is purported to be at least slightly relevant to humans and human affairs and human institutions, but none of those things can can share source code. Granted, everybody knows that game theory is idealized compared to the real world, but its limitations in that respect are frequently discussed, and I have never seen “humans can’t share source code / read each other’s minds” listed as one of those limitations.
    The prisoner’s dilemma is usually posed as a game of complete information, with nothing hidden.
    Wikipedia includes “strategies” as being common knowledge by definition of “complete information”, but I think that’s just an error (or a poor choice of words—see next paragraph). EconPort says complete information means each player is “informed of all other players payoffs for all possible action profiles”; Policonomics says “each agent knows the other agent’s utility function and the rules of the game”; Game Theory: An Introduction says complete information is “the situation in which each player i knows the action set and payoff function of each and every player j, and this itself is common knowledge”; Game Theory for Applied Economists says “a game has incomplete information if one player does not know another player’s payoffs”. I also found several claims that chess is a complete-information game, despite the fact that chess players obviously don’t share source code or explain to their opponent what exactly they were planning when they sacrificed their bishop.
    I haven’t yet found any source besides Wikipedia that says anything about the player’s reading each other’s minds (not just utility functions, but also immediate plans, tactics, strategies, etc.) as being part of “complete information games” universally and by definition. Actually, upon closer reading, I think even Wikipedia is unclear here, and probably not saying that the players can read each other’s minds (beyond utility functions). The intro just says “strategies” which is unclear, but later in the article it says “potential strategies”, suggesting to me that the authors really mean something like “the opponents’ space of possible action sequences” (which entails no extra information beyond the rules of the game), and not “the opponents’ actual current strategies” (which would entail mind-reading).
    - bideup 3 Sep 2023 21:16 UTC
      7 points
      2
      Parent
      Yep, a game of complete information is just one is which the structure of the game is known to all players. When wikipedia says
      
      The utility functions (including risk aversion), payoffs, strategies and “types” of players are thus common knowledge.
      
      it’s an unfortunately ambiguous phrasing but it means
      
      The specific utility function each player has, the specific payoffs each player would get from each possible outcome, the set of possible strategies available to each player, and the set of possible types each player can have (e.g. the set of hands they might be dealt in cards) are common knowledge.
      
      It certainly does not mean that the actual strategies or source code of all players are known to each other player.
      - Isaac King 4 Sep 2023 0:11 UTC
        2 points
        0
        Parent
        Well in that case classical game theory doesn’t seem up to the task, since in order to make optimal decisions you’d need a probability distribution over the opponent’s strategies, no?
        bideup 4 Sep 2023 7:13 UTC
        5 points
        1
        Parent
        Right, vanilla game theory is mostly not a tool for making decisions.
        
        It’s about studying the structure of strategic interactions, with the idea that some kind of equilibrium concept should have predictive power about what you’ll see in practice. On the one hand, if you get two humans together and tell them the rules of a matrix game, Nash equilibrium has relatively little predictive power. But there are many situations across biology, computer science, economics and more where various equilibrium concepts have plenty of predictive power.
        Isaac King 15 Sep 2023 15:40 UTC
        2 points
        0
        Parent
        But doesn’t the calculation of those equilibria require making an assumption about the opponent’s strategy?
    - TekhneMakre 3 Sep 2023 22:09 UTC
      2 points
      0
      Parent
      The situation is slightly complicated, in the following way. You’re broadly right; source code sharing is new. But the old concept of Nash equilibrium is I think sometimes justified like this: We assume that not only do the agents know the game, but they also know each other. They know each other’s beliefs, each other’s beliefs about the other’s beliefs, and so on ad infinitum. Since they know everything, they will know what their opponent will do (which is allowed to be a stochastic policy). Since they know what their opponent will do, they’ll of course (lol) do a causal EU-maxxing best response. Therefore the final pair of strategies must be a Nash equilibrium, i.e. a mutual best-response.
      
      This may be what Isaac was thinking of when referring to “common knowledge of everything”.
      
      OSGT then shows that there are code-reading players who play non-Nash strategies and do better than Nashers.
      - Gurkenglas 4 Sep 2023 0:03 UTC
        2 points
        1
        Parent
        This only needs knowledge of each other’s policy, not knowledge of each other’s knowledge, yes?
        TekhneMakre 4 Sep 2023 0:21 UTC
        2 points
        0
        Parent
        Yes, but the idea (I think!) is that you can recover the policy from just the beliefs (on the presumption of CDT EU maxxing). Saying that A does xyz because B is going to do abc is one thing; it builds in some of the fixpoint finding. The common knowledge of beliefs instead says: A does xyz because he believes “B believes that A will do xyz, and therefore B will do abc as the best response”; so A chooses xyz because it’s the best response to abc.
        
        But that’s just one step. Instead you could keep going:
        
        --> A believes that
        
        ----> B believes that
        
        ------> A believes that
        
        --------> B believes that A will do xyz,
        
        --------> and therefore B will do abc as the best response
        
        ------> and therefore A will do xyz as the best response
        
        ----> and therefore B will do abc as the best response
        
        so A does xyz as the best response. And then you go to infinityyyy.
        bideup 4 Sep 2023 7:26 UTC
        2 points
        0
        Parent
        Being able to deduce a policy from beliefs doesn’t mean that common knowledge of beliefs is required.
        
        The common knowledge of policy thing is true but is external to the game. We don’t assume that players in prisoner’s dilemma know each others policies. As part of our analysis of the structure of the game, we might imagine that in practice some sort of iterative responding-to-each-other’s-policy thing will go on, perhaps because players face off regularly (but myopically), and so the policies selected will be optimal wrt each other. But this isn’t really a part of the game, it’s just part of our analysis. And we can analyse games in various different ways e.g. by considering different equilibrium concepts.
        
        In any case it doesn’t mean that an agent in reality in a prisoner’s dilemma has a crystal ball telling them the other’s policy.
        
        Certainly it’s natural to consider the case where the agents are used to playing against each other so the have the chance to learn and react to each other’s policies. But a case where they each learn each other’s beliefs doesn’t feel that natural to me—might as well go full OSGT at that point.
        TekhneMakre 4 Sep 2023 15:23 UTC
        3 points
        0
        Parent
        
        Being able to deduce a policy from beliefs doesn’t mean that common knowledge of beliefs is required.
        
        Sure, I didn’t say it was. I’m saying it’s sufficient (given some assumptions), which is interesting.
        
        In any case it doesn’t mean that an agent in reality in a prisoner’s dilemma has a crystal ball telling them the other’s policy.
        
        Sure, who’s saying so?
        
        But a case where they each learn each other’s beliefs doesn’t feel that natural to me
        
        It’s analyzed this way in the literature, and I think it’s kind of natural; how else would you make the game be genuinely perfect information (in the intuitive sense), including the other agent, without just picking a policy?