This seems related to my speculations about multi-agent alignment. In short, for embedded agents, having a tractable complexity of building models of other decision processes either requires a reflexively consistent view of their reactions to modeling my reactions to their reactions, etc. - or it requires simplification that clearly precludes ideal Bayesian agents. I made the argument much less formally, and haven’t followed the math in the post above (I hope to have time to go through more slowly at some point.)
To lay it out here, the basic argument in the paper is that even assuming complete algorithmic transparency, in any reasonably rich action space, even games as simple as poker become completely intractable to solve. Each agent needs to simulate a huge space of possibilities for the decision of all other agents in order to make a decision about what the probability is that the agent is in each potential position. For instance, what is the probability that they are holding a hand much better than mine and betting this way, versus that they are bluffing, versus that they have a roughly comparable strength hand and are attempting to find my reaction, etc. But evaluating this requires evaluating the probability that they assign to me reacting in a given way in each condition, etc. The regress may not be infinite, because the space of states is finite, as is the computation time, but even in such a simple world it grows too quickly to allow fully Bayesian agents within the computational capacity of, say, the physical universe.
This seems related to my speculations about multi-agent alignment. In short, for embedded agents, having a tractable complexity of building models of other decision processes either requires a reflexively consistent view of their reactions to modeling my reactions to their reactions, etc. - or it requires simplification that clearly precludes ideal Bayesian agents. I made the argument much less formally, and haven’t followed the math in the post above (I hope to have time to go through more slowly at some point.)
To lay it out here, the basic argument in the paper is that even assuming complete algorithmic transparency, in any reasonably rich action space, even games as simple as poker become completely intractable to solve. Each agent needs to simulate a huge space of possibilities for the decision of all other agents in order to make a decision about what the probability is that the agent is in each potential position. For instance, what is the probability that they are holding a hand much better than mine and betting this way, versus that they are bluffing, versus that they have a roughly comparable strength hand and are attempting to find my reaction, etc. But evaluating this requires evaluating the probability that they assign to me reacting in a given way in each condition, etc. The regress may not be infinite, because the space of states is finite, as is the computation time, but even in such a simple world it grows too quickly to allow fully Bayesian agents within the computational capacity of, say, the physical universe.