Nicolas Macé comments on Implementing Decision Theory

Nicolas Macé 9 Nov 2023 8:58 UTC
4 points
0
Looks very interesting, I’ll make sure to check out the git repo! Thanks for developing that!
As you’re perhaps already aware, (Everitt, Leike & Hutter 2015) comes with a jupyter notebook that implements EDT and CDT in sequential decision problems. Perhaps useful as a comparison or a source of inspiration.
My view of decision theory now is that it’s all about fixpoints. You solve some big equation, and inside of it is the same equation, and there are multiple fixpoint solutions, and you pick the (hopefully unique) best one.
Would you say that this is similar to the connection that exists between fixed points and Nash equilibria?
- justinpombrio 10 Nov 2023 23:12 UTC
  2 points
  0
  Parent
  I was not aware of Everitt, Leike & Hutter 2015, thank you for the reference! I only delved into decision theory a few weeks ago, so I haven’t read that much yet.
  
  Would you say that this is similar to the connection that exists between fixed points and Nash equilibria?
  
  Nash equilibria come from the fact that your action depends on your opponent’s action, which depends on your action. When you assume that each player will greedily change their action if it improves their utility, the Nash equilibria are the fixpoints at which no player changes their action.
  
  In single-agent decision theory problems, your (best) action depends on the situation you’re in, which depends on what someone predicted your action would be, which (effectively) depends on your action.
  
  If there’s a deeper connection than this, I don’t know it. There’s a fundamental difference between the two cases, I think, because a Nash equilibrium involves multiple agents that don’t know each others’ decision process (problem statement: maximize the outputs of two functions independently), while single-agent decision theory involves just one agent (problem statement: maximize the output of one function).
  - Nicolas Macé 11 Nov 2023 17:38 UTC
    2 points
    0
    Parent
    I’d say that the connection is: Single-agent problems with predictors can be interpreted as sequential two-player games where the (perfect) predictor is a player who observes the action of the decision-maker and best-responds to it. In game-theoretic jargon, the predictor is a Stackelberg follower, and the decision-maker is the Stackelberg leader. (Related: (Kovarik, Oesterheld & Conitzer 2023))
    - justinpombrio 12 Nov 2023 0:22 UTC
      2 points
      0
      Parent
      What’s the utility function of the predictor? Is there necessarily a utility function for the predictor such that the predictor’s behavior (which is arbitrary) corresponds to maximizing its own utility? (Perhaps this is mentioned in the paper, which I’ll look at.)
      
      EDIT: do you mean to reduce a 2-player game to a single-agent decision problem, instead of vice-versa?
      - Nicolas Macé 20 Nov 2023 13:31 UTC
        1 point
        0
        Parent
        [Apologies for the delay]
        Is there necessarily a utility function for the predictor such that the predictor’s behavior (which is arbitrary) corresponds to maximizing its own utility
        You’re right, the predictor’s behavior might not be compatible with utility maximization against any beliefs. I guess we’re often interested in cases where we can think of the predictor as an agent. The predictor’s behavior might be irrational in the restrictive above sense,^[1] but to the extent that we think of it as an agent, my guess is that we can still get away with using a game theoretic-flavored approach.
        ^
        For instance, if the predictor is unaware of some crucial hypothesis, or applies mild optimization rather than expected value maximization