Caspar Oesterheld comments on Extracting Money from Causal Decision Theorists

Caspar Oesterheld 29 Jan 2021 18:18 UTC
5 points
0
I think some people may have their pet theories which they call CDT and which require randomization. But CDT as it is usually/traditionally described doesn’t ever insist on randomizing (unless randomizing has a positive causal effect). In this particular case, even if a randomization device were made available, CDT would either uniquely favor one of the boxes or be indifferent between all distributions over ${buy box B_{1}, buy box B_{2}}$ . Compare Section IV.1 of the paper.

What you’re referring to are probably so-called ratificationist variants of CDT. These would indeed require randomizing 50-50 between the two boxes. But one can easily construct scenarios which trip these theories up. For example, the seller could put no money in any box if she predicts that the buyer will randomize. Then no distribution is ratifiable. See Section IV.4 for a discussion of Ratificationism.
- cousin_it 30 Jan 2021 11:44 UTC
  7 points
  Parent
  
  For example, the seller could put no money in any box if she predicts that the buyer will randomize.
  
  This is a bit unsatisfying, because in my view of decision theory you don’t get to predict things like “the agent will randomize” or “the agent will take one box but feel a little wistful about it” and so on. This is unfair, in the same way as predicting that “the agent will use UDT” and punishing for it is unfair. No, you just predict the agent’s output. Or if the agent can randomize, you can sample (as many times as you like, but finitely many) from the distribution of the agent’s output. A bit more on this here, though the post got little attention.
  
  Can your argument be extended to this case?
  - Daniel Kokotajlo 30 Jan 2021 11:54 UTC
    8 points
    0
    Parent
    in my view of decision theory you don’t get to predict things like “the agent will randomize”
    Why not? You surely agree that sometimes people can in fact predict such things. So your objection must be that it’s unfair when they do and that it’s not a strike against a decision theory if it causes you to get money-pumped in those situations. Well… why? Seems pretty bad to me, especially since some extremely high-stakes real-world situations our AIs might face will be of this type.
    - cousin_it 30 Jan 2021 13:19 UTC
      4 points
      Parent
      Sure, and sometimes people can predict things like “the agent will use UDT” and use that to punish the agent. But this kind of prediction is “unfair” because it doesn’t lead to an interesting decision theory—you can punish any decision theory that way. So to me the boundaries of “fair” and “unfair” are also partly about mathematical taste and promising-ness, not just what will lead to a better tank and such.
      - Daniel Kokotajlo 30 Jan 2021 18:10 UTC
        3 points
        Parent
        Right, that kind of prediction is unfair because it doesn’t lead to an interesting decision theory… but I asked why you don’t get to predict things like “the agent will randomize.” All sorts of interesting decision theory comes out of considering situations where you do get to predict such things. (Besides, such situations are important in real life.)
      - Dagon 30 Jan 2021 18:17 UTC
        2 points
        Parent
        I might suggest “not interesting” rather than “not fair” as the complaint. One can image an Omega that leaves the box empty if the player is unpredictable, or if the player doesn’t rigorously follow CDT, or just always leaves it empty regardless. But there’s no intuition pump that it drives, and no analysis of why a formalization would or wouldn’t get the right answer.
        When I’m in challenge-the-hypothetical mode, I defend CDT by making the agent believe Omega cheats. It’s a trick box that changes contents AFTER the agent chooses, BEFORE the contents are revealed. This is much higher probability to any rational agent than mind-reading or extreme predictability.
  - Caspar Oesterheld 2 Feb 2021 16:42 UTC
    1 point
    Parent
    On the more philosophical points. My position is perhaps similar to Daniel K’s. But anyway...
    Of course, I agree that problems that punish the agent for using a particular theory (or using float multiplication or feeling a little wistful or stuff like that) are “unfair”/”don’t lead to interesting theory”. (Perhaps more precisely, I don’t think our theory needs to give algorithms that perform optimally in such problems in the way I want my theory to “perform optimally” Newcomb’s problem. Maybe we should still expect our theory to say something about them, in the way that causal decision theorists feel like CDT has interesting/important/correct things to say about Newcomb’s problem, despite Newcomb’s problem being designed to (unfairly, as they allege) reward non-CDT agents.)
    But I don’t think these are particularly similar to problems with predictions of the agent’s distribution over actions. The distribution over actions is behavioral, whereas performing floating point operations or whatever is not. When randomization is allowed, the subject of your choice is which distribution over actions you play. So to me, which distribution over actions you choose in a problem with randomization allowed, is just like the question of which action you take when randomization is not allowed. (Of course, if you randomize to determine which action’s expected utility to calculate first, but this doesn’t affect what you do in the end, then I’m fine with not allowing this to affect your utility, because it isn’t behavioral.)
    I also don’t think this leads to uninteresting decision theory. But I don’t know how to argue for this here, other than by saying that CDT, EDT, UDT, etc. don’t really care whether they choose from/rank a set of distributions or a set of three discrete actions. I think ratificationism-type concepts are the only ones that break when allowing discontinuous dependence on the chosen distribution and I don’t find these very plausible anyway.
    To be honest, I don’t understand the arguments against predicting distributions and predicting actions that you give in that post. I’ll write a comment on this to that post.
  - Caspar Oesterheld 2 Feb 2021 16:30 UTC
    1 point
    Parent
    Let’s start with the technical question:
    >Can your argument be extended to this case?
    No, I don’t think so. Take the class of problems. The agent can pick any distribution over actions. The final payoff is determined only as a function of the implemented action and some finite number of samples generated by Omega from that distribution. Note that the expectation is continuous in the distribution chosen. It can therefore be shown (using e.g. Kakutani’s fixed-point theorem) that there is always at least one ratifiable distribution. See Theorem 3 at https://users.cs.duke.edu/~ocaspar/NDPRL.pdf .
    (Note that the above is assuming the agent maximizes expected vNM utility. If, e.g., the agent maximizes some lexical utility function, then the predictor can just take, say, two samples and if they differ use a punishment that is of a higher lexicality than the other rewards in the problem.)
    - cousin_it 3 Feb 2021 10:38 UTC
      2 points
      Parent
      Thanks! That’s what I wanted to know. Will reply to the philosophical stuff in the comments to the other post.