Vaniver comments on Decision Theories: A Semi-Formal Analysis, Part III

Vaniver Apr 22, 2012, 7:29 PM
1 point

Just taking the maximum each time saves you from enumerating 2^16 strategies.

It’s not clear to me that’s the case. If your bot and my bot both receive the same source code for Y, we both determine the correct number of potential sub-strategies Y can use, and have to evaluate each of them against each of our A_is. I make the maximization over all of Y’s substrategies explicit by storing all of the values I obtain, but in order to get the maximum you also have to calculate all possible values. (I suppose you could get a bit of computational savings by exploiting the structure of the problem, but that may not generalize to arbitrary games.)

To expand on this, CDT’s way of avoiding harmful self-reference is to treat its decision as a causally separate node and try out different values for it while changing nothing else on the graph, including things that are copies of its source code.

The basic decision here is what to write as the source code, not the action that our bot outputs, and so CDT is fine- if it modifies the source code for X, that can impact the outputs of both X and Y. There’s no way to modify the output of X without potentially modifying the output of Y in this game and I don’t see a reason for CDT to mistakenly hallucinate one.

Put another way, I don’t think I would use “causally separate”- I think I would use “unprecedented.” The influence diagram I’m drawing for this has three decision boxes (made by three different agents), all unprecedented, whose outputs are the code for X, Y, and G; all of them point to the calculation node of the inference module, and then all three codes and the inference module point to separate calculation nodes of X’s output and Y’s output, which then both point to the value node of Game Outcome. (You could have uncertainty nodes pointing to X’s output and Y’s output separately, but I’m ignoring mixed strategies for now.)

So it considers it legitimate to figure out the impact of its present decision on any agent who can see the effects of the action, but not on any agent who can predict the decision.

To the best of my knowledge, this isn’t a feature of CDT: it’s a feature of the embedded physics module used by most CDTers. If we altered Newcomb’s problem such that Omega filled the boxes after the agent made their choice, then CDTers would one-box- and so if you have a CDTer who believes that perfect prediction is equivalent to information moving backwards in time (and that that’s possible), then you have a one-boxing CDTer.