What does a more sophisticated version of EDT, taking the above observations into account, look like? I don’t know. I suspect that it looks like some version of TDT / UDT
When I suggested this in the post of mine that you referenced, benelloitt pointed out that it fails the transparent-box variant of Newcomb’s problem, where you can see the contents of the boxes, and Omega makes his decision based on what he predicts you would do if you saw $1 million in box A. I don’t see an obvious way to rescue EDT in that scenario.
Again, I think it’s difficult to claim that EDT does a particular thing in a particular scenario. An EDT agent who has a prior over causal networks with logical nodes describing the environment (including itself) and who updates this prior by acquiring information may approximate a TDT agent as it collects more information about the environment and its posterior becomes concentrated at the “true” causal network.
I’m not sure what you mean. Can you give an example of a probability distribution over causal networks that could be believed by an EDT agent in the transparent Newcomb’s problem, such that the agent would one-box? Or at least give a plausibility argument for the existence of such a probability distribution?
Maybe it’s better not to talk about causal networks. Let’s use an AIXI-like setup instead. The EDT agent starts with a Solomonoff prior over all computable functions that Omega could be. Part of the setup of Newcomb’s problem is that Omega convinces you that it’s a very good predictor, so some series of trials takes place in which the EDT agent updates its prior over what Omega is. The posterior will be concentrated at computable functions that are very good predictors. The EDT agent then reasons that if it two-boxes then Omega will predict this and it won’t get a good payoff, so it one-boxes.
But in the transparent-box variant, the EDT agent knows exactly how much money is in box A before making its decision, so its beliefs about the contents of box A do not change when it updates on its counterfactual decision.
If you want to change what you want, then you’ve decided that your first-orded preferences were bad. EDT recognizing that it can replace itself with a better decision theory is not the same as it getting the answer right; the thing that makes the decision is not EDT anymore.
We don’t usually let decision theories make precommitments. That’s why CDT fails Newcomb’s problem. I think CDT and EDT both converge to something like TDT/UDT when allowed to precommit as far in advance as desirable.
When I suggested this in the post of mine that you referenced, benelloitt pointed out that it fails the transparent-box variant of Newcomb’s problem, where you can see the contents of the boxes, and Omega makes his decision based on what he predicts you would do if you saw $1 million in box A. I don’t see an obvious way to rescue EDT in that scenario.
Again, I think it’s difficult to claim that EDT does a particular thing in a particular scenario. An EDT agent who has a prior over causal networks with logical nodes describing the environment (including itself) and who updates this prior by acquiring information may approximate a TDT agent as it collects more information about the environment and its posterior becomes concentrated at the “true” causal network.
I’m not sure what you mean. Can you give an example of a probability distribution over causal networks that could be believed by an EDT agent in the transparent Newcomb’s problem, such that the agent would one-box? Or at least give a plausibility argument for the existence of such a probability distribution?
Maybe it’s better not to talk about causal networks. Let’s use an AIXI-like setup instead. The EDT agent starts with a Solomonoff prior over all computable functions that Omega could be. Part of the setup of Newcomb’s problem is that Omega convinces you that it’s a very good predictor, so some series of trials takes place in which the EDT agent updates its prior over what Omega is. The posterior will be concentrated at computable functions that are very good predictors. The EDT agent then reasons that if it two-boxes then Omega will predict this and it won’t get a good payoff, so it one-boxes.
But in the transparent-box variant, the EDT agent knows exactly how much money is in box A before making its decision, so its beliefs about the contents of box A do not change when it updates on its counterfactual decision.
Ah. I guess we’re not allowing EDT to make precommitments?
If you want to change what you want, then you’ve decided that your first-orded preferences were bad. EDT recognizing that it can replace itself with a better decision theory is not the same as it getting the answer right; the thing that makes the decision is not EDT anymore.
We don’t usually let decision theories make precommitments. That’s why CDT fails Newcomb’s problem. I think CDT and EDT both converge to something like TDT/UDT when allowed to precommit as far in advance as desirable.