In INP, any reinforcement learning (RL) algorithm will converge to one-boxing, simply because one-boxing gives it the money. This is despite RL naively looking like CDT.
Yup, like Caspar, I think that model-free RL learns the EDT policy in most/all situations. I’m not sure what you mean with it looking like CDT.
In Newcomb’s paradox CDT succeeds but EDT fails. Let’s consider an example where EDT succeeds and CDT fails: the XOR blackmail.
Isn’t it the other way around? The one-boxer gets more money, but gives in to blackmail, and therefore gets blackmailed in the first place.
RL is CDT in the sense that, your model of the world consists of actions and observations, and some causal link from past actions and observations to current observations, but there is no causal origin to the actions. The actions are just set by the agent to whatever it wants.
And, yes, I got CDT and EDT flipped there, good catch!
Yup, like Caspar, I think that model-free RL learns the EDT policy in most/all situations. I’m not sure what you mean with it looking like CDT.
Isn’t it the other way around? The one-boxer gets more money, but gives in to blackmail, and therefore gets blackmailed in the first place.
RL is CDT in the sense that, your model of the world consists of actions and observations, and some causal link from past actions and observations to current observations, but there is no causal origin to the actions. The actions are just set by the agent to whatever it wants.
And, yes, I got CDT and EDT flipped there, good catch!