To make decisions, an agent needs to understand the problem, to know what’s real and valuable that it needs to optimize. Suppose the agent thinks it’s solving one problem, while you are fooling it in a way that it can’t perceive, making its decisions lead to consequences that the agent can’t (shouldn’t) take into account. Then in a certain sense the agent acts in a different world (situation), in the world that it anticipates (values), not in the world that you are considering it in.
This is also the issue with CDT in Newcomb’s problem: a CDT agent can’t understand the problem, so when we test it, it’s acting according to its own understanding of the world that doesn’t match the problem. If you explain a reverse Newcomb’s to an FDT agent (ensure that it’s represented in it), so that it knows that it needs to act to win in the reverse Newcomb’s and not in regular Newcomb’s, then the FDT agent will two-box in regular Newcomb’s problem, because it will value winning in reverse Newcomb’s problem and won’t value winning in regular Newcomb’s.
To make decisions, an agent needs to understand the problem, to know what’s real and valuable that it needs to optimize. Suppose the agent thinks it’s solving one problem, while you are fooling it in a way that it can’t perceive, making its decisions lead to consequences that the agent can’t (shouldn’t) take into account. Then in a certain sense the agent acts in a different world (situation), in the world that it anticipates (values), not in the world that you are considering it in.
This is also the issue with CDT in Newcomb’s problem: a CDT agent can’t understand the problem, so when we test it, it’s acting according to its own understanding of the world that doesn’t match the problem. If you explain a reverse Newcomb’s to an FDT agent (ensure that it’s represented in it), so that it knows that it needs to act to win in the reverse Newcomb’s and not in regular Newcomb’s, then the FDT agent will two-box in regular Newcomb’s problem, because it will value winning in reverse Newcomb’s problem and won’t value winning in regular Newcomb’s.