The CDT agents here win because they do not believe that altering their strategy will change the way that their opponents behave. This is actually true in this case, and even true for the UDT agents depending on how you choose to construct your counterfactuals. If a UDT agent suffered a malfunction and defected, it too would do better. In any case, the theorem that UDT agents perform optimally in universes that can only read your mind by knowing what you would do in hypothetical situations is false as this example shows.
UDT bots win in some scenarios where the initial conditions of the scenario favor agents that behave sub-optimally in certain scenarios (and by sub-optimally, I mean where counterfactuals are constructed in the way implicit to CDT). The example above shows that sometimes they are punished for acting suboptimally.
The CDT agents here win because they do not believe that altering their strategy will change the way that their opponents behave. This is actually true in this case, and even true for the UDT agents depending on how you choose to construct your counterfactuals. If a UDT agent suffered a malfunction and defected, it too would do better. In any case, the theorem that UDT agents perform optimally in universes that can only read your mind by knowing what you would do in hypothetical situations is false as this example shows.
UDT bots win in some scenarios where the initial conditions of the scenario favor agents that behave sub-optimally in certain scenarios (and by sub-optimally, I mean where counterfactuals are constructed in the way implicit to CDT). The example above shows that sometimes they are punished for acting suboptimally.