The agent I’m talking about is separate from your physics-based world. It’s from toy setups like Robust Cooperation in the Prisoner’s Dilemma. If it can reason about statements like “If my algorithm returns that I cooperate, then I get 3 utility.”, then there may be p for which it can prove “If my algorithm returns that I cooperate, then this strange hypothetical-physics-based world has property p.” but not “This strange hypothetical-physics-based world has property p.”. This would indicate that that strange world contains agents about which that premise is useful, so we can use modal combatants as agent detectors.
The agent I’m talking about is separate from your physics-based world. It’s from toy setups like Robust Cooperation in the Prisoner’s Dilemma. If it can reason about statements like “If my algorithm returns that I cooperate, then I get 3 utility.”, then there may be p for which it can prove “If my algorithm returns that I cooperate, then this strange hypothetical-physics-based world has property p.” but not “This strange hypothetical-physics-based world has property p.”. This would indicate that that strange world contains agents about which that premise is useful, so we can use modal combatants as agent detectors.