My preferred interpretation of that particular method is not “the agent has false beliefs,” but instead “the agent cares both about the factual and the counterfactual worlds, and is trying to maximize utility in both at once.” That is, if you were to cry
But if the humans press the button, the press signal will occur! So why are you acting such that you still get utility in the counterfactual world where humans press the button and the signal fails to occur?
It will look at you funny, and say “Because I care about that counterfactual world. See? It says so right here in my utility function.” It knows the world is counterfactual, it just cares about “what would have happened” anyway. (Causal decision nodes are used to formalize “what would have happened” in the agent’s preferences, this says nothing about whether the agent uses causal reasoning when making decisions.)
(Causal decision nodes are used to formalize “what would have happened” in the agent’s preferences, this says nothing about whether the agent uses causal reasoning when making decisions.)
My preferred interpretation of that particular method is not “the agent has false beliefs,” but instead “the agent cares both about the factual and the counterfactual worlds, and is trying to maximize utility in both at once.” That is, if you were to cry
It will look at you funny, and say “Because I care about that counterfactual world. See? It says so right here in my utility function.” It knows the world is counterfactual, it just cares about “what would have happened” anyway. (Causal decision nodes are used to formalize “what would have happened” in the agent’s preferences, this says nothing about whether the agent uses causal reasoning when making decisions.)
This greatly clarified the distinction for me. Well done.
Makes sense.