“why is a causal connection privileged?”
I agree with everything here. What follows is merely history.
Historically, I think that CDT was meant to address the obvious shortcomings of choosing to bring about states that were merely correlated with good outcomes (as in the case of whitening one’s teeth to reduce lung cancer risk). When Pearl advocates CDT, he is mainly advocating acting based on robust connections that will survive the perturbation of the system caused by the action itself. (e.g. Don’t think you’ll cure lung cancer by making your population brush their teeth, because that is a non-robust correlation that will be eliminated once you change the system). The centrality of causality in decision making was obvious intuitively but wasn’t reflected in formal Bayesian decision theory. This was because of the lack of a good formalism linking probability and causality (and some erroneous positivistic scruples against the very idea of causality). Pearl and SGS’s work on causality has done much to address this, but I think there is much to be done.
There is a very annoying historical accident where EDT was taken to be the ‘one-boxing’ decision theory. First, any use of probability theory in the NP with infallible predictor is suspicious, because the problem can be specified in a logically complete way with no room for empirical uncertainty. (This is why dominance reasoning is brought in for CDT. What should the probabilities be?). Second, EDT is not easy to make coherent given an agent who knows they follow EDT. (The action that EDT disfavors will have probability zero and so the agent cannot condition on it in traditional probability theory). Third, EDT just barely one-boxes. It doesn’t one-box on Double Transparent Newcomb, nor on Counterfactual Mugging. It’s also obscure what it does on PD. (Again, I can play the PD against a selfish clone of myself, with both agents having each other’s source code. There is no empirical uncertainty here, and so applying probability theory immediate raises deep foundational problems).
If TDT/UDT had come first (including the logical models and deep connections to Godel’s theorem), the philosophy discussion of NP would have been very different. EDT (which brings into the NP very dubious empirical probability distributions) would not have been considered at all for NP. I don’t see that CDT would have held much interest if its alternative was not as feeble as EDT.
It is important to understand why economists have done so much work with Nash Equilibria (e.g. on the PD) rather than invent UDT. This is explained by the fact that the assumption of logical correlation and perfect empirical knowledge between agents in the PD is not the practical reality. This doesn’t mean that UDT is not relevant to practical situations, but only that these situations involve many additional elements that may be complex to deal with in UDT. Causal based theories would have been interesting independently, for the reasons noted above concerning robust correlations.
EDIT: I realize the comment by Paul Christiano sometimes describes UDT as a variant of EDT. When I used the term “EDT” I mean the theory discussed in the philosophy literature which involves choosing the action that maximizes P(outcomes / action). This is a theory which essentially makes use of vanilla conditional probability. In what I say, I assume that UDT/TDT, despite some similarity to EDT in spirit, are not limited to regular conditioning and do not fail on smoking lesion.
“why is a causal connection privileged?” I agree with everything here. What follows is merely history.
Historically, I think that CDT was meant to address the obvious shortcomings of choosing to bring about states that were merely correlated with good outcomes (as in the case of whitening one’s teeth to reduce lung cancer risk). When Pearl advocates CDT, he is mainly advocating acting based on robust connections that will survive the perturbation of the system caused by the action itself. (e.g. Don’t think you’ll cure lung cancer by making your population brush their teeth, because that is a non-robust correlation that will be eliminated once you change the system). The centrality of causality in decision making was obvious intuitively but wasn’t reflected in formal Bayesian decision theory. This was because of the lack of a good formalism linking probability and causality (and some erroneous positivistic scruples against the very idea of causality). Pearl and SGS’s work on causality has done much to address this, but I think there is much to be done.
There is a very annoying historical accident where EDT was taken to be the ‘one-boxing’ decision theory. First, any use of probability theory in the NP with infallible predictor is suspicious, because the problem can be specified in a logically complete way with no room for empirical uncertainty. (This is why dominance reasoning is brought in for CDT. What should the probabilities be?). Second, EDT is not easy to make coherent given an agent who knows they follow EDT. (The action that EDT disfavors will have probability zero and so the agent cannot condition on it in traditional probability theory). Third, EDT just barely one-boxes. It doesn’t one-box on Double Transparent Newcomb, nor on Counterfactual Mugging. It’s also obscure what it does on PD. (Again, I can play the PD against a selfish clone of myself, with both agents having each other’s source code. There is no empirical uncertainty here, and so applying probability theory immediate raises deep foundational problems).
If TDT/UDT had come first (including the logical models and deep connections to Godel’s theorem), the philosophy discussion of NP would have been very different. EDT (which brings into the NP very dubious empirical probability distributions) would not have been considered at all for NP. I don’t see that CDT would have held much interest if its alternative was not as feeble as EDT.
It is important to understand why economists have done so much work with Nash Equilibria (e.g. on the PD) rather than invent UDT. This is explained by the fact that the assumption of logical correlation and perfect empirical knowledge between agents in the PD is not the practical reality. This doesn’t mean that UDT is not relevant to practical situations, but only that these situations involve many additional elements that may be complex to deal with in UDT. Causal based theories would have been interesting independently, for the reasons noted above concerning robust correlations.
EDIT: I realize the comment by Paul Christiano sometimes describes UDT as a variant of EDT. When I used the term “EDT” I mean the theory discussed in the philosophy literature which involves choosing the action that maximizes P(outcomes / action). This is a theory which essentially makes use of vanilla conditional probability. In what I say, I assume that UDT/TDT, despite some similarity to EDT in spirit, are not limited to regular conditioning and do not fail on smoking lesion.