They do not. Causal diagrams represent causal relationships between variables. Given certain assumptions (like the Markov and Faithfulness properties), a given causal diagram may be consistent or not with various structures of dependence among the variables. Those structures may be representable by DAGs (which in this role are called Bayesian networks), but without the causal interpretation, which is something separate from the statistics, a Bayesian network is not a causal DAG. Neither type of DAG is a representation of the other.
The faithfulness property is necessary for the causal graph to also capture the dependence relationships, but not for it to capture independence relationships.
I’m confused about what you mean by the Markov property, I was under the impression that this property is generally required for causal diagrams to be considered true. Looking it up, perhaps you’re talking about correlated error terms? Or is it something else?
I meant the same Markov property as you refer to. Yes, it is generally assumed. You can’t do much with causal diagrams without it. Faithfulness is less assumed than Markov, but both, when made, are explicit assumptions/requirements/axioms/hypotheses/whatever.
I think the post is using “causal graph” to refer to Bayesian networks, since the comments about conditional independence etc. don’t make sense otherwise. Your original point about “where does the EDT-ist get these causal diagrams” is beside the point. Paul’s point is that the EDT-ist has no causal diagrams, but if you imagine that there’s a CDT-ist armed with a Bayes net model which includes the determinants of his own decision, then the EDT-ist will make the same decisions as him using only the conditional dependence structure implied by the Bayes net.
If you’re using a different terminology that’s fine, but the point made in the post is still valid with the Bayesian network interpretation and doesn’t depend on this terminological objection.
I think the post is using “causal graph” to refer to Bayesian networks
That is, the post displays exactly the confusion that Ilya mentioned.
a CDT-ist armed with a Bayes net model which includes the determinants of his own decision
That is stepping outside of what CDT is.
There is a deplorable tendency in these discussions for people to redefine basic terms to mean things different from what the terms were originally coined to mean. “EDT” and “CDT” already mean things. EDT knows nothing of causation and CDT knows nothing of including the deciding agent in the causal graph. This is why EDT fails on the Smoking Lesion and CDT fails on Newcomb. Redefining the two terms to mean the same thing does not change the fact that the decision theories they originally named are not the same thing, any more than writing “London” and “Paris” next to Berlin on a map will make London and Paris the same city in Germany. All it does is degrade the usefulness of the map.
Instead of redefining the terms to mean whatever improved decision theory one comes up with, it would be better to come up with that improved theory and give it a new name. See, for example, TDT, UDT, etc.
That is, the post displays exactly the confusion that Ilya mentioned.
As Paul has pointed out in the comments, the “confusion” in the post amounts to nothing more than a terminological dispute as far as I can see. It’s not a dispute over what CDT or EDT mean; it’s a dispute over what “causal network” means, and as far as I can see it’s irrelevant to the thrust of Paul’s argument.
That is stepping outside of what CDT is.
How? CDT is totally consistent with a situation in which you include yourself in your model. I can have a model (which I can’t compute explicitly) in which my actions are all caused by some inputs, but the algorithm I use to make decisions is “which action gets me the highest expected utility if I condition on that action’s do operator?”
This means I effectively ignore the causal determinants of my own decision when making the decision, but that doesn’t mean my model of the world must be ignorant of them.
EDT knows nothing of causation...
This is Paul’s whole point.
...and CDT knows nothing of including the deciding agent in the causal graph.
I’ve already responded to this above.
This is why EDT fails on the Smoking Lesion...
Paul’s point is that EDT fails on the smoking lesion problem if the EDT-ist neglects to condition on all the facts that he knows about the situation. If the EDT-ist correctly conditions on their utility function, they’ll notice that there’s actually no correlation among people with that utility function between smoking and lesions, so they’ll correctly decide to smoke if they think it’s positive expected utility.
Since Paul’s argument re: equivalence between CDT and EDT under his conditions is sound, it really has to be like this. The apparent failure of EDT has to go away once the problem is sufficiently formalized such that the EDT-ist can condition on all inputs to their decision process. However, Paul also says that CDT fails more gracefully than EDT, in the sense that if the EDT-ist neglects to condition on some relevant facts then they can fall into the trap of not smoking in the smoking lesion problem. CDT is more robust to this kind of failure.
Redefining the two terms to mean the same thing does not change the fact that the decision theories they originally named are not the same thing, any more than writing “London” and “Paris” next to Berlin on a map will make London and Paris the same city in Germany. All it does is degrade the usefulness of the map.
Paul doesn’t redefine either EDT or CDT, so I don’t know what you’re talking about here.
Instead of redefining the terms to mean whatever improved decision theory one comes up with, it would be better to come up with that improved theory and give it a new name. See, for example, TDT, UDT, etc.
I agree, but Paul hasn’t come up with an improved decision theory, so I don’t see why he should invent a new label for a new theory that doesn’t exist.
They do. I didn’t say the correspondence is bijective, and I don’t think the post says this either.
They do not. Causal diagrams represent causal relationships between variables. Given certain assumptions (like the Markov and Faithfulness properties), a given causal diagram may be consistent or not with various structures of dependence among the variables. Those structures may be representable by DAGs (which in this role are called Bayesian networks), but without the causal interpretation, which is something separate from the statistics, a Bayesian network is not a causal DAG. Neither type of DAG is a representation of the other.
The faithfulness property is necessary for the causal graph to also capture the dependence relationships, but not for it to capture independence relationships.
I’m confused about what you mean by the Markov property, I was under the impression that this property is generally required for causal diagrams to be considered true. Looking it up, perhaps you’re talking about correlated error terms? Or is it something else?
I meant the same Markov property as you refer to. Yes, it is generally assumed. You can’t do much with causal diagrams without it. Faithfulness is less assumed than Markov, but both, when made, are explicit assumptions/requirements/axioms/hypotheses/whatever.
I think the post is using “causal graph” to refer to Bayesian networks, since the comments about conditional independence etc. don’t make sense otherwise. Your original point about “where does the EDT-ist get these causal diagrams” is beside the point. Paul’s point is that the EDT-ist has no causal diagrams, but if you imagine that there’s a CDT-ist armed with a Bayes net model which includes the determinants of his own decision, then the EDT-ist will make the same decisions as him using only the conditional dependence structure implied by the Bayes net.
If you’re using a different terminology that’s fine, but the point made in the post is still valid with the Bayesian network interpretation and doesn’t depend on this terminological objection.
That is, the post displays exactly the confusion that Ilya mentioned.
That is stepping outside of what CDT is.
There is a deplorable tendency in these discussions for people to redefine basic terms to mean things different from what the terms were originally coined to mean. “EDT” and “CDT” already mean things. EDT knows nothing of causation and CDT knows nothing of including the deciding agent in the causal graph. This is why EDT fails on the Smoking Lesion and CDT fails on Newcomb. Redefining the two terms to mean the same thing does not change the fact that the decision theories they originally named are not the same thing, any more than writing “London” and “Paris” next to Berlin on a map will make London and Paris the same city in Germany. All it does is degrade the usefulness of the map.
Instead of redefining the terms to mean whatever improved decision theory one comes up with, it would be better to come up with that improved theory and give it a new name. See, for example, TDT, UDT, etc.
As Paul has pointed out in the comments, the “confusion” in the post amounts to nothing more than a terminological dispute as far as I can see. It’s not a dispute over what CDT or EDT mean; it’s a dispute over what “causal network” means, and as far as I can see it’s irrelevant to the thrust of Paul’s argument.
How? CDT is totally consistent with a situation in which you include yourself in your model. I can have a model (which I can’t compute explicitly) in which my actions are all caused by some inputs, but the algorithm I use to make decisions is “which action gets me the highest expected utility if I condition on that action’s do operator?”
This means I effectively ignore the causal determinants of my own decision when making the decision, but that doesn’t mean my model of the world must be ignorant of them.
This is Paul’s whole point.
I’ve already responded to this above.
Paul’s point is that EDT fails on the smoking lesion problem if the EDT-ist neglects to condition on all the facts that he knows about the situation. If the EDT-ist correctly conditions on their utility function, they’ll notice that there’s actually no correlation among people with that utility function between smoking and lesions, so they’ll correctly decide to smoke if they think it’s positive expected utility.
Since Paul’s argument re: equivalence between CDT and EDT under his conditions is sound, it really has to be like this. The apparent failure of EDT has to go away once the problem is sufficiently formalized such that the EDT-ist can condition on all inputs to their decision process. However, Paul also says that CDT fails more gracefully than EDT, in the sense that if the EDT-ist neglects to condition on some relevant facts then they can fall into the trap of not smoking in the smoking lesion problem. CDT is more robust to this kind of failure.
Paul doesn’t redefine either EDT or CDT, so I don’t know what you’re talking about here.
I agree, but Paul hasn’t come up with an improved decision theory, so I don’t see why he should invent a new label for a new theory that doesn’t exist.