OK, I managed to find the comment by Eliezer that you’re probably referring to, here. But what Eliezer says in that comment is that do(.)-based causality cannot be physically fundamental, which sounds right to me. And Pearl agrees with this, insofar as he states (in Causality) that the correspondence between physical causation (Pearl references the requirement that causes be in the past light cone of their effects; albeit presumably we should also include the principle of locality/”no action at a distance”) and statistical causality analysis is a bit of a mystery, and may say more about the way that people build models of the world and talk about them than anything more fundamental.
As for the confusion between Bayesian networks and causal graphs, Pearl deals with that in his book. Even before causal graphs were formally described, a lot of the interest in Bayesian networks (which are represented as directed graphs) was due to folks wanting to do causal analysis on them, if only informally. And indeed, if all we’re interested in is the correlation structure, then we’re not limited to Bayesian networks: we can use other kinds of graphical models, some of which have better properties (such as Markov graphs).
I am suspending judgement about the feedbacks issue for now, even though I still think it’s important. The point is that you’d need to make the case that causal diagrams can account in a reasonably straightforward way for all relevant uses of SEM (including not just explicit feedback but also equilibrium relationships more generally). Unless this is clearly shown, I don’t think it’s right to call do(.)-based methods a generalization of SEM.
Structural equation models (SEMs) are a special (linear/Gaussian) case of the non-parametric structural model (which uses do(.), or potential outcomes). This is not even an argument we can have, it’s standard math in the field. I don’t know where you learned that this is not the case, but whatever that source, it is wrong.
It’s fairly easy to verify: all non-parametric structural models do is replace the linear mechanism function by an arbitrary function, and the Gaussian noise term by an arbitrary noise term. It’s fairly easy to derive that causal regression coefficients in a SEM are simply interventional expected value contrasts on the difference scale.
So if we have:
y = ax + epsilon, then
a = E[y | do(x = 1)] - E[y | do(x = 0)]
One can also think of regression coefficients as partial derivatives of the interventional mean with respect to the intervened variable:
a = dE[y|do(x)]/dx
Cyclic causal models do not require either linearity or Gaussianity, although these assumptions make certain things easier.
Part of the reason I post here is I love talking about this stuff, and while I think I can learn much from the lesswrong community, I also can contribute my expertise where appropriate. What is disheartening is arguing with non-experts about settled issues. This reminds me of this episode where Judea asked me to change something on the Wikipedia Bayesian network article, and I got into an edit war with a resident Wikipedia edit camper. I am sure he was not an expert, because he was reverting a wrong statement (and had more time than me..) I adjusted my overall opinion of Wikipedia quality based on that :(.
You would think so, but I don’t think that’s true. Think about the legions of cranks trying to create perpetual motion machines, or settle the P/NP question, etc. etc. Thermodynamics is fairly settled, the difficulty of the P/NP question is fairly settled. Crankery is an easy attractor, apparently.
Note: I am not calling anyone in this thread a crank, merely responding to the general point that argument is evidence of an unsettled area. It’s true, but the evidence is surprisingly weak.
No, I meant that if someone gets settled stuff wrong, that’s usually due to sloppiness, and said sloppiness is an utter horror in any less settled area. It’s like repeatedly falling off bicycle head first with the training wheels on. Without training wheels its only worse.
I agree that this is true of structural equation models, taken in a fairly narrow sense. However, econometricians commonly generalize these to simultaneous equation models, which include equations where one simply asserts an algebraic equation involving variables, with no one variable having a privileged status of being “determined”, or an “outcome” of others. This means that do(.) cannot carry over to such models in a straightforward way. And yes, this is standard practice in econometrics when modeling equilibrium, feasibility constraints and the like.
To the extent that constraints are simply constraints and not a result of causal structure, the model representing them is partly non-causal (so do(.) or some other representation of causation is irrelevant for such constraints). To the extent that constraints represent some consequence of graphical causal structure I am not aware of a single example where a potential outcome model is not appropriate. Do you have an example in mind?
In some sense if you have constraints that represent consequence of causality, such as feedback, and there is no story relating them to interventions/generative mechanisms, then I am not sure in what sense the model is causal. I am not saying it is not possible, but the burden of proof is on whoever proposed the model to clearly explain how causality works in it. There is a lot of confusion in economics and sometimes even in stats about causality (Judea is fairly unhappy with incoherence that many economics textbooks display when discussing causation, actually).
OK, I managed to find the comment by Eliezer that you’re probably referring to, here. But what Eliezer says in that comment is that do(.)-based causality cannot be physically fundamental, which sounds right to me. And Pearl agrees with this, insofar as he states (in Causality) that the correspondence between physical causation (Pearl references the requirement that causes be in the past light cone of their effects; albeit presumably we should also include the principle of locality/”no action at a distance”) and statistical causality analysis is a bit of a mystery, and may say more about the way that people build models of the world and talk about them than anything more fundamental.
As for the confusion between Bayesian networks and causal graphs, Pearl deals with that in his book. Even before causal graphs were formally described, a lot of the interest in Bayesian networks (which are represented as directed graphs) was due to folks wanting to do causal analysis on them, if only informally. And indeed, if all we’re interested in is the correlation structure, then we’re not limited to Bayesian networks: we can use other kinds of graphical models, some of which have better properties (such as Markov graphs).
I am suspending judgement about the feedbacks issue for now, even though I still think it’s important. The point is that you’d need to make the case that causal diagrams can account in a reasonably straightforward way for all relevant uses of SEM (including not just explicit feedback but also equilibrium relationships more generally). Unless this is clearly shown, I don’t think it’s right to call do(.)-based methods a generalization of SEM.
Structural equation models (SEMs) are a special (linear/Gaussian) case of the non-parametric structural model (which uses do(.), or potential outcomes). This is not even an argument we can have, it’s standard math in the field. I don’t know where you learned that this is not the case, but whatever that source, it is wrong.
It’s fairly easy to verify: all non-parametric structural models do is replace the linear mechanism function by an arbitrary function, and the Gaussian noise term by an arbitrary noise term. It’s fairly easy to derive that causal regression coefficients in a SEM are simply interventional expected value contrasts on the difference scale.
So if we have:
y = ax + epsilon, then
a = E[y | do(x = 1)] - E[y | do(x = 0)]
One can also think of regression coefficients as partial derivatives of the interventional mean with respect to the intervened variable:
a = dE[y|do(x)]/dx
Cyclic causal models do not require either linearity or Gaussianity, although these assumptions make certain things easier.
Part of the reason I post here is I love talking about this stuff, and while I think I can learn much from the lesswrong community, I also can contribute my expertise where appropriate. What is disheartening is arguing with non-experts about settled issues. This reminds me of this episode where Judea asked me to change something on the Wikipedia Bayesian network article, and I got into an edit war with a resident Wikipedia edit camper. I am sure he was not an expert, because he was reverting a wrong statement (and had more time than me..) I adjusted my overall opinion of Wikipedia quality based on that :(.
Arguing with experts on settled issue is a symptom of sloppiness which would be particularly prominent in non-settled issues, though.
You would think so, but I don’t think that’s true. Think about the legions of cranks trying to create perpetual motion machines, or settle the P/NP question, etc. etc. Thermodynamics is fairly settled, the difficulty of the P/NP question is fairly settled. Crankery is an easy attractor, apparently.
Note: I am not calling anyone in this thread a crank, merely responding to the general point that argument is evidence of an unsettled area. It’s true, but the evidence is surprisingly weak.
No, I meant that if someone gets settled stuff wrong, that’s usually due to sloppiness, and said sloppiness is an utter horror in any less settled area. It’s like repeatedly falling off bicycle head first with the training wheels on. Without training wheels its only worse.
I agree that this is true of structural equation models, taken in a fairly narrow sense. However, econometricians commonly generalize these to simultaneous equation models, which include equations where one simply asserts an algebraic equation involving variables, with no one variable having a privileged status of being “determined”, or an “outcome” of others. This means that do(.) cannot carry over to such models in a straightforward way. And yes, this is standard practice in econometrics when modeling equilibrium, feasibility constraints and the like.
This is probably a good read also:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.1408
To the extent that constraints are simply constraints and not a result of causal structure, the model representing them is partly non-causal (so do(.) or some other representation of causation is irrelevant for such constraints). To the extent that constraints represent some consequence of graphical causal structure I am not aware of a single example where a potential outcome model is not appropriate. Do you have an example in mind?
In some sense if you have constraints that represent consequence of causality, such as feedback, and there is no story relating them to interventions/generative mechanisms, then I am not sure in what sense the model is causal. I am not saying it is not possible, but the burden of proof is on whoever proposed the model to clearly explain how causality works in it. There is a lot of confusion in economics and sometimes even in stats about causality (Judea is fairly unhappy with incoherence that many economics textbooks display when discussing causation, actually).