I mostly liked the post. In Pearl’s book, the example of whether smoking causes cancer worked pretty well for me despite being potentially controversial, and was more engaging for being on a controversial topic. Part of that is he kept his example fairly cleanly hypothetical. Eliezer’s “I didn’t really start believing that the virtue theory of metabolism was wrong” in a footnote, and “as common sense would have it” in the main text, both were suggesting it was about the real world. I think in Pearl’s example, he may have even made his hypothetical data give the opposite result to the real world.
This post I also thought was more engaging due to the controversial topic, so if you can keep that while reducing the “mind-killer politics” potential I’d encourage that.
I was fine with the model he was falsifying being simple and easily disproved—that’s great for an example.
I’m kind of confused and skeptical at the bit at the end: we’ve ruled out all the models except one. From Pearl’s book I’d somehow picked up that we need to make some causal assumption, statistical data wasn’t enough to get all the way from ignorance to knowing the causal model.
Is assuming “causation would imply correlation” and “the model will have only these three variables” enough in this case?
I think in Pearl’s example, he may have even made his hypothetical data give the opposite result to the real world.
He introduces a “hypothetical data set,” works through the math, then follows the conclusion that tar deposits protect against cancer with this paragraph:
The data in Table 3.1 are obviously unrealistic and were deliberately crafted so as to support the genotype theory. However, the purpose of this exercise was to demonstrate how reasonable qualitative assumptions about the workings of mechanisms, coupled with nonexperimental data, can produce precise quantitative assessments of causal effects. In reality, we would expect observational studies involving mediating variables to refute the genotype theory by showing, for example, that the mediating consequences of smoking (such as tar deposits) tend to increase, not decrease, the risk of cancer in smokers and nonsmokers alike. The estimand of (3.29) could then be used for quantifying the causal effect of smoking on cancer.
When I read it, I remember being mildly bothered by the example (why not have a clearly fictional example to match clearly fictional data, or find an actual study and use the real data as an example?) but mostly mollified by his extended disclaimer.
(I feel like pointing out, as another example, the decision analysis class that I took, which had a central example which was repeated and extended throughout the semester. The professor was an active consultant, and could have drawn on a wealth of examples in, say, petroleum exploration. But the example was a girl choosing a location for a party, subject to uncertain weather. Why that? Because it was obviously a toy example. If they tried to use a petroleum example for petroleum engineers, the petroleum engineers would be rightly suspicious of any simplified model put in front of them- “you mean this procedure only takes into account two things!?”- and any accurate model would be far too complicated to teach the methodology. An obviously toy example taught the process, and then once they understood the process, they were willing to apply it to more complicated situations- which, of course, needed much more complicated models.)
There may also be the assumption that the graph is acyclic.
Some causal models, while not flat out falsified by the data, are rendered less probable by the fact the data happens to fit more precise (less connected) causal graphs. A fully connected graph is impossible to falsify, for instance (it can explain any data).
Among all graphs that explain the fictional data here, there is only one that has only two edges. That’s the most probable one.
I mostly liked the post. In Pearl’s book, the example of whether smoking causes cancer worked pretty well for me despite being potentially controversial, and was more engaging for being on a controversial topic. Part of that is he kept his example fairly cleanly hypothetical. Eliezer’s “I didn’t really start believing that the virtue theory of metabolism was wrong” in a footnote, and “as common sense would have it” in the main text, both were suggesting it was about the real world. I think in Pearl’s example, he may have even made his hypothetical data give the opposite result to the real world.
This post I also thought was more engaging due to the controversial topic, so if you can keep that while reducing the “mind-killer politics” potential I’d encourage that.
I was fine with the model he was falsifying being simple and easily disproved—that’s great for an example.
I’m kind of confused and skeptical at the bit at the end: we’ve ruled out all the models except one. From Pearl’s book I’d somehow picked up that we need to make some causal assumption, statistical data wasn’t enough to get all the way from ignorance to knowing the causal model.
Is assuming “causation would imply correlation” and “the model will have only these three variables” enough in this case?
He introduces a “hypothetical data set,” works through the math, then follows the conclusion that tar deposits protect against cancer with this paragraph:
When I read it, I remember being mildly bothered by the example (why not have a clearly fictional example to match clearly fictional data, or find an actual study and use the real data as an example?) but mostly mollified by his extended disclaimer.
(I feel like pointing out, as another example, the decision analysis class that I took, which had a central example which was repeated and extended throughout the semester. The professor was an active consultant, and could have drawn on a wealth of examples in, say, petroleum exploration. But the example was a girl choosing a location for a party, subject to uncertain weather. Why that? Because it was obviously a toy example. If they tried to use a petroleum example for petroleum engineers, the petroleum engineers would be rightly suspicious of any simplified model put in front of them- “you mean this procedure only takes into account two things!?”- and any accurate model would be far too complicated to teach the methodology. An obviously toy example taught the process, and then once they understood the process, they were willing to apply it to more complicated situations- which, of course, needed much more complicated models.)
There may also be the assumption that the graph is acyclic.
Some causal models, while not flat out falsified by the data, are rendered less probable by the fact the data happens to fit more precise (less connected) causal graphs. A fully connected graph is impossible to falsify, for instance (it can explain any data).
Among all graphs that explain the fictional data here, there is only one that has only two edges. That’s the most probable one.