So I’ve read Pearl’s The Book of Why and although it is really well written I don’t understand some things.
Say we have two variables, and variable X could ‘listen’ to variable Y, Y--->X. But we don’t know if it is qualitative or quantitative. I would have appreciated it if the book included a case study or two on how people plot their studies around this thing.
For example, we want to know what features of an experimental system can influence the readout of our measuring equipment. Say, Y (feature) is the variety of fungi species inhabiting the root system of a plant, and X is the % of cases in which we register specific mycorrhyzal structures on slides we view through a microscope (readout). And our ‘measuring equipment’ is a staining/viewing procedure.
Conceivably, if there are several species of fungi present, the mycorrhizal one(s) might form fewer (or more numerous) specific structures. This would be what I mean by a quantitative effect. Also conceivably, only some species or combinations of them have this effect on X. This would be qualitative.
Measuring both Y and X is more or less impossible, since you either stain a root or try to obtain a mycorrhizal culture from it (which is expensive.)· Even if we do try out some number of combinations of fungal inoculum, who knows how it compares against the diversity in the wild.
So… does this mean that we should split Y into Y-->Y1-->X and Y-->Y2-->X… or what?
· we don’t consider some stains which maybe allow both.
The first way to treat this in the DAG paradigm that comes to mind is that the “quantitative” question is a question about a causal effect given a hypothesized diagram species→mycorrhyzal structure prevalence.
On the other hand, the “qualitative” question can be framed in two ways, I think. In the first, the question is about which DAG best describes reality given the choice of different DAGs that represent different sets of species having an effect. But in principle, we could also just construct a larger graph with all possible species as Ys having arrows pointing to $ X $ and try to infer all the different effects jointly, translating the qualitative question into a quantitative one. (The species that don’t effect $ X $ will just have a causal effect of $ 0 $ on $ X $.)
To your point about diversity in the wild, in theoretical causality, our ability to generalize depends on 1) the structure of the DAG and 2) our level of knowledge of the underlying mechanisms. If we only have a blackbox understanding of the graph structure and the size of the average effects (that is, $ P(Y \mid \text{do}(\mathbf{X})) $), then there exist [certain situations](https://ftp.cs.ucla.edu/pub/stat_ser/r372-a.pdf) in which we can “transport” our results from the lab to other situations. If we actually know the underlying mechanisms (the structural causal model equations in causal DAG terminology), then we can potentially apply our results even outside of the situations in which our graph structure and known quantities are “transportable”.
Thank you. It looks even more unfeasible than I thought (given the number of species of mycorrhizal and other root-inhabiting fungi); I’ll have to just explicitly assume that Y does not have an effect on X, in a given root system from the wild. At least things seem much cheaper to do now)))
So I’ve read Pearl’s The Book of Why and although it is really well written I don’t understand some things.
Say we have two variables, and variable X could ‘listen’ to variable Y, Y--->X. But we don’t know if it is qualitative or quantitative. I would have appreciated it if the book included a case study or two on how people plot their studies around this thing.
For example, we want to know what features of an experimental system can influence the readout of our measuring equipment. Say, Y (feature) is the variety of fungi species inhabiting the root system of a plant, and X is the % of cases in which we register specific mycorrhyzal structures on slides we view through a microscope (readout). And our ‘measuring equipment’ is a staining/viewing procedure.
Conceivably, if there are several species of fungi present, the mycorrhizal one(s) might form fewer (or more numerous) specific structures. This would be what I mean by a quantitative effect. Also conceivably, only some species or combinations of them have this effect on X. This would be qualitative.
Measuring both Y and X is more or less impossible, since you either stain a root or try to obtain a mycorrhizal culture from it (which is expensive.)· Even if we do try out some number of combinations of fungal inoculum, who knows how it compares against the diversity in the wild.
So… does this mean that we should split Y into Y-->Y1-->X and Y-->Y2-->X… or what?
· we don’t consider some stains which maybe allow both.
The first way to treat this in the DAG paradigm that comes to mind is that the “quantitative” question is a question about a causal effect given a hypothesized diagram species→mycorrhyzal structure prevalence.
On the other hand, the “qualitative” question can be framed in two ways, I think. In the first, the question is about which DAG best describes reality given the choice of different DAGs that represent different sets of species having an effect. But in principle, we could also just construct a larger graph with all possible species as Ys having arrows pointing to $ X $ and try to infer all the different effects jointly, translating the qualitative question into a quantitative one. (The species that don’t effect $ X $ will just have a causal effect of $ 0 $ on $ X $.)
To your point about diversity in the wild, in theoretical causality, our ability to generalize depends on 1) the structure of the DAG and 2) our level of knowledge of the underlying mechanisms. If we only have a blackbox understanding of the graph structure and the size of the average effects (that is, $ P(Y \mid \text{do}(\mathbf{X})) $), then there exist [certain situations](https://ftp.cs.ucla.edu/pub/stat_ser/r372-a.pdf) in which we can “transport” our results from the lab to other situations. If we actually know the underlying mechanisms (the structural causal model equations in causal DAG terminology), then we can potentially apply our results even outside of the situations in which our graph structure and known quantities are “transportable”.
Thank you. It looks even more unfeasible than I thought (given the number of species of mycorrhizal and other root-inhabiting fungi); I’ll have to just explicitly assume that Y does not have an effect on X, in a given root system from the wild. At least things seem much cheaper to do now)))