A few minor comments. Regarding I, it’s known that the direction of (or lack of) an arrow in generic two-node causal is un-identifiable, although there’s some recent work solving this in restricted cases.
Regarding II, if I understand correctly, the second sub-scenario is one in which we’d have a graph that looks like the following DAG.
What I’m confused about is if we condition on a level of tar in a big population, we’ll still see correlation between smoking and cancer via the trait assuming there’s independent noise feeding into each of these nodes. More concretely, presumably people will smoke different amounts based on some other unobserved factors outside this trait. So at at least certain levels of tar in lungs, we’ll have people who do/don’t have the trait, meaning there’ll be a correlation between smoking and cancer even in different tar level sub-populations. That said, in the purely deterministic simplified scenario, I see your point.
Alternatively, I’m pretty sure applying the front-door criterion (explanation) would properly identify the zero causal effect of smoking on cancer in this scenario (again assuming all the relationships aren’t purely deterministic).
Yeah. Thanks for the front door link, I’ll take some time learning this!
Maybe to reformulate a bit, in the second sub-scenario my idea was that each person has a kind of “tar thermostat”, which sets the desired level of tar and continually adjusts your desire to smoke. If some other factor makes you smoke more or less, it will compensate until your level of tar again matches the “thermostat setting”. And the trait that determines someone’s “thermostat setting” would also determine their cancer risk. Basically the system would counteract any external noise, making the statistician’s job harder (though not impossible, you’re right).
The third scenario, about skydiving, hints at a similar idea. The “thermostat” there is the person’s desire for thrill, so if you take away skydiving, it will try to find something else.
Oh I see, yeah this sounds hard. The causal graph wouldn’t be a DAG because it’s cyclic, in which case there may be something you can do but the “standard” (read: what you’d find in Pearl’s Causality) won’t help you unless I’m forgetting something.
An apparently real hypothesis that fits this pattern is that people take more risks / do more unhealthy things the more they know healthcare can heal them / keep them alive.
A few minor comments. Regarding I, it’s known that the direction of (or lack of) an arrow in generic two-node causal is un-identifiable, although there’s some recent work solving this in restricted cases.
Regarding II, if I understand correctly, the second sub-scenario is one in which we’d have a graph that looks like the following DAG.
What I’m confused about is if we condition on a level of tar in a big population, we’ll still see correlation between smoking and cancer via the trait assuming there’s independent noise feeding into each of these nodes. More concretely, presumably people will smoke different amounts based on some other unobserved factors outside this trait. So at at least certain levels of tar in lungs, we’ll have people who do/don’t have the trait, meaning there’ll be a correlation between smoking and cancer even in different tar level sub-populations. That said, in the purely deterministic simplified scenario, I see your point.
Alternatively, I’m pretty sure applying the front-door criterion (explanation) would properly identify the zero causal effect of smoking on cancer in this scenario (again assuming all the relationships aren’t purely deterministic).
Yeah. Thanks for the front door link, I’ll take some time learning this!
Maybe to reformulate a bit, in the second sub-scenario my idea was that each person has a kind of “tar thermostat”, which sets the desired level of tar and continually adjusts your desire to smoke. If some other factor makes you smoke more or less, it will compensate until your level of tar again matches the “thermostat setting”. And the trait that determines someone’s “thermostat setting” would also determine their cancer risk. Basically the system would counteract any external noise, making the statistician’s job harder (though not impossible, you’re right).
The third scenario, about skydiving, hints at a similar idea. The “thermostat” there is the person’s desire for thrill, so if you take away skydiving, it will try to find something else.
Oh I see, yeah this sounds hard. The causal graph wouldn’t be a DAG because it’s cyclic, in which case there may be something you can do but the “standard” (read: what you’d find in Pearl’s Causality) won’t help you unless I’m forgetting something.
An apparently real hypothesis that fits this pattern is that people take more risks / do more unhealthy things the more they know healthcare can heal them / keep them alive.
The thermostat pattern is everywhere, from biology to econ to climate etc. I learned about it years ago from this article and it affected me a lot.