How was the direction of causality established? Maybe smart people are less likely to want to smoke marijuana, or nerdy people are less likely to develop connections that make marijuana available to them even if it’s illegal where they are. IQ also negatively correlates with number of sexual partners, but I haven’t seen anyone concluding that getting laid a lot makes you dumber.
They didn’t just measure the IQ of marijuana users. They measured the change in IQ over a long time of people who used marijuana during that time (and of people who didn’t, as a control group, of course).
Longitudinal comparisons are much better than a simple cross-section (‘the marijuana smokers tend to be stupider, huh’), but you’re still getting only a correlation. It’s perfectly plausible—indeed, inevitable—that there are uncontrolled factors: the Big Five personality factor Conscientiousness comes to mind as a plausible trait which might lead to non-smoking and higher IQ.
(That said, I have not used marijuana and have no intention of doing so.)
If there is a mediating variable that captures all of the causal flow from “treatment” (smoking mrj) to “outcome” (iq), and moreover, this variable is not an effect of the unidentified common cause, you can use “the front door functional” (see Pearl’s book) to get the causal effect.
If there is a variable that is a “strong cause” of the “treatment”, but not of the “outcome” (except through treatment) then this variable is instrumental, and there are methods that will give you the causal effect using this variable.
If there is an observed effect of an unobserved common cause, and you know something about how this effect arose, there are methods for “reconstructing” the unobserved common cause, and then using the standard covariate adjustment formula.
In general (e.g. complex longitudinal cases), there is a neat algorithm due to Jin Tian that can handle all sorts of unobserved confounding. Not everything, of course. In general, unobserved confounders doom you.
If you want the full story, you can read for instance this paper:
I don’t think that correcting for the effects of various factors is on the same scale as controlling for them, and after going over your reference I am more sure of that.
Granted, correcting is often very much easier than controlling for complex factors, and allows for sample sizes to alter the scale again.
In both cases the goal is to measure the effect of one choice, including effects through intermediate causes, without including in that measurement any other factors.
Assuming a complex system and a fairly large sample, you can correct by gathering lots of data about as many things which might be factors as possible: If, however, there is an unmeasured trait “Does not enjoy mind-clouding events” which correlates both to increase in IQ and not smoking dope, it cannot be discovered by correction.
To control, you take your population and divide it into groups as evenly as possible along every axis that you can measure except the independent variable, and then force the independent variable of each group to be the same.
Maybe the definitions I’m using are different from the jargon, and if so I am ‘wrong’ in a real sense; what is the jargon for distinguishing between those two types of differentiation?
Ok, when you say “correct” you mean you try to discover as many hidden variables in your DAG as possible and try to collect data on them such that they become observed. When you say “control” you mean a particular implementation of the adjustment formula: p(y | do(a)) = sum{c} p(y | a, c) p(c), where a is the treatment, y is the outcome, and c is measured covariates. (Note: using “independent/dependent” variable is not correct because those variables are not guaranteed to have a causal relationship you want—an effect can be independent and a cause can be dependent).
The point of some of the work in causal inference, including the paper I linked is that in some cases you don’t need to either “correct” or “control” in the senses of the words you are using. For example if your graph is:
A → W → Y, and there is an unobserved common cause U of A and Y, then you don’t need to “correct” for the presence of this U by trying to measure it, nor can you “control” for U since you cannot measure it. What you can do is use the following formula: p(y | do(a)) = sum{w} p(w | a) sum{a’} p(y | w, a’) p(a’).
There are more complex versions of the same trick discussed in great detail in the paper I linked.
It is the independent variable in a controlled study because the study makes that variable independent of all other variables. It doesn’t matter if normally U->A, in the controlled study A is determined by sorting into groups. Instead of observing A, A is decided by fiat.
The formulae only work if you have a graph of what you believe the causal chain might be, and gather data for each step that you have. If, for example, you think the chain is A->W->Y, with a potential U->A and U->Y, but the actual chain is U->Not-W; U->Y; A->W and A->Y, you provides bad advice to people who wish Y or Not-Y and are deciding on A.
“Independent/dependent” variables are used when talking about functions and regression models, even when those functions and regression models are not causal. For this reason, I believe it is confusing usage. Ordinary statistical regressions are invertible, causal regressions are not.
The formulae are correct iff the graph is correct, that is true. I am not sure what you are trying to say. If your assumptions are wrong, your entire analysis is garbage. This is true of any analysis. Are you saying anything beyond this? Please clarify what you mean.
With controlled experimentation, one can be almost certain that the effect measured is due to the variable modified. It doesn’t matter if you have a correct graph of the confounding factors, because you balance them against each other.
What you are doing is measuring the combined strength of all chains of the type A->?->Y
Even in randomized trials you need to worry about assumptions. For example, you have to worry that your samples represent the general population. You have to worry that the actual random assignment with the people you have in your study well approximated the ideal random assignment in an infinite population. You then have to worry about modeling assumptions if you are doing statistical modeling on top of that. It is true you don’t need assumptions that link observational and interventional quantities if you randomize.
“What you are doing is measuring the combined strength of all chains of the type A->?->Y”
If the graph is as I described that’s what you want (e.g. the causal effect, e.g. the variation in Y under randomizing A).
I don’t do random assignment. I divide the sample set into two or more groups that are as close to identical as possible, including their prior variation along A. Figuring out if one split is closer than a different one is nontrivial.
The only random decision is which group gets which A.
Also, marijuana severely decreases motivation and level of motivation has been correlated with IQ. With frequent usage of any drug, I would say it would modify behavior. This is to say, if you get hi a lot, it probably would modify your normal behavior just from the basis of habitual formations. Am I correct to assume this?
There could also be self-fulfilling prophesies in the taking of the exam. Telling a guy, “You’re a stoner, so we want you to taken an IQ test” probably does something to the test takers perception of himself/herself.
How was the direction of causality established? Maybe smart people are less likely to want to smoke marijuana, or nerdy people are less likely to develop connections that make marijuana available to them even if it’s illegal where they are. IQ also negatively correlates with number of sexual partners, but I haven’t seen anyone concluding that getting laid a lot makes you dumber.
They didn’t just measure the IQ of marijuana users. They measured the change in IQ over a long time of people who used marijuana during that time (and of people who didn’t, as a control group, of course).
To establish causation you’d have to randomly assign people into groups, not let them self-select into marijuana users and control.
Longitudinal comparisons are much better than a simple cross-section (‘the marijuana smokers tend to be stupider, huh’), but you’re still getting only a correlation. It’s perfectly plausible—indeed, inevitable—that there are uncontrolled factors: the Big Five personality factor Conscientiousness comes to mind as a plausible trait which might lead to non-smoking and higher IQ.
(That said, I have not used marijuana and have no intention of doing so.)
Depending on what was measured, there are “well known ways” to correct for confounding in longitudinal observational studies.
How do you correct for an unidentified common causation?
If there is a mediating variable that captures all of the causal flow from “treatment” (smoking mrj) to “outcome” (iq), and moreover, this variable is not an effect of the unidentified common cause, you can use “the front door functional” (see Pearl’s book) to get the causal effect.
If there is a variable that is a “strong cause” of the “treatment”, but not of the “outcome” (except through treatment) then this variable is instrumental, and there are methods that will give you the causal effect using this variable.
If there is an observed effect of an unobserved common cause, and you know something about how this effect arose, there are methods for “reconstructing” the unobserved common cause, and then using the standard covariate adjustment formula.
In general (e.g. complex longitudinal cases), there is a neat algorithm due to Jin Tian that can handle all sorts of unobserved confounding. Not everything, of course. In general, unobserved confounders doom you.
If you want the full story, you can read for instance this paper:
http://ftp.cs.ucla.edu/pub/stat_ser/r336-published.pdf
There is also the issue of how to do this in practice with data and smart statistical methods, which is a long separate discussion.
Did they use this method in the statistical analysis of the study? It is behind a paywall for me.
Doesn’t look like it from the wording (I will know for sure once I find the pdf).
I don’t think that correcting for the effects of various factors is on the same scale as controlling for them, and after going over your reference I am more sure of that.
Granted, correcting is often very much easier than controlling for complex factors, and allows for sample sizes to alter the scale again.
I don’t know what you mean when you say “correcting” vs “controlling.” Can you give some examples? I don’t understand your last sentence at all.
In both cases the goal is to measure the effect of one choice, including effects through intermediate causes, without including in that measurement any other factors.
Assuming a complex system and a fairly large sample, you can correct by gathering lots of data about as many things which might be factors as possible: If, however, there is an unmeasured trait “Does not enjoy mind-clouding events” which correlates both to increase in IQ and not smoking dope, it cannot be discovered by correction.
To control, you take your population and divide it into groups as evenly as possible along every axis that you can measure except the independent variable, and then force the independent variable of each group to be the same.
Maybe the definitions I’m using are different from the jargon, and if so I am ‘wrong’ in a real sense; what is the jargon for distinguishing between those two types of differentiation?
Ok, when you say “correct” you mean you try to discover as many hidden variables in your DAG as possible and try to collect data on them such that they become observed. When you say “control” you mean a particular implementation of the adjustment formula: p(y | do(a)) = sum{c} p(y | a, c) p(c), where a is the treatment, y is the outcome, and c is measured covariates. (Note: using “independent/dependent” variable is not correct because those variables are not guaranteed to have a causal relationship you want—an effect can be independent and a cause can be dependent).
The point of some of the work in causal inference, including the paper I linked is that in some cases you don’t need to either “correct” or “control” in the senses of the words you are using. For example if your graph is:
A → W → Y, and there is an unobserved common cause U of A and Y, then you don’t need to “correct” for the presence of this U by trying to measure it, nor can you “control” for U since you cannot measure it. What you can do is use the following formula: p(y | do(a)) = sum{w} p(w | a) sum{a’} p(y | w, a’) p(a’).
There are more complex versions of the same trick discussed in great detail in the paper I linked.
It is the independent variable in a controlled study because the study makes that variable independent of all other variables. It doesn’t matter if normally U->A, in the controlled study A is determined by sorting into groups. Instead of observing A, A is decided by fiat.
The formulae only work if you have a graph of what you believe the causal chain might be, and gather data for each step that you have. If, for example, you think the chain is A->W->Y, with a potential U->A and U->Y, but the actual chain is U->Not-W; U->Y; A->W and A->Y, you provides bad advice to people who wish Y or Not-Y and are deciding on A.
“Independent/dependent” variables are used when talking about functions and regression models, even when those functions and regression models are not causal. For this reason, I believe it is confusing usage. Ordinary statistical regressions are invertible, causal regressions are not.
The formulae are correct iff the graph is correct, that is true. I am not sure what you are trying to say. If your assumptions are wrong, your entire analysis is garbage. This is true of any analysis. Are you saying anything beyond this? Please clarify what you mean.
With controlled experimentation, one can be almost certain that the effect measured is due to the variable modified. It doesn’t matter if you have a correct graph of the confounding factors, because you balance them against each other.
What you are doing is measuring the combined strength of all chains of the type A->?->Y
Even in randomized trials you need to worry about assumptions. For example, you have to worry that your samples represent the general population. You have to worry that the actual random assignment with the people you have in your study well approximated the ideal random assignment in an infinite population. You then have to worry about modeling assumptions if you are doing statistical modeling on top of that. It is true you don’t need assumptions that link observational and interventional quantities if you randomize.
“What you are doing is measuring the combined strength of all chains of the type A->?->Y”
If the graph is as I described that’s what you want (e.g. the causal effect, e.g. the variation in Y under randomizing A).
I don’t do random assignment. I divide the sample set into two or more groups that are as close to identical as possible, including their prior variation along A. Figuring out if one split is closer than a different one is nontrivial.
The only random decision is which group gets which A.
Also, people who make heavy use of marijuana may have other intelligence-lowering behaviors, like a heavy use of alcohol.
Also, marijuana severely decreases motivation and level of motivation has been correlated with IQ. With frequent usage of any drug, I would say it would modify behavior. This is to say, if you get hi a lot, it probably would modify your normal behavior just from the basis of habitual formations. Am I correct to assume this?
There could also be self-fulfilling prophesies in the taking of the exam. Telling a guy, “You’re a stoner, so we want you to taken an IQ test” probably does something to the test takers perception of himself/herself.