I think I might be confused by the concept of testability. But with that out of the way:
no, we really mean “untestable.” SUTVA (stable unit treatment value assumption) is a two part assumption:
first, it assumes that if we give the treatment to one person, this does not affect other people in the study (unclear how to check for this...)
second, it assumes that if we observed the exposure A is equal to a, then there is no difference between the observed responses for any person, and the responses for that same person under a hypothetical randomized study where we assigned A to a for that person (unclear how to check for this either… talks about hypothetical worlds).
Causal inference from observational data has to rely on untestable assumptions to link what we see with what would have happened under a hypothetical experiment. If you don’t like this, you should give up on causal inference from observational data (and you would be in good company if you do—Ronald Fisher and lots of other statisticians were extremely skeptical).
It’s not clear to me how large a class of statements you’re considering untestable. Are all counterfactual statements untestable (because they are about non-existent worlds)?
To take an example I just googled up, page 7 of this gives an example of a violation of the first of the SUVTA conditions. Is that violating circumstance, or its absence, untestable even outside of the particular study?
Another hypothetical example would be treatment of patients having a dangerous and infectious disease. One would presumably be keeping each one in isolation; is the belief that physical transmission of microorganisms from one person to another may result in interference between patient outcomes untestable? Surely not.
Such a general concept of untestability amounts to throwing up one’s hands and saying saying “what can we ever know?”, while looking around at the world shows that in fact we know a great deal. I cannot believe that this is what you are describing as untestable, but then it is not clear to me what the narrower bounds of the class are.
At the opposite extreme, some assumptions called untestable are described as “domain knowledge”, in which case they are as testable as any other piece of knowledge—where else does “domain knowledge” come from? -- but merely fail to be testable by the data under present consideration.
It’s not clear to me how large a class of statements you’re considering untestable.
As I said, I am confused about the concept of testability. While I work out a general account I am happy with (or perhaps abandon ship in a Bayeswardly direction or something) I am relying on a folk conception to get statements that, regardless of what the ultimate account of testability might be, are definitely untestable. That is, we cannot imagine an effective procedure that would, even in principle, check if the statement is true.
The standard example is Smoking → Tar → Cancer
The statement “the random variables I have cancer given that I was _assigned_ to smoke and I have tar in my lungs given that I was _assigned_ not to smoke are independent” is untestable.
That’s because to test this independence, I have to simultaneously consider a world where I was assigned to smoke, and another world where I was assigned not to smoke, and consider a joint distribution over these two worlds. But we only can access one such world at a time, unless we can roll back time, or jump across Everett branches.
Pearl does not concern himself with testability very much, because Pearl is a computer scientist, and to Pearl the world of Newtonian physics is like a computer circuit, where it is obvious that everything stays invariant to arbitrary counterfactual alterations of wires, and in particular sources of noise stay independent. But the applications of causal inference isn’t on circuits, but on much mushier problems—like psychology or medicine. In such domains it is not clear why assumptions like my example intuitively should hold, and not clear how to test them.
Such a general concept of untestability amounts to throwing up one’s hands and saying saying “what can
we ever know?”
This is not naive skepticism, this is a careful account of assumptions, a very good habit among statisticians, in my opinion. We need more of this in statistical and causal analysis, not less.
The statement “the random variables I have cancer given that I was assigned to smoke and I have tar in my lungs given that I was assigned not to smoke are independent” is untestable.
Can you give me a larger context for that example? A pointer to a paper that uses it would be enough.
At the moment I’m not clear what the independence of these means, if they’re understood as statements about non-interacting world branches. What is the mathematical formulation of the assertion that they are independent? How, in mathematical terms, would that assumption play a role in the study of whether smoking causes cancer?
From another point of view, suppose that we knew the exact mechanisms whereby smoke, tar, and everything else have effects on the body leading to cancer. Would we then be able to calculate the truth or falsity of the assumption?
(there are lots of refs in there as well, for more reading).
The “branches” are interacting because they share the past, although I was being imprecise when I was talking about Everett branches—these hypothetical worlds are mathematical abstractions, and do not correspond directly to a part of the wave function at all. There is no developed extension of interventionist causality to quantum theory (nor is it clear whether this is a good idea—the intervention abstraction might not make sense in that setting).
Thanks, I now have a clearer idea of what these expressions mean and why they matter. You write on page 15:
Defining the influence of A on Y for a particular unit u as Y(1,M(0,u),u) involved a seemingly impossible hypothetical situation, where the treatment given to u was 0 for the purposes of the mediator M, and 1 for the purposes of the outcome Y.
For the A/M/Y = smoking/tar/cancer situation I can imagine a simple way of creating this situation: have someone smoke cigarettes with filters that remove all of the tar but nothing else. There may be practical engineering problems in creating such a filter, and ethical considerations in having experimental subjects smoke, but it does not seem impossible in principle. This intervention sets A to 1 and M to M(0,u), allowing the measurement of Y(1,M(0,u),u).
As with the case of the word “untestable”, I am wondering if “impossible” is here being understood to mean, not impossible in an absolute sense, but “impossible within some context of available means, assumed as part of the background”. For example, “impossible without specific domain knowledge”, or “impossible given only the causal diagram and some limited repertoire of feasible interventions and observations”. The tar filter scenario goes outside those bounds by using domain knowledge to devise a way of physically erasing the arrow from A to M.
I have the same question about page 18, where you say that equation (15):
Y(1,m) _||_ M(0)
is untestable (this is the example you expressed in words upthread), even though you have shown that it mathematically follows from any SEM of a certain form relating the variables, and could be violated if it has certain different forms. The true causal relationships, whatever they are, are observable physical processes. If we could observe them all, we would observe whether Y(1,m) _||_ M(0).
Again, by “untestable” do you here mean untestable within certain limits on what experiments can be done?
This paper is about an argument the authors are having with Judea Pearl about whether assumptions like the one we are talking about are sensible to make. Of particular relevance for us is section 5.1. If I understood the point the authors are making, whenever Judea justifies such an assumption, he tells a story that is effectively interventional (very similar to your story about a filter). That is, what really is happening is we are replacing the graph:
A → M → Y, A → Y
by another graph:
A → A1 → Y, A → A2 → M → Y
where A1 is the “non tar-producing part” of smoking, and A2 is the “tar-producing part” of smoking (the example in 5.1 was talking about nicotine instead). As long as we can tell such a story, the relevant counterfactual is implemented via interventions, and all is well. That is, Y(A=1,M(A=0)) in graph 1 is the same thing as Y(A1=1,A2=0) in graph 2.
The true causal relationships, whatever they are, are observable physical processes. If we could observe
them all, we would observe whether Y(1,m) || M(0).
The point of doing mediation analysis in the first place is because we are being empiricist—using data for scientific discovery. In particular, we are trying to learn a fairly crude fact about cause-effect relationships of A, M and Y. If, as you say, we were able to observe the entire relevant DAG, and all biochemical events involved in the A → M → Y chain, then we would already be done, and would not need to do our analysis in the first place.
“Testability” (the concept I am confused about) comes up in the process of scientific work, which is crudely about expanding a lit circle of the known via sensible procedures. So intuitively, “testability” has to involve the resources of the lit circle itself, not of things in the darkness. This is because there is a danger of circularity otherwise.
I think I might be confused by the concept of testability. But with that out of the way:
no, we really mean “untestable.” SUTVA (stable unit treatment value assumption) is a two part assumption:
first, it assumes that if we give the treatment to one person, this does not affect other people in the study (unclear how to check for this...)
second, it assumes that if we observed the exposure A is equal to a, then there is no difference between the observed responses for any person, and the responses for that same person under a hypothetical randomized study where we assigned A to a for that person (unclear how to check for this either… talks about hypothetical worlds).
Causal inference from observational data has to rely on untestable assumptions to link what we see with what would have happened under a hypothetical experiment. If you don’t like this, you should give up on causal inference from observational data (and you would be in good company if you do—Ronald Fisher and lots of other statisticians were extremely skeptical).
It’s not clear to me how large a class of statements you’re considering untestable. Are all counterfactual statements untestable (because they are about non-existent worlds)?
To take an example I just googled up, page 7 of this gives an example of a violation of the first of the SUVTA conditions. Is that violating circumstance, or its absence, untestable even outside of the particular study?
Another hypothetical example would be treatment of patients having a dangerous and infectious disease. One would presumably be keeping each one in isolation; is the belief that physical transmission of microorganisms from one person to another may result in interference between patient outcomes untestable? Surely not.
Such a general concept of untestability amounts to throwing up one’s hands and saying saying “what can we ever know?”, while looking around at the world shows that in fact we know a great deal. I cannot believe that this is what you are describing as untestable, but then it is not clear to me what the narrower bounds of the class are.
At the opposite extreme, some assumptions called untestable are described as “domain knowledge”, in which case they are as testable as any other piece of knowledge—where else does “domain knowledge” come from? -- but merely fail to be testable by the data under present consideration.
As I said, I am confused about the concept of testability. While I work out a general account I am happy with (or perhaps abandon ship in a Bayeswardly direction or something) I am relying on a folk conception to get statements that, regardless of what the ultimate account of testability might be, are definitely untestable. That is, we cannot imagine an effective procedure that would, even in principle, check if the statement is true.
The standard example is Smoking → Tar → Cancer
The statement “the random variables
I have cancer given that I was _assigned_ to smoke
andI have tar in my lungs given that I was _assigned_ not to smoke
are independent” is untestable.That’s because to test this independence, I have to simultaneously consider a world where I was assigned to smoke, and another world where I was assigned not to smoke, and consider a joint distribution over these two worlds. But we only can access one such world at a time, unless we can roll back time, or jump across Everett branches.
Pearl does not concern himself with testability very much, because Pearl is a computer scientist, and to Pearl the world of Newtonian physics is like a computer circuit, where it is obvious that everything stays invariant to arbitrary counterfactual alterations of wires, and in particular sources of noise stay independent. But the applications of causal inference isn’t on circuits, but on much mushier problems—like psychology or medicine. In such domains it is not clear why assumptions like my example intuitively should hold, and not clear how to test them.
This is not naive skepticism, this is a careful account of assumptions, a very good habit among statisticians, in my opinion. We need more of this in statistical and causal analysis, not less.
Can you give me a larger context for that example? A pointer to a paper that uses it would be enough.
At the moment I’m not clear what the independence of these means, if they’re understood as statements about non-interacting world branches. What is the mathematical formulation of the assertion that they are independent? How, in mathematical terms, would that assumption play a role in the study of whether smoking causes cancer?
From another point of view, suppose that we knew the exact mechanisms whereby smoke, tar, and everything else have effects on the body leading to cancer. Would we then be able to calculate the truth or falsity of the assumption?
Since you asked for a paper, I have to cite myself:
http://arxiv.org/pdf/1205.0241v2.pdf
(there are lots of refs in there as well, for more reading).
The “branches” are interacting because they share the past, although I was being imprecise when I was talking about Everett branches—these hypothetical worlds are mathematical abstractions, and do not correspond directly to a part of the wave function at all. There is no developed extension of interventionist causality to quantum theory (nor is it clear whether this is a good idea—the intervention abstraction might not make sense in that setting).
Thanks, I now have a clearer idea of what these expressions mean and why they matter. You write on page 15:
For the A/M/Y = smoking/tar/cancer situation I can imagine a simple way of creating this situation: have someone smoke cigarettes with filters that remove all of the tar but nothing else. There may be practical engineering problems in creating such a filter, and ethical considerations in having experimental subjects smoke, but it does not seem impossible in principle. This intervention sets A to 1 and M to M(0,u), allowing the measurement of Y(1,M(0,u),u).
As with the case of the word “untestable”, I am wondering if “impossible” is here being understood to mean, not impossible in an absolute sense, but “impossible within some context of available means, assumed as part of the background”. For example, “impossible without specific domain knowledge”, or “impossible given only the causal diagram and some limited repertoire of feasible interventions and observations”. The tar filter scenario goes outside those bounds by using domain knowledge to devise a way of physically erasing the arrow from A to M.
I have the same question about page 18, where you say that equation (15):
is untestable (this is the example you expressed in words upthread), even though you have shown that it mathematically follows from any SEM of a certain form relating the variables, and could be violated if it has certain different forms. The true causal relationships, whatever they are, are observable physical processes. If we could observe them all, we would observe whether Y(1,m) _||_ M(0).
Again, by “untestable” do you here mean untestable within certain limits on what experiments can be done?
Richard, thanks for your message, and for reading my paper.
At the risk of giving you more homework, I thought I would point you to the following paper, which you might find interesting:
http://www.hsph.harvard.edu/james-robins/files/2013/03/wp100.pdf
This paper is about an argument the authors are having with Judea Pearl about whether assumptions like the one we are talking about are sensible to make. Of particular relevance for us is section 5.1. If I understood the point the authors are making, whenever Judea justifies such an assumption, he tells a story that is effectively interventional (very similar to your story about a filter). That is, what really is happening is we are replacing the graph:
A → M → Y, A → Y
by another graph:
A → A1 → Y, A → A2 → M → Y
where A1 is the “non tar-producing part” of smoking, and A2 is the “tar-producing part” of smoking (the example in 5.1 was talking about nicotine instead). As long as we can tell such a story, the relevant counterfactual is implemented via interventions, and all is well. That is, Y(A=1,M(A=0)) in graph 1 is the same thing as Y(A1=1,A2=0) in graph 2.
The point of doing mediation analysis in the first place is because we are being empiricist—using data for scientific discovery. In particular, we are trying to learn a fairly crude fact about cause-effect relationships of A, M and Y. If, as you say, we were able to observe the entire relevant DAG, and all biochemical events involved in the A → M → Y chain, then we would already be done, and would not need to do our analysis in the first place.
“Testability” (the concept I am confused about) comes up in the process of scientific work, which is crudely about expanding a lit circle of the known via sensible procedures. So intuitively, “testability” has to involve the resources of the lit circle itself, not of things in the darkness. This is because there is a danger of circularity otherwise.