Looking it over, I could have been much clearer (sorry).
Specifically I want to know.
Given a Dag of the form:
A → C ← B
Is it true that (in all prior joint distributions where A is independent of B, but A is evidence of C, and B is evidence of C) A is none-independent of B, given C is held constant?
I proved that when A & B is evidence against C, this is so, and also when A & B are independent of C, this is so, the only case I am missing is when A & B is evidence for C.
It’s clear enough to me that when you have one none-colliding path between any two variables, they must not be independent; and that if we were to hold any of the variable along that path constant, that those variables would be independent. This can all be shown given standard probability theory and correlation alone. It can also be shown that if there are only colliding paths between two variables, those two variables are independent. If I have understood the theory of d-separation correctly, if we hold the collision variable (assuming there is only one) on one of these paths constant, the two variables should become none-independent (either evidence for or against one another). I have proven that this is so in two of the (at least) three cases that fit the given DAG using standard probability theory.
Is it true that (in all prior joint distributions where A is independent of B, but A is evidence of C, and B is evidence of
C) A is none-independent of B, given C is held constant?
No, but I think it’s true if A,B,C are binary. In general, if a distribution p is Markov relative to a graph G, then if something is d-separated in G, then there is a corresponding independence in p. But, importantly, the implication does not always go the other way. Distributions in which the implication always goes the other way are very special and are called faithful.
“Markov” is used in the standard memoryless sense. By definition, the graph G represents any distribution p where each variable on the graph is independent of its past given its parents. This is the Markov property.
Ilya is discussing probability distributions p that may or may not be represented by graph G. If every variable in p is independent of its past given its parents in G, then you can use d-separation in G to reason about independences in p.
Looking it over, I could have been much clearer (sorry). Specifically I want to know. Given a Dag of the form:
A → C ← B
Is it true that (in all prior joint distributions where A is independent of B, but A is evidence of C, and B is evidence of C) A is none-independent of B, given C is held constant?
I proved that when A & B is evidence against C, this is so, and also when A & B are independent of C, this is so, the only case I am missing is when A & B is evidence for C.
It’s clear enough to me that when you have one none-colliding path between any two variables, they must not be independent; and that if we were to hold any of the variable along that path constant, that those variables would be independent. This can all be shown given standard probability theory and correlation alone. It can also be shown that if there are only colliding paths between two variables, those two variables are independent. If I have understood the theory of d-separation correctly, if we hold the collision variable (assuming there is only one) on one of these paths constant, the two variables should become none-independent (either evidence for or against one another). I have proven that this is so in two of the (at least) three cases that fit the given DAG using standard probability theory.
Those are the proofs I gave above.
No, but I think it’s true if A,B,C are binary. In general, if a distribution p is Markov relative to a graph G, then if something is d-separated in G, then there is a corresponding independence in p. But, importantly, the implication does not always go the other way. Distributions in which the implication always goes the other way are very special and are called faithful.
What is Markov relative?
“Markov” is used in the standard memoryless sense. By definition, the graph G represents any distribution p where each variable on the graph is independent of its past given its parents. This is the Markov property.
Ilya is discussing probability distributions p that may or may not be represented by graph G. If every variable in p is independent of its past given its parents in G, then you can use d-separation in G to reason about independences in p.