Causal networks do not make an iid assumption. Consider one of the simplest examples, in which we examine experimental data. Some of the variables are chosen by the experimenter. They can be chosen any way the experimenter pleases, so long as they vary. The process is the same, but that does not imply iid observations. It just means that time dependence must enter through the variables. As you say, it is not built in to the framework.
The problem is to reduce the phrase “the different measurements of each variable are associated because they come from the same sample of the causal process.” What is a sample? How do we know two numbers (or other strings) came from the same sample? Since the association contains information separate from the values themselves, how can we incorporate that information into the framework explicitly? How can we handle uncertainty in the association apart from uncertainty in the values of the variables?
Yeah, I guess that’s way too strong; there are a lot of alternative assumptions also that justify using them.
What is a sample? How do we know two numbers (or other strings) came from the same sample?
I think we just have to assume this problem solved. Whenever we use causal networks in practice, we know what a sample is. You can try to weaken this and see if you still get anything useful, but this is very different then ‘conditioning on time’ as you present in the post.
Since the association contains information separate from the values themselves, how can we incorporate that information into the framework explicitly?
Bayes theorem? If we have a strong enough prior and enough information to reverse-engineer the association reasonably well, then we might be able to learn something. If you’re running a clinical trial and you recorded which drugs were given out, but not to which patients, then you need other information, such as a prior about which side-effects they cause and measurements of side-effects that are associated with specific patients. Otherwise you just don’t have the data necessary to construct the model.
Exactly! We want to incorporate the association information using Bayes theorem. If you have zero information about the mapping, then your knowledge is invariant under permutations of the data sets (e.g., swapping T0 with T1). That implies that your prior over the associations is uniform over the possible permutations (note that a permutation uniquely specifies an association and vice versa). So, when calculating the correlation, you have to average over all permutations, and the correlation turns out to be identically zero for all possible data. No association means no correlation.
So in the zero information case, we get this weird behavior that isn’t what we expect. If the zero information case doesn’t work, then we can’t expect to get correct answers with only partial information about the associations. We can expect similar strangeness when trying to deal with partial information based on priors about side-effects caused by our hypothetical drug.
If we don’t have enough information to construct the model, then our analysis should yield inconclusive results, not weird or backward results. So the problem is to figure out the right way to handle association information.
Yes, but this is a completely different matter than your original post. Obviously this is how we should handle this weird state of information that you’re constructing, but it doesn’t have the causal interpretation you give it. You are doing something, but it isn’t causal analysis. Also, in the scenario you describe, you have the association information, so you should be using it.
Causal networks do not make an iid assumption. Consider one of the simplest examples, in which we examine experimental data. Some of the variables are chosen by the experimenter. They can be chosen any way the experimenter pleases, so long as they vary. The process is the same, but that does not imply iid observations. It just means that time dependence must enter through the variables. As you say, it is not built in to the framework.
The problem is to reduce the phrase “the different measurements of each variable are associated because they come from the same sample of the causal process.” What is a sample? How do we know two numbers (or other strings) came from the same sample? Since the association contains information separate from the values themselves, how can we incorporate that information into the framework explicitly? How can we handle uncertainty in the association apart from uncertainty in the values of the variables?
Yeah, I guess that’s way too strong; there are a lot of alternative assumptions also that justify using them.
I think we just have to assume this problem solved. Whenever we use causal networks in practice, we know what a sample is. You can try to weaken this and see if you still get anything useful, but this is very different then ‘conditioning on time’ as you present in the post.
Bayes theorem? If we have a strong enough prior and enough information to reverse-engineer the association reasonably well, then we might be able to learn something. If you’re running a clinical trial and you recorded which drugs were given out, but not to which patients, then you need other information, such as a prior about which side-effects they cause and measurements of side-effects that are associated with specific patients. Otherwise you just don’t have the data necessary to construct the model.
Exactly! We want to incorporate the association information using Bayes theorem. If you have zero information about the mapping, then your knowledge is invariant under permutations of the data sets (e.g., swapping T0 with T1). That implies that your prior over the associations is uniform over the possible permutations (note that a permutation uniquely specifies an association and vice versa). So, when calculating the correlation, you have to average over all permutations, and the correlation turns out to be identically zero for all possible data. No association means no correlation.
So in the zero information case, we get this weird behavior that isn’t what we expect. If the zero information case doesn’t work, then we can’t expect to get correct answers with only partial information about the associations. We can expect similar strangeness when trying to deal with partial information based on priors about side-effects caused by our hypothetical drug.
If we don’t have enough information to construct the model, then our analysis should yield inconclusive results, not weird or backward results. So the problem is to figure out the right way to handle association information.
Yes, but this is a completely different matter than your original post. Obviously this is how we should handle this weird state of information that you’re constructing, but it doesn’t have the causal interpretation you give it. You are doing something, but it isn’t causal analysis. Also, in the scenario you describe, you have the association information, so you should be using it.