In fact, in order to truly ignore time data, we cannot even order the points according to time! But that means that we no longer have any way to line up the points T0 with e0, T1 with e1, etc.
What? This makes no sense.
I guess you haven’t seen this stated explicitly, but the framework of causal networks makes an iid assumption. The idea is that the causal network represents some process that occurs a lot, and we can watch it occur until we get a reasonably good understanding of the joint distribution of variables. Part of this is that it the same process occurring, so there is no time dependence built into the framework.
For some purposes, we can model time by simply including it as an observed variable, which you do in this post. However, the different measurements of each variable are associated because they come from the same sample of the (iid) causal process, whether or not we are conditioning on time. The way you are trying to condition on time isn’t correct, and the correlation does exists in both cases. (Really, we care about dependence rather than correlation, but it doesn’t make a difference here.)
I do think that this is a useful general direction of analysis. If the question is meaningful at all, then the answer is probably that given by Armok_GoB in the original thread, but it would be useful to clarify what exactly the question means. There is probably a lot of work to be done before we really understand such things, but I would advise you to better understand the ideas behind causal networks before trying to contribute.
Causal networks do not make an iid assumption. Consider one of the simplest examples, in which we examine experimental data. Some of the variables are chosen by the experimenter. They can be chosen any way the experimenter pleases, so long as they vary. The process is the same, but that does not imply iid observations. It just means that time dependence must enter through the variables. As you say, it is not built in to the framework.
The problem is to reduce the phrase “the different measurements of each variable are associated because they come from the same sample of the causal process.” What is a sample? How do we know two numbers (or other strings) came from the same sample? Since the association contains information separate from the values themselves, how can we incorporate that information into the framework explicitly? How can we handle uncertainty in the association apart from uncertainty in the values of the variables?
Yeah, I guess that’s way too strong; there are a lot of alternative assumptions also that justify using them.
What is a sample? How do we know two numbers (or other strings) came from the same sample?
I think we just have to assume this problem solved. Whenever we use causal networks in practice, we know what a sample is. You can try to weaken this and see if you still get anything useful, but this is very different then ‘conditioning on time’ as you present in the post.
Since the association contains information separate from the values themselves, how can we incorporate that information into the framework explicitly?
Bayes theorem? If we have a strong enough prior and enough information to reverse-engineer the association reasonably well, then we might be able to learn something. If you’re running a clinical trial and you recorded which drugs were given out, but not to which patients, then you need other information, such as a prior about which side-effects they cause and measurements of side-effects that are associated with specific patients. Otherwise you just don’t have the data necessary to construct the model.
Exactly! We want to incorporate the association information using Bayes theorem. If you have zero information about the mapping, then your knowledge is invariant under permutations of the data sets (e.g., swapping T0 with T1). That implies that your prior over the associations is uniform over the possible permutations (note that a permutation uniquely specifies an association and vice versa). So, when calculating the correlation, you have to average over all permutations, and the correlation turns out to be identically zero for all possible data. No association means no correlation.
So in the zero information case, we get this weird behavior that isn’t what we expect. If the zero information case doesn’t work, then we can’t expect to get correct answers with only partial information about the associations. We can expect similar strangeness when trying to deal with partial information based on priors about side-effects caused by our hypothetical drug.
If we don’t have enough information to construct the model, then our analysis should yield inconclusive results, not weird or backward results. So the problem is to figure out the right way to handle association information.
Yes, but this is a completely different matter than your original post. Obviously this is how we should handle this weird state of information that you’re constructing, but it doesn’t have the causal interpretation you give it. You are doing something, but it isn’t causal analysis. Also, in the scenario you describe, you have the association information, so you should be using it.
What? This makes no sense.
I guess you haven’t seen this stated explicitly, but the framework of causal networks makes an iid assumption. The idea is that the causal network represents some process that occurs a lot, and we can watch it occur until we get a reasonably good understanding of the joint distribution of variables. Part of this is that it the same process occurring, so there is no time dependence built into the framework.
For some purposes, we can model time by simply including it as an observed variable, which you do in this post. However, the different measurements of each variable are associated because they come from the same sample of the (iid) causal process, whether or not we are conditioning on time. The way you are trying to condition on time isn’t correct, and the correlation does exists in both cases. (Really, we care about dependence rather than correlation, but it doesn’t make a difference here.)
I do think that this is a useful general direction of analysis. If the question is meaningful at all, then the answer is probably that given by Armok_GoB in the original thread, but it would be useful to clarify what exactly the question means. There is probably a lot of work to be done before we really understand such things, but I would advise you to better understand the ideas behind causal networks before trying to contribute.
Causal networks do not make an iid assumption. Consider one of the simplest examples, in which we examine experimental data. Some of the variables are chosen by the experimenter. They can be chosen any way the experimenter pleases, so long as they vary. The process is the same, but that does not imply iid observations. It just means that time dependence must enter through the variables. As you say, it is not built in to the framework.
The problem is to reduce the phrase “the different measurements of each variable are associated because they come from the same sample of the causal process.” What is a sample? How do we know two numbers (or other strings) came from the same sample? Since the association contains information separate from the values themselves, how can we incorporate that information into the framework explicitly? How can we handle uncertainty in the association apart from uncertainty in the values of the variables?
Yeah, I guess that’s way too strong; there are a lot of alternative assumptions also that justify using them.
I think we just have to assume this problem solved. Whenever we use causal networks in practice, we know what a sample is. You can try to weaken this and see if you still get anything useful, but this is very different then ‘conditioning on time’ as you present in the post.
Bayes theorem? If we have a strong enough prior and enough information to reverse-engineer the association reasonably well, then we might be able to learn something. If you’re running a clinical trial and you recorded which drugs were given out, but not to which patients, then you need other information, such as a prior about which side-effects they cause and measurements of side-effects that are associated with specific patients. Otherwise you just don’t have the data necessary to construct the model.
Exactly! We want to incorporate the association information using Bayes theorem. If you have zero information about the mapping, then your knowledge is invariant under permutations of the data sets (e.g., swapping T0 with T1). That implies that your prior over the associations is uniform over the possible permutations (note that a permutation uniquely specifies an association and vice versa). So, when calculating the correlation, you have to average over all permutations, and the correlation turns out to be identically zero for all possible data. No association means no correlation.
So in the zero information case, we get this weird behavior that isn’t what we expect. If the zero information case doesn’t work, then we can’t expect to get correct answers with only partial information about the associations. We can expect similar strangeness when trying to deal with partial information based on priors about side-effects caused by our hypothetical drug.
If we don’t have enough information to construct the model, then our analysis should yield inconclusive results, not weird or backward results. So the problem is to figure out the right way to handle association information.
Yes, but this is a completely different matter than your original post. Obviously this is how we should handle this weird state of information that you’re constructing, but it doesn’t have the causal interpretation you give it. You are doing something, but it isn’t causal analysis. Also, in the scenario you describe, you have the association information, so you should be using it.