So are there some facts about conditional independencies that would privilege the intended mapping? Here is one option.
We believe that A’ and C’ should be independent conditioned on B’. One problem is that this isn’t even true, because B’ is a coarse-graining and so there are in fact correlations between A’ and C’ that the human doesn’t understand. That said, I think that the bad map introduces further conditional correlations, even assuming B=B’. For example, if you imagine Y preserving some facts about A’ and C’, and if the human is sometimes mistaken about B’=B, then we will introduce extra correlations between the human’s beliefs about A’ and C’.
I think it’s pretty plausible that there are necessarily some “new” correlations in any case where the human’s inference is imperfect, but I’d like to understand that better.
So I think the biggest problem is that none of the human’s believed conditional independencies actually hold—they are both precise, and (more problematically) they may themselves only hold “on distribution” in some appropriate sense.
This problem seems pretty approachable though and so I’m excited to spend some time thinking about it.
Actually if A --> B --> C and I observe some function of (A, B, C) it’s just not generally the case that my beliefs about A and C are conditionally independent given my beliefs about B (e.g. suppose I observe A+C). This just makes it even easier to avoid the bad function in this case, but means I want to be more careful about the definition of the case to ensure that it’s actually difficult before concluding that this kid of conditional independence structure is potentially useful.
Sometimes we figure out the conditional in/dependence by looking at the data. It may not match common sense intuition, but if your model takes that into account and gives better results, then they just keep the conditional independence in there. You are only able to do with what you have. Lack of attributes may force you to rely on other dependencies for better predictions.
Conditional probability should be reflected if given enough data points. When you introduce human labeling into the equation, you are adding another uncertainty about the accuracy of the human doing the labeling, regardless whether the inaccuracy came from his own false sense of conditional independence. Usually human labeling don’t directly take into account of any conditional probability to not mess with the conditionals that exist within the data set. That’s why the more data the better, which also means the more labelers you have the less dependent you are on the inaccuracy of any individual human.
So are there some facts about conditional independencies that would privilege the intended mapping? Here is one option.
We believe that A’ and C’ should be independent conditioned on B’. One problem is that this isn’t even true, because B’ is a coarse-graining and so there are in fact correlations between A’ and C’ that the human doesn’t understand. That said, I think that the bad map introduces further conditional correlations, even assuming B=B’. For example, if you imagine Y preserving some facts about A’ and C’, and if the human is sometimes mistaken about B’=B, then we will introduce extra correlations between the human’s beliefs about A’ and C’.
I think it’s pretty plausible that there are necessarily some “new” correlations in any case where the human’s inference is imperfect, but I’d like to understand that better.
So I think the biggest problem is that none of the human’s believed conditional independencies actually hold—they are both precise, and (more problematically) they may themselves only hold “on distribution” in some appropriate sense.
This problem seems pretty approachable though and so I’m excited to spend some time thinking about it.
Actually if A --> B --> C and I observe some function of (A, B, C) it’s just not generally the case that my beliefs about A and C are conditionally independent given my beliefs about B (e.g. suppose I observe A+C). This just makes it even easier to avoid the bad function in this case, but means I want to be more careful about the definition of the case to ensure that it’s actually difficult before concluding that this kid of conditional independence structure is potentially useful.
Sometimes we figure out the conditional in/dependence by looking at the data. It may not match common sense intuition, but if your model takes that into account and gives better results, then they just keep the conditional independence in there. You are only able to do with what you have. Lack of attributes may force you to rely on other dependencies for better predictions.
Conditional probability should be reflected if given enough data points. When you introduce human labeling into the equation, you are adding another uncertainty about the accuracy of the human doing the labeling, regardless whether the inaccuracy came from his own false sense of conditional independence. Usually human labeling don’t directly take into account of any conditional probability to not mess with the conditionals that exist within the data set. That’s why the more data the better, which also means the more labelers you have the less dependent you are on the inaccuracy of any individual human.