Basic statistics question: if we find that 99% of all people are irrational, but “only” 90% of millionaires are irrational, is that evidence that rationality does lead to (increased probability of) winning, or is it only evidence that rationality is correlated with winning? For instance, how do I know that millionaires aren’t more rational simply because they can afford to go to CFAR workshops and have more freetime to read LessWrong?
I.e. knowing only that 99% of all people are A but “only” 90% of millionaires are A, how do I adjust my respective probabilities that
A --> millionaires
Millionaires --> A
Unknown factor C causes both A and millionaires
It feels like I ought to assign some additional likelihood to each of these 3 cases, but I’m not sure how to split it up. Maybe the answer is simply, “gather more evidence to attempt to tease out the proper causal relationship”.
This is a causal question, not a statistical question. You answer by implementing the relevant intervention, usually by randomization, or maybe you find a natural experiment, or maybe [lots of other ways people thought of].
You can’t in general use observational data (e.g. what you call “evidence”) to figure out causal relationships. You need causal assumptions somewhere.
which show you don’t even need conditional independences to orient edges. For example if the true dag is this:
1 → 2 → 3 → 4, 1 ← u1 → 3, 1 ← u2 → 4,
and we observe p(1, 2, 3, 4) (no conditional independences in this marginal), I can recover the graph exactly with enough data. (The graph would be causal if we assume the underlying true graph is, otherwise it’s just a statistical model).
People’s intuitions about what’s possible in causal discovery aren’t very good.
It would be good if statisticians and machine learning / comp. sci. people came together to hash out their differences regarding causal inference.
I saw that, but I didn’t see much substance to his remarks, nor in the comments.
Here is a paper surveying methods of methods of causal analysis for such non-interventional data, and summarising the causal assumptions that they make:
“New methods for separating causes from effects in genomics data” Alexander Statnikov, Mikael Henaff, Nikita I Lytkin, Constantin F Aliferis
It feels like I ought to assign some additional likelihood to each of these 3 cases, but I’m not sure how to split it up.
Two things:
1) Your prior probabilities. If before getting your evidence you expect that hypothesis H1 is twice as likely as H2, and the new evidence is equally likely under both H1 and H2, you should update so that the new H1 remains twice as likely as H2.
2) Conditional probabilities of the evidence under different hypotheses. Let’s suppose that hypothesis H1 predicts a specific evidence E with probability 10%, hypothesis H2 predicts E with probability 30%. After seeing E, the ratio between H1 and H2 should be multiplied by 1:3.
The first part means simply: Before the (fictional) research about rationality among millionaires was made, which probability would you assign to your hypotheses?
The second part means: If we know that 99% of all people are irrational, what would be your expectation about % of irrational millionaires, if you assume that e.g. the first hypothesis “rationality causes millionaires” is true. Would you expect to see 95% or 90% or 80% or 50% or 10% or 1% of irrational millionaires? Make your probability distribution. Now do the same thing for each one of the remaining hypotheses. -- Ta-da, the research is over and we know that the % of irrational millionaires is 90%, not more, not less. How good were the individual hypotheses at predicting this specific outcome?
(I don’t mean to imply that doing either of these estimates is easy. It is just the way it should be done.)
Maybe the answer is simply, “gather more evidence
Gathering more evidence is always good (ignoring the costs of gathering the evidence), but sometimes we need to make an estimate based on data we already have.
Basic statistics question: if we find that 99% of all people are irrational, but “only” 90% of millionaires are irrational, is that evidence that rationality does lead to (increased probability of) winning, or is it only evidence that rationality is correlated with winning? For instance, how do I know that millionaires aren’t more rational simply because they can afford to go to CFAR workshops and have more freetime to read LessWrong?
I.e. knowing only that 99% of all people are A but “only” 90% of millionaires are A, how do I adjust my respective probabilities that
A --> millionaires
Millionaires --> A
Unknown factor C causes both A and millionaires
It feels like I ought to assign some additional likelihood to each of these 3 cases, but I’m not sure how to split it up. Maybe the answer is simply, “gather more evidence to attempt to tease out the proper causal relationship”.
This is a causal question, not a statistical question. You answer by implementing the relevant intervention, usually by randomization, or maybe you find a natural experiment, or maybe [lots of other ways people thought of].
You can’t in general use observational data (e.g. what you call “evidence”) to figure out causal relationships. You need causal assumptions somewhere.
What do you think of this challenge, to detect causality from nothing but a set of pairs of values of unnamed variables?
You can do it with enough causal assumptions (e.g. not “from nothing”). There is a series of magical papers, e.g. this:
http://www.cs.helsinki.fi/u/phoyer/papers/pdf/hoyer2008nips.pdf
which show you can use additive noise assumptions to orient edges.
I have a series of papers:
http://www.auai.org/uai2012/papers/248.pdf
http://arxiv.org/abs/1207.5058
which show you don’t even need conditional independences to orient edges. For example if the true dag is this:
1 → 2 → 3 → 4, 1 ← u1 → 3, 1 ← u2 → 4,
and we observe p(1, 2, 3, 4) (no conditional independences in this marginal), I can recover the graph exactly with enough data. (The graph would be causal if we assume the underlying true graph is, otherwise it’s just a statistical model).
People’s intuitions about what’s possible in causal discovery aren’t very good.
It would be good if statisticians and machine learning / comp. sci. people came together to hash out their differences regarding causal inference.
Gelman seems skeptical.
I saw that, but I didn’t see much substance to his remarks, nor in the comments.
Here is a paper surveying methods of methods of causal analysis for such non-interventional data, and summarising the causal assumptions that they make:
“New methods for separating causes from effects in genomics data”
Alexander Statnikov, Mikael Henaff, Nikita I Lytkin, Constantin F Aliferis
Two things:
1) Your prior probabilities. If before getting your evidence you expect that hypothesis H1 is twice as likely as H2, and the new evidence is equally likely under both H1 and H2, you should update so that the new H1 remains twice as likely as H2.
2) Conditional probabilities of the evidence under different hypotheses. Let’s suppose that hypothesis H1 predicts a specific evidence E with probability 10%, hypothesis H2 predicts E with probability 30%. After seeing E, the ratio between H1 and H2 should be multiplied by 1:3.
The first part means simply: Before the (fictional) research about rationality among millionaires was made, which probability would you assign to your hypotheses?
The second part means: If we know that 99% of all people are irrational, what would be your expectation about % of irrational millionaires, if you assume that e.g. the first hypothesis “rationality causes millionaires” is true. Would you expect to see 95% or 90% or 80% or 50% or 10% or 1% of irrational millionaires? Make your probability distribution. Now do the same thing for each one of the remaining hypotheses. -- Ta-da, the research is over and we know that the % of irrational millionaires is 90%, not more, not less. How good were the individual hypotheses at predicting this specific outcome?
(I don’t mean to imply that doing either of these estimates is easy. It is just the way it should be done.)
Gathering more evidence is always good (ignoring the costs of gathering the evidence), but sometimes we need to make an estimate based on data we already have.