Evidence, as measured in log odds, has the nice property that evidence from independent sources can be combined by adding. Is there any way at all to combine p-values from independent sources? As I understand them, p-values are used to make a single binary decision to declare a theory supported or not, not to track cumulative strength of belief in a theory. They are not a measure of evidence.
Log odds of independent events do not add up, just as the odds of independent events do not multiply. The odds of flipping heads is 1:1, the odds of flipping heads twice is not 1:1 (you have to multiply odds by likelihood ratios, not odds by odds, and likewise you don’t add log odds and log odds, but log odds and log likelihood-ratios). So calling log odds themselves “evidence” doesn’t fit the way people use the word “evidence” as something that “adds up”.
I’m voting your comment up, because I think it’s a great example of how terminology should be chosen and used carefully. If you decide to edit it, I think it would be most helpful if you left your original words as a warning to others :)
By “evidence”, I refer to events that change an agent’s strength of belief in a theory, and the measure of evidence is the measure of this change in belief, that is, the likelihood-ratio and log likelihood-ratio you refer to.
I never meant for “evidence” to refer to the posterior strength of belief. “Log odds” was only meant to specify a particular measurement of strength in belief.
Can you be clearer? Log likelihood ratios do add up, so long as the independence criterion is satisfied (ie so long as P(E_2|H_x) = P(E_2|E_1,H_x) for each H_x).
Sure, just edited in the clarification: “you have to multiply odds by likelihood ratios, not odds by odds, and likewise you don’t add log odds and log odds, but log odds and log likelihood-ratios”.
It explains “mutual information”, i.e. “informational evidence”, which can be added up over as many independent events as you like. Hopefully this will have restorative effects for your intuition!
Well, just looking at the first result, it gives a formula for combining n p-values that as near as I can tell, lacks the property that C(p1,p2,p3) = C(C(p1,p2),p3). I suspect this is a result of unspoken assumptions that the combined p-values were obtained in a similar fashion (which I violate by trying to combine a p-value combined from two experiments with another obtained from a third experiment), which would be information not contained in the p-value itself. I am not sure of this because I did not completely follow the derivation.
But is there a particular paper I should look at that gives a good answer?
Fair enough, though it probably isn’t worth my time either.
Unless someone claims that they have a good general method for combining p-values, such that it does not matter where the p-values come from, or in what order they are combine, and can point me at one specific method that does all that.
Frequentists (or just about anybody involved in experimental work) report p-values, which are their main quantitative measure of evidence.
Evidence, as measured in log odds, has the nice property that evidence from independent sources can be combined by adding. Is there any way at all to combine p-values from independent sources? As I understand them, p-values are used to make a single binary decision to declare a theory supported or not, not to track cumulative strength of belief in a theory. They are not a measure of evidence.
Log odds of independent events do not add up, just as the odds of independent events do not multiply. The odds of flipping heads is 1:1, the odds of flipping heads twice is not 1:1 (you have to multiply odds by likelihood ratios, not odds by odds, and likewise you don’t add log odds and log odds, but log odds and log likelihood-ratios). So calling log odds themselves “evidence” doesn’t fit the way people use the word “evidence” as something that “adds up”.
This terminology may have originated here:
http://causalityrelay.wordpress.com/2008/06/23/odds-and-intuitive-bayes/
I’m voting your comment up, because I think it’s a great example of how terminology should be chosen and used carefully. If you decide to edit it, I think it would be most helpful if you left your original words as a warning to others :)
By “evidence”, I refer to events that change an agent’s strength of belief in a theory, and the measure of evidence is the measure of this change in belief, that is, the likelihood-ratio and log likelihood-ratio you refer to.
I never meant for “evidence” to refer to the posterior strength of belief. “Log odds” was only meant to specify a particular measurement of strength in belief.
Can you be clearer? Log likelihood ratios do add up, so long as the independence criterion is satisfied (ie so long as P(E_2|H_x) = P(E_2|E_1,H_x) for each H_x).
Sure, just edited in the clarification: “you have to multiply odds by likelihood ratios, not odds by odds, and likewise you don’t add log odds and log odds, but log odds and log likelihood-ratios”.
As long as there are only two H_x, mind you. They no longer add up when you have three hypotheses or more.
Indeed—though I find it very hard to hang on to my intuitive grasp of this!
Here is the post on information theory I said I would write:
http://lesswrong.com/lw/1y9/information_theory_and_the_symmetry_of_updating/
It explains “mutual information”, i.e. “informational evidence”, which can be added up over as many independent events as you like. Hopefully this will have restorative effects for your intuition!
Don’t worry, I have an information theory post coming up that will fix all of this :)
There’s lots of papers on combining p-values.
Well, just looking at the first result, it gives a formula for combining n p-values that as near as I can tell, lacks the property that C(p1,p2,p3) = C(C(p1,p2),p3). I suspect this is a result of unspoken assumptions that the combined p-values were obtained in a similar fashion (which I violate by trying to combine a p-value combined from two experiments with another obtained from a third experiment), which would be information not contained in the p-value itself. I am not sure of this because I did not completely follow the derivation.
But is there a particular paper I should look at that gives a good answer?
I haven’t actually read any of that literature—Cox’s theorem suggests it would not be a wise investment of time. I was just Googling it for you.
Fair enough, though it probably isn’t worth my time either.
Unless someone claims that they have a good general method for combining p-values, such that it does not matter where the p-values come from, or in what order they are combine, and can point me at one specific method that does all that.