I’m claiming the second. I was framing it in my mind as “on average, the factor will be 1”, but the kind of “average” required is the average log on further thought. I should probably use log in the future for statements like that.
This seems wrong then. Imagine you have two hypotheses, which you place equal probability on but then will see an observation that definitively selects one of the two as correct. E[p(x)] = 1⁄2 both before and after the observation, but E[log p (x)] is −1 vs—infinity.
I think we need to use actual limits then, instead of handwaving infinities. So let’s say the posterior for the unfavored hypothesis is e->0 (and is the same for both sides). The Bayes factor for the first hypothesis being confirmed is then (1-e)*3/(3e/2), which http://www.wolframalpha.com/input/?i=%281-e%29*3%2F%283e%2F2%29 simplifies to 2/e − 2. The Bayes factor for the second being confirmed is 3e/((1-e)3/2), which is again simplified http://www.wolframalpha.com/input/?i=3e%2F%28%281-e%293%2F2%29 to (2e)/(1-e).
Now, let me digress and derive the probability of finding evidence for each hypothesis; it’s almost but not quite 1/3:2/3. There’s a prior of 1⁄3 of the first hypothesis being true; this must equal the weighted expectation of the posteriors, by conservation of evidence. So if we call x the chance of finding evidence for hypothesis one, then x*(1-e)+(1-x)*e must equal 1⁄3. http://www.wolframalpha.com/input/?i=x*%281-e%29%2B%281-x%29*e%3D1%2F3+solve+for+x solves
x = (1-3 e)/(3-6 e)
which as a sanity check, does in fact head towards 1⁄3 as e goes towards 0. The corresponding probability of finding evidence for the second hypothesis is 1-x=(2-3 e)/(3-6 e).
Getting back to expected logs of Bayes factors, the chance of getting a bayes factor of 2/e − 2 is (1-3 e)/(3-6 e), while the chance of getting (2e)/(1-e) is (2-3 e)/(3-6 e).
Log of the first, times its probability, plus log of the second, times its probability, is http://www.wolframalpha.com/input/?i=log+%282%2Fx+-+2%29*+%281-3+x%29%2F%283-6+x%29%2Blog%28%282x%29%2F%281-x%29%29*+%282-3+x%29%2F%283-6+x%29%2Cx%3D.01 not zero.
Hm. I’ll need to think this over, this wasn’t what I expected. Either I made some mistake, or am misunderstanding something here. Let me think on this for a bit.
I think it’s not going to work out. The expected posterior is equal to the prior, but the expected log Bayes factor will have the form p log(K1) + (1-p) log(K2), which for general p is just a mess. Only when p=1/2 does it simplify to log(K1 K2), and when p=1/2, K2=1/K1, so the whole thing is zero.
Okay, so I think I worked out where my failed intuition got it from. The Bayes facter is the ratio of posterior/prior for hypothesis a, divided by the ratio for hypothesis B. The top of that is expected to be 1 (because the expected posterior over the prior is one, factoring out the prior in each case keeps that fraction constant), and the bottom is also (same argument), but the expected ratio of two numbers expected to be one is not always one. So my brain turned “denominator and numerator one” into “ratio one”.
I think it’s not going to work out. The expected posterior is equal to the prior, but the expected log Bayes factor will have the form p log(K1) + (1-p) log(K2), which for general p is just a mess. Only when p=1/2 does it simplify to log(K1 K2), and when p=1/2, K2=1/K1, so the whole thing is zero.
That’s probably a better way of putting it. I’m trying to intuitively capture the idea of “no expected evidence”, you can frame that in multiple ways.
Huh? E[X] = 1 and E[\log(X)] = 0 are two very different claims; which one are you actually claiming?
Also, what is the expectation with respect to? Your prior or the data distribution or something else?
I’m claiming the second. I was framing it in my mind as “on average, the factor will be 1”, but the kind of “average” required is the average log on further thought. I should probably use log in the future for statements like that.
The prior.
This seems wrong then. Imagine you have two hypotheses, which you place equal probability on but then will see an observation that definitively selects one of the two as correct. E[p(x)] = 1⁄2 both before and after the observation, but E[log p (x)] is −1 vs—infinity.
In that case, your Bayes Factor will be either 2⁄0, or 0⁄2.
Log of the first is infinity, log of the second is negative infinity.
The average of those two numbers is insert handwave here 0.
(If you use the formula for log of divisions, this actually works).
Replace 1⁄2 and 1⁄2 in the prior with 1⁄3 and 2⁄3, and I don’t think you can make them cancel anymore.
I think we need to use actual limits then, instead of handwaving infinities. So let’s say the posterior for the unfavored hypothesis is e->0 (and is the same for both sides). The Bayes factor for the first hypothesis being confirmed is then (1-e)*3/(3e/2), which http://www.wolframalpha.com/input/?i=%281-e%29*3%2F%283e%2F2%29 simplifies to 2/e − 2. The Bayes factor for the second being confirmed is 3e/((1-e)3/2), which is again simplified http://www.wolframalpha.com/input/?i=3e%2F%28%281-e%293%2F2%29 to (2e)/(1-e).
Now, let me digress and derive the probability of finding evidence for each hypothesis; it’s almost but not quite 1/3:2/3. There’s a prior of 1⁄3 of the first hypothesis being true; this must equal the weighted expectation of the posteriors, by conservation of evidence. So if we call x the chance of finding evidence for hypothesis one, then x*(1-e)+(1-x)*e must equal 1⁄3. http://www.wolframalpha.com/input/?i=x*%281-e%29%2B%281-x%29*e%3D1%2F3+solve+for+x solves
which as a sanity check, does in fact head towards 1⁄3 as e goes towards 0. The corresponding probability of finding evidence for the second hypothesis is 1-x=(2-3 e)/(3-6 e).
Getting back to expected logs of Bayes factors, the chance of getting a bayes factor of 2/e − 2 is (1-3 e)/(3-6 e), while the chance of getting (2e)/(1-e) is (2-3 e)/(3-6 e).
Log of the first, times its probability, plus log of the second, times its probability, is http://www.wolframalpha.com/input/?i=log+%282%2Fx+-+2%29*+%281-3+x%29%2F%283-6+x%29%2Blog%28%282x%29%2F%281-x%29%29*+%282-3+x%29%2F%283-6+x%29%2Cx%3D.01 not zero.
Hm. I’ll need to think this over, this wasn’t what I expected. Either I made some mistake, or am misunderstanding something here. Let me think on this for a bit.
Hopefully I’ll update this soon with an answer.
I think it’s not going to work out. The expected posterior is equal to the prior, but the expected log Bayes factor will have the form p log(K1) + (1-p) log(K2), which for general p is just a mess. Only when p=1/2 does it simplify to log(K1 K2), and when p=1/2, K2=1/K1, so the whole thing is zero.
Okay, so I think I worked out where my failed intuition got it from. The Bayes facter is the ratio of posterior/prior for hypothesis a, divided by the ratio for hypothesis B. The top of that is expected to be 1 (because the expected posterior over the prior is one, factoring out the prior in each case keeps that fraction constant), and the bottom is also (same argument), but the expected ratio of two numbers expected to be one is not always one. So my brain turned “denominator and numerator one” into “ratio one”.
I think it’s not going to work out. The expected posterior is equal to the prior, but the expected log Bayes factor will have the form p log(K1) + (1-p) log(K2), which for general p is just a mess. Only when p=1/2 does it simplify to log(K1 K2), and when p=1/2, K2=1/K1, so the whole thing is zero.