If you apply this to your prior probability as well as the evidence, this should generally move your probabilities towards 1⁄2.
This looks wrong to me. You can write your priors as a log-odds, and your pieces of evidence as several log-likelihood ratios, but while it’s it’s fairly obvious to me that your meta-uncertainty over log-likelihoods sends the extra evidence toward 0 and thus the overall probability toward the prior, I don’t see at all why it makes sense to do something analogous to the log-odds prior which sends that to 0 and thus the overall probability to 0.5.
What’s going on? Is the argument something like “well I have one possibility and then not-that-possibility, so if I look purely at the structure I should say ‘two possibilities, symmetric, 50/50!’”? I think that works if you generate all possibilities in estimations like this uniformly (esp. a possibility and its complement)? Anyway, IMO it’s a much stricter “outside view” to send your priors to 0.5 than it is to send your evidence to 0.
Suppose we are interested in an event B with prior probability P(B) = 1⁄2 which is prior log odds L(B) = 0, and have observed evidence E which is worth 1 bit, so L(B|E) = 1 and P(B|E) = 2⁄3 ~= .67. But if we are meta uncertain of the strength of evidence E such that we assign probability 1⁄2 that it is worth 0 bits, and probability 1⁄2 that it is worth 2 bits, then the expected log odds is EL(B|E) = 1, but the expected probability EP(B|E) = (1/2)*(1/2) + (1/2)*(4/5) = (.5 + .8)/2 = .65, decreasing towards 1⁄2 from P(B|E) ~= .67.
But what if instead the prior probability was P(B) = 1⁄5, or L(B) = −2. Then, with the same evidence with the same meta uncertainty, EL(B|E) = L(B|E) = −1, P(B|E) = 1⁄3 ~= .33, and EP(B|E) = .35, this time increasing towards 1⁄2.
Note this did not even require meta uncertainty over the prior, only the uncertainty over the total posterior log-odds is important. Also note that even though uncertainty moves the expected probability towards 1⁄2, it does not move the expected log-odds towards 0.
Note that your observation does not generalize to more complex logodds-distributions. Here is a simple counterexample:
Let’s say that L(B|E)=1+x with chance 2⁄3, and L(B|E)=1-2x with chance 1⁄3. It still holds that EL(B|E)=1. But the expected probability EP(B|E) is now not a monotone function of x. It has a global minimum at x=2.
Indeed. It looks like the effect I described occurs when the meta uncertainty is over a small range of log-odds values relative to the posterior log-odds, and there is another effect that could produce arbitrary expected probabilities given the right distribution over an arbitrarily large range of values. For any probability p, let L(B|E) = average + (1-p)*x with probability p and L(B|E) = average—p*x with probability (1-p), and then the limit of the expected probability as x approaches infinity is p.
It has a global minimum at x=2.
I notice that this is where |1 + x| = |1 − 2x|. That might be interesting to look into.
(Possible more rigorous and explicit math to follow when I can focus on it more)
I let L(B|E) be uniform from x-s/2 to x+s/2 and got that P(B|E) =
where A is the odds if L(B|E)=x.
In the limit as s goes to infinity, it looks like the interesting pieces are a term that’s the log of the prior probability dropping off as s grows linearly, plus a term that eventually looks like (1/s)*ln(e^(s/2))=1/2 which means we approach 1⁄2.
Oh I see, I thought you were saying something completely different. :D Yes, it looks like keeping the expectation of the evidence constant, the final probability will be closer to 0.5 the larger the variance of the evidence. I thought you were talking about what our priors should be on how much evidence we will tend to receive for propositions in general from things we intuit as one source or something.
This looks wrong to me. You can write your priors as a log-odds, and your pieces of evidence as several log-likelihood ratios, but while it’s it’s fairly obvious to me that your meta-uncertainty over log-likelihoods sends the extra evidence toward 0 and thus the overall probability toward the prior, I don’t see at all why it makes sense to do something analogous to the log-odds prior which sends that to 0 and thus the overall probability to 0.5.
What’s going on? Is the argument something like “well I have one possibility and then not-that-possibility, so if I look purely at the structure I should say ‘two possibilities, symmetric, 50/50!’”? I think that works if you generate all possibilities in estimations like this uniformly (esp. a possibility and its complement)? Anyway, IMO it’s a much stricter “outside view” to send your priors to 0.5 than it is to send your evidence to 0.
It might help to work an example.
Suppose we are interested in an event B with prior probability P(B) = 1⁄2 which is prior log odds L(B) = 0, and have observed evidence E which is worth 1 bit, so L(B|E) = 1 and P(B|E) = 2⁄3 ~= .67. But if we are meta uncertain of the strength of evidence E such that we assign probability 1⁄2 that it is worth 0 bits, and probability 1⁄2 that it is worth 2 bits, then the expected log odds is EL(B|E) = 1, but the expected probability EP(B|E) = (1/2)*(1/2) + (1/2)*(4/5) = (.5 + .8)/2 = .65, decreasing towards 1⁄2 from P(B|E) ~= .67.
But what if instead the prior probability was P(B) = 1⁄5, or L(B) = −2. Then, with the same evidence with the same meta uncertainty, EL(B|E) = L(B|E) = −1, P(B|E) = 1⁄3 ~= .33, and EP(B|E) = .35, this time increasing towards 1⁄2.
Note this did not even require meta uncertainty over the prior, only the uncertainty over the total posterior log-odds is important. Also note that even though uncertainty moves the expected probability towards 1⁄2, it does not move the expected log-odds towards 0.
Note that your observation does not generalize to more complex logodds-distributions. Here is a simple counterexample:
Let’s say that L(B|E)=1+x with chance 2⁄3, and L(B|E)=1-2x with chance 1⁄3. It still holds that EL(B|E)=1. But the expected probability EP(B|E) is now not a monotone function of x. It has a global minimum at x=2.
x EP(B|E)
0 0.66666666666666663
1 0.64444444444444438
2 0.62962962962962954
3 0.63755199049316691
4 0.64904862579281186
5 0.65706002898985361
Indeed. It looks like the effect I described occurs when the meta uncertainty is over a small range of log-odds values relative to the posterior log-odds, and there is another effect that could produce arbitrary expected probabilities given the right distribution over an arbitrarily large range of values. For any probability p, let L(B|E) = average + (1-p)*x with probability p and L(B|E) = average—p*x with probability (1-p), and then the limit of the expected probability as x approaches infinity is p.
I notice that this is where |1 + x| = |1 − 2x|. That might be interesting to look into.
(Possible more rigorous and explicit math to follow when I can focus on it more)
I let L(B|E) be uniform from x-s/2 to x+s/2 and got that P(B|E) =
where A is the odds if L(B|E)=x. In the limit as s goes to infinity, it looks like the interesting pieces are a term that’s the log of the prior probability dropping off as s grows linearly, plus a term that eventually looks like (1/s)*ln(e^(s/2))=1/2 which means we approach 1⁄2.Oh I see, I thought you were saying something completely different. :D Yes, it looks like keeping the expectation of the evidence constant, the final probability will be closer to 0.5 the larger the variance of the evidence. I thought you were talking about what our priors should be on how much evidence we will tend to receive for propositions in general from things we intuit as one source or something.