average log odds could make sense in the context in which there is a uniform prior
This is something I have heard from other people too, and I still cannot make sense of it. Why would questions where uninformed forecasters produce uniform priors make logodds averaging work better?
A tendency for the questions asked to have priors of near 50% according to the typical unknowledgeable person would explain why more knowledgeable forecasters would assign more extreme probabilities on average: it takes more expertise to justifiably bring their probabilities further from 50%.
I don’t understand your point. Why would forecasters care about what other people would do? They only want to maximize their own score.
If A, B, and C are mutually exclusive, then they can’t all have 50% prior probability, so a pooling method that implicitly assumes that they do will not give coherent results.
This also doesn’t make much sense to me, though it might be because I still don’t understand the point about needing uniform priors for logodd pooling.
Different implicit priors don’t appear to be ruining anything.
Neat!
I conclude that the incoherent results in my ABC example cannot be blamed on switching between the uniform prior on {A,B,C} and the uniform prior on {A,¬A}, and, instead, should be blamed entirely on the experts having different beliefs conditional on ¬A, which is taken account in the calculation using A,B,C, but not in the calculation using A,¬A.
Why would questions where uninformed forecasters produce uniform priors make logodds averaging work better?
Because it produces situations where more extreme probability estimates correlate with more expertise (assuming all forecasters are well-calibrated).
I don’t understand your point. Why would forecasters care about what other people would do? They only want to maximize their own score.
They wouldn’t. But if both would have started with priors around 50% before they acquired any of their expertise, and it’s their expertise that updates them away from 50%, then more expertise is required to get more extreme odds. If the probability is a martingale that starts at 50%, and the time axis is taken to be expertise, then more extreme probabilities will on average be sampled from later in the martingale; i.e. with more expertise.
This also doesn’t make much sense to me, though it might be because I still don’t understand the point about needing uniform priors for logodd pooling.
If logodd pooling implicitly assumes a uniform prior, then logodd pooling on A vs ¬A assumes A has prior probability 1⁄2, and logodd pooling on A vs B vs C assumes A has a prior of 1⁄3, which, if the implicit prior actually was important, could explain the different results.
This is something I have heard from other people too, and I still cannot make sense of it. Why would questions where uninformed forecasters produce uniform priors make logodds averaging work better?
I don’t understand your point. Why would forecasters care about what other people would do? They only want to maximize their own score.
This also doesn’t make much sense to me, though it might be because I still don’t understand the point about needing uniform priors for logodd pooling.
Neat!
I agree with this.
Because it produces situations where more extreme probability estimates correlate with more expertise (assuming all forecasters are well-calibrated).
They wouldn’t. But if both would have started with priors around 50% before they acquired any of their expertise, and it’s their expertise that updates them away from 50%, then more expertise is required to get more extreme odds. If the probability is a martingale that starts at 50%, and the time axis is taken to be expertise, then more extreme probabilities will on average be sampled from later in the martingale; i.e. with more expertise.
If logodd pooling implicitly assumes a uniform prior, then logodd pooling on A vs ¬A assumes A has prior probability 1⁄2, and logodd pooling on A vs B vs C assumes A has a prior of 1⁄3, which, if the implicit prior actually was important, could explain the different results.