That doesn’t work, even in the case where the number of probability estimates you’re trying to aggregate together is one. The geometric mean of a set of one number is just that number, so the claim that average log odds is the appropriate way to handle this situation implies that if you are given one probability estimate from this procedure, the appropriate thing to do is take it literally, but this is not the case. Instead, you should try to adjust out the expected effect of the gaussian noise. The correct way to do this depends on your prior, but for simplicity and to avoid privileging any particular prior, let’s try using the improper prior such that seeing the probability estimate gives you no information on what the gaussian noise term was. Then your posterior distribution over the “true log odds” is the observed log odds estimate plus a gaussian. The expected value of the true log odds is, of course, the observed log odds estimate, but the expected value of the true probability is not the observed probability estimate; taking the expected value does not commute with applying nonlinear functions like converting between log odds and probabilities.
Oof, rookie mistake. I retract the claim that averaging log odds is ‘the correct thing to do’ in this case
Still—unless I’m wrong again—the average log odds would converge to the correct result in the limit of many forecasters, and the average probabilities wouldn’t? Making the post title bad advice in such a case?
That doesn’t work, even in the case where the number of probability estimates you’re trying to aggregate together is one. The geometric mean of a set of one number is just that number, so the claim that average log odds is the appropriate way to handle this situation implies that if you are given one probability estimate from this procedure, the appropriate thing to do is take it literally, but this is not the case. Instead, you should try to adjust out the expected effect of the gaussian noise. The correct way to do this depends on your prior, but for simplicity and to avoid privileging any particular prior, let’s try using the improper prior such that seeing the probability estimate gives you no information on what the gaussian noise term was. Then your posterior distribution over the “true log odds” is the observed log odds estimate plus a gaussian. The expected value of the true log odds is, of course, the observed log odds estimate, but the expected value of the true probability is not the observed probability estimate; taking the expected value does not commute with applying nonlinear functions like converting between log odds and probabilities.
Oof, rookie mistake. I retract the claim that averaging log odds is ‘the correct thing to do’ in this case
Still—unless I’m wrong again—the average log odds would converge to the correct result in the limit of many forecasters, and the average probabilities wouldn’t? Making the post title bad advice in such a case?
(Though median forecast would do just fine)