The trouble with assuming they’re uncorrelated is that it can give you pretty extreme probability estimates
No. The trouble with assuming they’re uncorrelated is that they probably aren’t. If they were, the extreme probability estimates would be warranted.
I suppose more accurately, the problem is that if there is a significant correlation, assuming they’re uncorrelated will give a, equally significant error, and they’re usually significantly correlated.
No. The trouble with assuming they’re uncorrelated is that they probably aren’t. If they were, the extreme probability estimates would be warranted.
This is what I meant by extreme- further than warranted.
The subtler point was that the penalty for being extreme, in a decision-making context, depends on your threshold. Suppose you just want to know whether or not your posterior should be higher than your prior. Then, the experts saying “A>P(Q)” and “B>P(Q)” means that you vote “higher,” regardless of your aggregation technique, and if the experts disagree, you go with the one that feels more strongly (if you have no data on which one is more credible).
Again, if the threshold is higher, but not significantly higher, it may be that both aggregation techniques give the same results. One of the benefits of graphing them is that it will make the regions where the techniques disagree obvious- if A says .9 and B says .4 (with a prior of .3), then what do the real-world experts think this means? Choosing between the methods should be done by focusing on the differences caused by that choice (though first-principles arguments about correlation can be useful too).
No. The trouble with assuming they’re uncorrelated is that they probably aren’t. If they were, the extreme probability estimates would be warranted.
I suppose more accurately, the problem is that if there is a significant correlation, assuming they’re uncorrelated will give a, equally significant error, and they’re usually significantly correlated.
This is what I meant by extreme- further than warranted.
The subtler point was that the penalty for being extreme, in a decision-making context, depends on your threshold. Suppose you just want to know whether or not your posterior should be higher than your prior. Then, the experts saying “A>P(Q)” and “B>P(Q)” means that you vote “higher,” regardless of your aggregation technique, and if the experts disagree, you go with the one that feels more strongly (if you have no data on which one is more credible).
Again, if the threshold is higher, but not significantly higher, it may be that both aggregation techniques give the same results. One of the benefits of graphing them is that it will make the regions where the techniques disagree obvious- if A says .9 and B says .4 (with a prior of .3), then what do the real-world experts think this means? Choosing between the methods should be done by focusing on the differences caused by that choice (though first-principles arguments about correlation can be useful too).