Lumifer comments on Open Thread, Jul. 27 - Aug 02, 2015

Lumifer 29 Jul 2015 2:26 UTC
0 points

If one wants to measure my correctness across multiple confidence levels, then what aggregation procedure to use is unclear

Yes, that is precisely the issue for me here. Essentially, you have to specify a loss function and then aggregate it. It’s unclear what kind will work best here and what that “best” even means.

You may find the Wikipedia page on scoring rules interesting.

Yes, thank you, that’s useful.

Notably, Philip Tetlock in his Expert Political Judgement project uses Brier scoring.