“This desiderata is often difficult to reconcile with clear scoring, since complexity in forecasts generally requires complexity in scoring.”
Can you elaborate on this? In some sense, log-scoring is simple and can be applied to very complex distributions; Are you saying that the this would still be “complex scoring” because the complex forecast needs to be evaluated, or is your point about something different?
Yeah, that’s true. I don’t recall exactly what I was thinking.
Perhaps it was regarding time-weighting, and the difficulty of seeing what your score will be based on what you predict—but the Metaculus interface handles this well, modulus early closings, which screw lots of things up. Also, log-scoring is tricky when you have both continuous and binary outcomes, since they don’t give similar measures—being well calibrated for binary events isn’t “worth” as much, which seems perverse in many ways.
“This desiderata is often difficult to reconcile with clear scoring, since complexity in forecasts generally requires complexity in scoring.”
Can you elaborate on this? In some sense, log-scoring is simple and can be applied to very complex distributions; Are you saying that the this would still be “complex scoring” because the complex forecast needs to be evaluated, or is your point about something different?
Yeah, that’s true. I don’t recall exactly what I was thinking.
Perhaps it was regarding time-weighting, and the difficulty of seeing what your score will be based on what you predict—but the Metaculus interface handles this well, modulus early closings, which screw lots of things up. Also, log-scoring is tricky when you have both continuous and binary outcomes, since they don’t give similar measures—being well calibrated for binary events isn’t “worth” as much, which seems perverse in many ways.