(1) What is the family of calibration curves you’re updating on? These are functions from stated probabilities to ‘true’ probabilities, so the class of possible functions is quite large. Do we want a parametric family? A non-parametric family? We would like something which is mathematically convenient, looks as much like typical calibration curves as possible, but which has a good ability to fit anomalous curves as well when those come up.
(2) What is the prior oven this family of curves? It may not matter too much if we plan on using a lot of data, but if we want to estimate people’s calibration quickly, it would be nice to have a decent prior. This suggests a hierarchical Bayesian approach (where we estimate a good prior distribution via a higher-order prior).
(3) As mentioned by cousin_it, we would actually want to estimate different calibration curves for different topics. This suggests adding at least one more level to the hierarchical Bayesian model, so that we can simultaneously estimate the general distribution of calibration curves in the population, the all-subject calibration curve for an individual, and the single-subject calibration curve for an individual. At this point, one might prefer to shut one’s eyes and ignore the complexity of the problem.
Sure. Any practical implementation will have to figure out all the practical details including the ones that you mention. But that’s implementation issues of something that is still straightforward Bayes, at least for a single individual. If you have a history of predictions and know the actual outcomes, you can even just plot the empirical calibration curve without any estimation involved.
Now, if you have multiple people involved, things become more interesting and probably call for something like Gelman’s favourite multilevel/hierarchical models. But that’s beyond what OP asked for—he wanted a “rigorously mathematically defined system” and that’s plain-vanilla Bayes.
Not exactly.
(1) What is the family of calibration curves you’re updating on? These are functions from stated probabilities to ‘true’ probabilities, so the class of possible functions is quite large. Do we want a parametric family? A non-parametric family? We would like something which is mathematically convenient, looks as much like typical calibration curves as possible, but which has a good ability to fit anomalous curves as well when those come up.
(2) What is the prior oven this family of curves? It may not matter too much if we plan on using a lot of data, but if we want to estimate people’s calibration quickly, it would be nice to have a decent prior. This suggests a hierarchical Bayesian approach (where we estimate a good prior distribution via a higher-order prior).
(3) As mentioned by cousin_it, we would actually want to estimate different calibration curves for different topics. This suggests adding at least one more level to the hierarchical Bayesian model, so that we can simultaneously estimate the general distribution of calibration curves in the population, the all-subject calibration curve for an individual, and the single-subject calibration curve for an individual. At this point, one might prefer to shut one’s eyes and ignore the complexity of the problem.
Sure. Any practical implementation will have to figure out all the practical details including the ones that you mention. But that’s implementation issues of something that is still straightforward Bayes, at least for a single individual. If you have a history of predictions and know the actual outcomes, you can even just plot the empirical calibration curve without any estimation involved.
Now, if you have multiple people involved, things become more interesting and probably call for something like Gelman’s favourite multilevel/hierarchical models. But that’s beyond what OP asked for—he wanted a “rigorously mathematically defined system” and that’s plain-vanilla Bayes.