Does “chance relative to null is x%” mean “An observer, given my results, would assign an x% to me being calibrated”
No! P(Test results | Perfect calibration) / P(Test results | Whatever the null is) ≠ P(Perfect Calibration | Test results) !
You can also lodge this is a problem with null hypothesis testing—I would’ve thought that perfect calibration would be the null. Perhaps the null is a model where you just randomly say a probability from 0 to 100.
I’m assuming that they really calculated a likelihood function P(Data|Perfect) / P(Data|Null) instead of the posteriorP(Perfect|Data) / P(Null|Data) as the words they used would mean if taken literally. But maybe they have some priors P(Perfect) / P(Null) that they used. (The thing they should do is just report the likelihood ratio, instead of their posterior).
If you have your data and want to compute P(Data|Perfect), you can compute a total product Π_i (p_i if it happened, (1-p_i) if it didn’t)
So for example if I predicted 20%, 70%, 30% and the actual results were No, Yes, Yes, then P(Data|Perfect) = .8 * .7 * .3. If you have some other hypothesis (e.g. whatever their null is), you can compute P(Data|Other Hypothesis) by using the predictions that hypothesis makes for how your reported probabilities relate to propensities of events. A hypothesis here should be a function f(reported) = P(Event happens | reported).
It seems like you could do better with a logit model
p = logistic( \sum_i w_i c_i ) that is, logit(p) = log odds(p) = \sum_i w_i c_i
Are these also called SPR’s?