Its not obvious to me that your method improves upon the Wilson score. Certainly, the traditional Bayesian approach (Jeffreys interval) is rarely that different from the WIlson score- have you played with values to see what the largest differences would look like?
Given that a and b are arbitrary, I think the differences can be large. Whether they actually are large for typical datasets I can’t readily answer.
In any case the advantages are:
Simplicity. Tuning the parameters is a bit involved, but once you do the formula to apply for each item is very simple. In many (not all) cases, a complicated formula reflects insufficient understanding of the problem.
Motivation. Taking the lower bound of a confidence/credible interval makes some sense but it’s not that obvious. The need for it arises because we don’t model the prior mean, so we don’t want to take risk on unproven items. A posterior mean of the quality is more natural, and won’t cause much problems because items default to the true population mean.
Parametrization. The interval methods has a parameter for the probability to take for the size of the interval, but it’s not at all clear how to choose it. My method has parameters for mean and variance which are based on the data.
Generalization. This framework makes it easier to clearly think about what we want, and replace the posterior mean of p with a posterior mean of some other quantity of interest. e.g., the suggested “explore vs. exploit” tends to give something closer to an interval upper bound than lower bound, and other methods have been suggested.
Yes, a and b are arbitrary- but if they aren’t chosen well your model could be hugely inferior. I’d suggest making a few toy data sets and actually comparing the standard methods (Wilson Score, Jeffreys interval) to yours before suggesting everyone embrace it.
Edit for clarity: Just to be clear, the Jeffery’s interval (which is usually very close to the Wilson coeff) is essentially the same as your model but with the initial parameters 1⁄2,1/2.
Its not obvious to me that your method improves upon the Wilson score. Certainly, the traditional Bayesian approach (Jeffreys interval) is rarely that different from the WIlson score- have you played with values to see what the largest differences would look like?
Given that a and b are arbitrary, I think the differences can be large. Whether they actually are large for typical datasets I can’t readily answer.
In any case the advantages are:
Simplicity. Tuning the parameters is a bit involved, but once you do the formula to apply for each item is very simple. In many (not all) cases, a complicated formula reflects insufficient understanding of the problem.
Motivation. Taking the lower bound of a confidence/credible interval makes some sense but it’s not that obvious. The need for it arises because we don’t model the prior mean, so we don’t want to take risk on unproven items. A posterior mean of the quality is more natural, and won’t cause much problems because items default to the true population mean.
Parametrization. The interval methods has a parameter for the probability to take for the size of the interval, but it’s not at all clear how to choose it. My method has parameters for mean and variance which are based on the data.
Generalization. This framework makes it easier to clearly think about what we want, and replace the posterior mean of p with a posterior mean of some other quantity of interest. e.g., the suggested “explore vs. exploit” tends to give something closer to an interval upper bound than lower bound, and other methods have been suggested.
Yes, a and b are arbitrary- but if they aren’t chosen well your model could be hugely inferior. I’d suggest making a few toy data sets and actually comparing the standard methods (Wilson Score, Jeffreys interval) to yours before suggesting everyone embrace it.
Edit for clarity: Just to be clear, the Jeffery’s interval (which is usually very close to the Wilson coeff) is essentially the same as your model but with the initial parameters 1⁄2,1/2.