Are you aware of the existing work on ignorance priors, for instance the maximum entropy prior (if I remember properly this is Jeffrey’s prior and gives rise to the KT estimator), also the improper prior which effectively places almost all of the weight on 0 and 1?
Interestingly, the universal distribution does not include continuous parameters but does end up dominating any computable rule for assigning probabilities, including these families of conjugate priors.
If I understand correctly, the maximum entropy prior will be the uniform prior, which gives rise to Laplace’s law of succession, at least if we’re using the standard definition of entropy below:
H[p]:=∫1x=0P(x)lnP(x)dx
But this definition is somewhat arbitrary because the the “P(x)dx” term assumes that there’s something special about parameterising the distribution with it’s probability, as opposed to different parameterisations (e.g. its odds, its logodds, etc). Jeffrey’s prior is supposed to be invariant to different parameterisations, which is why people like it.
But my complaint is more Solomonoff-ish. The prior should put more weight on simple distributions, i.e. probability distributions that describe short probabilistic programs. Such a prior would better match our intuitions about what probabilities arise in real-life stochastic processes. The best prior is the Solomonoff prior, but that’s intractable. I think my prior is the most tractable prior that resolved the most egregious anti-Solomonoff problems with Laplace/Jeffrey’s priors.
Are you aware of the existing work on ignorance priors, for instance the maximum entropy prior (if I remember properly this is Jeffrey’s prior and gives rise to the KT estimator), also the improper prior which effectively places almost all of the weight on 0 and 1? Interestingly, the universal distribution does not include continuous parameters but does end up dominating any computable rule for assigning probabilities, including these families of conjugate priors.
If I understand correctly, the maximum entropy prior will be the uniform prior, which gives rise to Laplace’s law of succession, at least if we’re using the standard definition of entropy below:
H[p]:=∫1x=0P(x)lnP(x)dx
But this definition is somewhat arbitrary because the the “P(x)dx” term assumes that there’s something special about parameterising the distribution with it’s probability, as opposed to different parameterisations (e.g. its odds, its logodds, etc). Jeffrey’s prior is supposed to be invariant to different parameterisations, which is why people like it.
But my complaint is more Solomonoff-ish. The prior should put more weight on simple distributions, i.e. probability distributions that describe short probabilistic programs. Such a prior would better match our intuitions about what probabilities arise in real-life stochastic processes. The best prior is the Solomonoff prior, but that’s intractable. I think my prior is the most tractable prior that resolved the most egregious anti-Solomonoff problems with Laplace/Jeffrey’s priors.