and one has chosen a prior for those parameters such that there is non-zero weight on the true values then the Bernstein-von Mises theorem guarantees that the Bayesian posterior distribution and the maximum likelihood estimate converge to the same probability distribution (although you may need to use improper priors)
What do “non-zero weight” and “improper priors” mean?
EDIT: Improper priors mean priors that don’t sum to one. I would guess “non-zero weight” means “non-zero probability”. But then I would wonder why anyone would introduce the term “weight”. Perhaps “weight” is the term you use to express a value from a probability density function that is not itself a probability.
Improper priors are generally only considered in the case of continuous distributions so ‘sum’ is probably not the right term, integrate is usually used.
I used the term ‘weight’ to signify an integral because of how I usually intuit probability measures. Say you have a random variable X that takes values in the real line, the probability that it takes a value in some subset S of the real line would be the integral of S with respect to the given probability measure.
Thanks much!
What do “non-zero weight” and “improper priors” mean?
EDIT: Improper priors mean priors that don’t sum to one. I would guess “non-zero weight” means “non-zero probability”. But then I would wonder why anyone would introduce the term “weight”. Perhaps “weight” is the term you use to express a value from a probability density function that is not itself a probability.
No problem.
Improper priors are generally only considered in the case of continuous distributions so ‘sum’ is probably not the right term, integrate is usually used.
I used the term ‘weight’ to signify an integral because of how I usually intuit probability measures. Say you have a random variable X that takes values in the real line, the probability that it takes a value in some subset S of the real line would be the integral of S with respect to the given probability measure.
There’s a good discussion of this way of viewing probability distributions in the wikipedia article. There’s also a fantastic textbook on the subject that really has made a world of difference for me mathematically.