Maybe the difference lies in the format of answers?
We know: set of n outputs of a random number generator with normal distribution. Say {3.2, 4.5, 8.1}.
We don’t know: mean m and variance v.
Your proposed answer: m = 5.26, v = 6.44.
A Bayesian’s answer: a probability distribution P(m) of the mean and another distribution Q(v) of the variance.
How does a frequentist get them? If he hasn’t them, what’s his confidence in m = 5.26 and v = 6.44? What if the set contains only one number—what is the frequentist’s estimate for v? Note that a Bayesian has no problem even if the data set is empty, he only rests with his priors. If the data set is large, Bayesian’s answer will inevitably converge at delta-function around the frequentist’s estimate, no matter what the priors are.
50% confidence interval for mean: 4.07 to 6.46, stddev: 2.15 to 4.74
90% confidence interval for mean: 0.98 to 9.55, stddev: 1.46 to 11.20
If there’s only one sample, the calculation fails due to division by n-1, so the frequentist says “no answer”. The Bayesian says the same if he used the improper prior Cyan mentioned.
The prior for variance that matches the frequentist conclusion isn’t flat. And even if it were, a flat prior for variance implies a non-flat prior for standard deviation and vice versa. :-)
Using the flat improper prior I was talking about before, when there’s only one data point the posterior distribution is improper, so the Bayesian answer is the same as the frequentist’s.
Maybe the difference lies in the format of answers?
We know: set of n outputs of a random number generator with normal distribution. Say {3.2, 4.5, 8.1}.
We don’t know: mean m and variance v.
Your proposed answer: m = 5.26, v = 6.44.
A Bayesian’s answer: a probability distribution P(m) of the mean and another distribution Q(v) of the variance.
How does a frequentist get them? If he hasn’t them, what’s his confidence in m = 5.26 and v = 6.44? What if the set contains only one number—what is the frequentist’s estimate for v? Note that a Bayesian has no problem even if the data set is empty, he only rests with his priors. If the data set is large, Bayesian’s answer will inevitably converge at delta-function around the frequentist’s estimate, no matter what the priors are.
http://www.xuru.org/st/DS.asp
50% confidence interval for mean: 4.07 to 6.46, stddev: 2.15 to 4.74
90% confidence interval for mean: 0.98 to 9.55, stddev: 1.46 to 11.20
If there’s only one sample, the calculation fails due to division by n-1, so the frequentist says “no answer”. The Bayesian says the same if he used the improper prior Cyan mentioned.
Hm, should I understand it that the frequentist assumes normal distribution of the mean value with peak at the estimated 5.26?
If so, then frequentism = bayes + flat prior.
Improper priors are however quite tricky, they may lead to paradoxes such as the two-envelope paradox.
The prior for variance that matches the frequentist conclusion isn’t flat. And even if it were, a flat prior for variance implies a non-flat prior for standard deviation and vice versa. :-)
Of course, I meant flat distribution of the mean. The variance cannot be negative at least.
In this problem, yes. In the general case no one knows exactly what the flat prior is, e.g. if there are constraints on model parameters.
Using the flat improper prior I was talking about before, when there’s only one data point the posterior distribution is improper, so the Bayesian answer is the same as the frequentist’s.