I’m obviously new to this whole thing, but is this a largely undebated, widely accepted view on probabilities? That there are NO situations in which you can’t meaningfully state a probability?
It does seem to be widely accepted and largely undebated. However, it is also widely rejected and largely undebated, for example by Andrew Gelman, Cosma Shalizi, Ken Binmore, and Leonard Savage (to name just the people I happen to have seen rejecting it—I am not a statistician, so I do not know how representative these are of the field in general, or if there has actually been a substantial debate anywhere). None of them except Ken Binmore actually present arguments against it in the material I have read, they merely dismiss the idea of a universal prior as absurd. But in mathematics, only one thing is absurd, a contradiction, and by that standard only Ken Binmore has offered any mathematical arguments. He gives two in his book “Rational Decisions”: one based on Gödel-style self-reference, and the other based on a formalisation of the concept of “knowing that” as the box operator of S5 modal logic. I haven’t studied the first but am not convinced by the second, which fails at the outset by defining “I know that” as an extensional predicate. (He identifies a proposition P with the set of worlds in which it is true, and assumes that “I know that P” is a function of the set representing P, not of the syntactic form of P. Therefore by that definition of knowing, since I know that 2+2=4, I know every true statement of mathematics, since they are all true in all possible worlds.)
(ETA: Binmore’s S5 argument can also be found online here.)
(ETA2: For those who don’t have a copy of “Rational Decisions” to hand, here’s a lengthy and informative review of it.)
These people distinguish “small-world” Bayesianism from “large-world” Bayesianism, they themselves being small-worlders. Large-worlders would include Eliezer, Marcus Hutter, and everyone else who believes in the possibility of a universal prior.
A typical small-world Bayesian argument would be: I hypothesise that a certain variable has a Gaussian distribution with unknown parameters over which I have a prior distribution; I observe some samples; I obtain a posterior distribution for the parameters. A large-world Bayesian also makes arguments of this sort and they both make the same calculations.
Where they part company is when the variable in fact does not have a Gaussian distribution. For example, suppose it is a sum of two widely separated Gaussians. According to small-worlders, the large-world Bayesian is stuck with his prior hypothesis of a single Gaussian, which no quantity of observations will force him to relinquish, since it is his prior. His estimate of the mean of the Gaussian will drift aimlessly up and down like the Flying Dutchman between the two modes of the real distribution, unable to see the world beyond his prior. According to large-worlders, that prior was not the real prior which one started from. That whole calculation was really conditional on the assumption of a Gaussian, and this assumption itself has a certain prior probability less than 1, and was chosen from a space of all possible hypothetical distributions. The small-worlders reply that this is absurd, declare victory, and walk away without listening to the large-worlders explain how to choose universal priors. Instead, small-worlders insist that to rectify the fault of having hypothesised the wrong model, one must engage in a completely different non-Bayesian activity called model-checking. Chapter 6 of Gelman’s book “Bayesian Data Analysis” is all about that, but I haven’t read it. There is some material in this paper by Gelman and Shalizi.
(ETA: I have now read Gelman ch.6. Model-checking is performed by various means, such as (1) eyeballing visualisations of the real data and simulated data generated by the model, (2) comparing statistics evaluated for both real and simulated data, or (3) seeing if the model predicts things that conflict with whatever other knowledge you have of the phenomenon being studied.)
And that’s as far as I’ve read on the subject. Have the small-worlders ever responded to large-worlders’ construction of universal priors? Have the large-worlders ever demonstrated that universal priors are more than a theoretical construction without practical application? Has “model checking” ever been analysed in large-world Bayesian terms?
It does seem to be widely accepted and largely undebated. However, it is also widely rejected and largely undebated, for example by Andrew Gelman, Cosma Shalizi, Ken Binmore, and Leonard Savage (to name just the people I happen to have seen rejecting it—I am not a statistician, so I do not know how representative these are of the field in general, or if there has actually been a substantial debate anywhere). None of them except Ken Binmore actually present arguments against it in the material I have read, they merely dismiss the idea of a universal prior as absurd. But in mathematics, only one thing is absurd, a contradiction, and by that standard only Ken Binmore has offered any mathematical arguments. He gives two in his book “Rational Decisions”: one based on Gödel-style self-reference, and the other based on a formalisation of the concept of “knowing that” as the box operator of S5 modal logic. I haven’t studied the first but am not convinced by the second, which fails at the outset by defining “I know that” as an extensional predicate. (He identifies a proposition P with the set of worlds in which it is true, and assumes that “I know that P” is a function of the set representing P, not of the syntactic form of P. Therefore by that definition of knowing, since I know that 2+2=4, I know every true statement of mathematics, since they are all true in all possible worlds.)
(ETA: Binmore’s S5 argument can also be found online here.)
(ETA2: For those who don’t have a copy of “Rational Decisions” to hand, here’s a lengthy and informative review of it.)
These people distinguish “small-world” Bayesianism from “large-world” Bayesianism, they themselves being small-worlders. Large-worlders would include Eliezer, Marcus Hutter, and everyone else who believes in the possibility of a universal prior.
A typical small-world Bayesian argument would be: I hypothesise that a certain variable has a Gaussian distribution with unknown parameters over which I have a prior distribution; I observe some samples; I obtain a posterior distribution for the parameters. A large-world Bayesian also makes arguments of this sort and they both make the same calculations.
Where they part company is when the variable in fact does not have a Gaussian distribution. For example, suppose it is a sum of two widely separated Gaussians. According to small-worlders, the large-world Bayesian is stuck with his prior hypothesis of a single Gaussian, which no quantity of observations will force him to relinquish, since it is his prior. His estimate of the mean of the Gaussian will drift aimlessly up and down like the Flying Dutchman between the two modes of the real distribution, unable to see the world beyond his prior. According to large-worlders, that prior was not the real prior which one started from. That whole calculation was really conditional on the assumption of a Gaussian, and this assumption itself has a certain prior probability less than 1, and was chosen from a space of all possible hypothetical distributions. The small-worlders reply that this is absurd, declare victory, and walk away without listening to the large-worlders explain how to choose universal priors. Instead, small-worlders insist that to rectify the fault of having hypothesised the wrong model, one must engage in a completely different non-Bayesian activity called model-checking. Chapter 6 of Gelman’s book “Bayesian Data Analysis” is all about that, but I haven’t read it. There is some material in this paper by Gelman and Shalizi.
(ETA: I have now read Gelman ch.6. Model-checking is performed by various means, such as (1) eyeballing visualisations of the real data and simulated data generated by the model, (2) comparing statistics evaluated for both real and simulated data, or (3) seeing if the model predicts things that conflict with whatever other knowledge you have of the phenomenon being studied.)
And that’s as far as I’ve read on the subject. Have the small-worlders ever responded to large-worlders’ construction of universal priors? Have the large-worlders ever demonstrated that universal priors are more than a theoretical construction without practical application? Has “model checking” ever been analysed in large-world Bayesian terms?