This actually gets even worse. Consider for example a hypothetical Bayesian version of Issac Newton, trying to estimate what exponent k the radius is raised to in F= GMm/R^k. There’s an intuition that mathematically simple numbers should be more likely, such as say “2”. A while ago jimrandomh and benelliiot discussed this with me. Ben suggested that in this sort of context you might just have a complicated distribution where part of the distribution arose from something continuous and the other part arose from discrete probabilities for simple numbers. This seems to do a decent job capturing our intuition but it seems to be very hard to actually use that sort of distribution.
If Newton tried to derive his law purely from empirical measurements, then yes, he would never be exactly sure (ignoring general relativity for a moment) that the exponent is exactly 2. For all he would know, it could actually be 2.00000145...
But that would be like trying to derive the value of pi or the exponents in the Pythagorean theorem by measuring physical circles and triangles. If the law of gravity is derived from more general axioms, then its form can be computed exactly provided that these axioms are correct.
Do you think that the Dirichlet Processes models that machine learning people use might be relevant here? As I understand it, a DP prior says that the true probability distribution is a discrete probability distribution over some countable set of points, but you don’t know which set in advance. So in the posterior, this can consistently assign some nonzero probability on a single point—in fact, if you do the math the posterior is very simple, it’s a mix between a DP and some finite probability mass on the values that you did see.
My minimal knowledge base says that sounds potentially relevant. Unfortunately, I don’t know nearly enough about this sort of thing other than to make very vague, non-committal remarks.
In summary Newton should assign probability 0 to the statement that his theory of relativity is exactly correct. This turns out to be the right thing to do.
Huh? No. The probability shouldn’t be zero that he’s correct. Even now there’s some very tiny probability that Newton’s laws are exactly correct. This chance is vanishingly small but non-zero. Moreover, your argument implies too much because one could use the exact same logic for general relativity.
Ok. But even if you had a theory of quantum gravity that seemed to explain all observed data your argument would still go through. If your argument is accepted than any theory of everything would have to be assigned zero probability of being correct no matter how well it predicted things. This seems wrong.
“Should”? I would much rather be logically inconsistent, or bet that the axioms of probability are meaningless or irrelevant—which in relevant decision theoretic problems they tend to be—than give odds of infinity to one.
This actually gets even worse. Consider for example a hypothetical Bayesian version of Issac Newton, trying to estimate what exponent k the radius is raised to in F= GMm/R^k. There’s an intuition that mathematically simple numbers should be more likely, such as say “2”. A while ago jimrandomh and benelliiot discussed this with me. Ben suggested that in this sort of context you might just have a complicated distribution where part of the distribution arose from something continuous and the other part arose from discrete probabilities for simple numbers. This seems to do a decent job capturing our intuition but it seems to be very hard to actually use that sort of distribution.
If Newton tried to derive his law purely from empirical measurements, then yes, he would never be exactly sure (ignoring general relativity for a moment) that the exponent is exactly 2. For all he would know, it could actually be 2.00000145...
But that would be like trying to derive the value of pi or the exponents in the Pythagorean theorem by measuring physical circles and triangles. If the law of gravity is derived from more general axioms, then its form can be computed exactly provided that these axioms are correct.
Do you think that the Dirichlet Processes models that machine learning people use might be relevant here? As I understand it, a DP prior says that the true probability distribution is a discrete probability distribution over some countable set of points, but you don’t know which set in advance. So in the posterior, this can consistently assign some nonzero probability on a single point—in fact, if you do the math the posterior is very simple, it’s a mix between a DP and some finite probability mass on the values that you did see.
My minimal knowledge base says that sounds potentially relevant. Unfortunately, I don’t know nearly enough about this sort of thing other than to make very vague, non-committal remarks.
In summary Newton should assign probability 0 to the statement that his theory of relativity is exactly correct. This turns out to be the right thing to do.
Huh? No. The probability shouldn’t be zero that he’s correct. Even now there’s some very tiny probability that Newton’s laws are exactly correct. This chance is vanishingly small but non-zero. Moreover, your argument implies too much because one could use the exact same logic for general relativity.
And it would be equally correct.
Ok. But even if you had a theory of quantum gravity that seemed to explain all observed data your argument would still go through. If your argument is accepted than any theory of everything would have to be assigned zero probability of being correct no matter how well it predicted things. This seems wrong.
“Should”? I would much rather be logically inconsistent, or bet that the axioms of probability are meaningless or irrelevant—which in relevant decision theoretic problems they tend to be—than give odds of infinity to one.