Sometimes theorems have “fine print” that rules out weird cases where the theorem would be false or meaningless, and it can be quite hard to understand why it’s there/
e.g. in probability theory, the looming problem is that if I choose a point uniformly at random in [0,1] and ask what is the probability that it falls in a set S, there might be no such probability. At which point, a whole load of stuff about Borel sigma algebras appears in the statement of the theorem to make the problem go away.
Imo if you could really choose a point uniformly at random in [0,1], then things like Vitali sets philosophically shouldn’t exist (but I’ve gotten attacked on reddit for this reasoning, and I kinda don’t want to get into it). But this is why probability theory is phrased in terms of sigma algebras and whatnot to model what might happen if we really could choose uniformly at random in [0,1] instead of directly referring to such a platonic process. One could get away with being informal in probability theory by referring to such a process (and imo one should for the sake of grasping theorems), but then you have issues with the axiom of choice, as you mentioned. (I don’t think any results in probability theory invoke a version of the axiom of choice strong enough to construct non-measurable sets anyway, but I could be wrong.)
There is a little crackpot voice in my head that says something like, “the real numbers are dumb and bad and we don’t need them!” I don’t give it a lot of time, but I do let that voice exist in the back of my mind trying to work out other possible foundations. A related issue here is that it seems to me that one should be able to have a uniform probability distribution over a countable set of numbers. Perhaps one could do that by introducing infinitesimals.
The reason you can’t sample uniformly from the integers is more like “because they are not compact” or “because they are not bounded” than “because they are infinite and countable”. You also can’t sample uniformly at random from the reals. (If you could, then composing with floor would give you a uniformly random sample from the integers.)
If you want to build a uniform probability distribution over a countable set of numbers, aim for all the rationals in [0, 1].
I guess you could view that random number in [0,1] as a choice sequence (cf. intuitionism) and you’re allowed to see any finite number of bits of it by flipping coins to see what those bits are, but you don’t know the answer to any question that would require seeing infitely many bits...
I think the problem to grapple with is that I can cover the rationals in [0,1] with countably many intervals of total length only 1⁄2 (eg enumerate rationals in [0,1], and place interval of length 1⁄4 around first rational, interval of length 1⁄8 around the second, etc). This is not possible with reals—that’s the insight that makes measure theory work!
The covering means that the rationals in an interval cannot have a well defined length or measure which behaves reasonably under countable unions. This is a big barrier to doing probability theory. The same problem happens with ANY countable set—the reals only avoid it by being uncountable.
I’d be surprised if it could be salvaged using infinitesmals (imo the problem is deeper than the argument from countable additivity), but maybe it would help your intuition to think about how some Bayesian methods intersect with frequentist methods when working on a (degenerate) uniform prior over all the real numbers. I have a draft of such a post that I’ll make at some point, but you can think about univariate linear regression, the confidence regions that arise, and what prior would make those confidence regions credible regions.
Sometimes theorems have “fine print” that rules out weird cases where the theorem would be false or meaningless, and it can be quite hard to understand why it’s there/
e.g. in probability theory, the looming problem is that if I choose a point uniformly at random in [0,1] and ask what is the probability that it falls in a set S, there might be no such probability. At which point, a whole load of stuff about Borel sigma algebras appears in the statement of the theorem to make the problem go away.
Imo if you could really choose a point uniformly at random in [0,1], then things like Vitali sets philosophically shouldn’t exist (but I’ve gotten attacked on reddit for this reasoning, and I kinda don’t want to get into it). But this is why probability theory is phrased in terms of sigma algebras and whatnot to model what might happen if we really could choose uniformly at random in [0,1] instead of directly referring to such a platonic process. One could get away with being informal in probability theory by referring to such a process (and imo one should for the sake of grasping theorems), but then you have issues with the axiom of choice, as you mentioned. (I don’t think any results in probability theory invoke a version of the axiom of choice strong enough to construct non-measurable sets anyway, but I could be wrong.)
There is a little crackpot voice in my head that says something like, “the real numbers are dumb and bad and we don’t need them!” I don’t give it a lot of time, but I do let that voice exist in the back of my mind trying to work out other possible foundations. A related issue here is that it seems to me that one should be able to have a uniform probability distribution over a countable set of numbers. Perhaps one could do that by introducing infinitesimals.
How do you sample uniformly from the integers?
The reason you can’t sample uniformly from the integers is more like “because they are not compact” or “because they are not bounded” than “because they are infinite and countable”. You also can’t sample uniformly at random from the reals. (If you could, then composing with floor would give you a uniformly random sample from the integers.)
If you want to build a uniform probability distribution over a countable set of numbers, aim for all the rationals in [0, 1].
I guess you could view that random number in [0,1] as a choice sequence (cf. intuitionism) and you’re allowed to see any finite number of bits of it by flipping coins to see what those bits are, but you don’t know the answer to any question that would require seeing infitely many bits...
I think the problem to grapple with is that I can cover the rationals in [0,1] with countably many intervals of total length only 1⁄2 (eg enumerate rationals in [0,1], and place interval of length 1⁄4 around first rational, interval of length 1⁄8 around the second, etc). This is not possible with reals—that’s the insight that makes measure theory work!
The covering means that the rationals in an interval cannot have a well defined length or measure which behaves reasonably under countable unions. This is a big barrier to doing probability theory. The same problem happens with ANY countable set—the reals only avoid it by being uncountable.
I’d be surprised if it could be salvaged using infinitesmals (imo the problem is deeper than the argument from countable additivity), but maybe it would help your intuition to think about how some Bayesian methods intersect with frequentist methods when working on a (degenerate) uniform prior over all the real numbers. I have a draft of such a post that I’ll make at some point, but you can think about univariate linear regression, the confidence regions that arise, and what prior would make those confidence regions credible regions.
In “Proofs and Refutations”, Imre Lakatos talks about “monster barring”.