While you’re quite right about numbers on the scale of billions or trillions, I don’t think it makes sense in the limit for the prior probability of X people existing in the world to fall faster than X grows in size.
Certain series of large numbers grow larger much faster than they grow in complexity. A program that returns 10^(10^(10^10)) takes fewer bits to specify (relative to most reasonable systems of specifying programs) than a program that returns 32758932523657923658936180532035892630581608956901628906849561908236520958326051861018956109328631298061259863298326379326013327851098368965026592086190862390125670192358031278018273063587236832763053870032004364702101004310417647840155719238569120561329853619283561298215693286953190539832693826325980569123856910536312892639082369382562039635910965389032698312569023865938615338298392306583192365981036198536932862390326919328369856390218365991836501590931685390659103658916392090356835906398269120625190856983206532903618936398561980569325698312650389253839527983752938579283589237325987329382571092301928* - even though 10^(10^(10^10)) is by far the larger number. And it only takes a linear increase in complexity to make it 10^(10^(10^(10^(10^(10^10))))) instead.
*I produced this number via keyboard-mashing; it’s not anything special.
Consider the proposition “A superpowered entity capable of creating unlimited numbers of people ran a program that output the result of a random program out of all possible programs (with their outputs rendered as integers), weighted by the complexity of those programs, and then created that many people.”
If this happened, the probability that their program outputs at least X would fall much slower than X rises, in the limit. The sum doesn’t converge at all; the expected number of people created would be literally infinite.
So as long as you assign greater than literally zero probability to that proposition—and there’s no such thing as zero probability—there must exist some number X such that you assign greater than 1/X probability to X people existing. In fact, there must exist some number X such that you assign greater than 1/X probability to X million people existing, or X billion, or so on.
(btw, I don’t think that the sort of SIA-based reasoning here is actually valid—but if it was, then yeah, it implies that there are infinite people.)
I think when you get to any class of hypotheses like “capable of creating unlimited numbers of people” with nonzero probability, you run into multiple paradoxes of infinity.
For example, there is no uniform distribution over any countable set, which includes the set of all halting programs. Every non-uniform distribution this hypothetical superbeing may have used over such programs is a different prior hypothesis. The set of these has no suitable uniform distribution either, since they can be partitioned into countably many equivalence classes under natural transformations.
It doesn’t take much study of this before you’re digging into pathologies of measure theory such as Vitali sets and similar.
You can of course arbitrarily pick any of these weightings to be your “chosen” prior, but that’s just equivalent to choosing a prior over population directly so it doesn’t help at all.
Probability theory can’t adequately deal with such hypothesis families, and so if you’re considering Bayesian reasoning you must discard them from your prior distribution. Perhaps there is some extension or replacement for probability that can handle them, but we don’t have one.
While you’re quite right about numbers on the scale of billions or trillions, I don’t think it makes sense in the limit for the prior probability of X people existing in the world to fall faster than X grows in size.
Certain series of large numbers grow larger much faster than they grow in complexity. A program that returns 10^(10^(10^10)) takes fewer bits to specify (relative to most reasonable systems of specifying programs) than a program that returns 32758932523657923658936180532035892630581608956901628906849561908236520958326051861018956109328631298061259863298326379326013327851098368965026592086190862390125670192358031278018273063587236832763053870032004364702101004310417647840155719238569120561329853619283561298215693286953190539832693826325980569123856910536312892639082369382562039635910965389032698312569023865938615338298392306583192365981036198536932862390326919328369856390218365991836501590931685390659103658916392090356835906398269120625190856983206532903618936398561980569325698312650389253839527983752938579283589237325987329382571092301928* - even though 10^(10^(10^10)) is by far the larger number. And it only takes a linear increase in complexity to make it 10^(10^(10^(10^(10^(10^10))))) instead.
*I produced this number via keyboard-mashing; it’s not anything special.
Consider the proposition “A superpowered entity capable of creating unlimited numbers of people ran a program that output the result of a random program out of all possible programs (with their outputs rendered as integers), weighted by the complexity of those programs, and then created that many people.”
If this happened, the probability that their program outputs at least X would fall much slower than X rises, in the limit. The sum doesn’t converge at all; the expected number of people created would be literally infinite.
So as long as you assign greater than literally zero probability to that proposition—and there’s no such thing as zero probability—there must exist some number X such that you assign greater than 1/X probability to X people existing. In fact, there must exist some number X such that you assign greater than 1/X probability to X million people existing, or X billion, or so on.
(btw, I don’t think that the sort of SIA-based reasoning here is actually valid—but if it was, then yeah, it implies that there are infinite people.)
I think when you get to any class of hypotheses like “capable of creating unlimited numbers of people” with nonzero probability, you run into multiple paradoxes of infinity.
For example, there is no uniform distribution over any countable set, which includes the set of all halting programs. Every non-uniform distribution this hypothetical superbeing may have used over such programs is a different prior hypothesis. The set of these has no suitable uniform distribution either, since they can be partitioned into countably many equivalence classes under natural transformations.
It doesn’t take much study of this before you’re digging into pathologies of measure theory such as Vitali sets and similar.
You can of course arbitrarily pick any of these weightings to be your “chosen” prior, but that’s just equivalent to choosing a prior over population directly so it doesn’t help at all.
Probability theory can’t adequately deal with such hypothesis families, and so if you’re considering Bayesian reasoning you must discard them from your prior distribution. Perhaps there is some extension or replacement for probability that can handle them, but we don’t have one.