If you try to count primes IN ANY WAY, then you will end up with the answer that YOU HAVE AN EQUAL CHANCE OF GETTING ANY COPRIME RESIDUE MODULO 10; this is a THEOREM. It is a proven fact of nature, or platonic ideals, or whatever it is theorems are facts of. Measure mod 3 will just tell you that the residues are also equidistributed modulo 3; it will reduce to computing residues modulo 30, which will not give you any information whatsoever because there will be an equal number that reduce to 1,3,7, and 9 modulo 10.
If you have an upper and lower bound obviously primes aren’t EXACTLY equidistributed, but in this case your upper and lower bounds will be away by millions, with primes showing up on average every 27 digits. The probabilities will work out, if you think of this as a random question selected from many possible and indistinguishable questions.
In real life if I ask you “how tall is the tallest Sequoia tree?” versus “how tall is the tallest scott’s pine?” Even though both questions have specific answers, if you don’t have wikipedia handy, you’ll need to default to “how tall do evergreens get” and the answer becomes a probability distribution. The same is true of prime numbers, once you blur out the details as much as they already have been blurred out. So the “correct answer” in a Bayesian sense is the well-calibrated answer, where you have a distribution and confidence intervals that are of known accuracy over many similar questions, even if you get the wrong answer sometimes. This sort of reasoning is totally transparent with respect to one specific mathematical phenomenon; the distribution of prime numbers in a range.
It may not be transparent with respect to other mathematical problems.
Equidistribution, of course, doesn’t imply “equal chances” for Nth prime. It’s 3. If you don’t have time to calculate 3, there’s no theorem saying you can’t conclude something (trivially, in edge cases where you have almost enough time you might be able to exclude 1 somehow). Other interesting thing is that 2^n-1 are rarely primes (n has to be prime and then still usually not), and we know that 2^n-1 don’t end in 9 ; impact of this decreases as n grows though. All sorts of small subtle things going on. (And yes, the resulting difference in “probabilities” is very small).
If you were able to conclude additional information about the one trillionth prime OTHER THAN that it is a large prime number which is approximately of size 27 trillion, then that information MAY contain information about that prime modulo 10, I agree.
I would be very surprised (p = .01) if there actually are results of the form “the nth prime has such and such characteristics modulo whatever” because of the strong equidistribution results that do exist, and obviously after someone computes the answer it is known. If you look at prime numbers and apply techniques from probability theory, it does work, and it works beautifully well. Adding information beyond “high prime” would allow you to apply techniques from probability theory to deal with logical uncertainty, and it would work well. The examples you give seem to be examples of this.
My core point is that other problems may be less susceptible to approaches from probability theory. Even if looking at digits of prime numbers is not a probability theory problem, using techniques from probability theory accesses information from theorems you may not have proven. Using techniques from probability theory elsewhere does not do this, because those theorems you haven’t proven aren’t even true.
So I am concerned, when someone purports to solve the problem of logical uncertainty, that their example problem is one whose solution looks like normal uncertainty.
I don’t think we disagree on any matters of fact; I think we may disagree about the definition of the word “probability.”
Yes, I agree. In other problems, your “probabilities” are going to not be statistically independent from basic facts of mathematics. I myself posted a top level comment with regards to ultimate futility of ‘probabilistic’ approach. Probability is not just like logic. If you have graph with loops or cycles, it is incredibly expensive. It isn’t some reals flowing through network of tubes, sides of a loop are not statistically independent. It doesn’t cut down your time at all, except in extreme examples (I once implemented a cryptographic algorithm dependent on Miller–Rabin primality test, which is “probabilistic”, and my understanding is that this is common in cryptography and is used by your browser any time you establish a SSL connection)
If you try to count primes IN ANY WAY, then you will end up with the answer that YOU HAVE AN EQUAL CHANCE OF GETTING ANY COPRIME RESIDUE MODULO 10; this is a THEOREM. It is a proven fact of nature, or platonic ideals, or whatever it is theorems are facts of. Measure mod 3 will just tell you that the residues are also equidistributed modulo 3; it will reduce to computing residues modulo 30, which will not give you any information whatsoever because there will be an equal number that reduce to 1,3,7, and 9 modulo 10.
If you have an upper and lower bound obviously primes aren’t EXACTLY equidistributed, but in this case your upper and lower bounds will be away by millions, with primes showing up on average every 27 digits. The probabilities will work out, if you think of this as a random question selected from many possible and indistinguishable questions.
In real life if I ask you “how tall is the tallest Sequoia tree?” versus “how tall is the tallest scott’s pine?” Even though both questions have specific answers, if you don’t have wikipedia handy, you’ll need to default to “how tall do evergreens get” and the answer becomes a probability distribution. The same is true of prime numbers, once you blur out the details as much as they already have been blurred out. So the “correct answer” in a Bayesian sense is the well-calibrated answer, where you have a distribution and confidence intervals that are of known accuracy over many similar questions, even if you get the wrong answer sometimes. This sort of reasoning is totally transparent with respect to one specific mathematical phenomenon; the distribution of prime numbers in a range.
It may not be transparent with respect to other mathematical problems.
Equidistribution, of course, doesn’t imply “equal chances” for Nth prime. It’s 3. If you don’t have time to calculate 3, there’s no theorem saying you can’t conclude something (trivially, in edge cases where you have almost enough time you might be able to exclude 1 somehow). Other interesting thing is that 2^n-1 are rarely primes (n has to be prime and then still usually not), and we know that 2^n-1 don’t end in 9 ; impact of this decreases as n grows though. All sorts of small subtle things going on. (And yes, the resulting difference in “probabilities” is very small).
If you were able to conclude additional information about the one trillionth prime OTHER THAN that it is a large prime number which is approximately of size 27 trillion, then that information MAY contain information about that prime modulo 10, I agree.
I would be very surprised (p = .01) if there actually are results of the form “the nth prime has such and such characteristics modulo whatever” because of the strong equidistribution results that do exist, and obviously after someone computes the answer it is known. If you look at prime numbers and apply techniques from probability theory, it does work, and it works beautifully well. Adding information beyond “high prime” would allow you to apply techniques from probability theory to deal with logical uncertainty, and it would work well. The examples you give seem to be examples of this.
My core point is that other problems may be less susceptible to approaches from probability theory. Even if looking at digits of prime numbers is not a probability theory problem, using techniques from probability theory accesses information from theorems you may not have proven. Using techniques from probability theory elsewhere does not do this, because those theorems you haven’t proven aren’t even true. So I am concerned, when someone purports to solve the problem of logical uncertainty, that their example problem is one whose solution looks like normal uncertainty.
I don’t think we disagree on any matters of fact; I think we may disagree about the definition of the word “probability.”
Yes, I agree. In other problems, your “probabilities” are going to not be statistically independent from basic facts of mathematics. I myself posted a top level comment with regards to ultimate futility of ‘probabilistic’ approach. Probability is not just like logic. If you have graph with loops or cycles, it is incredibly expensive. It isn’t some reals flowing through network of tubes, sides of a loop are not statistically independent. It doesn’t cut down your time at all, except in extreme examples (I once implemented a cryptographic algorithm dependent on Miller–Rabin primality test, which is “probabilistic”, and my understanding is that this is common in cryptography and is used by your browser any time you establish a SSL connection)
Okay yes we are in agreement.