Could logical inductors be used as a partial solution to Hilbert’s Second Problem (of putting mathematics on a sure footing)? Thanks to Gödel we know that there are lots of things that any given theory can’t prove. But by running a logical inductor we could at least say that these things are true with some probability. Of course a result proved in the “Logical Induction” paper is that the probability of an undecidable statement tends to a value that is neither 0 or 1, so we can’t use this approach to justify belief in a stronger theory. But I noticed a weaker result that does hold. There’s a certain class of statements such that (assuming ZF is consistent) an inductor over PA will think that they’re very likely as soon as it finds a proof for them in ZF.
This class of statements is those with only bounded quantifiers; those where every “∀” and “∃” are restricted to a predefined range. This class of statements is decidable, meaning that there’s a Turing machine that will take a bounded sentence and will always halt and tell you whether or not it holds in ℕ. Because of this every bounded sentence has a proof (or a proof of its negation) in both PA and ZF (and PA and ZF agree which it is).
But the proofs of a bounded sentence in PA and ZF can have very different lengths. Consider the self-referential bounded sentence “PA cannot prove this sentence in fewer than 1000000 symbols”. This must have a proof in PA, since we can just check all sentence with fewer than 1000000 symbols by brute force, but its proof must be longer than 1000000 symbols, or else we would get a contradiction. But the preceding sentences constitute a proof in ZF with much fewer than 1000000 symbols. So the sentence is provable in both PA and ZF, but the ZF proof is much shorter.
It might seem like the bounded sentences can’t express many interesting concepts. But in fact I’d contend that they can express most (if not all) things that you might actually need to know. For example, it seems like the fact “For all x and y, x + y = y + x” is a useful unbounded sentence. But whenever you face a situation where you would want to use it there are always some particular x and y that apply in that situation. So then we can use the bounded sentence “x + y = y + x” instead, where x and y stand for whichever values actually occurred.
Now I’ll show that logical inductors over PA eventually trust proofs in ZF of bounded sentences (assuming ZF is consistent). Consider the Turing machine that takes as input a number n and searches through all strings in length order, keeping track of any that are a ZF proof of a bounded sentence. When it’s been searching for n timesteps it stops and outputs whichever bounded sentence it found a proof for last. Call this sentence ϕ_n. Now let P be a logical inductor over PA. Assuming that ZF is consistent, the sequence ϕ_n are all theorems of PA, and by construction there’s a polynomial time Turning machine that outputs them. So by a theorem in the logical inductor paper, we have that P_n(ϕ_n) tends to 1 as n goes to infinity, meaning that for large n the logical inductor becomes confident in ϕ_n sometime around day n. If a bounded statement ϕ has a ZF proof in m symbols then it’s equal to ϕ_n for n ~ exp(m). So P begins to think that ϕ is very likely from day exp(m) onward.
Assuming that the logical inductor is working with a deductive process that searches through PA proofs in length order, this can occur long before the deductive process actually proves that ϕ is true. The exponential doesn’t really make a difference here, since we don’t know exactly how fast the deductive process is working. But it hardly matters, because the length of ZF proofs can be arbitrarily better than those in PA. For example the shortest proof of the sentence “PA cannot prove this sentence in fewer than exp(exp(exp(n))) symbols” in PA is longer than exp(exp(exp(n))) symbols, whereas the length of the shortest proof in ZF is about log(n).
So in generality what we have proved is that weak systems will accept as very good evidence proofs given in stronger theories, so long as the target of the proof is a bounded sentence, and so long as the stronger theory is in fact consistent. This is an interesting partial answer to Hilbert’s question, since it explains why we would care about proofs in ZF, even if we only believe in PA.
Sure, the inductor doesn’t know which systems are consistent, but nevertheless it eventually starts believing the proofs given by any system which is consistent.
Could logical inductors be used as a partial solution to Hilbert’s Second Problem (of putting mathematics on a sure footing)? Thanks to Gödel we know that there are lots of things that any given theory can’t prove. But by running a logical inductor we could at least say that these things are true with some probability. Of course a result proved in the “Logical Induction” paper is that the probability of an undecidable statement tends to a value that is neither 0 or 1, so we can’t use this approach to justify belief in a stronger theory. But I noticed a weaker result that does hold. There’s a certain class of statements such that (assuming ZF is consistent) an inductor over PA will think that they’re very likely as soon as it finds a proof for them in ZF.
This class of statements is those with only bounded quantifiers; those where every “∀” and “∃” are restricted to a predefined range. This class of statements is decidable, meaning that there’s a Turing machine that will take a bounded sentence and will always halt and tell you whether or not it holds in ℕ. Because of this every bounded sentence has a proof (or a proof of its negation) in both PA and ZF (and PA and ZF agree which it is).
But the proofs of a bounded sentence in PA and ZF can have very different lengths. Consider the self-referential bounded sentence “PA cannot prove this sentence in fewer than 1000000 symbols”. This must have a proof in PA, since we can just check all sentence with fewer than 1000000 symbols by brute force, but its proof must be longer than 1000000 symbols, or else we would get a contradiction. But the preceding sentences constitute a proof in ZF with much fewer than 1000000 symbols. So the sentence is provable in both PA and ZF, but the ZF proof is much shorter.
It might seem like the bounded sentences can’t express many interesting concepts. But in fact I’d contend that they can express most (if not all) things that you might actually need to know. For example, it seems like the fact “For all x and y, x + y = y + x” is a useful unbounded sentence. But whenever you face a situation where you would want to use it there are always some particular x and y that apply in that situation. So then we can use the bounded sentence “x + y = y + x” instead, where x and y stand for whichever values actually occurred.
Now I’ll show that logical inductors over PA eventually trust proofs in ZF of bounded sentences (assuming ZF is consistent). Consider the Turing machine that takes as input a number n and searches through all strings in length order, keeping track of any that are a ZF proof of a bounded sentence. When it’s been searching for n timesteps it stops and outputs whichever bounded sentence it found a proof for last. Call this sentence ϕ_n. Now let P be a logical inductor over PA. Assuming that ZF is consistent, the sequence ϕ_n are all theorems of PA, and by construction there’s a polynomial time Turning machine that outputs them. So by a theorem in the logical inductor paper, we have that P_n(ϕ_n) tends to 1 as n goes to infinity, meaning that for large n the logical inductor becomes confident in ϕ_n sometime around day n. If a bounded statement ϕ has a ZF proof in m symbols then it’s equal to ϕ_n for n ~ exp(m). So P begins to think that ϕ is very likely from day exp(m) onward.
Assuming that the logical inductor is working with a deductive process that searches through PA proofs in length order, this can occur long before the deductive process actually proves that ϕ is true. The exponential doesn’t really make a difference here, since we don’t know exactly how fast the deductive process is working. But it hardly matters, because the length of ZF proofs can be arbitrarily better than those in PA. For example the shortest proof of the sentence “PA cannot prove this sentence in fewer than exp(exp(exp(n))) symbols” in PA is longer than exp(exp(exp(n))) symbols, whereas the length of the shortest proof in ZF is about log(n).
So in generality what we have proved is that weak systems will accept as very good evidence proofs given in stronger theories, so long as the target of the proof is a bounded sentence, and so long as the stronger theory is in fact consistent. This is an interesting partial answer to Hilbert’s question, since it explains why we would care about proofs in ZF, even if we only believe in PA.
Well, your argument assumes that ZF is consistent, so it wouldn’t convince someone who only believes in PA.
Sure, the inductor doesn’t know which systems are consistent, but nevertheless it eventually starts believing the proofs given by any system which is consistent.