Donald Hobson answers How much is known about the “inference rules” of logical induction?

Donald Hobson 9 Aug 2020 14:46 UTC
13 points
Updating on the observations “ϕ(n) for all 0≤n≤N”, the probability of ∀nϕ(n) goes to 1 as N→∞
Logical induction doesn’t have this property. No alternatives can either.
To make sense of this answer, I recommend reading
https://www.lesswrong.com/posts/3FoMuCLqZggTxoC3S/logical-pinpointing
and
https://www.lesswrong.com/posts/i7oNcHR3ZSnEAM29X/standard-and-nonstandard-numbers
In short, logical induction can’t be sure if its operating in a nonstandard context or not. So suppose that $ϕ$ is true for all standard numbers, but false for some nonstandard number. Logical Induction has no way to know whether it is operating in the standard or non-standard numbers, so assigns a probability strictly between 0 and 1 to it.
Suppose you had an alternate algorithm that didn’t do that. A Turing machine can be formally specified in PA, so we can talk about the output of the algorithm. The concept of limits can be discussed in PA using $\forall ϵ \exists δ$ of foundational calculus. (The $ϵ$ and $δ$ actually need to be Natural numbers, but you can encode rational numbers in them with prime factors)
So for every formula $ϕ$ in PA there exists another formula $ϕ^{'}$ that says that your algorithm assigns probability 1 to $ϕ$ in the limit. (And $ϕ^{'}$ is computable from $ϕ$ )
Let $K$ be a PA formula with one free variable. $K$ takes in a number $a$ . If $a$ does not encode a PA formula with one free variable, then $K$ is false. If $a$ encodes $A$ , then $K = \neg (A (a))^{'}$ . Now consider $k$ an integer encoding $K$ . And consider $K (k)$ .
This is a straightforward modification of godels incompleteness theorem. If you have any procedure that can perfectly distinguish between true and false statements within the structure of PA, then you can make a “this statement is false” and get a contradiction.
Technical Note. I am also assuming that your proposed alternative to logical induction respects negation and implication in the limit. (Ie if ${lim}_{n \to \infty} P (ψ) = 1 ⟺ {lim}_{n \to \infty} P (\neg ψ) = 0$ and $({lim}_{n \to \infty} P (ψ) = 1 \land {lim}_{n \to \infty} P (ψ ⟹ ϕ) = 1) ⟹ {lim}_{n \to \infty} P (ϕ) = 1$ .)
Some expressions in PA have multiple qualifiers $\forall a \exists b \forall c . . .$ Suppose we know that $P (ϕ (a)) = 1$ for every particular $a$ . Then by finite combinations of probabilities, $P (\forall a < N : ϕ (a)) = 1$ and so $P (\forall a : ϕ (a)) = 1$ . As $\exists a = \neg \forall a \neg$ , then it follows that any attempt at logical induction with this property must fail.
EDIT:
I have realized that I was confusing two separate concepts. You can’t have any function from formulas to booleans that has the property $\forall x : f (^{'} ϕ (x)^{'}) ⟹ f (^{'} \forall x : ϕ (x)^{'})$ . and the trivial preservation of logical operations. (definable within a formal first order theory with the same symbols)
However, if you have a process that returns True, False or Maybe, and is never wrong, then you can extend it to another process like this. Take $f (ϕ)$ the first function. If $f (ϕ)$ is True or False, define $g (ϕ) = f (ϕ)$ otherwise, if $ϕ = \forall x : ψ (x)$ and $\forall x : f (ψ (x))$ then $g (ϕ)$ =True. Otherwise Maybe.
You can’t always combine infinite sequences of your own beliefs, but you can combine infinite sequences of beliefs from some weaker system, and each time you do that you move further up the arithmetic hierarchy.
https://en.wikipedia.org/wiki/Arithmetical_hierarchy
As totally computable functions are in $Δ_{0}$ , the limit of a sequence of boolean functions is in $Σ_{2}$ . So as taking a forall over a function makes it a $Π$ the highest you can get while still having
Updating on the observations “ϕ(n) for all 0≤n≤N”, the probability of ∀nϕ(n) goes to 1 as N→∞
Is $Π_{1}$ . This means that you can have your asymtotic property compared to any $Π_{1}$ function.
Limits in the rationals add a $\forall ϵ > 0$ to the front, puting them in $Π_{3}$ . Meaning that your asymtotic property holds for all $Π_{3}$ sets.
What this means is that if you have a computable function $p_{x} (n) \in [0, 1]$ which tends to 1 as $n \to \infty$ if and only if $ϕ (x)$ then you can construct $q (n) = \sum_{x = 1}^{n} 2^{- x} p_{x} (n)$ which tends to 1 if and only if $\forall x : ϕ (x)$ .
The downside of this approach is that it assigns probability almost 1 to false statements.
The boolean function case is to take a function that searches for an explicit counterexample, and return a probability that tends to one if no counter example has been found yet.
- Steven Byrnes 10 Aug 2020 12:10 UTC
  3 points
  Parent
  Sorry if these are stupid questions...
  logical induction can’t be sure if its operating in a nonstandard context or not.
  The question specified “all my variables are implicitly natural numbers”. Why can’t there be traders that specialize on questions specifically about standard numbers and ignore others? (I assume that the natural numbers are standard numbers, correct?) Also, what’s the connection between nonstandard numbers and your Godel-like proof?
  If you have any procedure that can perfectly distinguish between true and false statements within the structure of PA, then you can make a “this statement is false” and get a contradiction.
  I think it would be fun to find a concrete example… here’s my attempt...
  $P$ =”the logical inductor will assign probability <0.5 to $P$ in the limit of infinitely many steps”
  $ϕ (n)$ = “the logical inductor will assign probability <0.5 to $P$ after $n$ inference steps”
  so $P = \forall n ϕ (n)$ .
  Then maybe any individual $ϕ (n)$ will be true, but the algorithm will never assign >0.5 probability to $P = \forall n ϕ (n)$ .
  Did I get that right? I’m very out of practice with my self-referential math so I have low confidence. :-P
  - Donald Hobson 10 Aug 2020 15:35 UTC
    4 points
    Parent
    Your self referential maths is nearly right. Technically the limit is defined as $\exists N \forall n > N : ϕ (n)$ as your $\forall n : ϕ (n)$ could fail because $ϕ (1)$ is false, and logical induction only behaves well in the limit.
    
    The paper says that $P_{n} ="$ the logical inductor will assign probability <0.5 to $P_{n}$ after $n$ inference steps”
    tend to 0.5 as $n \to \infty$ . But I havn’t yet worked out the infinite case.
    
    The question specified “all my variables are implicitly natural numbers”. Why can’t there be traders that specialize on questions specifically about standard numbers and ignore others? (I assume that the natural numbers are standard numbers, correct?)
    You can’t do that because non-standard numbers look really similar to standard numbers from the inside. There is no formula $ϕ (x)$ that is true on all standard numbers, and false on nonstandard numbers.
    Suppose you had a logical induction procedure that had the property that after updating on $ϕ (1), ϕ (2) . . .$ it believed $\forall x : ϕ (x)$ .
    So suppose you had $p_{1}, p_{2} . . . \in [0, 1]$ with each $p_{n}$ depending only on $ϕ (1) . . ϕ (n)$ . and $p_{n} \to 1 if \forall x : ϕ (x) else 0$ as $n \to \infty$ . (Produced by your alternative to logical induction, these represent the probability assigned to $\forall x : ϕ (x)$ )
    Suppose you prove all that in PA. As PA has nonstandard models, the proof also holds in those. But we can pick a $ϕ$ such that $\forall x : ϕ (x)$ is true in the standard model, but false for some nonstandard x. There must exist nonstandard $n$ such that $1 - p_{n}$ is infinitesimal.
    Otherwise the formula $\exists m : 1 - p_{m} < 1 / k$ would distinguish standard k from nonstandard k. Assigning infinitesimal probability to something that actually happend is a form of bad behaviour that I think can be transported back to the standard domain.
    I think that the resulting behaviour is consistent, but results in poor behaviour.
    Will post more as I think of it.
- Eigil Rischel 10 Aug 2020 20:20 UTC
  1 point
  Parent
  Thank you very much!
  
  I guess an argument of this type rules out a lot of reasonable-seeming inference rules—if a computable process can infer “too much” about universal statements from finite bits of evidence, you do this sort of Gödel argument and derive a contradiction. This makes a lot of sense, now that I think about it.