Fix a theory T over a language L. A coherent probability function is one that satisfies laws of probability theory, each coherent probability function represents a probability distribution on complete logical extensions of T.
One of many equivalent definitions of coherence is that P is coherent if P(s1)+P(s2)+…+P(sk)=1 whenever T can prove that exactly one of s1,…,sk is true.
Another very basic desirable property is that P(s)=1 only when s is provable. There have been many proposals of specific coherent probability assignments that all satisfy this basic requirement. Many satisfy stronger requirements that give bounds on how far P(s) is from 1 when s is not provable.
In this post, I modify the framework slightly to instead talk about conditional probability. Consider a function P which takes in a consistent theory T and a sentence s, and outputs a number P(s|T)∈[0,1], which represents the conditional probability of s given T.
We say that P is coherent if:
P(s1|T)+P(s2|T)+…+P(sk|T)=1 whenever T can prove that exactly one of s1,…,sk is true, and
P(s∧r|T)=P(r|T)⋅P(s|T∪{r}).
If s proves every sentence in T, then P(s|R∪T)≥P(s|R).
Theorem: There is no coherent conditional probability function P such that P(s|T)=1 only when T proves s.
Proof:
This proof will use the notation of log odds ℓ(p)=log2(p1−p) to make things simpler.
Let P be a coherent conditional probability function. Fix a sentence s which is neither provable nor disprovable from the empty theory. Construct an infinite sequences of theories as follows:
T0 is the empty theory.
To construct Tn+1, choose a sentence rn such that neither s→rn nor s→¬rn are provable in Tn. If P(s∧rn|Tn)≤P(s∧¬rn|Tn), then let Tn+1=Tn∪{s→rn}. Otherwise, let Tn+1=Tn∪{s→¬rn}.
Fix an n, and without loss of generality, assume P(s∧rn|Tn)≤P(s∧¬rn|Tn). Since P is coherent we have P(s∧r|Tn)+P(s∧¬r|Tn)=P(s|Tn). In particular, this means that P(s∧r|Tn)≤12P(s|Tn).
Observe that P(s∧(s→r)|Tn)=P(s|Tn+1)P(s→r|Tn), and P(¬s∧(s→r)|Tn)=P(¬s|Tn+1)P(s→r|Tn). Therefore,
P(s∧r|Tn)/P(¬s|Tn)=P(s|Tn+1)/P(¬s|Tn+1), so
12P(s|Tn)/P(¬s|Tn)≥P(s|Tn+1)/P(¬s|Tn+1).
In the language of log odds, this means that ℓ(P(s|Tn))−1≥ℓ(P(s|Tn+1)).
Let T∞ be the union of all the Tn. Note that by the third condition of coherence, ℓ(P(¬s|T∞))≥ℓ(P(¬s|Tn)) for all n, so ℓ(P(s|T∞))≤ℓ(P(s|Tn)) for all n
Consider ℓ(P(s|T0)) and ℓ(P(s|T∞)). These numbers cannot both be finite, since ℓ(P(s|T∞))≤ℓ(P(s|Tn))≤ℓ(P(s|T0))−n. Therefore, at least one of P(s|T0) and P(s|T∞) must be 0 or 1. However neither T0 nor T∞ prove or disprove s, so this means that P assigns conditional probability 1 to some statement that cannot be proven.
Open Problem: Does this theorem still hold if we leave condition 3 out of the definition of coherence?
No Good Logical Conditional Probability
Fix a theory T over a language L. A coherent probability function is one that satisfies laws of probability theory, each coherent probability function represents a probability distribution on complete logical extensions of T.
One of many equivalent definitions of coherence is that P is coherent if P(s1)+P(s2)+…+P(sk)=1 whenever T can prove that exactly one of s1,…,sk is true.
Another very basic desirable property is that P(s)=1 only when s is provable. There have been many proposals of specific coherent probability assignments that all satisfy this basic requirement. Many satisfy stronger requirements that give bounds on how far P(s) is from 1 when s is not provable.
In this post, I modify the framework slightly to instead talk about conditional probability. Consider a function P which takes in a consistent theory T and a sentence s, and outputs a number P(s|T)∈[0,1], which represents the conditional probability of s given T.
We say that P is coherent if:
P(s1|T)+P(s2|T)+…+P(sk|T)=1 whenever T can prove that exactly one of s1,…,sk is true, and
P(s∧r|T)=P(r|T)⋅P(s|T∪{r}).
If s proves every sentence in T, then P(s|R∪T)≥P(s|R).
Theorem: There is no coherent conditional probability function P such that P(s|T)=1 only when T proves s.
Proof:
This proof will use the notation of log odds ℓ(p)=log2(p1−p) to make things simpler.
Let P be a coherent conditional probability function. Fix a sentence s which is neither provable nor disprovable from the empty theory. Construct an infinite sequences of theories as follows:
T0 is the empty theory.
To construct Tn+1, choose a sentence rn such that neither s→rn nor s→¬rn are provable in Tn. If P(s∧rn|Tn)≤P(s∧¬rn|Tn), then let Tn+1=Tn∪{s→rn}. Otherwise, let Tn+1=Tn∪{s→¬rn}.
Fix an n, and without loss of generality, assume P(s∧rn|Tn)≤P(s∧¬rn|Tn). Since P is coherent we have P(s∧r|Tn)+P(s∧¬r|Tn)=P(s|Tn). In particular, this means that P(s∧r|Tn)≤12P(s|Tn).
Observe that P(s∧(s→r)|Tn)=P(s|Tn+1)P(s→r|Tn), and P(¬s∧(s→r)|Tn)=P(¬s|Tn+1)P(s→r|Tn). Therefore, P(s∧r|Tn)/P(¬s|Tn)=P(s|Tn+1)/P(¬s|Tn+1), so 12P(s|Tn)/P(¬s|Tn)≥P(s|Tn+1)/P(¬s|Tn+1).
In the language of log odds, this means that ℓ(P(s|Tn))−1≥ℓ(P(s|Tn+1)).
Let T∞ be the union of all the Tn. Note that by the third condition of coherence, ℓ(P(¬s|T∞))≥ℓ(P(¬s|Tn)) for all n, so ℓ(P(s|T∞))≤ℓ(P(s|Tn)) for all n
Consider ℓ(P(s|T0)) and ℓ(P(s|T∞)). These numbers cannot both be finite, since ℓ(P(s|T∞))≤ℓ(P(s|Tn))≤ℓ(P(s|T0))−n. Therefore, at least one of P(s|T0) and P(s|T∞) must be 0 or 1. However neither T0 nor T∞ prove or disprove s, so this means that P assigns conditional probability 1 to some statement that cannot be proven.
Open Problem: Does this theorem still hold if we leave condition 3 out of the definition of coherence?