No Good Logical Conditional Probability

Fix a theory $T$ over a language $L$ . A coherent probability function is one that satisfies laws of probability theory, each coherent probability function represents a probability distribution on complete logical extensions of $T$ .

One of many equivalent definitions of coherence is that $P$ is coherent if $P (s_{1}) + P (s_{2}) + \dots + P (s_{k}) = 1$ whenever $T$ can prove that exactly one of $s_{1}, \dots, s_{k}$ is true.

Another very basic desirable property is that $P (s) = 1$ only when $s$ is provable. There have been many proposals of specific coherent probability assignments that all satisfy this basic requirement. Many satisfy stronger requirements that give bounds on how far $P (s)$ is from 1 when $s$ is not provable.

In this post, I modify the framework slightly to instead talk about conditional probability. Consider a function $P$ which takes in a consistent theory $T$ and a sentence $s$ , and outputs a number $P (s | T) \in [0, 1]$ , which represents the conditional probability of $s$ given $T$ .

We say that $P$ is coherent if:

$P (s_{1} | T) + P (s_{2} | T) + \dots + P (s_{k} | T) = 1$ whenever $T$ can prove that exactly one of $s_{1}, \dots, s_{k}$ is true, and
$P (s \land r | T) = P (r | T) \cdot P (s | T \cup {r}) .$
If $s$ proves every sentence in $T$ , then $P (s | R \cup T) \geq P (s | R)$ .

Theorem: There is no coherent conditional probability function $P$ such that $P (s | T) = 1$ only when $T$ proves $s$ .

Proof:

This proof will use the notation of log odds $ℓ (p) = {log}_{2} (\frac{p}{1 - p})$ to make things simpler.

Let $P$ be a coherent conditional probability function. Fix a sentence $s$ which is neither provable nor disprovable from the empty theory. Construct an infinite sequences of theories as follows:

$T_{0}$ is the empty theory.
To construct $T_{n + 1}$ , choose a sentence $r_{n}$ such that neither $s \to r_{n}$ nor $s \to \neg r_{n}$ are provable in $T_{n}$ . If $P (s \land r_{n} | T_{n}) \leq P (s \land \neg r_{n} | T_{n})$ , then let $T_{n + 1} = T_{n} \cup {s \to r_{n}}$ . Otherwise, let $T_{n + 1} = T_{n} \cup {s \to \neg r_{n}}$ .

Fix an $n$ , and without loss of generality, assume $P (s \land r_{n} | T_{n}) \leq P (s \land \neg r_{n} | T_{n})$ . Since $P$ is coherent we have $P (s \land r | T_{n}) + P (s \land \neg r | T_{n}) = P (s | T_{n}) .$ In particular, this means that $P (s \land r | T_{n}) \leq \frac{1}{2} P (s | T_{n})$ .

Observe that $P (s \land (s \to r) | T_{n}) = P (s | T_{n + 1}) P (s \to r | T_{n})$ , and $P (\neg s \land (s \to r) | T_{n}) = P (\neg s | T_{n + 1}) P (s \to r | T_{n})$ . Therefore, $P (s \land r | T_{n}) / P (\neg s | T_{n}) = P (s | T_{n + 1}) / P (\neg s | T_{n + 1})$ , so $\frac{1}{2} P (s | T_{n}) / P (\neg s | T_{n}) \geq P (s | T_{n + 1}) / P (\neg s | T_{n + 1})$ .

In the language of log odds, this means that $ℓ (P (s | T_{n})) - 1 \geq ℓ (P (s | T_{n + 1}))$ .

Let $T_{\infty}$ be the union of all the $T_{n}$ . Note that by the third condition of coherence, $ℓ (P (\neg s | T_{\infty})) \geq ℓ (P (\neg s | T_{n}))$ for all $n$ , so $ℓ (P (s | T_{\infty})) \leq ℓ (P (s | T_{n}))$ for all $n$

Consider $ℓ (P (s | T_{0}))$ and $ℓ (P (s | T_{\infty}))$ . These numbers cannot both be finite, since $ℓ (P (s | T_{\infty})) \leq ℓ (P (s | T_{n})) \leq ℓ (P (s | T_{0})) - n$ . Therefore, at least one of $P (s | T_{0})$ and $P (s | T_{\infty})$ must be 0 or 1. However neither $T_{0}$ nor $T_{\infty}$ prove or disprove $s$ , so this means that $P$ assigns conditional probability 1 to some statement that cannot be proven.

Open Problem: Does this theorem still hold if we leave condition 3 out of the definition of coherence?