Löb’s Lemma: an easier approach to Löb’s Theorem
Related to: Löb’s Theorem for implicit reasoning in natural language: Löbian party invitations
tl;dr: Löb’s Theorem is much easier to grok if you separate the parts of the proof that use the assumption from the parts that don’t. The parts that don’t use can be extracted as a stand-alone result, which I hereby dub “Löb’s Lemma”.
Here’s how it works.
Key properties of and
Here is some standard notation. In short, is our symbol for talking about what PA can prove, and is shorthand for PA’s symbols for talking about what (a copy of) PA can prove.
“ 1+1=2” means “Peano Arithmetic (PA) can prove that 1+1=2”. No parentheses are needed; the “” applies to the whole line that follows it. Also, does not stand for an expression in PA; it’s a symbol we use to talk about what PA can prove.
“” basically means the same thing. More precisely, it stands for a numerical expression within PA that can be translated as saying ” 1+1=2″. This translation is possible because of something called a Gödel numbering which allows PA to talk about a (numerically encoded version of) itself.
″ ” is short for “” in cases where “” is just a single character of text.
“” means “PA, along with X as an additional axiom/assumption, can prove Y”. In this post we don’t have any analogous notation for .
The proofs here will use the following standard properties of and (source: Wikipedia), which respectively stand for provability and provability encoded within arithmetic.
(necessitation) From , conclude . Informally, this says that if A can be proven, then it can be proven that it can be proven (by just writing out and checking the proof within arithmetic). Note that from we cannot conclude , because still means “PA can prove”, and not “PA+X can prove”.
(internal necessitation) . If A is provable, then it is provable that it is provable (basically the same as the previous point).
(box distributivity) . This rule allows one to apply modus ponens inside the provability operator. If it is provable that A implies B, and A is provable, then B is provable.
(deduction theorem) From , conclude : if assuming is enough to prove , then it’s possible to prove under no assumptions that .
Point 4 is helpful and pretty intuitive, but for whatever reason isn’t used in the main Wikipedia article on Löb’s Theorem.
Löb’s Lemma
Claim: Assume and are any statements satisfying . Then .
Intuition: By assumption, the sentence is equivalent to saying “If this sentence is provable, then ”. Intuitively, has very little content, except for the part at the end, so it makes sense that boils down to nothing more than in terms of logical equivalence.
Reminder: this does not use the assumption from Löb’s Theorem at all.
Proof:
Let’s do the forward implication first:
by internal necessitation ().
using box distributivity on the assumption,
with and .from 1 and 2 by box distributivity.
from 3 by the deduction theorem.
Now for the backwards implication, which isn’t needed for Löb’s Theorem, but is handy anyway:
is a tautology.
by box distributivity on 1.
by box distributivity on the assumption.
by 2 and 3.
I like this result because both directions of the proof are fairly short, it doesn’t use the assumption at all, and the conclusion itself is also fairly intuitive. The statement just turns out to have no content except for itself, from the perspective of writing proofs.
Löb’s Theorem, now in just 6 lines
If you can remember Löb’s Lemma, you can write a very straightforward proof of Löb’s Theorem in just 6 lines:
Claim: If is any sentence such that , then
Proof:
Let be any sentence satisfying , which exists by the existence of modal fixed points (or by the Diagonal Lemma).
by Löb’s Lemma.
by assumption.
by 1 and 2 combined.
by 3 and the defining property of
by necessitation.
by 3 and 5.
«mic drop»
Why is it OK to use deduction theorem, though? In standard modal logics like K and S5 the deduction theorem doesn’t hold (otherwise you could assume P, use necessitation to get []P, and then use deduction theorem to get P → []P as a theorem).
Well, the deduction theorem is a fact about PA (and, propositional logic), so it’s okay to use as long as ⊢means “PA can prove”.
But you’re right that it doesn’t mix seamlessly with the (outer) necessitation rule. Necessitation is a property of “⊢”, but not generally a property of “X⊢”. When PA can prove something, it can prove that it can prove it. By contrast, if PA+X can prove Y, that does mean that PA can prove that PA+X can prove Y (because PA alone can work through proofs in a Gödel encoding), but it doesn’t mean that PA+X can prove that PA can prove Y. This can be seen by example, by setting X=Y=¬□(1=0)”.
As for the case where you want ⊢ to refer to K or S5 instead of PA provability, those logics are still built on propositional logic, for which the deduction theorem does hold. So if you do the deduction only using propositional logic from theorems in ⊢ along with an additional assumption X, then the deduction theorem applies. In particular, inner necessitation and box distributivity are both theorems of ⊢ for every A and B you stick into them (rather than meta theorems about ⊢, which is what necessitation is). So the application of the deduction theorem here is still valid.
Still, the deduction theorem isn’t safe to just use willy nilly along with the (outer) necessitation rule, so I’ve just added a caveat about that:
Thanks for calling this out.
It might be worth being explicit that ⊢ has lower precedence than the other operators (which in some sense are part of a different language). I, like maybe Gurkenglas, spent some time wondering why (⊢□A)→(□□A) wasn’t just a special case of necessitation.
I’m confused by your use of the deduction theorem. It’s only used in the forward implication argument, and seems unnecessary to me. (The linked wiki article doesn’t mention it.) More precisely, it only seems necessary to move things left-to-right across a turnstyle, because you’ve previously moved them right-to-left in a way that isn’t obviously-to-me valid.
(I mean, it’s obvious to me that if we can prove that A implies B, then assuming A lets us prove B. But it would also be obvious to me that if assuming A lets us prove B, we can prove that A implies B, yet you dedicated a theorem to allowing that. And I don’t trust things that are obvious to me to be actually “true”, let alone valid moves here.)
We can just make the whole argument without doing that and it seems fine to me? Looking at it in more depth:
(This does require that we can do logic with the things we’ve proved. E.g. if we have ⊢x and ⊢x→y, we can conclude ⊢y. I think that’s okay, but seems worth being explicit about.)
This is just doing the thing I question.
The assumption is ⊢Ψ↔(□Ψ→p), i.e. ⊢A↔B. Weaken it and use necessitation to get ⊢□(A→B). Box distributivity is ⊢□(A→B)→(□A→□B). From those two, we can get ⊢□A→□B. Expanding again, this is ⊢□Ψ→□(□Ψ→p).
Box distributivity lets us go from ⊢□(□Ψ→p) to ⊢□□Ψ→□p. So using 2, we have ⊢□Ψ→(□□Ψ→□p). And internal necessitation gives us ⊢□Ψ→□□Ψ, so combining those, ⊢□Ψ→□p.
No need for this, we already finished.
It’s true that the deduction theorem is not needed, as in the Wikipedia proof. I just like using the deduction theorem because I find it intuitive (assume A, prove B, then drop the assumption and conclude A→B) and it removes the need for lots of parentheses everywhere.
I’ll add a note about the meaning of ⊢ so folks don’t need to look it up, thanks for the feedback!
Thanks, that makes sense. And the added explanation helps a lot, I can see the argument for going from ⊢A→B to A⊢B now:
PA proves A→B;
So PA + A certainly proves A→B, since it can prove everything PA can;
Also, PA + A proves A;
So we have A⊢A∧(¬A∨B);
Which in turn gives us A⊢B.
Can you explain more formally what is the difference between ⊢ and □? I’ve looked in Wikipedia and in Cartoon Guide on Löb’s theorem, but still can’t get it.
I’ve now fleshed out the notation section to elaborate on this a bit. Is it better now?
Thank you, it’s much more clear now.
A set of axioms Σ within a vanilla first-order-logic language does not necessarily allow for the ability to speak about what statements are provable from Σ within that language. Consider the axioms scheme ∀x∀y(x=y). There’s not enough strength in Σ to create encodings of statements, which is neccesary to make formal statements in the language which could be interpreted in some standard model to mean “Statement X is provable”. Seriously, try doing it. It should clear things up the need to build in some notion of provability into the syntax and semtantics of a logic if you want to talk about provability within a language. Then read upto section 3 in this SEP article.
EDIT: That is assuming you know stuff about provability predicates already, and things like arithmetization. If you don’t, then the above exercise is way too hard. In that case, I’d just say that Σ⊢ψ and Σ⊢□(ψ) represent “axioms Σ proves statement ψ” and “axioms Σ proves that ψ is provable in some set of axioms T”. T is a “sufficiently strong” set of axioms, like Peano arithmetic, which can do interesting things.
It seems important to clarify the difference between
From ⊢A, conclude ⊢□A
and⊢□A→□□A
. I don’t feel like I get why we don’t just setconclude := →
and⊢:=□
.Well, A→B is just short for ¬A∨B, i.e., “(not A) or B”. By contrast, A⊢B means that there exists a sequence of (very mechanical) applications of modus ponens, starting from the axioms of Peano Arithmetic (PA) with A appended, ending in B. We tried hard to make the rules of ⊢ so that it would agree with → in a lot of cases (i.e., we tried to design ⊢ to make the deduction theorem true), but it took a lot of work in the design of Peano Arithmetic and can’t be taken for granted.
For instance, consider the statement ¬□(1=0). If you believe Peano Arithmetic is consistent, then you believe that ¬□(1=0), and therefore you also believe that □(1=0)→(2=3). But PA cannot prove that ¬□(1=0) (by Gödel’s Theorem, or Löb’s theorem with p=(1=0)), so we don’t have ⊢□(1=0)→(2=3).
To check, are you reading the second as (⊢□A)→(□□A)? It’s meant to be ⊢(□A→□□A).
Nah.
How would the semantics for that work out?
⊢ already is to □ as □ is to quoted □. If we redo the post one □ further in, we never need external necessitation. Not sure if that answers your question.
I think this factoring hides the computational content of Löb’s theorem (or at least doesn’t make it obvious). Namely, that if you have f:□p→p, then Löb’s theorem is just the fixpoint of this function.
Here’s a one-line proof of Löb’s theorem, which is basically the same as the construction of the Y combinator (h/t Neel Krishnaswami’s blogpost from 2016):
löb(f):=let ψ:=(λψ:Ψ.f(ψ.fwd(‘‘ψ")) in ψ.bak(ψ)where ‘‘ψ" is applying internal necessitation to ψ, and .fwd (.bak) is the forward (reps. backwards) direction of the point-surjection Ψ↔(□Ψ→p).