an attempt to rescue what seems to me like the intuition in the OP:
(note that the result is underwhelming, but perhaps informative.)
in the lambda calculus we might say “given (f : A → A), we can get an element (a : A) by the equation (a := f a)”.
recursive definitions such as this one work perfectly well, modulo the caveat that you need to have developed the Y combinator first so that you can ground out this new recursive syntax (assuming you don’t want to add any new basic combinators).
by a directly analogous argument, we might wish define (löb f := f “löb f”). very direct! very quine-like!
and, this proof works perfectly well. …modulo the caveat that you need to have developed a modified Y combinator first, so that you can ground out this recursive syntax (assuming you don’t want to add any new axioms).
so if you prove the existence of a modified Y-combinator (that works w/ quotations), then you can add recursive syntax (modulo quotation) to your theory, and get a very “direct/quine-like” proof of löb’s theorem.
the only hiccup is that the modified-Y-combinator is itself löb’s theorem, so the above proof amounts to “once you have proven löb’s theorem, there is a very direct and intuitive proof of löb’s theorem”, which… is perhaps less exciting :-p
(bonus fun fact: in type theory, there’s an operator that forms quines. and the proof of löb’s theorem is extremely simple, given this operator; you just take a (syntactic) term of type □ A → A, and feed it the quotation of the result of this whole operation, using the quining operator. the hiccup is, ofc, that the quining operator is itself just löb’s theorem!)
the details of the omitted “zip” operation are going to be no simpler than the standard proof of löb’s theorem, and will probably turn out to be just a variation on the standard proof of löb’s theorem (unless you can find a way of building self-reference that’s shorter than the Y combinator (b/c the standard proof of löb’s theorem is already just the Y combinator plus the minimal modifications to interplay with gödel-codes))
even more compressed: the normal proof of löb is contained in the thing labeled “zip”. the argument in the OP is operating under the assumption that a proof can make reference to its own gödel-number, and that assumption is essentially just löb’s theorem.
a thing that might help is asking what specific proof-steps the step “1. the gödel-code P of this very proof proves C” cashes out into. obviously step 1 can’t cash out into verifying (under quotation) every step of the entire proof (by checking for the appropriate prime-factors), because then the proof would require at least one step for every step in the proof plus three extra steps at the end, which would make the proof infinite (and infinite proofs admit contradictions).
so step 1 has to involve some indirection somehow. (cf: we can’t get a fixpoint in λ-calculus by f (f (f ..., we need some indirection, as occurs in (Y f) aka (λ s. f (s s)) (λ s. f (s s)), to make the term finite.)
this works fine, as long as you are ok w/ the gödel-code in step 1 being not P, but some different proof of the same conclusion (analogous to the difference between f (Y f) and Y f), and so long as you have the analog of the Y combinator (modified to interplay with gödel-codes).
but the analog of the Y combinator is precisely löb’s theorem :-p
if i understand your OP correctly, we can factor it into three claims:
when proving C, it’s valid to assume a referece «this» to the gödel-code of a proof of C
given (1) and a proof f of □ C → C, there’s a proof of C by simply applying f to «this»
you can arrange things such that if you desugar «this» into a particular numeral in “f «this»”, you’ll find that that numeral is the gödel-code of “f «this»”.
(1) is löb’s theorem.
(2) is an indirect proof of löb’s theorem, from the assumption of löb’s theorem (in direct analogy with how, once we’ve found syntax Y f for the fixpoint of f, we can wrap it in f to find an alternate syntax, namely f (Y f)).
(3) is false (i’m fairly confident)--you can get real close, but you can’t quite get there (in direct analogy to how you can get λ-term Y f that means the same thing asf (Y f), but you can’t get a λ-term fix f that is syntactically exactlyf (fix f), on pain of infinite syntax).
afaict, there’s two ways to read this proposal (if we port across the analogy to λ-calculus), according to whether you cash “zip” out to a recursive instance of the whole proposal, vs whether you cash “zip” out to some different proof of löb’s theorem:
“i hear that people find the Y combinator confusing. instead of using (Y f) to name the fixpoint of f, why not use f (f (f (f (... ad infinitum?”
“i hear that people find the Y combinator confusing. instead of using (Y f) to name the fixpoint of f, why not use f (Y f)?”
the answer to (1) is that it doesn’t make sense unless you allow infinite syntax trees (which aren’t permitted in normal math b/c infinite proofs admit contradictions).
the answer to (2) is that, while (Y f) may be somewhat tricky to grasp, f (Y f) is surely no simpler :-p
“this” should be a reference to the Gödel code of the current proof
the OP would like to have what is essentially an application of the diagonal lemma to the whole proof instead of a proposition
“zip”/”unzip” is simply taking the Gödel numbering, and doesn’t hide any extra complexity
In this case the problem here is making precise what one hopes for in step 2. (I don’t really see how one would do it), and then to prove that one can make it work.
part of my point is that the standard proof of löb does essentially apply the main idea of the diagonal lemma to the whole proof instead of to a proposition. when you look at the standard proof of löb from the right angle, it looks very similar to the standard standard proof of the diagonal lemma (namely, they both look like lawvere’s fixpoint theorem, as does the Y combinator).
löb lets you define a proof by reference to its own gödel-code, just like how the diagonal lemma lets you define a proposition by reference to its own gödel-code. (the former just happens to recurse into the latter at one point along the way, as an implementation detail.)
(“but wait, if the mechanisms are so similar, then why doesn’t the diagonal lemma have to recurse into the diagonal lemma at one point along the way, as an implementation detail?” it almost does! and this almost sinks the ship! the diagonal lemma requires an extra hacky insight, that works only in the special case of the diagonal lemma, and that is required to get off the ground. i discuss this in footnote 2 of my original comment, iirc.)
another part of my point is that the hard part of zip/unzip isn’t the part where you take the gödel-numbering, it’s the part where you make the gödel-numbering of the proof you’re currently naming available to the proof that you haven’t finished defining yet. that’s the tricky part of löb’s theorem. if we ignore that part, it’s no surprise that we find the remainder is simpler: we skipped the hard part!
(another thing that might help: when you’re proving an implication □ C → C, the gödel-number that you’re given doesn’t code for the proof of the implication you’re currently writing; that would be ill-typed. you asked for a □ C, not a □ (□ C → C). so the gödel-number you’re given isn’t a code for the thing you’re currently writing, it’s a code for löb’s theorem applied to the thing you’re currently writing.
it is for this reason that the proof you’re fed might not be exactly the proof you were hoping for. you started out your implication being like “step 1: this proof is a proof of C”, but then the proof that you’re fed starts out saying “step 1: apply löb’s theorem to the following implication”, and only at step 2 is it like “this proof is a proof of C”.
you might have really wanted the proof you’re fed to start with “step 1: this proof is a proof of C”. and you might say “well, inline the application of löb’s theorem there, so that the code i’m fed codes for a proof that actually starts that way”. this is like trying to get the fixpoint of f in the λ-calculus by way of f (f (f (...: you can do that for as many steps as you like, but you have to end at (Y f) at some point, on pain of infinite terms (which are contradictory in the mathematical setting).
alternatively, you might accept that the proof you’re fed codes for a proof that starts by saying “apply löb’s theorem to the following implication: …”; this corresponds to being fed the term (Y f) and producing the term f (Y f).
in neither case does it seem to me like there’s an idea for some alternative method of producing the self-reference, so as to sidestep the Y-combinator-esque machinery that makes up the bulk of the proof of löb’s theorem (and that is the source of the recursive call into the diagonal lemma).)
an attempt to rescue what seems to me like the intuition in the OP:
(note that the result is underwhelming, but perhaps informative.)
in the lambda calculus we might say “given (f : A → A), we can get an element (a : A) by the equation (a := f a)”.
recursive definitions such as this one work perfectly well, modulo the caveat that you need to have developed the Y combinator first so that you can ground out this new recursive syntax (assuming you don’t want to add any new basic combinators).
by a directly analogous argument, we might wish define (löb f := f “löb f”). very direct! very quine-like!
and, this proof works perfectly well. …modulo the caveat that you need to have developed a modified Y combinator first, so that you can ground out this recursive syntax (assuming you don’t want to add any new axioms).
so if you prove the existence of a modified Y-combinator (that works w/ quotations), then you can add recursive syntax (modulo quotation) to your theory, and get a very “direct/quine-like” proof of löb’s theorem.
the only hiccup is that the modified-Y-combinator is itself löb’s theorem, so the above proof amounts to “once you have proven löb’s theorem, there is a very direct and intuitive proof of löb’s theorem”, which… is perhaps less exciting :-p
(bonus fun fact: in type theory, there’s an operator that forms quines. and the proof of löb’s theorem is extremely simple, given this operator; you just take a (syntactic) term of type □ A → A, and feed it the quotation of the result of this whole operation, using the quining operator. the hiccup is, ofc, that the quining operator is itself just löb’s theorem!)
various attempts to distill my objection:
the details of the omitted “zip” operation are going to be no simpler than the standard proof of löb’s theorem, and will probably turn out to be just a variation on the standard proof of löb’s theorem (unless you can find a way of building self-reference that’s shorter than the Y combinator (b/c the standard proof of löb’s theorem is already just the Y combinator plus the minimal modifications to interplay with gödel-codes))
even more compressed: the normal proof of löb is contained in the thing labeled “zip”. the argument in the OP is operating under the assumption that a proof can make reference to its own gödel-number, and that assumption is essentially just löb’s theorem.
a thing that might help is asking what specific proof-steps the step “1. the gödel-code P of this very proof proves C” cashes out into. obviously step 1 can’t cash out into verifying (under quotation) every step of the entire proof (by checking for the appropriate prime-factors), because then the proof would require at least one step for every step in the proof plus three extra steps at the end, which would make the proof infinite (and infinite proofs admit contradictions).
so step 1 has to involve some indirection somehow. (cf: we can’t get a fixpoint in λ-calculus by
f (f (f ...
, we need some indirection, as occurs in(Y f)
aka(λ s. f (s s)) (λ s. f (s s))
, to make the term finite.)this works fine, as long as you are ok w/ the gödel-code in step 1 being not P, but some different proof of the same conclusion (analogous to the difference between
f (Y f)
andY f
), and so long as you have the analog of the Y combinator (modified to interplay with gödel-codes).but the analog of the Y combinator is precisely löb’s theorem :-p
if i understand your OP correctly, we can factor it into three claims:
when proving C, it’s valid to assume a referece «this» to the gödel-code of a proof of C
given (1) and a proof f of □ C → C, there’s a proof of C by simply applying f to «this»
you can arrange things such that if you desugar «this» into a particular numeral in “f «this»”, you’ll find that that numeral is the gödel-code of “f «this»”.
(1) is löb’s theorem.
(2) is an indirect proof of löb’s theorem, from the assumption of löb’s theorem (in direct analogy with how, once we’ve found syntax
Y f
for the fixpoint of f, we can wrap it in f to find an alternate syntax, namelyf (Y f))
.(3) is false (i’m fairly confident)--you can get real close, but you can’t quite get there (in direct analogy to how you can get λ-term
Y f
that means the same thing asf (Y f)
, but you can’t get a λ-termfix f
that is syntactically exactlyf (fix f)
, on pain of infinite syntax).afaict, there’s two ways to read this proposal (if we port across the analogy to λ-calculus), according to whether you cash “zip” out to a recursive instance of the whole proposal, vs whether you cash “zip” out to some different proof of löb’s theorem:
“i hear that people find the Y combinator confusing. instead of using
(Y f)
to name the fixpoint of f, why not usef (f (f (f (...
ad infinitum?”“i hear that people find the Y combinator confusing. instead of using
(Y f)
to name the fixpoint of f, why not usef (Y f)
?”the answer to (1) is that it doesn’t make sense unless you allow infinite syntax trees (which aren’t permitted in normal math b/c infinite proofs admit contradictions).
the answer to (2) is that, while
(Y f)
may be somewhat tricky to grasp,f (Y f)
is surely no simpler :-pI was interpreting things somewhat differently:
“this” should be a reference to the Gödel code of the current proof
the OP would like to have what is essentially an application of the diagonal lemma to the whole proof instead of a proposition
“zip”/”unzip” is simply taking the Gödel numbering, and doesn’t hide any extra complexity
In this case the problem here is making precise what one hopes for in step 2. (I don’t really see how one would do it), and then to prove that one can make it work.
part of my point is that the standard proof of löb does essentially apply the main idea of the diagonal lemma to the whole proof instead of to a proposition. when you look at the standard proof of löb from the right angle, it looks very similar to the standard standard proof of the diagonal lemma (namely, they both look like lawvere’s fixpoint theorem, as does the Y combinator).
löb lets you define a proof by reference to its own gödel-code, just like how the diagonal lemma lets you define a proposition by reference to its own gödel-code. (the former just happens to recurse into the latter at one point along the way, as an implementation detail.)
(“but wait, if the mechanisms are so similar, then why doesn’t the diagonal lemma have to recurse into the diagonal lemma at one point along the way, as an implementation detail?” it almost does! and this almost sinks the ship! the diagonal lemma requires an extra hacky insight, that works only in the special case of the diagonal lemma, and that is required to get off the ground. i discuss this in footnote 2 of my original comment, iirc.)
another part of my point is that the hard part of zip/unzip isn’t the part where you take the gödel-numbering, it’s the part where you make the gödel-numbering of the proof you’re currently naming available to the proof that you haven’t finished defining yet. that’s the tricky part of löb’s theorem. if we ignore that part, it’s no surprise that we find the remainder is simpler: we skipped the hard part!
(another thing that might help: when you’re proving an implication □ C → C, the gödel-number that you’re given doesn’t code for the proof of the implication you’re currently writing; that would be ill-typed. you asked for a □ C, not a □ (□ C → C). so the gödel-number you’re given isn’t a code for the thing you’re currently writing, it’s a code for löb’s theorem applied to the thing you’re currently writing.
it is for this reason that the proof you’re fed might not be exactly the proof you were hoping for. you started out your implication being like “step 1: this proof is a proof of C”, but then the proof that you’re fed starts out saying “step 1: apply löb’s theorem to the following implication”, and only at step 2 is it like “this proof is a proof of C”.
you might have really wanted the proof you’re fed to start with “step 1: this proof is a proof of C”. and you might say “well, inline the application of löb’s theorem there, so that the code i’m fed codes for a proof that actually starts that way”. this is like trying to get the fixpoint of f in the λ-calculus by way of
f (f (f (...
: you can do that for as many steps as you like, but you have to end at(Y f)
at some point, on pain of infinite terms (which are contradictory in the mathematical setting).alternatively, you might accept that the proof you’re fed codes for a proof that starts by saying “apply löb’s theorem to the following implication: …”; this corresponds to being fed the term
(Y f)
and producing the termf (Y f)
.in neither case does it seem to me like there’s an idea for some alternative method of producing the self-reference, so as to sidestep the Y-combinator-esque machinery that makes up the bulk of the proof of löb’s theorem (and that is the source of the recursive call into the diagonal lemma).)