(These were some comments I had on a slightly earlier draft than this, so the page numbers and such might be slightly off.)
Page 4, footnote 8: I don’t think it’s true that only stronger systems can prove weaker systems consistent. It can happen that system A can prove system B consistent and A and B are incomparable, with neither stronger than the other. For example, Gentzen’s proof of the consistency of PA uses a system which is neither stronger nor weaker than PA.
Page 6: the hypotheses of the second incompleteness theorem are a little more restrictive than this (though not much, I think).
Page 11, problem c: I don’t understand the sentence containing “highly regular and compact formula.” Looks like there’s a typo somewhere.
A can prove system B consistent and A and B are incomparable, with neither stronger than the other. For example, Gentzen’s proof of the consistency of PA uses a system which is neither stronger nor weaker than PA.
I think there are more trivial counterexamples to the statement also. Take Robinson arithmetic and throw in an axiom asserting the consistency of PA. This system can trivially prove that PA is consistent, and is much weaker than PA.
Your post confused me for a moment, because Robinson + Con(PA) is of course not weaker than PA. It proves Con(PA), and PA doesn’t.
I see now that your point is that Robinson arithmetic is sufficiently weak compared to PA that PA should not be weaker than Robinson + Con(PA). Is there an obvious proof of this?
(For example, if Robinson + Con(PA) proved all theorems of PA, would this contradict the fact that PA is not finitely axiomatizable?)
Yes, finite axiomatizability is the obvious way of seeing this. You are correct that strictly speaking Robinson + Con(PA) is not weaker than PA, but rather is another incomparable example (which was the intended point). Note that there are other ways of seeing that Robinson + Con(PA) is weaker than PA without using the finite axiomatization of PA if one is willing to be be slightly non-rigorous. For example, one can note that Robinson arithmetic has as a model Z[x]+ so any theorem of Robinson +Con(PA) should be a theorem of Z[x]+ +Con(PA), (this step requires some details).
Ah, so my question was more along the line: does finite axiomatizability of a stronger (consistent) theory imply finite axiomatizability of the weaker theory? (This would of course imply Q + Con(PA) is not stronger than PA, Q being the usual symbol for Robinson arithmetic.)
On the model theoretic side, I think I can make something work, but it depends on distorting the specific definition of Con(PA) in a way that I’m not really happy about. In any case, I agree that your example is trivial to state and trivial to believe correct, but maybe it’s less trivial to prove correct.
Here’s what I was thinking:
Consider the predicate P(x) which says “if x != Sx, then x does not encode a PA-proof of 0=1”, and let ConMinus(PA) say for all x, P(x). Now, I think one could argue that ConMinus is a fair definition of (or substitute for?) Con, in that qualifying a formula with “if x != Sx” does not change its meaning in the standard model. Alternately, you could push this “if x != Sx” clause deeper, into basically every formula you would use to define the primitive recursive functions needed to talk about consistency in the first place, and you would not change the meanings of these formulas in the standard model. (I guess what I’m saying is that “the” formula asserting the consistency of PA is poorly specified.)
Also, PA is smart enough to prove that numbers are not their own successors, so PA believes in the equivalence of Con and ConMinus. In particular, PA does not prove ConMinus(PA), so PA is not stronger than Q + ConMinus(PA).
On the other hand, let M be the non-negative integers, together with one additional point omega. Put S(omega) = omega, put omega + anything = omega = anything + omega, and similarly for multiplication (except 0 * omega = omega * 0 = 0). I am pretty sure this is a model of Q.
Q is smart enough about its standard integers that it knows none of them encode PA-proofs of 0=1 (the “proves” predicate being Delta_0). Thus the model M satisfies Q + ConMinus(PA). But now we can see that Q + ConMinus(PA) is not stronger than PA, because PA proves “for all x, x is not equal to Sx”, yet this statement fails in a model of Q + ConMinus(PA).
Ah, so my question was more along the line: does finite axiomatizability of a stronger (consistent) theory imply finite axiomatizability of the weaker theory?
If I’m not mistaken, NBG and ZFC are a counterexample to this: NBG is a conservative extension of ZFC (and therefore stronger than ZFC), but NBG is finitely axiomatizable while ZFC is not.
Yeah, the details of actually proving this are looking like they contain more subtleties than I expected, but I tentatively agree with your analysis. Here’s what may be another proof,. Not only is PA not finitely axiomatizable, but any consistent extension of PA isn’t (I think this is true, the same proof that works for PA should go through here, but I haven’t checked the details). So PA+ConMinus(PA) still isn’t finitely axiomatizable. So now, pick any of the axioms created in the axiom schema of induction that are needed in PA + ConMinus(PA), Q+ConMinus(PA) also can’t prove any of those (since it is strictly weaker than PA+ConMinus(PA),) but all of those statements are theorems of PA (since they are in fact axioms). Does this work?
Overall, this is requiring a lot more subtlety than I initially thought was involved which may make Qiaochu Yuan’s example a better one.
Not only is PA not finitely axiomatizable, but any consistent extension of PA isn’t (I think this is true, the same proof that works for PA should go through here, but I haven’t checked the details)
Well if we had this, we would know immediately that Q + Con(PA) is not an extension of PA (which is what we originally wanted), because it certainly is finitely axiomatizable. I know there are several proofs that PA is not finitely axiomatizable, but I have not read any of them, so can’t comment on the strengthened statement, though it sounds true.
Page 4 footnote 8 in the version you saw looks like footnote 9 in mine.
I don’t see how ‘proof-of-bottom → bottom’ makes a system inconsistent. This kind of formula appears all the time in Type Theory, and is interpreted as “not(proof-of-bottom)”.
The ‘principle of explosion’ says ‘forall A, bottom → A’. We can instantiate A to get ‘bottom → not(proof-of-bottom)’, then compose this with “proof-of-bottom → bottom” to get “proof-of-bottom → not(proof-of-bottom)”. This is an inconsistency iff we can show proof-of-bottom. If our system is consistent, we can’t construct a proof of bottom so it remains consistent. If our system is inconsistent then we can construct a proof of bottom and derive bottom, so our system remains inconsistent.
Have I misunderstood this footnote?
[EDIT: Ignore me for now; this is of course Lob’s theorem for bottom. I haven’t convinced myself of the existence of modal fixed points yet though]
Page 4, footnote 8: I don’t think it’s true that only stronger systems can prove weaker systems consistent. It can happen that system A can prove system B consistent and A and B are incomparable, with neither stronger than the other.
That is strictly correct, but not relevant for self-improving AI. You don’t want father AI that cannot prove everything that the child AI can prove. Maybe the footnote should be edited in this sense.
I don’t think this can’t happen, since A has proven Con(B), then it can now reason using system B for consistency purposes and get from the fact that B proves Con(A) to get A proving Con(A), which is bad.
(These were some comments I had on a slightly earlier draft than this, so the page numbers and such might be slightly off.)
Page 4, footnote 8: I don’t think it’s true that only stronger systems can prove weaker systems consistent. It can happen that system A can prove system B consistent and A and B are incomparable, with neither stronger than the other. For example, Gentzen’s proof of the consistency of PA uses a system which is neither stronger nor weaker than PA.
Page 6: the hypotheses of the second incompleteness theorem are a little more restrictive than this (though not much, I think).
Page 11, problem c: I don’t understand the sentence containing “highly regular and compact formula.” Looks like there’s a typo somewhere.
I think there are more trivial counterexamples to the statement also. Take Robinson arithmetic and throw in an axiom asserting the consistency of PA. This system can trivially prove that PA is consistent, and is much weaker than PA.
Your post confused me for a moment, because Robinson + Con(PA) is of course not weaker than PA. It proves Con(PA), and PA doesn’t.
I see now that your point is that Robinson arithmetic is sufficiently weak compared to PA that PA should not be weaker than Robinson + Con(PA). Is there an obvious proof of this?
(For example, if Robinson + Con(PA) proved all theorems of PA, would this contradict the fact that PA is not finitely axiomatizable?)
Yes, finite axiomatizability is the obvious way of seeing this. You are correct that strictly speaking Robinson + Con(PA) is not weaker than PA, but rather is another incomparable example (which was the intended point). Note that there are other ways of seeing that Robinson + Con(PA) is weaker than PA without using the finite axiomatization of PA if one is willing to be be slightly non-rigorous. For example, one can note that Robinson arithmetic has as a model Z[x]+ so any theorem of Robinson +Con(PA) should be a theorem of Z[x]+ +Con(PA), (this step requires some details).
Ah, so my question was more along the line: does finite axiomatizability of a stronger (consistent) theory imply finite axiomatizability of the weaker theory? (This would of course imply Q + Con(PA) is not stronger than PA, Q being the usual symbol for Robinson arithmetic.)
On the model theoretic side, I think I can make something work, but it depends on distorting the specific definition of Con(PA) in a way that I’m not really happy about. In any case, I agree that your example is trivial to state and trivial to believe correct, but maybe it’s less trivial to prove correct.
Here’s what I was thinking:
Consider the predicate P(x) which says “if x != Sx, then x does not encode a PA-proof of 0=1”, and let ConMinus(PA) say for all x, P(x). Now, I think one could argue that ConMinus is a fair definition of (or substitute for?) Con, in that qualifying a formula with “if x != Sx” does not change its meaning in the standard model. Alternately, you could push this “if x != Sx” clause deeper, into basically every formula you would use to define the primitive recursive functions needed to talk about consistency in the first place, and you would not change the meanings of these formulas in the standard model. (I guess what I’m saying is that “the” formula asserting the consistency of PA is poorly specified.)
Also, PA is smart enough to prove that numbers are not their own successors, so PA believes in the equivalence of Con and ConMinus. In particular, PA does not prove ConMinus(PA), so PA is not stronger than Q + ConMinus(PA).
On the other hand, let M be the non-negative integers, together with one additional point omega. Put S(omega) = omega, put omega + anything = omega = anything + omega, and similarly for multiplication (except 0 * omega = omega * 0 = 0). I am pretty sure this is a model of Q.
Q is smart enough about its standard integers that it knows none of them encode PA-proofs of 0=1 (the “proves” predicate being Delta_0). Thus the model M satisfies Q + ConMinus(PA). But now we can see that Q + ConMinus(PA) is not stronger than PA, because PA proves “for all x, x is not equal to Sx”, yet this statement fails in a model of Q + ConMinus(PA).
EDIT: escape characters for *.
If I’m not mistaken, NBG and ZFC are a counterexample to this: NBG is a conservative extension of ZFC (and therefore stronger than ZFC), but NBG is finitely axiomatizable while ZFC is not.
Yeah, the details of actually proving this are looking like they contain more subtleties than I expected, but I tentatively agree with your analysis. Here’s what may be another proof,. Not only is PA not finitely axiomatizable, but any consistent extension of PA isn’t (I think this is true, the same proof that works for PA should go through here, but I haven’t checked the details). So PA+ConMinus(PA) still isn’t finitely axiomatizable. So now, pick any of the axioms created in the axiom schema of induction that are needed in PA + ConMinus(PA), Q+ConMinus(PA) also can’t prove any of those (since it is strictly weaker than PA+ConMinus(PA),) but all of those statements are theorems of PA (since they are in fact axioms). Does this work?
Overall, this is requiring a lot more subtlety than I initially thought was involved which may make Qiaochu Yuan’s example a better one.
Well if we had this, we would know immediately that Q + Con(PA) is not an extension of PA (which is what we originally wanted), because it certainly is finitely axiomatizable. I know there are several proofs that PA is not finitely axiomatizable, but I have not read any of them, so can’t comment on the strengthened statement, though it sounds true.
Page 4 footnote 8 in the version you saw looks like footnote 9 in mine.
I don’t see how ‘proof-of-bottom → bottom’ makes a system inconsistent. This kind of formula appears all the time in Type Theory, and is interpreted as “not(proof-of-bottom)”.
The ‘principle of explosion’ says ‘forall A, bottom → A’. We can instantiate A to get ‘bottom → not(proof-of-bottom)’, then compose this with “proof-of-bottom → bottom” to get “proof-of-bottom → not(proof-of-bottom)”. This is an inconsistency iff we can show proof-of-bottom. If our system is consistent, we can’t construct a proof of bottom so it remains consistent. If our system is inconsistent then we can construct a proof of bottom and derive bottom, so our system remains inconsistent.
Have I misunderstood this footnote?
[EDIT: Ignore me for now; this is of course Lob’s theorem for bottom. I haven’t convinced myself of the existence of modal fixed points yet though]
That is strictly correct, but not relevant for self-improving AI. You don’t want father AI that cannot prove everything that the child AI can prove. Maybe the footnote should be edited in this sense.
Well, if A can prove everything B can, except for con(A), and B can prove everything A can, except for con(B), then you’re relatively happy.
ETA: retracted (thanks to Joshua Z for pointing out the error).
I don’t think this can’t happen, since A has proven Con(B), then it can now reason using system B for consistency purposes and get from the fact that B proves Con(A) to get A proving Con(A), which is bad.
Thanks for pointing this out. My mathematical logic is rusty.