Every so often I hear seemingly mathematical statements involving the concept of being provable. For example:
I’ve seen Gödel’s Incompleteness Theorem stated as “if a mathematical system is powerful enough to express arithmetic, then either it contains a contradiction or there are true statements that it cannot prove.”
On the AI alignment forum, one of the pinned sequences describes Löb’s Theorem as “If Peano Arithmetic can prove that a proof of P would imply the truth of P, then it can also prove P is true”.
I find each of these statements baffling for a different reason:
Gödel: What could it mean for a statement to be “true but not provable”? Is this just because there are some statements such that neither P nor not-P can be proven, yet one of them must be true? If so, I would (stubbornly) contest that perhaps P and not-P really are both non-true.
Löb: How can a system of arithmetic prove anything? Much less prove things about proofs?
And I also have one more general confusion. What systems of reasoning could these kinds of theorems be set in? For example, I’ve heard that there are proofs that PA is consistent. Let’s say one of those proofs is set in Proof System X. Now how do we know that Proof System X is consistent? Perhaps it can be proven consistent by using Proof System Y? Do we just end up making an infinite chain of appeals up along a tower of proof systems? Or do we eventually drive ourselves into the ground by reaching system that nobody could deny is valid? If so, why don’t we just stop and PA or ZFC?
Oh, speaking of ZFC. There seems to be a debate about whether we should accept the Axiom of Choice. Isn’t it...obviously true? I don’t really understand this topic well enough to have any particular question about the AC debate, but my confusion definitely extends to that region of concept space.
So here’s my question: Where can I learn about “provability” and/or what clarifying insights could you share about it?
First, the question of “provability” is whether something can be proved using a given collection of axioms and rules.
That is, we are not concerned about whether a smart mathematician with great communication skills could convince us about the veracity of a statement. Instead, we are concerned about whether we can construct a sequence of statements culminating with the given statement, such that each step can be verified by a rather simple computer program, by matching it to the few predetermined patterns.
So it is perfectly okay to have a statement that is obviously true, but still cannot be proved using some set of axioms and rules. Think about it as a weakness of the given set of axioms and rules, rather than a property of the statement itself. In other words, we never talk about “provability of X” (except as a shortcut), but always about “provability of X in system S”. Provability is a relation between X and S.
Therefore, Gödel’s Theorem is not about statements that are unprovable per se (i.e. some more complicated version of “math will never be able to prove that 2 + 2 equals 4”), but rather it means “if you give me a non-contradictory system S of axioms and rules, I can create a statement X such that its veracity will be beyond the reach of S”. Plus there is an algorithm for creating a specific X for given S. The generated X will be tailored to the given S. (For different systems S1, S2, S3 you would get different X1, X2, X3; and it’s always S1 having a problem with X1, S2 with X2, and S3 with X3; but maybe X1 and X2 are easy to prove or disprove in S3.)
More specifically, the statement X generated by Gödel for system S is cleverly designed to be equivalent to “X cannot be proved by S” without being self-referential. To discuss the clever construction is beyond the scope of this comment.
.
Second, once we start talking about things like sets, we are now far outside the realm of reality. It doesn’t make sense to ask whether “sets really have a property P”, if the fact is that “in the first place, set’s don’t really exist”.
It would be like trying to use experimental physics to verify whether fairies are heavier than 10 grams. You simply can’t, because there are no fairies to measure. So if you have two competing theories of physics-including-supernatural, one saying that fairies are heavier than 10 grams, and the other saying they are not, well, you have a problem that cannot be resolved experimentally. A set theory with the Axiom of Choice, and a set theory with some incompatible axiom, are two such theories: they agree about everything that really exists, and they disagree about the properties of fairies… ahem, infinite sets. You can’t resolve this conflict by talking about what really exists; and the arguments about what doesn’t exist are by their nature arbitrary. (You can’t prove internal inconsistency, because both theories are internally consistent. They just disagree with each other.)
So how can we study sets if they are not real? By defining axioms and rules, and examining which statements can be proved using given axioms and rules. “ZF” is such a set of axioms; there are statements you can prove using it, there are statements you can disprove, and there are statements where the system provides no answer.
The underlying reason is that if you imagine a Platonic realm where all abstractions allegedly exist, the problem is that there are actually multiple abstractions [“models”] compatible with ZF, but different from each other in many important ways. In some of them, Axiom of Choice is true. In others, it is false. This is what it means that Axiom of Choice can be neither proved nor disproved in ZF. The problem is that “sets” have never been unambiguously defined in the first place!
However, adopting the Axiom of Choice (or any of its competitors) won’t actually solve the problem. There will still remain many abstractions compatible with ZFC (or whatever), but different from each other. So you will sooner or later find other exciting properties of the fairies… ahem, infinite sets, that can be neither proved nor disproved by ZFC (or whatever). The problem, again, is that we still don’t have an unambiguous definition of “sets”.
And… but I am way out of my depth here… maybe we can’t have an unambiguous definition of “sets”, ever, because of the Gödel Theorem. So we will have to keep adding new axioms, but there is no territory to guide us in their choice, so different people will prefer different choices, mutually contradictory.
At the end the set theory research will be fractured into hundreds of competing definitions, all underspecified. The statements will have to be prefaced by ever longer “according to this collection of axioms” which will make things difficult and error-prone. (Things you will learn under one collection of axioms may be false or even nonsensical under other collections. So you will have to re-learn everything over and over again if you switch to a different system.) And the preferred collection of axioms, which will most likely provide you opportunity in the peer-reviewed journals and research grants, will be decided “politically” (as the most popular among the currently established researchers); which according to some people has already happened with ZF(C).
Thanks for this great answer. I have a couple of follow-up questions that anybody should feel free to jump in and answer.
There are models where the AC is specifically false? I’ve been told that AC can be formulated as “the cartesian product of any collection of sets (even an infinite collection) is non-empty”, but I’m having trouble picturing something a collection of things I call “sets” failing to satisfy this property. Are the models referred to here ones “sets” are replaced by totally unrelated mathematical objects that just happen to satisfy the ZF axioms?
If I understand it correctly, (you believe) you have an intuition of “sets”, and you are going to judge any axiom depending on whether it is compatible with your intuition or not. ZFC is compatible with your intuition, negation of AC is not.
And maybe your intuition is also underspecified—that is, there can be multiple models that are different only in things too abstract for you to have an opinion on them—but still, AC has to be true in any of those models that are acceptable to you.
Fair enough. I believe this is not different in principle from what professional mathematicians do. Except they carefully separate what they believe intuitively to be true, from what they can prove from the axioms.
The question is, whether your intuition is consistent. That is, do you believe in AC only because the proponents of AC got to you first, and gave you the best arguments in their favor? (Like the one about the Cartesian product of infinitely many nonempty sets.) What would have happened if you heard about Banach-Tarski paradox first? Would you be like “yeah, that’s cool for me; you take an orange, cut it and rotate the pieces, and you get two oranges of the same size; my intuition is okay with that”? What is your intuitive opinion on whether the set of real numbers can be well-ordered?
If your intuition is consistent with these things, good for you! Then yes, the other models are the ones where “sets” refers to mathematical objects that are not sets according to your intuition, and only happen to satisfy the ZF axioms. (It still might be useful to talk about such objects, similarly how talking about geometries that do not satisfy Euclid’s fifth postulate got us some interesting results.) But some other people’s intuitions object against things that are equivalent to the axiom of choice, therefore they find it easier to reject it. And it’s not like you can prove them wrong.
I think this is a good assessment of my current thoughts. It’s funny you ask about the Banach-Tarski paradox, because I think I actually did hear about that first, although the place I heard it (VSauce) didn’t mentioned that it was a consequence of the AC.
I think I’m okay with the BT paradox, because the so-called “pieces” end up being insane collections of individual points. If it were doable with something more akin to “solid chunks”, the way an actual orange would be cut up, I might feel differently.
The well-ordering of the reals is less intuitive to me, and I think I remember being very surprised when I first read that it was true. This was in a context that linked it to the AC, and I think that was the first I heard of the controversy.
I think you’ve given me a good foundation to work on as I think this topic through even further. Thanks again!
What kind of reference you are using for your reference to sets if not the axioms? That reads to me as if “Are they just totally unrelated objects to red busses that just happen to be a bus and red?”
Some times I have seen people argue for example that the word “yellow” is grounded by the set of all yellow things. But usually that kind of definition suffers from the list being ambigious/insufficient. Like if a give a thing it either is or is not a member of that set. But listing all the members or otherwise giving some procedure to give out all the members seems like is not the most natural thing to do. Thus if you tried to take the cartesian product of yellow things and red things because you can’t exemplify a sample just from the concept you can’t build up the product from members. The collection of yellow things propbably is not a set but it has many set-like properties. By having a close inventory of sets properties they can be distinguished from confused or nearby notions.
Another possible imagination prompt would be a person faced with coordinates. Is there a real number that you can spesify that the human can point out on the x axis? No, they are always going to be off. In the same way if you present the axis and ask the human to point out their “favourite number” (that is supposed to keep stable) they will point out a slightly different real number each time they supposedly point that point out. Such a person can’t provide a choice function. It might still make sense to treat the person as being able to specify intervals, or refer to all of the points or being able to reference crossing points and others that have geometrical spesification. But in general a line is not guaranteed to have any referancale points falling within it.
So, when you say Godel’s sentence is obviously true, in which “abstraction” is it true?
Arguably, the numbers we care about. Set theory helpfully adds that second-order arithmetic (arithmetic using the language of sets) has only a single model (up to what is called ‘isomorphism’, meaning a precise analogy) and that Godel’s original sentence is ‘true’ within this abstraction.
Proofs, Implications, and Models introduces some of these ideas more slowly. Other stuff from the Highly Advanced Epistemology 101 for Beginners is relevant too, and includes more realism-flavored concerns about choosing between systems.
I know enough about this to not have these questions, but not enough to explain the answers to anyone else. So I’ll recommend a book by Torkel Franzén, who was definitely able to both understand and teach this, “Gödel’s Theorem: An Incomplete Guide to Its Use and Abuse”. The book costs money, but as a preview, here’s a review of it.
Douglas Hofstadter has written a lot on the subject for a popular audience, but is better avoided until you understand the subject yourself well enough to recognise the unstated technical underpinnings of his exposition, and to see where he glosses over things a step too far. But when you are at that point, there is no need to read him.
One source that can help you learn more about the concept of provability is Gödel, Escher, Bach: An Eternal Golden Braid by Douglas Hofstadter. It talks about Gödel’s Theorem and helps to get a deeper understanding of it. Plus, it’s just a very interesting book.
In answer to the question of how can something be true, but not provable I want to point to the Goldbach conjecture which says “every even number > 2 is the sum of two primes”. If the Goldbach conjecture is false then there’s a counterexample which can be checked in finite time (eg. just try adding all pairs of primes less than that number although there are faster ways). If there isn’t a counterexample then the Goldbach conjecture is true. To be provable however, there would have to exist a proof of the Goldbach conjecture. No such proof is known to exist. Here “truth” is exactly what it intuitively means. Either a counterexample exists or it doesn’t. The “truth” of the Goldbach conjecture won’t depend on your choice of axioms. All you need are the primitives necessary to define the natural numbers, primality, and addition (in particular, it won’t depend on AC).
Note that you can prove “P or not P”.
In what sense? In Classical Logic it’s an axiom, and in the Calculus of Indutive Constructions it’s unprovable.
(Interestingly, you can prove in Coq that the negation of “forall P, P or not P” is unprovable. So it’s safe to assume it: https://stackoverflow.com/questions/32812834/how-to-prove-excluded-middle-is-irrefutable-in-coq )
Ike is responding to this:
“P and not-P really are both non-true” is classically false, and Gödel holds in classical mathematics, so Sunny’s response isn’t available in that case.
Sunny’s sense that “perhaps P and not-P really are both non-true” might be a reason for him to endorse intuitionism as “more correct” than classical math in some sense.
All provable statements follow from the axioms, including “P or not P” for any particular P. It’s the same sense as any other statement can be provable.
Yes, in any formal system, all provable statements follow from the axioms. However, there are many formal systems. Two of the most commonly used ones are Classical Logic and the Calculus of Inductive Constructions.
In Classical Logic, “forall P, P or not P” is an axiom. So, it’s technically provable, but it would be misleading to say that you can prove it without further comment.
In the Calculus of Inductive Constructions (which is an extension of Intuitionistic Logic, if I understand correctly), “forall P, P or not P” is not provable.
So if there’s a non-trival proof of “forall P, P or not P”, it isn’t in either of these formal systems. If you do have one in mind, what formal system (logic) is it in, and what does the proof look like?
There’s a non-trivial proof of “P or not P” for a specific P.