If you could say that, you could rule out numbers with infinite numbers of predecessors, meaning that you could rule out all infinite-in-both-directions chains, and hence rule out all nonstandard numbers. And then the only remaining model would be the standard numbers. And then Godel’s Statement would be a semantic implication of those axioms; there would exist no number encoding a proof of Godel’s Statement in any model which obeyed the axioms of first-order arithmetic. And then, by Godel’s Completeness Theorem, we could prove Godel’s Statement from those axioms using first-order syntax. Because every genuinely valid implication of any collection of first-order axioms—every first-order statement that actually does follow, in every possible model where the premises are true—can always be proven, from those axioms, in first-order logic. Thus, by the combination of Godel’s Incompleteness Theorem and Godel’s Completeness Theorem, we see that there’s no way to uniquely pin down the natural numbers using first-order logic. QED.”
This is a tortured and unfortunate phrasing of the following direct and clearer argument:
Godel’s Incompleteness Theorem says that there’s a sentence G that our first-order theory PA neither proves nor disproves. And Godel’s Completeness theorem says that every consistent theory has a model.
(this is also a better, more intuitive way of presenting the Completeness theorem than “every statement valid in every model has a proof”, because it hints at why it’s true. It’s not intuitively clear why every valid statement must have a proof, but if a theory is consistent, if you cannot puzzle out a contradiction within it, then it makes sense that something out there looks like what it says. If it could happen (by virtue of being consistent), it does happen. And the usual proof of the Completeness theorem in fact shows how it happens).
“PA doesn’t prove G” means “PA+not-G is consistent” and “PA doesn’t disprove G” means “PA+G is consistent”. Thus by the Completeness theorem both PA+G and PA+not-G have a model. And only one of them could possibly be the standard model, so the other is certainly nonstandard.
The Compactness Theorem says that if a collection of first-order statements has no model, there is a finite syntactical proof of an inconsistency.
No. No. No.
You keep conflating syntax and semantics, and the result is a muddled description in which different kinds of nonstandard models existing for different reasons merge and run together in a confused whole.
Properly phrased, the Compactness theorem is wholly semantic, not syntactic. It says that if a collection of statements has no model, than already some finite subset of it has no model. There is no reference to proof anywhere. This is important!
First-order logic presents us with two kinds of limitations on the power of our reasoning. There are semantic limitations on what we can capture in a statement or a set of statements. In other words, how expressive our formal language is, how well it’s able to describe the mathematical reality. And then there are syntactic limitations on what we can hope to prove, given an appropriate axiomatic system and a set of axioms.
It’s vital to keep syntax and semantics sharply apart in your mind and keep track of what is what. But both symantic and syntactic limitations may manifest as existence of unwanted “nonstandard models”, and your account muddles them together. And more generally your account of “trying to pinpoint the standard numbers” again and again fails to distinguish between the limits on what we can express and the limits on what we can prove, and the result is confusion.
(Here’s a thought experiment. Imagine we deleted rules of inference from our first-order logic. Now we have a logic with an incredibly weak proof system: nothing can be proved except what is taken as an axiom. Proofs are one statement long. There is no Completeness theorem: the Completeness theorem says that first-order logic’s ability to prove things is as strong as it ought to be,
and our logic is weak. But there is still the Compactness theorem, it didn’t go anywhere. And there are still nonstandard models due to it, nothing happened to them. The Compactness theorem may be proved via the Completeness theorem, but that’s just a possible proof. It can also be proved via a completely semantic argument. It’s about what can be expressed in a statement, not what can be proved, and making our logic ridiculously weak doesn’t affect it in the least.)
So, among semantic limitations on what we can express we can count the Compactness theorem, that shows us, among other things, we cannot hope to capture with first-order statements the behavior of any particular infinite model, like the natural numbers. There will always be nonstandard models, and in many cases of interest they will even be of the same size as the standard model.
(another example of a semantic limitation is Tarski’s undefinability of truth, which really deserves much of the reputation thoughtlessly bestowed upon Godel’s Incompleteness Theorem).
And in matters syntactic we have, as a great boon, the Completeness Theorem telling us we can prove universally true things, but on the other hand the Incompleteness Theorem is a check on our ability to capture all arithmetical truths in a manageable set of axioms. We can combine this inability with the Completeness Theorem (acting in a semi-evil capacity here) to demonstrate nonstandard models that show off our failure in the flesh, so to speak. But that is merely an echo of the real thing, the real failure that is our inability to form a manageable complete theory of arithmetics. Imagine for a second that our logic were a little weaker so that we didn’t have the Completeness theorem in full generality, but would still have a relatively powerful proof system to formalize the usual mathematical proofs and in particular to have the Incompleteness theorem. Then perhaps we wouldn’t be able to prove that there exists a nonstandard model of PA+not-G. But that wouldn’t really matter, the real failure would remain the fact that PA neither proves nor disproves G. That would remain an essential limitation on the powers of our formal reasoning.
The nonstandard models due to the syntactic limitations of the Incompletess Theorem are thus more of a shiny exhibition than the real failure. And at any rate, they are so different in nature from the nonstandard models due to the semantic limitations that it’s confusing to run them together. The former kind, the Incompleteness-based nonstandard models, necessarily disagree on some statements of the standard language of arithmetic. They are instantiations of different first-order claims about natural numbers. There are perfectly ordinary statements about natural numbers, like G, that are true in one of them and false in the other.
But in the other kind of nonstandard models, the ones resulting from the Compactness theorem, that doesn’t have to be the case. Even T(N), the full theory of true arithmetic, which takes for axioms all true statements about natural numbers, has nonstandard models due to the Compactness theorem. These nonstandard models do not disagree with the standard one about any first-order claim in the standard language. They are manifestations of limitations inherent in the language itself, not in what truths we are able to capture. Even if we could capture all first-order truths about natural numbers—even if there was no Incompleteness Theorem, and PA were a complete theory of arithmetic—they would still be there, expressing the same truths in a different-looking model, as a sign that our language is not an all-powerful means of expressing the mathematical reality.
I realized after reading this that I’d stated the Compactness Theorem much more strongly than I needed, and that I only needed the fact that infinite semantic inconsistency implies finite semantic inconsistency, never mind syntactic proofs of inconsistency, so I did a quick rewrite accordingly. Hopefully this addresses your worries about “muddled description”, although initially I was confused about what you meant by “muddled” since I’d always carefully distinguished semantics from syntax at each point in the post.
I was also confused by what you meant by “nonstandard models resulting from the Compactness Theorem” versus “nonstandard models resulting from the Incompleteness Theorem”—the nonstandard models are just there, after all, they don’t poof into existence as a result of one Theorem or the other being proved. But yes, the Compactness Theorem shows that even adjoining all first-order stateable truths about the natural numbers to PA (resulting in a theory not describable within PA) would still give a theory with nonstandard models.
I think “semantic consistency” is not a very good phrase, and you should consider replacing it with “satisfiability” or, if that seems too technical, “realizability”. The word “inconsistent” tells us that there’s some sort of contradiction hidden within. But there could be statements without contradiction that are yet not realizable—not in our logic, thanks to the Completeness theorem, but in some other, perhaps less useful one. Imagine for example that you tried to develop mathematical logic from scratch, and defined “models” in such a way that only finite sets can serve as their domains (perhaps because you’re a hardcore finitist or something). Then your class of models is too poor and doesn’t sustain the Completeness theorem. There are consistent finite sets of statements, from which no contradiction may be syntactically derived, that are only realizable in infinite models and so are not realizable at all in this hypothetical logic. It feels wrong to call them “semantically inconsistent” even though you can technically do that of course, it’s just a definition. “Realizable” seems better.
I feel that this example is part of a larger trend. Think of first-order logic as a perfect fit between syntactic notions (how formulas are built up, what is a proof) and semantic notions (how to assign a truth value to a statement in a model, what models exist and how they’re defined). Apriori it’s not clear or preordained that these two fit together like a hand in a glove, but thanks to Soundness and Completeness theorems they actually do. You keep using that fit to jump seamlessly between semantic notions and syntactic notions, and although you’re not committing any error, I think the result is confusing; in “alternative universes”—other logics—the fit doesn’t exist or it’s very different, and to gain understanding and appreciation of how logic works the two must be kept sharply separate in one’s mind. I’m not saying that you don’t appreciate the difference—I’m saying that pedagogically your posts in this sequence fail in making the reader understand it.
I was confused about what you meant by “muddled” since I’d always carefully distinguished semantics from syntax at each point in the post.
Here’s an example from an earlier post:
...Gosh. I think I see the idea now. It’s not that ‘axioms’ are mathematicians asking for you to just assume some things about numbers that seem obvious but can’t be proven. Rather, axioms pin down that we’re talking about numbers as opposed to something else.
“Exactly. That’s why the mathematical study of numbers is equivalent to the logical study of which conclusions follow inevitably from the number-axioms. The way that theorems like 2 + 2 = 4 are syntactically provable from those axioms reflects the way that 2 + 2 = 4 is semantically implied within this unique mathematical universe that the axioms pin down.
Up until this point in the post you were only talking about how particular sentence are able or not able semantically to “pin down” a particular model. But now suddenly you’re talking about “follow inevitably” and “syntactically provable”, and the reader who’s not trained in logic has no clues to tell them that you suddenly switched tracks and are talking about a completely different thing. The “that’s why” is incoherent, because the set of syntactic conclusions from number-axioms is smaller than the set of semantic truths about standard numbers. Here the way you leap between syntactic and semantic notions leads you astray. Your “the way that...” sentence alludes to a completeness theorem which second-order logic, which you’re talking about here, doesn’t have! Think about it: second-order PA does have only one model, but the fact that in this model 2+2=4 does not give you license to deduce that 2+2=4 is syntactically provable from second-order PA axioms. The “reflects” in your sentence is wrong! (read the answer to this post for a useful summary, which also clarifies how limitations established by Godel’s Incompleteness Theorems still apply in a certain way to second-order arithmetic)
I’m convinced that a reader of this sequence who is not educated in logic simply won’t notice the leaps between syntax and semantics that you habitually make in an undisciplined fashion, and will not get a clear picture of why and which nonstandard models exist and how proof systems, Completeness and Incompleteness interact with their existence.
I was also confused by what you meant by “nonstandard models resulting from the Compactness Theorem” versus “nonstandard models resulting from the Incompleteness Theorem”—the nonstandard models are just there, after all, they don’t poof into existence as a result of one Theorem or the other being proved.
Well, I thought I made that clear in my comment. Yes, they are there, but you can imagine other logics where one of the theorems holds but not the other, and see that one kind of the nonstandard models remains and the other disappears. The difference between them to working mathematicians and logicians is vast. Lowenhelm-Skolem was proved, I think in 1917 or thereabouts, but until Godel’s Incompleteness Theorem appeared nobody thought that it demonstrated that first-order logic is not suitable to formalize mathematics because it doesn’t “pin down standard numbers”.
That’s why I think that your emphasis on “pins down a unique model” is wrongheaded and doesn’t reflect the real concerns mathematicians and logicians had and continue to have about how well axiomatic systems formalize mathematics. It’s because this property is too coarse to distinguish e.g. between incomplete and complete theories—even complete theories have nonstandard models in first-order logic. In an alternate universe where Godel’s Incompleteness doesn’t exist and PA is a complete first-order theory that proves or disproves any statement about natural numbers, approximately nobody cares that it has nonstandard models due to Compactness, and everybody’s very happy that they have a set of axioms that are able in principle to answer all questions you can ask about natural numbers. (if you protest that PA is incomplete in any alternate universe, consider that there are complete first-order theories, e.g. of real closed fields). You’re free to disagree with that and to insist that categoricity—having only one model up to isomorphism—must be the all-important property of the logic we want to use, but your readers are ill-equipped to agree or disagree in an informed way because your description muddles the hugely important difference between those two kinds of nonstandard models. To mathematicians interested in foundations, the fact that PA is incomplete is a blow to Hilbert’s program and more generally to the project of formalizing mathematical meaning, while nonstandard models existing due to compactness are at worst an annoying technical distraction, and at best a source of interesting constructions like nonstandard analysis.
I’ll try a couple more edits, but keep in mind that this isn’t aimed at logicians concerned about Hilbert’s program, it’s aimed at improving gibberish-detection skills (sentences that can’t mean things) and avoiding logic abuse (trying to get empirical facts from first principles) and improving people’s metaethics and so on.
Just as an aside, I really enjoyed Scott Aaronson’s explanation of nonstandard models in this writeup:
...The Completeness Theorem is confusing for two reasons: on the one hand, it sounds like a tautology (“that which is consistent, is consistent”) — what could it possibly mean to prove such a thing? And on the other hand, it seems to contradict the Incompleteness Theorem.
We’re going to clear up this mess, and as a bonus, answer our question about whether all models of ZF are uncountable. The best way to understand the Completeness Theorem is to make up a consistent axiom set that you’d guess doesn’t have a model. Given a theory T, let Con(T) be the assertion that T is consistent. We know from Gödel’s Incompleteness Theorem that Con(ZF) can be expressed in ZF, and also that Con(ZF) can’t be proved in ZF, assuming ZF is consistent. It follows that assuming ZF is consistent, the “self-hating theory” ZF+¬Con(ZF), or ZF plus the assertion of its own inconsistency, must also be consistent. So by the Completeness Theorem, ZF+¬Con(ZF) has a model. What on earth could it be? We’ll answer this question via a fictional dialogue between you and the axioms of ZF+¬Con(ZF).
You: Look, you say ZF is inconsistent, from which it follows that there’s a proof in ZF that 1+1=3. May I see that proof?
Axioms of ZF+¬Con(ZF): I prefer to talk about integers that encode proofs. (Actually sets that encode integers that encode proofs. But I’ll cut you a break — you’re only human, after all.)
You: Then show me the integer.
Axioms: OK, here it is: X.
You: What the hell is X?
Axioms: It’s just X, the integer encoded by a set in the universe that I describe.
You: But what is X, as an ordinary integer?
Axioms: No, no, no! Talk to the axioms.
You: Alright, let me ask you about X. Is greater or smaller than a billion?
Axioms: Greater.
You: The 10^10^1,000,000,000th Ackermann number?
Axioms: Greater than that too.
You: What’s X^2+100?
Axioms: Hmm, let me see… Y.
You: Why can’t I just add an axiom to rule out these weird ‘nonstandard integers?’ Let me try: for all integers X, X belongs to the set obtained by starting from 0 and...
Axioms: Ha ha! This is first-order logic. You’re not allowed to talk about sets of objects — even if the objects are themselves sets.
You: Argh! I know you’re lying about this proof that 1+1=3, but I’ll never catch you.
Axioms: That right! What Gödel showed is that we can keep playing this game forever. What’s more, the infinite sequence of bizarre entities you’d force me to make up — X, Y, and so on — would then constitute a model for the preposterous theory ZF+¬Con(ZF).
You: But how do you know I’ll never trap you in an inconsistency?
Axioms: Because if you did, the Completeness Theorem says that we could convert that into an inconsistency in the original axioms, which contradicts the obvious fact that ZF is consis—no, wait! I’m not supposed to know that! Aaahh! [The axioms melt in a puddle of inconsistency.]
This is a tortured and unfortunate phrasing of the following direct and clearer argument:
Godel’s Incompleteness Theorem says that there’s a sentence G that our first-order theory PA neither proves nor disproves. And Godel’s Completeness theorem says that every consistent theory has a model.
(this is also a better, more intuitive way of presenting the Completeness theorem than “every statement valid in every model has a proof”, because it hints at why it’s true. It’s not intuitively clear why every valid statement must have a proof, but if a theory is consistent, if you cannot puzzle out a contradiction within it, then it makes sense that something out there looks like what it says. If it could happen (by virtue of being consistent), it does happen. And the usual proof of the Completeness theorem in fact shows how it happens).
“PA doesn’t prove G” means “PA+not-G is consistent” and “PA doesn’t disprove G” means “PA+G is consistent”. Thus by the Completeness theorem both PA+G and PA+not-G have a model. And only one of them could possibly be the standard model, so the other is certainly nonstandard.
No. No. No.
You keep conflating syntax and semantics, and the result is a muddled description in which different kinds of nonstandard models existing for different reasons merge and run together in a confused whole.
Properly phrased, the Compactness theorem is wholly semantic, not syntactic. It says that if a collection of statements has no model, than already some finite subset of it has no model. There is no reference to proof anywhere. This is important!
First-order logic presents us with two kinds of limitations on the power of our reasoning. There are semantic limitations on what we can capture in a statement or a set of statements. In other words, how expressive our formal language is, how well it’s able to describe the mathematical reality. And then there are syntactic limitations on what we can hope to prove, given an appropriate axiomatic system and a set of axioms.
It’s vital to keep syntax and semantics sharply apart in your mind and keep track of what is what. But both symantic and syntactic limitations may manifest as existence of unwanted “nonstandard models”, and your account muddles them together. And more generally your account of “trying to pinpoint the standard numbers” again and again fails to distinguish between the limits on what we can express and the limits on what we can prove, and the result is confusion.
(Here’s a thought experiment. Imagine we deleted rules of inference from our first-order logic. Now we have a logic with an incredibly weak proof system: nothing can be proved except what is taken as an axiom. Proofs are one statement long. There is no Completeness theorem: the Completeness theorem says that first-order logic’s ability to prove things is as strong as it ought to be, and our logic is weak. But there is still the Compactness theorem, it didn’t go anywhere. And there are still nonstandard models due to it, nothing happened to them. The Compactness theorem may be proved via the Completeness theorem, but that’s just a possible proof. It can also be proved via a completely semantic argument. It’s about what can be expressed in a statement, not what can be proved, and making our logic ridiculously weak doesn’t affect it in the least.)
So, among semantic limitations on what we can express we can count the Compactness theorem, that shows us, among other things, we cannot hope to capture with first-order statements the behavior of any particular infinite model, like the natural numbers. There will always be nonstandard models, and in many cases of interest they will even be of the same size as the standard model.
(another example of a semantic limitation is Tarski’s undefinability of truth, which really deserves much of the reputation thoughtlessly bestowed upon Godel’s Incompleteness Theorem).
And in matters syntactic we have, as a great boon, the Completeness Theorem telling us we can prove universally true things, but on the other hand the Incompleteness Theorem is a check on our ability to capture all arithmetical truths in a manageable set of axioms. We can combine this inability with the Completeness Theorem (acting in a semi-evil capacity here) to demonstrate nonstandard models that show off our failure in the flesh, so to speak. But that is merely an echo of the real thing, the real failure that is our inability to form a manageable complete theory of arithmetics. Imagine for a second that our logic were a little weaker so that we didn’t have the Completeness theorem in full generality, but would still have a relatively powerful proof system to formalize the usual mathematical proofs and in particular to have the Incompleteness theorem. Then perhaps we wouldn’t be able to prove that there exists a nonstandard model of PA+not-G. But that wouldn’t really matter, the real failure would remain the fact that PA neither proves nor disproves G. That would remain an essential limitation on the powers of our formal reasoning.
The nonstandard models due to the syntactic limitations of the Incompletess Theorem are thus more of a shiny exhibition than the real failure. And at any rate, they are so different in nature from the nonstandard models due to the semantic limitations that it’s confusing to run them together. The former kind, the Incompleteness-based nonstandard models, necessarily disagree on some statements of the standard language of arithmetic. They are instantiations of different first-order claims about natural numbers. There are perfectly ordinary statements about natural numbers, like G, that are true in one of them and false in the other.
But in the other kind of nonstandard models, the ones resulting from the Compactness theorem, that doesn’t have to be the case. Even T(N), the full theory of true arithmetic, which takes for axioms all true statements about natural numbers, has nonstandard models due to the Compactness theorem. These nonstandard models do not disagree with the standard one about any first-order claim in the standard language. They are manifestations of limitations inherent in the language itself, not in what truths we are able to capture. Even if we could capture all first-order truths about natural numbers—even if there was no Incompleteness Theorem, and PA were a complete theory of arithmetic—they would still be there, expressing the same truths in a different-looking model, as a sign that our language is not an all-powerful means of expressing the mathematical reality.
I realized after reading this that I’d stated the Compactness Theorem much more strongly than I needed, and that I only needed the fact that infinite semantic inconsistency implies finite semantic inconsistency, never mind syntactic proofs of inconsistency, so I did a quick rewrite accordingly. Hopefully this addresses your worries about “muddled description”, although initially I was confused about what you meant by “muddled” since I’d always carefully distinguished semantics from syntax at each point in the post.
I was also confused by what you meant by “nonstandard models resulting from the Compactness Theorem” versus “nonstandard models resulting from the Incompleteness Theorem”—the nonstandard models are just there, after all, they don’t poof into existence as a result of one Theorem or the other being proved. But yes, the Compactness Theorem shows that even adjoining all first-order stateable truths about the natural numbers to PA (resulting in a theory not describable within PA) would still give a theory with nonstandard models.
I think “semantic consistency” is not a very good phrase, and you should consider replacing it with “satisfiability” or, if that seems too technical, “realizability”. The word “inconsistent” tells us that there’s some sort of contradiction hidden within. But there could be statements without contradiction that are yet not realizable—not in our logic, thanks to the Completeness theorem, but in some other, perhaps less useful one. Imagine for example that you tried to develop mathematical logic from scratch, and defined “models” in such a way that only finite sets can serve as their domains (perhaps because you’re a hardcore finitist or something). Then your class of models is too poor and doesn’t sustain the Completeness theorem. There are consistent finite sets of statements, from which no contradiction may be syntactically derived, that are only realizable in infinite models and so are not realizable at all in this hypothetical logic. It feels wrong to call them “semantically inconsistent” even though you can technically do that of course, it’s just a definition. “Realizable” seems better.
I feel that this example is part of a larger trend. Think of first-order logic as a perfect fit between syntactic notions (how formulas are built up, what is a proof) and semantic notions (how to assign a truth value to a statement in a model, what models exist and how they’re defined). Apriori it’s not clear or preordained that these two fit together like a hand in a glove, but thanks to Soundness and Completeness theorems they actually do. You keep using that fit to jump seamlessly between semantic notions and syntactic notions, and although you’re not committing any error, I think the result is confusing; in “alternative universes”—other logics—the fit doesn’t exist or it’s very different, and to gain understanding and appreciation of how logic works the two must be kept sharply separate in one’s mind. I’m not saying that you don’t appreciate the difference—I’m saying that pedagogically your posts in this sequence fail in making the reader understand it.
Here’s an example from an earlier post:
Up until this point in the post you were only talking about how particular sentence are able or not able semantically to “pin down” a particular model. But now suddenly you’re talking about “follow inevitably” and “syntactically provable”, and the reader who’s not trained in logic has no clues to tell them that you suddenly switched tracks and are talking about a completely different thing. The “that’s why” is incoherent, because the set of syntactic conclusions from number-axioms is smaller than the set of semantic truths about standard numbers. Here the way you leap between syntactic and semantic notions leads you astray. Your “the way that...” sentence alludes to a completeness theorem which second-order logic, which you’re talking about here, doesn’t have! Think about it: second-order PA does have only one model, but the fact that in this model 2+2=4 does not give you license to deduce that 2+2=4 is syntactically provable from second-order PA axioms. The “reflects” in your sentence is wrong! (read the answer to this post for a useful summary, which also clarifies how limitations established by Godel’s Incompleteness Theorems still apply in a certain way to second-order arithmetic)
I’m convinced that a reader of this sequence who is not educated in logic simply won’t notice the leaps between syntax and semantics that you habitually make in an undisciplined fashion, and will not get a clear picture of why and which nonstandard models exist and how proof systems, Completeness and Incompleteness interact with their existence.
Well, I thought I made that clear in my comment. Yes, they are there, but you can imagine other logics where one of the theorems holds but not the other, and see that one kind of the nonstandard models remains and the other disappears. The difference between them to working mathematicians and logicians is vast. Lowenhelm-Skolem was proved, I think in 1917 or thereabouts, but until Godel’s Incompleteness Theorem appeared nobody thought that it demonstrated that first-order logic is not suitable to formalize mathematics because it doesn’t “pin down standard numbers”.
That’s why I think that your emphasis on “pins down a unique model” is wrongheaded and doesn’t reflect the real concerns mathematicians and logicians had and continue to have about how well axiomatic systems formalize mathematics. It’s because this property is too coarse to distinguish e.g. between incomplete and complete theories—even complete theories have nonstandard models in first-order logic. In an alternate universe where Godel’s Incompleteness doesn’t exist and PA is a complete first-order theory that proves or disproves any statement about natural numbers, approximately nobody cares that it has nonstandard models due to Compactness, and everybody’s very happy that they have a set of axioms that are able in principle to answer all questions you can ask about natural numbers. (if you protest that PA is incomplete in any alternate universe, consider that there are complete first-order theories, e.g. of real closed fields). You’re free to disagree with that and to insist that categoricity—having only one model up to isomorphism—must be the all-important property of the logic we want to use, but your readers are ill-equipped to agree or disagree in an informed way because your description muddles the hugely important difference between those two kinds of nonstandard models. To mathematicians interested in foundations, the fact that PA is incomplete is a blow to Hilbert’s program and more generally to the project of formalizing mathematical meaning, while nonstandard models existing due to compactness are at worst an annoying technical distraction, and at best a source of interesting constructions like nonstandard analysis.
I’ll try a couple more edits, but keep in mind that this isn’t aimed at logicians concerned about Hilbert’s program, it’s aimed at improving gibberish-detection skills (sentences that can’t mean things) and avoiding logic abuse (trying to get empirical facts from first principles) and improving people’s metaethics and so on.
Just as an aside, I really enjoyed Scott Aaronson’s explanation of nonstandard models in this writeup: