In this post I’ll discuss some basic results of model theory. It may be helpful to read through my previous post if you haven’t yet. Model Theory is an implicit context for the Heavily Advanced Epistemology sequence and for a few of the recent MIRI papers, so casual readers may find this brief introduction useful. And who knows, maybe it will pique your interest:
A tale of two logics
propositional logic is the “easy logic”, built from basic symbols and the connectives “and” and “not”. Remember that all other connectives can be built from these two: With Enough NAND Gates You Can Rule The World and all that. Propositional logic is sometimes called the “sentential logic”, because it’s not like any other logics are “of or relating to sentences” (/sarcasm).
first order logic is the “nice logic”. It has quantifiers (“there exists”, “for all”) and an internal notion of equality. Its sentences contain constants, functions, and relations. This lets you say lots of cool stuff that you can’t say in propositional logic. First order logic turns out to be quite friendly (as we’ll see below). However, it’s not strong enough to talk about certain crazy/contrived ideas that humans cook up (such as “the numbers”).
There are many other logics available (second order logic AKA “the heavy guns”, ω-logic AKA “please just please can I talk about numbers”, and many more). In this post we’ll focus on propositional and first-order logics.
Training Wheels
As a warm-up, let’s define “models” for propositional logic.
Remember, logics together with languages produce sentences. Languages of propositional logic are sets of basic symbols. From these symbols and the two connectives of propositional logic (∧ and ¬) we construct our sentences.
For example, from the language {x, y} we can build the sentences x, ¬x, x∧y, and many more. We cannot build sentences like hello (which doesn’t use the right symbols) or ¬xy (which breaks the rules of the logic).
We want to study interpretations for such sentences, where an “interpretation” is some object which assigns a truth value to each sentence. This is denoted by the ⊧ operator: A⊧B expresses some form of “A models B as true”. This operator is used frequently and in many contexts throughout model theory.
What we’re looking for is some object (set) such that there is a relation (⊧) between the object and all sentences generated by the language under consideration. There are many object/relation pairs that assign truth values in a stupid way. For example, an interpretation that assigns “true” to every sentence is quite useless.
Many other interpretations are somewhat useful but still “wrong”. For example, there are interpretations of sentences which treat ∧ like implication instead of conjunction. Somebody may have use for such things, but we certainly don’t.
We want to narrow our consideration to interpretations where ∧ means “and” and ¬ means “not”, so we explicitly require object/relation pairs such that whenever the object models both φ and ψ, it also models φ∧ψ. (And X⊧φ iff it is not the case that X⊧¬φ, where X is the object under consideration.)
If we can find any objects that work like this, we’ll be justified in calling them “models”. However, we haven’t defined any such objects yet — we’ve only constrained their potential behavior.
Any interpretation of the sentences generated from a language L of propositional logic must assign truth values to all basic sentence symbols in L. It turns out, once we select which sentence symbols are “true”, we’re done. The rest of the behavior is completely specified by the rules about ∧ and ¬.
In other words, any object/relation which assigns truth values to sentences according to the above rules is isomorphic to a set S of sentence symbols with the operator ⊧ defined in the obvious way (starting with S⊧x iff x∈S).
Exploring this idea gives you a feel for how to define models of a logic.
More power
Propositional logic is pretty boring. If you want to say anything interesting, you’ve got to be able to say more than just “and” and “not”. Let’s move on to a stronger logic.
First-order logic uses the symbols ( ) ∀ ∃ ∧ ¬ ν ' ≡ or equivalent (where ν ν′ ν″, etc. are variables) with the familiar syntactic rules. A language of first-order logic has three different types of symbols: relation symbols, function symbols, and constant symbols. The rules of logic are designed such that symbols act as you expect, given their names.
An interpretation of the sentences generated by some language in first-order logic is (as before) an object that assigns each sentence a truth value (via a relation). We narrow these objects down to the ones that treat all the symbols in ways that justify their names.
More specifically, we consider interpretations that have of some “universe” (set) of items. Constant symbols are interpreted as specific items in the universe. Relation/function symbols are interpreted as relations/functions on the universe.
We further restrict consideration to interpretations where the logical symbols act in the intended fashion. For example, we require that the interpretation hold c≡d true (where c and d are constant symbols) if and only if the interpretation of c in the universe is equal to the interpretation of d in the universe.
All this is pretty mechanical. The resulting objects are “models”. Some of the early results in model theory show that these mathematical objects have indeed earned the name.
Completeness
Completeness is the poster-child of model theory, and it warrants quite a bit of exploration. It’s actually saying a few different things. I’ll break it down:
1. Theorems of first-order logic are true in every model of first-order logic.
A theorem of first-order logic is something we can prove from the logic alone, such as (∀ν)(ν≡ν). Contrast this with a sentence like (∃ν)(F(ν)≡c), where F is a function symbol and c is a constant symbol — the truth of this sentence depends entirely upon interpretation.
So what (1) says is that there aren’t any models that deny tautological truths of first-order logic. This result should not be surprising: this claim merely states that we picked a good definition for “models”. If instead we found that there are models which deny theorems, this would not be some big grand proof about the behavior of first-order logic. Rather, it would mean that we put the “model” label on the wrong group of thingies, and that we should go back and try again.
2. Any sentence true in every model of first-order logic is a theorem of first-order logic.
This is the converse to (1), and it’s quite a bit more powerful. This states that the theorems of a language are only the things true in every model of that language. There’s no mystery sentence that is true in every model but not provable from the syntax of the logic.
From one point of view, this says that there are no “missing” models that “would have” said the mystery sentence was false (hence “completeness”). From another, this says that all sentences are “well behaved”: no sentence “outwits” all models.
It’s worth noting that this is where completeness fails for models of second order logic. From one perspective, there must be some “missing models” in second-order logic. From the other perspective, there must be certain sentences which can “outwit” their models (presumably Gödel sentences). I haven’t spent any time formally examining these intuitive claims yet; I have much to learn about second-order logic.
I should also note that I’ve been implicitly assuming completeness for the last two posts by ignoring the syntactic rules of the logics under consideration. The completeness theorem lets me do this, because it states that the results interpreted by first-order models are exactly the same as the results derived from the syntactic rules of first-order logic. Because the completeness theorem holds, I’m allowed to treat the logical sentences as dead symbols and consider the binding M⊧φ∧ψ iff M⊧φ and M⊧ψ to be the means by which ∧ obtains its meaning.
When the completeness theorem fails (as it does in stronger logics) then there is a gap between what the rules of the logic allow and what the models of the logic can say. In that case I must separately consider what the rules say and what the models model. This is the edge of my comfort zone; more exploration with second-order logic is necessary.
Fortunately, everything is well-behaved in first order logic. In fact, (1) and (2) above can be made stronger:
3. A set Σ of sentences is consistent if it has a model.
This is another sanity check that we’ve put the “model” label on the right thing. A set of sentences is inconsistent if it can derive both φ and ¬φ for some sentence φ. (More specifically, Σ is inconsistent if it can derive everything. If there is any sentence that Σ cannot derive, Σ is consistent. Remember that from a contradiction, anything follows.)
Our models are defined such that whenever they model φ, they do not model ¬φ (and vice versa). Thus, if a set of sentences has a model (Σ “has a model” when there is some model that holds true every sentence σ in Σ) then there must be some sentences which Σ cannot deduce (¬σ for each σ in Σ, for instance). Thus, Σ is consistent.
4. A set Σ of sentences has a model if it is consistent.
This is where things get interesting again. This is a stronger version of (2) above: every consistent theory has a model.
This makes a lot of sense after you understand (2) and (3), but don’t underestimate its importance. This tells us that for every consistent theory there is some interpretation following the rules which we laid out. Again, this says something positive about our models (they are strong enough to handle any consistent theory) and something negative about the logic (it’s not strong enough to permit a consistent theory stating “I cannot be modeled”).
Note: I’m not sure what it would mean for a theory to state “I cannot be modeled”, nor am I convinced that the idea is meaningful. Again, we’re nearing the edge of my comfort zone.
The point is, the completeness theorem for first order logic says “if you hand me a consistent theory, I can build a model of it”. This is very useful. We don’t have to waste any time worrying whether or not there’s an interpretation available that satisfies all our stringent rules about how ≡ actually means “equals” and so on: if the theory is consistent, a model exists.
Witnesses
There’s been a lot of hand-waving going on for the past four points. Let’s get back to the models.
In model theory, you’re manipulating interpretations for theories. Due to the generality of such work, there are few tools available to manipulate any abstract model. One of the most useful tools in model theory is the ability to extend a model.
Ideally, when you extend a model, you want to make it more manageable without changing its behavior. Our first example of such a technique involves extending a model to add “witnesses”.
A “witness” is a constant symbol that witnesses the truth of an existentially quantified sentence. For example, in the language {S, +, ✕, 0} of arithmetic, the constant symbol 0 is a witness to the sentence (∃ν)(Sν≡S0) (because 0 makes (Sν≡S0) true). However, there is no witness to the sentence (∃ν)(ν≡S0), because 1 is not a constant symbol of the language.
However, we can extend a language to add new constant symbols. Specifically, given any model, we can extend the language to contain one constant symbol ā for each element a in the universe. Then we can extend the model to interpret each ā by a. Such extension does not change the behavior of the model, but it does give the model a number of nice properties.
A model is said to “have witnesses” if for every existentially quantified sentence shaped like (∃ν)ψ that it models, there is some constant c such that the model models ψ(ν\c)(ψ with ν replaced by c).
Before we discuss them, note a few things:
We can also reduce a model & language extended in this way by eliminating the added constants.
We can talk about sets of sentences with witnesses if, whenever a sentence shaped like (∃ν)ψ is in the set of sentences Σ, there is some constant c such that ψ(ν\c) is also in Σ.
Just as we can extend a model & language so that the model has witnesses, we can extend a set of sentences and language so that the set of sentences has witnesses (by adding a new constant symbol for each existentially quantified sentence). Such extension does not threaten the consistency of a set of sentences.
It’s easy to construct a model from a consistent set of sentences that has witnesses. The universe is the set of all constant symbols (technically, one element for each equivalent set of constant symbols), and the behavior of function/relation symbols is forced by their behavior on the constant symbols.
Formalize these four points and put them together, and you’ve just proved the completeness theorem. (Given any consistent set Σ of sentences, add witnesses then make a model then reduce it to the original language, you now have a model of Σ).
Compactness
The compactness theorem states that A set Σ of sentences has a model iff every finite subset of Σ has a model.
This leads to a pretty surprising result. Let’s break it down.
1. If Σ has a model then every finite subset of Σ has a model.
This direction is obvious: any model of Σ is also a model of all subsets of Σ: if a model holds true all sentences σ in Σ then it obviously holds true all sentences σ in any subset of Σ.
2. If every finite subset of Σ has a model then Σ has a model.
This is where things get interesting. Note that we measure the size of a model by measuring the size of its universe. So when we say “a countably infinite model” we mean a model with a countably infinite universe.
The proof of the above is actually quite easy: in first order logic, all proofs are finite. Therefore, any proof of contradiction (and thus inconsistency) must be finite. Since every finite subset of Σ has a model, no finite subset of Σ is inconsistent. Because all inconsistencies are finite, Σ is not inconsistent. Then, by completeness, because Σ is consistent it has a model.
Or, in other words, the compactness theorem says
If Σ wants to be inconsistent it can do it in a finite subset.
When we know that no finite subset of Σ is inconsistent, we know that Σ has a model.
The surprising result that this leads to is as follows:
If a theory T allows arbitrarily large finite models then it has an infinite model.
If you hand me a theory that allows arbitrarily large finite models, I can extend the language by adding countably many constants to the language. Then I can consider the set Σ of sentences built from T joined with sentences of the form “there are at least n distinct constants” for all finite n. Clearly, every finite subset of Σ is satisfied by one of the finite models of T, which come in arbitrarily large sizes — just pick one that fits. Then, by compactness, Σ has a model. The model of Σ has infinitely many constants.
I have just given a fully general recipe for building an infinite model given a theory T that admits arbitrarily large finite models. What does this mean? It means that no theory in first order logic is capable of saying “I admit only finite models”. Sentences can say “models must have no more than 10 elements” or “models are allowed to be infinite”, but sentences cannot say “My model is finite”. This follows directly fro the fact that all proofs of contradiction are finite. You either have a specific limit, or you don’t get a limit at all.
MORE POWER
If you hand me a theory that allows arbitrarily sized finite models, I can hand you back an infinite model.
It gets better.
If you hand me an infinite theory, I can expand it.
Using a method similar to the one above, I can expand T with sentences of the form “there are at least β distinct constants”, for all β less than my chosen cardinal α. Following the same argument as above, all finite subsets of this theory have a model so this theory has a model, which obviously is of power α.
In other words, a theory in first-order logic can either say “I am at most this big”, where this is a specific finite number, or “I am very big”, where very is can be any infinite cardinal that you like.
Example: The theory of arithmetic doesn’t have a finite cutoff point. There’s no maximum number. Countably infinite models are allowed.
Therefore, arbitrarily large infinite models are allowed.
There are countable models of arithmetic (at least one of which you’ll find familiar), and there are also models of arithmetic where there are as many “numbers” as there are reals. Then there are bigger models of arithmetic that are saturated, no, dripping with numbers.
Now you understand why first order logic cannot discuss the “standard” model of number theory. No first-order theory can specifically pinpoint “countable” models! A first order theory must either specify a finite maximum size explicitly, or allow models of unfathomable size.
Even more
I was hoping to get farther than this, but this is a good stopping point. Hopefully you’ve learned something about model theory. There are many more interesting results yet.
The book Model Theory by Chang and Keisler primarily focuses on introducing new ways to extend arbitrary models (giving them nice properties in the process) and then discusses results that can be gleaned from models extended thus. Focus is also given to special cases where models are well behaved, such as atomic models (which are a sort of minimal model for a given theory) and saturated models (which are a sort of maximal model for a given theory, within a given cardinality. There Are Always Bigger Models).
There’s also quite a bit of exploration into manipulating languages (Eliminate Quantifiers To Win Big Prizes!!!) and even manipulating logics (Q. How far beyond first order logic we can go before sacrificing things like completeness and compactness? A. Not very.)
If you’re interested in learning more, then… well, I’m probably not the person to talk to. I’m not even halfway through this textbook. But if you’re feeling gutsy, consider picking up Model Theory and getting started. Some of the harder concepts would be much easier to work through with other people rather than alone.
In fact, I have a number of notes about trying to learn something difficult on my own (what worked, what didn’t) that I plan to share. But that’s a story for another day.
Finally, please note that my entire exposure to model theory is half a textbook and some internetting. I guarantee I’ve misunderstood some things. If you see errors in my explanations,don’t hesitate to let me know! My feedback loop isn’t very strong right now.
Very Basic Model Theory
In this post I’ll discuss some basic results of model theory. It may be helpful to read through my previous post if you haven’t yet. Model Theory is an implicit context for the Heavily Advanced Epistemology sequence and for a few of the recent MIRI papers, so casual readers may find this brief introduction useful. And who knows, maybe it will pique your interest:
A tale of two logics
propositional logic is the “easy logic”, built from basic symbols and the connectives “and” and “not”. Remember that all other connectives can be built from these two: With Enough NAND Gates You Can Rule The World and all that. Propositional logic is sometimes called the “sentential logic”, because it’s not like any other logics are “of or relating to sentences” (/sarcasm).
first order logic is the “nice logic”. It has quantifiers (“there exists”, “for all”) and an internal notion of equality. Its sentences contain constants, functions, and relations. This lets you say lots of cool stuff that you can’t say in propositional logic. First order logic turns out to be quite friendly (as we’ll see below). However, it’s not strong enough to talk about certain crazy/contrived ideas that humans cook up (such as “the numbers”).
There are many other logics available (second order logic AKA “the heavy guns”, ω-logic AKA “please just please can I talk about numbers”, and many more). In this post we’ll focus on propositional and first-order logics.
Training Wheels
As a warm-up, let’s define “models” for propositional logic.
Remember, logics together with languages produce sentences. Languages of propositional logic are sets of basic symbols. From these symbols and the two connectives of propositional logic (∧ and ¬) we construct our sentences.
For example, from the language
{x, y}
we can build the sentencesx
,¬x
,x∧y
, and many more. We cannot build sentences likehello
(which doesn’t use the right symbols) or¬xy
(which breaks the rules of the logic).We want to study interpretations for such sentences, where an “interpretation” is some object which assigns a truth value to each sentence. This is denoted by the ⊧ operator:
A⊧B
expresses some form of “A models B as true”. This operator is used frequently and in many contexts throughout model theory.What we’re looking for is some object (set) such that there is a relation (⊧) between the object and all sentences generated by the language under consideration. There are many object/relation pairs that assign truth values in a stupid way. For example, an interpretation that assigns “true” to every sentence is quite useless.
Many other interpretations are somewhat useful but still “wrong”. For example, there are interpretations of sentences which treat
∧
like implication instead of conjunction. Somebody may have use for such things, but we certainly don’t.We want to narrow our consideration to interpretations where
∧
means “and” and¬
means “not”, so we explicitly require object/relation pairs such that whenever the object models bothφ
andψ
, it also modelsφ∧ψ
. (AndX⊧φ
iff it is not the case thatX⊧¬φ
, whereX
is the object under consideration.)If we can find any objects that work like this, we’ll be justified in calling them “models”. However, we haven’t defined any such objects yet — we’ve only constrained their potential behavior.
Any interpretation of the sentences generated from a language L of propositional logic must assign truth values to all basic sentence symbols in L. It turns out, once we select which sentence symbols are “true”, we’re done. The rest of the behavior is completely specified by the rules about
∧
and¬
.In other words, any object/relation which assigns truth values to sentences according to the above rules is isomorphic to a set S of sentence symbols with the operator
⊧
defined in the obvious way (starting withS⊧x iff x∈S
).Exploring this idea gives you a feel for how to define models of a logic.
More power
Propositional logic is pretty boring. If you want to say anything interesting, you’ve got to be able to say more than just “and” and “not”. Let’s move on to a stronger logic.
First-order logic uses the symbols
( ) ∀ ∃ ∧ ¬ ν ' ≡
or equivalent (where ν ν′ ν″, etc. are variables) with the familiar syntactic rules. A language of first-order logic has three different types of symbols: relation symbols, function symbols, and constant symbols. The rules of logic are designed such that symbols act as you expect, given their names.An interpretation of the sentences generated by some language in first-order logic is (as before) an object that assigns each sentence a truth value (via a relation). We narrow these objects down to the ones that treat all the symbols in ways that justify their names.
More specifically, we consider interpretations that have of some “universe” (set) of items. Constant symbols are interpreted as specific items in the universe. Relation/function symbols are interpreted as relations/functions on the universe.
We further restrict consideration to interpretations where the logical symbols act in the intended fashion. For example, we require that the interpretation hold
c≡d
true (wherec
andd
are constant symbols) if and only if the interpretation ofc
in the universe is equal to the interpretation ofd
in the universe.All this is pretty mechanical. The resulting objects are “models”. Some of the early results in model theory show that these mathematical objects have indeed earned the name.
Completeness
Completeness is the poster-child of model theory, and it warrants quite a bit of exploration. It’s actually saying a few different things. I’ll break it down:
1. Theorems of first-order logic are true in every model of first-order logic.
A theorem of first-order logic is something we can prove from the logic alone, such as
(∀ν)(ν≡ν)
. Contrast this with a sentence like(∃ν)(F(ν)≡c)
, whereF
is a function symbol andc
is a constant symbol — the truth of this sentence depends entirely upon interpretation.So what (1) says is that there aren’t any models that deny tautological truths of first-order logic. This result should not be surprising: this claim merely states that we picked a good definition for “models”. If instead we found that there are models which deny theorems, this would not be some big grand proof about the behavior of first-order logic. Rather, it would mean that we put the “model” label on the wrong group of thingies, and that we should go back and try again.
2. Any sentence true in every model of first-order logic is a theorem of first-order logic.
This is the converse to (1), and it’s quite a bit more powerful. This states that the theorems of a language are only the things true in every model of that language. There’s no mystery sentence that is true in every model but not provable from the syntax of the logic.
From one point of view, this says that there are no “missing” models that “would have” said the mystery sentence was false (hence “completeness”). From another, this says that all sentences are “well behaved”: no sentence “outwits” all models.
It’s worth noting that this is where completeness fails for models of second order logic. From one perspective, there must be some “missing models” in second-order logic. From the other perspective, there must be certain sentences which can “outwit” their models (presumably Gödel sentences). I haven’t spent any time formally examining these intuitive claims yet; I have much to learn about second-order logic.
I should also note that I’ve been implicitly assuming completeness for the last two posts by ignoring the syntactic rules of the logics under consideration. The completeness theorem lets me do this, because it states that the results interpreted by first-order models are exactly the same as the results derived from the syntactic rules of first-order logic. Because the completeness theorem holds, I’m allowed to treat the logical sentences as dead symbols and consider the binding
M⊧φ∧ψ iff M⊧φ and M⊧ψ
to be the means by which∧
obtains its meaning.When the completeness theorem fails (as it does in stronger logics) then there is a gap between what the rules of the logic allow and what the models of the logic can say. In that case I must separately consider what the rules say and what the models model. This is the edge of my comfort zone; more exploration with second-order logic is necessary.
Fortunately, everything is well-behaved in first order logic. In fact, (1) and (2) above can be made stronger:
3. A set Σ of sentences is consistent if it has a model.
This is another sanity check that we’ve put the “model” label on the right thing. A set of sentences is inconsistent if it can derive both
φ
and¬φ
for some sentence φ. (More specifically, Σ is inconsistent if it can derive everything. If there is any sentence that Σ cannot derive, Σ is consistent. Remember that from a contradiction, anything follows.)Our models are defined such that whenever they model
φ
, they do not model¬φ
(and vice versa). Thus, if a set of sentences has a model (Σ “has a model” when there is some model that holds true every sentence σ in Σ) then there must be some sentences which Σ cannot deduce (¬σ for each σ in Σ, for instance). Thus, Σ is consistent.4. A set Σ of sentences has a model if it is consistent.
This is where things get interesting again. This is a stronger version of (2) above: every consistent theory has a model.
This makes a lot of sense after you understand (2) and (3), but don’t underestimate its importance. This tells us that for every consistent theory there is some interpretation following the rules which we laid out. Again, this says something positive about our models (they are strong enough to handle any consistent theory) and something negative about the logic (it’s not strong enough to permit a consistent theory stating “I cannot be modeled”).
Note: I’m not sure what it would mean for a theory to state “I cannot be modeled”, nor am I convinced that the idea is meaningful. Again, we’re nearing the edge of my comfort zone.
The point is, the completeness theorem for first order logic says “if you hand me a consistent theory, I can build a model of it”. This is very useful. We don’t have to waste any time worrying whether or not there’s an interpretation available that satisfies all our stringent rules about how ≡ actually means “equals” and so on: if the theory is consistent, a model exists.
Witnesses
There’s been a lot of hand-waving going on for the past four points. Let’s get back to the models.
In model theory, you’re manipulating interpretations for theories. Due to the generality of such work, there are few tools available to manipulate any abstract model. One of the most useful tools in model theory is the ability to extend a model.
Ideally, when you extend a model, you want to make it more manageable without changing its behavior. Our first example of such a technique involves extending a model to add “witnesses”.
A “witness” is a constant symbol that witnesses the truth of an existentially quantified sentence. For example, in the language
{S, +, ✕, 0}
of arithmetic, the constant symbol0
is a witness to the sentence(∃ν)(Sν≡S0)
(because0
makes(Sν≡S0)
true). However, there is no witness to the sentence(∃ν)(ν≡S0)
, because1
is not a constant symbol of the language.However, we can extend a language to add new constant symbols. Specifically, given any model, we can extend the language to contain one constant symbol
ā
for each elementa
in the universe. Then we can extend the model to interpret eachā
bya
. Such extension does not change the behavior of the model, but it does give the model a number of nice properties.A model is said to “have witnesses” if for every existentially quantified sentence shaped like (∃ν)ψ that it models, there is some constant c such that the model models ψ(ν\c) (ψ with ν replaced by c).
Before we discuss them, note a few things:
We can also reduce a model & language extended in this way by eliminating the added constants.
We can talk about sets of sentences with witnesses if, whenever a sentence shaped like
(∃ν)ψ
is in the set of sentences Σ, there is some constantc
such thatψ(ν\c)
is also in Σ.Just as we can extend a model & language so that the model has witnesses, we can extend a set of sentences and language so that the set of sentences has witnesses (by adding a new constant symbol for each existentially quantified sentence). Such extension does not threaten the consistency of a set of sentences.
It’s easy to construct a model from a consistent set of sentences that has witnesses. The universe is the set of all constant symbols (technically, one element for each equivalent set of constant symbols), and the behavior of function/relation symbols is forced by their behavior on the constant symbols.
Formalize these four points and put them together, and you’ve just proved the completeness theorem. (Given any consistent set Σ of sentences, add witnesses then make a model then reduce it to the original language, you now have a model of Σ).
Compactness
The compactness theorem states that A set Σ of sentences has a model iff every finite subset of Σ has a model.
This leads to a pretty surprising result. Let’s break it down.
1. If Σ has a model then every finite subset of Σ has a model.
This direction is obvious: any model of Σ is also a model of all subsets of Σ: if a model holds true all sentences σ in Σ then it obviously holds true all sentences σ in any subset of Σ.
2. If every finite subset of Σ has a model then Σ has a model.
This is where things get interesting. Note that we measure the size of a model by measuring the size of its universe. So when we say “a countably infinite model” we mean a model with a countably infinite universe.
The proof of the above is actually quite easy: in first order logic, all proofs are finite. Therefore, any proof of contradiction (and thus inconsistency) must be finite. Since every finite subset of Σ has a model, no finite subset of Σ is inconsistent. Because all inconsistencies are finite, Σ is not inconsistent. Then, by completeness, because Σ is consistent it has a model.
Or, in other words, the compactness theorem says
When we know that no finite subset of Σ is inconsistent, we know that Σ has a model.
The surprising result that this leads to is as follows:
If a theory T allows arbitrarily large finite models then it has an infinite model.
If you hand me a theory that allows arbitrarily large finite models, I can extend the language by adding countably many constants to the language. Then I can consider the set Σ of sentences built from T joined with sentences of the form “there are at least n distinct constants” for all finite n. Clearly, every finite subset of Σ is satisfied by one of the finite models of T, which come in arbitrarily large sizes — just pick one that fits. Then, by compactness, Σ has a model. The model of Σ has infinitely many constants.
I have just given a fully general recipe for building an infinite model given a theory T that admits arbitrarily large finite models. What does this mean? It means that no theory in first order logic is capable of saying “I admit only finite models”. Sentences can say “models must have no more than 10 elements” or “models are allowed to be infinite”, but sentences cannot say “My model is finite”. This follows directly fro the fact that all proofs of contradiction are finite. You either have a specific limit, or you don’t get a limit at all.
MORE POWER
If you hand me a theory that allows arbitrarily sized finite models, I can hand you back an infinite model.
It gets better.
If you hand me an infinite theory, I can expand it.
Using a method similar to the one above, I can expand T with sentences of the form “there are at least β distinct constants”, for all β less than my chosen cardinal α. Following the same argument as above, all finite subsets of this theory have a model so this theory has a model, which obviously is of power α.
In other words, a theory in first-order logic can either say “I am at most
this
big”, wherethis
is a specific finite number, or “I amvery
big”, wherevery
is can be any infinite cardinal that you like.Example: The theory of arithmetic doesn’t have a finite cutoff point. There’s no maximum number. Countably infinite models are allowed.
Therefore, arbitrarily large infinite models are allowed.
There are countable models of arithmetic (at least one of which you’ll find familiar), and there are also models of arithmetic where there are as many “numbers” as there are reals. Then there are bigger models of arithmetic that are saturated, no, dripping with numbers.
Now you understand why first order logic cannot discuss the “standard” model of number theory. No first-order theory can specifically pinpoint “countable” models! A first order theory must either specify a finite maximum size explicitly, or allow models of unfathomable size.
Even more
I was hoping to get farther than this, but this is a good stopping point. Hopefully you’ve learned something about model theory. There are many more interesting results yet.
The book Model Theory by Chang and Keisler primarily focuses on introducing new ways to extend arbitrary models (giving them nice properties in the process) and then discusses results that can be gleaned from models extended thus. Focus is also given to special cases where models are well behaved, such as atomic models (which are a sort of minimal model for a given theory) and saturated models (which are a sort of maximal model for a given theory, within a given cardinality. There Are Always Bigger Models).
There’s also quite a bit of exploration into manipulating languages (Eliminate Quantifiers To Win Big Prizes!!!) and even manipulating logics (Q. How far beyond first order logic we can go before sacrificing things like completeness and compactness? A. Not very.)
If you’re interested in learning more, then… well, I’m probably not the person to talk to. I’m not even halfway through this textbook. But if you’re feeling gutsy, consider picking up Model Theory and getting started. Some of the harder concepts would be much easier to work through with other people rather than alone.
In fact, I have a number of notes about trying to learn something difficult on my own (what worked, what didn’t) that I plan to share. But that’s a story for another day.
Finally, please note that my entire exposure to model theory is half a textbook and some internetting. I guarantee I’ve misunderstood some things. If you see errors in my explanations, don’t hesitate to let me know! My feedback loop isn’t very strong right now.