Extensionality and the univalence axiom of type theory
This is the final part of my introduction to dependent type theory. The big theme of this article is equality though that may not be immediately obvious.
Let’s start by finally discussing function extensionality and propositional extensionality.
What’s the deal with extensionality?
Whenever we define concepts, there are basically two ways to do it: extensionally and intensionally. An extensional definition simply lists all the objects that fall under the concept to be defined. So, if I was defining what a country is, I would simply give an exhaustive list of all countries. An intensional definition, on the other hand, gives you the rules for identifying objects falling under the concept. So, an intensional definition of a country might say thing like, “it’s a distinct part of the world that has its own government.”[1]
(As an aside, if you give a definition to a fellow human, it’s probably best to mix both kinds of definitions: “a country is a distinct part of the world that has its own government, like for example, France or Japan.”)
We can naturally extend this distinction in definition styles to the question of equality. For example, consider how the equality of two sets is defined in ZFC set theory:
In words: two sets are equal if they have exactly the same elements. We could imagine an machine that checks all elements one by one and determines whether they’re the same in both sets, and finally tells you whether the two sets are equal.
We say that this kind of equality is extensional, because all that matters here are the contents of the sets, and we don’t care which rules were used to generate the sets. Whether the set was generated from the first three prime numbers, or the three Fibonacci numbers after , doesn’t matter for the question of equality. This makes ZFC set theory a purely extensional theory.
However, as you may remember, this is not how we defined equality in our type theory. With the way we defined equality, two functions can be non-equal even if they return exactly the same values for all inputs. I appreciate that this sounds very strange, but it’s basically due to how equality works in lambda calculus. Lambda calculus gives us these judgmental equalities (also called alpha-, beta- and eta-equivalence[2], respectively):
where the notation in the second equation means that all occurrences of in are replaced with .
But with only these equalities, the way something is computed matters to the equality of functions. So, for example, a bubble-sort function and a quick-sort function both give the same result (a sorted list), but the function body of one of them can’t be transformed into the function body of the other, just with the given three equivalences, so these functions are not judgmentally equal in lambda calculus (and by default also not propositionally equal).
This is again quite different from set theory where only the output matters.
But as you may recall, we have several times used the idea that two functions are (propositionally) equal if all their outputs are equal. Quite a bit of our theory depends on this. So what can we do? We can add an axiom which makes equality on functions behave extensionally. We’ll call this axiom function extensionality. It roughly says that:
Similarly, propositional extensionality says that if two propositions (subsingletons) and are logically equivalent:
then they are equal: (though not judgmentally equal, ). So, propositional extensionality implies that and are (propositionally) equal, because they’re both “truthy”. Again, this is about ignoring the way these objects came to be and just looking at their (relevant) behavior, which in the case of propositions is to be either “truthy” or “falsy”.
So, by axiomatically declaring things to be equal, we can make our theory more extensional.
However, it should be noted that we can’t just declare things to be equal willy-nilly. For example, say we declare the two elements of to be equal: . You might say, “what’s the harm?” But let’s define the type-valued function with and . Then by the induction principle of equality (“indiscernibility of identicals”), we have that if is true then is also true. But is always true! (It’s inhabited by .) So, is now also inhabited? That seems bad. In fact, this will make our whole theory inconsistent!
So, as we can see, some care must be taken when declaring things to be equal.
However, as far as anyone can tell, propositional and function extensionality are perfectly safe to assume and will not risk the consistency of our theory. Generations of type theorist have adopted these axioms and no incidents were reported. And if you feel the loss of a stricter notion of equality, then don’t despair, you’ll still have judgmental equality, which we won’t change.
Function extensionality is probably not that controversial (we’ve already used it several times), but you might be less convinced of propositional extensionality.
One pragmatic reason to adopt it is that we need it in order to prove LEM from AC. And then, together with LEM, this axiom will make (the index type of all truth values) collapse to , the type with exactly 2 elements; which makes sense because all the subsingletons will be indistinguishable from and , so just two possibilities.
A more philosophical argument in favor of propositional extensionality is that we don’t really care about what is inside the type that encodes a truth value; we just want to distinguish true from false, so any internal structure is uninteresting for our purposes of classical logic. The fact that is technically inhabited by some pair when encoded as a type (assuming both and and are true) is just an implementation detail and shouldn’t concern us. With propositional extensionality, we can say and and conclude (remember that we’re using the symbol for the unit type when it’s in a logic formula).
If you’re still not convinced, I think propositional extensionality will make more sense after you’ve seen the univalence axiom. But before we get to that, we want to state the two extensionality axioms formally. This requires the definition of an equivalence.
Equivalences
The general idea is that the existence of an equivalence implies that two types are equivalent.
Note: I’m following the terminology from the HoTT book here, which uses “equivalence” and “equivalent” to describe these concepts, but you could also say “bijection”, or “isomorphism” and “isomorphic”. The HoTT book reserves these terms for 0-types (see below).
In order to state what an equivalence is, we first need to introduce the uniqueness quantification: . It can be seen as an alternative way of turning a sum type into a subsingleton type:
So, is true if and only if there is a unique such that .
An equivalence is then a function with the following property:
In other words, is cleanly invertible.
If an equivalence exists for and , we write and say that and are equivalent. Formally:
For any type , we have because of the identify function: .
For subsingletons, is logically equivalent to . This is because “” corresponds to the following pair of function types: , and since and have at most one element, these functions are easy to invert, so we can prove that implies and vice versa.
If, on the other hand, we have for two subsingletons, then is also easy to prove: start with which holds for all types and then substitute one with , which we may do because . Thus we get , which means the following function type is inhabited:
Or, equivalently: . Can we also prove the opposite direction? No! The opposite direction is exactly what the axiom of propositional extensionality corresponds to. The axiom says that the function described above is an equivalence:
This is the formal statement of the axiom. However, as everything here is subsingletons, we can also state it this way:
Similarly, if and are two functions with , then with substitution we can easily prove that , but we can’t prove it in the opposite direction. For that, we need the axiom of function extensionality which asserts the following equivalence:
Here, we cannot just replace “” with “” because… of a thing that I haven’t explained yet. Basically: might not be a subsingleton. We will discuss it below!
So, these are the two classic extensionality axioms. We could stop here and we’d have a pretty nice theory. Functions are equal when we expect them to be and we can completely ignore the inner structure of propositions. However, we can go even further…
The univalence axiom
Often in mathematics, when two structures are the same up to isomorphism, we “abuse notation” and write formulas that seem to refer to one structure but we actually mean the other. For example, when we discussed subtypes, , we noted that the elements of this object are technically pairs where the second entry is the witness for (under propositional extensionality, the witness can always be written as ) but we would pretend that the elements of this object are just the ’s from that satisfy . We argued that the two structures are isomorphic, so it doesn’t really matter.
Motivated by such considerations, we can introduce the univalence axiom, of which propositional and function extensionality are mere special cases. The axiom was invented by Fields-medalist Vladimir Voevodsky between 2006 and 2009. Informally it says: if two types are equivalent (“the same up to isomorphism”) we’ll just declare they are actually equal. In other words: “isomorphic structures are equal.” This would mean would actually be equal to a type which just contains all for which .
As with the extensionality axioms, the univalence axiom asserts an equivalence:
That is, we assert that the equality type for and is equivalent to the type of all equivalence functions between and , for any and in . This in turn implies the following statement, where is propositional truncation, as before:
Meaning, if and are equivalent, they’re equal. However, this is a slightly weaker statement because it only implies that the type is inhabited if and are equivalent, but it doesn’t say by how many elements.
The obvious weird thing about the above is that we’re asserting that an equality type for two types and can now have more than one element. Why would we do this? Well, the main reason is arguably that the univalence axiom was originally intended for constructive mathematics, where we don’t use propositional truncation and the elements of propositions (witnesses) are actually meaningful. But it’s also true that allowing multiple witnesses for equality potentially makes our theory a bit more powerful. We will discuss this more in the next section.
The univalence axiom pretty much immediately implies propositional extensionality as a special case. Proving function extensionality from univalence is also possible but relatively involved, so we won’t do it here.
The name “univalence” seems to have been a mistranslation from Russian, by the way, so don’t read too much into it.
Should we adopt the univalence axiom? I’m not sure myself. A lot of people seem to be quite excited about it. Admittedly, it does seem more elegant than separately assuming propositional and function extensionality. And it is true that mathematicians informally use the univalence axiom all the time when they abuse notation. But there may be other ramifications that I’m not aware of.
Equality revisited
When we defined equality, we said that for any in any type, is equal to itself; meaning that the type is inhabited by something. We could call this canonical member because it witnesses the fact that equality is reflexive:
where . However, we never said that this must be the only element of .
Of course, if we adopt the univalence axiom, then it is obvious that some equalities will have multiple elements. For example, the equality has two members under the univalence axiom, because there are two ways to construct an equivalence between and : mapping and or mapping and .
But independently of the univalence axiom, the fact that is not guaranteed to be a subsingleton is of course a problem for us because our encoding of first-order logic relies on the fact that we’re exclusively working with subsingletons. Can we just use propositional truncation, , and be done with it?
Unfortunately that doesn’t solve all the problems. For example, the axiom of choice, in the form that we stated it in before, is not compatible with the univalence axiom because the index type may contain elements that are equal in more than one way (i.e., their equality type has more than one element) and this causes an inconsistency with the univalence axiom (which I will not get into here). We can fix this problem relatively easily though by restricting the index type in the axiom of choice to types whose elements are only equal in at most one way. Those types will be called 0-types (according to the naming scheme that calls subsingletons -types).
Formally, is a 0-type if for all and all , we have :
Put differently: the equality types in a 0-type have to be -types (aka subsingletons). So, within 0-types, there is no need to use propositional truncation for equality. (The “” that appears in the above definition can also be proven to be a subsingleton, by the way, so no propositional truncation needed there either.)
This property is called UIP: uniqueness of identity proofs (where “proofs” should be understood as “witnesses”; so: “uniqueness of identity witnesses”). Most of the types we work with are 0-types (except the universe type ). 0-types are characterized by the fact that determining the equality of two elements can be reduced to a single, uniquely-determinable property.
For example, the type of the natural numbers, , is a 0-type, and a very rough argument for this is that since we define natural numbers to be constructed by repeated application of the successor function to the unique element , the equality of two numbers only depends on how many applications they have. No other factor comes into play, and so “the proof for their identity is unique.” (If that didn’t sound very convincing, check out the proof in section 2.13 of HoTT.)
In general, if the elements of a type don’t have much inner structure themselves, chances are, the type is a 0-type. All the number types are 0-types.
0-types are also called sets or h-sets, because I guess to type theorists these types look like sets from set theory? I sort of get why they’re called that, but I still think it’s confusing, because there are a lot of ways in which 0-types do not behave like sets in set theory; for example, 0-types don’t have a global membership relation.
As you might have guessed, if there are -types and 0-types, then this probably keeps going. Here is the definition of 1-types: Type is a 1-type if for all and and , we have . So, in a -type (aka a subsingleton) all elements are equal to each other. In a 0-type, all equality types between elements are -types. And in a 1-type, all equality types between elements are 0-types. You can guess how this continues.[3]
With the definition of 0-types we can amend the axiom of choice to say that the index type has to be a 0-type. This will make it compatible with univalence and will not make it weaker, to my knowledge.
This brings up the question: should we require all types to be 0-types? As discussed, this is not compatible with the univalence axiom as stated[4], but maybe it’s more important to us to not have weird elements in our equality types? We could definitely just adopt UIP as an axiom for all types in all universes and forgo univalence; it would be a consistent theory, though probably less powerful in some ways. We could still adopt propositional and function extensionality separately. Dependent type theory (i.e., the type theory we’ve been talking about) with UIP is an example of a so-called set-level type theory where all types behave like “sets” (0-types).[5]
I’m not sure myself how to answer this question. On the one hand, in my ordinary mathematics I don’t expect to ever need types that aren’t 0-types – all the types I would be using, like , , , , are 0-types – so, I don’t feel I would lose much by adopting UIP; on the other hand, the univalence axiom is pretty cool. Though, going with univalence technically means that you annoyingly have to truncate equalities if the elements are not from a 0-type: (which, to be fair, probably won’t happen often).
The subuniverse of 0-types
If you go with the univalence axiom, you’ll often need to state that a given theorem only holds for 0-types. For example, in order for groups to behave correctly, the base type that we called in the definition of the group type actually needs to be a 0-type. For stating this requirement, it would be nice if we could specify that comes from the subuniverse of all 0-types. And indeed we can do that. This subuniverse is usually called and we can define it with the UIP property from above:
The subscript is technically necessary because there isn’t a universe which contains all sets. But we usually drop it.
With univalence, the set of all subsingletons, , is a 0-type, by the way – even without LEM (in which case it collapses to and is obviously a 0-type). The informal argument for this is that there can only exist one equivalence between two subsingletons (under function extensionality, that is; which is implied by univalence).
Function types like “” are 0-types as long as the codomain is a 0-type; the domain can be any type. The informal argument is that if and are functions of type and , then for a given , there is only one witness for in the codomain (because is a 0-type) and so there is only one choice available for constructing the function and thus only one witness for . Families of types, which have the type where is some index type, are not 0-types though, because is certainly not a 0-type (under univalence).
The type itself is also not a 0-type, because there can be lots of equivalences between h-sets. (I will say “h-sets” instead of “0-types” whenever we’re talking about them as elements of .) Rather, is a 1-type. Quick proof sketch: if , then we need to show that is a 0-type. By the univalence axiom, is equivalent to , which is a subtype of . And is a 0-type as long as is a 0-type, which is the case here. Thus, we (informally) conclude that is also a 0-type.
Cardinal numbers
Let’s assume the univalence axiom for a bit and see what kind of consequences we can derive.
The fact that the h-sets in are equal to one another if there is an equivalence between them, means that, according to how the size of a set is usually defined in set theory, an h-set in is equal to all other h-sets of the same size (because in set theory, two sets have the same size if there is an isomorphism between them). This means the structure of is kind of like the structure of the cardinal numbers. For every possible set size, there is “one element” (well, depending on how you look at it, there can be lots of elements but they’re all indistinguishable) which we could take to be the cardinal number for that size. So, if we had some transformation that could turn an arbitrary type into a 0-type – we could call it 0-truncation and write it as for any type – then we could define the type of the cardinal numbers as
because already has the right structure.
If the 0-truncation alluded to above also comes with a (surjective) function, which maps elements from to , then the operations of addition and multiplication on the cardinal numbers reduce to the operations of binary sum types and binary product types on the types in :
Here, and are from , and and are from . The operations on the cardinal numbers are subscripted with a .
Now, is an element of as well? In ZFC set theory, the collection of all cardinal numbers is not a set because otherwise you get Cantor’s paradox. Instead, the cardinal numbers form a proper class in set theory. However, the way we constructed , it clearly is a 0-type, so it should be in . Do we have a paradox? The solution to the problem is that if you consider it carefully, you’ll realize that 0-truncation must be “raising the universe.” That is, if you have from the universe , then must be from a higher universe: (which makes the subscript admittedly a bit confusing; just means that it corresponds to from universe ). We have therefore .
I think this is a nice solution to Cantor’s paradox: we can still talk about the collection of all cardinals as an h-set without being at risk of paradox, because is never an element of . The hierarchy of universes solves all these problems of “sets” being too large to still be “sets”.
It’s an important point that being an h-set/0-type has nothing to do with “smallness”. is a “set” in the sense of being a 0-type, but it is not small in the sense of being an element of in the same universe – we do not have . Likewise, is a 0-type but is not small in the absence of LEM – we do not have . So, the concept of smallness still exists in type theory.
Nevertheless, it is easier to talk about things that are not small in type theory, which is one reason why category theory can be defined more naturally within type theory: categories are usually so large that they can’t be sets in set theory (and so are usually proper classes), but this is no problem in type theory; there will always be a type that’s large enough[6] – potentially in a higher universe.
The 0-truncation that we used to define can be defined as an inductive type, which I haven’t talked about. The definition of inductive types is a generalization of how the natural numbers are defined (namely, inductively). But it’s a very complex topic that I won’t go into here. Suffice to say that there is a rigorous way to define things like 0-truncation, including the mapping of the elements.
Summary of type forming rules and axioms
For easier reference, I’ll try to list here all the axioms we assumed for our type theory.
Remember that in set theory (specifically ZFC set theory), we first have to specify all the rules of inference for first-order logic, and then we specify the 9 ZFC axioms.
The situation in type theory is similar, except that we have more rules and fewer axioms (can be as few as 2 axioms). The basis of our theory is Martin-Löf dependent type theory (MLTT), which specifies all the fundamental type forming rules, like how to form a binary product type if you have two types and , and what the elements of that new type look like. I only specified those rules very informally; see for example here for the formal type forming rules. MLTT covers:
the empty type
the unit type
binary product types
dependent product types (aka dependent function types)
function types (really just a special case of dependent function types)
binary sum types
dependent sum types
the infinite hierarchy of universes
Then we also need to establish the natural numbers. I didn’t really explicitly do that, but basically, we introduce a type with an element and a successor function , and then we also need to establish how recursion/induction works on . This, too, is done via type forming rules (as opposed to axioms).
The finite types like can be generated from repeated binary sums on the unit type:
and so on. According to this definition, the elements of are and ; and for , , , and .
As the first non-type-rule axiom, we have the axiom of choice (AC). Next, we have either the two axioms propositional extensionality and function extensionality or just the univalence axiom. Together with AC, either choice implies the law of excluded middle.[7] If we don’t pick the univalence axiom, we can also consider uniqueness of identity proofs (UIP) as an axiom which ensures that all types are 0-types/h-sets.
Less ambiguous notation
Given that type theory re-uses symbols like and (and also ), that are quite commonly used when working with numbers, and have a different meaning there (namely addition), you might sometimes wish for less ambiguous symbols.
For example, with the univalence axiom we have , but the two “” symbols refer to different operations; the one on the left is a binary sum type while the one on the right is addition on the natural numbers.
As sum types are basically just disjoint unions, I think it would be okay to use symbols usually used for disjoint unions instead. Unfortunately, there isn’t really a single accepted notation for disjoint unions, but the symbol seems somewhat popular. It’s like the symbol for set union, , but square (\sqcup
in latex). Thus, a binary sum type could be written like this:
for two types and . And for the full sum type, we can write
The example above then becomes .
For product types, the analogue in set theory is the Cartesian product, so we could look there for alternative symbols for and , but, in fact, and are the established symbols there, too. So we have to live with the ambiguity.
This concludes my series on type theory. If you want to know more about the inner workings of it, definitely go read the HoTT book. However, if you only want to use types to annotate your normal, classical math, then I don’t expect you’ll get much out of HoTT.
If you want to actually use type theory in your work, you probably want to practice using it. I unfortunately don’t have ready-made exercises for this. I imagine an exercise for this could look like taking a known result from some branch of mathematics that is usually built on top of set theory and trying to annotate all the terms with types, which might require some modifications to the terms themselves in some cases.
- ^
Though obviously it’s not that simple: https://www.youtube.com/watch?v=4AivEQmfPpk
- ^
I think some variants of lambda calculus don’t accept eta-equivalence, but we do.
- ^
You can also play this game further in the other direction and define -types, by saying that in a -type, all equality types are -types. What do -types look like then? In a -type, all equality types are inhabited, so -types are all the “truthy” types – types that have exactly one element. The predicate that I defined above identifies (-2)-types.
- ^
Could we just use the weaker version of the univalence axiom which only asserted that the equality type is inhabited but not by how many elements? It seems to me that this could be compatible with UIP and not cause other inconsistencies, but I couldn’t find anything about this when searching on the internet.
- ^
Another way to get a set-level type theory is modifying judgmental equality such that it agrees with propositional equality. Such a type theory is called extensional type theory which might be a bit confusing because it has nothing to do with propositional or function extensionality.
- ^
Incidentally, the base type for a category would usually be a 1-type (the question of 0-type vs 1-type is unrelated to the question of “size”, to be clear), so that the isomorphisms between objects and the equality witnesses for objects can be the same thing. This is a natural extrapolation of the principle of univalence: each isomorphism in a category serves as a witness for the fact that two objects are equal. But you don’t have to do it this way; you can also keep categorical isomorphisms and equality separate.
- ^
One nice thing about explicit axioms is that you don’t necessarily have to use them. So, for example, you could first try proving something without the law of excluded middle (and also without the axiom of choice, of course) and see whether it’s possible. If it isn’t, you can always adopt the necessary axiom and list it as an explicit requirement. For example, in set theory, it is usually known which theorems require the axiom of choice, because it was once controversial. So, too, can you mark theorems in type theory with whether or not they require AC.
Thank you for the wonderful posts!