I spent a lot of the last two years getting really into categorical logic (as in, using category theory to study logic), because I’m really into logic, and category theory seemed to be able to provide cool alternate foundations of mathematics.
Turns out it doesn’t really.
Don’t get me wrong, I still think it’s interesting and useful, and it did provide me with a very cosmopolitan view of logical systems (more on that later). But category theory is not suitable for foundations or even meant to befoundational. Most category theorists use an extended version of set theory as foundations!
In fact, its purpose is best seen as exactly dual to that of foundations: while set theory allows you to build things from the ground up, category theory allows you to organize things from high above. A category by itself is not so interesting; one often studies a category in terms of how it maps from and into other categories (including itself!), with functors, and, most usefully, adjunctions.
Ahem. This wasn’t even on topic.
I want to talk about a particular subject in categorical logic, perhaps the most well-studied one, which is topos theory, and why I believe it be to useless for rationality, so that others may avoid retreading my path. The thesis of this post is that probabilities aren’t (intuitionistic) truth values.
Topoï and toposes
A topos is perhaps best seen not even as category, but as an alternate mathematical universe. They are, essentially, “weird set theories”. Case in point: Set itself is a topos, and other toposes are often constructed as categories of functors F:C→Set, for C an arbitrary category.
(Functors assemble into categories if you take natural transformations between them. That basically means that you have maps F(c)→G(c), such that if you compare the images of a path under F and G, all the little squares commute.)
Consider that natural numbers, with their usual ordering like 4≤5, can form a category if you take instead 4→5. So one simple example is to consider the category of all functors N→Set, which are really just sequences of sets, like
X1→X2→X3→⋯
where the arrows are regular set theoretic functions. You can do practically any kind of mathematical reasoning using sequences of sets! (as long as it is constructive) For example, you have
an “empty set”, which is just a sequence of empty sets;
a “point” given by a sequence of points;
“products” of sequences given by {Xi×Yi};
and so on. Most interestingly, you have truth values given by subobjects of the point; accordingly, in Set those are the empty set and the point itself, since P(∗)={∅,∗}, corresponding to true and false. Notice that ∅⊆∗; in fact the truth values in general will have the structure of a partially ordered set.
What are our truth values here? What is a subobject of a sequence of points? For one, each Xi has to be a subset of ∗. And there are no maps ∗→∅ ; so each “truth value” will look like
∅→∅→⋯→∅→∗→∗→⋯
a bunch of empty sets and, at some position n, all points, meaning that we have as many truth values as natural numbers. This is our first glance into the cosmopolitan nature of topos theory: weird truth values! Notice, however, that if n≤m, their corresponding subobjects will have this ordering reversed (an exercise left for the already knowledgeable reader); so in the end it might have been better to use functors on Nop , natural numbers with their order reversed.
To sum up, we made the category SetN of sequences of sets, and realized that it was a topos with truth values Nop. Isn’t it that interesting...
Topoï-logical
Turns out there’s a big connection between toposes and topological spaces.
The open sets of a topological space have the structure of a partially ordered set, if you set U≤V whenever U⊆V. Moreover, in that poset, you can describe U∩V as the greatest lower bound of U and V, and U∪V as their least upper bound.
This is in fact (almost) exactly the structure we want of our topos-theoretic poset of truth values: the greatest lower bound corresponds to conjunction p∧q and least upper bound is the disjunction p∨q. So we can use topological spaces as spaces of truth values, and this is in fact the approach used in Heyting semantics of intuitionistic logic.
(so each open set, and not point, corresponds to a truth value; you take the AND of two open sets to be their intersection and the OR to be their union)
Alright, so as with N, the poset of open sets can define a category if you set U→V whenever U≤V. So take X a topological space and O(X) its category of open sets. We’ll define a new category Sh(X) of functors O(X)op→Set, except that they won’t be all of the functors, only those that preserve the topo/logical structure (sheaves).
Guess what? Not only is Sh(X) a topos, turns out that its truth values are isomorphic to O(X)! And since the truth values are the subobjects of the point, that means that the points of the topos are in fact shaped like X…
Now we have a fuller view of the logically cosmopolitan view that topos theory can bring us. You can create, just like that, a whole parallel mathematical universe of bizarro sets where everything is made up of, say, donuts. Or coffee cups. It is as you wish.
Where’s my Bayesian topos?
Since I am at heart a LessWronger, and since I care deeply about problems of logical induction and logical counterfactuals and whatnot, I spent a while trying to design a topos that would behave like manipulating probabilities. Or distributions. Or something. With the objective of making something that would represent beliefs.
Well, I’m sorry, but it doesn’t work.
At minimum, we would expect that the truth values of this topos be probabilities, yeah? And with the cosmopolitan principle above, we could then just take the sheaves on this poset of probabilistic truth values.
So these truth-values would be order-isomorphic to [0,1]. But for them to actually represent probabilities, we’d want that p∧q=pq, and yet the order on [0,1] already prescribes that p∧q=min(p,q) , and we are doomed from the start.
Furthermore, even in an intuitionistic logic, the provable statements all have the maximal truth value (which here would be 1); but we all know that 0 and 1 are not probabilities, and so nothing should be provable… which seems like it wouldn’t be very useful.
All in all, I’m truly sorry you had to bear through all of the math above just for this conclusion. It’s still pretty cool, though, right?
(Geometric) topoï aren’t reflective
In order to legitimately use topos theory for rationality, we should have a way for the topos to “think about itself”. Analogously to the situation in Peano arithmetic, for a topos E, we’d want some object E∈E (specifically, an internal category) to be isomorphic to E in some sense.
We can define an “element” of an object E∈E as being an arrow ∗→E . So the objects of the internal category are given by the set Hom(∗,E), and in fact the functor Hom(∗,−):E→Set respects the structure of the internal category enough that it becomes a category internal to Set, which is just a small category.
But wait. The toposes generated by sheaves on a topological space are at least as big as Set, but the collection of all sets is too big to be a set, and thus we run into size issues.
It should in principle be possible to do so in small toposes, such as the free (as in syntactic) topos, but I am not sure and will refrain from claiming so. It is however certainly possible to do so in list-arithmetic pretoposes (yes it’s a mouthful), as shown by André Joyal in his as of yet unpublished categorical proof of Gödel’s incompleteness theorems, which I have studied with him last year.
What now?
It now seems to me that linear logic might be the “right” weakening of classical logic into something probabilistic. I still need to figure out some of the details, but let’s say that the work has already been done, and one need only piece it together into something relevant to rationality and agent foundations. Particularly promising is that some claim that linear logic is a good setting for “paraconsistent” logic (logic that deals gracefully with contradictions), which could make it work for logical counterfactuals.
All this and more in my next post, pretentiously monikered “Probability Monads”.
Why Rationalists Shouldn’t be Interested in Topos Theory
I spent a lot of the last two years getting really into categorical logic (as in, using category theory to study logic), because I’m really into logic, and category theory seemed to be able to provide cool alternate foundations of mathematics.
Turns out it doesn’t really.
Don’t get me wrong, I still think it’s interesting and useful, and it did provide me with a very cosmopolitan view of logical systems (more on that later). But category theory is not suitable for foundations or even meant to be foundational. Most category theorists use an extended version of set theory as foundations!
In fact, its purpose is best seen as exactly dual to that of foundations: while set theory allows you to build things from the ground up, category theory allows you to organize things from high above. A category by itself is not so interesting; one often studies a category in terms of how it maps from and into other categories (including itself!), with functors, and, most usefully, adjunctions.
Ahem. This wasn’t even on topic.
I want to talk about a particular subject in categorical logic, perhaps the most well-studied one, which is topos theory, and why I believe it be to useless for rationality, so that others may avoid retreading my path. The thesis of this post is that probabilities aren’t (intuitionistic) truth values.
Topoï and toposes
A topos is perhaps best seen not even as category, but as an alternate mathematical universe. They are, essentially, “weird set theories”. Case in point: Set itself is a topos, and other toposes are often constructed as categories of functors F:C→Set, for C an arbitrary category.
(Functors assemble into categories if you take natural transformations between them. That basically means that you have maps F(c)→G(c), such that if you compare the images of a path under F and G, all the little squares commute.)
Consider that natural numbers, with their usual ordering like 4≤5, can form a category if you take instead 4→5. So one simple example is to consider the category of all functors N→Set, which are really just sequences of sets, like
where the arrows are regular set theoretic functions. You can do practically any kind of mathematical reasoning using sequences of sets! (as long as it is constructive) For example, you have
an “empty set”, which is just a sequence of empty sets;
a “point” given by a sequence of points;
“products” of sequences given by {Xi×Yi};
and so on. Most interestingly, you have truth values given by subobjects of the point; accordingly, in Set those are the empty set and the point itself, since P(∗)={∅,∗}, corresponding to true and false. Notice that ∅⊆∗; in fact the truth values in general will have the structure of a partially ordered set.
What are our truth values here? What is a subobject of a sequence of points? For one, each Xi has to be a subset of ∗. And there are no maps ∗→∅ ; so each “truth value” will look like
a bunch of empty sets and, at some position n, all points, meaning that we have as many truth values as natural numbers. This is our first glance into the cosmopolitan nature of topos theory: weird truth values! Notice, however, that if n≤m, their corresponding subobjects will have this ordering reversed (an exercise left for the already knowledgeable reader); so in the end it might have been better to use functors on Nop , natural numbers with their order reversed.
To sum up, we made the category SetN of sequences of sets, and realized that it was a topos with truth values Nop. Isn’t it that interesting...
Topoï-logical
Turns out there’s a big connection between toposes and topological spaces.
The open sets of a topological space have the structure of a partially ordered set, if you set U≤V whenever U⊆V. Moreover, in that poset, you can describe U∩V as the greatest lower bound of U and V, and U∪V as their least upper bound.
This is in fact (almost) exactly the structure we want of our topos-theoretic poset of truth values: the greatest lower bound corresponds to conjunction p∧q and least upper bound is the disjunction p∨q. So we can use topological spaces as spaces of truth values, and this is in fact the approach used in Heyting semantics of intuitionistic logic.
(so each open set, and not point, corresponds to a truth value; you take the AND of two open sets to be their intersection and the OR to be their union)
Alright, so as with N, the poset of open sets can define a category if you set U→V whenever U≤V. So take X a topological space and O(X) its category of open sets. We’ll define a new category Sh(X) of functors O(X)op→Set, except that they won’t be all of the functors, only those that preserve the topo/logical structure (sheaves).
Guess what? Not only is Sh(X) a topos, turns out that its truth values are isomorphic to O(X)! And since the truth values are the subobjects of the point, that means that the points of the topos are in fact shaped like X…
Now we have a fuller view of the logically cosmopolitan view that topos theory can bring us. You can create, just like that, a whole parallel mathematical universe of bizarro sets where everything is made up of, say, donuts. Or coffee cups. It is as you wish.
Where’s my Bayesian topos?
Since I am at heart a LessWronger, and since I care deeply about problems of logical induction and logical counterfactuals and whatnot, I spent a while trying to design a topos that would behave like manipulating probabilities. Or distributions. Or something. With the objective of making something that would represent beliefs.
Well, I’m sorry, but it doesn’t work.
At minimum, we would expect that the truth values of this topos be probabilities, yeah? And with the cosmopolitan principle above, we could then just take the sheaves on this poset of probabilistic truth values.
So these truth-values would be order-isomorphic to [0,1]. But for them to actually represent probabilities, we’d want that p∧q=pq, and yet the order on [0,1] already prescribes that p∧q=min(p,q) , and we are doomed from the start.
Furthermore, even in an intuitionistic logic, the provable statements all have the maximal truth value (which here would be 1); but we all know that 0 and 1 are not probabilities, and so nothing should be provable… which seems like it wouldn’t be very useful.
All in all, I’m truly sorry you had to bear through all of the math above just for this conclusion. It’s still pretty cool, though, right?
(Geometric) topoï aren’t reflective
In order to legitimately use topos theory for rationality, we should have a way for the topos to “think about itself”. Analogously to the situation in Peano arithmetic, for a topos E, we’d want some object E∈E (specifically, an internal category) to be isomorphic to E in some sense.
We can define an “element” of an object E∈E as being an arrow ∗→E . So the objects of the internal category are given by the set Hom(∗,E), and in fact the functor Hom(∗,−):E→Set respects the structure of the internal category enough that it becomes a category internal to Set, which is just a small category.
But wait. The toposes generated by sheaves on a topological space are at least as big as Set, but the collection of all sets is too big to be a set, and thus we run into size issues.
It should in principle be possible to do so in small toposes, such as the free (as in syntactic) topos, but I am not sure and will refrain from claiming so. It is however certainly possible to do so in list-arithmetic pretoposes (yes it’s a mouthful), as shown by André Joyal in his as of yet unpublished categorical proof of Gödel’s incompleteness theorems, which I have studied with him last year.
What now?
It now seems to me that linear logic might be the “right” weakening of classical logic into something probabilistic. I still need to figure out some of the details, but let’s say that the work has already been done, and one need only piece it together into something relevant to rationality and agent foundations. Particularly promising is that some claim that linear logic is a good setting for “paraconsistent” logic (logic that deals gracefully with contradictions), which could make it work for logical counterfactuals.
All this and more in my next post, pretentiously monikered “Probability Monads”.