The sentence structure of mathematics
“Alice pushes Bob.”
“Cat drinks milk.”
“Comment hurts feelings.”
These are all different sentences that describe wildly different things. People are very different from cats, and cats are very different from comments. Bob, milk, and feelings don’t have much to do with each other. Pushing, drinking, and (emotionally) hurting are also really different things.
But I bet these sentences all feel really similar to you.
They should feel similar. They all have the same structure. Specifically, that structure is
Because these sentences all share the same fundamental underlying structure, they all feel quite similar even though they are very different on the surface. (The mathematical term for “fundamentally the same but different on the surface” is isomorphic.)
When you studied sentence structure back in grammar school (it wasn’t just me, right?) you learned to break down sentences into their parts of speech. You learn that nouns are persons, places, or things, and verbs are the activities that nouns do. Adjectives describe nouns, and adverbs describe pretty much anything. Prepositions tell you where nouns go. Etc.
Parts of speech are really abstract and really general. When you look at the surface, the sentence
the ant crawls on the ground
and the sentence
the spaceship flies through space
could not possibly be more different. But when you look at the sentence structure, they’re nearly identical.
The concept of “parts of speech” emerge when we notice certain general patterns arising in the way we speak. We notice that whether we’re talking about ants or spaceships, we’re always talking about things. And whether we’re talking about crawling or flying, we’re always talking about actions.
And so on for adjectives, adverbs, conjunctions, etc., which always seem to relate back to nouns and verbs—adjectives modify nouns, for example.
Next we simply give things and actions, descriptors and relational terms some confusing names to make sure the peons can’t catch on—nouns and verbs, adjectives and prepositions—and we have a way of breaking down any English sentence into its fundamental parts.
That is to say, if you know the abstract rules governing sentence structure—the types of pieces and their connections—you can come up with structures that any English sentence is but a particular example of.
Like how “Alice pushes Bob” is but a particular example of “Noun verb noun.”
At the most basic level, category theory breaks down mathematics into its parts of speech. It turns out that mathematics is pretty much just nouns and verbs at its simplest—just like how, if you read between the lines a bit, any English sentence can be boiled down to its nouns and verbs. Those are the “main players” which everything else just modifies in some fashion.
In mathematics, a noun is called an object.
A verb is called a morphism or arrow. We’ll explore the terminology of morphism a bit more next time. As to why they can also be called arrows, that’s because verbs appear to have directions: One noun does the verb, and another noun (potentially the same noun, like pinching yourself) receives the verb. So you could draw that as an arrow like so:
This is exactly how we diagram objects and morphisms in category theory, with one difference: we typically use single letters in place of full names. (I’d explain the value of concision here, but it seems hypocritical.) So if Alice and Bob are objects in our category, and Alice’s push of Bob is the morphism, then we might write it this way:
Equally legitimate is to highlight the morphism up front. (We’ll see they’re the real stars of the show):
So now you understand objects and morphisms, the basic pieces of any category, just like how nouns and verbs are the basic pieces of any sentence.
Of course, making a sentence isn’t as simple as mashing nouns and verbs together. We need to make sure that the sentence makes sense. To paraphrase Harrison Ford, you can write “colorless green ideas sleep furiously”, but you sure can’t think it.
We’ll explore the rules that define a category in the next post.
I was tentatively excited about this series, but I have to be honest: I am dismayed by this post.
… I really, really don’t.
You are (unless I’m grossly misunderstanding) analogizing category theory to grammar. Your analogy starts with some examples of sentences; then provides an intuitive, common-sense explanation of the common parts of speech, in non-technical terms; and also provides the technical terms. This is perfectly sensible, and is easy to follow.
Then, to make use of the analogy, you introduce the mathematical analogues… but this time, you don’t provide any examples, nor any intuitive generalizations of the examples (because there are none to generalize)… you simply introduce the technical terms, assert that they analogize, and declare that understanding has been conveyed. But that doesn’t work at all!
To elaborate:
What are some examples of this? What are some things in mathematics which are “nouns” and “verbs”? I don’t have any intuition for this (as I certainly do for English sentences, which clearly deal with things, and actions that people take, etc.).
So we’re talking about… what, exactly? Numbers? Digits? Variables? Functions? Expressions? Equations? Operators? Symbols? All of the above? None of the above? Some of the above?
Again… what are examples of “morphisms” or “arrows”? Like, actual examples, not “examples” by analogy to English sentences?
Ditto. If this is the generalization, what are some specific examples?
I very much hope that you can address these troubles… otherwise, if I can’t understand even the very basic first concepts, there doesn’t seem to be much hope of understanding anything else!
Thank you very much for your reaction to this post. As it happens, I find myself in agreement with you. I leaned too hard in the direction of avoiding any discussion of mathematics. The next post is already written to clarify that sentences are all about nouns and verbs because we use sentences to model reality, and reality seems to consist of nouns and verbs. (Cats, drinking, milk, etc., are all part of reality. Even adjectives like “blue” are broken down by our physics into nouns and verbs.) We use various specific kinds of mathematics to model various specific parts of reality, and so various specific kinds of mathematics themselves boil down to nouns and verbs. So when you do a “mathematics of math” it ends up being a mathematics that is analogous to a mathematics of nouns and verbs, which get called objects and morphisms respectively. (We probably can’t carry this analogy forever—I don’t know that there’s a real-world language analogy to n-categories. But that won’t come up anyway.) I’ll very much look forward to your reaction to the next post, which motivates category theory as a general description of how you’d want to model pretty much anything in a universe of cause-and-effect, which correspondingly generalizes, almost as a byproduct, the mathematics any human is likely to invent.
There are many options for being clearer about objects and morphisms in this post, and I will consider them...I will also take pains to ensure it is not necessary to reconsider future posts for this particular mistake, thanks to you.
Do you know the monads are like burritos problem? Do you have a plan for how this sequence isn’t going to end up being “mathematics is like burritos”?
Delighted that someone is wants to give a detailed explanation of this area. I tried to read the start of the introduction for programmers and it wasn’t as self-evident that I would have thought.
I would have broken up the parts of speech as subject predicate object, s p o, which produces a pattern like a b c while the post wants to introduce a pattern like a b a. The verb also gets inflected in the examples. A starkly literal application of noun verb noun pattern would spell “cat drink milk” rather than “cat drinks milk”. It is also ambigious whether it should carry over that the As are drawn form the same kind of entities (Alice and Bob are persons)
There is also the difference of a verb as it relates to place ina sentence and verb as descripbing an action. For example I do not think that addition is a verb but more of a relation. Part of the shakiness and insecurity on taking on odd concept areas can be the undefinedness of the basic concepts. To that effect I think the post seems me to think that I have a understanding of “objects” and “morphisms” but it really just says “translate these as nouns and verbs”. Okay it is something I can hangs conceptual stuff on but referring to establihed concepts elsewhere seems like a lot of unwanted baggage might be imported in the same go.
If this is part of future steps refer to there but I got myself confused over what is the same or different between programming functions, mathematical functions and morphisms. How it relates to this post based on reading this if I have p: A → B and q: A → B these seem to define two separate morphisms. I get that if I have p:A->B and p:A->C there is a naming conflict and the second p is neccesarily different. But on language level it would seem that “Alice punches Bob” and “Alice hugs Bob” are two separate entities.
I have previous baggage since the difference between a programming function and a mathematical function gives me a theorethical headache. In particular I can imagine two programming functions that have the same input and output behaviour but work differently and are thus clearly separate. Yet mathematical functions are identified by their input/output behaviour. (then there is the problem that some of the extensions are in fact drawn from intensions which makes one wonder whether the extension definitions are just a front for the real thought processes. If you have a thing like “f(x)=x+x^2” it seems to be a different kind of entity than a infinite listing of value pairs) Then there is the thing that programming functions are geniunely verb in that you can execute functions and it corresponds to physical events happening on a computer. However there is no “time progression” for mathematical functions. The analog for morphisms to verbs seems to me that they also have time progression but it seems to be somewhat in conflict with the other source.
You have thought about the language analogy much harder than I did. I will think about how to avoid this issue better in the future, so thank you. In any case, don’t stress it too much—all that this post seeks to establish is that category theory is a mathematics of “stuff taking action on stuff”—moreover, it does so in a logical, intuitive way that you are already familiar with, even if you don’t know higher maths. Judging by Said’s comment, I also should have clarified that specific branches of mathematics fill in particular things for “stuff” and “taking action.” E.g., you get set theory when you fill in “sets” for stuff and “functions” for taking action.
It might get weird for me as part of the past prgoress for me is how functions are actually objects ie non-verblike. You can example code a function into ordered pairs which can be represented as a set. You are meaning more in the sense that a function by itself is missing something has a “hole” in it? For example “It rains” can seem like a language construction where “rain” appears without holes (and in my native language you express that kind of thought without any formal subject, “rains” is a pertfectly fine sentence that descripbes a common wheather condition/activity.).
The baggage that comes with the words noun and verb is only for guiding the search for intuition and is to be discarded when it leads to confusion.
In all your interpretations of math/programming functions, there can be different arrows between the same objects. The input/output behavior is seen as part of the arrow. The objects are merely there to establish what kinds of arrows can be strung together because one produces, say, real numbers, and the other consumes them.
Well if I have a mapping (function, morphism?) that has some “rows” of
1 to 5 A to 3 B to cat cow to france
it doesn’t seem that descriptive to say that this is a “B->5” mapping. Now usually programming functions are sensible in the sense that the inputs and outputs are of similar types. But if I am start and form the concept of morphism from the ground up how do I know whether such “mixed” types are allowed or not? Or rather given that I do not know of types how I get mapping over multiple inputs?
If your mapping contains those three pairs, then the arrow’s source object contains 1, A, B and cow, and the target object contains 5, 3, cat and france. Allowing or disallowing mixed types gives two different categories. Whether an arrow mixes types is as far as I can tell you to mean uniquely determined by whether its source or target object mix types. In either case, to compose two arrows they must have a common middle object.
I don’t know whether it is a relevant fear but just I am unsure how much the other details other than type compatibility are preserved.
Say you have a mapping O: A->1, B->3 and a mapping P: 4-> france, 5->england. You could then go that O is letters to numbers and P is numbers to countries so you go that mapping from letters to countries should exist but if you start at A or B you don’t end up at any country. Or is the case that {1,3} is a different category than {4,5} rather than letters being equal to letters?
You mean object.
Every category containing O and P must address this question. In the usual category of math functions, if P has only those two pairs then the source object of P is exactly {4,5}, so O and P can’t be composed. In the category of relations, that is arbitrary sets of pairs between the source and target sets, O and P would compose to the empty relation between letters and countries.
I would suspect there are rules how it works that way but now it is not intuitive for me why that would be the result. Why it would not produce the empty function? And if you have a empty relation isn’t it a relation of any type to any type at the same time? Would it or why it would not be an empty relation between letter-shapes and country-dances? But apparently you can have different kinds of empty morphisms based on what their source and target objects are.
I didn’t also realise that composing is relative to how you view the objects.
Categories are what we call it when each arrow remembers its source and target. When they don’t, and you can compose anything, it’s called a monoid. The difference is the same as between static and dynamic type systems. The more powerful your system is, the less you can prove about it, so whenever we can, we express that particular arrows can’t be composed, using definitions of source and target.
There are some pretty big differences. A verb typically defines an action that takes place at a time, so that Bob was not pushed by Alice until time T when he was. Morphisms are more like static relationships, which would typically expressed by the copula .. “is the father of” and so on.
Does category theory have meaning component?