Explanations of mathematical explanation

I recently read Mathematical Explanation [gated], by Mark Steiner (1978). My summary follows, and my commentary follows that. I am aware that others have written things since 1978 on this topic, but I don’t have time to read them right now.

***

We seem to think there is a distinction between explaining a mathematical fact and merely demonstrating it to be the case. We have proofs that do both things, and perhaps a sliding scale of explanatoriness between them. One big question then is what makes a proof actually explain the thing it proves? Or at least what makes it seem that way to us?

One suggestion has been the level of generality or abstractness. Perhaps if we show a particular fact follows from some much bigger theory, the fact feels more explained. But then consider this fact:

1+2+3+…+n = n(n+1)/​2

There is an inductive proof of this:

S(1) = 1(1+1)/​2 = 1

S(n+1) = S(n) + (n + 1) = n(n+1)/​2 + 2(n+1)/​2 = (n + 1)(n+2)/​2

This is not taken to be very explanatory. Whereas this is:

O O O O O
O O O O O
O O O O O
O O O O O

[the black circles make a triangle of 1+2+3+4. Any such triangle can be made into a rectangle of area n x (n+1) with another identical triangle. So the triangle is half of n(n+1).]

It seems the latter is if anything less general, yet it seems a much better explanation (I remember learning it this way as a preteen in book about fun math magic). There are other examples.

This case and others, suggest being able to visualize a proof is key to its seeming to be an explanation. Steiner discards this immediately as being too subjective, and claims there are also counterexamples.

He also quickly dismisses a third hypothesis that others have forwarded: that a proof is explanatory if it could have been used to discover the fact, rather than just to verify it. His counterexample is the Eulerian identity, which I shan’t go into here. I take it this hypothesis isn’t very plausible anyway, since often we discover a fact first then hope to explain it better.

Steiner offers his own theory: that a proof is explanatory if it makes use of a ‘characterizing property’ of an entity that is mentioned in the theorem. ‘Characterizing properties’ characterize an entity relative to other entities in some similar family. For instance, 18 might be characterized as 2*3*3, since other numbers don’t have that property. 18 might also be characterized as being one more than 17, or in a huge number of other ways.

If I understand, the idea is that if we are clear on how a result depends on a particular characterizing property, we will feel that the result has been explained. If we don’t see how something unique about the entities in question ‘caused’ the outcome, the outcome seems arbitrary. He explains further that this means we can see that if we change the properties of the entity, perhaps swapping out 18 for 20, we would get a different result.

Steiner explains how the many proofs he has presented that we have considered explanatory do in fact depend on characterizing properties, thus considers his theory to be quite supported.

Perhaps I misunderstand this notion of ‘characterizing properties’. It seems to me that of course all proofs depend on properties specific to the entities they are about (relative to whatever entities the proof is not about). So to distinguish the explanatory proofs, Steiner needs a narrower notion of a characterizing property. For instance, a property that is particularly saliently related to the entity in question. Or he needs to claim that explanatoriness requires the observer to actually notice or understand the connection between the explanatory property and the outcome. In which case the explanatoriness of a proof would be a function of the observer’s psychology as well as the proof. Any proof would be perfectly explanatory if the reader followed it carefully enough.

At any rate, he doesn’t seem to be thinking of either of those things (though again I may be misunderstanding just what he is claiming at the end here). He rather claims that the various proofs he examines do in fact rely on properties that characterize the entities involved. The class seemed to agree with me here.

My tentative theory of when we feel something has been explained, which goes for scientific explanations as well as mathematical ones, is as follows. We feel like we understand a bunch of things that we are very familiar with: chunks of matter moving through space and knocking into each other, liquids, shapes, basic agenthood, that sort of thing.

Anything that happens that only involves these things acting in their usual ways doesn’t feel like it needs any extra explanation. It is obvious. To ‘explain’ less familiar things, we can do one of two things. We can frame them in terms of something we already intuitively grasp in the above way. This is what is usually called an explanation. For instance we can think of electricity as being like water, or of the first n integers as being like bits of a triangle. Or of the mysterious murder being like a waitress putting poison in the soup. Alternatively we can just keep interacting with the entity in question until we become familiar with it’s properties, and then we think them obvious and not requiring explanation. For instance I no longer feel like I need an explanation for x^2 making a parabola shape, because I’m so familiar with it.

This arguably fits with many of the characteristics we have noted are associated with explanatoriness. Instances of generalizations that we understand feel explanatory. Pictures tend to be explanatory, especially diagrams with simple shapes. We feel like we could have discovered a thing ourselves if it follows from behavior of entities we can manipulate intuitively.

While this seems to me a decent characterization of what feels explanatory, I can’t see that it is a particularly useful category outside of psychology, for instance for use in saying what it is that science is meant to be doing. Something like unification seems more apt there, but that’s a topic for another time.


No comments.