Kaarel comments on Why I’m not a Bayesian

Kaarel 14 Oct 2024 17:10 UTC
18 points
7
I find it surprising/confusing/confused/jarring that you speak of models-in-the-sense-of-mathematical-logic=:L-models as the same thing as (or as a precise version of) models-as-conceptions-of-situations=:C-models. To explain why these look to me like two pretty much entirely distinct meanings of the word ‘model’, let me start by giving some first brushes of a picture of C-models. When one employs a C-model, one likens a situation/object/etc of interest to a situation/object/etc that is already understood (perhaps a mathematical/abstract one), that one expects to be better able to work/play with. For example, when one has data about sun angles at a location throughout the day and one is tasked with figuring out the distance from that location to the north pole, one translates the question to a question about 3d space with a stationary point sun and a rotating sphere and an unknown point on the sphere and so on. (I’m not claiming a thinker is aware of making such a translation when they make it.) Employing a C-model $\approx$ making an analogy. From inside a thinker, the objects/situations on each side of the analogy look like… well, things/situations; from outside a thinker, both sides are thinking-elements.^[1] (I think there’s a large GOFAI subliterature trying to make this kind of picture precise but I’m not that familiar with it; here are two papers that I’ve only skimmed: https://www.qrg.northwestern.edu/papers/Files/smeff2(searchable).pdf , https://api.lib.kyushu-u.ac.jp/opac_download_md/3070/76.ps.tar.pdf .)
I’m not that happy with the above picture of C-models, but I think that it seeming like an even sorta reasonable candidate picture might be sufficient to see how C-models and L-models are very different, so I’ll continue in that hope. I’ll assume we’re already on the same page about what an L-model is ( https://en.wikipedia.org/wiki/Model_theory ). Here are some ways in which C-models and L-models differ that imo together make them very different things:
- An L-model is an assignment of meaning to a language, a ‘mathematical universe’ together with a mapping from symbols in a language to stuff in that universe — it’s a semantic thing one attaches to a syntax. The two sides of a C-modeling-act are both things/situations which are roughly equally syntactic/semantic (more precisely: each side is more like a syntactic thing when we try to look at a thinker from the outside, and just not well-placed on this axis from the thinker’s internal point of view, but if anything, the already-understood side of the analogy might look more like a mechanical/syntactic game than the less-understood side, eg when you are aware that you are taking something as a C-model).
- Both sides of a C-model are things/situations one can reason about/with/in. An L-model takes from a kind of reasoning (proving, stating) system to an external universe which that system could talk about.
- An L-model is an assignment of a static world to a dynamic thing; the two sides of a C-model are roughly equally dynamic.
- A C-model might ‘allow you to make certain moves without necessarily explicitly concerning itself much with any coherent mathematical object that these might be tracking’. Of course, if you are employing a C-model and you ask yourself whether you are thinking about some thing, you will probably answer that you are, but in general it won’t be anywhere close to ‘fully developed’ in your mind, and even if it were (whatever that means), that wouldn’t be all there is to the C-model. For an extreme example, we could maybe even imagine a case where a C-model is given with some ‘axioms and inference rules’ such that if one tried to construct a mathematical object ‘wrt which all these axioms and inference rules would be valid’, one would not be able to construct anything — one would find that one has been ‘talking about a logically impossible object’. Maybe physicists handling infinities gracefully when calculating integrals in QFT is a fun example of this? This is in contrast with an L-model which doesn’t involve anything like axioms or inference rules at all and which is ‘fully developed’ — all terms in the syntax have been given fixed referents and so on.
- (this point and the ones after are in some tension with the main picture of C-models provided above but:) A C-model could be like a mental context/arena where certain moves are made available/salient, like a game. It seems difficult to see an L-model this way.
- A C-model could also be like a program that can be run with inputs from a given situation. It seems difficult to think of an L-model this way.
- A C-model can provide a way to talk about a situation, a conceptual lens through which to see a situation, without which one wouldn’t really be able to [talk about]/see the situation at all. It seems difficult to see an L-model as ever doing this. (Relatedly, I also find it surprising/confusing/confused/jarring that you speak of reasoning using C-models as a semantic kind of reasoning.)
(But maybe I’m grouping like a thousand different things together unnaturally under C-models and you have some single thing or a subset in mind that is in fact closer to L-models?)
All this said, I don’t want to claim that no helpful analogy could be made between C-models and L-models. Indeed, I think there is the following important analogy between C-models and L-models:
- When we look for a C-model to apply to a situation of interest, perhaps we often look for a mathematical object/situation that satisfies certain key properties satisfied by the situation. Likewise, an L-model of a set of sentences is (roughly speaking) a mathematical object which satisfies those sentences.
(Acknowledgments. I’d like to thank Dmitry Vaintrob and Sam Eisenstat for related conversations.)
1. ^
  This is complicated a bit by a thinker also commonly looking at the C-model partly as if from the outside — in particular, when a thinker critiques the C-model to come up with a better one. For example, you might notice that the situation of interest has some property that the toy situation you are analogizing it to lacks, and then try to fix that. For example, to guess the density of twin primes, you might start from a naive analogy to a probabilistic situation where each ‘prime’ p has probability (p-1)/p of not dividing each ‘number’ independently at random, but then realize that your analogy is lacking because really p not dividing n makes it a bit less likely that p doesn’t divide n+2, and adjust your analogy. This involves a mental move that also looks at the analogy ‘from the outside’ a bit.