The paper asks for a “probability distribution over models of L”. In fact, for many languages L, models of L form a proper class. Does this cause measure-theoretic difficulties? It seems like this might force mu to be zero on all sufficiently large models (otherwise you can do some sort of transfinite induction to get sets of unbounded measure) but I’m not very good at crazy set theory stuff.
At one point the authors state “We would like P(forall phi in L’ )”. I thought we were in a first-order language and therefore couldn’t quantify over propositions?
It’s not immediately clear to me that this actually constructs a measure on the set of theories: that is, if S is the set of all complete consistent theories, it’s not clear to me that for the mu we construct by martingale, mu(S) = 1 (or even that mu(S) != 0). Mightn’t additivity break when we take the limit and get a whole theory rather than just an incomplete bag of axioms?
Can we instead do “probability distribution over equivalence classes of models of L”, where equivalence is determined by agreement on the truth-values of all first order sentences? There’s only 2^ℵ₀ of those, and the paper never depends on any distinction within such an equivalence class.
There are definitely some probability distributions over proper classes that are useful (for example, a measure that assigns .5 to one model, .5 to another, and zero to the rest). No model would ever be forced to have measure 0, as you can always construct the measure that assigns 1 to that particular model and 0 to all the others. But as to whether or not there are other difficulties with defining a probability measure over a proper class, I have no idea. I, too, lack skill with crazy set theory.
You’re referring to page 7? I believe that it means to say “we would like a P that obeys the axiom schemaP(forall a, b in Q ... phi ...) for all phi in L”. You’re right, though, this is somewhat ambiguous.
I don’t completely understand your question. Are you questioning whether T=UTi is actually complete and consistent? Compactness guarantees that it is consistent, and the enumeration of sentences guarantees it is complete.
I meant that (conjecturally) for every measure, there exists a cardinal kappa such that mu({M: |M| > kappa}) = 0. Anyway, I guess as you’ve demonstrated the set/class thing isn’t a big problem, but it is something to watch out for.
Okay, that makes sense.
No, I was observing the following: mu is countably additive, and the set of theories is countable. Hence the measure of the total space is the sum of the measures of the theories, so the measures of the theories must sum to 1. Now it’s clear that at every step i of the process, the sum of the measures of the (incomplete) theories so obtained is 1. But it’s not immediately clear to me that this holds in the limit.
However, I just realized my mistake, which is that the set of theories isn’t always countable (there are countably many sentences, but a theory is a subset of the sentences; for instance, consider the language with countably many unary relations and a constant symbol). In particular, I believe it’s countable if and only if the sum is preserved in the limit, so we’re fine.
For a countable language L and theory T (say, with no finite models), I believe the standard interpretation of “space of all models” is “space of all models with the natural numbers as the underlying set”. The latter is a set with cardinality continuum (it clearly can’t be larger, but it also can’t be smaller, as any non-identity permutation of the naturals gives a non-identity isomorphism between different models).
Moreover this space of models has a natural topology, with basic open sets {M: M models phi} for L-sentences phi, so it makes sense to talk about (Borel) probability measures on this space, and the measures of such sets. (I believe this topology is Polish, actually making the space Borel isomorphic to the real numbers.)
Note that by Lowenheim-Skolem, any model of T admits a countable elementary substructure, so to the extent that we only care about models up to some reasonable equivalence, countable models (hence ones isomorphic to models on the naturals) are enough to capture the relevant behavior. (In particular, as pengvado points out, the Christiano et al paper only really cares about the complete theories realized by models, so models on the naturals suffice.)
I’m confused by a couple minor points here, also:
The paper asks for a “probability distribution over models of L”. In fact, for many languages L, models of L form a proper class. Does this cause measure-theoretic difficulties? It seems like this might force mu to be zero on all sufficiently large models (otherwise you can do some sort of transfinite induction to get sets of unbounded measure) but I’m not very good at crazy set theory stuff.
At one point the authors state “We would like P(forall phi in L’ )”. I thought we were in a first-order language and therefore couldn’t quantify over propositions?
It’s not immediately clear to me that this actually constructs a measure on the set of theories: that is, if S is the set of all complete consistent theories, it’s not clear to me that for the mu we construct by martingale, mu(S) = 1 (or even that mu(S) != 0). Mightn’t additivity break when we take the limit and get a whole theory rather than just an incomplete bag of axioms?
Can we instead do “probability distribution over equivalence classes of models of L”, where equivalence is determined by agreement on the truth-values of all first order sentences? There’s only 2^ℵ₀ of those, and the paper never depends on any distinction within such an equivalence class.
Yes, though we should just call it a “probability distribution over complete consistent theories” in that case (it’s exactly the same).
There are definitely some probability distributions over proper classes that are useful (for example, a measure that assigns .5 to one model, .5 to another, and zero to the rest). No model would ever be forced to have measure 0, as you can always construct the measure that assigns 1 to that particular model and 0 to all the others. But as to whether or not there are other difficulties with defining a probability measure over a proper class, I have no idea. I, too, lack skill with crazy set theory.
You’re referring to page 7? I believe that it means to say “we would like a P that obeys the axiom schema
P(forall a, b in Q ... phi ...)
for all phi in L”. You’re right, though, this is somewhat ambiguous.I don’t completely understand your question. Are you questioning whether T=UTi is actually complete and consistent? Compactness guarantees that it is consistent, and the enumeration of sentences guarantees it is complete.
I meant that (conjecturally) for every measure, there exists a cardinal kappa such that mu({M: |M| > kappa}) = 0. Anyway, I guess as you’ve demonstrated the set/class thing isn’t a big problem, but it is something to watch out for.
Okay, that makes sense.
No, I was observing the following: mu is countably additive, and the set of theories is countable. Hence the measure of the total space is the sum of the measures of the theories, so the measures of the theories must sum to 1. Now it’s clear that at every step i of the process, the sum of the measures of the (incomplete) theories so obtained is 1. But it’s not immediately clear to me that this holds in the limit.
However, I just realized my mistake, which is that the set of theories isn’t always countable (there are countably many sentences, but a theory is a subset of the sentences; for instance, consider the language with countably many unary relations and a constant symbol). In particular, I believe it’s countable if and only if the sum is preserved in the limit, so we’re fine.
For a countable language L and theory T (say, with no finite models), I believe the standard interpretation of “space of all models” is “space of all models with the natural numbers as the underlying set”. The latter is a set with cardinality continuum (it clearly can’t be larger, but it also can’t be smaller, as any non-identity permutation of the naturals gives a non-identity isomorphism between different models).
Moreover this space of models has a natural topology, with basic open sets {M: M models phi} for L-sentences phi, so it makes sense to talk about (Borel) probability measures on this space, and the measures of such sets. (I believe this topology is Polish, actually making the space Borel isomorphic to the real numbers.)
Note that by Lowenheim-Skolem, any model of T admits a countable elementary substructure, so to the extent that we only care about models up to some reasonable equivalence, countable models (hence ones isomorphic to models on the naturals) are enough to capture the relevant behavior. (In particular, as pengvado points out, the Christiano et al paper only really cares about the complete theories realized by models, so models on the naturals suffice.)