You spend a few paragraphs puzzling about how a probabilistic theory could be falsified. As you say, observing an event in a null set or a meagre set does not do the trick. But observing an event which is disjoint from the support of the theory’s measure does falsify it. Support is a very deep concept; see this category-theoretic treatise that builds up to it.
You can add that as an additional axiom to some theory, sure. It’s not clear to me why that is the correct notion to have, especially since you’re adding some extra information about the topology of your probability space when interpreting the measure theoretic structure which seems “illegitimate” and difficult to generalize to some other situations.
My point with meager sets was that they are orthogonal to sets of null measure, so you really need to give some explanation for why you “break the symmetry” between two classes of small sets by privileging one over the other.
However, I think the more fundamental question here isn’t “how can I discard most of the information in a probabilistic theory so that it fits into Popperian falsificationism?”, but rather “why should I accept Bayesian epistemology when it doesn’t seem to fit into Popperian falsificationism?” For that, I refer you to Andrew Gelman’s nuanced views and Sprenger and Hartmann’s Bayesian Philosophy of Science.
I don’t think the question is about Popperian falsificationism, though Popperians are usually more able to notice the philosophical problem I’m talking about in the question. I simply don’t actually know what the relationship of probability to anything “real” is when a theory says “here is a space of outcomes and here is a probability measure on it”. The probability measure doesn’t seem to “tell you” anything.
Thanks for the references, I’ll take a look. I’m not very hopeful since if there is a good answer to my question I think it should fit in the space of an answer—most of what’s in these sources seems to be irrelevant to what I’m asking.
Okay, I now think both of my guesses about what’s really being asked were misses. Maybe I will try again with a new answer; meanwhile, I’ll respond to your points here.
You’re right that I’m sneaking something in when invoking support because it depends on the sample space having a topological structure, which cannot typically be extracted from just a measurable structure. What I’m sneaking in is that both the σ-algebra structure and the topological structure on a scientifically meaningful space ought to be generated by the (finitely) observable predicates. In my experience, this prescription doesn’t contradict with standard examples, and situations to which it’s “difficult to generalize” feel confused and/or pathological until this is sorted out. So, in a sense I’m saying, you’re right that a probability space (X,Σ,P) by itself doesn’t connect to reality—because it lacks the information about which events in Σ are opens.
As to why I privilege null sets over meagre sets: null sets are those to which the probability measure assigns zero value, while meagre sets are independent of the probability measure—the question of which sets are meagre is determined entirely by the topology. If the space is Polish (or more generally, any Baire space), then meagre sets are never inhabited open sets, so they can never conceivably be observations, therefore they can’t be used to falsify a theory.
But, given that I endorse sneaking in a topology, I feel obligated to examine meagre sets from the same point of view, i.e. treating the topology as a statement about which predicates are finitely observable, and see what role meagre sets then have in philosophy of science. Meagre sets are not the simplest concept; the best way I’ve found to do this is via the characterization of meagre sets with the Banach–Mazur game:
Suppose Alice is trying to claim a predicate X is true about the world, and Bob is trying to claim it isn’t true.
They engage in armchair “yes-but” reasoning, taking turns saying “suppose we observe Eᵢ”.
The rules are the same for Alice and Bob: they can choose any finitely observable event, as long as it is consistent with all the previous suppositions (i.e. has nonempty intersection).
In the limit of countably-infinitely many rounds, Alice wins if all the suppositions remain consistent with X, or Bob wins if he can rule out X.
Alice gets to move first.
Of course, for any claim X that is finitely observable, the first-move advantage is decisive: Alice can simply say “suppose we observe X,” and now Bob is doomed. But there are some sets X for which Bob has a guaranteed winning strategy, and those are the meagre sets.
From a philosophy-of-science perspective, meagre sets are propositions internal to a scientific ontology which, even if the ontology is assumed true, could always be falsified by a stream of experimental outcomes from an adversarial Nature (Bob’s moves), even if each such outcome must be consistent with the best possible outcome for the proposition (Alice’s moves). That’s the sense in which meagre sets are negligible. Very loosely, they are hypotheses that, in a specific way, it doesn’t make sense to argue for. For example, the proposition that the fine-structure constant is a Martin-Löf random number has probability 1, but it doesn’t make sense to argue that this is “in fact” the case, essentially because the proposition is meagre.
A related perspective on meagre sets as propositions (mostly writing down for my own interest):
The interior operator IntAcan be thought of as “rounding down to the nearest observable proposition”, since it is the upper bound of all opens that imply A.
The condition for A to be nowhere dense is equivalent to Int¬Int¬A=∅.
If we are working with a logic of observables, where every proposition must be an observable, the closest we can get to a negation operator is a pseudo-negation ∼:=Int¬.
So a nowhere dense set is a predicate whose double-pseudo-negation ∼∼A is false, or equivalently ∼∼∼A is true.
Another slogan, derived from this, is “a nowhere dense hypothesis is one we cannot rule out ruling out”.
The meagre propositions are the σ-ideal generated by nowhere dense propositions.
What I’m sneaking in is that both the σ-algebra structure and the topological structure on a scientifically meaningful space ought to be generated by the (finitely) observable predicates. In my experience, this prescription doesn’t contradict with standard examples, and situations to which it’s “difficult to generalize” feel confused and/or pathological until this is sorted out.
It’s not clear to me how finitely observable predicates would generate a topology. For a sigma algebra it’s straightforward to do the generation because they are closed under complements, but for a topology if you allow both a predicate and its negation to be “legitimate” then you’ll end up with all your basis elements being clopen. This would then give you something that looks more like a Cantor set than a space like [0,1].
I agree “morally” that the topology should have something to do with finitely observable predicates, but just taking it to be generated by them seems to exclude a lot of connected spaces which you might want to be “scientifically meaningful”, starting from [0,1].
From a philosophy-of-science perspective, meagre sets are propositions internal to a scientific ontology which, even if the ontology is assumed true, could always be falsified by a stream of experimental outcomes from an adversarial Nature (Bob’s moves), even if each such outcome must be consistent with the best possible outcome for the proposition (Alice’s moves). That’s the sense in which meagre sets are negligible. Very loosely, they are hypotheses that, in a specific way, it doesn’t make sense to argue for. For example, the proposition that the fine-structure constant is a Martin-Löf random number has probability 1, but it doesn’t make sense to argue that this is “in fact” the case, essentially because the proposition is meagre.
Your point is taken, though as a side remark I think it’s ludicrous to claim that something like the fine structure constant has any property like this with probability 1, given that it’s most likely so far from being a number chosen randomly from some range.
I think putting meager sets in context using the Banach-Mazur game makes sense, but to me this only makes the issue worse, since the existence of comeager & null events would mean that there are some hypotheses that
it doesn’t make sense to argue against and yet
you should give arbitrarily large odds in a bet to anyone who would claim they are correct.
You’re saved from a contradiction because in this setup neither the comeager event nor its complement contain any nonempty open, so the event can neither be proven true or false if we assume the opens are all the finitely observable predicates. In that sense it “doesn’t matter” what odds you give on the event being true or false, but it still seems to me like on a probability space that is also a Polish space you have two structures living together which give you contradictory signals about which events are “small”, and it’s difficult to reconcile these two different ways of looking at things.
I also think that this discussion, though interesting, is somewhat beside the point: even if we deal with some probability space which is also a Polish space, I’m still not sure what is the information added by the probability measure beyond what its support is. Any two measures absolutely continuous wrt each other will have the same support, but obviously we would treat them very differently in practice.
(I agree with your last paragraph—this thread is interesting but unfortunately beside the point since probabilistic theories are obviously trying to “say more” than just their merely nondeterministic shadows.)
Negations of finitely observable predicates are typically not finitely observable. [0,0.5) is finitely observable as a subset of [0,1], because if the true value is in [0,0.5) then there necessarily exists a finite precision with which we can know that. But its negation, [0.5,1], is not finitely observable, because if the true value is exactly 0.5, no finite-precision measurement can establish with certainty that the value is in [0.5,1], even though it is.
The general case of why observables form a topology is more interesting. Finite intersections of finite observables are finitely observable because I can check each one in series and still need only finite observation in total. Countable unions of finite observables are finitely observable because I can check them in parallel, and if any are true then its check will succeed after only finite observation in total.
Uncountable unions are thornier, but arguably unnecessary (they’re redundant with countable unions if the space is hereditarily Lindelöf, for which being Polish is sufficient, or more generally second-countable), and can be accommodated by allowing the observer to hypercompute. This is very much beside the point, but if you are still interested anyway, check out Escardó′s monograph on the topic.
Negations of finitely observable predicates are typically not finitely observable. [0,0.5) is finitely observable as a subset of [0,1], because if the true value is in [0,0.5) then there necessarily exists a finite precision with which we can know that. But its negation, [0.5,1], is not finitely observable, because I’d the true value is exactly 0.5, no finite-precision measurement can establish with certainty that the value is in [0.5,1], even though it is.
Ah, I didn’t realize that’s what you mean by “finitely observable”—something like “if the proposition is true then there is a finite precision measurement which will show that it’s true”. That does correspond to the opens of a metric space if that’s how you formalize “precision”, but it seems like a concept that’s not too useful in practice because you actually can’t measure things to arbitrary precision in the real world. [0, 0.5) is not going to actually be observable as long as your apparatus of observation has some small but nonzero lower bound on its precision.
What’s the logic behind not making this concept symmetric, though? Why don’t we ask also for “if the proposition is false then there is a finite precision measurement which will show that it’s false”, i.e. why don’t we ask for observables to be clopens? I’m guessing it’s because this concept is too restrictive, but perhaps there’s some kind of intuitionist/constructivist justification for why you’d not want to make it symmetric like this.
Uncountable unions are thornier, but arguably unnecessary, and can be accommodated by allowing the observer to hypercompute. This is very much beside the point, but if you are still interested anyway, check out Escardó′s monograph on the topic.
What’s the logic behind not making this concept symmetric, though?
It’s nice if the opens of X can be internalized as the continuous functions X→TV for some space of truth values TV with a distinguished point ⊤ such that x∈O⇔O(x)=⊤. For this, it is necessary (and sufficient) for the open sets of TV to be generated by {⊤}. I could instead ask for a distinguished point ⊥ such that x∉O⇔O(x)=⊥, and for this it is necessary and sufficient for the open sets of TV to be generated by TV∖{⊥}. Put them together, and you get that TV must be the Sierpiński space: a “true” result (⊤∈TV) is finitely observable ({⊤} is open), but a “false” result is not ({⊥} is not open).
perhaps there’s some kind of intuitionist/constructivist justification
Yes, constructively we do not know a proposition until we find a proof. If we find a proof, it is definitely true. If we do not find a proof, maybe it is false, or maybe we have not searched hard enough—we don’t know.
Also related is that the Sierpiński space is the smallest model of intuitionistic propositional logic (with its topological semantics) that rejects LEM, and any classical tautology rejected by Sierpiński space is intuitionistically equivalent to LEM. There’s a sense in which the difference between classical logic and intuitionistic logic is precisely the assumption that all open sets of possibility-space are clopen (which, if we further assume T0, leads to an ontology where possibility-space is necessarily discrete). (Of course it’s not literally a theorem of classical logic that all open sets are clopen; this is a metatheoretic claim about semantic models, not about objects internal to either logic.) See A Semantic Hierarchy for Intuitionistic Logic.
You can add that as an additional axiom to some theory, sure. It’s not clear to me why that is the correct notion to have, especially since you’re adding some extra information about the topology of your probability space when interpreting the measure theoretic structure which seems “illegitimate” and difficult to generalize to some other situations.
My point with meager sets was that they are orthogonal to sets of null measure, so you really need to give some explanation for why you “break the symmetry” between two classes of small sets by privileging one over the other.
I don’t think the question is about Popperian falsificationism, though Popperians are usually more able to notice the philosophical problem I’m talking about in the question. I simply don’t actually know what the relationship of probability to anything “real” is when a theory says “here is a space of outcomes and here is a probability measure on it”. The probability measure doesn’t seem to “tell you” anything.
Thanks for the references, I’ll take a look. I’m not very hopeful since if there is a good answer to my question I think it should fit in the space of an answer—most of what’s in these sources seems to be irrelevant to what I’m asking.
Okay, I now think both of my guesses about what’s really being asked were misses. Maybe I will try again with a new answer; meanwhile, I’ll respond to your points here.
You’re right that I’m sneaking something in when invoking support because it depends on the sample space having a topological structure, which cannot typically be extracted from just a measurable structure. What I’m sneaking in is that both the σ-algebra structure and the topological structure on a scientifically meaningful space ought to be generated by the (finitely) observable predicates. In my experience, this prescription doesn’t contradict with standard examples, and situations to which it’s “difficult to generalize” feel confused and/or pathological until this is sorted out. So, in a sense I’m saying, you’re right that a probability space (X,Σ,P) by itself doesn’t connect to reality—because it lacks the information about which events in Σ are opens.
As to why I privilege null sets over meagre sets: null sets are those to which the probability measure assigns zero value, while meagre sets are independent of the probability measure—the question of which sets are meagre is determined entirely by the topology. If the space is Polish (or more generally, any Baire space), then meagre sets are never inhabited open sets, so they can never conceivably be observations, therefore they can’t be used to falsify a theory.
But, given that I endorse sneaking in a topology, I feel obligated to examine meagre sets from the same point of view, i.e. treating the topology as a statement about which predicates are finitely observable, and see what role meagre sets then have in philosophy of science. Meagre sets are not the simplest concept; the best way I’ve found to do this is via the characterization of meagre sets with the Banach–Mazur game:
Suppose Alice is trying to claim a predicate X is true about the world, and Bob is trying to claim it isn’t true.
They engage in armchair “yes-but” reasoning, taking turns saying “suppose we observe Eᵢ”.
The rules are the same for Alice and Bob: they can choose any finitely observable event, as long as it is consistent with all the previous suppositions (i.e. has nonempty intersection).
In the limit of countably-infinitely many rounds, Alice wins if all the suppositions remain consistent with X, or Bob wins if he can rule out X.
Alice gets to move first.
Of course, for any claim X that is finitely observable, the first-move advantage is decisive: Alice can simply say “suppose we observe X,” and now Bob is doomed. But there are some sets X for which Bob has a guaranteed winning strategy, and those are the meagre sets.
From a philosophy-of-science perspective, meagre sets are propositions internal to a scientific ontology which, even if the ontology is assumed true, could always be falsified by a stream of experimental outcomes from an adversarial Nature (Bob’s moves), even if each such outcome must be consistent with the best possible outcome for the proposition (Alice’s moves). That’s the sense in which meagre sets are negligible. Very loosely, they are hypotheses that, in a specific way, it doesn’t make sense to argue for. For example, the proposition that the fine-structure constant is a Martin-Löf random number has probability 1, but it doesn’t make sense to argue that this is “in fact” the case, essentially because the proposition is meagre.
A related perspective on meagre sets as propositions (mostly writing down for my own interest):
The interior operator IntAcan be thought of as “rounding down to the nearest observable proposition”, since it is the upper bound of all opens that imply A.
The condition for A to be nowhere dense is equivalent to Int¬Int¬A=∅.
If we are working with a logic of observables, where every proposition must be an observable, the closest we can get to a negation operator is a pseudo-negation ∼:=Int¬.
So a nowhere dense set is a predicate whose double-pseudo-negation ∼∼A is false, or equivalently ∼∼∼A is true.
Another slogan, derived from this, is “a nowhere dense hypothesis is one we cannot rule out ruling out”.
The meagre propositions are the σ-ideal generated by nowhere dense propositions.
It’s not clear to me how finitely observable predicates would generate a topology. For a sigma algebra it’s straightforward to do the generation because they are closed under complements, but for a topology if you allow both a predicate and its negation to be “legitimate” then you’ll end up with all your basis elements being clopen. This would then give you something that looks more like a Cantor set than a space like [0,1].
I agree “morally” that the topology should have something to do with finitely observable predicates, but just taking it to be generated by them seems to exclude a lot of connected spaces which you might want to be “scientifically meaningful”, starting from [0,1].
Your point is taken, though as a side remark I think it’s ludicrous to claim that something like the fine structure constant has any property like this with probability 1, given that it’s most likely so far from being a number chosen randomly from some range.
I think putting meager sets in context using the Banach-Mazur game makes sense, but to me this only makes the issue worse, since the existence of comeager & null events would mean that there are some hypotheses that
it doesn’t make sense to argue against and yet
you should give arbitrarily large odds in a bet to anyone who would claim they are correct.
You’re saved from a contradiction because in this setup neither the comeager event nor its complement contain any nonempty open, so the event can neither be proven true or false if we assume the opens are all the finitely observable predicates. In that sense it “doesn’t matter” what odds you give on the event being true or false, but it still seems to me like on a probability space that is also a Polish space you have two structures living together which give you contradictory signals about which events are “small”, and it’s difficult to reconcile these two different ways of looking at things.
I also think that this discussion, though interesting, is somewhat beside the point: even if we deal with some probability space which is also a Polish space, I’m still not sure what is the information added by the probability measure beyond what its support is. Any two measures absolutely continuous wrt each other will have the same support, but obviously we would treat them very differently in practice.
(I agree with your last paragraph—this thread is interesting but unfortunately beside the point since probabilistic theories are obviously trying to “say more” than just their merely nondeterministic shadows.)
Negations of finitely observable predicates are typically not finitely observable. [0,0.5) is finitely observable as a subset of [0,1], because if the true value is in [0,0.5) then there necessarily exists a finite precision with which we can know that. But its negation, [0.5,1], is not finitely observable, because if the true value is exactly 0.5, no finite-precision measurement can establish with certainty that the value is in [0.5,1], even though it is.
The general case of why observables form a topology is more interesting. Finite intersections of finite observables are finitely observable because I can check each one in series and still need only finite observation in total. Countable unions of finite observables are finitely observable because I can check them in parallel, and if any are true then its check will succeed after only finite observation in total.
Uncountable unions are thornier, but arguably unnecessary (they’re redundant with countable unions if the space is hereditarily Lindelöf, for which being Polish is sufficient, or more generally second-countable), and can be accommodated by allowing the observer to hypercompute. This is very much beside the point, but if you are still interested anyway, check out Escardó′s monograph on the topic.
Ah, I didn’t realize that’s what you mean by “finitely observable”—something like “if the proposition is true then there is a finite precision measurement which will show that it’s true”. That does correspond to the opens of a metric space if that’s how you formalize “precision”, but it seems like a concept that’s not too useful in practice because you actually can’t measure things to arbitrary precision in the real world. [0, 0.5) is not going to actually be observable as long as your apparatus of observation has some small but nonzero lower bound on its precision.
What’s the logic behind not making this concept symmetric, though? Why don’t we ask also for “if the proposition is false then there is a finite precision measurement which will show that it’s false”, i.e. why don’t we ask for observables to be clopens? I’m guessing it’s because this concept is too restrictive, but perhaps there’s some kind of intuitionist/constructivist justification for why you’d not want to make it symmetric like this.
I’ll check it out, thanks.
It’s nice if the opens of X can be internalized as the continuous functions X→TV for some space of truth values TV with a distinguished point ⊤ such that x∈O⇔O(x)=⊤. For this, it is necessary (and sufficient) for the open sets of TV to be generated by {⊤}. I could instead ask for a distinguished point ⊥ such that x∉O⇔O(x)=⊥, and for this it is necessary and sufficient for the open sets of TV to be generated by TV∖{⊥}. Put them together, and you get that TV must be the Sierpiński space: a “true” result (⊤∈TV) is finitely observable ({⊤} is open), but a “false” result is not ({⊥} is not open).
Yes, constructively we do not know a proposition until we find a proof. If we find a proof, it is definitely true. If we do not find a proof, maybe it is false, or maybe we have not searched hard enough—we don’t know.
Also related is that the Sierpiński space is the smallest model of intuitionistic propositional logic (with its topological semantics) that rejects LEM, and any classical tautology rejected by Sierpiński space is intuitionistically equivalent to LEM. There’s a sense in which the difference between classical logic and intuitionistic logic is precisely the assumption that all open sets of possibility-space are clopen (which, if we further assume T0, leads to an ontology where possibility-space is necessarily discrete). (Of course it’s not literally a theorem of classical logic that all open sets are clopen; this is a metatheoretic claim about semantic models, not about objects internal to either logic.) See A Semantic Hierarchy for Intuitionistic Logic.