If I have a set of axioms, and I derive theorems from them, then anything that these axioms are true about, all the theorems are also true about. For example, suppose we took Euclid’s first four postulates and derived a bunch of theorems from them. These postulates are true if you use them to describe figures on a plane, so the theorems are also true about those figures. This also works if it’s on a sphere. It’s not that a “point” means a spot on a plane, or two opposite spots on a sphere, it’s just that the reasoning for abstract points applies to physical models.
Statistics isn’t just those axioms. You might be able to find something else that those axioms apply to. If you do, every statistical theorem will also apply. It still wouldn’t be statistics. Statistics is a specific application. P(A|B) represents something in this application. P(A|B) always equals P(A∩B)/P(B). We can find this out the same way we figured out that P(∅) always equals zero. It’s just that the latter is more obvious than the former, and we may be able to derive the former from something else equally obvious.
If I have a set of axioms, and I derive theorems from them, then anything that these axioms are true about, all the theorems are also true about. For example, suppose we took Euclid’s first four postulates and derived a bunch of theorems from them. These postulates are true if you use them to describe figures on a plane, so the theorems are also true about those figures. This also works if it’s on a sphere. It’s not that a “point” means a spot on a plane, or two opposite spots on a sphere, it’s just that the reasoning for abstract points applies to physical models.
Statistics isn’t just those axioms. You might be able to find something else that those axioms apply to. If you do, every statistical theorem will also apply. It still wouldn’t be statistics. Statistics is a specific application. P(A|B) represents something in this application. P(A|B) always equals P(A∩B)/P(B). We can find this out the same way we figured out that P(∅) always equals zero. It’s just that the latter is more obvious than the former, and we may be able to derive the former from something else equally obvious.