There definition is equivalent to having an axiom that states that P(A|B) = P(A∩B)/P(B). That’s not that difficult a concept, but it’s still more advanced than axioms tend to be. Compare it to the other three. It’s like Euclid’s fifth postulate.
It bothers me that you seem to be under the impression that the equation represents some kind of substantive claim. It doesn’t; it’s just the establishment of a shorthand notation. (It bothers me even more that other commenters don’t seem to be noticing that you’re suffering from a confusion about this.)
A reasonable question to ask might be: “why is the quantity P(A∩B)/P(B) interesting enough to be worth having a shorthand notation for?” But that isn’t what you asked, and the answer wouldn’t consist of a “proof”, so despite its being the closest non-confused question to yours I’m not yet sure whether an attempt to answer it would be helpful to you.
If you simply view P(A|B) = P(A∩B)/P(B) as a shorthand, with “P(A|B)” as just an arbitrary symbol, then you’re right—it needs no more explanation. But we don’t consider P(A|B) to be just an arbitary symbol—we think it has a specific meaning, which is “the probability of A given B”. And we think that “P(A∩B)/P(B)” has been chosen to equal “P(A|B)” because it has the properties we feel “the probability of A given B” should have.
I think DanielLC is asking why it is specifically P(A∩B)/P(B), and not some other formula, that has been chosen to correspond with the intuitive notion of “the probability of A given B”.
In that case, it’s no wonder that I’m having trouble relating, because I didn’t understand what “the probability of A given B” meant until somebody told me it was P(A∩B)/P(B).
There is a larger point here:
But we don’t consider P(A|B) to be just an arbitary symbol—we think it has a specific meaning, which is “the probability of A given B”. And we think that “P(A∩B)/P(B)” has been chosen to equal “P(A|B)” because it has the properties we feel “the probability of A given B” should have.
In my opinion, an important part of learning to think mathematically is learning not to think like this. That is, not to think of symbols as having a mysterious “meaning” apart from their formal definitions.
This is what causes some people to have trouble accepting that 0.999.… = 1: they don’t understand that the question of what 0.999.… “is” is simply a matter of definition, and not some mysterious empirical fact.
Paradoxically, this is a way in which my lack of “mathematical ability” is a kind of mathematical ability in its own right, because I often don’t have these mysterious “intuitions” that other people seem to, and thus for me it tends to be second nature that the formal definition of something is what the thing is. For other people, I suppose, thinking this way is a kind of skill they have to consciously learn.
If I have a set of axioms, and I derive theorems from them, then anything that these axioms are true about, all the theorems are also true about. For example, suppose we took Euclid’s first four postulates and derived a bunch of theorems from them. These postulates are true if you use them to describe figures on a plane, so the theorems are also true about those figures. This also works if it’s on a sphere. It’s not that a “point” means a spot on a plane, or two opposite spots on a sphere, it’s just that the reasoning for abstract points applies to physical models.
Statistics isn’t just those axioms. You might be able to find something else that those axioms apply to. If you do, every statistical theorem will also apply. It still wouldn’t be statistics. Statistics is a specific application. P(A|B) represents something in this application. P(A|B) always equals P(A∩B)/P(B). We can find this out the same way we figured out that P(∅) always equals zero. It’s just that the latter is more obvious than the former, and we may be able to derive the former from something else equally obvious.
That is, not to think of symbols as having a mysterious “meaning” apart from their formal definitions.
Pure formalism is useful for developing new math, but math cannot be applied to real problems without the assignment of meaning to the variables and equations. Most people are more interested in using math than in what amounts to intellectual play, as enjoyable and potentially useful as that can be. Note that I tend to be more of a formalist myself, which is why I mentioned in an old comment on HN that I tend to learn math concepts fairly easily, but have trouble applying it.
There definition is equivalent to having an axiom that states that P(A|B) = P(A∩B)/P(B). That’s not that difficult a concept, but it’s still more advanced than axioms tend to be. Compare it to the other three. It’s like Euclid’s fifth postulate.
But it’s not an axiom; it’s a definition.
It bothers me that you seem to be under the impression that the equation represents some kind of substantive claim. It doesn’t; it’s just the establishment of a shorthand notation. (It bothers me even more that other commenters don’t seem to be noticing that you’re suffering from a confusion about this.)
A reasonable question to ask might be: “why is the quantity P(A∩B)/P(B) interesting enough to be worth having a shorthand notation for?” But that isn’t what you asked, and the answer wouldn’t consist of a “proof”, so despite its being the closest non-confused question to yours I’m not yet sure whether an attempt to answer it would be helpful to you.
If you simply view P(A|B) = P(A∩B)/P(B) as a shorthand, with “P(A|B)” as just an arbitrary symbol, then you’re right—it needs no more explanation. But we don’t consider P(A|B) to be just an arbitary symbol—we think it has a specific meaning, which is “the probability of A given B”. And we think that “P(A∩B)/P(B)” has been chosen to equal “P(A|B)” because it has the properties we feel “the probability of A given B” should have.
I think DanielLC is asking why it is specifically P(A∩B)/P(B), and not some other formula, that has been chosen to correspond with the intuitive notion of “the probability of A given B”.
In that case, it’s no wonder that I’m having trouble relating, because I didn’t understand what “the probability of A given B” meant until somebody told me it was P(A∩B)/P(B).
There is a larger point here:
In my opinion, an important part of learning to think mathematically is learning not to think like this. That is, not to think of symbols as having a mysterious “meaning” apart from their formal definitions.
This is what causes some people to have trouble accepting that 0.999.… = 1: they don’t understand that the question of what 0.999.… “is” is simply a matter of definition, and not some mysterious empirical fact.
Paradoxically, this is a way in which my lack of “mathematical ability” is a kind of mathematical ability in its own right, because I often don’t have these mysterious “intuitions” that other people seem to, and thus for me it tends to be second nature that the formal definition of something is what the thing is. For other people, I suppose, thinking this way is a kind of skill they have to consciously learn.
If I have a set of axioms, and I derive theorems from them, then anything that these axioms are true about, all the theorems are also true about. For example, suppose we took Euclid’s first four postulates and derived a bunch of theorems from them. These postulates are true if you use them to describe figures on a plane, so the theorems are also true about those figures. This also works if it’s on a sphere. It’s not that a “point” means a spot on a plane, or two opposite spots on a sphere, it’s just that the reasoning for abstract points applies to physical models.
Statistics isn’t just those axioms. You might be able to find something else that those axioms apply to. If you do, every statistical theorem will also apply. It still wouldn’t be statistics. Statistics is a specific application. P(A|B) represents something in this application. P(A|B) always equals P(A∩B)/P(B). We can find this out the same way we figured out that P(∅) always equals zero. It’s just that the latter is more obvious than the former, and we may be able to derive the former from something else equally obvious.
Pure formalism is useful for developing new math, but math cannot be applied to real problems without the assignment of meaning to the variables and equations. Most people are more interested in using math than in what amounts to intellectual play, as enjoyable and potentially useful as that can be. Note that I tend to be more of a formalist myself, which is why I mentioned in an old comment on HN that I tend to learn math concepts fairly easily, but have trouble applying it.