My main takeaway from Bayes Theorem is that “A implies B with probability P” and “B implies A with probability P” are not the same thing, but many people think they are.
For example, this explains how a scientific journal can publish hundreds of studies with “p < 0.05” and yet half of them fail to replicate. The probability 95% is for “we get this result, if the hypothesis is true” (what the journal requires), not “the hypothesis is true, if we got this result” (what we care about).
Another takeaway is that it is not enough to consider “how likely am I to see this if X is true” but also “how likely am I to see this if X is false”. If both answers are the same, then the observation is actually not evidence for X.
It also gives an alternative to “simplified Popperism” popular on internet, saying that things “cannot be proved, only falsified”. (First problem: what about negations? What would it mean for “X is Y” and “X is not Y” to be simultaneously both unprovable but falsifiable? Doesn’t falsifying one of them kinda automatically prove the other? Second problem: this is not how actual science works. Whenever someone yet again experimentally “falsifies” the theory of relativity, most actual scientists calmly wait until someone finds an error in the experiment. Third problem: if “wasn’t falsified yet” is the highest compliment anyone could ever make to a theory, then any crackpot theory that was literally invented yesterday and therefore no one had enough time to disprove it yet, requires the same respect as a theory that was invented decades ago and supported by thousands of experiments.) Similarly it helps to answer the question whether seeing a non-black object, and observing that it is not a raven, should be considered an evidence for “all ravens are black”.
With regard to journal results, it is even worse than that.
A published result with p < 0.05 means that: if the given hypothesis is false, but the underlying model and experimental design is otherwise correct, then there is at least a 95% chance that we don’t see results like this.
There are enough negations and qualifiers in there that even highly competent scientists get confused on occasion.
My main takeaway from Bayes Theorem is that “A implies B with probability P” and “B implies A with probability P” are not the same thing, but many people think they are.
For example, this explains how a scientific journal can publish hundreds of studies with “p < 0.05” and yet half of them fail to replicate. The probability 95% is for “we get this result, if the hypothesis is true” (what the journal requires), not “the hypothesis is true, if we got this result” (what we care about).
Another takeaway is that it is not enough to consider “how likely am I to see this if X is true” but also “how likely am I to see this if X is false”. If both answers are the same, then the observation is actually not evidence for X.
It also gives an alternative to “simplified Popperism” popular on internet, saying that things “cannot be proved, only falsified”. (First problem: what about negations? What would it mean for “X is Y” and “X is not Y” to be simultaneously both unprovable but falsifiable? Doesn’t falsifying one of them kinda automatically prove the other? Second problem: this is not how actual science works. Whenever someone yet again experimentally “falsifies” the theory of relativity, most actual scientists calmly wait until someone finds an error in the experiment. Third problem: if “wasn’t falsified yet” is the highest compliment anyone could ever make to a theory, then any crackpot theory that was literally invented yesterday and therefore no one had enough time to disprove it yet, requires the same respect as a theory that was invented decades ago and supported by thousands of experiments.) Similarly it helps to answer the question whether seeing a non-black object, and observing that it is not a raven, should be considered an evidence for “all ravens are black”.
With regard to journal results, it is even worse than that.
A published result with p < 0.05 means that: if the given hypothesis is false, but the underlying model and experimental design is otherwise correct, then there is at least a 95% chance that we don’t see results like this.
There are enough negations and qualifiers in there that even highly competent scientists get confused on occasion.