We have hypothesis H and evidence E, and we dutifully compute
P(H) * P(E | H) / P(E)
It sounds like your advice is: don’t update yet! Especially if this number is very small. We might have made a mistake. But then how should we update? “Round up” seems problematic.
While you do that, the probability for the estimate being dynamically unstable should go up and then down again. Otherwise, you might make some strange decisions in-between, where the tradeoff between waiting for new information and deciding right now will be as for the honest estimate and not an intermediate step in a multi-step updating procedure with knowably incorrect intermediate results.
I’m not saying not to use Bayes’ theorem, I’m saying to consider very carefully what to plug into “E”. In the election example, your evidence is “A guy on a website said that there was a 999,999,999 in a billion chance that the incumbent would win.” You need to compute the probability of the incumbent winning given this actual evidence (the evidence that a guy on a website said something), not given the evidence that there really is a 999,999,999/billion chance. In the cosmic ray example, your evidence would be “There’s an argument that looks like it should make a less than 10^20 chance of apocalypse”, which may have different evidence value depending on how well your brain judges the way arguments look.
I think this amounts to saying: real-world considerations force an upper bound on abs(log(P(E | H) / P(E))). I’m on board with that, but can we think about how to compute and increase this bound?
P(E) can be broken down into P(E|A)P(A) + P(E|~A)P(~A). Our temptation, when looking at a model, is to treat P(E|~A)*P(~A) as smaller than it really is—the question is, “Is the number of worlds in which the hypothesis is false but the evidence exists anyway large or small?” Yvain is noting that, because we are crazy, we tend to forget about many (or most) of these worlds when looking at evidence. We should expect the number of these worlds to be much larger than the number of worlds in which our probabililty calculations are everywhere and always correct.
The math doesn’t work out to “round up” exactly. It’s situation-dependent. It’s entirely possible that the model is so ill-specified that every variable has the wrong sign. The math will usually work out to deviation towards priors, even if only slightly.
Here’s a post on the same problem in social sciences.
We have hypothesis H and evidence E, and we dutifully compute
P(H) * P(E | H) / P(E)
It sounds like your advice is: don’t update yet! Especially if this number is very small. We might have made a mistake. But then how should we update? “Round up” seems problematic.
I read it to mean “update again” based on the probability that E is flawed. This well tend to adjust back toward your prior.
While you do that, the probability for the estimate being dynamically unstable should go up and then down again. Otherwise, you might make some strange decisions in-between, where the tradeoff between waiting for new information and deciding right now will be as for the honest estimate and not an intermediate step in a multi-step updating procedure with knowably incorrect intermediate results.
I’m not saying not to use Bayes’ theorem, I’m saying to consider very carefully what to plug into “E”. In the election example, your evidence is “A guy on a website said that there was a 999,999,999 in a billion chance that the incumbent would win.” You need to compute the probability of the incumbent winning given this actual evidence (the evidence that a guy on a website said something), not given the evidence that there really is a 999,999,999/billion chance. In the cosmic ray example, your evidence would be “There’s an argument that looks like it should make a less than 10^20 chance of apocalypse”, which may have different evidence value depending on how well your brain judges the way arguments look.
EDIT: Or what nerzhin said.
I think this amounts to saying: real-world considerations force an upper bound on abs(log(P(E | H) / P(E))). I’m on board with that, but can we think about how to compute and increase this bound?
Yes.
P(E) can be broken down into P(E|A)P(A) + P(E|~A)P(~A). Our temptation, when looking at a model, is to treat P(E|~A)*P(~A) as smaller than it really is—the question is, “Is the number of worlds in which the hypothesis is false but the evidence exists anyway large or small?” Yvain is noting that, because we are crazy, we tend to forget about many (or most) of these worlds when looking at evidence. We should expect the number of these worlds to be much larger than the number of worlds in which our probabililty calculations are everywhere and always correct.
The math doesn’t work out to “round up” exactly. It’s situation-dependent. It’s entirely possible that the model is so ill-specified that every variable has the wrong sign. The math will usually work out to deviation towards priors, even if only slightly.
Here’s a post on the same problem in social sciences.
What’s A?
“Deviation towards priors” sounds again like we are positing a bound on log(P(E|H)/P(E)). How can I estimate this bound?