I disagree with the article for the following reason: if I have two hypotheses that both explain an “absence of evidence” occurrence equally well, then that occurrence does not give me reason to favor either hypothesis and is not “evidence of absence.”
Example: Vibrams are a brand of toe-shoes that recently settled a big suit because they couldn’t justify their claims of health benefits. We have two hypotheses (1) Vibrams work, (2) Vibrams don’t work. Now, if a well-executed experiment had been done and failed to show an effect, that would be evidence against a significant benefit from Vibrams. However, if the effect were small or nobody had completed a well-executed experiment, I see no reason that (2) would fit the evidence better than (1), so we are justified in saying this absence of evidence is not evidence of absence.
Although the original saying, I think, was meant in the absolute sense (evidence meaning proof), it is still fitting in the probabilistic sense. Absence of evidence is only evidence of absence when combined with one hypothesis explaining an occurrence better than the other, so the saying holds.
In the situation you describe, the settlement is weak evidence for the product not working. Weak evidence is still evidence. The flaw in “Absence of evidence is evidence of absence,” is that the saying omits the detailed description of how to correctly weight the evidence, but this omission does not make the simple statement untrue.
if I have two hypotheses that both explain an “absence of evidence” occurrence equally well, then that occurrence does not give me reason to favor either hypothesis and is not “evidence of absence.”
This statement is technically true, but not in the way you’re using it.
Suppose Vibrams had been around for a thousand years. For a thousand years, people had been challenging their claims to health benefits in court. For a thousand years, time and again, Vibrams had been unable to credibly defend their claims. Would that make you any more skeptical of the claims in question, at least a little bit?
If the answer is “yes”, you are agreeing that some very large number of such events constitutes evidence against Vibrams. I don’t see any way around concluding, from there, that at least one individual instance provides some nonzero amount of evidence—perhaps very small, but not zero.
“Vibrams work, but the effect is small and/or the experiment was shoddy” and “Vibrams don’t work” explain the outcome nearly equally well. They cannot explain it precisely equally well: the first hypothesis would assign a higher P(claims defended) than the second, because even small effects are sometimes correctly detected, and even shoddy experiments sometimes aren’t fatally flawed. So the second necessarily has a higher P(~claims defended) than the first. This difference is precisely the thing that makes (~claims defended) evidence for the second hypothesis.
Evidence is not proof. Depending on the ratios involved, it may constitute very weak evidence, sometimes weak enough that it’s not even worth tracking for mere humans: a .0001% shift is lost in the noise when people aren’t even calibrated to the nearest 10%.
If you have two hypotheses that both explain an “absence of evidence” precisely equally well, then you’re looking at something completely uncorrelated: trying to deduce the existence of a Fifth Column from the result of a coin flip. And if they explain it only nearly, but not exactly equally well, then you have evidence of absence—although maybe not very much, and maybe not enough to actually push you into the other camp.
I disagree with the article for the following reason: if I have two hypotheses that both explain an “absence of evidence” occurrence equally well, then that occurrence does not give me reason to favor either hypothesis and is not “evidence of absence.”
Example: Vibrams are a brand of toe-shoes that recently settled a big suit because they couldn’t justify their claims of health benefits. We have two hypotheses (1) Vibrams work, (2) Vibrams don’t work. Now, if a well-executed experiment had been done and failed to show an effect, that would be evidence against a significant benefit from Vibrams. However, if the effect were small or nobody had completed a well-executed experiment, I see no reason that (2) would fit the evidence better than (1), so we are justified in saying this absence of evidence is not evidence of absence.
Although the original saying, I think, was meant in the absolute sense (evidence meaning proof), it is still fitting in the probabilistic sense. Absence of evidence is only evidence of absence when combined with one hypothesis explaining an occurrence better than the other, so the saying holds.
In the situation you describe, the settlement is weak evidence for the product not working. Weak evidence is still evidence. The flaw in “Absence of evidence is evidence of absence,” is that the saying omits the detailed description of how to correctly weight the evidence, but this omission does not make the simple statement untrue.
This statement is technically true, but not in the way you’re using it.
Suppose Vibrams had been around for a thousand years. For a thousand years, people had been challenging their claims to health benefits in court. For a thousand years, time and again, Vibrams had been unable to credibly defend their claims. Would that make you any more skeptical of the claims in question, at least a little bit? If the answer is “yes”, you are agreeing that some very large number of such events constitutes evidence against Vibrams. I don’t see any way around concluding, from there, that at least one individual instance provides some nonzero amount of evidence—perhaps very small, but not zero.
“Vibrams work, but the effect is small and/or the experiment was shoddy” and “Vibrams don’t work” explain the outcome nearly equally well. They cannot explain it precisely equally well: the first hypothesis would assign a higher P(claims defended) than the second, because even small effects are sometimes correctly detected, and even shoddy experiments sometimes aren’t fatally flawed. So the second necessarily has a higher P(~claims defended) than the first. This difference is precisely the thing that makes (~claims defended) evidence for the second hypothesis.
Evidence is not proof. Depending on the ratios involved, it may constitute very weak evidence, sometimes weak enough that it’s not even worth tracking for mere humans: a .0001% shift is lost in the noise when people aren’t even calibrated to the nearest 10%.
If you have two hypotheses that both explain an “absence of evidence” precisely equally well, then you’re looking at something completely uncorrelated: trying to deduce the existence of a Fifth Column from the result of a coin flip. And if they explain it only nearly, but not exactly equally well, then you have evidence of absence—although maybe not very much, and maybe not enough to actually push you into the other camp.
Alternately, you might have alternative hypothesis that explain the absence equally well, but with a much higher complexity cost.