The problem you are interested in is about the validity of a single data point, which is not what statistical inference is meant to answer. Statistical hypotheses are about population (distribution) level parameters, not about individual observations.
Imagine I sample 100 individuals. Using a diagnostic test with 100% sensitivity and specificity, I find out that 20 of them have cancer, including one individual named Joe. If I make the claim that 20% of the general population have cancer, you can meaningfully ask me how certain I am about this claim. This is what statistics will allow me to formalize. However, if you ask me how certain I am that Joe has cancer, I will tell you that I am 100% certain (because the test was foolproof). This is not a statistical question anymore, the problem is not sampling variability.
The same holds for the case of my keys. What you are interested in is a single data point, ie whether I accurately concluded that the object found in location X were indeed my keys. In order to answer that, you need to reason about measurement error (sensitivity/​specificity), not sampling variability. In this case, there is no reason to suspect measurement error. I will believe I have found my keys.
Question 2:
This is just a case where once you have collected your winnings, you have overwhelming evidence that you actually won the lottery: The probability of collecting your winnings if you did not win approaches zero, so the likelihood ratio will easily overpower 1 in 100 million.
I was responding to the original post, which said:
You repeated some failed experiments in the hope of getting a different result. Multiple hypotheses, file drawer effect, motivated cognition, motivated stopping, researcher degrees of freedom, remining of old data: there is hardly a methodological sin you have not committed.
I realize my wording may have been suboptimal, but some of these biases (such as multiple comparisons) only make sense in a frequentist framework.
I was trying to explain why some of these methodological problems do not even apply in this example. It is not a matter of other evidence is strong enough to outweight the methodological flaws. These methodological flaws are irrelevant to questions about the individuals data points.
For example, problems arising from biased stopping rules would arise if you were trying to estimate the proportion of all locations that contain keys that open your door. However, a biased stopping rule makes absolutely no difference for the integrity of the individual data points.
Question 1:
The problem you are interested in is about the validity of a single data point, which is not what statistical inference is meant to answer. Statistical hypotheses are about population (distribution) level parameters, not about individual observations.
Imagine I sample 100 individuals. Using a diagnostic test with 100% sensitivity and specificity, I find out that 20 of them have cancer, including one individual named Joe. If I make the claim that 20% of the general population have cancer, you can meaningfully ask me how certain I am about this claim. This is what statistics will allow me to formalize. However, if you ask me how certain I am that Joe has cancer, I will tell you that I am 100% certain (because the test was foolproof). This is not a statistical question anymore, the problem is not sampling variability.
The same holds for the case of my keys. What you are interested in is a single data point, ie whether I accurately concluded that the object found in location X were indeed my keys. In order to answer that, you need to reason about measurement error (sensitivity/​specificity), not sampling variability. In this case, there is no reason to suspect measurement error. I will believe I have found my keys.
Question 2:
This is just a case where once you have collected your winnings, you have overwhelming evidence that you actually won the lottery: The probability of collecting your winnings if you did not win approaches zero, so the likelihood ratio will easily overpower 1 in 100 million.
That’s an unnecessarily narrow (and entirely frequentist) approach.
Statistics is a toolbox for dealing with uncertainty.
I was responding to the original post, which said:
I realize my wording may have been suboptimal, but some of these biases (such as multiple comparisons) only make sense in a frequentist framework.
I was trying to explain why some of these methodological problems do not even apply in this example. It is not a matter of other evidence is strong enough to outweight the methodological flaws. These methodological flaws are irrelevant to questions about the individuals data points.
For example, problems arising from biased stopping rules would arise if you were trying to estimate the proportion of all locations that contain keys that open your door. However, a biased stopping rule makes absolutely no difference for the integrity of the individual data points.