0.05 is a practical tradeoff, for supposed Bayesians, it is still much too strict, not too lax.
No, it isn’t. In an environment where the incentive to find a positive result in huge and there are all sorts of flexibilities in what particular results to report and which studies to abandon entirely, 0.05 leaves far too many false positives. I really does begin to look like this. I don’t advocate using the standards from physics but p=0.01 would be preferable.
Mind you, there is no particularly good reason why there is an arbitrary p value to equate with ‘significance’ anyhow.
Well, I would find it really awkward for a Bayesian to condone a modus operandi such as “The p-value of 0.15 indicates it is much more likely that there is a correlation than that the result is due to chance, however for all intents and purposes the scientific community will treat the correlation as non-existent, since we’re not sufficiently certain of it (even though it likely exists)”.
Similar to having choice of two roads to go down, one of which leads into the forbidden forest. Then saying “while I have decent evidence which way goes where, because I’m not yet really certain, I’ll just toss a coin.” How many false choices would you make in life, using an approach like that? Neglecting your duty to update, so to speak. A p-value of 0.15 is important evidence. A p-value of 0.05 is even more important evidence. It should not be disregarded, regardless of the perverse incentives in publishing and the false binary choice (if (p<=0.05) correlation=true, else correlation=false). However, for the medical community, a p-value of 0.15 might as well be 0.45, for practical purposes. Not published = not published.
This is especially pertinent given that many important chance discoveries may only barely reach significance initially, not because their effect size is so small, but because in medicine sample sizes often are, with the accompanying low power of discovering new effects. When you’re just a grad student with samples from e.g. 10 patients (no economic incentive yet, not yet a large trial), unless you’ve found magical ambrosia, p-values may tend to be “insignificant”, even of potentially significant breakthrough drugs .
Better to check out a few false candidates too many than to falsely dismiss important new discoveries. Falsely claiming a promising new substance to have no significant effect due to p-value shenanigans is much worse than not having tested it in the first place, since the “this avenue was fruitless” conclusion can steer research in the wrong direction (information spreads around somewhat even when unpublished, “group abc had no luck with testing substances xyz”).
IOW, I’m more concerned with false negatives (may never get discovered as such, lost chance) than with false positives (get discovered later on—in larger follow-up trials—as being false positives). A sliding p-value scale may make sense, with initial screening tests having a lax barrier signifying a “should be investigated further”, with a stricter standard for the follow-up investigations.
Well, I would find it really awkward for a Bayesian to condone a modus operandi such as “The p-value of 0.15 indicates it is much more likely that there is a correlation than that the result is due to chance, however for all intents and purposes the scientific community will treat the correlation as non-existent, since we’re not sufficiently certain of it (even though it likely exists)”.
And this is a really, really great reason not to identify yourself as “Bayesian”. You end up not using effective methods when you can’t derive them from Bayes theorem. (Which is to be expected absent very serious training in deriving things).
Better to check out a few false candidates too many than to falsely dismiss important new discoveries
Where do you think the funds for testing false candidates are going to come from? If you are checking too many false candidates, you are dismissing important new discoveries. You are also robbing time away from any exploration into the unexplored space.
edit: also I think you overestimate the extent to which promising avenues of research are “closed” by a failure to confirm. It is understood that a failure can result from a multitude of causes. Keep in mind also that with a strong effect, you have quadratically better p-value for the same sample size. You are at much less of a risk of dismissing strong results.
Well, I would find it really awkward for a Bayesian to condone a modus operandi such as “The p-value of 0.15 indicates it is much more likely that there is a correlation than that the result is due to chance, however for all intents and purposes the scientific community will treat the correlation as non-existent, since we’re not sufficiently certain of it (even though it likely exists)”.
The way statistically significant scientific studies are currently used is not like this. The meaning conveyed and the practical effect of official people declaring statistically significant findings is not a simple declaration of the Bayesian evidence implied by the particular statistical test returning less than 0.05. Because of this, I have no qualms with saying that I would prefer lower values than p<0.05 to be used in the place where that standard is currently used. No rejection of Bayesian epistemology is implied.
No, it isn’t. In an environment where the incentive to find a positive result in huge and there are all sorts of flexibilities in what particular results to report and which studies to abandon entirely, 0.05 leaves far too many false positives. I really does begin to look like this. I don’t advocate using the standards from physics but p=0.01 would be preferable.
Mind you, there is no particularly good reason why there is an arbitrary p value to equate with ‘significance’ anyhow.
Well, I would find it really awkward for a Bayesian to condone a modus operandi such as “The p-value of 0.15 indicates it is much more likely that there is a correlation than that the result is due to chance, however for all intents and purposes the scientific community will treat the correlation as non-existent, since we’re not sufficiently certain of it (even though it likely exists)”.
Similar to having choice of two roads to go down, one of which leads into the forbidden forest. Then saying “while I have decent evidence which way goes where, because I’m not yet really certain, I’ll just toss a coin.” How many false choices would you make in life, using an approach like that? Neglecting your duty to update, so to speak. A p-value of 0.15 is important evidence. A p-value of 0.05 is even more important evidence. It should not be disregarded, regardless of the perverse incentives in publishing and the false binary choice (if (p<=0.05) correlation=true, else correlation=false). However, for the medical community, a p-value of 0.15 might as well be 0.45, for practical purposes. Not published = not published.
This is especially pertinent given that many important chance discoveries may only barely reach significance initially, not because their effect size is so small, but because in medicine sample sizes often are, with the accompanying low power of discovering new effects. When you’re just a grad student with samples from e.g. 10 patients (no economic incentive yet, not yet a large trial), unless you’ve found magical ambrosia, p-values may tend to be “insignificant”, even of potentially significant breakthrough drugs .
Better to check out a few false candidates too many than to falsely dismiss important new discoveries. Falsely claiming a promising new substance to have no significant effect due to p-value shenanigans is much worse than not having tested it in the first place, since the “this avenue was fruitless” conclusion can steer research in the wrong direction (information spreads around somewhat even when unpublished, “group abc had no luck with testing substances xyz”).
IOW, I’m more concerned with false negatives (may never get discovered as such, lost chance) than with false positives (get discovered later on—in larger follow-up trials—as being false positives). A sliding p-value scale may make sense, with initial screening tests having a lax barrier signifying a “should be investigated further”, with a stricter standard for the follow-up investigations.
And this is a really, really great reason not to identify yourself as “Bayesian”. You end up not using effective methods when you can’t derive them from Bayes theorem. (Which is to be expected absent very serious training in deriving things).
Where do you think the funds for testing false candidates are going to come from? If you are checking too many false candidates, you are dismissing important new discoveries. You are also robbing time away from any exploration into the unexplored space.
edit: also I think you overestimate the extent to which promising avenues of research are “closed” by a failure to confirm. It is understood that a failure can result from a multitude of causes. Keep in mind also that with a strong effect, you have quadratically better p-value for the same sample size. You are at much less of a risk of dismissing strong results.
The way statistically significant scientific studies are currently used is not like this. The meaning conveyed and the practical effect of official people declaring statistically significant findings is not a simple declaration of the Bayesian evidence implied by the particular statistical test returning less than 0.05. Because of this, I have no qualms with saying that I would prefer lower values than p<0.05 to be used in the place where that standard is currently used. No rejection of Bayesian epistemology is implied.