I’m here to say, this is not some property specific to p-values, just about the credibility of the communicator.
If scientists make a bunch of errors all the time, especially those that change their conclusions, indeed you can’t trust them. Turns out (BW11) that scientistspublishedinbetterjournals are more credible than scientistspublishedinworsejournals, the errors they make tend not to change the conclusions of the test (i.e., the chance of drawing a wrong conclusion from their data (“gross error” in BW11) was much lower than the headline rate), and (admittedly I’m going out on a limb here) it is very possible the errors that change the conclusion of a particular test do not change the overall conclusion about the general theory (e.g., if theory says X, Y, and Z should happen, and you find support for X and Y and marginal-support-now-not-significant-support-anymore for Z, the theory is still pretty intact unless you really care about using p-values in a binary fashion. If theory says X, Y, and Z should happen, and you find support for X and Y and now-not-significant-support-anymore for Z, that’s more of an issue. But given how many tests are in a paper, it’s also possible theory says X, Y, and Z should happen, and you find support for X and Y and Z, but turns out your conclusion about W reverses, which may or may not really have something to say about your theory).
I don’t think it is wise to throw the baby out with the bathwater.
I’m here to say, this is not some property specific to p-values, just about the credibility of the communicator.
If scientists make a bunch of errors all the time, especially those that change their conclusions, indeed you can’t trust them. Turns out (BW11) that scientistspublishedinbetterjournals are more credible than scientistspublishedinworsejournals, the errors they make tend not to change the conclusions of the test (i.e., the chance of drawing a wrong conclusion from their data (“gross error” in BW11) was much lower than the headline rate), and (admittedly I’m going out on a limb here) it is very possible the errors that change the conclusion of a particular test do not change the overall conclusion about the general theory (e.g., if theory says X, Y, and Z should happen, and you find support for X and Y and marginal-support-now-not-significant-support-anymore for Z, the theory is still pretty intact unless you really care about using p-values in a binary fashion. If theory says X, Y, and Z should happen, and you find support for X and Y and now-not-significant-support-anymore for Z, that’s more of an issue. But given how many tests are in a paper, it’s also possible theory says X, Y, and Z should happen, and you find support for X and Y and Z, but turns out your conclusion about W reverses, which may or may not really have something to say about your theory).
I don’t think it is wise to throw the baby out with the bathwater.