The idea is good. I’m afraid that it may be interpreted as meaning that we need to increase our publication standards from 95% confidence intervals to 98% confidence intervals. I think scientists already have a dangerously strong bias to reject anything that fails to meet a 95% confidence interval. If someone has a good idea, with good theoretical reasoning behind it; and they run some experiments but don’t hit 95%, it’s still worth considering.
There are also all sorts of data-collection tasks which are routinely thrown out if they fall below 95% confidence, when they shouldn’t be. People doing any sort of genomics work routinely fail to report gene associations at less than 95% confidence. The fact is that, when we’re taking millions of pieces of data and putting them into a computer program to compute reliability scores, ALL data should be saved and used. Most of the information scientists produce is in the large mass of low-confidence predictions. There is much more information in 100,000 50%-confidence predictions than in a dozen 95%-confidence predictions.
I agree that all data should be saved, and that there’s much more information in 100,000 50%-confidence predictions than in a dozen 95%-confidence predictions. But ask a biologist which they’d prefer (ETA: I have actually done this, more or less) and they’ll take the dozen 95%-confidence predictions, because they’re just going to turn around and use bog-standard low-throughput experimental techniques to dig deeper. From the biologists’ decision theory perspective, false positives are a lot more costly than false negatives.
That approach only works because yeast has been subjected to intense investigation by low-throughput techniques, providing a huge knowledge base that constrains and guides the automated investigation. (It also helps that yeast doesn’t do alternative splicing.) So it’s not so much “replacing” as “building upon”.
The idea is good. I’m afraid that it may be interpreted as meaning that we need to increase our publication standards from 95% confidence intervals to 98% confidence intervals. I think scientists already have a dangerously strong bias to reject anything that fails to meet a 95% confidence interval. If someone has a good idea, with good theoretical reasoning behind it; and they run some experiments but don’t hit 95%, it’s still worth considering.
There are also all sorts of data-collection tasks which are routinely thrown out if they fall below 95% confidence, when they shouldn’t be. People doing any sort of genomics work routinely fail to report gene associations at less than 95% confidence. The fact is that, when we’re taking millions of pieces of data and putting them into a computer program to compute reliability scores, ALL data should be saved and used. Most of the information scientists produce is in the large mass of low-confidence predictions. There is much more information in 100,000 50%-confidence predictions than in a dozen 95%-confidence predictions.
I agree that all data should be saved, and that there’s much more information in 100,000 50%-confidence predictions than in a dozen 95%-confidence predictions. But ask a biologist which they’d prefer (ETA: I have actually done this, more or less) and they’ll take the dozen 95%-confidence predictions, because they’re just going to turn around and use bog-standard low-throughput experimental techniques to dig deeper. From the biologists’ decision theory perspective, false positives are a lot more costly than false negatives.
That’s why we need to replace biologists with robots. Like this one.
That approach only works because yeast has been subjected to intense investigation by low-throughput techniques, providing a huge knowledge base that constrains and guides the automated investigation. (It also helps that yeast doesn’t do alternative splicing.) So it’s not so much “replacing” as “building upon”.