gwern comments on Too good to be true

gwern 21 Jul 2014 14:40 UTC
10 points

whenever Phils of this world encounter the example of results not being slightly too good to be true

Boy, it’s a real pity that there’s no research into excess significance in which various authors do systematic samples of large numbers of papers to get field-wide generalizations and observations about whether this is a common phenomenon or not. As it stands, we have no idea whether Phil has cherry-picked a rare phenomenon or not.

Such a pity.
- private_messaging 21 Jul 2014 21:28 UTC
  3 points
  Parent
  Well, I don’t see anyone writing about e.g. physics results not being too good to be true, or government-sponsored pharmaceutical studies not being too good to be true etc. Nor would it be particularly rare to obtain that sort of result anyway.
  - gwern 22 Jul 2014 1:13 UTC
    6 points
    Parent
    
    physics results not being too good to be true
    
    Well, more generally people do apply that sort of reasoning in being skeptical of improbable results, like most people’s reaction (especially on LW) to the neutrino FTL result was that the result was simply wrong, regardless of how many measurements they took.
    
    I’m not really familiar with how significance-testing is used in physics, but at least under the six-sigma level of alpha, it would take an enormous number of studies of a null hypothesis before the lack of statistical-significance would become ‘too good to be true’.
    
    government-sponsored pharmaceutical studies not being too good to be true
    
    Then maybe you should look instead of talking out of your ass. People talk about problems with clinical trials all the time, and pharmaceutical & medicine in general is the home stomping grounds for a lot of meta approaches like excess significance.
    - private_messaging 22 Jul 2014 10:22 UTC
      1 point
      Parent
      
      I’m not really familiar with how significance-testing is used in physics, but at least under the six-sigma level of alpha, it would take an enormous number of studies of a null hypothesis before the lack of statistical-significance would become ‘too good to be true’.
      
      Physics is very diverse. There’s those neutrino detectors which detect and fail to detect rare events, for example.
      
      People talk about problems with clinical trials all the time,
      
      Yes, and they don’t seem to talk much about non problems.
      - gwern 24 Jul 2014 20:38 UTC
        6 points
        Parent
        
        There’s those neutrino detectors which detect and fail to detect rare events, for example.
        
        OK, so? Do they impose six-sigmas on the total result, subdivisions, or what?
        
        Yes, and they don’t seem to talk much about non problems.
        
        Yes, because almost all clinical trials stink. Publication bias is pervasive, and the methodological problems are almost universal. When you read through, say, Cochrane meta-analyses or reviews, it’s normal to find that something like 90%+ of studies had to be discarded because they lacked such basic desiderata as ‘blinding’ or ‘randomization’ or simply didn’t specify important things like sample sizes or intent-to-treat. That people are willing to cite studies at all is ‘talking about non problems’.