benelliott comments on [SEQ RERUN] Beautiful Probability

benelliott 25 Dec 2011 22:47 UTC
0 points
The difference is that in your example we got different sets of data, and simply discarded some of the data from one of them to make them look the same, whereas in the original we got the same set of data by the same method, everything that happened in the real world was the same, the only difference was in counterfactual scenarios.
- FeepingCreature 25 Dec 2011 22:52 UTC
  0 points
  Parent
  Yeah but our perception is the same, no? Besides, in a sense, the original researcher also discards bits of data—he discards all possible stopping points that do not confirm his hypothesis, and all those after his hypothesis has been “confirmed”.
  - benelliott 25 Dec 2011 23:06 UTC
    0 points
    Parent
    He does not discard anything that actually happened.
    
    This is the key difference. We are evaluating the effectiveness of the drug by looking at what the drug actually did, not what it could have done.
    
    I can give a much more precise mathematical proof if you want.
    - FeepingCreature 26 Dec 2011 1:08 UTC
      0 points
      Parent
      Let’s imagine a scientist did 500 tests. Then he started discarding tests, from the end, until the remaining data supported some hypothesis (or he ran out of tests). Is this to be treated as evidence of the same strength as it would if he had precommitted to only doing that many tests?
      - benelliott 26 Dec 2011 1:18 UTC
        0 points
        Parent
        I may be wrong here because I’m tired, but I think the way the maths comes out is that this would be as strong if he only removed tests from the end, whereas if he removed them from anywhere he chose depending on how they came out it would not be as strong.