Let’s do a toy problem. Suppose that Statistician 2 is extra-lazy, and will only flip the coin three times, again stopping if they ever have more heads than tails. And suppose that, again, they end up using up all the flips and have more tails than heads—in this case, 2 tails and 1 heads. Every time, they must get tails first, or else they would immediately stop, and then they either get the next two flips Heads-Tails or Tails-Heads—they can only get the sequences THT or TTH.
So P(THT+TTH | Coin A) = 4⁄27, while P(THT+TTH | Coin B) = 8⁄27. So statistician 2 will record twice as many of this result from coin B as from coin A. Thus statistician 2 claims that the probability of it being coin B is 2⁄3.
Compare this to Statistician 1: P(1 heads and 2 tails | Coin A) = P(HTT+THT+TTH | Coin A) = 6⁄27, while P(HTT+THT+TTH | Coin B) = 12⁄27. Thus statistician 1 thinks the probability of it being coin B is 2⁄3. The two statisticians get the same results!
This is a general pattern—because the trials are independent, when statistician 2 compares how many times they get a result with coin A vs coin B, the ratio (and thus the likelihood ratio) will be the same as for statistician 1 - number 2 just only accepts a smaller number of possible sequences of flips—but all of those sequences have the same ratio of probabilities.
Remember P(A|B) = P(A) * P(B|A) / P(B). Here P(B|A) is the probability of getting some number of heads and tails given a specific coin, and P(B) the probability of getting that result averaged over both coins. The size of P(B|A) itself doesn’t matter, only the ratio P(B|A)/P(B).
This is not to say that statistician 2 can’t cheat. All they have to do is to not publish results with more tails than heads. Now if you update straightforwardly on the published results, on average statistician 2 has biased you towards coin A. The only way to counteract this is if you know that this is what they’re doing, and can update on your observations, not just their published observations.
There’s another toy example that might help too. Suppose Statistician 2 is willing to flip the coin 3 times, but gets heads on the first flip and stops there. Surely you can’t accept this data, or else you’re practically guaranteed to let Statistician 2 manipulate you, right?
Well, P(H | coin A) = 2⁄3, and P(H| | coin B) = 1⁄3, so clearly “first flip heads” is an event that happens twice as often when it’s coin A. What kind of scientist would you be if you couldn’t derive evidence from an event that happens twice as often under some conditions?
The weird thing is that even though you can see the event “first flip heads,” you’ll never see the event “first flip tails.” How come these individual data points are still “good bets,” even though you’ll never see the event of first flip tails? It seems like Statistician 2 has a sure-fire system for “beating the house” and convincing you no matter what.
Why am I suddenly making gambling analogies? Because Statistician 2 is trying to use a Martingale betting system. And at the end of the day, the house always wins—Statistician 2 has a large chance to submit a “biased towards heads” sample, but only at the cost of having their other samples be even more biased towards tails. On average, they are still accurate, just like how on average, you can’t win money with a Martingale betting strategy.
In this analogy, publication bias is like running away without paying your gambling debts.
You are a little off :P
Let’s do a toy problem. Suppose that Statistician 2 is extra-lazy, and will only flip the coin three times, again stopping if they ever have more heads than tails. And suppose that, again, they end up using up all the flips and have more tails than heads—in this case, 2 tails and 1 heads. Every time, they must get tails first, or else they would immediately stop, and then they either get the next two flips Heads-Tails or Tails-Heads—they can only get the sequences THT or TTH.
So P(THT+TTH | Coin A) = 4⁄27, while P(THT+TTH | Coin B) = 8⁄27. So statistician 2 will record twice as many of this result from coin B as from coin A. Thus statistician 2 claims that the probability of it being coin B is 2⁄3.
Compare this to Statistician 1: P(1 heads and 2 tails | Coin A) = P(HTT+THT+TTH | Coin A) = 6⁄27, while P(HTT+THT+TTH | Coin B) = 12⁄27. Thus statistician 1 thinks the probability of it being coin B is 2⁄3. The two statisticians get the same results!
This is a general pattern—because the trials are independent, when statistician 2 compares how many times they get a result with coin A vs coin B, the ratio (and thus the likelihood ratio) will be the same as for statistician 1 - number 2 just only accepts a smaller number of possible sequences of flips—but all of those sequences have the same ratio of probabilities.
Remember P(A|B) = P(A) * P(B|A) / P(B). Here P(B|A) is the probability of getting some number of heads and tails given a specific coin, and P(B) the probability of getting that result averaged over both coins. The size of P(B|A) itself doesn’t matter, only the ratio P(B|A)/P(B).
This is not to say that statistician 2 can’t cheat. All they have to do is to not publish results with more tails than heads. Now if you update straightforwardly on the published results, on average statistician 2 has biased you towards coin A. The only way to counteract this is if you know that this is what they’re doing, and can update on your observations, not just their published observations.
There’s another toy example that might help too. Suppose Statistician 2 is willing to flip the coin 3 times, but gets heads on the first flip and stops there. Surely you can’t accept this data, or else you’re practically guaranteed to let Statistician 2 manipulate you, right?
Well, P(H | coin A) = 2⁄3, and P(H| | coin B) = 1⁄3, so clearly “first flip heads” is an event that happens twice as often when it’s coin A. What kind of scientist would you be if you couldn’t derive evidence from an event that happens twice as often under some conditions?
The weird thing is that even though you can see the event “first flip heads,” you’ll never see the event “first flip tails.” How come these individual data points are still “good bets,” even though you’ll never see the event of first flip tails? It seems like Statistician 2 has a sure-fire system for “beating the house” and convincing you no matter what.
Why am I suddenly making gambling analogies? Because Statistician 2 is trying to use a Martingale betting system. And at the end of the day, the house always wins—Statistician 2 has a large chance to submit a “biased towards heads” sample, but only at the cost of having their other samples be even more biased towards tails. On average, they are still accurate, just like how on average, you can’t win money with a Martingale betting strategy.
In this analogy, publication bias is like running away without paying your gambling debts.