I feel like this result should have rung significant alarm bells. Bayes theorem is not a rule someone has come up with that has empirically worked out well. It’s a theorem. It just tells you a true equation by which to compute probabilities. Maybe if we include limits of probability (logical uncertainty/infinities/anthropics) there would be room for error, but the setting you have here doesn’t include any of these. So Bayesians can’t commit a fallacy. There is either an error in your reasoning, or you’ve found an inconsistency in ZFC.
So where’s the mistake? Well, as far as I’ve understood (and I might be wrong), all you’ve shown is that if we restrict ourselves to three priors (uniform, streaky, switchy) and observe a distribution that’s uniform, then we’ll accumulate evidence against streaky more quickly than against switchy. Which is a cool result since the two do appear symmetrical, as you said. But it’s not a fallacy. If we set up a game where we randomize (uniform, streaky, switchy) with 1⁄3 probability each (so that the priors are justified), then generate a sequence, and then make people assign probabilities to the three options after seeing 10 samples, then the Bayesians are going to play precisely optimally here. It just happens to be the case that, whenever steady is randomized, the probability for streaky goes down more quickly than that for switchy. So what? Where’s the fallacy?
First upshot: whenever she’s more confident of Switchy than Sticky, this weighted average will put more weight on the Switchy (50-c%) term than the Sticky (50+c%) term. This will her to be less than 50%-confident the streak will continue—i.e. will lead her to commit the gambler’s fallacy.
In other words, if a Bayesian agent has a prior (13,13,13) across three distributions, then their probability estimate for the next sampled element will be systematically off if only the first distribution is used. This is not a fallacy; it happens because you’ve given the agent the wrong prior! You made her equally uncertain between three hypotheses and then assumed that only one of them is true.
And yeah, there are probably fewer than 13 sticky and streaky distributions each other there, so the prior is probably wrong. But this isn’t the Bayesian’s fault. The fair way to set up the game would be to randomize which distribution is shown first, which again would lead to optimal predictions.
I don’t want to be too negative since it is still a cool result, but it’s just not a fallacy.
Baylee is a rational Bayesian. As I’ll show: when either data or memory are limited, Bayesians who begin with causal uncertainty about an (in fact independent) process—and then learn from unbiased data—will, on average, commit the gambler’s fallacy.
Same as above. I mean the data isn’t “unbiased”, it’s uniform, which means it is very much biased relative to the prior that you’ve given the agent.
I have the impression the OP is using “gambler’s fallacy” as a conventional term for a strategy, while you are taking “fallacy” to mean “something’s wrong”. The OP does write about this contrast, e.g., in the conclusion:
Maybe the gambler’s fallacy doesn’t reveal statistical incompetence at all. After all, it’s exactly what we’d expect from a rational sensitivity to both causal uncertainty and subtle statistical cues.
So I think the adversative backbone of your comment is misdirected.
I agree with that characterization, but I think it’s still warranted to make the argument because (a) OP isn’t exactly clear about it, and (b) saying “maybe the title of my post isn’t exactly true” near the end doesn’t remove the impact of the title. I mean this isn’t some kind of exotic effect; it’s the most central way in which people come to believe silly things about science: someone writes about a study in a way that’s maybe sort of true but misleading, and people come away believing something false. Even on LW, the number of people who read just the headline and fill in the rest is probably larger than the number of people who read the post.
This seems like a difficult situation because they need to refer to the particular way-of-betting that they are talking about, and the common name for that way-of-betting is “the gambler’s fallacy”, and so they can’t avoid the implication that this way-of-betting is based on fallacious reasoning except by identifying the way-of-betting in some less-recognizable way, which trades off against other principles of good communication.
I suppose they could insert the phrase “so-called”. i.e. “Bayesians commit the so-called Gambler’s Fallacy”. (That still funges against the virtue of brevity, though not exorbitantly.)
But the point of the post is to use that as a simplified model of a more general phenomenon, that should cling to your notions connected to “gambler’s fallacy”.
A title like yours is more technically defensible and closer to the math, but it renounces an important part. The bolder claim is actually there and intentional.
It reminds me of a lot of academic papers where it’s very difficult to see what all that math is there for.
To be clear, I second making the title less confident. I think your suggestion exceeds in the other direction. It omits content.
Are “switchy” and “streaky” accepted terms-of-art? I wasn’t previously familiar with them and my attempts to Google them mostly lead back to this exact paper, which makes me think this paper probably coined them.
Yeah, I definitely did not think they’re standard terms, but they’re pretty expressive. I mean, you can use terms-that-you-define-in-the-post in the title.
I see the point, though I don’t see why we should be too worried about the semantics here. As someone mentioned below, I think the “gambler’s fallacy” is a folk term for a pattern of beliefs, and the claim is that Bayesians (with reasonable priors) exhibit the same pattern of beliefs. Some relevant discussion in the full paper (p. 3), which I (perhaps misguidedly) cut for the sake of brevity:
I feel like this result should have rung significant alarm bells. Bayes theorem is not a rule someone has come up with that has empirically worked out well. It’s a theorem. It just tells you a true equation by which to compute probabilities. Maybe if we include limits of probability (logical uncertainty/infinities/anthropics) there would be room for error, but the setting you have here doesn’t include any of these. So Bayesians can’t commit a fallacy. There is either an error in your reasoning, or you’ve found an inconsistency in ZFC.
So where’s the mistake? Well, as far as I’ve understood (and I might be wrong), all you’ve shown is that if we restrict ourselves to three priors (uniform, streaky, switchy) and observe a distribution that’s uniform, then we’ll accumulate evidence against streaky more quickly than against switchy. Which is a cool result since the two do appear symmetrical, as you said. But it’s not a fallacy. If we set up a game where we randomize (uniform, streaky, switchy) with 1⁄3 probability each (so that the priors are justified), then generate a sequence, and then make people assign probabilities to the three options after seeing 10 samples, then the Bayesians are going to play precisely optimally here. It just happens to be the case that, whenever steady is randomized, the probability for streaky goes down more quickly than that for switchy. So what? Where’s the fallacy?
In other words, if a Bayesian agent has a prior (13,13,13) across three distributions, then their probability estimate for the next sampled element will be systematically off if only the first distribution is used. This is not a fallacy; it happens because you’ve given the agent the wrong prior! You made her equally uncertain between three hypotheses and then assumed that only one of them is true.
And yeah, there are probably fewer than 13 sticky and streaky distributions each other there, so the prior is probably wrong. But this isn’t the Bayesian’s fault. The fair way to set up the game would be to randomize which distribution is shown first, which again would lead to optimal predictions.
I don’t want to be too negative since it is still a cool result, but it’s just not a fallacy.
Same as above. I mean the data isn’t “unbiased”, it’s uniform, which means it is very much biased relative to the prior that you’ve given the agent.
I have the impression the OP is using “gambler’s fallacy” as a conventional term for a strategy, while you are taking “fallacy” to mean “something’s wrong”. The OP does write about this contrast, e.g., in the conclusion:
So I think the adversative backbone of your comment is misdirected.
I agree with that characterization, but I think it’s still warranted to make the argument because (a) OP isn’t exactly clear about it, and (b) saying “maybe the title of my post isn’t exactly true” near the end doesn’t remove the impact of the title. I mean this isn’t some kind of exotic effect; it’s the most central way in which people come to believe silly things about science: someone writes about a study in a way that’s maybe sort of true but misleading, and people come away believing something false. Even on LW, the number of people who read just the headline and fill in the rest is probably larger than the number of people who read the post.
I strong downvote any post in which the title is significantly more clickbaity than warranted by the evidence in the post. Including this one.
This seems like a difficult situation because they need to refer to the particular way-of-betting that they are talking about, and the common name for that way-of-betting is “the gambler’s fallacy”, and so they can’t avoid the implication that this way-of-betting is based on fallacious reasoning except by identifying the way-of-betting in some less-recognizable way, which trades off against other principles of good communication.
I suppose they could insert the phrase “so-called”. i.e. “Bayesians commit the so-called Gambler’s Fallacy”. (That still funges against the virtue of brevity, though not exorbitantly.)
What would you have titled this result?
With ~2 min of thought, “Uniform distributions provide asymmetrical evidence against switchy and streaky priors”
But the point of the post is to use that as a simplified model of a more general phenomenon, that should cling to your notions connected to “gambler’s fallacy”.
A title like yours is more technically defensible and closer to the math, but it renounces an important part. The bolder claim is actually there and intentional.
It reminds me of a lot of academic papers where it’s very difficult to see what all that math is there for.
To be clear, I second making the title less confident. I think your suggestion exceeds in the other direction. It omits content.
Are “switchy” and “streaky” accepted terms-of-art? I wasn’t previously familiar with them and my attempts to Google them mostly lead back to this exact paper, which makes me think this paper probably coined them.
Yeah, I definitely did not think they’re standard terms, but they’re pretty expressive. I mean, you can use terms-that-you-define-in-the-post in the title.
I see the point, though I don’t see why we should be too worried about the semantics here. As someone mentioned below, I think the “gambler’s fallacy” is a folk term for a pattern of beliefs, and the claim is that Bayesians (with reasonable priors) exhibit the same pattern of beliefs. Some relevant discussion in the full paper (p. 3), which I (perhaps misguidedly) cut for the sake of brevity: