I’m confused about something. In reality there are no perfect dice, all dice are biased in some way, intentionally or not. Thus wouldn’t a more realistic approach be something like “Given the dataset, construct the (multidimensional) probability distribution of biases.” Why privilege the “unbiased” hypothesis?
Right. But also we would want to use a prior that favoured biases which were near fair, since we know that Wolf at least thought they were a normal pair of dice.
First answer: this question is a perfect lead-in to the next post, in which we try to figure which physical asymmetries the die had. Definitely read that.
Second answer: In physics, to talk about the force applied by a baseball bat on a baseball, we use a delta function. We don’t actually think that the force is infinite and applied over an infinitesimal time span, but that’s a great approximation for simple calculations. Same in probability: we do actually think that most dice & coins are very close to unbiased. Even if we think there’s some small spread, the delta function distribution (i.e. a delta function right at the unbiased probability) is a great approximation for an “unbiased” real-world die or coin. That’s what the unbiased model is.
Third answer: “Given the dataset, construct the (multidimensional) probability distribution of biases” translates to “calculate P[p|data]”. That is absolutely a valid question to ask. Our models then enter into the prior for p—each model implies a different prior p distribution, so to get the overall prior for p, we’d combine them: P[p]=P[p|model1]P[model1]+P[p|model2]P[model2]. In English: we think the world has some “unbiased” dice, which have outcome frequencies very close to uniform, and some “biased” dice, which could have any frequencies at all. Thus our prior for p looks like a delta function plus some flatter distribution—a mixture of “unbiased” and “biased” dice.
Sure, you use a delta function when you want to make a simplifying assumption. But this post is about questioning the assumption. That’s exactly when you wouldn’t use a delta function. Your third answer flatly contradicts Shminux. No, he does not believe that there are any perfect dice. Sometimes it’s right to contradict people, but if you don’t notice you’re doing it, it’s a sign that you’re the one who is confused.
The third answer was meant to be used in conjunction with the second; that’s what the scare quotes around “unbiased” were meant to convey, along with the phrase “frequencies very close to uniform”. Sorry if that was insufficiently clear.
Also, if we’re questioning (i.e. testing) the assumption, then we still need the assumption around as a hypothesis against which to test. That’s exactly how it’s used in the post.
I’m confused about something. In reality there are no perfect dice, all dice are biased in some way, intentionally or not. Thus wouldn’t a more realistic approach be something like “Given the dataset, construct the (multidimensional) probability distribution of biases.” Why privilege the “unbiased” hypothesis?
Right. But also we would want to use a prior that favoured biases which were near fair, since we know that Wolf at least thought they were a normal pair of dice.
First answer: this question is a perfect lead-in to the next post, in which we try to figure which physical asymmetries the die had. Definitely read that.
Second answer: In physics, to talk about the force applied by a baseball bat on a baseball, we use a delta function. We don’t actually think that the force is infinite and applied over an infinitesimal time span, but that’s a great approximation for simple calculations. Same in probability: we do actually think that most dice & coins are very close to unbiased. Even if we think there’s some small spread, the delta function distribution (i.e. a delta function right at the unbiased probability) is a great approximation for an “unbiased” real-world die or coin. That’s what the unbiased model is.
Third answer: “Given the dataset, construct the (multidimensional) probability distribution of biases” translates to “calculate P[p|data]”. That is absolutely a valid question to ask. Our models then enter into the prior for p—each model implies a different prior p distribution, so to get the overall prior for p, we’d combine them: P[p]=P[p|model1]P[model1]+P[p|model2]P[model2]. In English: we think the world has some “unbiased” dice, which have outcome frequencies very close to uniform, and some “biased” dice, which could have any frequencies at all. Thus our prior for p looks like a delta function plus some flatter distribution—a mixture of “unbiased” and “biased” dice.
Sure, you use a delta function when you want to make a simplifying assumption. But this post is about questioning the assumption. That’s exactly when you wouldn’t use a delta function. Your third answer flatly contradicts Shminux. No, he does not believe that there are any perfect dice. Sometimes it’s right to contradict people, but if you don’t notice you’re doing it, it’s a sign that you’re the one who is confused.
The third answer was meant to be used in conjunction with the second; that’s what the scare quotes around “unbiased” were meant to convey, along with the phrase “frequencies very close to uniform”. Sorry if that was insufficiently clear.
Also, if we’re questioning (i.e. testing) the assumption, then we still need the assumption around as a hypothesis against which to test. That’s exactly how it’s used in the post.
No, really, it was perfectly clear. The problem is that it was wrong.