In both boy or girl puzzle and Monty hall problem the main point is “how” the new information is obtained. Is the mathematician randomly picking a child and mentioning its gender, or is he purposely checking for a boy among his children. Does the host know what’s behind the door and always reveal a goat, or does he simple randomly opens a door and it turns out to be a goat. Or in statistic terms: how is the sample drawn. Once that is clear bayesian and statistics gives the same result. Of course if one start from a wrong assumption about the sampling process his conclusion would be wrong. No argument there.
But SIA itself is a statement regarding how the sample is drawn. Why we must only check its merit with bayesian but not stats? And if you are certain the statistic reasoning is wrong then instead of pointing to different probability puzzles why not point out the mistake?
With all these posts you haven’t even mention whether you believe the thirder should estimate R=27 or not. While I have been explicitly clear about my positions and dissecting my arguments step by step I feel you are being very vague about yours. This put me into a harder and more labours position to counter argue. That’s why I feel this discussion is no longer about sleeping beauty problem but more about who’s right and who’s better at arguing. That’s not productive, and I am leaving it.
With all these posts you haven’t even mention whether you believe the thirder should estimate R=27 or not.
If by “estimate” you mean “highest credence”, the short answer is that Bayesians usually don’t use such tools (maximum likelihood, unbiased estimates, etc.) They use plain old expected values instead.
After waking up in a red room and then opening 2 red and 6 blue rooms, a Bayesian thirder will believe the expected value of R to be 321⁄11, which is a bit over 29. I calculated it directly and then checked with a numerical simulation.
It’s easy to explain why the expected value isn’t 27 (proportional to the fraction of red in the sample). Consider the case where all 9 rooms seen are red. Should a Bayesian then believe that the expected value of R is 81? No way! That would imply believing R=81 with probability 100%, because any nonzero credence for R<81 would lead to lower expected value. That’s way overconfident after seeing only 9 rooms, so the right expected value must be lower. You can try calculating it, it’s a nice exercise.
Appreciate the effort. Especially about the calculation part. I am no expert on coding. But from my limited knowledge on python the calculation looks correct to me. I want to point out for the direct calculation formulation like this+choose+3)++((81-r)+choose+6)),+r%3D3+to+75)+%2F+(sum+(+((r)+choose+3)++((81-r)+choose+6)),+r%3D3+to+75)) gives the same answer. I would say it reflect SIA reasoning more and resemble your code better as well. Basically it shows under SIA beauty should treat her own room the same way as the other 8 rooms.
The part explaining the relationship between expected value and unbiased estimation (maximum likelihood) is obviously correct. Though I wouldn’t say it is relevant to the argument.
You claim Bayesian’s don’t usually uses maximum likelihood or unbiased estimates. I would say that is a mistake. They are important in decision making. However “usually” is a subjective term and argument about how often is “usual” is pointless. The bottom line is they are valid questions to ask and bayesians should have an answer. And how should thirders answer it, that is the question.
Mathematically, maximum likelihood and unbiased estimates are well defined, but Bayesians don’t expect them to always agree with intuition.
For example, imagine you have a coin whose parameter is known to be between 1⁄3 and 2⁄3. After seeing one tails, an unbiased estimate of the coin’s parameter is 0 (lower than all possible parameter values) and the maximum likelihood estimate is 1⁄3 (jumping to extremes after seeing a tiny bit of information). Bayesian expected values don’t have such problems.
You can stop kicking the sand castle of frequentism+SIA, it never had strong defenders anyway. Bayes+SIA is the strong inconvenient position you should engage with.
Maximum likelihood is indeed 0 or Tails, assuming we start from a uniform prior. 1⁄3 is the expected value. Ask yourself this, after seeing a tail what should you guess for the next toss result to have maximum likelihood of being correct?
If halfers reasoning applies to both Bayesian and Frequentist while SIA is only good in Bayesian isn’t it quite alarming to say the least?
The 0 isn’t a prediction of the next coin toss, it’s an unbiased estimate of the coin parameter which is guaranteed to lie between 1⁄3 and 2⁄3. That’s the problem! Depending on the randomness in the sample, an unbiased estimate of unknown parameter X could be smaller or larger than literally all possible values of X. Since in the post you use unbiased estimates and expect them to behave reasonably, I thought this example would be relevant.
Hopefully that makes it clearer why Bayesians wouldn’t agree that frequentism+halfism is coherent. They think frequentism is incoherent enough on its own :-)
OK, I misunderstood. I interpreted the coin is biased 1⁄3 to 2⁄3 but we don’t know which side it favours. If we start from uniform (1/2 to H and 1⁄2 to T), then the maximum likelihood is Tails.
Unless I misunderstood again, you mean there is a coin we want to guess its natural chance (forgive me if I’m misusing terms here). We do know its chance is bounded between 1⁄3 and 2⁄3. In this case yes, the statistical estimate is 0 while the maximum likelihood is 1⁄3. However it is obviously due to the use of a informed prior (that we know it is between 1⁄3 and 2⁄3). Hardly a surprise.
Also I want to point out in your previous example you said SIA+frequentist never had any strong defenders. That is not true. Until now in literatures thirding are generally considered to be a better fit for frequentist than halving. Because long run frequency of Tail awakening is twice as many as Head awakenings. Such arguments are used by published academics including Elga. Therefore I would consider my attack from the frequentist angle has some value.
Interesting. I guess the right question is, if you insist on a frequentist argument, how simple can you make it? Like I said, I don’t expect things like unbiased estimates to behave intuitively. Can you make the argument about long run frequencies only? That would go a long way in convincing me that you found a genuine contradiction.
Yes, I have given a long run frequency argument for halving in part I. Sadly that part have not gotten any attention. My entire argument is about the importance of perspective disagreement in SBP. This counter argument is actually the less important part.
Sorry slightly confused here, bias (although an F concept, since it relies on “true parameter value”) is sort of orthogonal to B vs F.
Estimates based on either B or F techniques could be biased or unbiased.
Quoth famous Bayesian Andrew Gelman:
“I can’t keep track of what all those Bayesians are doing nowadays—unfortunately,
all sorts of people are being seduced by the promises of automatic inference through
the “magic of MCMC”—but I wish they would all just stop already and get back to
doing statistics the way it should be done, back in the old days when a p-value stood
for something, when a confidence interval meant what it said, and statistical bias was
something to eliminate, not something to embrace.”
Heh. I’m not a strong advocate of Bayesianism, but when someone says their estimator is unbiased, that doesn’t fill me with trust. There are many problems where the unique unbiased estimator is ridiculous (e.g. negative with high probability when the true parameter is always positive, etc.)
In both boy or girl puzzle and Monty hall problem the main point is “how” the new information is obtained. Is the mathematician randomly picking a child and mentioning its gender, or is he purposely checking for a boy among his children. Does the host know what’s behind the door and always reveal a goat, or does he simple randomly opens a door and it turns out to be a goat. Or in statistic terms: how is the sample drawn. Once that is clear bayesian and statistics gives the same result. Of course if one start from a wrong assumption about the sampling process his conclusion would be wrong. No argument there.
But SIA itself is a statement regarding how the sample is drawn. Why we must only check its merit with bayesian but not stats? And if you are certain the statistic reasoning is wrong then instead of pointing to different probability puzzles why not point out the mistake?
With all these posts you haven’t even mention whether you believe the thirder should estimate R=27 or not. While I have been explicitly clear about my positions and dissecting my arguments step by step I feel you are being very vague about yours. This put me into a harder and more labours position to counter argue. That’s why I feel this discussion is no longer about sleeping beauty problem but more about who’s right and who’s better at arguing. That’s not productive, and I am leaving it.
If by “estimate” you mean “highest credence”, the short answer is that Bayesians usually don’t use such tools (maximum likelihood, unbiased estimates, etc.) They use plain old expected values instead.
After waking up in a red room and then opening 2 red and 6 blue rooms, a Bayesian thirder will believe the expected value of R to be 321⁄11, which is a bit over 29. I calculated it directly and then checked with a numerical simulation.
It’s easy to explain why the expected value isn’t 27 (proportional to the fraction of red in the sample). Consider the case where all 9 rooms seen are red. Should a Bayesian then believe that the expected value of R is 81? No way! That would imply believing R=81 with probability 100%, because any nonzero credence for R<81 would lead to lower expected value. That’s way overconfident after seeing only 9 rooms, so the right expected value must be lower. You can try calculating it, it’s a nice exercise.
Appreciate the effort. Especially about the calculation part. I am no expert on coding. But from my limited knowledge on python the calculation looks correct to me. I want to point out for the direct calculation formulation like this+choose+3)++((81-r)+choose+6)),+r%3D3+to+75)+%2F+(sum+(+((r)+choose+3)++((81-r)+choose+6)),+r%3D3+to+75)) gives the same answer. I would say it reflect SIA reasoning more and resemble your code better as well. Basically it shows under SIA beauty should treat her own room the same way as the other 8 rooms.
The part explaining the relationship between expected value and unbiased estimation (maximum likelihood) is obviously correct. Though I wouldn’t say it is relevant to the argument.
You claim Bayesian’s don’t usually uses maximum likelihood or unbiased estimates. I would say that is a mistake. They are important in decision making. However “usually” is a subjective term and argument about how often is “usual” is pointless. The bottom line is they are valid questions to ask and bayesians should have an answer. And how should thirders answer it, that is the question.
Mathematically, maximum likelihood and unbiased estimates are well defined, but Bayesians don’t expect them to always agree with intuition.
For example, imagine you have a coin whose parameter is known to be between 1⁄3 and 2⁄3. After seeing one tails, an unbiased estimate of the coin’s parameter is 0 (lower than all possible parameter values) and the maximum likelihood estimate is 1⁄3 (jumping to extremes after seeing a tiny bit of information). Bayesian expected values don’t have such problems.
You can stop kicking the sand castle of frequentism+SIA, it never had strong defenders anyway. Bayes+SIA is the strong inconvenient position you should engage with.
That’s an unfair comparison, since you assume a good prior. Screw up the prior and Bayes can be made to look as silly as you like.
Doing frequentist estimation on the basis of one data point is stupid, of course.
Maximum likelihood is indeed 0 or Tails, assuming we start from a uniform prior. 1⁄3 is the expected value. Ask yourself this, after seeing a tail what should you guess for the next toss result to have maximum likelihood of being correct?
If halfers reasoning applies to both Bayesian and Frequentist while SIA is only good in Bayesian isn’t it quite alarming to say the least?
The 0 isn’t a prediction of the next coin toss, it’s an unbiased estimate of the coin parameter which is guaranteed to lie between 1⁄3 and 2⁄3. That’s the problem! Depending on the randomness in the sample, an unbiased estimate of unknown parameter X could be smaller or larger than literally all possible values of X. Since in the post you use unbiased estimates and expect them to behave reasonably, I thought this example would be relevant.
Hopefully that makes it clearer why Bayesians wouldn’t agree that frequentism+halfism is coherent. They think frequentism is incoherent enough on its own :-)
OK, I misunderstood. I interpreted the coin is biased 1⁄3 to 2⁄3 but we don’t know which side it favours. If we start from uniform (1/2 to H and 1⁄2 to T), then the maximum likelihood is Tails.
Unless I misunderstood again, you mean there is a coin we want to guess its natural chance (forgive me if I’m misusing terms here). We do know its chance is bounded between 1⁄3 and 2⁄3. In this case yes, the statistical estimate is 0 while the maximum likelihood is 1⁄3. However it is obviously due to the use of a informed prior (that we know it is between 1⁄3 and 2⁄3). Hardly a surprise.
Also I want to point out in your previous example you said SIA+frequentist never had any strong defenders. That is not true. Until now in literatures thirding are generally considered to be a better fit for frequentist than halving. Because long run frequency of Tail awakening is twice as many as Head awakenings. Such arguments are used by published academics including Elga. Therefore I would consider my attack from the frequentist angle has some value.
Interesting. I guess the right question is, if you insist on a frequentist argument, how simple can you make it? Like I said, I don’t expect things like unbiased estimates to behave intuitively. Can you make the argument about long run frequencies only? That would go a long way in convincing me that you found a genuine contradiction.
Yes, I have given a long run frequency argument for halving in part I. Sadly that part have not gotten any attention. My entire argument is about the importance of perspective disagreement in SBP. This counter argument is actually the less important part.
Sorry slightly confused here, bias (although an F concept, since it relies on “true parameter value”) is sort of orthogonal to B vs F.
Estimates based on either B or F techniques could be biased or unbiased.
Quoth famous Bayesian Andrew Gelman:
“I can’t keep track of what all those Bayesians are doing nowadays—unfortunately, all sorts of people are being seduced by the promises of automatic inference through the “magic of MCMC”—but I wish they would all just stop already and get back to doing statistics the way it should be done, back in the old days when a p-value stood for something, when a confidence interval meant what it said, and statistical bias was something to eliminate, not something to embrace.”
(http://www.stat.columbia.edu/~gelman/research/published/badbayesmain.pdf)
Heh. I’m not a strong advocate of Bayesianism, but when someone says their estimator is unbiased, that doesn’t fill me with trust. There are many problems where the unique unbiased estimator is ridiculous (e.g. negative with high probability when the true parameter is always positive, etc.)
Sure, unbiasedness is a weak property:
If you throw a dart either one foot to the left or one foot to the right of the bullseye, you are unbiased wrt the bullseye, but this is stupid.
Consistency is a better property.