Regarding the technical side of your post, if a Bayesian computer program assigns probability 0.87 to proposition X, then obviously it ought to assign probability 1 to the fact that it assigns probability 0.87 to proposition X.
But it’s hard to think of a situation where the program will need to make use of the latter probability.
It’s a measure of how much confidence there is in the estimate, so it could be used when updating in response to evidence. High confidence there would mean that it takes a lot of new evidence to shift the 0.87 estimate.
Your last paragraph is wrong. Here’s an excruciatingly detailed explanation.
Let’s say I am a perfect Bayesian flipping a possibly biased coin. At the outset I have a uniform prior over all possible biases of the coin between 0 and 1. Marginalizing (integrating) that prior, I assign 50% probability to the event of seeing heads on the first throw. Knowing my own neurons perfectly, I believe all the above statements with probability 100%.
The first flip of the coin will still make me update the prior to a posterior, which will have a different mean. Perfect knowledge of myself doesn’t stop me from that.
Now skip forward. I have flipped the coin a million times, and about half the results were heads. My current probability assignment for the next throw (obtained by integrating my current prior) is 50% heads and 50% tails. I have monitored my neurons diligently throughout the process, and am 100% confident of their current state.
But it will take much more evidence now to change the 50% assignment to something like 51%, because my prior is very concentrated after seeing a million throws.
The statement “I have perfect knowledge of the current state of my prior” (and its integral, etc.) does not in any way imply that “my current prior is very concentrated around a certain value”. It is the latter, not the former, that controls my sensitivity to evidence.
Your last paragraph is wrong. Here’s an excruciatingly detailed explanation.
That does clarify what you originally meant. However, this still seems “rather suspicious”—due to the 1.0:
if a Bayesian computer program assigns probability 0.87 to proposition X, then obviously it ought to assign probability 1 to the fact that it assigns probability 0.87 to proposition X.
I’m willing to bite the bullet here because all hell breaks loose if I don’t. We don’t know how a Bayesian agent can ever function if it’s allowed (and therefore required) to doubt arbitrary mathematical statements, including statements about its own algorithm, current contents of memory, arithmetic, etc. It seems easier to just say 1.0 as a stopgap. Wei Dai, paulfchristiano and I have been thinking about this issue for some time, with no results.
I am pretty sure that is wrong. For one thing it would be overconfident. For another 0 and 1 are not probabilities.
It’s a measure of how much confidence there is in the estimate, so it could be used when updating in response to evidence. High confidence there would mean that it takes a lot of new evidence to shift the 0.87 estimate.
Your last paragraph is wrong. Here’s an excruciatingly detailed explanation.
Let’s say I am a perfect Bayesian flipping a possibly biased coin. At the outset I have a uniform prior over all possible biases of the coin between 0 and 1. Marginalizing (integrating) that prior, I assign 50% probability to the event of seeing heads on the first throw. Knowing my own neurons perfectly, I believe all the above statements with probability 100%.
The first flip of the coin will still make me update the prior to a posterior, which will have a different mean. Perfect knowledge of myself doesn’t stop me from that.
Now skip forward. I have flipped the coin a million times, and about half the results were heads. My current probability assignment for the next throw (obtained by integrating my current prior) is 50% heads and 50% tails. I have monitored my neurons diligently throughout the process, and am 100% confident of their current state.
But it will take much more evidence now to change the 50% assignment to something like 51%, because my prior is very concentrated after seeing a million throws.
The statement “I have perfect knowledge of the current state of my prior” (and its integral, etc.) does not in any way imply that “my current prior is very concentrated around a certain value”. It is the latter, not the former, that controls my sensitivity to evidence.
Upvoted for giving a good explanation where I failed earlier...
That does clarify what you originally meant. However, this still seems “rather suspicious”—due to the 1.0:
I’m willing to bite the bullet here because all hell breaks loose if I don’t. We don’t know how a Bayesian agent can ever function if it’s allowed (and therefore required) to doubt arbitrary mathematical statements, including statements about its own algorithm, current contents of memory, arithmetic, etc. It seems easier to just say 1.0 as a stopgap. Wei Dai, paulfchristiano and I have been thinking about this issue for some time, with no results.