In your example before we have any information we’d assume P(A) = 0.5 and after we have information about the alphabet and how X is constructed from the alphabet we can just calculate the exact value for P(A|B). So the “update” here just consists of replacing the initial estimate with the correct answer. I think this is also what you’re saying so I agree that in situations like these using P(A) = 0.5 as starting point does not affect the final answer (but I’d still start out with a prior of 0.5).
I’ll propose a different example. It’s a bit contrived (well, really contrived, but OK).
Frank and his buddies (of which you are one) decide to rob a bank.
Frank goes: “Alright men, in order for us to pull this off 4 things have to go perfectly according to plan.”
(you think: conjunction of 4 things: 0.0625 prior probability of success)
Frank continues: the first thing we need to do is beat the security system (… long explanation follows).
(you think: that plan is genius and almost certain to work (0.9 probability of success follows from Bayesian estimate). I’m updating my confidence to 0.1125)
Frank continues: the second thing we we need to do is break into the safe (… again a long explanation follows).
(you think: wow, that’s a clever solution − 0.7 probability of success. Total probability of success 0.1575)
Frank continues: So! Are you in or are you out?
At this point you have to decide immediately. You don’t have the time to work out the plausibility of the remaining two factors, you just have to make a decision. But just by knowing that there are two more things that have to go right you can confidently say “Sorry Frank, but I’m out.”.
If you had more time to think you could come up with a better estimate of success. But you don’t have time. You have to go with your prior of total ignorance for the last two factors of your estimate.
If we were to plot the confidence over time I think it should start at 0.5, then go to 0.0625 when we understand a estimate of a conjunction of 4 parts is to be calculated and after that more nuanced Bayesian reasoning follows. So if I were to build an AI then I would make it start out with the universal prior of total ignorance and go from there. So I don’t think the prior is a purely mathematical trick that has no bearing on we way we reason.
(At the risk of stating the obvious: you’re strictly speaking never adjusting based on the prior of 0.5. The moment you have evidence you replace the prior with the estimate based on evidence. When you get more evidence you can update based on that. The prior of 0.5 completely vaporizes the moment evidence enters the picture. Otherwise you would be doing an update on non-evidence.)
In your example before we have any information we’d assume P(A) = 0.5 and after we have information about the alphabet and how X is constructed from the alphabet we can just calculate the exact value for P(A|B). So the “update” here just consists of replacing the initial estimate with the correct answer. I think this is also what you’re saying so I agree that in situations like these using P(A) = 0.5 as starting point does not affect the final answer (but I’d still start out with a prior of 0.5).
I’ll propose a different example. It’s a bit contrived (well, really contrived, but OK).
Frank and his buddies (of which you are one) decide to rob a bank.
Frank goes: “Alright men, in order for us to pull this off 4 things have to go perfectly according to plan.”
(you think: conjunction of 4 things: 0.0625 prior probability of success)
Frank continues: the first thing we need to do is beat the security system (… long explanation follows).
(you think: that plan is genius and almost certain to work (0.9 probability of success follows from Bayesian estimate). I’m updating my confidence to 0.1125)
Frank continues: the second thing we we need to do is break into the safe (… again a long explanation follows).
(you think: wow, that’s a clever solution − 0.7 probability of success. Total probability of success 0.1575)
Frank continues: So! Are you in or are you out?
At this point you have to decide immediately. You don’t have the time to work out the plausibility of the remaining two factors, you just have to make a decision. But just by knowing that there are two more things that have to go right you can confidently say “Sorry Frank, but I’m out.”.
If you had more time to think you could come up with a better estimate of success. But you don’t have time. You have to go with your prior of total ignorance for the last two factors of your estimate.
If we were to plot the confidence over time I think it should start at 0.5, then go to 0.0625 when we understand a estimate of a conjunction of 4 parts is to be calculated and after that more nuanced Bayesian reasoning follows. So if I were to build an AI then I would make it start out with the universal prior of total ignorance and go from there. So I don’t think the prior is a purely mathematical trick that has no bearing on we way we reason.
(At the risk of stating the obvious: you’re strictly speaking never adjusting based on the prior of 0.5. The moment you have evidence you replace the prior with the estimate based on evidence. When you get more evidence you can update based on that. The prior of 0.5 completely vaporizes the moment evidence enters the picture. Otherwise you would be doing an update on non-evidence.)