I would be interested to find the ways to estimate how many hypothesis should be listed before the correct one is in the list. If the number is very large, like 1000s, when listing hypothesis is not productive.
I think the better question is how many hypotheses should be listed before the value of information is too low to be worth continuing.
If you think (exactly) one of the hypotheses is correct, then the prior probability that you have already included the correct one in your list is exactly the sum of the prior probabilities of all hypotheses so far. The posterior probability that your list contains the correct hypothesis cannot be computed, though, since it requires knowledge of the prior probability of the observed evidence (which requires summing over all the hypotheses, including the ones you didn’t list yet).
(If more than one hypothesis can be correct due to several hypotheses being equivalent, the probability is higher.)
If there is a chance reality is not any hypothesis you would ever list, then you could multiply the above calculation by the probability reality is one of the hypotheses you would ever list.
All this seems rather artificial, since it assumes the probabilities in the prior are meaningful, but it seems to me that if we’re asking what the probability that we’ve already listed the correct hypothesis is, we don’t want to trust the prior. But, what else can you do?
However, getting the correct hypothesis in your list is much less important than getting hypotheses which are good enough to help you make accurate decisions later. That’s why I said value of information seems like the more relevant measurement. It seems like this can’t be estimated without knowing anything about the hypotheses you haven’t listed yet, though.
I think the better question is how many hypotheses should be listed before the value of information is too low to be worth continuing.
If you think (exactly) one of the hypotheses is correct, then the prior probability that you have already included the correct one in your list is exactly the sum of the prior probabilities of all hypotheses so far. The posterior probability that your list contains the correct hypothesis cannot be computed, though, since it requires knowledge of the prior probability of the observed evidence (which requires summing over all the hypotheses, including the ones you didn’t list yet).
(If more than one hypothesis can be correct due to several hypotheses being equivalent, the probability is higher.)
If there is a chance reality is not any hypothesis you would ever list, then you could multiply the above calculation by the probability reality is one of the hypotheses you would ever list.
All this seems rather artificial, since it assumes the probabilities in the prior are meaningful, but it seems to me that if we’re asking what the probability that we’ve already listed the correct hypothesis is, we don’t want to trust the prior. But, what else can you do?
However, getting the correct hypothesis in your list is much less important than getting hypotheses which are good enough to help you make accurate decisions later. That’s why I said value of information seems like the more relevant measurement. It seems like this can’t be estimated without knowing anything about the hypotheses you haven’t listed yet, though.