Why? Are there no conceivable lotteries with that probability of winning? (There are, e.g. if I bought multiple tickets). Is there no evidence that we could see in order to update this prediction? (There is, e.g. the number of tickets sold, the outcomes of past lotteries, etc). I continue to not understand what standard of “garbage” you’re using.
So, I guess it depends on exactly how far back you want to go when erasing your background knowledge to try to form the concept of a prior. I was assuming you still knew something about the structure of the problem, i.e. that there would be a bunch of tickets sold, that you have only bought one, etc. But you’re right that you could recategorize those as evidence in which case the proper prior wouldn’t depend on them.
If you take this to the extreme you could say that the prior for every sentence should be the same, because the minimum amount of knowledge you could have about a sentence is just “There is a sentence”. You could then treat all facts about the number of words in the sentence, the instances in which you have observed people using those words, etc. as observations to be updated on.
It is tempting to say that the prior for every sentence should be 0.5 in this case (in which case a “garbage prediction” would just be one that is sufficiently far away from 0.5 on a log-odds scale), but it is not so clear that a “randomly chosen” sentence (whatever that means) has a 0.5 probability of being true. If by a “randomly chosen” sentence we mean the kinds of sentences that people are likely to say, then estimating the probability of such a sentence requires all of the background knowledge that we have, and we are left with the same problem.
Maybe all of this is an irrelevant digression. After rereading your previous comments, it occurs to me that maybe I should put it this way: After updating, you have a bunch of people who all have a small probability for “the earth is flat”, but they may have slightly different probabilities due to different genetic predispositions. Are you saying that you don’t think averaging makes sense here? There is no issue with the predictions being garbage, we both agree that they are not garbage. The question is just whether to average them.
I was assuming you still knew something about the structure of the problem, i.e. that there would be a bunch of tickets sold, that you have only bought one, etc.
If you’ve already observed all the possible evidence, then your prediction is not a “prior” any more, in any sense of the word. Also, both total tickets sold and the number of tickets someone bought are variables. If I know that there is a lottery in the real world, I don’t usually know how many tickets they really sold (or will sell), and I’m usually allowed to buy more than one (although it’s hard for me to not know how many I have).
After updating, you have a bunch of people who all have a small probability for “the earth is flat”, but they may have slightly different probabilities due to different genetic predispositions. Are you saying that you don’t think averaging makes sense here?
I think that Hanson wants to average before updating. Although if everyone is a perfect bayesian and saw the same evidence, then maybe there isn’t a huge difference between averaging before or after the update.
Either way, my position is that averaging is not justified without additional assumptions. Though I’m not saying that averaging is necessarily harmful either.
If you are doing a log-odds average then it doesn’t matter whether you do it before or after updating.
Like I pointed out in my previous comment the question “how much evidence have I observed / taken into account?” is a continuous question with no obvious “minimum” answer. The answer “I know that a bunch of tickets will be sold, and that I will only buy a few” seems to me to not be a “maximum” answer either, so beliefs based on it seem reasonable to call a “prior”, even if under some framings they are a posterior. Though really it is pointless to talk about what is a prior if we don’t have some specific set of observations in mind that we want our prior to be prior to.
P(I will win the lottery) = 0.6 is a garbage prediction.
Why? Are there no conceivable lotteries with that probability of winning? (There are, e.g. if I bought multiple tickets). Is there no evidence that we could see in order to update this prediction? (There is, e.g. the number of tickets sold, the outcomes of past lotteries, etc). I continue to not understand what standard of “garbage” you’re using.
So, I guess it depends on exactly how far back you want to go when erasing your background knowledge to try to form the concept of a prior. I was assuming you still knew something about the structure of the problem, i.e. that there would be a bunch of tickets sold, that you have only bought one, etc. But you’re right that you could recategorize those as evidence in which case the proper prior wouldn’t depend on them.
If you take this to the extreme you could say that the prior for every sentence should be the same, because the minimum amount of knowledge you could have about a sentence is just “There is a sentence”. You could then treat all facts about the number of words in the sentence, the instances in which you have observed people using those words, etc. as observations to be updated on.
It is tempting to say that the prior for every sentence should be 0.5 in this case (in which case a “garbage prediction” would just be one that is sufficiently far away from 0.5 on a log-odds scale), but it is not so clear that a “randomly chosen” sentence (whatever that means) has a 0.5 probability of being true. If by a “randomly chosen” sentence we mean the kinds of sentences that people are likely to say, then estimating the probability of such a sentence requires all of the background knowledge that we have, and we are left with the same problem.
Maybe all of this is an irrelevant digression. After rereading your previous comments, it occurs to me that maybe I should put it this way: After updating, you have a bunch of people who all have a small probability for “the earth is flat”, but they may have slightly different probabilities due to different genetic predispositions. Are you saying that you don’t think averaging makes sense here? There is no issue with the predictions being garbage, we both agree that they are not garbage. The question is just whether to average them.
If you’ve already observed all the possible evidence, then your prediction is not a “prior” any more, in any sense of the word. Also, both total tickets sold and the number of tickets someone bought are variables. If I know that there is a lottery in the real world, I don’t usually know how many tickets they really sold (or will sell), and I’m usually allowed to buy more than one (although it’s hard for me to not know how many I have).
I think that Hanson wants to average before updating. Although if everyone is a perfect bayesian and saw the same evidence, then maybe there isn’t a huge difference between averaging before or after the update.
Either way, my position is that averaging is not justified without additional assumptions. Though I’m not saying that averaging is necessarily harmful either.
If you are doing a log-odds average then it doesn’t matter whether you do it before or after updating.
Like I pointed out in my previous comment the question “how much evidence have I observed / taken into account?” is a continuous question with no obvious “minimum” answer. The answer “I know that a bunch of tickets will be sold, and that I will only buy a few” seems to me to not be a “maximum” answer either, so beliefs based on it seem reasonable to call a “prior”, even if under some framings they are a posterior. Though really it is pointless to talk about what is a prior if we don’t have some specific set of observations in mind that we want our prior to be prior to.