Besides, isn’t your first point contradicted by the two following ones?
You assign, let’s say, p=0.6 that Theory A is basically right
How do I express my uncertainty about that 0.6 number?
So you have uncertainties about things other than the next frobulator reading, but it would be misleading to describe them as uncertainties about your probability distribution
I don’t know about that. I am uncertain about the next frobulator reading. I’m treating this reading as a random variable arising out of an unobserved process (=some unobserved distribution). This unobserved process/distribution has a set of parameters theta. I am uncertain about these parameters. Would you describe the uncertainty about these parameters as “uncertainties about [my] probability distribution”?
I don’t really believe in “Knightian uncertainty” as a fundamental notion, but in so far as you have it I’m not sure you can properly be said to have a prior at all.
Your “uncertainty about that 0.6 number” is a meaningful notion only when there’s something in (your model of) the world for it to be about. For instance, perhaps your opinion that Theory A is a bit more likely than not is the result of your having read a speculative paper by someone you think is an expert; but if you think there’s a 10% chance she’s a charlatan, maybe it would be useful to represent that as p=0.9 of (65% Theory A, 32% Theory B, 3% neither) plus p=0.1 of some other probability distribution over theories. (If that’s the only impact of learning that the author is or isn’t a charlatan, this doesn’t buy you anything relative to just figuring out the overall probabilities for A, B, and Neither; but e.g. perhaps if the author is a charlatan then your ideas about how things might look if A and B are both wrong will change.)
But your estimate of p=0.6 as such—I think asking for your uncertainty about it is a type error.
(It might be fruitful in practice to put probability distributions on such things—it might be easier and almost as accurate as figuring out all the intricate evidential structures that I’m suggesting are the “real” underpinnings of the kind of uncertainty that makes it feel like a good thing to do. But I think that’s a heuristic technique and I’m not convinced that there’s a way to make it rigorous that doesn’t cash it out in terms of the kind of thing I’ve been describing.)
Would you describe the uncertainty about these parameters as “uncertainties about [my] probability distribution”?
No. I think you’re making a type error again. The unobserved process is, or describes, some physical thing within the world; its parameters, whatever they may be, are facts about the world. You are (of course) uncertain about them; that uncertainty is part of your current probability distribution over ways-the-world-could-be. (You may also be uncertain about whether the actual process is the sort you think it is; again, that’s represented by your probability distribution over how the world is.)
None of that involves making your probability assignment apply to itself.
Now, having said all that: you are part of the world, and you may in fact be uncertain about various aspects of your mind, including your probability assignments. So if you are trying to predict your own future behaviour or something, then for that purpose you may want to introduce something like uncertainty about your probability distribution. But I think you shouldn’t identify your model of your probability distribution, as here, with the probability distribution you’re using for calculation, as in the previous paragraphs. (In particular, I suspect that assuming they match may lead you into inconsistencies.)
Let me express my approach in a slightly different way.
Let’s say I have a sample of some numbers and I’m interested in the properties of future numbers coming out of the same underlying process.
The simplest approach (say, Level 1) is to have a point estimate. Here is my expected value for the future numbers.
But wait! There is uncertainty. At Level 2 I specify a distribution, say, a Gaussian with a particular mean and standard deviation (note that it implies e.g. very specific “hard” probabilities of seeing particulate future numbers).
But wait! There is more uncertainty! At Level 3 I specify that the mean of that Gaussian is actually uncertain, too, and has a standard error—in effect it is a distribution (meaning your “hard” probabilities from the previous level just became “soft”). And the variance is uncertain, too, and has parameters of its own.
But wait! You can dive deeper and find yet more turtles down there.
but in so far as you have it I’m not sure you can properly be said to have a prior at all.
I have an uncertain prior. I find that notion intuitive, it seems that you don’t.
Your “uncertainty about that 0.6 number” is a meaningful notion only when there’s something in (your model of) the world for it to be about.
It is uncertainty about the probability that the theory A is correct. I find the idea of “uncertainty about the probability” meaningful and useful.
I think that in a large number of cases you just do not have enough data for “figuring out all the intricate evidential structures” and the “heuristic technique” is all you can do. As for being rigorous, I’ll be happy if in the limit it converges to the right values.
that’s represented by your probability distribution over how the world is
But I don’t have one. I’m not Omega—the world is too large for me to have a probability distribution over it. I’m building models all of which are wrong but some of which are useful (hat tip to George Box). Is it useful to me to have multilayered models which involve probabilities of probabilities.
I think we are basically talking about whether to collapse all the meta-levels into one (your and Anders_H’s position) or not collapse them (my position).
Including my Knightian uncertainty?
Besides, isn’t your first point contradicted by the two following ones?
How do I express my uncertainty about that 0.6 number?
I don’t know about that. I am uncertain about the next frobulator reading. I’m treating this reading as a random variable arising out of an unobserved process (=some unobserved distribution). This unobserved process/distribution has a set of parameters theta. I am uncertain about these parameters. Would you describe the uncertainty about these parameters as “uncertainties about [my] probability distribution”?
I don’t really believe in “Knightian uncertainty” as a fundamental notion, but in so far as you have it I’m not sure you can properly be said to have a prior at all.
Your “uncertainty about that 0.6 number” is a meaningful notion only when there’s something in (your model of) the world for it to be about. For instance, perhaps your opinion that Theory A is a bit more likely than not is the result of your having read a speculative paper by someone you think is an expert; but if you think there’s a 10% chance she’s a charlatan, maybe it would be useful to represent that as p=0.9 of (65% Theory A, 32% Theory B, 3% neither) plus p=0.1 of some other probability distribution over theories. (If that’s the only impact of learning that the author is or isn’t a charlatan, this doesn’t buy you anything relative to just figuring out the overall probabilities for A, B, and Neither; but e.g. perhaps if the author is a charlatan then your ideas about how things might look if A and B are both wrong will change.)
But your estimate of p=0.6 as such—I think asking for your uncertainty about it is a type error.
(It might be fruitful in practice to put probability distributions on such things—it might be easier and almost as accurate as figuring out all the intricate evidential structures that I’m suggesting are the “real” underpinnings of the kind of uncertainty that makes it feel like a good thing to do. But I think that’s a heuristic technique and I’m not convinced that there’s a way to make it rigorous that doesn’t cash it out in terms of the kind of thing I’ve been describing.)
No. I think you’re making a type error again. The unobserved process is, or describes, some physical thing within the world; its parameters, whatever they may be, are facts about the world. You are (of course) uncertain about them; that uncertainty is part of your current probability distribution over ways-the-world-could-be. (You may also be uncertain about whether the actual process is the sort you think it is; again, that’s represented by your probability distribution over how the world is.)
None of that involves making your probability assignment apply to itself.
Now, having said all that: you are part of the world, and you may in fact be uncertain about various aspects of your mind, including your probability assignments. So if you are trying to predict your own future behaviour or something, then for that purpose you may want to introduce something like uncertainty about your probability distribution. But I think you shouldn’t identify your model of your probability distribution, as here, with the probability distribution you’re using for calculation, as in the previous paragraphs. (In particular, I suspect that assuming they match may lead you into inconsistencies.)
Let me express my approach in a slightly different way.
Let’s say I have a sample of some numbers and I’m interested in the properties of future numbers coming out of the same underlying process.
The simplest approach (say, Level 1) is to have a point estimate. Here is my expected value for the future numbers.
But wait! There is uncertainty. At Level 2 I specify a distribution, say, a Gaussian with a particular mean and standard deviation (note that it implies e.g. very specific “hard” probabilities of seeing particulate future numbers).
But wait! There is more uncertainty! At Level 3 I specify that the mean of that Gaussian is actually uncertain, too, and has a standard error—in effect it is a distribution (meaning your “hard” probabilities from the previous level just became “soft”). And the variance is uncertain, too, and has parameters of its own.
But wait! You can dive deeper and find yet more turtles down there.
I have an uncertain prior. I find that notion intuitive, it seems that you don’t.
It is uncertainty about the probability that the theory A is correct. I find the idea of “uncertainty about the probability” meaningful and useful.
I think that in a large number of cases you just do not have enough data for “figuring out all the intricate evidential structures” and the “heuristic technique” is all you can do. As for being rigorous, I’ll be happy if in the limit it converges to the right values.
But I don’t have one. I’m not Omega—the world is too large for me to have a probability distribution over it. I’m building models all of which are wrong but some of which are useful (hat tip to George Box). Is it useful to me to have multilayered models which involve probabilities of probabilities.
I think we are basically talking about whether to collapse all the meta-levels into one (your and Anders_H’s position) or not collapse them (my position).