Suppose that I have a coin with probability of heads p. I certainly know that p is fixed and does not change as I toss the coin. I would like to express my degree of belief in p and then update it as I toss the coin.
Using a constant pdf to model my initial belief, the problem becomes a classic one and it turns out that my belief in p should be expressed with the pdf f(x)=(nh)xh(1−x)n−h after observing h heads out of n tosses. That’s fine.
But let’s say I’m a super-skeptic guy that avoids accepting any statement with certainty, and I am aware of the issue of parametrization dependence too. So I dislike this solution and instead choose to attach beliefs to statements of the form S(f)= “my initial degree of belief is represented with probability density function f.”
Well this is not quite possible since the set of all such f is uncountable. However something similar to the probability density trick we use for continuous variables should do the job here as well. After observing some heads and tails, each initial belief function will be updated just as we did before, which will create a new uneven “density” distribution over S(f). When I want to express my belief that p is in between numbers a and b, now I have a probability density function instead of a definite number, which is a collection of all definite numbers from each (updated) prior. Now I can use the mean of this function to express my guess and I can even be skeptic about my own belief!
This first meta level is still somewhat manageable, as I computed the Var(μ) = 1⁄12 for the initial uniform density over S(f) where μ is the mean of a particular f. I am not sure whether my approach is correct, though. Since the domain of each f is finite, I discretize this domain and represent the uniform density over S(f) as a finite collection of continuous random variables whose joint density is constant. Then taking the limit to infinity.
The whole thing may not make sense at all. I’m just curious what would happen if we use even deeper meta levels, with the outermost level being the uniform “thing”. Is there any math literature anybody knows that already explored something similar to this idea? Like maybe use of probability theory in higher-order logics?
Edit 1:
Let me rephrase my question in a more formal way so that everything becomes more clear.
Let S1=(Ω1,E1,P1) be our first probability space where Ω1 is the sample space coming from our original problem, E1 is the set of events considered that satisfy the rules for being a σ-algebra and P1 is the probability measure.
First of all, for full generality let us choose Ei=2Ωi for all i, that is, the set of all subsets of sample space is our event set. Such an Ei is always a σ-algebra for any Ωi.
Now let me define Ωi+1 to be the set of all possible probability measures Pi:2Ωi→[0,1] for all i. Note that Ωi+1 depends only on Ωi.
Let Sn=(Ωn,2Ωn,Pn) be the nth probability space where Ωn is constructed eventually from Ω1. The final ingredient missing is Pn, we would like it to be a “uniform” probability measure in some sense.
After we invent some nice “uniform” Pn, I plan to use this construct {Si}ni=1 as follows: An event en∈2Ωn occurs with probability Pn(en), which is just a set of probability measures all belonging to the (n−1)st level. Now we use each of these measures to create a set of probability spaces: {(Ωn−1,2Ωn−1,P)∣P∈en}.
Then for each of these spaces an event en−1 occurs with probability determined by the probability measure of that space and so on. A tree will be created whose leaves are elements of 2Ω1, the events of our original problem.
Now the same element of 2Ω1 can appear more than once among the leaves of this tree. So to compute the total probability that an event e1∈2Ω1 occurs, we should add up probabilities of all paths. The depth of the tree is finite, but the number of branches spawned at each level may not be countable at all, which seems to be a dead-end to our journey.
Additional constraints may mitigate this problem which I plan to explore in a later edit.
[Question] Infinite tower of meta-probability
Suppose that I have a coin with probability of heads p. I certainly know that p is fixed and does not change as I toss the coin. I would like to express my degree of belief in p and then update it as I toss the coin.
Using a constant pdf to model my initial belief, the problem becomes a classic one and it turns out that my belief in p should be expressed with the pdf f(x)=(nh)xh(1−x)n−h after observing h heads out of n tosses. That’s fine.
But let’s say I’m a super-skeptic guy that avoids accepting any statement with certainty, and I am aware of the issue of parametrization dependence too. So I dislike this solution and instead choose to attach beliefs to statements of the form S(f)= “my initial degree of belief is represented with probability density function f.”
Well this is not quite possible since the set of all such f is uncountable. However something similar to the probability density trick we use for continuous variables should do the job here as well. After observing some heads and tails, each initial belief function will be updated just as we did before, which will create a new uneven “density” distribution over S(f). When I want to express my belief that p is in between numbers a and b, now I have a probability density function instead of a definite number, which is a collection of all definite numbers from each (updated) prior. Now I can use the mean of this function to express my guess and I can even be skeptic about my own belief!
This first meta level is still somewhat manageable, as I computed the Var(μ) = 1⁄12 for the initial uniform density over S(f) where μ is the mean of a particular f. I am not sure whether my approach is correct, though. Since the domain of each f is finite, I discretize this domain and represent the uniform density over S(f) as a finite collection of continuous random variables whose joint density is constant. Then taking the limit to infinity.
The whole thing may not make sense at all. I’m just curious what would happen if we use even deeper meta levels, with the outermost level being the uniform “thing”. Is there any math literature anybody knows that already explored something similar to this idea? Like maybe use of probability theory in higher-order logics?
Edit 1:
Let me rephrase my question in a more formal way so that everything becomes more clear.
Let S1=(Ω1,E1,P1) be our first probability space where Ω1 is the sample space coming from our original problem, E1 is the set of events considered that satisfy the rules for being a σ-algebra and P1 is the probability measure.
First of all, for full generality let us choose Ei=2Ωi for all i, that is, the set of all subsets of sample space is our event set. Such an Ei is always a σ-algebra for any Ωi.
Now let me define Ωi+1 to be the set of all possible probability measures Pi:2Ωi→[0,1] for all i. Note that Ωi+1 depends only on Ωi.
Let Sn=(Ωn,2Ωn,Pn) be the nth probability space where Ωn is constructed eventually from Ω1. The final ingredient missing is Pn, we would like it to be a “uniform” probability measure in some sense.
After we invent some nice “uniform” Pn, I plan to use this construct {Si}ni=1 as follows: An event en∈2Ωn occurs with probability Pn(en), which is just a set of probability measures all belonging to the (n−1)st level. Now we use each of these measures to create a set of probability spaces: {(Ωn−1,2Ωn−1,P)∣P∈en}.
Then for each of these spaces an event en−1 occurs with probability determined by the probability measure of that space and so on. A tree will be created whose leaves are elements of 2Ω1, the events of our original problem.
Now the same element of 2Ω1 can appear more than once among the leaves of this tree. So to compute the total probability that an event e1∈2Ω1 occurs, we should add up probabilities of all paths. The depth of the tree is finite, but the number of branches spawned at each level may not be countable at all, which seems to be a dead-end to our journey.
Additional constraints may mitigate this problem which I plan to explore in a later edit.