There’s definitely some literature about “probability of probability” (I remember one bit from Jaynes’ book). Usually when people try to go turbo-meta with this, they do something a little different than you, and just ask for “probability of probability of probability”—i.e. they ask only for the meta-meta-distribution of the value of the meta-distribution (or density function) at its object-level value.
Unsure if that’s in Jaynes too.
Connection to logic seems questionable because it’s hard to make logic and probability play nice together formally (maybe the intro to the Logical Inductors paper has good references for complaints about this).
Philosophically I think that there’s something fishy going on here, and that calling something a “distribution over probabilities” is misleading. You have probability distributions when you’re ignorant of something. But you’re not actually ignorant about what probability you’d assign to the next flip being heads (or at least, not under Bayesian assumptions of infinite computational power).
Instead, the thing you’re putting a meta-probability distribution over has to be something else that looks like your Bayesian probability but can be made distinct, like “long-run frequency if I flip the coin 10,000 times” or “correct value of some parameter in my physical model of the coin.” It’s very common for us to want to put probability distributions over these kinds of things, and so “meta-probabilities” are common.
And then your meta-meta-probability has to be about something distinct from the meta-probability! But now I’m sort of scratching my head about what that something is. Maybe “correct value of some parameter in a model of my reasoning about a physical model of the coin?”
There’s definitely some literature about “probability of probability” (I remember one bit from Jaynes’ book). Usually when people try to go turbo-meta with this, they do something a little different than you, and just ask for “probability of probability of probability”—i.e. they ask only for the meta-meta-distribution of the value of the meta-distribution (or density function) at its object-level value.
Unsure if that’s in Jaynes too.
Connection to logic seems questionable because it’s hard to make logic and probability play nice together formally (maybe the intro to the Logical Inductors paper has good references for complaints about this).
Philosophically I think that there’s something fishy going on here, and that calling something a “distribution over probabilities” is misleading. You have probability distributions when you’re ignorant of something. But you’re not actually ignorant about what probability you’d assign to the next flip being heads (or at least, not under Bayesian assumptions of infinite computational power).
Instead, the thing you’re putting a meta-probability distribution over has to be something else that looks like your Bayesian probability but can be made distinct, like “long-run frequency if I flip the coin 10,000 times” or “correct value of some parameter in my physical model of the coin.” It’s very common for us to want to put probability distributions over these kinds of things, and so “meta-probabilities” are common.
And then your meta-meta-probability has to be about something distinct from the meta-probability! But now I’m sort of scratching my head about what that something is. Maybe “correct value of some parameter in a model of my reasoning about a physical model of the coin?”