Anyhow, the question is, why is throwing out non-90% models and trying again going to make [the probability that we assign true to P(x) using this random process] behave like we want the probability that P(x) is true to behave?
We can answer this with an analogy to updating on new information. If we have a probability distribution over models, and we learn that the correct model says that 90% of P(x) are true in some domain, what we do is we zero out the probability of all models where that’s false, and normalize the remaining probabilities to get our new distribution. All this “output of the random process” stuff is really just describing a process that has some probability of outputting different models (that, is, Abram’s process outputs a model drawn from some distribution, and then we call the probability that P(x) is true as the probability that the output of Abram’s process assigns true to P(x)).
So the way you do updating is you zero out the probability that this process outputs a model where the conditioned-upon information is false, and then you normalize the outputs so that the process outputs one of the remaining models with the same relative frequencies. This is the same behavior as updating a probability distribution.
--
One thing I think you might mean by “we want conditioning to be conditioning” is that you don’t want to store a (literal or effective) distribution over models, and then condition by updating that distribution and recalculating the probability of a statement. You want to store the probability of statements, and condition by doing something to that probability. Like, P(A|B) = P(AB)/P(B).
I like the aesthetics of that too—my first suggestion for logical probability was based off of storing probabilities of statements, after all. But to make things behave at all correctly, you need more than just that, you also need to be able to talk about correlations between probabilities. The easiest way to represent that? Truth tables.
Yeah, updating probabilty distributions over models is believed to be good. The problem is, sometimes our probability distributions over models are wrong, as demonstrated by bad behavior when we update on certain info.
The kind of data that would make you want to zeroi out non-90% models. Is when you observe a bunch of random data points and 90% of them are true, but there are no other patterns you can detect.
The other problem is that updates can be hard to compute.
Anyhow, the question is, why is throwing out non-90% models and trying again going to make [the probability that we assign true to P(x) using this random process] behave like we want the probability that P(x) is true to behave?
We can answer this with an analogy to updating on new information. If we have a probability distribution over models, and we learn that the correct model says that 90% of P(x) are true in some domain, what we do is we zero out the probability of all models where that’s false, and normalize the remaining probabilities to get our new distribution. All this “output of the random process” stuff is really just describing a process that has some probability of outputting different models (that, is, Abram’s process outputs a model drawn from some distribution, and then we call the probability that P(x) is true as the probability that the output of Abram’s process assigns true to P(x)).
So the way you do updating is you zero out the probability that this process outputs a model where the conditioned-upon information is false, and then you normalize the outputs so that the process outputs one of the remaining models with the same relative frequencies. This is the same behavior as updating a probability distribution.
--
One thing I think you might mean by “we want conditioning to be conditioning” is that you don’t want to store a (literal or effective) distribution over models, and then condition by updating that distribution and recalculating the probability of a statement. You want to store the probability of statements, and condition by doing something to that probability. Like, P(A|B) = P(AB)/P(B).
I like the aesthetics of that too—my first suggestion for logical probability was based off of storing probabilities of statements, after all. But to make things behave at all correctly, you need more than just that, you also need to be able to talk about correlations between probabilities. The easiest way to represent that? Truth tables.
Yeah, updating probabilty distributions over models is believed to be good. The problem is, sometimes our probability distributions over models are wrong, as demonstrated by bad behavior when we update on certain info.
The kind of data that would make you want to zeroi out non-90% models. Is when you observe a bunch of random data points and 90% of them are true, but there are no other patterns you can detect.
The other problem is that updates can be hard to compute.