For the first part about ”Ap being a formal maneuver”—I don’t disagree with the comment as stated nor with what Jaynes did in a technical sense. But I’m trying to imbue the proposition with a “physical interpretation” when I identify it with an infinite collection of evidences. There is a subtlety with my original statement that I didn’t expand on, but I’ve been thinking about ever since I read the post: “infinitude” is probably best understood as a relative term. Maybe the simplest way to think about this is that, as I understand it, if you condition on two Ap distributions at the same time, you get a “do not compute”—not zero, but “do not compute”. So the Ap proposition only seems to make sense with respect to some subset {E} of all possible propositions. I interpret this subset as being those of “finite” evidence while the Ap’s (and other propositions) somehow stand outside of this finite evidence class. There is also the matter that, in day-to-day life, it doesn’t really seem possible to encounter what to me seems like a “single” piece of evidence that has the dramatic effect of rendering our beliefs “deterministically indeterministic”. Can we really learn something that tells us that there is no more to learn?
Yes, I suspect that there is a typo there, though I’m a bit too lazy to reference the original text to check. It should be that the probability density over Ap is normalized, and their expectation is the probability of A.
This idea of compressing all relevant information of E relevant to A in the object p(Ap|E) is interesting and indeed, it’s perhaps a better articulation of what I find interesting about the Ap distribution than what is conveyed in the main body of the original post. One thread that I want(ed) to tug at a little further is that the Ap distribution seems to lend itself well to the first steps towards something of a dynamical model of probability theory: when you encounter a piece of evidence E, its first-order effect is to change your probability of A, but its second and n-th order effects are to affect your distribution of what future evidence you expect to encounter and how to “interpret” those pieces of evidence—where by “interpret” I mean in what way encountering that piece of evidence shifts your probability of A. This dynamical theory of probability would have folk theorems like “the variance in your A_p distribution must monotonically decrease over time”. These are shower thoughts.
And it’s also interesting perhaps on a more applied/agentic sense in that we often casually talk about “updating” our beliefs, but what does that actually look like in practice? Empirically, we see that we can have evidence in our head that we fail to process (lack of logical omniscience). Maybe something like the Ap distribution could be helpful for understanding this even better.
For the first part about ”Ap being a formal maneuver”—I don’t disagree with the comment as stated nor with what Jaynes did in a technical sense. But I’m trying to imbue the proposition with a “physical interpretation” when I identify it with an infinite collection of evidences. There is a subtlety with my original statement that I didn’t expand on, but I’ve been thinking about ever since I read the post: “infinitude” is probably best understood as a relative term. Maybe the simplest way to think about this is that, as I understand it, if you condition on two Ap distributions at the same time, you get a “do not compute”—not zero, but “do not compute”. So the Ap proposition only seems to make sense with respect to some subset {E} of all possible propositions. I interpret this subset as being those of “finite” evidence while the Ap’s (and other propositions) somehow stand outside of this finite evidence class. There is also the matter that, in day-to-day life, it doesn’t really seem possible to encounter what to me seems like a “single” piece of evidence that has the dramatic effect of rendering our beliefs “deterministically indeterministic”. Can we really learn something that tells us that there is no more to learn?
Yes, I suspect that there is a typo there, though I’m a bit too lazy to reference the original text to check. It should be that the probability density over Ap is normalized, and their expectation is the probability of A.
This idea of compressing all relevant information of E relevant to A in the object p(Ap|E) is interesting and indeed, it’s perhaps a better articulation of what I find interesting about the Ap distribution than what is conveyed in the main body of the original post. One thread that I want(ed) to tug at a little further is that the Ap distribution seems to lend itself well to the first steps towards something of a dynamical model of probability theory: when you encounter a piece of evidence E, its first-order effect is to change your probability of A, but its second and n-th order effects are to affect your distribution of what future evidence you expect to encounter and how to “interpret” those pieces of evidence—where by “interpret” I mean in what way encountering that piece of evidence shifts your probability of A. This dynamical theory of probability would have folk theorems like “the variance in your A_p distribution must monotonically decrease over time”. These are shower thoughts.
And it’s also interesting perhaps on a more applied/agentic sense in that we often casually talk about “updating” our beliefs, but what does that actually look like in practice? Empirically, we see that we can have evidence in our head that we fail to process (lack of logical omniscience). Maybe something like the Ap distribution could be helpful for understanding this even better.