I do not understand how Jeffrey updates lead to path dependence. Is the trick that my probabilities can change without evidence, therefore I can just update B without observing anything that also updates A, and then use that for hocus pocus? Writing that out, I think that’s probably it, but as I was reading the essay I wasn’t sure which bit was where the key step was happening.
hmmmm. My attempt at an English translation of my example:
A and B are correlated, so moving B to 60% (up from 50%) makes A more probable as well. But then moving A up to 60% is less of a move for A. This means that (A&¬B) ends up smaller than (B&¬A): both get dragged up and then down, but (B&¬A) was dragged up by the larger update and down by the smaller.
Okay, I got tired and skipped most of the virtual evidence section (it got tough for me). You say “Exchange Virtual Evidence” and I would be interested in a concrete example of what that kind of conversation would look like.
It would be nice to write a whole post on this, but the first thing you need to do is distinguish between likelihoods and probabilities.
likelihood(A|B)=probability(B|A)
The notation may look pointless at first. The main usage has to do with the way we usually regard the first argument as variable an the second as fixed. IE, “a probability function sums to one” can be understood as P(A|B)+P(¬A|B)=1; we more readily think of A as variable here. In a Bayesian update, we vary the hypothesis, not the evidence, so it’s more natural to think in terms of a likelihood function, L(H|E).
In a Bayesian network, you propagate probability functions down links, and likelihood functions up links. Hence Pearl distinguished between the two strongly.
Likelihood functions don’t sum to 1. Think of them as fragments of belief which aren’t meaningful on their own until they’re combined with a probability.
Base-rate neglect can be thought of as confusion of likelihood for probability. The conjunction fallacy could also be explained in this way.
I wish it were feasible to get people to use “likely” vs “probable” in this way. Sadly, that’s unprobable to work.
I’m imagining it’s something like “I thought for ages and changed my mind, let me tell you why”.
What I’m pointing at is really much more outside-view than that. Standard warnings about outside view apply. ;p
An example of exchanging probabilities is: I assert X, and another person agrees. I now know that they assign a high probability to X. But that does not tell me very much about how to update.
Exchanging likelihoods instead: I assert X, and the other person tells me they already thought that for unrelated reasons. This tells me that their agreement is further evidence for X, and I should update up.
Or, a different possibility: I assert X, and the other person updates to X, and tells me so. This doesn’t provide me with further evidence in favor of X, except insofar as they acted as a proof-checker for my argument.
“Exchange virtual evidence” just means “communicate likelihoods” (or just likelihood ratios!)
Exchanging likelihoods is better than exchanging probabilities, because likelihoods are much easier to update on.
Granted, exchanging models is much better than either of those two ;3 However, it’s not always feasible. There’s the quick conversational examples like I gave, where someone may just want to express their epistemic state wrt what you just said in a way which doesn’t interrupt the flow of conversation significantly. But we could also be in a position where we’re trying to integrate many expert opinions in a forecasting-like setting. If we can’t build a coherent model to fit all the information together, virtual evidence is probable to be one of the more practical and effective ways to go.
hmmmm. My attempt at an English translation of my example:
A and B are correlated, so moving B to 60% (up from 50%) makes A more probable as well. But then moving A up to 60% is less of a move for A. This means that (A&¬B) ends up smaller than (B&¬A): both get dragged up and then down, but (B&¬A) was dragged up by the larger update and down by the smaller.
It would be nice to write a whole post on this, but the first thing you need to do is distinguish between likelihoods and probabilities.
likelihood(A|B)=probability(B|A)
The notation may look pointless at first. The main usage has to do with the way we usually regard the first argument as variable an the second as fixed. IE, “a probability function sums to one” can be understood as P(A|B)+P(¬A|B)=1; we more readily think of A as variable here. In a Bayesian update, we vary the hypothesis, not the evidence, so it’s more natural to think in terms of a likelihood function, L(H|E).
In a Bayesian network, you propagate probability functions down links, and likelihood functions up links. Hence Pearl distinguished between the two strongly.
Likelihood functions don’t sum to 1. Think of them as fragments of belief which aren’t meaningful on their own until they’re combined with a probability.
Base-rate neglect can be thought of as confusion of likelihood for probability. The conjunction fallacy could also be explained in this way.
I wish it were feasible to get people to use “likely” vs “probable” in this way. Sadly, that’s unprobable to work.
What I’m pointing at is really much more outside-view than that. Standard warnings about outside view apply. ;p
An example of exchanging probabilities is: I assert X, and another person agrees. I now know that they assign a high probability to X. But that does not tell me very much about how to update.
Exchanging likelihoods instead: I assert X, and the other person tells me they already thought that for unrelated reasons. This tells me that their agreement is further evidence for X, and I should update up.
Or, a different possibility: I assert X, and the other person updates to X, and tells me so. This doesn’t provide me with further evidence in favor of X, except insofar as they acted as a proof-checker for my argument.
“Exchange virtual evidence” just means “communicate likelihoods” (or just likelihood ratios!)
Exchanging likelihoods is better than exchanging probabilities, because likelihoods are much easier to update on.
Granted, exchanging models is much better than either of those two ;3 However, it’s not always feasible. There’s the quick conversational examples like I gave, where someone may just want to express their epistemic state wrt what you just said in a way which doesn’t interrupt the flow of conversation significantly. But we could also be in a position where we’re trying to integrate many expert opinions in a forecasting-like setting. If we can’t build a coherent model to fit all the information together, virtual evidence is probable to be one of the more practical and effective ways to go.
Thank you, they were all helpful. I’ll write more if I have more questions.
(“sadly that’s unprobable to work” lol)