Person A has purposively sought out, and updated, on evidence related to X since childhood.
Person B has sat on her couch and played video games.
Yet both A and B have arrived at the same degree-of-belief in proposition X.
Does the Bayesian framework equip its adherents with an adequate account of how Person A should be more confident in her conclusion than Person B?
The only viable answer I can think of is that every reasoner should multiply every conclusion with some measure of epistemic confidence, and re-normalize. But I have not yet encountered such a pervasive account of confidence-measurement from leading Bayesian theorists.
If X is just a binary proposition that can be true or false once and for all, and A and B have arrived at the same degree-of-belief, they are equally confident. A has updated on evidence related to X since childhood, and found that it’s perfectly balanced in either direction. The only way A can be said to be “more confident” than B is that A has seen a lot of evidence already, so she won’t update her conclusion upon seeing the same evidence again; on the other hand, all evidence is new to B.
Things get more interesting if X is some sort of random variable. Let’s say we have a bag of black and white marbles. A has seen people draw from the bag 100 times, and 50 of them ended up with white marbles. B only knows the general idea. Now, both of them expect a white marble to come up with 50% probability. But actually, they each have a probability distribution on the fraction of white marbles in the bag. The mean is 1⁄2 for both of them, but the distribution is flat for B, and has a sharp peak at 1⁄2 for A. This is what determines how confident they are. If C comes along and says “well, I drew a white marble”, then B will update to a new distribution, with mean 2⁄3, but A’s distribution will barely shift at all.
The example of stochastic evidence is indeed interesting. But I find myself stuck on the first example.
If a new reasoner C were to update Pc(X) based on the testimony of A, and had an extremely high degree of confidence in her ability to generate correct opinions, he would presumably strongly gravitate towards Pa(X).
Alternatively, suppose C is going to update Pc(X) based on the testimony of B. Further, C has evidence outlining B’s apathetic proclivities. Therefore, he would presumably only weakly gravitate towards Pb(X).
The above account may be shown to be confused. But if it is not, why can C update based on evidence of infomed-belief, but A and B are precluded from similarly reflecting on their own testimony? Or, if such introspective activity is not non-normative, should they not strive to perform such an activity consistently?
Okay. I’m assuming everyone has the same prior. I’m going to start by comparing the case where C talks to A and learns everything A knows, to the case where C talks to B and learns everything B knows; that is, when C ends up conditioning on all the same things. If you already see why those two cases are very different, you can skip down to the second section, where I talk about what this implies about how C updates when just hearing that A knows a lot and what Pa(X) is, compared to how he updates when learning what B thinks.
It’s the same scenario as you described: knowlegable A, ignorant B, Pa(X) = Pb(X).
What happens when C learns everything B knows depends on what evidence C already has. If C knows nothing, then after talking to B, Pc(X) = Pb(X), because he’ll be conditioning on exactly the same things.
In other words, if C knows nothing, then C is even more ignorant than B is. When he talks to B, he becomes exactly as ignorant as B is, and assigns the probability that you have in that state of ignorance.
It’s only if C already has some evidence that talking to A and talking to B becomes different. As Kindly said, Pa(X) is very stable. So once C learns everything that A knows, C ends up with the probability Pa(X|whatever C knew), which is probably a lot like Pa(X). To take an extreme case, if A is well-informed enough, then she already knows everything C knows, and Pa(X|whatever C knew) is equal to Pa(X), and C comes out with exactly the same probability as A. But if C’s info is new to A, then it’s probably a lot like telling your biochemistry professor about a study that you read weighing in on one side of a debate: she’s seen plenty of evidence for both sides, and unless this new study is particularly conclusive, it’s not going to change her mind a whole lot.
However, B’s probability is not stable. That biochemistry study might change B’s mind a lot, because for all she knows, there isn’t even a debate, and she has this pretty good evidence for one side of it.
So, once C talks to B and learns everything B knows, C will be using the probability that incorporates all of B’s knowledge, plus his own: Pb(X|whatever C knew). This is probably farther from Pb(X) aka Pa(X) than Pa(X|whatever C knew).
This is just how it would typically go. I say A’s probability is more “stable”, but there’s actually some evidence that A would recognize as extremely significant that would mean nothing to B. In this case, one C has learned everything A knows, he would also recognize the significance of the little bit of knowledge that he came in with, and end up with a probability far different from Pa(X).
So that’s how it would probably go if C actually sits down and learns everything they know.
So, what if C just knows that A is knowledgable, and Pa(X)? Well, suppose that C is convinced by my reasoning, that if he sat down with A and learned everything she knew, then her probability of X would end up pretty close to Pa(X).
Here’s the key thing: If C expects that, then his probability is already pretty close to Pa(X). All C knows is that A is knowledgable and has Pa(X), but if he expects to be convinced after learning everything A knows, then he already is convinced.
For any event Q, P(X) is equal to the expected value of P(X|the outcome of Q). That is, you don’t know the outcome of Q, but if there’s N mutually exclusive possible outcomes O_1… O_N, then P(X) = P(X|O_1)P(O_1) + … + P(X|O_N)P(O_N). This is one way of stating Conservation of Probability. If the expected value of Pc(X|the outcome of learning everything A knows) is pretty close to Pa(X), then, well, Pc(X) must be pretty close too, because the expected value of Pc(X|the outcome of learning everything A knows) is equal to Pc(X).
Likewise, if C learns about B’s knowledge and Pb(X), and he doesn’t think that learning everything B knows would make much of a difference, then he also doesn’t end up matching Pb(X) unless he started out matching before he even learned B’s testimony.
I’ve been assuming that A’s knowledge makes her probability more “stable”; Pa(X|one more piece of evidence) is close to Pa(X). What if A is knowledgable but unstable? I think it still works out the same way but I haven’t worked it out and I have to go.
PS: This is a first attempt on my part. Hopefully it’s overcomplicated and overspecific, so we can work out/receive a more general/simple answer. But I saw that nobody else had replied so here ya go.
Person A and B hold a belief about proposition X.
Person A has purposively sought out, and updated, on evidence related to X since childhood.
Person B has sat on her couch and played video games.
Yet both A and B have arrived at the same degree-of-belief in proposition X.
Does the Bayesian framework equip its adherents with an adequate account of how Person A should be more confident in her conclusion than Person B?
The only viable answer I can think of is that every reasoner should multiply every conclusion with some measure of epistemic confidence, and re-normalize. But I have not yet encountered such a pervasive account of confidence-measurement from leading Bayesian theorists.
If X is just a binary proposition that can be true or false once and for all, and A and B have arrived at the same degree-of-belief, they are equally confident. A has updated on evidence related to X since childhood, and found that it’s perfectly balanced in either direction. The only way A can be said to be “more confident” than B is that A has seen a lot of evidence already, so she won’t update her conclusion upon seeing the same evidence again; on the other hand, all evidence is new to B.
Things get more interesting if X is some sort of random variable. Let’s say we have a bag of black and white marbles. A has seen people draw from the bag 100 times, and 50 of them ended up with white marbles. B only knows the general idea. Now, both of them expect a white marble to come up with 50% probability. But actually, they each have a probability distribution on the fraction of white marbles in the bag. The mean is 1⁄2 for both of them, but the distribution is flat for B, and has a sharp peak at 1⁄2 for A. This is what determines how confident they are. If C comes along and says “well, I drew a white marble”, then B will update to a new distribution, with mean 2⁄3, but A’s distribution will barely shift at all.
The example of stochastic evidence is indeed interesting. But I find myself stuck on the first example.
If a new reasoner C were to update Pc(X) based on the testimony of A, and had an extremely high degree of confidence in her ability to generate correct opinions, he would presumably strongly gravitate towards Pa(X).
Alternatively, suppose C is going to update Pc(X) based on the testimony of B. Further, C has evidence outlining B’s apathetic proclivities. Therefore, he would presumably only weakly gravitate towards Pb(X).
The above account may be shown to be confused. But if it is not, why can C update based on evidence of infomed-belief, but A and B are precluded from similarly reflecting on their own testimony? Or, if such introspective activity is not non-normative, should they not strive to perform such an activity consistently?
They essentially have already updated on their own testimony.
Okay. I’m assuming everyone has the same prior. I’m going to start by comparing the case where C talks to A and learns everything A knows, to the case where C talks to B and learns everything B knows; that is, when C ends up conditioning on all the same things. If you already see why those two cases are very different, you can skip down to the second section, where I talk about what this implies about how C updates when just hearing that A knows a lot and what Pa(X) is, compared to how he updates when learning what B thinks. It’s the same scenario as you described: knowlegable A, ignorant B, Pa(X) = Pb(X).
What happens when C learns everything B knows depends on what evidence C already has. If C knows nothing, then after talking to B, Pc(X) = Pb(X), because he’ll be conditioning on exactly the same things.
In other words, if C knows nothing, then C is even more ignorant than B is. When he talks to B, he becomes exactly as ignorant as B is, and assigns the probability that you have in that state of ignorance.
It’s only if C already has some evidence that talking to A and talking to B becomes different. As Kindly said, Pa(X) is very stable. So once C learns everything that A knows, C ends up with the probability Pa(X|whatever C knew), which is probably a lot like Pa(X). To take an extreme case, if A is well-informed enough, then she already knows everything C knows, and Pa(X|whatever C knew) is equal to Pa(X), and C comes out with exactly the same probability as A. But if C’s info is new to A, then it’s probably a lot like telling your biochemistry professor about a study that you read weighing in on one side of a debate: she’s seen plenty of evidence for both sides, and unless this new study is particularly conclusive, it’s not going to change her mind a whole lot.
However, B’s probability is not stable. That biochemistry study might change B’s mind a lot, because for all she knows, there isn’t even a debate, and she has this pretty good evidence for one side of it. So, once C talks to B and learns everything B knows, C will be using the probability that incorporates all of B’s knowledge, plus his own: Pb(X|whatever C knew). This is probably farther from Pb(X) aka Pa(X) than Pa(X|whatever C knew).
This is just how it would typically go. I say A’s probability is more “stable”, but there’s actually some evidence that A would recognize as extremely significant that would mean nothing to B. In this case, one C has learned everything A knows, he would also recognize the significance of the little bit of knowledge that he came in with, and end up with a probability far different from Pa(X).
So that’s how it would probably go if C actually sits down and learns everything they know. So, what if C just knows that A is knowledgable, and Pa(X)? Well, suppose that C is convinced by my reasoning, that if he sat down with A and learned everything she knew, then her probability of X would end up pretty close to Pa(X).
Here’s the key thing: If C expects that, then his probability is already pretty close to Pa(X). All C knows is that A is knowledgable and has Pa(X), but if he expects to be convinced after learning everything A knows, then he already is convinced.
For any event Q, P(X) is equal to the expected value of P(X|the outcome of Q). That is, you don’t know the outcome of Q, but if there’s N mutually exclusive possible outcomes O_1… O_N, then P(X) = P(X|O_1)P(O_1) + … + P(X|O_N)P(O_N). This is one way of stating Conservation of Probability. If the expected value of Pc(X|the outcome of learning everything A knows) is pretty close to Pa(X), then, well, Pc(X) must be pretty close too, because the expected value of Pc(X|the outcome of learning everything A knows) is equal to Pc(X).
Likewise, if C learns about B’s knowledge and Pb(X), and he doesn’t think that learning everything B knows would make much of a difference, then he also doesn’t end up matching Pb(X) unless he started out matching before he even learned B’s testimony.
I’ve been assuming that A’s knowledge makes her probability more “stable”; Pa(X|one more piece of evidence) is close to Pa(X). What if A is knowledgable but unstable? I think it still works out the same way but I haven’t worked it out and I have to go.
PS: This is a first attempt on my part. Hopefully it’s overcomplicated and overspecific, so we can work out/receive a more general/simple answer. But I saw that nobody else had replied so here ya go.