Nope; it’s the limit of I(J(I(J(I(J(I(J(...(w)...), where I(S) for a set S is the union of the elements of I that have nonempty intersections with S, i.e. the union of I(x) over all x in S, and J(S) is defined the same way.
Alternately if instead of I and J you think about the sigma-algebras they generate (let’s call them sigma(I) and sigma(J)), then sigma(I meet J) is the intersection of sigma(I) and sigma(J). I prefer this somewhat because the machinery for conditional expectation is usually defined in terms of sigma-algebras, not partitions.
Then… I’m having trouble seeing why I^J wouldn’t very often converge on the entire space.
ie, suppose a super simplification in which both agent 1 and agent 2 partition the space into only two parts, agent 1 partitioning it into I = {A1, B1}, and agent 2 partitioning into J = {A2, B2}
Suppose I(w) = A1 and J(w) = A2
Then, unless the two partitions are identical, wouldn’t (I^J)(w) = the entire space? or am I completely misreading? And thanks for taking the time to explain.
That simplification is a situation in which there is no common knowledge. In world-state w, agent 1 knows A1 (meaning knows that the correct world is in A1), and agent 2 knows A2. They both know A1 union A2, but that’s still not common knowledge, because agent 1 doesn’t know that agent 2 knows A1 union A2.
I(w) is what agent 1 knows, if w is correct. If all you know is S, then the only thing you know agent 1 knows is I(S), and the only thing that you know agent 1 knows agent 2 knows is J(I(S)), and so forth. This is why the usual “everyone knows that everyone knows that … ” definition of common knowledge translates to I(J(I(J(I(J(...(w)...).
As far as I understand, agent 1 doesn’t know that agent 2 knows A2, and agent 2 doesn’t know that agent 1 knows A1. Instead, agent 1 knows that agent 2′s state of knowledge is in J and agent 2 knows that agent 1′s state of knowledge is in I. I’m a bit confused now about how this matches up with the meaning of Aumann’s Theorem. Why are I and J common knowledge, and {P(A|I)=q} and {P(A|J)=q} common knowledge, but I(w) and J(w) are not common knowledge? Perhaps that’s what the theorem requires, but currently I’m finding it hard to see how I and J being common knowledge is reasonable.
Edit: I’m silly. I and J don’t need to be common knowledge at all. It’s not agent 1 and agent 2 who perform the reasoning about I meet J, it’s us. We know that the true common knowledge is a set from I meet J, and that therefore if it’s common knowledge that agent 1′s posterior for the event A is q1 and agent 2′s posterior for A is q2, then q1=q2. And it’s not unreasonable for these posteriors to become common knowledge without I(w) and J(w) becoming common knowledge. The theorem says that if you’re both perfect Bayesians and you have the same priors then you don’t have to communicate your evidence.
But if I and J are not common knowledge then I’m confused about why any event that is common knowledge must be built from the meet of I and J.
Then agent 1 knows that agent 2 knows one of the members of J that have non empty intersection with I(w), and similar for for agent 2.
Presumably they have to tell each other which of their own partitions w is in, right? ie, presumably SOME sort of information sharing happens about each other’s conclusions.
And, once that happens, seems like intersection I(w) and J(w) would be their resultant common knowledge.
I’m confused still though what the “meet” operation is.
Unless… the idea is something like this: they exchange probabilities. Then agent 1 reasons “J(w) is a member of J such that it both Intersects I(w) AND would assign that particular probability. So then I can determine the subset of I(w) that intersects with those” and determine a probability from there.” And similar for agent 2. Then they exchange probabilities again, and go through an equivalent reasoning process to tighten the spaces a bit more… and the theorem ensures that they’d end up converging on the same probabilities? (each time they state unequal probabilities, they each learn more information and each one then comes up with a set that’s a strict subset of the one they were previously considering, but each of their sets always contain the intersection of I(w) and J(w))?
Try a concrete example: Two dice are thrown, and each agent learns one die’s value. In addition, each learns whether the other die is in the range 1-3 vs 4-6. Now what can we say about the sum of the dice?
Suppose player 1 sees a 2 and learns that player 2′s die is in 1-3. Then he knows that player 2 knows that player 1′s die is in 1-3. It is common knowledge that the sum is in 2-6.
You could graph it by drawing a 6x6 grid and circling the information partition of player 1 in one color, and player 2 in another color. You will find that the meet is a partition of 4 elements, each a 3x3 grid in one of the corners.
In general, anything which is common knowledge will limit the meet—that is, the meet partition the world is in will not extend to include world-states which contradict what is common knowledge. If 2 people disagree about global warming, it is probably common knowledge what the current CO2 level is and what the historical record of that level is. They agree on this data and each knows that the other agrees, etc.
The thrust of the theorem though is not what is common knowledge before, but what is common knowledge after. The claim is that it cannot be common knowledge that the two parties disagree.
What I don’t like about the example you provide is: what player 1 and player 2 know needs to be common knowledge. For instance if player 1 doesn’t know whether player 2 knows whether die 1 is in 1-3, then it may not be common knowledge at all that the sum is in 2-6, even if player 1 and player 2 are given the info you said they’re given.
This is what I was confused about in the grandparent comment: do we really need I and J to be common knowledge? It seems so to me. But that seems to be another assumption limiting the applicability of the result.
Not sure… what happens when the ranges are different sizes, or otherwise the type of information learnable by each player is different in non symmetric ways?
Anyways, thanks, upon another reading of your comment, I think I’m starting to get it a bit.
Different size ranges in Hal’s example? Nothing in particular happens. It’s ok for different random variables to have different ranges.
Otoh, if the players get different ranges about a single random variable, then they could have problems.
Suppose there is one d6. Player A learns whether it is in 1-2, 3-4, or 5-6. Player B learns whether it is in 1-3 or 4-6. And suppose the actual value is 1. Then A knows it’s 1-2. So A knows B knows it’s 1-3. But A reasons that B reasons that if it were 3 then A would know it’s 3-4, so A knows B knows A knows it’s 1-4. But A reasons that B reasons that A reasons that if it were 4 then B would know it’s 4-6, so A knows B knows A knows B knows it’s 1-6. So there is no common knowledge, i.e. I∧J=Ω. (Omitting the argument w, since if this is true then it’s true for all w.)
And if it were a d12, with ranges still size 2 and 3, then the partitions line up at one point, so the meet stops at {1-6, 7-12}.
Nope; it’s the limit of I(J(I(J(I(J(I(J(...(w)...), where I(S) for a set S is the union of the elements of I that have nonempty intersections with S, i.e. the union of I(x) over all x in S, and J(S) is defined the same way.
Alternately if instead of I and J you think about the sigma-algebras they generate (let’s call them sigma(I) and sigma(J)), then sigma(I meet J) is the intersection of sigma(I) and sigma(J). I prefer this somewhat because the machinery for conditional expectation is usually defined in terms of sigma-algebras, not partitions.
Then… I’m having trouble seeing why I^J wouldn’t very often converge on the entire space.
ie, suppose a super simplification in which both agent 1 and agent 2 partition the space into only two parts, agent 1 partitioning it into I = {A1, B1}, and agent 2 partitioning into J = {A2, B2}
Suppose I(w) = A1 and J(w) = A2
Then, unless the two partitions are identical, wouldn’t (I^J)(w) = the entire space? or am I completely misreading? And thanks for taking the time to explain.
That simplification is a situation in which there is no common knowledge. In world-state w, agent 1 knows A1 (meaning knows that the correct world is in A1), and agent 2 knows A2. They both know A1 union A2, but that’s still not common knowledge, because agent 1 doesn’t know that agent 2 knows A1 union A2.
I(w) is what agent 1 knows, if w is correct. If all you know is S, then the only thing you know agent 1 knows is I(S), and the only thing that you know agent 1 knows agent 2 knows is J(I(S)), and so forth. This is why the usual “everyone knows that everyone knows that … ” definition of common knowledge translates to I(J(I(J(I(J(...(w)...).
Well, how is it not the intersection then?
ie, Agent 1 knows A1 and knows that Agent 2 knows A2
If they trust each other’s rationality, then they both know that w must be in A1 and be in A2
So they both conclude it must be in intersection of A1 and A2, and they both know that they both know this, etc etc...
Or am I missing the point?
As far as I understand, agent 1 doesn’t know that agent 2 knows A2, and agent 2 doesn’t know that agent 1 knows A1. Instead, agent 1 knows that agent 2′s state of knowledge is in J and agent 2 knows that agent 1′s state of knowledge is in I. I’m a bit confused now about how this matches up with the meaning of Aumann’s Theorem. Why are I and J common knowledge, and {P(A|I)=q} and {P(A|J)=q} common knowledge, but I(w) and J(w) are not common knowledge? Perhaps that’s what the theorem requires, but currently I’m finding it hard to see how I and J being common knowledge is reasonable.
Edit: I’m silly. I and J don’t need to be common knowledge at all. It’s not agent 1 and agent 2 who perform the reasoning about I meet J, it’s us. We know that the true common knowledge is a set from I meet J, and that therefore if it’s common knowledge that agent 1′s posterior for the event A is q1 and agent 2′s posterior for A is q2, then q1=q2. And it’s not unreasonable for these posteriors to become common knowledge without I(w) and J(w) becoming common knowledge. The theorem says that if you’re both perfect Bayesians and you have the same priors then you don’t have to communicate your evidence.
But if I and J are not common knowledge then I’m confused about why any event that is common knowledge must be built from the meet of I and J.
Then agent 1 knows that agent 2 knows one of the members of J that have non empty intersection with I(w), and similar for for agent 2.
Presumably they have to tell each other which of their own partitions w is in, right? ie, presumably SOME sort of information sharing happens about each other’s conclusions.
And, once that happens, seems like intersection I(w) and J(w) would be their resultant common knowledge.
I’m confused still though what the “meet” operation is.
Unless… the idea is something like this: they exchange probabilities. Then agent 1 reasons “J(w) is a member of J such that it both Intersects I(w) AND would assign that particular probability. So then I can determine the subset of I(w) that intersects with those” and determine a probability from there.” And similar for agent 2. Then they exchange probabilities again, and go through an equivalent reasoning process to tighten the spaces a bit more… and the theorem ensures that they’d end up converging on the same probabilities? (each time they state unequal probabilities, they each learn more information and each one then comes up with a set that’s a strict subset of the one they were previously considering, but each of their sets always contain the intersection of I(w) and J(w))?
Try a concrete example: Two dice are thrown, and each agent learns one die’s value. In addition, each learns whether the other die is in the range 1-3 vs 4-6. Now what can we say about the sum of the dice?
Suppose player 1 sees a 2 and learns that player 2′s die is in 1-3. Then he knows that player 2 knows that player 1′s die is in 1-3. It is common knowledge that the sum is in 2-6.
You could graph it by drawing a 6x6 grid and circling the information partition of player 1 in one color, and player 2 in another color. You will find that the meet is a partition of 4 elements, each a 3x3 grid in one of the corners.
In general, anything which is common knowledge will limit the meet—that is, the meet partition the world is in will not extend to include world-states which contradict what is common knowledge. If 2 people disagree about global warming, it is probably common knowledge what the current CO2 level is and what the historical record of that level is. They agree on this data and each knows that the other agrees, etc.
The thrust of the theorem though is not what is common knowledge before, but what is common knowledge after. The claim is that it cannot be common knowledge that the two parties disagree.
What I don’t like about the example you provide is: what player 1 and player 2 know needs to be common knowledge. For instance if player 1 doesn’t know whether player 2 knows whether die 1 is in 1-3, then it may not be common knowledge at all that the sum is in 2-6, even if player 1 and player 2 are given the info you said they’re given.
This is what I was confused about in the grandparent comment: do we really need I and J to be common knowledge? It seems so to me. But that seems to be another assumption limiting the applicability of the result.
Not sure… what happens when the ranges are different sizes, or otherwise the type of information learnable by each player is different in non symmetric ways?
Anyways, thanks, upon another reading of your comment, I think I’m starting to get it a bit.
Different size ranges in Hal’s example? Nothing in particular happens. It’s ok for different random variables to have different ranges.
Otoh, if the players get different ranges about a single random variable, then they could have problems. Suppose there is one d6. Player A learns whether it is in 1-2, 3-4, or 5-6. Player B learns whether it is in 1-3 or 4-6.
And suppose the actual value is 1.
Then A knows it’s 1-2. So A knows B knows it’s 1-3. But A reasons that B reasons that if it were 3 then A would know it’s 3-4, so A knows B knows A knows it’s 1-4. But A reasons that B reasons that A reasons that if it were 4 then B would know it’s 4-6, so A knows B knows A knows B knows it’s 1-6. So there is no common knowledge, i.e. I∧J=Ω. (Omitting the argument w, since if this is true then it’s true for all w.)
And if it were a d12, with ranges still size 2 and 3, then the partitions line up at one point, so the meet stops at {1-6, 7-12}.