Sorry, I think I got a bit confused about the “meet” operation, mind clarifying?
is (I^J)(w) equal to the intersection of I(w) and J(w) (which seems to be the implied way it works based on the overall description here) or something else? (Since the definition of meet you gave involved unions rather than intersections, and some sort of merging operation)
Thanks.
EDIT: whoops. am stupid today. Meant to say intersection, not disjunction
Meet of two partitions (in the context of this post) is the finest common coarsening of those partitions.
Consider the coarsening relation on the set of all partitions of the given set. Partition A is a coarsening of partition B if A can be obtained by “lumping together” some of the elements of B. Now, for this order, a “meet” of two partitions X and Y is a partition Z such that
Z is a coarsening of X, and it is a coarsening of Y
Z is the finest such partition, that is for any other Z’ that is a coarsening of both X and Y, Z’ is also a coarsening of Z.
Meet of two partitions is the finest common coarsening of those partitions.
Under the usages familiar to me, the common coarsening is the join, not the meet. That’s how “join” is used on the Wikipedia page for set partitions. Using “meet” to mean “common refinement” is the usage that makes sense to me in the context of the proof in the OP. [ETA: I’ve been corrected on this point; see below.]
Of course, what you call “meet” or “join” depends on which way you decide to direct the partial order on partitions. Unfortunately, it looks like both possibilities are floating around as conventions.
It is not difficult to see that the common knowledge accessibility function [...] corresponds to the finest common coarsening of the partitions [...], which is the finitary characterization of common knowledge also given by Aumann in the 1976 article.
The idea is that the partitions define what each agent is able to discern, so no refinement of what a given agent can discern is possible (unless you perform additional communication). Aumann’s agreement theorem is about a condition for when the agents already agree, without any additional discussion between them.
Hmm. Then I am in a state of confusion much like Psy-Kosh’s. These opposing convention aren’t helping, but, at any rate, I evidently need to study this more closely.
It was confusing for me too, which is why I gave an imperative definition: first form the union of I and J, then merge any overlapping elements. Did that not help?
It should have. The fault is certainly mine. I skimmed your definition too lightly because you were defining a technical term (“meet”) in a context (partitions) where I was already familiar with the term, but I hadn’t suspected that it had any other usages than the one I knew.
The term “meet” would correspond to considering a coarser partition as “less” than a finer partition, which is natural enough if you see partitions as representing “precision of knowledge”. The coarser partition is able to discern less. Greatest lower bound is usually called “meet”.
It’s always called that, but the greatest lower bound and the least upper bound switch places if you switch the direction of the partial order. And there’s a lot of literature on set partitions in which finer partitions are lower in the poset. (That’s the convention used in the Wikipedia page on set partitions.)
The justification for taking the meet to be a refinement is that refinements correspond to intersections of partition elements, and intersections are meets in the poset of sets. So the terminology carries over from the poset of sets to the poset of set partitions in a way that appeals to the mathematician’s aesthetic.
But I can see the justification for the opposite convention when you’re talking about precision of knowledge.
Ah, thanks. In that case… wouldn’t the meet of A and B often end up being the entire space?
For that matter, why this coarsening operation rather than the set of all the possible pairwise intersections between members of I and members of J?
ie, why coarsening instead if “fineing” (what’s the appropriate word there anyways?)
When two rationalists exchange information, shouldn’t their conclusions then sometimes be finer rather than coarser since they have, well, each gained information they didn’t have previously?
When two rationalists exchange all information, their new partition is the ‘join’ of the two old partitions, where the join is the “coarsest common fining”. If you plot omega as the rectangle with corners at (-1,-1) and (1,1) and the initial partitions are the x axis for agent A and the Y axis for agent B, then they share information and ‘join’ and then their common partition separates all 4 quadrants.
“common knowledge” is the set of questions that they can both answer before sharing information. This is the ‘meet’ which is the coarsest common fining. In the previous example, there is no information that they both share, so the meet becomes the whole quadrant.
If you extend omega down to y = −2 and modify the original partitions to both fence off this new piece on its own, then the join would be the original four squares plus this lower rectangle, while the meet would be the square from (-1,1) to (1,1) plus this lower rectangle (since they now have this as common knowledge).
wait, what? is it coarsest common fining or finest common coarsening that we’re interested in here?
And isn’t common knowledge the set of questions that not only they can both answer, but that they both know that both can answer, and both know that both know, etc etc etc?
Actually, maybe I need to reread this a bit more, but now am more confused.
Actually, on rereading, I think I’m starting to get the idea about meet and common knowledge (given that before exchanging info, they do know each other’s partitioning, but not which particular partition the other has observed to be the current one).
Nope; it’s the limit of I(J(I(J(I(J(I(J(...(w)...), where I(S) for a set S is the union of the elements of I that have nonempty intersections with S, i.e. the union of I(x) over all x in S, and J(S) is defined the same way.
Alternately if instead of I and J you think about the sigma-algebras they generate (let’s call them sigma(I) and sigma(J)), then sigma(I meet J) is the intersection of sigma(I) and sigma(J). I prefer this somewhat because the machinery for conditional expectation is usually defined in terms of sigma-algebras, not partitions.
Then… I’m having trouble seeing why I^J wouldn’t very often converge on the entire space.
ie, suppose a super simplification in which both agent 1 and agent 2 partition the space into only two parts, agent 1 partitioning it into I = {A1, B1}, and agent 2 partitioning into J = {A2, B2}
Suppose I(w) = A1 and J(w) = A2
Then, unless the two partitions are identical, wouldn’t (I^J)(w) = the entire space? or am I completely misreading? And thanks for taking the time to explain.
That simplification is a situation in which there is no common knowledge. In world-state w, agent 1 knows A1 (meaning knows that the correct world is in A1), and agent 2 knows A2. They both know A1 union A2, but that’s still not common knowledge, because agent 1 doesn’t know that agent 2 knows A1 union A2.
I(w) is what agent 1 knows, if w is correct. If all you know is S, then the only thing you know agent 1 knows is I(S), and the only thing that you know agent 1 knows agent 2 knows is J(I(S)), and so forth. This is why the usual “everyone knows that everyone knows that … ” definition of common knowledge translates to I(J(I(J(I(J(...(w)...).
As far as I understand, agent 1 doesn’t know that agent 2 knows A2, and agent 2 doesn’t know that agent 1 knows A1. Instead, agent 1 knows that agent 2′s state of knowledge is in J and agent 2 knows that agent 1′s state of knowledge is in I. I’m a bit confused now about how this matches up with the meaning of Aumann’s Theorem. Why are I and J common knowledge, and {P(A|I)=q} and {P(A|J)=q} common knowledge, but I(w) and J(w) are not common knowledge? Perhaps that’s what the theorem requires, but currently I’m finding it hard to see how I and J being common knowledge is reasonable.
Edit: I’m silly. I and J don’t need to be common knowledge at all. It’s not agent 1 and agent 2 who perform the reasoning about I meet J, it’s us. We know that the true common knowledge is a set from I meet J, and that therefore if it’s common knowledge that agent 1′s posterior for the event A is q1 and agent 2′s posterior for A is q2, then q1=q2. And it’s not unreasonable for these posteriors to become common knowledge without I(w) and J(w) becoming common knowledge. The theorem says that if you’re both perfect Bayesians and you have the same priors then you don’t have to communicate your evidence.
But if I and J are not common knowledge then I’m confused about why any event that is common knowledge must be built from the meet of I and J.
Then agent 1 knows that agent 2 knows one of the members of J that have non empty intersection with I(w), and similar for for agent 2.
Presumably they have to tell each other which of their own partitions w is in, right? ie, presumably SOME sort of information sharing happens about each other’s conclusions.
And, once that happens, seems like intersection I(w) and J(w) would be their resultant common knowledge.
I’m confused still though what the “meet” operation is.
Unless… the idea is something like this: they exchange probabilities. Then agent 1 reasons “J(w) is a member of J such that it both Intersects I(w) AND would assign that particular probability. So then I can determine the subset of I(w) that intersects with those” and determine a probability from there.” And similar for agent 2. Then they exchange probabilities again, and go through an equivalent reasoning process to tighten the spaces a bit more… and the theorem ensures that they’d end up converging on the same probabilities? (each time they state unequal probabilities, they each learn more information and each one then comes up with a set that’s a strict subset of the one they were previously considering, but each of their sets always contain the intersection of I(w) and J(w))?
Try a concrete example: Two dice are thrown, and each agent learns one die’s value. In addition, each learns whether the other die is in the range 1-3 vs 4-6. Now what can we say about the sum of the dice?
Suppose player 1 sees a 2 and learns that player 2′s die is in 1-3. Then he knows that player 2 knows that player 1′s die is in 1-3. It is common knowledge that the sum is in 2-6.
You could graph it by drawing a 6x6 grid and circling the information partition of player 1 in one color, and player 2 in another color. You will find that the meet is a partition of 4 elements, each a 3x3 grid in one of the corners.
In general, anything which is common knowledge will limit the meet—that is, the meet partition the world is in will not extend to include world-states which contradict what is common knowledge. If 2 people disagree about global warming, it is probably common knowledge what the current CO2 level is and what the historical record of that level is. They agree on this data and each knows that the other agrees, etc.
The thrust of the theorem though is not what is common knowledge before, but what is common knowledge after. The claim is that it cannot be common knowledge that the two parties disagree.
What I don’t like about the example you provide is: what player 1 and player 2 know needs to be common knowledge. For instance if player 1 doesn’t know whether player 2 knows whether die 1 is in 1-3, then it may not be common knowledge at all that the sum is in 2-6, even if player 1 and player 2 are given the info you said they’re given.
This is what I was confused about in the grandparent comment: do we really need I and J to be common knowledge? It seems so to me. But that seems to be another assumption limiting the applicability of the result.
Not sure… what happens when the ranges are different sizes, or otherwise the type of information learnable by each player is different in non symmetric ways?
Anyways, thanks, upon another reading of your comment, I think I’m starting to get it a bit.
Different size ranges in Hal’s example? Nothing in particular happens. It’s ok for different random variables to have different ranges.
Otoh, if the players get different ranges about a single random variable, then they could have problems.
Suppose there is one d6. Player A learns whether it is in 1-2, 3-4, or 5-6. Player B learns whether it is in 1-3 or 4-6. And suppose the actual value is 1. Then A knows it’s 1-2. So A knows B knows it’s 1-3. But A reasons that B reasons that if it were 3 then A would know it’s 3-4, so A knows B knows A knows it’s 1-4. But A reasons that B reasons that A reasons that if it were 4 then B would know it’s 4-6, so A knows B knows A knows B knows it’s 1-6. So there is no common knowledge, i.e. I∧J=Ω. (Omitting the argument w, since if this is true then it’s true for all w.)
And if it were a d12, with ranges still size 2 and 3, then the partitions line up at one point, so the meet stops at {1-6, 7-12}.
Sorry, I think I got a bit confused about the “meet” operation, mind clarifying?
is (I^J)(w) equal to the intersection of I(w) and J(w) (which seems to be the implied way it works based on the overall description here) or something else? (Since the definition of meet you gave involved unions rather than intersections, and some sort of merging operation)
Thanks.
EDIT: whoops. am stupid today. Meant to say intersection, not disjunction
Meet of two partitions (in the context of this post) is the finest common coarsening of those partitions.
Consider the coarsening relation on the set of all partitions of the given set. Partition A is a coarsening of partition B if A can be obtained by “lumping together” some of the elements of B. Now, for this order, a “meet” of two partitions X and Y is a partition Z such that
Z is a coarsening of X, and it is a coarsening of Y
Z is the finest such partition, that is for any other Z’ that is a coarsening of both X and Y, Z’ is also a coarsening of Z.
Under the usages familiar to me, the common coarsening is the join, not the meet. That’s how “join” is used on the Wikipedia page for set partitions. Using “meet” to mean “common refinement” is the usage that makes sense to me in the context of the proof in the OP. [ETA: I’ve been corrected on this point; see below.]
Of course, what you call “meet” or “join” depends on which way you decide to direct the partial order on partitions. Unfortunately, it looks like both possibilities are floating around as conventions.
See for example on Wikipedia: Common knowledge (logic)
The idea is that the partitions define what each agent is able to discern, so no refinement of what a given agent can discern is possible (unless you perform additional communication). Aumann’s agreement theorem is about a condition for when the agents already agree, without any additional discussion between them.
Hmm. Then I am in a state of confusion much like Psy-Kosh’s. These opposing convention aren’t helping, but, at any rate, I evidently need to study this more closely.
It was confusing for me too, which is why I gave an imperative definition: first form the union of I and J, then merge any overlapping elements. Did that not help?
It should have. The fault is certainly mine. I skimmed your definition too lightly because you were defining a technical term (“meet”) in a context (partitions) where I was already familiar with the term, but I hadn’t suspected that it had any other usages than the one I knew.
The term “meet” would correspond to considering a coarser partition as “less” than a finer partition, which is natural enough if you see partitions as representing “precision of knowledge”. The coarser partition is able to discern less. Greatest lower bound is usually called “meet”.
It’s always called that, but the greatest lower bound and the least upper bound switch places if you switch the direction of the partial order. And there’s a lot of literature on set partitions in which finer partitions are lower in the poset. (That’s the convention used in the Wikipedia page on set partitions.)
The justification for taking the meet to be a refinement is that refinements correspond to intersections of partition elements, and intersections are meets in the poset of sets. So the terminology carries over from the poset of sets to the poset of set partitions in a way that appeals to the mathematician’s aesthetic.
But I can see the justification for the opposite convention when you’re talking about precision of knowledge.
Ah, thanks. In that case… wouldn’t the meet of A and B often end up being the entire space?
For that matter, why this coarsening operation rather than the set of all the possible pairwise intersections between members of I and members of J?
ie, why coarsening instead if “fineing” (what’s the appropriate word there anyways?)
When two rationalists exchange information, shouldn’t their conclusions then sometimes be finer rather than coarser since they have, well, each gained information they didn’t have previously?
If I’ve got this right...
When two rationalists exchange all information, their new partition is the ‘join’ of the two old partitions, where the join is the “coarsest common fining”. If you plot omega as the rectangle with corners at (-1,-1) and (1,1) and the initial partitions are the x axis for agent A and the Y axis for agent B, then they share information and ‘join’ and then their common partition separates all 4 quadrants.
“common knowledge” is the set of questions that they can both answer before sharing information. This is the ‘meet’ which is the coarsest common fining. In the previous example, there is no information that they both share, so the meet becomes the whole quadrant.
If you extend omega down to y = −2 and modify the original partitions to both fence off this new piece on its own, then the join would be the original four squares plus this lower rectangle, while the meet would be the square from (-1,1) to (1,1) plus this lower rectangle (since they now have this as common knowledge).
Does this help?
wait, what? is it coarsest common fining or finest common coarsening that we’re interested in here?
And isn’t common knowledge the set of questions that not only they can both answer, but that they both know that both can answer, and both know that both know, etc etc etc?
Actually, maybe I need to reread this a bit more, but now am more confused.
Actually, on rereading, I think I’m starting to get the idea about meet and common knowledge (given that before exchanging info, they do know each other’s partitioning, but not which particular partition the other has observed to be the current one).
Thanks!
Nope; it’s the limit of I(J(I(J(I(J(I(J(...(w)...), where I(S) for a set S is the union of the elements of I that have nonempty intersections with S, i.e. the union of I(x) over all x in S, and J(S) is defined the same way.
Alternately if instead of I and J you think about the sigma-algebras they generate (let’s call them sigma(I) and sigma(J)), then sigma(I meet J) is the intersection of sigma(I) and sigma(J). I prefer this somewhat because the machinery for conditional expectation is usually defined in terms of sigma-algebras, not partitions.
Then… I’m having trouble seeing why I^J wouldn’t very often converge on the entire space.
ie, suppose a super simplification in which both agent 1 and agent 2 partition the space into only two parts, agent 1 partitioning it into I = {A1, B1}, and agent 2 partitioning into J = {A2, B2}
Suppose I(w) = A1 and J(w) = A2
Then, unless the two partitions are identical, wouldn’t (I^J)(w) = the entire space? or am I completely misreading? And thanks for taking the time to explain.
That simplification is a situation in which there is no common knowledge. In world-state w, agent 1 knows A1 (meaning knows that the correct world is in A1), and agent 2 knows A2. They both know A1 union A2, but that’s still not common knowledge, because agent 1 doesn’t know that agent 2 knows A1 union A2.
I(w) is what agent 1 knows, if w is correct. If all you know is S, then the only thing you know agent 1 knows is I(S), and the only thing that you know agent 1 knows agent 2 knows is J(I(S)), and so forth. This is why the usual “everyone knows that everyone knows that … ” definition of common knowledge translates to I(J(I(J(I(J(...(w)...).
Well, how is it not the intersection then?
ie, Agent 1 knows A1 and knows that Agent 2 knows A2
If they trust each other’s rationality, then they both know that w must be in A1 and be in A2
So they both conclude it must be in intersection of A1 and A2, and they both know that they both know this, etc etc...
Or am I missing the point?
As far as I understand, agent 1 doesn’t know that agent 2 knows A2, and agent 2 doesn’t know that agent 1 knows A1. Instead, agent 1 knows that agent 2′s state of knowledge is in J and agent 2 knows that agent 1′s state of knowledge is in I. I’m a bit confused now about how this matches up with the meaning of Aumann’s Theorem. Why are I and J common knowledge, and {P(A|I)=q} and {P(A|J)=q} common knowledge, but I(w) and J(w) are not common knowledge? Perhaps that’s what the theorem requires, but currently I’m finding it hard to see how I and J being common knowledge is reasonable.
Edit: I’m silly. I and J don’t need to be common knowledge at all. It’s not agent 1 and agent 2 who perform the reasoning about I meet J, it’s us. We know that the true common knowledge is a set from I meet J, and that therefore if it’s common knowledge that agent 1′s posterior for the event A is q1 and agent 2′s posterior for A is q2, then q1=q2. And it’s not unreasonable for these posteriors to become common knowledge without I(w) and J(w) becoming common knowledge. The theorem says that if you’re both perfect Bayesians and you have the same priors then you don’t have to communicate your evidence.
But if I and J are not common knowledge then I’m confused about why any event that is common knowledge must be built from the meet of I and J.
Then agent 1 knows that agent 2 knows one of the members of J that have non empty intersection with I(w), and similar for for agent 2.
Presumably they have to tell each other which of their own partitions w is in, right? ie, presumably SOME sort of information sharing happens about each other’s conclusions.
And, once that happens, seems like intersection I(w) and J(w) would be their resultant common knowledge.
I’m confused still though what the “meet” operation is.
Unless… the idea is something like this: they exchange probabilities. Then agent 1 reasons “J(w) is a member of J such that it both Intersects I(w) AND would assign that particular probability. So then I can determine the subset of I(w) that intersects with those” and determine a probability from there.” And similar for agent 2. Then they exchange probabilities again, and go through an equivalent reasoning process to tighten the spaces a bit more… and the theorem ensures that they’d end up converging on the same probabilities? (each time they state unequal probabilities, they each learn more information and each one then comes up with a set that’s a strict subset of the one they were previously considering, but each of their sets always contain the intersection of I(w) and J(w))?
Try a concrete example: Two dice are thrown, and each agent learns one die’s value. In addition, each learns whether the other die is in the range 1-3 vs 4-6. Now what can we say about the sum of the dice?
Suppose player 1 sees a 2 and learns that player 2′s die is in 1-3. Then he knows that player 2 knows that player 1′s die is in 1-3. It is common knowledge that the sum is in 2-6.
You could graph it by drawing a 6x6 grid and circling the information partition of player 1 in one color, and player 2 in another color. You will find that the meet is a partition of 4 elements, each a 3x3 grid in one of the corners.
In general, anything which is common knowledge will limit the meet—that is, the meet partition the world is in will not extend to include world-states which contradict what is common knowledge. If 2 people disagree about global warming, it is probably common knowledge what the current CO2 level is and what the historical record of that level is. They agree on this data and each knows that the other agrees, etc.
The thrust of the theorem though is not what is common knowledge before, but what is common knowledge after. The claim is that it cannot be common knowledge that the two parties disagree.
What I don’t like about the example you provide is: what player 1 and player 2 know needs to be common knowledge. For instance if player 1 doesn’t know whether player 2 knows whether die 1 is in 1-3, then it may not be common knowledge at all that the sum is in 2-6, even if player 1 and player 2 are given the info you said they’re given.
This is what I was confused about in the grandparent comment: do we really need I and J to be common knowledge? It seems so to me. But that seems to be another assumption limiting the applicability of the result.
Not sure… what happens when the ranges are different sizes, or otherwise the type of information learnable by each player is different in non symmetric ways?
Anyways, thanks, upon another reading of your comment, I think I’m starting to get it a bit.
Different size ranges in Hal’s example? Nothing in particular happens. It’s ok for different random variables to have different ranges.
Otoh, if the players get different ranges about a single random variable, then they could have problems. Suppose there is one d6. Player A learns whether it is in 1-2, 3-4, or 5-6. Player B learns whether it is in 1-3 or 4-6.
And suppose the actual value is 1.
Then A knows it’s 1-2. So A knows B knows it’s 1-3. But A reasons that B reasons that if it were 3 then A would know it’s 3-4, so A knows B knows A knows it’s 1-4. But A reasons that B reasons that A reasons that if it were 4 then B would know it’s 4-6, so A knows B knows A knows B knows it’s 1-6. So there is no common knowledge, i.e. I∧J=Ω. (Omitting the argument w, since if this is true then it’s true for all w.)
And if it were a d12, with ranges still size 2 and 3, then the partitions line up at one point, so the meet stops at {1-6, 7-12}.