rossry comments on A Bayesian Aggregation Paradox

rossry Nov 22, 2021, 1:58 PM
10 points
0
The framing of this issue that makes the most sense to me is ” $P (E | B \cup C)$ is a function of $P (B) : P (C)$ ”.
When I look at it this way, I disagree with the claim (in “Mennen’s ABC example”) that “[Bayesian updating] is not invariant when we aggregate outcomes”—I think it’s clearer to say the Bayesian updating is not well-defined when we aggregate outcomes.
Additionally, in “Interpreting Bayesian Networks”, the framing seems to make it clearer that the problem is that you used $e_{1, 2} + e_{1, 3}$ for $P (E | B \cup C)$ -- but they’re not the same thing! In essence, you’re taking the sum where you should be taking the average...
With this focus on (mis)calculating $P (E | B \cup C)$ , the issue seems to me more like “a common error in applying Bayesian updates”, rather than a fundamental paradox in Bayesian updating itself. I agree with the takeaway “be careful when grouping together outcomes of a variable”—because grouping exposes one to committing this error—but I’m not sure I’m seeing the thing that makes you describe it as unintuitive?
- Jsevillamol Nov 22, 2021, 2:40 PM
  4 points
  0
  Parent
  I like this framing.
  This seems to imply that summarizing beliefs and summarizing updates are two distinct operations.
  For summarizing beliefs we can still resort to summing:
  $⎛ ⎜ ⎝ \begin{matrix} p_{1} p_{2} p_{3} \end{matrix} ⎞ ⎟ ⎠      Belief \to (\begin{matrix} p_{1} p_{2} + p_{3} \end{matrix})      Summarized belief$
  But for summarizing updates we need to use an average—which in the absence of prior information will be a simple average:
  $⎛ ⎜ ⎝ \begin{matrix} e_{1} e_{2} e_{3} \end{matrix} ⎞ ⎟ ⎠      Update \to (\begin{matrix} e_{1} \frac{e_{2} + e_{3}}{2} \end{matrix})      Summarized update$
  Annoyingly and as you point out this is not a perfect summary—we are definitely losing information here and subsequent updates will be not as exact as if we were working with the disaggregated odds.
  I still find it quite disturbing that the update after summarizing depends on prior information—but I can’t see how to do better than this, pragmatically speaking.
  - rossry Nov 22, 2021, 5:46 PM
    3 points
    0
    Parent
    Right, I agree that for the update aggregation $\frac{e_{2} + e_{3}}{2}$ is better than $e_{2} + e_{3}$ (but still lossy). And the thing that $p_{2} : p_{3}$ affects is the weighting in the average—so if $e_{2} = e_{3}$ then the $p$ s don’t matter! (which is a possible answer to your question of “how much aggregation/disaggregation can you do?”)
    
    But yeah if $e_{2}$ is very different from $e_{3}$ then I don’t think there’s any way around it, because the effective $e_{i}$ could be one or the other depending on what the $p_{i}$ are.