CallumMcDougall comments on Six (and a half) intuitions for KL divergence

CallumMcDougall 10 Oct 2022 14:11 UTC
1 point
0
Oh yeah, I hadn’t considered that one. I think it’s interesting, but the intuitions are better in the opposite direction, i.e. you can build on good intuitions for $D_{K L}$ to better understand MI. I’m not sure if you can easily get intuitions to point in the other direction (i.e. from MI to $D_{K L}$ ), because this particular expression has MI as an expectation over $D_{K L}$ , rather than the other way around. E.g. I don’t think this expression illuminates the nonsymmetry of $D_{K L}$ .
The way it’s written here seems more illuminating (not sure if that’s the one that you meant). This gets across the idea that:
$P_{(X, Y)}$ is the true reality, and $P_{X} \otimes P_{Y}$ is our (possibly incorrect) model which assumes independence. The mutual information between $X$ and $Y$ equals $D_{K L} (P_{(X, Y)} | | P_{X} \otimes P_{Y})$ , i.e. the extent to which modelling $X$ and $Y$ as independent (sharing no information) is a poor way of modelling the true state of affairs (where they do share information).
But again I think this intuition works better in the other direction, since it builds on intuitions for $D_{K L}$ to better explain MI. The arguments in the $D_{K L}$ expression aren’t arbitrary (i.e. we aren’t working with $D_{K L} (P | | Q)$ ), which restricts the amount this can tell us about $D_{K L}$ in general.
- TekhneMakre 10 Oct 2022 21:08 UTC
  2 points
  0
  Parent
  The arguments in the $D_{K L}$ expression aren’t arbitrary (i.e. we aren’t working with $D_{K L} (P | | Q)$ ), which restricts the amount this can tell us about $D_{K L}$ in general.
  Yeah, I was vaguely hoping one could phrase $P$ and $Q$ so they’re in that form, but I don’t see it.