Oh yeah, I hadn’t considered that one. I think it’s interesting, but the intuitions are better in the opposite direction, i.e. you can build on good intuitions for DKL to better understand MI. I’m not sure if you can easily get intuitions to point in the other direction (i.e. from MI to DKL), because this particular expression has MI as an expectation over DKL, rather than the other way around. E.g. I don’t think this expression illuminates the nonsymmetry of DKL.
The way it’s written here seems more illuminating (not sure if that’s the one that you meant). This gets across the idea that:
P(X,Y) is the true reality, and PX⊗PY is our (possibly incorrect) model which assumes independence. The mutual information between X and Y equals DKL(P(X,Y)||PX⊗PY), i.e. the extent to which modelling X and Y as independent (sharing no information) is a poor way of modelling the true state of affairs (where they do share information).
But again I think this intuition works better in the other direction, since it builds on intuitions for DKL to better explain MI. The arguments in the DKL expression aren’t arbitrary (i.e. we aren’t working with DKL(P||Q)), which restricts the amount this can tell us about DKL in general.
The arguments in the DKL expression aren’t arbitrary (i.e. we aren’t working with DKL(P||Q)), which restricts the amount this can tell us about DKL in general.
Yeah, I was vaguely hoping one could phrase $P$ and $Q$ so they’re in that form, but I don’t see it.
Nice. I didn’t know about the hypothesis testing one (or Bregman, but I don’t get that one). I wonder if one can back out another description of KL divergence in terms of mutual information from the expression of mutual information in terms of KL divergence: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Mutual_information
And yeah, despite a whole 16 lecture course on convex opti I still don’t really get Bregman either, I skipped the exam questions on it 😆
Oh yeah, I hadn’t considered that one. I think it’s interesting, but the intuitions are better in the opposite direction, i.e. you can build on good intuitions for DKL to better understand MI. I’m not sure if you can easily get intuitions to point in the other direction (i.e. from MI to DKL), because this particular expression has MI as an expectation over DKL, rather than the other way around. E.g. I don’t think this expression illuminates the nonsymmetry of DKL.
The way it’s written here seems more illuminating (not sure if that’s the one that you meant). This gets across the idea that:
But again I think this intuition works better in the other direction, since it builds on intuitions for DKL to better explain MI. The arguments in the DKL expression aren’t arbitrary (i.e. we aren’t working with DKL(P||Q)), which restricts the amount this can tell us about DKL in general.
Yeah, I was vaguely hoping one could phrase $P$ and $Q$ so they’re in that form, but I don’t see it.