Dalcy comments on Dalcy’s Shortform

Dalcy 19 Apr 2025 17:46 UTC
30 points
0
Non-Shannon-type Inequalities
The first new qualitative thing in Information Theory when you move from two variables to three variables is the presence of negative values: information measures (entropy, conditional entropy, mutual information) are always nonnegative for two variables, but there can be negative triple mutual information $I (X; Y; Z)$ .
This so far is a relatively well-known fact. But what is the first new qualitative thing when moving from three to four variables? Non-Shannon-type Inequalities.
A fundamental result in Information Theory is that $I (X; Y ∣ Z) \geq 0$ always holds.
- Given $n$ random variables $X_{1}, \dots, X_{n}$ and $α, β, γ \subseteq [n]$ , from now on we write $I (α; β ∣ γ)$ with the obvious interpretation of the variables standing for the joint variables they correspond to as indices.
Since $I (α; β | γ) \geq 0$ always holds, a nonnegative linear combination of a bunch of these is always a valid inequality, which we call a Shannon-type Inequality.
Then the question is, whether Shannon-type Inequalities capture all valid information inequalities of $n$ variable. It turns out, yes for $n = 2$ , (approximately) yes for $n = 3$ , and no for $n \geq 4$ .
Behold, the glorious Zhang-Yeung inequality, a Non-Shannon-type Inequality for $n = 4$ :
$I (A; B) \leq 2 I (A; B ∣ C) + I (A; C ∣ B) + I (B; C ∣ A) + I (A; B ∣ D) + I (C; D)$
Explanation of the math, for anyone curious.
- Given $n$ random variables and $α, β, γ \subseteq [n]$ , it turns out that $I (α; β ∣ γ) \geq 0$ is equivalent to $H (α \cup β) + H (α \cap β) \leq H (α) + H (β)$ (submodularity), $H (α) \leq H (β)$ if $α \subseteq β$ , and $H (\emptyset) = 0$ .
  This lets us write the inequality involving conditional mutual information in terms of joint entropy instead.
  Let $Γ_{n}^{*}$ then be a subset of $R^{2^{n}}$ , each element corresponding to the values of the joint entropy assigned to each subset of some random variables $X_{1}, \dots, X_{n}$ . For example, an element of $Γ_{2}^{*}$ would be $(H (\emptyset), H (X_{1}), H (X_{2}), H (X_{1}, X_{2})) \in R^{2^{n}}$ for some random variables $X_{1}$ and $X_{2}$ , with a different element being a different tuple induced by a different random variable $(X_{1}^{'}, X_{2}^{'})$ .
  Now let $Γ_{n}$ represent elements of $R^{2^{n}}$ satisfying the three aforementioned conditions on joint entropy. For example, $Γ_{2}^{*}$ ’s element would be $(h_{\emptyset}, h_{1}, h_{2}, h_{12}) \in R^{2^{n}}$ satisfying e.g., $h_{1} \leq h_{12}$ (monotonicity). This is also a convex cone, so its elements really do correspond to “nonnegative linear combinations” of Shannon-type inequalities.
  Then, the claim that “nonnegative linear combinations of Shannon-type inequalities span all inequalities on the possible Shannon measures” would correspond to the claim that $Γ_{n} = Γ_{n}^{*}$ for all $n$ .
  The content of the papers linked above is to show that:
  $Γ_{2} = Γ_{2}^{*}$
  $Γ_{3} \neq Γ_{3}^{*}$ but $Γ_{3} = ¯ ¯¯¯¯ ¯ Γ_{3}^{*}$ (closure^[1])
  $Γ_{4} \neq Γ_{4}^{*}$ and $Γ_{4} \neq ¯ ¯¯¯¯ ¯ Γ_{4}^{*}$ , and also for all $n \geq 4$ .
1. ^
  This implies that, while there exists a $2^{3}$ -tuple satisfying Shannon-type inequalities that can’t be constructed or realized by any random variables $X_{1}, X_{2}, X_{3}$ , there does exist a sequence of random variables $(X_{1}^{(k)}, X_{2}^{(k)}, X_{3}^{(k)})_{k = 1}^{\infty}$ whose induced $2^{3}$ -tuple of joint entropies converge to that tuple in the limit.
- Alexander Gietelink Oldenziel 19 Apr 2025 20:01 UTC
  2 points
  0
  Parent
  @Fernando Rosas

Dalcy comments on Dalcy’s Shortform

Non-Shannon-type Inequalities