Vanessa Kosoy comments on Vanessa Kosoy’s Shortform

Vanessa Kosoy 8 Oct 2024 17:20 UTC
LW: 3 AF: 2
0
AF
Ambidistributions
I believe that all or most of the claims here are true, but I haven’t written all the proofs in detail, so take it with a grain of salt.
Ambidistributions are a mathematical object that simultaneously generalizes infradistributions and ultradistributions. It is useful to represent how much power an agent has over a particular system: which degrees of freedom it can control, which degrees of freedom obey a known probability distribution and which are completely unpredictable.
Definition 1: Let $X$ be a compact Polish space. A (crisp) ambidistribution on $X$ is a function $Q : C (X) \to R$ s.t.
1. (Monotonocity) For any $f, g \in C (X)$ , if $f \leq g$ then $Q (f) \leq Q (g)$ .
2. (Homogeneity) For any $f \in C (X)$ and $λ \geq 0$ , $Q (λ f) = λ Q (f)$ .
3. (Constant-additivity) For any $f \in C (X)$ and $c \in R$ , $Q (f + c) = Q (f) + c$ .
Conditions 1+3 imply that $Q$ is 1-Lipschitz. We could introduce non-crisp ambidistributions by dropping conditions 2 and/or 3 (and e.g. requiring 1-Lipschitz instead), but we will stick to crisp ambidistributions in this post.
The space of all ambidistributions on $X$ will be denoted $♡ X$ .^[1] Obviously, $□ X \subseteq ♡ X$ (where $□ X$ stands for (crisp) infradistributions), and likewise for ultradistributions.
Examples
Example 1: Consider compact Polish spaces $X, Y, Z$ and a continuous mapping $F : X \times Y \to Z$ . We can then define $F^{♡} \in ♡ Z$ by
$F^{♡} (u) := max θ \in Δ X min η \in Δ Y E_{θ \times η} [u \circ F]$
That is, $F^{♡} (u)$ is the value of the zero-sum two-player game with strategy spaces $X$ and $Y$ and utility function $u \circ F$ .
Notice that $F$ in Example 1 can be regarded as a Cartesian frame: this seems like a natural connection to explore further.
Example 2: Let $A$ and $O$ be finite sets representing actions and observations respectively, and $Λ : {O^{*} \to A} \to □ (A \times O)^{*}$ be an infra-Bayesian law. Then, we can define $Λ^{♡} \in ♡ (A \times O)^{*}$ by
$Λ^{♡} (u) := max π : O^{*} \to A E_{Λ (π)} [u]$
In fact, this is a faithful representation: $Λ$ can be recovered from $Λ^{♡}$ .
Example 3: Consider an infra-MDP with finite state set $S$ , initial state $s_{0} \in S$ and transition infrakernel $T : S \times A \to □ S$ . We can then define the “ambikernel” $T^{♡} : S \to ♡ S$ by
$T^{♡} (s; u) := max a \in A E_{T (s, a)} [u]$
Thus, every infra-MDP induces an “ambichain”. Moreover:
Claim 1: $♡$ is a monad. In particular, ambikernels can be composed.
This allows us defining
$ϕ (γ) := (1 - γ) \infty \sum n = 0 γ^{n} (T^{♡})^{n} (s_{0})$
This object is the infra-Bayesian analogue of the convex polytope of accessible state occupancy measures in an MDP.
Claim 2: The following limit always exists:
$ϕ^{*} := lim γ \to 1 ϕ (γ)$
Legendre-Fenchel Duality
Definition 3: Let $D$ be a convex space and $A_{1}, A_{2} \dots A_{n}, B \subseteq D$ . We say that $B$ occludes $(A_{1} \dots A_{n})$ when for any $(a_{1} \dots a_{n}) \in A_{1} \times \dots \times A_{n}$ , we have
$C H (a_{1} \dots a_{n}) \cap B \neq \emptyset$
Here, $C H$ stands for convex hull.
We denote this relation $A_{1} \dots A_{n} ⊢ B$ . The reason we call this “occlusion” is apparent for the $n = 2$ case.
Here are some properties of occlusion:
1. For any $1 \leq i \leq n$ , $A_{1} \dots A_{n} ⊢ A_{i}$ .
2. More generally, if $c \in Δ {1 \dots n}$ then $A_{1} \dots A_{n} ⊢ \sum_{i} c_{i} A_{i}$ .
3. If $Φ ⊢ A$ and $Φ \subseteq Ψ$ then $Ψ ⊢ A$ .
4. If $Φ ⊢ A$ and $A \subseteq B$ then $Φ ⊢ B$ .
5. If $A_{1} \dots A_{n} ⊢ B$ and $A_{i}^{'} \subseteq A_{i}$ for all $1 \leq i \leq n$ , then $A_{1}^{'} \dots A_{n}^{'} ⊢ B$ .
6. If $Φ ⊢ A_{i}$ for all $1 \leq i \leq n$ , and also $A_{1} \dots A_{n} ⊢ B$ , then $Φ ⊢ B$ .
Notice that occlusion has similar algebraic properties to logical entailment, if we think of $A \subseteq B$ as ” $B$ is a weaker proposition than $A$ ”.
Definition 4: Let $X$ be a compact Polish space. A cramble set^[2] over $X$ is $Φ \subseteq □ X$ s.t.
1. $Φ$ is non-empty.
2. $Φ$ is topologically closed.
3. For any finite $Φ_{0} \subseteq Φ$ and $Θ \in □ X$ , if $Φ_{0} ⊢ Θ$ then $Θ \in Φ$ . (Here, we interpret elements of $□ X$ as credal sets.)
Question: If instead of condition 3, we only consider binary occlusion (i.e. require $| Φ_{0} | \leq 2)$ , do we get the same concept?
Given a cramble set $Φ$ , its Legendre-Fenchel dual ambidistribution is
$^Φ (f) := max Θ \in Φ E_{Θ} [f]$
Claim 3: Legendre-Fenchel duality is a bijection between cramble sets and ambidistributions.
Lattice Structure
Functionals
The space $♡ X$ is equipped with the obvious partial order: $Q \leq P$ when for all $f \in C (X),$ $Q (f) \leq P (f)$ . This makes $♡ X$ into a distributive lattice, with
$(P \land Q) (f) = min (P (f), Q (f))$ $(P \lor Q) (f) = max (P (f), Q (f))$
This is in contrast to $□ X$ which is a non-distributive lattice.
The bottom and top elements are given by
$⊥ (f) = min x \in X f (x)$ $⊤ (f) = max x \in X f (x)$
Ambidistributions are closed under pointwise suprema and infima, and hence $♡ X$ is complete and satisfies both infinite distributive laws, making it a complete Heyting and co-Heyting algebra.
$♡ X$ is also a De Morgan algebra with the involution
$¯ Q (f) := - Q (- f)$
For $X \neq \emptyset$ , $♡ X$ is not a Boolean algebra: $Δ X \subseteq ♡ X$ and for any $θ \in Δ X$ we have $¯ θ = θ$ .
One application of this partial order is formalizing the “no traps” condition for infra-MDP:
Definition 2: A finite infra-MDP is quasicommunicating when for any $s \in S$
$lim γ \to 1 (1 - γ) \infty \sum n = 0 γ^{n} (T^{♡})^{n} (s_{0}) \leq lim γ \to 1 (1 - γ) \infty \sum n = 0 γ^{n} (T^{♡})^{n} (s)$
Claim 4: The set of quasicommunicating finite infra-MDP (or even infra-RDP) is learnable.
Cramble Sets
Going to the cramble set representation, $^Φ \leq^Ψ$ iff $Φ \subseteq Ψ$ .
$Φ \land Ψ$ is just $Φ \cap Ψ$ , whereas $Φ \lor Ψ$ is the “occlusion hall” of $Φ$ and $Ψ$ .
The bottom and the top cramble sets are
$⊥ = {⊤_{□}}$ $⊤ = □ X$
Here, $⊤_{□}$ is the top element of $□ X$ (corresponding to the credal set $Δ X)$ .
The De Morgan involution is
$¯ Φ = {Θ \in □ X ∣ \forall Ξ \in Φ : Θ \cap Ξ \neq \emptyset}$
Operations
Definition 5: Given $X, Y$ compact Polish spaces and a continuous mapping $h : X \to Y$ , we define the pushforward $h_{*} : ♡ X \to ♡ Y$ by
$h_{*} (Q; f) := Q (f \circ h)$
When $h$ is surjective, there are both a left adjoint and a right adjoint to $h_{*}$ , yielding two pullback operators $h_{min}^{*}, h_{max}^{*} : ♡ Y \to ♡ X$ :
$h_{min}^{*} (Q; f) := min g \in C (Y) : g \circ h \geq f Q (g)$ $h_{max}^{*} (Q; f) := max g \in C (Y) : g \circ h \leq f Q (g)$
Given $Q \in ♡ X$ and $P \in ♡ Y$ we can define the semidirect product $Q ⋉ P \in ♡ (X \times Y)$ by
$(Q ⋉ P) (f) := Q (λ x . P (λ y . f (x, y)))$
There are probably more natural products, but I’ll stop here for now.
Polytopic Ambidistributions
Definition 6: The polytopic ambidistributions $♡_{pol} X$ are the (incomplete) sublattice of $♡ X$ generated by $Δ X$ .
Some conjectures about this:
- For finite $X$ , an ambidistributions $Q$ is polytopic iff there is a finite polytope complex $C$ on $R^{X}$ s.t. for any cell $A$ of $C$ , $Q |_{C}$ is affine.
- For finite $X$ , a cramble set $Φ$ is polytopic iff it is the occlusion hall of a finite set of polytopes in $Δ X$ .
- $ϕ (γ)$ and $ϕ^{*}$ from Example 3 are polytopic.
1. ^
  The non-convex shape $♡$ reminds us that ambidistributions need not be convex or concave.
2. ^
  The expression “cramble set” is meant to suggest a combination of “credal set” with “ambi”.

Vanessa Kosoy comments on Vanessa Kosoy’s Shortform

Ambidistributions

Examples

Legendre-Fenchel Duality

Lattice Structure

Functionals

Cramble Sets

Operations

Polytopic Ambidistributions