Vanessa Kosoy comments on Vanessa Kosoy’s Shortform

Vanessa Kosoy 8 Nov 2019 18:14 UTC
LW: 2 AF: 1
0
AF
It seems useful to consider agents that reason in terms of an unobservable ontology, and may have uncertainty over what this ontology is. In particular, in Dialogic RL, the user’s preferences are probably defined w.r.t. an ontology that is unobservable by the AI (and probably unobservable by the user too) which the AI has to learn (and the user is probably uncertain about emself). However, onotlogies are more naturally thought of as objects in a category than as elements in a set. The formalization of an “ontology” should probably be a POMDP or a suitable Bayesian network. A POMDP involves an arbitrary set of states, so it’s not an element in a set, and the class of POMDPs can be naturally made into a category. Therefore, there is need for defining the notion of a probability measure over a category. Of course we can avoid this by enumerating the states, considering the set of all possible POMDPs w.r.t. this enumeration and then requiring the probability measure to be invariant w.r.t. state relabeling. However, the category theoretic point of view seems more natural, so it might be worth fleshing out.

Ordinary probably measures are defined on measurable spaces. So, first we need to define the analogue of “measurable structure” ( $σ$ -algebra) for categories. Fix a category $C$ . Denote $M e a s$ the category of measurable spaces. A measurable structure on $C$ is then specified by providing a Grothendick fibration $B : {M F}_{C} \to M e a s$ and an equivalence $E : B^{- 1} (p t) \to C$ . Here, $B^{- 1} (p t)$ stands for the essential fiber of $B$ over the one point space $p t \in M e a s$ . The intended interpretation of ${M F}_{C}$ is, the category of families of objects in $C$ indexed by measurable spaces. The functor $B$ is supposed to extract the base (index space) of the family. We impose the following conditions on ${M F}_{C}$ and $B$ :

Given $A \in M e a s$ , $Y \in {M F}_{C}$ and $f : A \to B (Y)$ , we denote the corresponding base change by $f^{Y} : f^{- 1} (Y) \to Y$ ( $f^{- 1} (Y) \in {M F}_{C}$ and $B (f^{- 1} (Y))$ is canonically isomorphic to $A$ ).
- Consider $X, Y \in {M F}_{C}$ and $g, g^{'} : X \to Y$ . Consider also a point $q \in B (X)$ . We can think of $q$ as a morphism $q : p t \to X$ . This allows us considering the base changes $X_{q} := q^{- 1} (X)$ and $Y_{f (q)}$ (the “fibers” of $X$ at $q$ and $Y$ at $f (q)$ respectively) where $f := B (g)$ . Applying the universal property of $Y_{f (q)}$ to $g \circ q^{X}$ and $g^{'} \circ q^{X}$ , we get morphisms $g_{q}, g_{q}^{'} : X_{q} \to Y_{q}$ . We now require that, if for any $q \in B (X)$ , $g_{q} = g_{q}^{'}$ then $g = g^{'}$ (morphisms between families that are pointwise equal are just equal).
- Consider $X, Y \in {M F}_{C}$ and $g : X \to Y$ . Suppose that (i) $B (g)$ is an isomorphism and (ii) for any $q \in B (X)$ , $g_{q}$ is an isomorphism. Then, $g$ is an isomorphism (families with a common base that are pointwise isomorphic are just isomorphic).
I’m not entirely sure how sufficient or necessary these conditions are for proving useful results, but they seem to me natural at first glance. Note that this definition can be regarded as motivated by the Yoneda lemma: a measurable space $A \in M e a s$ is defined by the measurable mappings to $A$ from other measurable spaces, so a “measurable category” should be defined by the measurable “mappings” to it from measurable spaces, and $M F$ is precisely the category of such measurable “mappings”. Compare this with definition of geometric stacks(fn1).

Next, we define probability measures. Specifically, for any “measurable category” $C$ (a category equipped with structure as above), we construct the category $Δ C$ of “probability measures on $C$ ”. First, we define the auxiliary category $~ Δ C$ . An object in $~ Δ C$ is a pair $(X, μ)$ where $X$ is an object in ${M F}_{C}$ and $μ$ is a probability measure on $B (X)$ . We interpret this as sampling $q \in B (X)$ from $μ$ and then taking $X_{q}$ (using $E$ , the latter can be considered to be an object in $C$ ). We define the morphisms from $(X, μ)$ to $(Y, ν)$ as those morphisms $g : X \to Y$ for which $B (g)_{*} μ = ν$ (the notation stands for pushforward). Given $g : X \to Y$ , we call it a “quasi-isomorphism” when, for any $q \in B (X)$ , $g_{q}$ is an isomorphism. Claim: quasi-isomorphisms admit a calculus of right fractions(fn2). We now define $Δ C$ as the localization of $~ Δ C$ by quasi-isomorphisms.

(fn1) Maybe the analogy with stacks should be made more formal? Not sure, stacks are motivated by topology and measurable spaces are not topological...

(fn2) This should clearly be right, and this is right for natural examples, but I haven’t written down the proof. If it turns out to be false it would mean that my conditions on ${M F}_{C}$ are too weak.