Scott presented Cartesian frames/Chu spaces as follows:
Let W be a set of possible worlds. A Cartesian frame C over W is a triple C=(A,D,⋅), where A represents a set of possible ways the agent can be, D represents a set of possible ways the environment can be, and ⋅:A×D→W is an evaluation function that returns a possible world given an element of A and an element of D.
In a previous post, I defined GM, the category of generalised models.
In this post, I’ll try and see how these two formalisms relate to each other.
Let C0=(A0,D0,⋆0) and C1=(A1,D1,⋆1) be Cartesian frames over W: thus there are relations ⋆0:A0×D0→W (written as a0⋆0d0=w) and ⋆1:A1×D1→W (written as a1⋆1d1=w′).
A morphism between them is a pair of maps (g0:A0→A1,h1:D1→D0), such that, for all a0∈A0 and d1∈D1, g0(a0)⋆1d1=a0⋆0h1(d1).
How can we express this in the generalised model formalism?
First, let Ei=Ai×Di×W. In terms of features, this can be defined by setting ¯¯¯fAi=Ai, ¯¯¯fDi=D and ¯¯¯fW=W. Then Fi={fAi,fDi,fW}, and Mi=(Fi,Ei,Qi) is the feature-split generalised model with Ai⊂2¯¯¯fAi=2Ai, Di⊂2¯¯¯fDi=2Di, and W⊂2¯¯¯fW=2W.
As we’ll see in the bears example, there can be more interesting ways of defining the feature split Mi.
Then the map pair (g0,h1) is equivalent to the (feature-split) relation r, defined such that (a0,d0,w)∼r(a1,d1,w′) iff:
g0(a0)=a1,
h1(d1)=d0,
and w=w′.
Without loss of clarity, we can thus write r as the feature-split relation (g0,h0,IdW).
Composing (g0,h1) and (g1,h2) generates (g1∘g0,h1∘h2). Take r as the relation defined by (g0,h1) and q as the relation defined by (g1,h2). Then if (a0,d0,w)∼pr(a2,d2,w′′), there must exist an (a1,d1,w′) with (a0,d0,w)∼r(a1,d1,w′)∼p(a2,d2,w′′). Then:
g1g0(a0)=g1(a1)=a2,
h1h2(d2)=h1(d1)=d0,
w=w′=w′′.
So composition of morphisms (g,h) for Cartesian frames is the same as the composition of corresponding relations (g,h,IdW).
The extra structure
We have two structures to add: Cartesian frames have the ⋆ map, while generalised models have the probability measures Q; we need to relate them.
One natural way to relate them is to consider that if a⋆d=w, then we should get Q(w∣a,d)=1 and have Q(w′∣a,d)=0 for w′≠w. This reflects the fact that action a and environment d lead inevitably to world w.
Now Q(w∣a,d)=Q(a,d,w)/Q(a,d,W), where Q(a,d,W) denotes Q on the set {a}×{d}×W; this is ∑w′∈WQ(a,d,w′).
Hence the desired condition on Q(w∣a,d) is equivalent with Q(a,d,w)=0 iff a⋆d≠w. There are, of course, multiple possible Qs with that property for any given ⋆.
The categorical equivalence
Now let’s tie these together, and define C(W), a subcategoryC(W) of the GM, the category of generalised models. This C(W) will map surjectively to the category of Cartesian frames.
The objects of C(W) are those (feature-split) generalised models which have E=A×D×W for some sets A and D, and have Q(a,d,w)=0 iff a⋆d≠w for some evaluation function ⋆:A×D→W.
The morphisms of C(W) are those morphisms of GM that map C(W) to itself, and that are of the form r=(g,h,IdW) for (g,h) a morphism of Cartesian frames.
Thus morphisms of C(W) are derived from morphisms of Chu(W), and are also compatible with the Q structures (since they are also morphisms of GM). Also included are the identity morphisms (IdA,IdD,IdW), which trivially preserve the Q structures.
To demonstrate that C(W) is a category, we need to show that pr is a morphism of it whenever r=(g0,h1,IdW) and p=(g1,h2,IdW) are. We know that pr must respect the Q structures (since r and p are morphisms of GM), while pr=(g1∘g0,h1∘h2,IdW), which derives from (g1∘g0,h1∘h2), a morphism of Chu(W).
Thus C(W) is a category. Let Φ:C(W)→Chu(W) be the map that sends (F,A×D×W,Q) to (A,D,⋆), and sends r=(g,h,IdW) to (g,h).
This Φ is clearly a functor of categories, and it is surjective on the objects of Chu(W) (the Cartesian frames). Now we need to show that it’s also surjective on the morphisms, which comes from the following result:
Let (g0,h1) be a morphism between C0=(A0,D0,⋆0) and C1=(A1,D1,⋆1). Then there exists M0,M1∈C(W) and a morphism r=(g0,h1,IdW) between them such that Φ(Mi)=Ci.
To show that, we need to choose Q0 and Q1 that are compatible with ⋆0 and ⋆1, and are compatible with r.
In fact, we’ll show a slightly stronger result: that for any M0 with Φ(M0)=C0, we can pick an M1 (ie pick a Q1) with the required properties.
To show this, note that r=(g0,h1,IdW) will relate every element of (g−1(a1),d0,w) with every element of (a1,h−11(d0),w). In fact, r is defined by such relations, for any a1∈A1, d0∈D1 and w∈W. No other elements are related by r.
For compatibility of r with the Qs, it suffices that Q0(g−1(a1),d0,w) be equal to Q1(a1,h−11(d0),w).
For any d1∈D1, define #d1 as the size of h−11(h1(d1)); since d1∈h−11(h1(d1)), #d1≥1.
Then define Q1(a1,d1,w) as Q0(g−1(a1),h1(d1),w)/#d1. This will give the compatibility that we want.
Hence Φ:C(W)→Chu(W) is a surjective functor of categories, from a subcategory of GM, the category of generalised models.
More functors
Given two sets W and V, and a function p:W→V, there is an induced functor p:Chu(W)→Chu(V), sending (a,d,w) to (a,d,p(w)) and sending the morphism (g,h) to the morphism with the same underlying functions, (g,h).
Then by the above, we have C(W) and C(V) as distinct subcategories of GM, with functors ΦW and ΦV sending these subcategories to Chu(W) and Chu(V).
Then p also induces a functor C(W)→C(V), by sending (a,d,w)∈E=A×D×W to (a,d,p(w)). The induced Q is given by Q(a,d,v)=∑w∈p−1(v)Q(a,d,v).
Note that p is not only a functor C(W)→C(V), it also acts as a morphism between any M0∈C(W) to p(M0)∈C(V), when both are seen as elements of GM. We’ll designate these various morphisms by p as well. As a functor, p maps the relation r=(g,h,IdW) to p(r)(g,h,IdV); seeing p as a morphism, p(r) is precisely prp−1, where p−1 is the opposite morphism to p: (a,d,v)∼p−1(a,d,w) iff p(w)=v.
We can see that the morphism p commutes with the Φs:
ΦV∘p=p∘ΦW.
This is probably enough exploration of the functorial properties of these spaces for one post.
An example: colours and bears
To illustrate, let’s use the Cartesian frame from this post; this construction will also show how features can figure non-trivially in this construction.
Here the agent has two unrelated choices: which colour to think about (green, G or red R) and whether to go for a walk or stay home (W or H). So A={GH,GW,RH,RW}. The environment is either safe or has bears: D={S,B}.
This gives the following frame C0:
SBC0=GHGWRHRW⎛⎜
⎜
⎜⎝w0w1w2w3w4w5w6w7⎞⎟
⎟
⎟⎠
Of course, w0 and w4 only differ in the colour that the agent is thinking about (similarly for w1 and w5, etc...). We could choose a C1 frame that doesn’t distinguish between these thoughts:
SBC1=GHGWRHRW⎛⎜
⎜
⎜⎝w0w1w2w3w0w1w2w3⎞⎟
⎟
⎟⎠
Let V={w0,w1,w2,w3}. Then we can define the various sets through features; specifically, in this example, FA={fG/R,fW/H}. Similarly FD={fS/B}.
Adding a definition of FV={fV} and FW={fV,fG/R}, we can construct the feature split generalised models:
M0={FA⊔FD⊔FW,A×D×W,Q0}.
M1={FA⊔FD⊔FV,A×D×V,Q1}.
The Qi are defined by the matrix above; if we want them to make sense as traditional probability distributions, we might require that Qi(a,d,w)=1/8 whenever it is non-zero, with 8=||A×D|| the size of the matrix. In that case, Qi(Ei)=1, as required.
Notes on non-synonyms
Some of the terminology is repeated between the two formalisms, but doesn’t mean the same things. Specifically:
Environments: for Cartesian frames, this is D, the different columns of the matrix. For generalised models, this is the larger set E=A×D×W.
Worlds: for Cartesian frames, this is W, the possible values of the elements of the matrix. For generalised models, this is W=2¯¯¯¯F, the set of all possible values all the features could take. At the very least, W contains E=A×D×W, but it could be much larger.
Cartesian frames as generalised models
Scott presented Cartesian frames/Chu spaces as follows:
Let W be a set of possible worlds. A Cartesian frame C over W is a triple C=(A,D,⋅), where A represents a set of possible ways the agent can be, D represents a set of possible ways the environment can be, and ⋅:A×D→W is an evaluation function that returns a possible world given an element of A and an element of D.
In a previous post, I defined GM, the category of generalised models.
In this post, I’ll try and see how these two formalisms relate to each other.
Equivalence with Cartesian frames
We’ll now demonstrate the equivalence of Cartesian frames morphisms with the morphisms of generalised models. To do so, and avoid a collision of symbols, I’ve slightly tweaked the notation for Cartesian frames.
Equivalence of morphisms
Let C0=(A0,D0,⋆0) and C1=(A1,D1,⋆1) be Cartesian frames over W: thus there are relations ⋆0:A0×D0→W (written as a0⋆0d0=w) and ⋆1:A1×D1→W (written as a1⋆1d1=w′).
A morphism between them is a pair of maps (g0:A0→A1,h1:D1→D0), such that, for all a0∈A0 and d1∈D1, g0(a0)⋆1d1=a0⋆0h1(d1).
How can we express this in the generalised model formalism?
First, let Ei=Ai×Di×W. In terms of features, this can be defined by setting ¯¯¯fAi=Ai, ¯¯¯fDi=D and ¯¯¯fW=W. Then Fi={fAi,fDi,fW}, and Mi=(Fi,Ei,Qi) is the feature-split generalised model with Ai⊂2¯¯¯fAi=2Ai, Di⊂2¯¯¯fDi=2Di, and W⊂2¯¯¯fW=2W.
As we’ll see in the bears example, there can be more interesting ways of defining the feature split Mi.
Then the map pair (g0,h1) is equivalent to the (feature-split) relation r, defined such that (a0,d0,w)∼r(a1,d1,w′) iff:
g0(a0)=a1,
h1(d1)=d0,
and w=w′.
Without loss of clarity, we can thus write r as the feature-split relation (g0,h0,IdW).
Composing (g0,h1) and (g1,h2) generates (g1∘g0,h1∘h2). Take r as the relation defined by (g0,h1) and q as the relation defined by (g1,h2). Then if (a0,d0,w)∼pr(a2,d2,w′′), there must exist an (a1,d1,w′) with (a0,d0,w)∼r(a1,d1,w′)∼p(a2,d2,w′′). Then:
g1g0(a0)=g1(a1)=a2,
h1h2(d2)=h1(d1)=d0,
w=w′=w′′.
So composition of morphisms (g,h) for Cartesian frames is the same as the composition of corresponding relations (g,h,IdW).
The extra structure
We have two structures to add: Cartesian frames have the ⋆ map, while generalised models have the probability measures Q; we need to relate them.
One natural way to relate them is to consider that if a⋆d=w, then we should get Q(w∣a,d)=1 and have Q(w′∣a,d)=0 for w′≠w. This reflects the fact that action a and environment d lead inevitably to world w.
Now Q(w∣a,d)=Q(a,d,w)/Q(a,d,W), where Q(a,d,W) denotes Q on the set {a}×{d}×W; this is ∑w′∈WQ(a,d,w′).
Hence the desired condition on Q(w∣a,d) is equivalent with Q(a,d,w)=0 iff a⋆d≠w. There are, of course, multiple possible Qs with that property for any given ⋆.
The categorical equivalence
Now let’s tie these together, and define C(W), a subcategory C(W) of the GM, the category of generalised models. This C(W) will map surjectively to the category of Cartesian frames.
The objects of C(W) are those (feature-split) generalised models which have E=A×D×W for some sets A and D, and have Q(a,d,w)=0 iff a⋆d≠w for some evaluation function ⋆:A×D→W.
The morphisms of C(W) are those morphisms of GM that map C(W) to itself, and that are of the form r=(g,h,IdW) for (g,h) a morphism of Cartesian frames.
Thus morphisms of C(W) are derived from morphisms of Chu(W), and are also compatible with the Q structures (since they are also morphisms of GM). Also included are the identity morphisms (IdA,IdD,IdW), which trivially preserve the Q structures.
To demonstrate that C(W) is a category, we need to show that pr is a morphism of it whenever r=(g0,h1,IdW) and p=(g1,h2,IdW) are. We know that pr must respect the Q structures (since r and p are morphisms of GM), while pr=(g1∘g0,h1∘h2,IdW), which derives from (g1∘g0,h1∘h2), a morphism of Chu(W).
Thus C(W) is a category. Let Φ:C(W)→Chu(W) be the map that sends (F,A×D×W,Q) to (A,D,⋆), and sends r=(g,h,IdW) to (g,h).
This Φ is clearly a functor of categories, and it is surjective on the objects of Chu(W) (the Cartesian frames). Now we need to show that it’s also surjective on the morphisms, which comes from the following result:
Let (g0,h1) be a morphism between C0=(A0,D0,⋆0) and C1=(A1,D1,⋆1). Then there exists M0,M1∈C(W) and a morphism r=(g0,h1,IdW) between them such that Φ(Mi)=Ci.
To show that, we need to choose Q0 and Q1 that are compatible with ⋆0 and ⋆1, and are compatible with r.
In fact, we’ll show a slightly stronger result: that for any M0 with Φ(M0)=C0, we can pick an M1 (ie pick a Q1) with the required properties.
To show this, note that r=(g0,h1,IdW) will relate every element of (g−1(a1),d0,w) with every element of (a1,h−11(d0),w). In fact, r is defined by such relations, for any a1∈A1, d0∈D1 and w∈W. No other elements are related by r.
For compatibility of r with the Qs, it suffices that Q0(g−1(a1),d0,w) be equal to Q1(a1,h−11(d0),w).
For any d1∈D1, define #d1 as the size of h−11(h1(d1)); since d1∈h−11(h1(d1)), #d1≥1.
Then define Q1(a1,d1,w) as Q0(g−1(a1),h1(d1),w)/#d1. This will give the compatibility that we want.
Hence Φ:C(W)→Chu(W) is a surjective functor of categories, from a subcategory of GM, the category of generalised models.
More functors
Given two sets W and V, and a function p:W→V, there is an induced functor p:Chu(W)→Chu(V), sending (a,d,w) to (a,d,p(w)) and sending the morphism (g,h) to the morphism with the same underlying functions, (g,h).
Then by the above, we have C(W) and C(V) as distinct subcategories of GM, with functors ΦW and ΦV sending these subcategories to Chu(W) and Chu(V).
Then p also induces a functor C(W)→C(V), by sending (a,d,w)∈E=A×D×W to (a,d,p(w)). The induced Q is given by Q(a,d,v)=∑w∈p−1(v)Q(a,d,v).
Note that p is not only a functor C(W)→C(V), it also acts as a morphism between any M0∈C(W) to p(M0)∈C(V), when both are seen as elements of GM. We’ll designate these various morphisms by p as well. As a functor, p maps the relation r=(g,h,IdW) to p(r)(g,h,IdV); seeing p as a morphism, p(r) is precisely prp−1, where p−1 is the opposite morphism to p: (a,d,v)∼p−1(a,d,w) iff p(w)=v.
We can see that the morphism p commutes with the Φs:
ΦV∘p=p∘ΦW.
This is probably enough exploration of the functorial properties of these spaces for one post.
An example: colours and bears
To illustrate, let’s use the Cartesian frame from this post; this construction will also show how features can figure non-trivially in this construction.
Here the agent has two unrelated choices: which colour to think about (green, G or red R) and whether to go for a walk or stay home (W or H). So A={GH,GW,RH,RW}. The environment is either safe or has bears: D={S,B}.
This gives the following frame C0:
SBC0=GHGWRHRW⎛⎜ ⎜ ⎜⎝w0w1w2w3w4w5w6w7⎞⎟ ⎟ ⎟⎠
Of course, w0 and w4 only differ in the colour that the agent is thinking about (similarly for w1 and w5, etc...). We could choose a C1 frame that doesn’t distinguish between these thoughts:
SBC1=GHGWRHRW⎛⎜ ⎜ ⎜⎝w0w1w2w3w0w1w2w3⎞⎟ ⎟ ⎟⎠
Let V={w0,w1,w2,w3}. Then we can define the various sets through features; specifically, in this example, FA={fG/R,fW/H}. Similarly FD={fS/B}.
Adding a definition of FV={fV} and FW={fV,fG/R}, we can construct the feature split generalised models:
M0={FA⊔FD⊔FW,A×D×W,Q0}.
M1={FA⊔FD⊔FV,A×D×V,Q1}.
The Qi are defined by the matrix above; if we want them to make sense as traditional probability distributions, we might require that Qi(a,d,w)=1/8 whenever it is non-zero, with 8=||A×D|| the size of the matrix. In that case, Qi(Ei)=1, as required.
Notes on non-synonyms
Some of the terminology is repeated between the two formalisms, but doesn’t mean the same things. Specifically:
Environments: for Cartesian frames, this is D, the different columns of the matrix. For generalised models, this is the larger set E=A×D×W.
Worlds: for Cartesian frames, this is W, the possible values of the elements of the matrix. For generalised models, this is W=2¯¯¯¯F, the set of all possible values all the features could take. At the very least, W contains E=A×D×W, but it could be much larger.