In this post, given a finite factored setF=(S,B), we will show how to associate each E⊆S with a characteristic polynomial, QFE. We will discuss how to factor these characteristic polynomials, and use these characteristic polynomials to build up to the fundamental theorem of finite factored sets, which associates conditional orthogonality with conditional independence in probability distributions.
5.1. Characteristic Polynomials
Definition 28. Given a finite factored set F=(S,B), let PolyF denote the ring of polynomials with coefficients in R and variables in P(S).
Definition 29. Given a finite factored set F=(S,B), a p∈PolyF, and an f:P(S)→R, we write p(f)∈R for the evaluation of p at f, computed by replacing each E⊆S with f(E).
Definition 30. Given a finite factored set F=(S,B), and a polynomial p∈PolyF, supp(p)⊆P(S) denotes the set of all variables v∈P(S) that appear in p. supp(p) is called the support of p.
Definition 31. Given a finite factored set F=(S,B) and an E⊆S, let QFE∈PolyF be given by QFE=∑s∈E∏b∈B[s]b. QFE is called the characteristic polynomial of E (in F).
We will be building up to an understanding of how to factor QFE into irreducibles. For that, we will first need to give some basic notation for manipulating polynomials in PolyF.
Definition 32. Given a finite factored set F=(S,B), an s∈S, and a C⊆B, let monoFC(s)∈PolyF be given by monoFC(s)=∏b∈C[s]b.
Definition 33.Given a finite factored set F=(S,B), an E⊆S, and a C⊆B, let monosFC(E)∈P(PolyF) be given by monosFC(E)={monoFC(s)∣s∈E}.
Definition 34.Given a finite factored set F=(S,B), an E⊆S, and a C⊆B, let polyFC(E)∈PolyF be given by polyFC(E)=∑m∈monosFC(E)m.
Proposition 26. Let F=(S,B) be a finite factored set, and let E⊆S. Then QFE=polyFB(E).
Proof. We start by showing that for all s≠t∈S, monoFB(s)≠monoFB(t).
Let s≠t∈S be arbitrary. By Proposition 3, if s≠t, there must be some b∈B such that [s]b≠[t]b. Then, note that [s]b∈supp(monoFB(s)). If [s]b were also in supp(monoFB(t)), then t would be in both [s]b and [t]b, contradicting the fact that these two sets are disjoint. Therefore monoFB(s)≠monoFB(t).
Thus monosFB(E) has exactly one element for each element of E, so we have that ∑m∈monosFB(E)m=∑s∈EmonoFB(s)=QFE. □
Proposition 27. Let F=(S,B) be a finite factored set, and let E0,E1⊆S be subsets of S. Let C0,C1⊆B be disjoint subsets of B. Let E2=χFC0(E0,E1), and let C2=C0∪C1. Then polyFC2(E2)=polyFC0(E0)⋅polyFC1(E1).
Proof. For i∈{0,1,2}, let Mi=monosFCi(Ei). We will start by showing that f:M0×M1→M2, given by f(m0,m1)=m0m1, is a well-defined function and a bijection.
First, observe that it follows immediately from the definition that for all s0,s1∈S, if s2=χFC0(s0,s1) we have that monoFC0(s0)=monoFC0(s2), monoFC1(s1)=monoFC1(s2), and monoFC0(s2)⋅monoFC1(s2)=monoFC2(s2). Combining these, we get that monoFC0(s0)⋅monoFC1(s1)=monoFC2(χFC0(s0,s1)).
For all (m0,m1)∈M0×M1, there exists some s0∈E0 such that m0=monoFC0(s0), and some s1∈E1 such that m1=monoFC1(s1), and this gives us that m0m1=monoFC0(s0)monoFC1(s1)=monoFC2(χFC(s0,s1))∈M2. Thus, f is well-defined.
To see that f is surjective, observe that for all m2∈M2, there exists an s2∈E2 such that m2=monoFC2(s2), and there exist s0∈E0 and s1∈E1 such that s2=χFC(s0,s1), and we have f(monoFC0(s0),monoFC1(s1))=m2.
To see that f is injective, observe that for i∈{0,1}, for all mi∈Mi, supp(mi)⊆⋃b∈Cib. Further, ⋃b∈C0b and ⋃b∈C1b are disjoint. Thus, for all m0∈M0 and m1∈M1, supp(mi)=supp(m0m1)∩⋃b∈Cib.
This means that for all m0,m′0∈M0 and m1,m′1∈M1, if m0m1=m′0m′1, then supp(m0)=supp(m′0) and supp(m1)=supp(m′1). However, every monomial in M0 or M1 is just equal to the product of all variables in its support. Thus m0=∏v∈supp(m0)v=m′0 and m1=∏v∈supp(m1)v=m′1. Thus f is injective, and thus a bijection between M0×M1 and M2.
Proposition 28. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. If p divides QFE, then p=r⋅polyFC(E), for some r∈R and C⊆B.
Proof. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Let p,q∈PolyF satisfy pq=QFE. We thus must have supp(p)∪supp(q)=supp(QFE).
If there were some T∈supp(p)∩supp(q), then the degree of T in QFE would be at least 2, contradicting the definition of QFE and Corollary 1. Thus, supp(p)∩supp(q)={}.
There can be no combining like terms, then, in the product pq. The monomial terms in QFE are in bijective correspondence to the pairs of monomial terms in p and monomial terms in q.
In particular, this means that since all the coefficients in pq are equal to 1, all the coefficients in p must be equal to some r∈R, and all of the coefficients in q must be equal to 1/r.
Further, for all b∈B, if b∩supp(p) is nonempty, b∩supp(q) must be empty, since otherwise QFE would contain a term with two factors in b, which clearly never happens according to the definition of QFE.
Since E is nonempty, for each b∈B there must be some T∈b∩supp(QFE). Thus at least one of b∩supp(p) and b∩supp(q) must be nonempty, so exactly one of b∩supp(p) and b∩supp(q) must be nonempty.
Let C be the set of all b∈B such that b∩supp(p) is nonempty.
For every b∈C, every term of QFE has exactly one factor in b. Thus, every term in p has exactly one factor in b. These cover all variables in the support of p, so each term in p must have total degree |C|.
For each m∈monosFC(E), m divides a term in QFE. Since m has no common support with q, m must also divide a term in p. Thus r⋅m must be a term in p. Conversely, every term in p divides a term in QFE, and thus must be in monosFC(E). Thus every term in p is of the form r⋅m for some m∈monosFC(E). Thus p=∑m∈monosFC(E)r⋅m=r⋅polyFC(E). □
5.2. Factoring Characteristic Polynomials
We will now show how to factor characteristic polynomials into irreducibles.
Definition 35. Given a finite factored set F=(S,B), and a nonempty subset E⊆S, let IrrF(E)⊆P(B) denote the set of all C⊆B such that:
C is nonempty,
χFC(E,E)=E, and
there is no nonempty strict subset D⊂C such that χFD(E,E)=E.
Proposition 29. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Then IrrF(E)∈Part(B).
Proof. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. It suffices to show that the sets in IrrF(E) are pairwise disjoint and cover B.
We start by showing that the set of all C⊆B satisfying χFC(E,E)=E is closed under intersection. Indeed, if χFC0(E,E)=E and χFC1(E,E)=E, then χFC0∩C1(E,E)=χFC0(E,χFC1(E,E))=χFC0(E,E)=E.
Next, observe that χFB(E,E)=E. Thus, for all b∈B, we can consider Cb=⋂C⊆B,b∈C,χFC(E,E)=EC. Since Cb is an intersection of a finite nonempty collection of sets C satisfying χFC(E,E)=E, we have that χFCb(E,E)=E. Further, b∈Cb, so Cb is nonempty.
Assume for the purpose of contradiction that there is some nonempty strict subset D⊂Cb such that χFD(E,E)=E. If b∈D, then we have a contradiction by the definition of Cb. If b∉D, then note that χFB∖D(E,E)=E, so χFCb∖D(E,E)=E, and Cb∖D is a nonempty strict subset of Cb that contains b, contradicting the definition of Cb.
Thus Cb∈IrrF(E) for all b∈B, and since b∈Cb, this means that the sets in IrrF(E) cover B.
Next, we need to show that the sets in IrrF(E) are pairwise disjoint. Let C0,C1∈IrrF(E) be arbitrary distinct elements. We have that χFC0∩C1(E,E)=E, and C0∩C1 is a subset of C0 and C1, and thus a strict subset of at least one of them. Thus C0∩C1 is empty.
Thus IrrF(E)∈Part(B). □
The following two propositions constitute a factorization of QFE into irreducibles.
Proposition 30.Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Then QFE=∏C∈IrrF(E)polyFC(E).
Proof. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Let n=|IrrF(E)|, and let IrrF(E)={C0,…,Cn−1}. For 0≤k<n, let C≤k=⋃ki=0Ci.
We will show by induction on k that ∏ki=0polyFCi(E)=polyFC≤k(E) for all 0≤k<n.
If k=0, the result is trivial, as ∏0i=0polyFCi(E)=polyFC0(E)=polyFC≤0(E).
For k>0, observe that Ck and C≤k−1 are disjoint, and that E=χFCk(E,E), thus by Proposition 27, we have polyFCk(E)⋅polyFC≤k−1(E)=polyFC≤k(E). Thus, by induction, we get ∏ki=0polyFCi(E)=polyFC≤k(E).
In the case where k=n−1, this gives that ∏C∈IrrF(E)polyFC(E)=polyFB(E)=QFE. □
Proposition 31.Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Then polyFC(E) is irreducible for all C∈IrrF(E).
Proof. Let F=(S,B) be a finite factored set, let E be a nonempty subset of S, and let C∈IrrF(E).
Assume for the purpose of contradiction that p0⋅p1=polyFC(E), and that both p0 and p1 have nonempty support. By Proposition 28, we have that pi=ri⋅polyFCi(E), for some r0,r1∈R, and C0,C1⊆B.
We will first need to show that C0 and C1 are nonempty and disjoint. They must be nonempty, because p0 and p1 have nonempty support. Assume for the purpose of contradiction that b∈C0∩C1. Let s be an element of E, and note that for i∈{0,1}, we have [s]b∈supp polyFCi(E). Thus [s]b must be degree at least 2 in polyFC(E), which contradicts the fact that every variable clearly has degree at most 1 in polyFC(E).
Next, we need to show that C0∪C1=C. We already know that
Let s be an element of E. Given an arbitrary b∈B, we have that b∈C if and only if [s]b∈supp(polyFC(E)) if and only if [s]b∈supp(polyFCi(E)) for some i∈{0,1} if and only if b∈C0∪C1.
We now have that C0 and C1 are disjoint and that C=C0∪C1. Thus, by Proposition 27, we have that polyFC0(E)⋅polyFC1(E)=polyFC(χFC0(E,E)). Thus polyFC(E)=r0r1polyFC(χFC0(E,E)), so monosFC(E)=monosFC(χFC0(E,E)).
Let s0,s1∈E be arbitrary, and let s2=χFC0(s0,s1). Note that monoFC(s2)∈monosFC(χFC0(E,E))=monosFC(E), so there is some s3∈E such that monoFC(s2)=monoFC(s3). Thus s2∼bs3 for all b∈C. However, we also have that s2∼bs1 for all b∈B∖C, so s2=χFC(s3,s1). Since C∈IrrF(E), χFC(E,E)=E, so s2=χFC0(s0,s1)∈E. Since s0 and s1 were arbitrary elements of E, we have that χFC0(E,E)=E. Since C0 is a nonempty strict subset of C, this contradicts the fact that C∈IrrF(E).
Thus, polyFC(E) is irreducible for all C∈IrrF(E). □
5.3. Characteristic Polynomials and Orthogonality
We can now give an alternate characterization of conditional orthogonality in terms of divisibility of characteristic polynomials.
Lemma 3. Let F=(S,B) be a finite factored set, and let X,Y,Z∈Part(S) be partitions of S. The following are equivalent.
X⊥FY|Z.
QFz divides QFx∩z⋅QFy∩z for all x∈X, y∈Y, and z∈Z.
QFz⋅QFx∩y∩z=QFx∩z⋅QFy∩z for all x∈X, y∈Y, and z∈Z.
Proof. Clearly condition 3 implies condition 2. We will first show that condition 1 implies condition 3, and then show that condition 2 implies condition 1.
Let F=(S,B), and let X,Y,Z∈Part(S) satisfy X⊥FY|Z. Consider an arbitrary x∈X, y∈Y, and z∈Z. We want to show that QFz⋅QFx∩y∩z=QFx∩z⋅QFy∩z.
Let C=hF(X|z). Clearly C⊢FX|z. We thus have that χFC(z,z)=z, so χFB∖C(z,z)=z. We also have that hF(Y|z)⊆B∖C, so Y|z≤z(⋁S(B∖C))|z. These two together give that B∖C⊢FY|z.
Since C⊢FX|z, we have that χFC(x∩z,z)=x∩z. Thus, by Proposition 27, we have that polyFC(x∩z)⋅polyFB∖C(z)=QFx∩z. Similarly, since B∖C⊢FY|z, we have that polyFC(z)⋅polyFB∖C(y∩z)=QFy∩z.
Since χFC(x∩z,y∩z)⊆χFC(x∩z,z)=x∩z, and χFC(x∩z,y∩z)⊆χFC(z,y∩z)=y∩z, we have χFC(x∩z,y∩z)⊆x∩y∩z. We also have that
χFC(x∩z,y∩z)⊇χFC(x∩y∩z,x∩y∩z)⊇x∩y∩z.
Thus χFC(x∩z,y∩z)=x∩y∩z.
By Proposition 27, this gives that polyFC(x∩z)⋅polyFB∖C(y∩z)=QFx∩y∩z.
Finally, since χFC(z,z)=z, we have that polyFC(z)⋅polyFB∖C(z)=QFz.
Thus, QFz⋅QFx∩y∩z and QFx∩z⋅QFy∩z are both equal to polyFC(x∩z)⋅polyFB∖C(y∩z)⋅polyFC(z)⋅polyFB∖C(z).
Thus, condition 1 implies condition 3. It remains to show that condition 2 implies condition 1.
Fix F=(S,B), and X,Y,Z∈Part(S), and let QFz divide QFx∩z⋅QFy∩z for all x∈X, y∈Y, and z∈Z. Assume for the purpose of contradiction that it is not the case that X⊥FY|Z. Thus, there exists some z∈Z such that hF(X|z)∩hF(Y|z)≠{}. Let z∈Z and b∈B satisfy b∈hF(X|z)∩hF(Y|z)≠{}.
Let C⊆B be such that b∈C and C∈IrrF(z), and let p=polyFC(z). Thus, p is an irreducible factor of QFz.
Either p divides QFx∩z for all x∈X or p divides QFy∩z for all y∈Y, since otherwise there would exist an x∈X and a y∈Y such that p divides neither QFx∩z nor QFx∩z, but does divide their product, contradicting the fact that p is irreducible, and thus prime.
Assume without loss of generality that p divides QFx∩z for all x∈X. Fix an x∈X. Let us first restrict attention to the case where x∩z is nonempty.
Let QFx∩z=p⋅q. By Proposition 28, p=r0⋅polyFC0(x∩z) and q=r1⋅polyFC1(x∩z) for some r0,r1∈R and C0,C1⊆B. We will show that C0=C,C1=B∖C, and r0=r1=1.
Let s be an element of x∩z. Then for all b∈B, b∈C if and only if [s]b∈supp(p) if and only if [s]b∈supp(polyFC0(x∩z)) if and only if b∈C0. Thus C0=C.
For all b∈B∖C, we have [s]b∈supp(QFx∩z) and [s]b∉supp(p), so [s]b∈supp(q), so b∈C1. Similarly, for all b∈C1, [s]b∈supp(q), so [s]b∉supp(p), so b∈B∖C. Thus C1=B∖C.
Since p and polyFC0(x∩z) both have all coefficients equal to 1, we have r0=1. Thus, p=polyFC(x∩z).
Similarly, since all the coefficients of p are 1 and all the coefficients of QFx∩z are 1, all the coefficients of q are 1, so r1=1. Thus, q=polyFB∖C(x∩z).
We thus have that QFx∩z=polyFC(z)⋅polyFB∖C(x∩z).
In the case where x∩z is empty, we also have QFx∩z=polyFC(z)⋅polyFB∖C(x∩z), since both sides are 0.
By Proposition 27, QFx∩z=polyFB(χFC(z,x∩z)). Thus, monosFB(x∩z)=monosFB(χFC(z,x∩z)), so x∩z=χFC(z,x∩z)=χFB∖C(x∩z,z).
Since x∩z=χFB∖C(x∩z,z) for all x∈X, we have that B∖C⊢FX|z. However, this contradicts the fact that b∉B∖C, and b∈hF(X|z).
Thus, condition 2 implies condition 1. □
5.4. Probability Distributions on Finite Factored Sets
The primary purpose of all this discussion of characteristic polynomials has been to build up to thinking about the relationship between orthogonality and probabilistic independence. We will now discuss probability distributions on finite factored sets.
Recall the definition of a probability distribution.
Definition 36. Given a finite set S, a probability distribution on S is a function P:P(S)→R such that
A probability distribution on a finite factored set F is a probability distribution on its underlying set that also satisfies another condition, which represents the probability distribution coming from a product of distributions on the underlying factors.
Definition 37. Given a finite factored set F=(S,B), a probability distribution on F is a probability distribution P on S such that for all s∈S, we have P({s})=∏b∈BP([s]b).
Proposition 32. Given a finite factored set F=(S,B), a probability distribution on S is a probability distribution P on F if and only if P(E)=QFE(P) for all E⊆S.
Proof. If P(E)=QFE(P) for all E⊆S, in particular this means that P({s})=QF{s}(P)=(∏b∈B[s]b)(P)=∏b∈BP([s]b) for all s∈S.
Conversely, if P({s})=∏b∈BP([s]b) for all s∈S, then for all E⊆S,P(E)=∑s∈E∏b∈BP([s]b)=(∑s∈E∏b∈B[s]b)(P)=QFE(P). □
5.5. The Fundamental Theorem of Finite Factored Sets
We are now ready to state and prove the fundatmental theorem of finite factored sets.
Theorem 3. Let F=(S,B) be a finite factored set, and let X,Y,Z∈Part(S) be partitions of S. Then X⊥FY|Z if and only if for all probability distributions P on F and all x∈X, y∈Y, and z∈Z, we have P(x∩z)⋅P(y∩z)=P(x∩y∩z)⋅P(z).
Proof. We already have by Lemma 3 that if X⊥FY|Z, then for all x∈X, y∈Y, and z∈Z, QFz⋅QFx∩y∩z=QFx∩z⋅QFy∩z. Thus for any probability distribution P on F, we have
Conversely, assume that for all probability distributions P on F, and all x∈X, y∈Y, and z∈Z, we have P(x∩z)⋅P(y∩z)=P(x∩y∩z)⋅P(z).
If S is empty, then {} is the unique partition of S, and we have {}⊥F{}|{}. Thus, we can restrict our attention to the case where S is nonempty.
Fix an arbitrary x∈X, y∈Y, and z∈Z. Let q=QFx∩z⋅QFy∩z−QFx∩y∩z⋅QFz. We will first show that q(f)=0 for all f:P(S)→R>0.
Given an arbitrary f:P(S)→R>0, we can define Pf:P(S)→R by Pf(E)=QFE(f)/QFS(f), and we will show that Pf is a distribution on F.
Pf is well-defined because QFS(f) is a nonempty sum of products of positive real numbers, and thus positive. Further, since QFE(f) is a sum of products of positive real numbers, Pf(E)≥0 for all E⊆S. Since QF{}=0, we also have Pf({})=0. Clearly Pf(S)=1. Finally, for all E0,E1⊆S with E0∩E1={}, we have
Therefore Pf is a distribution on S. We still need to show that Pf is a distribution on F.
Observe that for all s∈S and b∈B, since χF{b}([s]b,S)=[s]b, we have that QF[s]b(f)=polyF{b}([s]b)⋅polyFB∖{b}(S), and since χF{b}(S,S)=S, we have that QFS(f)=polyF{b}(S)⋅polyFB∖{b}(S). Thus, we have that
Thus, q is a polynomial that is zero on an open subset of inputs, so q is the zero polynomial. Thus QFx∩z⋅QFy∩z−QFz⋅QFx∩y∩z=0, so QFz⋅QFx∩y∩z=QFx∩z⋅QFy∩z. Since x∈X, y∈Y, and z∈Z were arbitrary, by Lemma 3, we have X⊥FY|Z. □
In the next two posts, we will introduce temporal inference using finite factored sets, and discuss future potential research directions.
Finite Factored Sets: Polynomials and Probability
In this post, given a finite factored set F=(S,B), we will show how to associate each E⊆S with a characteristic polynomial, QFE. We will discuss how to factor these characteristic polynomials, and use these characteristic polynomials to build up to the fundamental theorem of finite factored sets, which associates conditional orthogonality with conditional independence in probability distributions.
5.1. Characteristic Polynomials
Definition 28. Given a finite factored set F=(S,B), let PolyF denote the ring of polynomials with coefficients in R and variables in P(S).
Definition 29. Given a finite factored set F=(S,B), a p∈PolyF, and an f:P(S)→R, we write p(f)∈R for the evaluation of p at f, computed by replacing each E⊆S with f(E).
Definition 30. Given a finite factored set F=(S,B), and a polynomial p∈PolyF, supp(p)⊆P(S) denotes the set of all variables v∈P(S) that appear in p. supp(p) is called the support of p.
Definition 31. Given a finite factored set F=(S,B) and an E⊆S, let QFE∈PolyF be given by QFE=∑s∈E∏b∈B[s]b. QFE is called the characteristic polynomial of E (in F).
We will be building up to an understanding of how to factor QFE into irreducibles. For that, we will first need to give some basic notation for manipulating polynomials in PolyF.
Definition 32. Given a finite factored set F=(S,B), an s∈S, and a C⊆B, let monoFC(s)∈PolyF be given by monoFC(s)=∏b∈C[s]b.
Definition 33. Given a finite factored set F=(S,B), an E⊆S, and a C⊆B, let monosFC(E)∈P(PolyF) be given by monosFC(E)={monoFC(s)∣s∈E}.
Definition 34. Given a finite factored set F=(S,B), an E⊆S, and a C⊆B, let polyFC(E)∈PolyF be given by polyFC(E)=∑m∈monosFC(E)m.
Proposition 26. Let F=(S,B) be a finite factored set, and let E⊆S. Then QFE=polyFB(E).
Proof. We start by showing that for all s≠t∈S, monoFB(s)≠monoFB(t).
Let s≠t∈S be arbitrary. By Proposition 3, if s≠t, there must be some b∈B such that [s]b≠[t]b. Then, note that [s]b∈supp(monoFB(s)). If [s]b were also in supp(monoFB(t)), then t would be in both [s]b and [t]b, contradicting the fact that these two sets are disjoint. Therefore monoFB(s)≠monoFB(t).
Thus monosFB(E) has exactly one element for each element of E, so we have that ∑m∈monosFB(E)m=∑s∈EmonoFB(s)=QFE. □
Proposition 27. Let F=(S,B) be a finite factored set, and let E0,E1⊆S be subsets of S. Let C0,C1⊆B be disjoint subsets of B. Let E2=χFC0(E0,E1), and let C2=C0∪C1. Then polyFC2(E2)=polyFC0(E0)⋅polyFC1(E1).
Proof. For i∈{0,1,2}, let Mi=monosFCi(Ei). We will start by showing that f:M0×M1→M2, given by f(m0,m1)=m0m1, is a well-defined function and a bijection.
First, observe that it follows immediately from the definition that for all s0,s1∈S, if s2=χFC0(s0,s1) we have that monoFC0(s0)=monoFC0(s2), monoFC1(s1)=monoFC1(s2), and monoFC0(s2)⋅monoFC1(s2)=monoFC2(s2). Combining these, we get that monoFC0(s0)⋅monoFC1(s1)=monoFC2(χFC0(s0,s1)).
For all (m0,m1)∈M0×M1, there exists some s0∈E0 such that m0=monoFC0(s0), and some s1∈E1 such that m1=monoFC1(s1), and this gives us that m0m1=monoFC0(s0)monoFC1(s1)=monoFC2(χFC(s0,s1))∈M2. Thus, f is well-defined.
To see that f is surjective, observe that for all m2∈M2, there exists an s2∈E2 such that m2=monoFC2(s2), and there exist s0∈E0 and s1∈E1 such that s2=χFC(s0,s1), and we have f(monoFC0(s0),monoFC1(s1))=m2.
To see that f is injective, observe that for i∈{0,1}, for all mi∈Mi, supp(mi)⊆⋃b∈Cib. Further, ⋃b∈C0b and ⋃b∈C1b are disjoint. Thus, for all m0∈M0 and m1∈M1, supp(mi)=supp(m0m1)∩⋃b∈Cib.
This means that for all m0,m′0∈M0 and m1,m′1∈M1, if m0m1=m′0m′1, then supp(m0)=supp(m′0) and supp(m1)=supp(m′1). However, every monomial in M0 or M1 is just equal to the product of all variables in its support. Thus m0=∏v∈supp(m0)v=m′0 and m1=∏v∈supp(m1)v=m′1. Thus f is injective, and thus a bijection between M0×M1 and M2.
Now, we have that
polyFC0(E0)⋅polyFC1(E1)=⎛⎝∑m0∈M0m0⎞⎠⎛⎝∑m1∈M1m1⎞⎠=∑m0∈M0∑m1∈M1m0m1=∑(m0,m1)∈M0×M1m0m1=∑(m0,m1)∈M0×M1f(m0,m1)=∑m2∈M2m2=polyFC2(E2).□
Proposition 28. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. If p divides QFE, then p=r⋅polyFC(E), for some r∈R and C⊆B.
Proof. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Let p,q∈PolyF satisfy pq=QFE. We thus must have supp(p)∪supp(q)=supp(QFE).
If there were some T∈supp(p)∩supp(q), then the degree of T in QFE would be at least 2, contradicting the definition of QFE and Corollary 1. Thus, supp(p)∩supp(q)={}.
There can be no combining like terms, then, in the product pq. The monomial terms in QFE are in bijective correspondence to the pairs of monomial terms in p and monomial terms in q.
In particular, this means that since all the coefficients in pq are equal to 1, all the coefficients in p must be equal to some r∈R, and all of the coefficients in q must be equal to 1/r.
Further, for all b∈B, if b∩supp(p) is nonempty, b∩supp(q) must be empty, since otherwise QFE would contain a term with two factors in b, which clearly never happens according to the definition of QFE.
Since E is nonempty, for each b∈B there must be some T∈b∩supp(QFE). Thus at least one of b∩supp(p) and b∩supp(q) must be nonempty, so exactly one of b∩supp(p) and b∩supp(q) must be nonempty.
Let C be the set of all b∈B such that b∩supp(p) is nonempty.
For every b∈C, every term of QFE has exactly one factor in b. Thus, every term in p has exactly one factor in b. These cover all variables in the support of p, so each term in p must have total degree |C|.
For each m∈monosFC(E), m divides a term in QFE. Since m has no common support with q, m must also divide a term in p. Thus r⋅m must be a term in p. Conversely, every term in p divides a term in QFE, and thus must be in monosFC(E). Thus every term in p is of the form r⋅m for some m∈monosFC(E). Thus p=∑m∈monosFC(E)r⋅m=r⋅polyFC(E). □
5.2. Factoring Characteristic Polynomials
We will now show how to factor characteristic polynomials into irreducibles.
Definition 35. Given a finite factored set F=(S,B), and a nonempty subset E⊆S, let IrrF(E)⊆P(B) denote the set of all C⊆B such that:
C is nonempty,
χFC(E,E)=E, and
there is no nonempty strict subset D⊂C such that χFD(E,E)=E.
Proposition 29. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Then IrrF(E)∈Part(B).
Proof. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. It suffices to show that the sets in IrrF(E) are pairwise disjoint and cover B.
We start by showing that the set of all C⊆B satisfying χFC(E,E)=E is closed under intersection. Indeed, if χFC0(E,E)=E and χFC1(E,E)=E, then χFC0∩C1(E,E)=χFC0(E,χFC1(E,E))=χFC0(E,E)=E.
Next, observe that χFB(E,E)=E. Thus, for all b∈B, we can consider Cb=⋂C⊆B,b∈C,χFC(E,E)=EC. Since Cb is an intersection of a finite nonempty collection of sets C satisfying χFC(E,E)=E, we have that χFCb(E,E)=E. Further, b∈Cb, so Cb is nonempty.
Assume for the purpose of contradiction that there is some nonempty strict subset D⊂Cb such that χFD(E,E)=E. If b∈D, then we have a contradiction by the definition of Cb. If b∉D, then note that χFB∖D(E,E)=E, so χFCb∖D(E,E)=E, and Cb∖D is a nonempty strict subset of Cb that contains b, contradicting the definition of Cb.
Thus Cb∈IrrF(E) for all b∈B, and since b∈Cb, this means that the sets in IrrF(E) cover B.
Next, we need to show that the sets in IrrF(E) are pairwise disjoint. Let C0,C1∈IrrF(E) be arbitrary distinct elements. We have that χFC0∩C1(E,E)=E, and C0∩C1 is a subset of C0 and C1, and thus a strict subset of at least one of them. Thus C0∩C1 is empty.
Thus IrrF(E)∈Part(B). □
The following two propositions constitute a factorization of QFE into irreducibles.
Proposition 30. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Then QFE=∏C∈IrrF(E)polyFC(E).
Proof. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Let n=|IrrF(E)|, and let IrrF(E)={C0,…,Cn−1}. For 0≤k<n, let C≤k=⋃ki=0Ci.
We will show by induction on k that ∏ki=0polyFCi(E)=polyFC≤k(E) for all 0≤k<n.
If k=0, the result is trivial, as ∏0i=0polyFCi(E)=polyFC0(E)=polyFC≤0(E).
For k>0, observe that Ck and C≤k−1 are disjoint, and that E=χFCk(E,E), thus by Proposition 27, we have polyFCk(E)⋅polyFC≤k−1(E)=polyFC≤k(E). Thus, by induction, we get ∏ki=0polyFCi(E)=polyFC≤k(E).
In the case where k=n−1, this gives that ∏C∈IrrF(E)polyFC(E)=polyFB(E)=QFE. □
Proposition 31. Let F=(S,B) be a finite factored set, and let E be a nonempty subset of S. Then polyFC(E) is irreducible for all C∈IrrF(E).
Proof. Let F=(S,B) be a finite factored set, let E be a nonempty subset of S, and let C∈IrrF(E).
Assume for the purpose of contradiction that p0⋅p1=polyFC(E), and that both p0 and p1 have nonempty support. By Proposition 28, we have that pi=ri⋅polyFCi(E), for some r0,r1∈R, and C0,C1⊆B.
We will first need to show that C0 and C1 are nonempty and disjoint. They must be nonempty, because p0 and p1 have nonempty support. Assume for the purpose of contradiction that b∈C0∩C1. Let s be an element of E, and note that for i∈{0,1}, we have [s]b∈supp polyFCi(E). Thus [s]b must be degree at least 2 in polyFC(E), which contradicts the fact that every variable clearly has degree at most 1 in polyFC(E).
Next, we need to show that C0∪C1=C. We already know that
supp(polyFC(E))=supp(r0r1polyFC0(E)polyFC1(E))=supp(polyFC0(E))∪supp(polyFC1(E)).Let s be an element of E. Given an arbitrary b∈B, we have that b∈C if and only if [s]b∈supp(polyFC(E)) if and only if [s]b∈supp(polyFCi(E)) for some i∈{0,1} if and only if b∈C0∪C1.
We now have that C0 and C1 are disjoint and that C=C0∪C1. Thus, by Proposition 27, we have that polyFC0(E)⋅polyFC1(E)=polyFC(χFC0(E,E)). Thus polyFC(E)=r0r1polyFC(χFC0(E,E)), so monosFC(E)=monosFC(χFC0(E,E)).
Let s0,s1∈E be arbitrary, and let s2=χFC0(s0,s1). Note that monoFC(s2)∈monosFC(χFC0(E,E))=monosFC(E), so there is some s3∈E such that monoFC(s2)=monoFC(s3). Thus s2∼bs3 for all b∈C. However, we also have that s2∼bs1 for all b∈B∖C, so s2=χFC(s3,s1). Since C∈IrrF(E), χFC(E,E)=E, so s2=χFC0(s0,s1)∈E. Since s0 and s1 were arbitrary elements of E, we have that χFC0(E,E)=E. Since C0 is a nonempty strict subset of C, this contradicts the fact that C∈IrrF(E).
Thus, polyFC(E) is irreducible for all C∈IrrF(E). □
5.3. Characteristic Polynomials and Orthogonality
We can now give an alternate characterization of conditional orthogonality in terms of divisibility of characteristic polynomials.
Lemma 3. Let F=(S,B) be a finite factored set, and let X,Y,Z∈Part(S) be partitions of S. The following are equivalent.
X⊥FY|Z.
QFz divides QFx∩z⋅QFy∩z for all x∈X, y∈Y, and z∈Z.
QFz⋅QFx∩y∩z=QFx∩z⋅QFy∩z for all x∈X, y∈Y, and z∈Z.
Proof. Clearly condition 3 implies condition 2. We will first show that condition 1 implies condition 3, and then show that condition 2 implies condition 1.
Let F=(S,B), and let X,Y,Z∈Part(S) satisfy X⊥FY|Z. Consider an arbitrary x∈X, y∈Y, and z∈Z. We want to show that QFz⋅QFx∩y∩z=QFx∩z⋅QFy∩z.
Let C=hF(X|z). Clearly C⊢FX|z. We thus have that χFC(z,z)=z, so χFB∖C(z,z)=z. We also have that hF(Y|z)⊆B∖C, so Y|z≤z(⋁S(B∖C))|z. These two together give that B∖C⊢FY|z.
Since C⊢FX|z, we have that χFC(x∩z,z)=x∩z. Thus, by Proposition 27, we have that polyFC(x∩z)⋅polyFB∖C(z)=QFx∩z. Similarly, since B∖C⊢FY|z, we have that polyFC(z)⋅polyFB∖C(y∩z)=QFy∩z.
Since χFC(x∩z,y∩z)⊆χFC(x∩z,z)=x∩z, and χFC(x∩z,y∩z)⊆χFC(z,y∩z)=y∩z, we have χFC(x∩z,y∩z)⊆x∩y∩z. We also have that
χFC(x∩z,y∩z)⊇χFC(x∩y∩z,x∩y∩z)⊇x∩y∩z.Thus χFC(x∩z,y∩z)=x∩y∩z.
By Proposition 27, this gives that polyFC(x∩z)⋅polyFB∖C(y∩z)=QFx∩y∩z.
Finally, since χFC(z,z)=z, we have that polyFC(z)⋅polyFB∖C(z)=QFz.
Thus, QFz⋅QFx∩y∩z and QFx∩z⋅QFy∩z are both equal to polyFC(x∩z)⋅polyFB∖C(y∩z)⋅polyFC(z)⋅polyFB∖C(z).
Thus, condition 1 implies condition 3. It remains to show that condition 2 implies condition 1.
Fix F=(S,B), and X,Y,Z∈Part(S), and let QFz divide QFx∩z⋅QFy∩z for all x∈X, y∈Y, and z∈Z. Assume for the purpose of contradiction that it is not the case that X⊥FY|Z. Thus, there exists some z∈Z such that hF(X|z)∩hF(Y|z)≠{}. Let z∈Z and b∈B satisfy b∈hF(X|z)∩hF(Y|z)≠{}.
Let C⊆B be such that b∈C and C∈IrrF(z), and let p=polyFC(z). Thus, p is an irreducible factor of QFz.
Either p divides QFx∩z for all x∈X or p divides QFy∩z for all y∈Y, since otherwise there would exist an x∈X and a y∈Y such that p divides neither QFx∩z nor QFx∩z, but does divide their product, contradicting the fact that p is irreducible, and thus prime.
Assume without loss of generality that p divides QFx∩z for all x∈X. Fix an x∈X. Let us first restrict attention to the case where x∩z is nonempty.
Let QFx∩z=p⋅q. By Proposition 28, p=r0⋅polyFC0(x∩z) and q=r1⋅polyFC1(x∩z) for some r0,r1∈R and C0,C1⊆B. We will show that C0=C,C1=B∖C, and r0=r1=1.
Let s be an element of x∩z. Then for all b∈B, b∈C if and only if [s]b∈supp(p) if and only if [s]b∈supp(polyFC0(x∩z)) if and only if b∈C0. Thus C0=C.
For all b∈B∖C, we have [s]b∈supp(QFx∩z) and [s]b∉supp(p), so [s]b∈supp(q), so b∈C1. Similarly, for all b∈C1, [s]b∈supp(q), so [s]b∉supp(p), so b∈B∖C. Thus C1=B∖C.
Since p and polyFC0(x∩z) both have all coefficients equal to 1, we have r0=1. Thus, p=polyFC(x∩z).
Similarly, since all the coefficients of p are 1 and all the coefficients of QFx∩z are 1, all the coefficients of q are 1, so r1=1. Thus, q=polyFB∖C(x∩z).
We thus have that QFx∩z=polyFC(z)⋅polyFB∖C(x∩z).
In the case where x∩z is empty, we also have QFx∩z=polyFC(z)⋅polyFB∖C(x∩z), since both sides are 0.
By Proposition 27, QFx∩z=polyFB(χFC(z,x∩z)). Thus, monosFB(x∩z)=monosFB(χFC(z,x∩z)), so x∩z=χFC(z,x∩z)=χFB∖C(x∩z,z).
Since x∩z=χFB∖C(x∩z,z) for all x∈X, we have that B∖C⊢FX|z. However, this contradicts the fact that b∉B∖C, and b∈hF(X|z).
Thus, condition 2 implies condition 1. □
5.4. Probability Distributions on Finite Factored Sets
The primary purpose of all this discussion of characteristic polynomials has been to build up to thinking about the relationship between orthogonality and probabilistic independence. We will now discuss probability distributions on finite factored sets.
Recall the definition of a probability distribution.
Definition 36. Given a finite set S, a probability distribution on S is a function P:P(S)→R such that
P(E)≥0 for all E⊆S,
P({})=0,
P(S)=1, and
P(E0∪E1)=P(E0)+P(E1) whenever E0,E1⊆S satisfy E0∩E1={}.
A probability distribution on a finite factored set F is a probability distribution on its underlying set that also satisfies another condition, which represents the probability distribution coming from a product of distributions on the underlying factors.
Definition 37. Given a finite factored set F=(S,B), a probability distribution on F is a probability distribution P on S such that for all s∈S, we have P({s})=∏b∈BP([s]b).
Proposition 32. Given a finite factored set F=(S,B), a probability distribution on S is a probability distribution P on F if and only if P(E)=QFE(P) for all E⊆S.
Proof. If P(E)=QFE(P) for all E⊆S, in particular this means that P({s})=QF{s}(P)=(∏b∈B[s]b)(P)=∏b∈BP([s]b) for all s∈S.
Conversely, if P({s})=∏b∈BP([s]b) for all s∈S, then for all E⊆S,P(E)=∑s∈E∏b∈BP([s]b)=(∑s∈E∏b∈B[s]b)(P)=QFE(P). □
5.5. The Fundamental Theorem of Finite Factored Sets
We are now ready to state and prove the fundatmental theorem of finite factored sets.
Theorem 3. Let F=(S,B) be a finite factored set, and let X,Y,Z∈Part(S) be partitions of S. Then X⊥FY|Z if and only if for all probability distributions P on F and all x∈X, y∈Y, and z∈Z, we have P(x∩z)⋅P(y∩z)=P(x∩y∩z)⋅P(z).
Proof. We already have by Lemma 3 that if X⊥FY|Z, then for all x∈X, y∈Y, and z∈Z, QFz⋅QFx∩y∩z=QFx∩z⋅QFy∩z. Thus for any probability distribution P on F, we have
P(z)⋅P(x∩y∩z)=QFz(P)⋅QFx∩y∩z(P)=QFx∩z(P)⋅QFy∩z(P)=P(x∩z)⋅P(y∩z).Conversely, assume that for all probability distributions P on F, and all x∈X, y∈Y, and z∈Z, we have P(x∩z)⋅P(y∩z)=P(x∩y∩z)⋅P(z).
If S is empty, then {} is the unique partition of S, and we have {}⊥F{}|{}. Thus, we can restrict our attention to the case where S is nonempty.
Fix an arbitrary x∈X, y∈Y, and z∈Z. Let q=QFx∩z⋅QFy∩z−QFx∩y∩z⋅QFz. We will first show that q(f)=0 for all f:P(S)→R>0.
Given an arbitrary f:P(S)→R>0, we can define Pf:P(S)→R by Pf(E)=QFE(f)/QFS(f), and we will show that Pf is a distribution on F.
Pf is well-defined because QFS(f) is a nonempty sum of products of positive real numbers, and thus positive. Further, since QFE(f) is a sum of products of positive real numbers, Pf(E)≥0 for all E⊆S. Since QF{}=0, we also have Pf({})=0. Clearly Pf(S)=1. Finally, for all E0,E1⊆S with E0∩E1={}, we have
Pf(E0∪E1)=QFE0∪E1(f)/QFS(f)=(QFE0(f)+QFE1(f))/QFS(f)=Pf(E0)+Pf(E1).Therefore Pf is a distribution on S. We still need to show that Pf is a distribution on F.
Observe that for all s∈S and b∈B, since χF{b}([s]b,S)=[s]b, we have that QF[s]b(f)=polyF{b}([s]b)⋅polyFB∖{b}(S), and since χF{b}(S,S)=S, we have that QFS(f)=polyF{b}(S)⋅polyFB∖{b}(S). Thus, we have that
Pf([s]b)=polyF{b}([s]b)(f)/polyF{b}(S)(f)=f([s]b)/polyF{b}(S)(f).Thus, for all s∈S,
∏b∈BPf([s]b)=(∏b∈Bf([s]b))/(∏b∈BpolyF{b}(S)(f))=QF{s}(f)/QFS(f)=Pf({s}).Thus Pf is a distribution on F.
It follows that Pf(x∩z)⋅Pf(y∩z)=Pf(x∩y∩z)⋅Pf(z). We therefore have that
q(f)=QFx∩z(f)⋅QFy∩z(f)−QFx∩y∩z(f)⋅QFz(f)=(Pf(x∩z)⋅Pf(y∩z)−Pf(x∩y∩z)⋅Pf(z))⋅QFS(f)2=0⋅QFS(f)2=0.Thus, q is a polynomial that is zero on an open subset of inputs, so q is the zero polynomial. Thus QFx∩z⋅QFy∩z−QFz⋅QFx∩y∩z=0, so QFz⋅QFx∩y∩z=QFx∩z⋅QFy∩z. Since x∈X, y∈Y, and z∈Z were arbitrary, by Lemma 3, we have X⊥FY|Z. □
In the next two posts, we will introduce temporal inference using finite factored sets, and discuss future potential research directions.