Lemma 1:prΓ×Φ(χy∈α(s∗(θ)))∈Θ occurs for all s:Γ→Γ iff, for all g:Γ×Φ→[0,1] and s:Γ→Γ, θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x))
This lemma will be implicitly used all over the place, in order to deal with the “presence in the bridge transform” condition algebraically, in terms of function expectations. The bulk of the conditions for a contribution to lie in the bridge transform is this endofunction condition (the support condition is trivial most of the time), so it’s advantageous to reformulate it. First-off, we have that
prΓ×Φ(χy∈α(s∗(θ)))∈Θ
iff, for all g:Γ×Φ→[0,1], we have
prΓ×Φ(χy∈α(s∗(θ)))(λyx.g(y,x))≤Θ(λyx.g(y,x))
By LF-duality for ultradistributions, proved in Less Basic Inframeasure theory. A contribution lies in an ultracontribution set iff its expectations, w.r.t. all functions, are less than or equal to the ultracontribution expectations.
Now, we just need to unpack the left-hand-side into our desired form. Start off with
prΓ×Φ(χy∈α(s∗(θ)))(λyx.g(y,x))
Apply how projections work
=χy∈α(s∗(θ))(λyαx.g(y,x))
Now, we can move the indicator function into the function we’re taking the expectation again, because there’s no difference between deleting all measure outside of an event, or taking the expectation of a function that’s 0 outside of that event. So, we get
=s∗(θ)(λyαx.χy∈αg(y,x))
Then we use how pushforwards are defined in probability theory.
=θ(λyαx.χs(y)∈αg(s(y),x))
And that’s our desired form. So, we’ve shown that for any particular s:Γ→Γ, we have prΓ×Φ(χy∈α(s∗(θ)))∈Θ iff, for all g:Γ×Φ→[0,1],
θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x))
And so, we get our desired iff statement by going from the iff for one s to the iff for all s. ■
Proposition 2.1:For any Γ, Φ and Θ∈□c(Γ×Φ),
Br(Θ) exists and satisfies prΓ×ΦBr(Θ)=Θ.
In particular, if Θ∈□(Γ×Φ) then Br(Θ)∈□(elΓ×Φ).
Proof sketch: We’ll show that for any particular contribution in θ∈Θ, there’s a contribution θ∗ which lies within Br(Θ) that projects down to equal θ. And then, show the other direction, that any contribution in Br(Θ) lands within Θ when you project it down. Thus, the projection of Br(Θ) must be Θ exactly.
For the first direction, given some θ∈Θ, let θ∗:=θ⋉(λy.{y}). Note that since θ∈Δc(Γ×Φ), this means that θ∗∈Δc(Γ×2Γ×Φ), so the type signatures line up. Clearly, projecting θ∗ down to Γ×Φ makes θ again. So that leaves showing that θ∗∈Br(Θ). Applying Lemma 1, an equivalent way of stating the bridge transform is that it consists precisely of all the θ′∈Δc(Γ×2Γ×Φ) s.t. for all s:Γ→Γ and g:Γ×Φ→[0,1],
θ′(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x))
and also, supp θ′⊆elΓ×Φ.
Clearly, for our given θ∗, everything works out with the support condition, so that leaves the endofunction condition. Let s,g be arbitrary.
θ∗(λyαx.χs(y)∈αg(s(y),x))=(θ⋉(λy.{y}))(λyαx.χs(y)∈αg(s(y),x))=θ(λyx.δ{y}(λα.χs(y)∈αg(s(y),x)))=θ(λyx.χs(y)∈{y}g(s(y),x))=θ(λyx.χs(y)=yg(s(y),x))=θ(λyx.χs(y)=yg(y,x))≤θ(λyx.g(y,x))≤maxθ∈Θθ(λyx.g(y,x))=Θ(λyx.g(y,x))
In order, the equalities were unpacking the semidirect product, substituting the dirac-delta in, reexpressing the condition for the indicator function, using that s(y)=y inside the indicator function, applying monotonicity of θ, using that θ∈Θ, and then packing up the definition of Θ.
And that inequality has been fulfilled, so, yes, θ∗ lies within Br(Θ). Since θ was arbitrary within Θ, this shows that the projection set is as-big-or-bigger than Θ.
Now to show the reverse direction, that anything in the projection set lies within Θ. For any particular θ′∈Θ, remember, it must fulfill, for all g,s, that
θ′(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x))
So, in particular, we can let s:Γ→Γ be the identity function, and g be anything, and we’d get
θ′(λyαx.χy∈αg(y,x))≤Θ(λyx.g(y,x))
And, since θ′ is in Br(Θ), it’s supported on the (y,α) pairs s.t. y∈α, so that indicator function drops out of existence, and we get
θ′(λyαx.g(y,x))≤Θ(λyx.g(y,x))
And then this can be written as a projection
prΓ×Φ(θ′)(λyx.g(y,x))≤Θ(λyx.g(y,x))
And since this holds for all the various g, this means that
prΓ×Φ(θ′)∈Θ
And we’re done with the other direction of things, the projection must be exactly equal to Θ since projections of contributions in Br(Θ) always land in Θ, and we can surject onto Θ. ■
Proposition 2.2:Let X be a poset, θ,η∈ΔcX.
Then, θ⪯η if and only if there exists κ:X→ΔX
s.t.:
For all x∈X, y∈supp κ(x): x≤y.
κ∗θ≤η
Vanessa proved this one, with a very nice proof involving the max-flow min-cut theorem.
Make the following directed graph: The nodes are the elements of {s}∪{t}∪X×{0,1}. Basically, two copies of the finite poset X, a source node s, and a sink node t. x will be used to denote a variable from X, while a subscript of 0 or 1 denotes that the 0 or 1 version of that x, respectively.
As for the edges, there is an edge from s to every point in X0. And an edge from every point in X1 to t. And, for some x0∈X0 and y1∈X1, there’s an edge iff x≤y. The capacities of the edges are, for all x∈X, cs,x0:=θ(x), and for all x∈X, cx1,t:=η(x), and for x,y∈X s.t. x≤y, cx0,y1:=∞.
To work up to applying min-cut max-flow, we must show that the cut of all the edges between s and X0 is a min-cut of the network.
Remember, all cuts are induced by partitions of the nodes into two sets, S and T, where S includes the source s and T includes the sink t. And the value of a cut is
c(S,T)=∑a∈S,b∈Tca,b=∑x0∈Tcs,x0+∑x0∈S,y1∈T:x≤ycx0,y1+∑y1∈Scy1,t
Implicitly converting both of the following sets to be subsets of X, we have (S∩X1)⊇(S∩X0)↑ for any non-infinite cut. Why? Well, if it was false, then there’d be some y≥x where x0∈S, and yet y1∉S. So y1∈T. Then, the cost of the cut would include cx0,y1=∞, contradiction. So, now, we have that for any non-infinite cut,
c(S,T)=∑a∈S,b∈Tca,b=∑x0∈Tcs,x0+∑y1∈Scy1,t
and (S∩X1)⊇(S∩X0)↑ holds.
Now, we’ll change our cut. Given some cut (S,T), let our new cut (S′,T′) be defined as S′:={s}∪((S∩X0)↑×{0,1}) (which still respects that superset property listed above since both sets would be the same), and T′ be the complement. Letting some stuff cancel out, we get
c(S,T)−c(S′,T′)=∑x0∈Tcs,x0+∑y1∈Scy1,t−∑x0∈T′cs,x0−∑y1∈S′cy1,t=∑x0∈T/T′cs,x0+∑y1∈S/S′cy1,t−∑x0∈T′/Tcs,x0−∑y1∈S′/Scy1,t
Now, because S′∩X1=(S∩X0)↑×{1}⊆S∩X1 by that subset property, it means that there’s no y1∈S′/S, so we get
=∑x0∈T/T′cs,x0+∑y1∈S/S′cy1,t−∑x0∈T′/Tcs,x0
As for that other minus term, we have that S′∩X0=(S∩X0)↑×{0}⊇S∩X0
by how it was defined, so T′∩X0⊆T∩X0, so that other minus term is 0, and we get
=∑x0∈T/T′cs,x0+∑y1∈S/S′cy1,t≥0
Rearranging this a bit, we have c(S,T)≥c(S′,T′).
And now we’ll show that c(S′,T′) is underscored by c({s},(X×{0,1})∪{t}). Let’s go. We have
c(S′,T′)−c({s},(X×{0,1})∪{t})=∑x0∈T′cs,x0+∑y1∈S′cy1,t−∑x0∈(X×{0,1})∪{t}cs,x0−∑y1∈{s}cy1,t
This simplifies a bit as
=∑x0∈T′cs,x0+∑y1∈S′cy1,t−∑x0∈X0cs,x0
This can be rewritten a bit as
=⎛⎝∑x0∈X0cs,x0−∑x0∈S′cs,x0⎞⎠+∑y1∈S′cy1,t−∑x0∈X0cs,x0=∑y1∈S′cy1,t−∑x0∈S′cs,x0
And then, using what S′ is defined as, we can get
=∑y∈(S∩X0)↑cy1,t−∑x∈(S∩X0)↑cs,x0
and using what the costs are, it’s
=∑y∈(S∩X0)↑η(y)−∑x∈(S∩X0)↑θ(y)=η((S∩X0)↑)−θ((S∩X0)↑)≥0
This holds because θ⪯η and the indicator function for (S∩X0)↑ is monotone. And so, we get c(S′,T′)≥c({s},(X×{0,1})∪{t}). And we previously showed that c(S,T)≥c(S′,T′). So, this means that the cut around s is a minimal cut, it underscores all other cuts.
By the max-flow min-cut theorem, there’s a way of having stuff flow from s to t that saturates the capacities of all the edges that are cut.fx0,y1 will be used to denote the flow from x0 to y1 according to this max-flow way. Let’s finish things up.
Define κ:X→ΔX as follows. For some x, if fs,x0>0, then
κ(x)(y):=fx0,y1fs,x0
If fs,x0=0, let κ(x) be any probability distribution on x↑. Note that κ(x) is always a probability distribution supported on x↑, by fiat if fs,x0=0, and otherwise,
∑y∈Xκ(x)(y)=∑y∈Xfx0,y1fs,x0=fs,x0fs,x0=1
This is because, the flow out of x0 must equal the flow into x0 from the source. And κ(x) is supported on x↑ because the only edges out of x0 go to y1 where y≥x. Now that we’ve got this demonstrated, we’ll show our desired inequality. Fix an arbitrary y. We have
κ∗(θ)(y)=∑x∈Xθ(x)⋅κ(x)(y)=∑x∈Xcs,x0⋅fx0,y1fs,x0
And then we use that all the capacities of the edges cut in the min-cut are saturated according to the max-flow plan, so cs,x0=fs,x0, and we have
=∑x∈Xfs,x0⋅fx0,y1fs,x0
Now, if fs,x0=0, then because the flow in equals the flow out, that means that fx0,y1=0, and otherwise we can cancel out, so we can rewrite as
=∑x∈Xfx0,y1=fy1,t≤cy1,t=η(y)
Where the first equality came from flow in equals flow out, the inequality came from the flow plan respecting the capacity of the paths, and the equality came from how the capacities were defined. So, for all y∈X, we have κ∗(θ)(y)≤η(y) so we have κ∗(θ)≤η.
For the reverse direction, if there’s some κ:X→ΔX s.t κ(x) is supported on x↑ s.t. κ∗(θ)≤η, then for any monotone function f, we have
η(f)≥κ∗(θ)(f)=θ(λx.κ(x)(f))
And then we use that, since κ(x) is supported on x↑, and f is monotone, f(x) is less than or equal to the expectation of f w.r.t κ(x) (remember, for κ(x) you have a 100 percent chance of drawing something at-or-above x, which guarantees that f of whatever you picked is above f(x)). And so, we get
≥θ(f)
And since this holds for every monotone f, we have θ⪯η.■
Proposition 2.3:Let X be a poset, θ,η∈ΔcX.
Then, θ⪯η if and only if there exists κ:X→ΔX
s.t.
For all x∈X, y∈supp κ(x): x≥y.
θ≤κ∗η
Proof: Because θ⪯η, we can get a maximum flow in exactly the same way as Proposition 2.2. Then, just flip the direction of all flows, which will be denoted by swapping the order of the subscripts. Now, define
κ(y)(x):=fy1,x0ft,y1
And, again, if it’s 0, it should be an arbitrary probability distribution supported on y↓.
Note that κ(y) is always a probability distribution supported on y↓, by fiat if f′t,y1=0, and otherwise,
∑x∈Xκ(y)(x)=∑x∈Xfy1,x0ft,y1=ft,y1ft,y1=1
This is because, the flow out of y1 must equal the flow into y1 from the source t (the old sink). And κ(y) is supported on y↓ because the only edges out of y1 go to x0 where y≥x. Now that we’ve got this demonstrated, we’ll show our desired inequality. Fix an arbitrary x. We have
κ∗(η)(x)=∑y∈Xη(y)⋅κ(y)(x)=∑y∈Xcy1,t⋅fy1,x0ft,y1
Then use that cy1,t≥ft,y1 (because even with the flow reversed, the flow through a path must be less than or the same as the capacity. Accordingly, we get
≥∑y∈Xfy1,x0=fx0,s=cs,x0=θ(x)
And we’re done, inequality established, using definitions and the flow saturating all the paths out of s.
So, for all x∈X, we have κ∗(η)(x)≥θ(x) so we have κ∗(η)≥θ.
For the reverse direction, if there’s some κ:X→ΔX s.t κ(y) is supported on y↓ s.t. κ∗(η)≥θ, then for any monotone function f, we have
θ(f)≤κ∗(η)(f)=η(λx.κ(x)(f))
And then we use that, since κ(x) is supported on x↓, and f is monotone, f(x) is greater than or equal to the expectation of f w.r.t κ(x) (remember, for κ(x) you have a 100 percent chance of drawing something at-or-below x, which guarantees that f of whatever you picked is below f(x)). And so, we get
≤η(f)
And since this holds for every monotone f, we have θ⪯η. ■
Proposition 2.4:For any Γ, Φ and Θ∈□c(Γ×Φ),Br(Θ)
is downwards closed w.r.t. the induced order on Δc(elΓ×Φ).
That is, if θ∈Br(Θ) and η⪯θ then η∈Br(Θ).
Remember, the condition for some θ to lie in Br(Θ) is that it be supported on elΓ, and that, for all s:Γ→Γ, and g:Γ×Φ→[0,1],
θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x))
So, given that η⪯θ (lower score for all monotone functions) we’ll show that η fulfills both conditions. The support thing is taken care of by η∈Δc(elΓ×Φ). As for the other one, we have
η(λyαx.χs(y)∈αg(s(y),x))≤θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x))
That inequality occurs because as you make α larger, ie, go up in the partial ordering, the value assigned to the relevant point increases since s(y)∈α is more likely now, so the function value increases from 0 to something that may be greater than 0. So, it’s a monotone function, and we then use that η⪯θ′. ■
Proposition 2.5:Consider a finite set X, ϕ?∈ΔX,
ϕ!:{0,1}→ΔX, p∈[0,1] and ϕ:=(1−p)ϕ?+pϕ!.
Then, p≥dTV(ϕ(0),ϕ(1)). Conversly, consider any ϕ:{0,1}→ΔX.
Then, there exist ϕ?∈ΔX and ϕ!:{0,1}→ΔX
s.t.ϕ=(1−p)ϕ?+pϕ! for p:=dTV(ϕ(0),ϕ(1)).
Ok, so let f be some arbitrary function X→[0,1]. We have an alternate characterization of the total variation distance as
dTV(ϕ(0),ϕ(1)):=supf∈X→[0,1]|ϕ(0)(f)−ϕ(1)(f)|
And then from there we can go
=supf∈X→[0,1]|((1−p)ϕ?(f)+pϕ!(0)(f))−((1−p)ϕ?(f)+pϕ!(1)(f))|=supf∈X→[0,1]|pϕ!(0)(f)−pϕ!(1)(f)|=psupf∈X→[0,1]|ϕ!(0)(f)−ϕ!(1)(f)|=pdTV(ϕ!(0),ϕ!(1))
And since p≥pdTV(ϕ!(0),ϕ!(1)), we have p≥dTV(ϕ(0),ϕ(1)) and are done.
Conversely, let the value of p be 1−(ϕ(0)∧ϕ(1))(1), and ϕ? be 1(1−p)(ϕ(0)∧ϕ(1)). It is clear that this is a probability distribution because of how p was defined. The ∧ is the minimum/common overlap of the two probability distributions. Then, let ϕ!(0)=1p(ϕ(0)−(ϕ(0)∧ϕ(1))), and similar for the 1. Well, as long as p>0. If p=0, it can be any probability distribution you want. It’s a probability distribution because
ϕ!(0)(1)=1p(ϕ(0)(1)−(ϕ(0)∧ϕ(1))(1))=1p(1−(ϕ(0)∧ϕ(1))(1))=1−(ϕ(0)∧ϕ(1))(1)1−(ϕ(0)∧ϕ(1))(1)=1
And since p>0, everything works out. Now we just need to show that these add up to make ϕ and that p is the same as the total variation distance.
ϕ(0)=(ϕ(0)∧ϕ(1))+ϕ(0)−(ϕ(0)∧ϕ(1))=(1−p)1(1−p)(ϕ(0)∧ϕ(1))+p1p(ϕ(0)−(ϕ(0)∧ϕ(1)))=(1−p)ϕ?+pϕ!(0)
And this works symmetrically for ϕ(1), showing that we indeed have equality. As for showing that p is the total variation distance, referring back to what we’ve already proved, we have
dTV(ϕ(0),ϕ(1))=pdTV(ϕ!(0),ϕ!(1))
And now, since ϕ!(0)=ϕ(0)−(ϕ(0)∧ϕ(1)) and ϕ!(1)=ϕ(1)−(ϕ(0)∧ϕ(1)), the supports of these two probability distributions are disjoint, which implies that the total variation distance is 1, so we have dTV(ϕ(0),ϕ(1))=p.
Proposition 2.6:Consider any Φ and ϕ:{0,1}→ΔΦ.
Denote U:={0,1}×{{0,1}}×Φ (the event ``program
is unrealized″). Let Λ:=Br(⊤{0,1}⋉ϕ).
Then,Λ(χU)=1−dTV(ϕ(0),ϕ(1))
This will take some work to establish. One direction, showing that the bridge transform value exceeds the total variation distance value, is fairly easy. As for the other direction, it’ll take some work. Let’s begin.
Λ(χU)=Br(⊤{0,1}⋉ϕ)(χU)=Br(⊤{0,1}⋉ϕ)(λyαx.χα={0,1})=maxθ∈Br(⊤{0,1}⋉ϕ)θ′(λyαx.χα={0,1})
We’ll make a particular contribution θ∗. It is defined as
δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0}+δ0×(ϕ(0)∧ϕ(1))×δ{0,1}
Or, put another way, restricting to 0,{0}, it’s ϕ(0)−ϕ(0)∧ϕ(1), and conditioning on 0,{0,1}, it’s ϕ(0)∧ϕ(1). So, for a given s:{0,1}→{0,1}, we have
θ∗(λyαx.χs(y)∈αg(s(y),x))=(δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0}+δ0×(ϕ(0)∧ϕ(1))×δ{0,1})(λyαx.χs(y)∈αg(s(y),x))=(δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0})(λyαx.χs(y)∈αg(s(y),x))+(δ0×(ϕ(0)∧ϕ(1))×δ{0,1})(λyαx.χs(y)∈αg(s(y),x))=(ϕ(0)−ϕ(0)∧ϕ(1))(λx.χs(0)∈{0}g(s(0),x))+(ϕ(0)∧ϕ(1))(λx.χs(0)∈{0,1}g(s(0),x))
Now, we’ll go through two exhaustive possibilities for what s could be. If it maps 0 to 0, then it’s
=(ϕ(0)−ϕ(0)∧ϕ(1))(λx.χ0∈{0}g(0,x))+(ϕ(0)∧ϕ(1))(λx.χ0∈{0,1}g(0,x))=(ϕ(0)−ϕ(0)∧ϕ(1))(λx.g(0,x))+(ϕ(0)∧ϕ(1))(λx.g(0,x))=ϕ(0)(λx.g(0,x))≤maxy∈{0,1}ϕ(y)(λx.g(y,x))=⊤{0,1}(λy.ϕ(y)(λx.g(y,x))=(⊤{0,1}⋉ϕ)(λyx.g(y,x))
And our desired inequality is established. If s maps 0 to 1, then it’s
=(ϕ(0)−ϕ(0)∧ϕ(1))(λx.χ1∈{0}g(1,x))+(ϕ(0)∧ϕ(1))(λx.χ1∈{0,1}g(1,x))=(ϕ(0)∧ϕ(1))(λx.g(1,x))≤ϕ(1)(λx.g(1,x))≤maxy∈{0,1}ϕ(y)(λx.g(y,x))=(⊤{0,1}⋉ϕ)(λyx.g(y,x))
And that is taken care of. So, our θ∗ lies in Br(⊤{0,1}⋉ϕ).
Now, where were we? Ah right, we were at
Λ(χU)=maxθ∈Br(⊤{0,1}⋉ϕ)θ(λyαx.χα={0,1})
But now we can continue with
≥θ∗(λyαx.χα={0,1})
Which unpacks as
=(δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0}+δ0×(ϕ(0)∧ϕ(1))×δ{0,1})(λyαx.χα={0,1})=(δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0})(λyαx.χα={0,1})+(δ0×(ϕ(0)∧ϕ(1))×δ{0,1})(λyαx.χα={0,1})=(ϕ(0)−ϕ(0)∧ϕ(1))(λx.χ{0}={0,1})+(ϕ(0)∧ϕ(1))(λx.χ{0,1}={0,1})=(ϕ(0)∧ϕ(1))(1)
And then, the value of the overlap between two distributions is 1−dTV(ϕ(0),ϕ(1)). So, we’ve shown one direction, we have
Λ(χU)≥1−dTV(ϕ(0),ϕ(1))
In the other direction of the equality we’re trying to prove, we need to constrain what θ might be. Starting off with what we definitely know, we have
Λ(χU)=maxθ∈Br(⊤{0,1}⋉ϕ)θ(λyαx.χα={0,1})
Now it’s time to show that for any θ∈Br(⊤{0,1}⋉ϕ), that the measure on α={0,1} can’t be too high. Specifically, to proceed further, we’ll need to show that
∀x′∈Φ:θ(λyαx.χx=x′∧α={0,1})≤(ϕ(0)∧ϕ(1))(λx.χx=x′)
Time to establish it. On x′, either ϕ(0)(x′)≥ϕ(1)(x′), or vice-versa. Without loss of generality, assume that it’s ϕ(0)(x′) that’s lower.
Then, let s be the constant-0 function, and g(0,x′) be 1, and 0 everywhere else.
Then, we have
θ(λyαx.χx=x′∧α={0,1})≤θ(λyαx.χs(y)∈αg(s(y),x))≤(⊤{0,1}⋉ϕ)(λyx.g(y,x))=(⊤{0,1}⋉ϕ)(λyx.χy=0∧x=x′)=maxy∈{0,1}ϕ(y)(λx.χy=0∧x=x′)=ϕ(0)(λx.χx=x′)=ϕ(0)(x′)=(ϕ(0)∧ϕ(1))(x′)
Just abbreviating, using that θ lies in the bridge transform, deabbreviating with what g is, unpacking things a little bit and canceling out. At the end we used that ϕ(0)(x′)≤ϕ(1)(x′).
Ok, so, now that we’ve established the key fact that
∀x′∈Φ:θ(λyαx.χx=x′∧α={0,1})≤(ϕ(0)∧ϕ(1))(λx.χx=x′)
We can resume our work. We were previously at
=maxθ∈Br(⊤{0,1}⋉ϕ)θ(λyαx.χα={0,1})
and we can proceed to
=maxθ∈Br(⊤{0,1}⋉ϕ)∑x′∈Φθ(λyαx.χx=x′∧α={0,1})
and from there, using the inequality we proved, proceed to
≤∑x′∈Φ(ϕ(0)∧ϕ(1))(λx.χx=x′)=(ϕ(0)∧ϕ(1))(1)=1−dTV(ϕ(0),ϕ(1))
And we’re done, we established that it’s an upper bound as well a lower bound, so we have equality. ■
Lemma 2:If there’s a function h:Φ→2Γ and Θ(λyx.χy∉h(x))=0, then for all θ∈Br(Θ), θ is supported on the set {α,x|α⊆h(x)}.
Assume that the conclusion is false, that there is some nonzero probability of drawing a x∗,α∗ pair where α∗⊈h(x∗). In particular, there must be some special y∗∈Γ value that witnesses that α∗ isn’t a subset, by y∗∈α∗ and y∗∉h(x∗). Remember that for all s:Γ→Γ and g:Γ×Φ→[0,1], that since θ∈Br(Θ), we have
θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x))
Now, in particular, let g:=χy∉h(x), and s be the constant function that maps everything to y∗. Then, this turns into
0<θ(λyαx.χy∗∈αχy∗∉h(x))≤Θ(λyx.χy∉h(x))=0
which is impossible. The equality as the end was our starting assumption. The middle inequality was just specializing our inequality to a particular pair of functions. And it’s greater than 0 because there’s a nonzero probability of drawing x∗,α∗. And y∗ was selected to lie in α∗ and outside of h(x∗), so there’s a nonzero probability of getting a nonzero value. Therefore, the result follows. ■
Lemma 3:For any s:Γ→Γ, and Θ:□c(Γ×Φ), if θ∈Br(Θ), then χelΓ(s∗(θ′))∈Br(Θ).
So, the support condition is trivially fulfilled, because we’re restricting to the event y∈α. That just leaves the endofunction condition. Let s′:Γ→Γ be arbitrary, and g:Γ×Φ→[0,1] be arbitrary. Then we have
χelΓ(s∗(θ))(λyαx.χs′(y)∈αg(s′(y),x))=s∗(θ)(λyαx.χy∈α∧s′(y)∈αg(s′(y),x))≤s∗(θ)(λyαx.χs′(y)∈αg(s′(y),x))=θ(λyαx.χs′(s(y))∈αg(s′(s(y)),x))
And, since s′∘s:Γ→Γ, we can use that θ∈Br(Θ) to get
≤Θ(λyx.g(y,x))
And we’re done, we’ve established that our modified version of θ remains in Br(Θ). ■
Proposition 2.7:If X is a poset and Θ∈□cX,
then Θ↓ will denote downward closure of Θ.
For any Γ, Φ and Θ∈□c(Γ×Φ)
if (y,α,x)∈supp Br(Θ) and y′∈α, then (y′,x)∈supp Θ.
Moreover, define susΘ:Φ→2Γ by susΘ(x):={y∈Γ∣(y,x)∈supp Θ}.
Then, Br(Θ)⊆(Θ⋉susΘ)↓
(we slightly abuse notation by treating susΘ as a mapping
Γ×Φ→2Γ that doesn’t depend on the
first argument, and also playing loose with the order of factors in
the set on which our HUCs live).
This proof splits into two parts. Part 1 will be proving the support statement. Part 2 will be proving the thing with susΘ.
Part 1: Here’s the basics of how this part will work. There’s some (y∗,α∗,x∗) tuple in the support of Br(Θ). Since the support of Br(Θ) is the union of the supports for the θ∈Br(Θ), there’s some θ that assigns nonzero probability to that event. Also, y′∈α∗. The key part of the proof is showing there’s some θ∗∈Br(Θ) that assigns nonzero measure to the event (y′,x∗,α∗). Once we’ve got that, since we know by Proposition 2.1 that Br(Θ) projects down to equal Θ, that means that the projection of θ∗ will land in Θ, and it will assign nonzero measure to the event (y′,x∗), so the event (y′,x∗) lies in the support of Θ.
So, that just leaves appropriately defining our θ∗ and showing that it lies in Br(Θ) and assigns nonzero probability to our event of interest. Our particular choice of θ∗ will be as follows. Let cy′:Γ→Γ be the constant function mapping everything to y′.
θ∗:=χelΓ((cy′)∗(θ))
Applying Lemma 3, this is in Br(Θ), so that just leaves showing that it assigns nonzero probability to the event y′,x∗,α∗. We have
χelΓ((cy′)∗(θ))(λyαx.χy=y′∧x=x∗∧α=α∗)=(cy′)∗(θ)(λyαx.χy=y′∧x=x∗∧α=α∗∧y∈α)
We can simplify this a bit, because if the first three properties hold, that tells you something about what α is.
=(cy′)∗(θ)(λyαx.χy=y′∧x=x∗∧α=α∗∧y∈α∗)
unpack the pushforward
=θ(λyαx.χcy′(y)=y′∧x=x∗∧α=α∗∧cy′(y)∈α∗)
Then we use that it’s a constant function
=θ(λyαx.χy′=y′∧x=x∗∧α=α∗∧y′∈α∗)
Now, since y′∈α∗ by the problem setup, and y′=y′ is a tautology, we can remove those two events.
=θ(λyαx.χx=x∗∧α=α∗)
and use an inequality
≥θ(λyαx.χy=y∗∧x=x∗∧α=α∗)>0
Because θ assigned nonzero probability to that event. So, we know that our θ∗ assigns nonzero measure to the event of interest. And that’s it! It fulfills the appropriate properties to carry the proof through.
Proof part 2: First off, we can reexpress
Br(Θ)⊆(Θ⋉susΘ)↓
as the equivalent statement that, for all f:elΓ×Φ→[0,1], we have
Br(Θ)(λyαx.f(y,x,α))≤(Θ⋉susΘ)↓(λyαx.f(y,x,α))
Although, since Br(Θ) and (Θ⋉susΘ)↓ are downwards-closed, we actually only need to demonstrate this inequality for monotone functions f.
The reason we only need to demonstrate this inequality for monotone f is because of
Br(Θ)(f)=Br(Θ)(fmax)≤(Θ⋉susΘ)↓(fmax)=(Θ⋉susΘ)↓(f)
Where the equalities follow from Proposition 3.1 and the downward closure of the two ultracontributions, and the inequality is what we’re trying to prove (since fmax in the sense of Proposition 3.1 is always a monotone function.
So, let f be monotone, and we’re trying to prove the desired inequality. We’ll unpack it bit by bit.
Br(Θ)(λyαx.f(y,x,α))=maxθ∈Br(Θ)θ(λyαx.f(y,x,α))
And now we remember that the set {x,y|y∈susΘ(x)} is a support for Θ, because it’s the same as {x,y|(y,x)∈supp Θ}, ie, the support of Θ. So, by Lemma 2, we can conclude that any θ∈Br(Θ) is supported on α,x pairs s.t. α⊆susΘ(x). In particular, this means that α≤susΘ(x), and since f is monotone, swapping out α for susΘ(x) always produces an increase in expected value, so we get
≤maxθ∈Br(Θ)θ(λyαx.f(y,x,susΘ(x)))
and then, since all θ∈Br(Θ) are supported on y,α s.t. y∈α, we can go
=maxθ∈Br(Θ)θ(λyαx.χy∈αf(y,x,susΘ(x)))
And then, since all the θ∈Br(Θ) fulfill the endofunction property, we can let s be identity and g be f, and go
≤Θ(λyx.f(y,x,susΘ(x)))
And rewrite that as
=Θ(λyx.δsusΘ(x)(λα.f(y,x,α)))=(Θ⋉susΘ)(λyαx.f(y,x,α))
and then since f is monotone
=(Θ⋉susΘ)↓(λyαx.f(y,x,α))
This holds because all contributions added when you take the downward closure can only produce lower expectation values than the existing contributions due to monotonicity of f, and so they’re ignored.
And now we’re done! We got the inequality going appropriately to hit our proof target. ■
Proposition 2.8:For any Γ, Φ and Θ∈□c(Γ×Φ)Br(prΓ×2ΓBr(Θ))=[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)]↓
For notation, we’ll use β for the set which is being treated as part of the background environment, and α as the usual set.
The way to establish this is to show equal expectation values for all functions monotone in the last argument (the relevant set-based one), which is all we need to do as both sets are downwards-closed. We’ll do this by establishing the two inequalities separately. Our first order of business is showing that for all f:elΓ×2Γ monotone in the last coordinate, we have
Br(prΓ×2ΓBr(Θ))(λyαβ.f(y,β,α))≥[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)]↓(λyαβ.f(y,β,α))
Let’s begin. Pick an f that’s monotone in the last argument.
[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)]↓(λyαβ.f(y,β,α))=[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)](λyαβ.f(y,β,α))=prΓ×2ΓBr(Θ)(λyβ.f(y,β,β))=Br(Θ)(λyβx.f(y,β,β))=maxθ′∈Br(Θ)θ′(λyβx.f(y,β,β))=maxθ′∈Br(Θ)θ′(λyβx.δβ(λα.f(y,β,α)))=maxθ′∈Br(Θ)prΓ×2Γ(θ′)(λyβ.δβ(λα.f(y,β,α)))=maxθ′∈Br(Θ)(prΓ×2Γ(θ′)⋉id2Γ)(λyβα.f(y,β,α))
We’ll set this to the side for a moment to show that if θ′∈Br(Θ), then prΓ×2Γ(θ′)⋉id2Γ∈Br(prΓ×2ΓBr(Θ)).
The support part is easy. Since θ′∈Br(Θ), it’s supported on (y,β) pairs where y∈β. Taking semidirect product with identity means that y∈α always happens, because β=α always happens. So, that leaves showing the endofunction condition. We have
(prΓ×2Γ(θ′)⋉id2Γ)(λyαβ.χs(y)∈αg(s(y),β))=prΓ×2Γ(θ′)(λyβ.δβ(λα.χs(y)∈αg(s(y),β))=prΓ×2Γ(θ′)(λyβ.χs(y)∈βg(s(y),β))=θ′(λyβx.χs(y)∈βg(s(y),β))
And start packing things up a bit
=s∗(θ′)(λyβx.χy∈βg(y,β))=χelΓ(s∗(θ′))(λyβx.g(y,β))
And, since θ′∈Br(Θ), this update of the pushforward of θ′ lands in Br(Θ) by Lemma 3, so we get
≤maxθ′′∈Br(Θ)θ′′(λyβx.g(y,β))=Br(Θ)(λyβx.g(y,β))=prΓ×2ΓBr(Θ)(λyβ.g(y,β))
Now, since we’ve established that inequality for all choices of s,g, we have that
prΓ×2Γ(θ′)⋉id2Γ∈Br(prΓ×2ΓBr(Θ))
Resuming where we last left off, the last place we were at in our chain of inequalities was
=maxθ′∈Br(Θ)(prΓ×2Γ(θ′)⋉id2Γ)(λyαβ.f(y,β,α))
Since these things are always in Br(prΓ×2ΓBr(Θ)), we can go
≤maxθ′′∈Br(prΓ×2ΓBr(Θ))θ′′(λyαβ.f(y,β,α))=Br(prΓ×2ΓBr(Θ))(λyαβ.f(y,β,α))
And we’re done with that inequality direction.
Now to show the reverse direction, which actually will use that f is monotone in the last argument. We start with
Br(prΓ×2ΓBr(Θ))(λyαβ.f(y,β,α))=maxθ′′∈Br(prΓ×2ΓBr(Θ))θ′′(λyαβ.f(y,β,α))
And now, we can use that prΓ×2ΓBr(Θ) is supported on (y,β) pairs where y∈id(β), along with Lemma 2 applied to id, to conclude that all the θ′′ must be supported on the event α⊆id(β)=β. Since big sets go more towards the top and get a higher loss, and f is monotone in the last argument, we get
≤maxθ′′∈Br(prΓ×2ΓBr(Θ))θ′′(λyαβ.f(y,β,β))
Now, we can use the endofunction property of all the θ′′ w.r.t.prΓ×2ΓBr(Θ) to get a uniform upper bound of
≤prΓ×2ΓBr(Θ)(λyβ.f(y,β,β))
and then go
=[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)](λyαβ.f(y,β,α))=[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)]↓(λyαβ.f(y,β,α))
and we’re done, we got the inequality going in the other direction, so we have equality for arbitrary monotone functions, and thus equality. ■
Proposition 2.9:For any Γ, Φ and Θ1,Θ2∈□c(Γ×Φ),
if Θ1⊆Θ2 then Br(Θ1)⊆Br(Θ2).
Proof: This is really easy to show, we just need to take some contribution θ′∈Br(Θ1) and show that it lies in Br(Θ2). The support condition is easily fulfilled, so that leaaves showing the endofunction condition. Let’s begin.
θ′(λyαx.χs(y)∈αg(s(y),x))≤Θ1(λyx.g(y,x))=maxθ∈Θ1θ(λyx.g(y,x))
And now, since Θ1⊆Θ2, we have
≤maxθ∈Θ2θ(λyx.g(y,x))=Θ2(λyx.g(y,x))
Done. ■
Proposition 2.10:For any Γ, Φ1, Φ2,
t:Φ2→Φ1 and Θ∈□c(Γ×Φ2)(idelΓ×t)∗Br(Θ)⊆Br((idΓ×t)∗Θ)
As usual, we just need to establish that for all f:elΓ×Φ, the expectation value is lower in the first function than the second function, so our proof goal will be
(idelΓ×t)∗Br(Θ)(λyαx1.f(y,α,x1))≤Br((idΓ×t)∗Θ)(λyαx1.f(y,α,x1))
Let’s begin trying to show this.
(idelΓ×t)∗Br(Θ)(λyαx1.f(y,x1,α))=Br(Θ)(λyαx2.f(y,t(x2),α))=maxθ′∈Br(Θ)θ′(λyαx2,.f(y,t(x2),α))=maxθ′∈Br(Θ)(idelΓ×t)∗(θ′)(λyαx1.f(y,x1,α))
Now we will show that all the contributions of this form lie in Br((idΓ×t)∗Θ). Clearly the y∈α condition is always fulfilled for these pushforwards, so that just leaves the endofunction condition. Let’s begin.
(idelΓ×t)∗(θ′)(λyαx1.χs(y)∈αg(s(y),x1))=θ′(λyαx2.χs(y)∈αg(s(y),t(x2)))≤Θ(λyx2.g(y,t(x2)))=(idΓ×t)∗(Θ)(λyx1.g(y,x1))
And we’re done, we established our desired result that the pushforward lands in the appropriate set. So, we can proceed by going
≤maxθ′′∈Br((idΓ×t)∗(Θ))θ′′(λyαx1.f(y,x1,α))=Br((idΓ×t)∗(Θ))(λyαx1.f(y,x1,α))
And we’re done! ■
Proposition 2.11:Consider any Γ, Φ1, Φ2, t:Φ2→Φ1, Ξ:Φ1→□Φ2
and Θ∈□c(Γ×Φ1) s.t. t∘Ξ=idΦ1.
Then,
(idelΓ×Ξ)∗Br(Θ)⊆Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ)
In particular,prelΓBr(Θ)=prelΓBr((idΓ×Ξ)∗Θ)
Well, let’s get started on proving these various things. To begin with, to prove
(idelΓ×Ξ)∗Br(Θ)⊆Br((idΓ×Ξ)∗Θ)
We need to prove that, for all f:elΓ×Φ2→[0,1],
(idelΓ×Ξ)∗Br(Θ)(λyαx2.f(y,x2,α))≤Br((idΓ×Ξ)∗Θ)(λyαx2.f(y,x2,α))
Let’s establish this. Unpacking the left-hand side, we have
(idelΓ×Ξ)∗Br(Θ)(λyαx2.f(y,x2,α))=Br(Θ)(λyαx1.Ξ(x1)(λx2.f(y,x2,α)))=maxθ′∈Br(Θ)θ′(λyαx1.Ξ(x1)(λx2.f(y,x2,α)))=maxθ′∈Br(Θ)(idelΓ×Ξ)∗(θ′)(λyαx2.f(y,x2,α))=maxθ′∈Br(Θ)maxθ′′∈(idelΓ×Ξ)∗(θ′)θ′′(λyαx2.f(y,x2,α))
Now we’ll show that this θ′′ lies in Br((idΓ×Ξ)∗Θ). The support condition is trivial, so that leaves the endofunction condition.
θ′′(λyαx2.χy∈αg(s(y),x2))≤(idelΓ×Ξ)∗(θ′)(λyαx2.χs(y)∈αg(s(y),x2))=θ′(λyαx1.Ξ(x1)(λx2.χs(y)∈αg(s(y),x2)))
Then, by homogenity of Ξ, we can pull a constant out, the indicator function, to get
=θ′(λyαx1.χs(y)∈αΞ(x1)(λx2.g(s(y),x2)))
Then, by the endofunction condition, we get
≤Θ(λyx1.Ξ(x1)(λx2.g(y,x2)))=(idΓ×Ξ)∗(Θ)(λyx2.g(y,x2))
And bam, we’ve showed that θ′′ lies in the appropriate set. We were last at
=maxθ′∈Br(Θ)maxθ′′∈(idelΓ×Ξ)∗(θ′)θ′′(λyαx2.f(y,x2,α))
So we can impose an upper bound of
≤maxθ′′∈Br((idΓ×Ξ)∗Θ)θ′′(λyαx2.f(y,x2,α))≤Br((idΓ×Ξ)∗(Θ))(λyαx2.f(y,x2,α))
And we’re done, we’ve established our desired inequality. Now for proving another inequality.
Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ)
This can be proved by applying Proposition 2.10. Start out with Proposition 2.10.
(idelΓ×t)∗Br(Θ′)⊆Br((idΓ×t)∗Θ′)
Specialize Θ′ to (idΓ×Ξ)∗(Θ), yielding
(idelΓ×t)∗Br((idΓ×Ξ)∗Θ)⊆Br((idΓ×t)∗(idΓ×Ξ)∗Θ)
The two pushforwards can be folded into one.
(idelΓ×t)∗Br((idΓ×Ξ)∗Θ)⊆Br((idΓ×t∘Ξ)∗Θ)
Now use that t∘Ξ is identity
(idelΓ×t)∗Br((idΓ×Ξ)∗Θ)⊆Br(Θ)
Apply pullback along t to both sides
(idelΓ×t)∗(idelΓ×t)∗Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ)
On the other hand, we also have
Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗(idelΓ×t)∗Br((idΓ×Ξ)∗Θ)
Combining the last two lines, we get the desired result:
Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ)
That just leaves showing the equality result, now that we’ve got both subset inclusions.
We start out with
(idelΓ×Ξ)∗Br(Θ)⊆Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ)
Applying projection yields
prelΓ((idelΓ×Ξ)∗Br(Θ))⊆prelΓ(Br((idΓ×Ξ)∗Θ))⊆prelΓ((idelΓ×t)∗Br(Θ))
For any given f, we have
prelΓ((idelΓ×Ξ)∗Br(Θ))(λyα.f(y,α))≤prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α))≤prelΓ((idelΓ×t)∗Br(Θ))(λyα.f(y,α))
Unpacking the projection on the two ends, we get
(idelΓ×Ξ)∗Br(Θ)(λyαx2.f(y,α))≤prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α))≤(idelΓ×t)∗Br(Θ)(λyαx2.f(y,α))
And then, for the smallest and largest quantities, the pullback or pushforward only affects the x1 or x2 coordinates, everything else remains the same. In particular, f doesn’t depend on such coordinates. So, we get
Br(Θ)(λyαx1.f(y,α))≤prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α))≤Br(Θ)(λyαx1.f(y,α))
The left and right hand sides are equal, so we have
prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α))=Br(Θ)(λyαx1.f(y,α))
Then, packing the projection back up, we have
prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α))=prelΓBr(Θ)(λyα.f(y,α))
And since there’s equality for all functions, the two ultradistributions are equal, so we have
prelΓ(Br((idΓ×Ξ)∗Θ))=prelΓBr(Θ)
And we’re done. ■
Proposition 2.12:For any Γ, Φ, Θ1,Θ2∈□c(Γ×Φ) and p∈[0,1]pBr(Θ1)+(1−p)Br(Θ2)⊆Br(pΘ1+(1−p)Θ2)
We’ll do this by showing that all contributions of the form pθ+(1−p)θ′ with θ∈Br(Θ1) and θ′∈Br(Θ2) lie within Br(pΘ1+(1−p)Θ2).
The support property is trivial, so that leaves the endofunction property.
(pθ+(1−p)θ′)(λyαx.χs(y)∈αg(s(y),x))=pθ(λyαx.χs(y)∈αg(s(y),x))+(1−p)θ′(λyαx.χs(y)∈αg(s(y),x))≤pΘ1(λyx.g(y,x))+(1−p)Θ2(λyx.g(y,x))=(pΘ1+(1−p)Θ2)(λyx.g(y,x))
Done, the mix lies in Br(pΘ1+(1−p)Θ2). ■
Proposition 2.13:Consider some Γ, Φ1,
Φ2, Θ1∈□c(Γ×Φ1), Θ1∈□c(Γ×Φ2)
and p∈[0,1]. Regard Φ1,2 as subsets of Φ1⊔Φ2,
so that pΘ1+(1−p)Θ2∈□c(Γ×(Φ1⊔Φ2)).
Then,pBr(Θ1)+(1−p)Br(Θ2)=Br(pΘ1+(1−p)Θ2)
We already established one subset direction in Proposition 2.12. We just need the other direction, to take any θ′′∈Br(pΘ1+(1−p)Θ2), and write it as pθ+(1−p)θ′ where θ∈Br(Θ1) and θ′∈Br(Θ2).
We trivially have equality when p=0 or 1, so that leaves the cases where p∈(0,1). With that, our attempted θ will be χΦ1θ′′p, and our attempted θ′ will be χΦ2θ′′1−p.
Now, clearly, these mix to make θ′′, because
pθ+(1−p)θ′=pχΦ1θ′′p+(1−p)χΦ2θ′′1−p=χΦ1θ′′+χΦ2θ′′=θ′′
Remember, Φ1 and Φ2 are the two components of a space made by disjoint union.
They clearly both fulfill the support condition, because they’re made from θ′′ which fulfills the support condition. This leaves the endofunction condition, which is somewhat nontrivial to show, and will require some setup. Clearly, without loss of generality, we just need to show it for θ, and θ′ follows by identical arguments. For some g:Γ×(Φ1⊔Φ2)→[0,1], g′ will denote the function that mimics g on Γ×Φ1, and is 0 on Γ×Φ2.
Also, note that θ, as defined, is supported entirely on Γ×Φ1, and θ′, as defined, is supported entirely on Γ×Φ2. Let’s get started on showing our endofunction condition. Let g and s be arbitrary.
θ(λyαx.χs(y)∈αg(s(y),x))=θ(λyαx.χs(y)∈αg(s(y),x))+1−ppθ′(λyαx.0)
This is because the expectation of an all-0 function is 0. Then,
=θ(λyαx.χs(y)∈αg′(s(y),x))+1−ppθ′(λyαx.χs(y)∈αg′(s(y),x))
Why does this happen? Well, θ is supported entirely on Φ1, and g′ mimics g perfectly on Φ1. And θ′ is supported entirely on Φ2, and g′ mimics 0 perfectly on Φ2. So it doesn’t matter that we changed the functions outside of the contribution support.
=(θ+1−ppθ′)(λyαx.χs(y)∈αg′(s(y),x))=1p(pθ+(1−p)θ′)(λyαx.χs(y)∈αg′(s(y),x))=1pθ′′(λyαx.χs(y)∈αg′(s(y),x))
Now we use the endofunction condition on θ′′ to get
≤1p(pΘ1+(1−p)Θ2)(λyx.g′(y,x))=(Θ1+1−ppΘ2)(λyx.g′(y,x))=Θ1(λyx.g′(y,x))+1−ppΘ2(λyx.g′(y,x))
And then we use that Θ1 is supported on Ψ1 and Θ2 is supported on Ψ2, and on Ψ1, g′=g, and on Ψ2, g′=0, to get
=Θ1(λyx.g(y,x))+1−ppΘ2(λyx.0)=Θ1(λyx.g(y,x))
And we’re done! The endofunction condition goes through, which is the last piece of the proof we needed. ■
Proposition 2.14:Consider some Γ, Φ,
Θ∈□c(Γ×Φ) and F⊆Γ. Let
∩F:2Γ→2Γ be defined by ∩F(α):=F∩α.
Then,
BrΓ(Θ∩⊤F×Φ)=(idΓ×∩F×idΦ)∗(BrΓ(Θ)∩⊤F×2Γ×Φ)
Moreover, let ι:Γ′→Γ be an injection and elι:elΓ′→elΓ be the injection induced by ι. Then,BrΓ′((ι×idΦ)∗(Θ∩⊤im(ι)×Φ))=(elι×idΦ)∗BrΓ(Θ∩⊤im(ι)×Φ)
We’ll have to prove both directions separately. Notice that for our proofs so far, they involve transforming one side of the equation to a certain form, then there’s a critical step in the middle that’s tricky to show, then we just transform back down. So our first step will be unpacking the two sides of the equation up to the critical inequality.
Also, intersecting with the top distribution corresponding to a particular set is the same as the raw-update on that set, we’ll use that. We’ll suppress identity functions and unused coordinates for ⊤ in the notation, so if some coordinate isn’t mentioned, assume it’s either the identity function for pushforwards, or that the set for ⊤ is the specified set times the entire space for unmentioned coordinates. With that notational note, we have
BrΓ(Θ∩⊤F×Φ)(λyαx.f(y,x,α))
Suppressing notation
=Br(Θ∩⊤F)(λyαx.f(y,x,α))
And rewriting the bridge transform,
=maxθ∈Br(Θ∩⊤F)θ(λyαx.f(y,x,α))
This is our unpacking in one direction.
In the other direction, we have
(idΓ×∩F×idΦ)∗(BrΓ(Θ)∩⊤F×2Γ×Φ)(λyαx.f(y,x,α))
Suppress some notation,
=(∩F)∗(Br(Θ)∩⊤F)(λyαx.f(y,x,α))
Unpack the pushforward and intersection
=(Br(Θ)∩⊤F)(λyαx.f(y,x,α∩F))=Br(Θ)(λyαx.χy∈Ff(y,x,α∩F))
And unpack the bridge transform
=maxθ′∈Br(Θ)θ′(λyαx.χy∈Ff(y,x,α∩F))
Pack back up
=maxθ′∈Br(Θ)(θ′∩⊤F)(λyαx.f(y,x,α∩F))=maxθ′∈BrΓ(Θ)(∩F)∗(θ′∩⊤F)(λyαx.f(y,x,α))
So, our goal to show equality overall is to establish the equality
maxθ∈Br(Θ∩⊤F)θ(λyαx.f(y,x,α))=maxθ′∈Br(Θ)(∩F)∗(θ′∩⊤F)(λyαx.f(y,x,α))
This can be done by establishing the following two things. First, is, if θ′∈Br(Θ), then (∩F)∗(θ′∩⊤F)∈Br(Θ∩⊤F). That establishes the ≥ inequality direction.
For the reverse direction, we’ll need to show that for any θ∈Br(Θ∩⊤F), θ also lies in Br(Θ), and that (∩F)∗(θ∩⊤F)=θ. This establishes the ≤ inequality direction, since any choice of θ for the left hand side can be duplicated on the right hand side.
Let’s switch our proof target to showing these two things. First off, assume θ′∈Br(Θ). The goal is to show that (∩F)∗(θ′∩⊤F)∈Br(Θ∩⊤F)
For once, the support condition is not entirely trivial. However, notice that θ′∩⊤F always has y∈F holding (because we updated) and y∈α always holds (because θ′ fulfills the support condition due to being in Br(Θ)). So, it’s guaranteed that y∈α∩F. Then, applying the pushforward that turns α into α∩F doesn’t change that there’s a guarantee that y lies in the set it’s paired with.
Now for the endofunction condition.
(∩F)∗(θ′∩⊤F)(λyαx.χs(y)∈αg(s(y),x))=(θ′∩⊤F)(λyαx.χs(y)∈α∩Fg(s(y),x))=θ′(λyαx.χy∈Fχs(y)∈α∩Fg(s(y),x))≤θ′(λyαx.χs(y)∈α∩Fg(s(y),x))=θ′(λyαx.χs(y)∈αχs(y)∈Fg(s(y),x))≤Θ(λyxχy∈Fg(y,x))=(Θ∩⊤F)(λyx.g(y,x))
And we’re done, that half of the proof works out.
Now for the reverse direction, establishing that for any θ∈Br(Θ∩⊤F), θ also lies in Br(Θ), and that (∩F)∗(θ∩⊤F)=θ.
The first part of this is easy.
θ(λyαx.χs(y)∈αg(s(y),x))≤(Θ∩⊤F)(λyx.g(y,x))=Θ(λyx.χy∈Fg(y,x))≤Θ(λyx.g(y,x))
And we’re done. For the second part, it’s a bit trickier. In short, what happens is that a piece of measure from θ is deleted if y∉F, and then α gets mapped to α∩F. So, if we knew that θ was supported on F×2F, we’d know that neither of the two transformations do anything whatsoever, and so you just get θ out again.
So, let’s show that θ∈Br(Θ∩⊤F) is supported on F×2F.
For this, the way we do it is use Proposition 2.7 to get an upper bound on the bridge transformation, and show that said upper bound is also supported on F×2F. By proposition 2.7,
Br(Θ∩⊤F)⊆((Θ∩⊤F)⋉susΘ∩⊤F)↓
The downwards arrow means that we’re passing from larger subsets to smaller subsets, so it doesn’t matter if we remove that, it won’t affect whether the support is a subset of F×2F. Clearly, Θ∩⊤F is supported on F. And, for sus, we have
susΘ∩⊤F(x)={y|(y,x)∈supp (Θ∩⊤F)}
So clearly it can only produce sets of y which are subsets of F since that’s an upper bound on the support of Θ∩⊤F. So we have support on F×2F, and we’re done with this half of the theorem.
So now we move onto the second half of the theorem with the injections. Without loss of generality, we can reduce this problem to a slightly simpler form, where we assume that Θ is supported over im(ι). This is because any Θ supported over im(ι) has Θ∩⊤im(ι)=Θ, and also for any Θ′, Θ′∩⊤im(ι) is always supported over im(ι).
Accordingly, assume that Θ is supported over im(ι). Suppressing identity functions, our goal is to prove that
BrF((ι)∗Θ)=(elι)∗BrΓ(Θ)
This is suprisingly aggravating to prove. Again, we’ll try to rewrite things until we get to a point where there’s just one equality to show, and then put in the work of showing the endofunction condition in both directions. Time to start rewriting. Subscripts F will be used to denote when a particular y is part of F, or of Γ. Similar with 2F and 2Γ.
BrF((ι)∗Θ)(λyFαFx.f(yF,x,αF))=maxθF∈BrF((ι)∗Θ)θF(λyFαFx.f(yF,x,αF))
Rewriting the other direction, we have
(elι)∗BrΓ(Θ)(λyFαFx.f(yF,x,αF))
Now, for pullback, it’s defined like this. Maximum over the empty set is 0.
=BrΓ(Θ)(λyαx.maxyF,αF∈(elι)−1(y,α)f(yF,x,αF))=maxθ∈BrΓ(Θ)θ(λyαx.maxyF,αF∈(elι)−1(y,α)f(yF,x,αF))=maxθ∈BrΓ(Θ)(elι)∗(θ)(λyFαFx.f(yF,x,αF))
And so we now must embark on showing these two things are equal, by showing that every value that one of the maxes is capable of producing, the other is capable of producing too. Our proof goal is
maxθF∈BrF((ι)∗Θ)θF(λyFαFx.f(yF,x,αF))=maxθ∈BrΓ(Θ)(elι)∗(θ)(λyFαFx.f(yF,x,αF))
Now, for one direction of this, we need to show that if θ∈BrΓ(Θ), then (elι)∗(θ)∈BrF((ι)∗Θ).
The support condition is easily fulfilled, because the only time the pullback works is when y,α∈im(elι), and ι is an injection, so there’s a unique preimage point yF,αF, and since y∈F occurs always,then yF∈αF occurs always. That leaves the endofunction condition. Let sF:F→F, gF:F×Φ→[0,1]. Let s′ be defined as follows. For points in im(ι), it maps y to ι(sF(ι−1(y))). Everywhere else, it’s the identity. Also, g′, for points in im(ι), maps y,x to g(ι−1(y),x), and is 0 everywhere else. Both of these are uniquely defined because ι is an injection. We have
(elι)∗(θ)(λyFαFx:χsF(yF)∈αFgF(sF(yF),x))=θ(λyαx:maxyF,αF∈(elι)−1(y,α)χsF(yF)∈αFgF(sF(yF),x))
We can partially write the maximum as an indicator function for checking whether y∈im(ι) and α⊆im(ι) (because in such a case the maximum is taken over an empty set and 0 is returned in those cases). And also, since there’s only one point in that inverse, since ι is an injection, there’s a canonical inverse and we can swap everything out for that, yielding
=θ(λyαx:χy∈im(ι)∧α⊆im(ι)χsF(ι−1(y))∈ι−1(α)gF(sF(ι−1(y)),x))
Now, since ι is an injection, applying it to the point and the set in the indicator function don’t affect whether the point is in the relevant set, so we can go
=θ(λyαx:χy∈im(ι)∧α⊆im(ι)χι(sF(ι−1(y)))∈ι(ι−1(α))gF(sF(ι−1(y)),x))
And we can also apply ι and ι−1 to the point in the gF function, as that’s just identity.
=θ(λyαx:χy∈im(ι)∧α⊆im(ι)χι(sF(ι−1(y)))∈ι(ι−1(α))gF(ι−1(ι(sF(ι−1(y)))),x))
Use our s′ and g′ abbreviations since we know that the relevant point is in im(ι) if it made it past the indicator function. Also do some canceling out of identity functions around the α.
=θ(λyαx:χy∈im(ι)∧α⊆im(ι)χs′(y)∈αgF(ι−1(s′(y)),x))
And use our abbreviation of g′=θ(λyαx:χy∈im(ι)∧α⊆im(ι)χs′(y)∈αg′(s′(y),x))
Remove the indicator function
≤θ(λyαx:χs′(y)∈αg′(s′(y),x))
Apply endofunction property
≤Θ(λyx.g′(y,x))
Now, g′ is 0 when y∉im(ι), and is gF(ι−1(y),x) otherwise, so we have
=Θ(λyx.χy∈im(ι)gF(ι−1(y),x))=Θ(λyx.maxyF∈ι−1(y)gF(yF,x))=(ι)∗Θ(λyFx.gF(yF,x))
And we’re done here, the endofunction condition has been shown, establishing one direction of the inequality.
For the reverse direction, we’ll be working on showing that if θF∈BrF((ι)∗Θ), then (elι)∗θF∈Br(Θ), though it will take a little bit of extra work at the end to show how this implies our desired equality. The support property obviously holds, we’re just applying ι to our point and set, so that leaves the endofunction property.s and g are as usual.
(elι)∗θF(λyαx.χs(y)∈αg(s(y),x))=θF(λyFαFx.χs(ι(yF))∈ι(αF)g(s(ι(yF)),x))
Now, if s(ι(yF))∈im(ι), it might pass the indicator function, but if s(ι(yF))∉im(ι), it definitely won’t. So let’s place that indicator function.
=θF(λyFαFx.χs(ι(yF))∈im(ι)χs(ι(yF))∈ι(αF)g(s(ι(yF)),x))
Now, since we know this stuff is in the image of ι (and ι(αF) definitely is), and ι is an injection, we can safely take the inverse of everything, yielding
=θF(λyFαFx.χs(ι(yF))∈im(ι)χι−1(s(ι(yF)))∈ι−1(ι(αF))g(ι(ι−1(s(ι(yF)))),x))
In order, this was because we could reverse the injection as long as we’re past that indicator function, reversing the injection doesn’t affect whether the point is an element of the set, and ι∘ι−1 is identity. Cancel out some of the identity..
=θF(λyFαFx.χs(ι(yF))∈im(ι)χι−1(s(ι(yF)))∈αFg(ι(ι−1(s(ι(yF)))),x))
Now, at this point, let sF:F→F be defined as sF(yF)=ι−1(s(ι(yF))) when s(ι(yF))∈im(ι), and arbitrary otherwise. And let gF be defined as gF(yF,x)=g(ι(yF),x)
Now, using these abbreviations, we can go
=θF(λyFαFx.χs(ι(yF))∈im(ι)χsF(yF)∈αFg(ι(sF(yF)),x))=θF(λyFαFx.χs(ι(yF))∈im(ι)χsF(yF)∈αFgF(sF(yF),x))
Now we can safely strip away the indicator function.
≤θF(λyFαFx.χsF(yF)∈αFgF(sF(yF),x))
And apply the endofunction condition
≤(ι)∗(Θ)(λyFx.gF(yF,x))
Unpacking the pullback, we get
=Θ(λyx.maxyF∈ι−1(y)gF(yF,x))
Now unpack the definition of gF=Θ(λyx.maxyF∈ι−1(y)g(ι(yF),x))
Use that Θ is supported on the image of ι, so that max is always nonempty. Then, things cancel out to yield
=Θ(λyx.g(y,x))
And we’re done there.
Ok, so… we’ve shown one inequality direction already, all that remained to be shown was
maxθF∈BrF((ι)∗Θ)θF(λyFαFx.f(yF,x,αF))≤maxθ∈BrΓ(Θ)(elι)∗(θ)(λyFαFx.f(yF,x,αF))
How does it help to know that for any θF∈BrF((ι)∗Θ), then (elι)∗θF∈BrΓ(Θ)? Well, for any particular θF chosen on the left-hand side, you can let the choice of θ for the right hand side be (elι)∗θF, which lies in the appropriate set. Then, the contribution for the right-hand-side would be (elι)∗((elι)∗(θF)), which happens to be θF. So anything the left-hand-side can do, the right-hand-side can do as well, showing our one remaining direction of inequality, and concluding the proof.
Well, actually, I should flesh out that claim that (elι)∗((elι)∗(θF))=θF. This happens because
(elι)∗((elι)∗(θF))(λyFαFx.f(yF,x,αF))=(elι)∗(θF)(λyαx.χy,α∈elFmaxyF,αF∈(elι)−1(y,α)f(yF,x,αF))=θF(λyFαFx.χι(yF),ι(αF)∈elFmaxy′F,α′F∈(elι)−1(ι(yF),ι(αF))f(y′F,x,α′F))
Now,that initial indicator condition is always true because θF is supported on elF, so we have
=θF(λyFαFx.maxy′F,α′F∈(elι)−1(ι(yF),ι(αF))f(y′F,x,α′F))
And then, since elι is injective, applying it then taking the partial inverse is just identity, so we get
=θF(λyFαFx.f(yF,x,αF))
And that’s why we have equality. Alright, proof finally done. ■
Proposition 2.15:Consider some Γ0, Γ1,
r:Γ0→Γ1, Φ, Θ∈□c(Γ0×Φ).
Let ι:Γ0→Γ0×Γ1 be given
by ι(y):=(y,r(y)). Then,BrΓ0×Γ1((ι×idΦ)∗Θ)=(elι×idΦ)∗BrΓ0(Θ)
This can be established by Proposition 2.14. Taking proposition 2.14, and using Θ′ as an abbreviation for (ι×idΦ)∗Θ, and im(ι) as the F, we get
BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)=(idΓ0×Γ1×∩im(ι)×idΦ)∗(BrΓ0×Γ1(Θ′)∩⊤im(ι)×2Γ×Φ)
We’ll need to abbreviate in a bit, so ignore some of the identity functions to get
BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)=(∩im(ι))∗(BrΓ0×Γ1(Θ′)∩⊤im(ι)×2Γ×Φ)
Now, deabbreviating, Θ′ ie (ι×idΦ)∗Θ is already supported on im(ι), so intersecting with ⊤im(ι)×Φ does nothing.
BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)=(∩im(ι))∗(BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)∩⊤im(ι)×2Γ×Φ)
Now, since Θ′∩⊤im(ι)×Φ is onl supported on im(ι), the bridge transform will only be supported on elim(ι). So, we can pull back along the function elι and pushforward and it will be identity, as our ultracontribution is entirely supported on the area that can be pulled back. So, we get
BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)=(∩im(ι))∗((elι×idΦ)∗((elι×idΦ)∗(BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)))∩⊤im(ι)×2Γ×Φ)
Applying the second part of Proposition 2.14, this can be rewritten as
BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)=(∩im(ι))∗((elι×idΦ)∗(BrΓ0((ι×idΦ)∗(Θ′∩⊤im(ι)×Φ)))∩⊤im(ι)×2Γ×Φ)
Since Θ′ deabbreviates as (ι×idΦ)∗Θ, it’s supported on im(ι), so intersecting with im(ι) does nothing. Getting rid of that part, and deabbreviating, we get
BrΓ0×Γ1((ι×idΦ)∗Θ)=(∩im(ι))∗((elι×idΦ)∗(BrΓ0((ι×idΦ)∗((ι×idΦ)∗Θ)))∩⊤im(ι)×2Γ×Φ)
Now, since ι is an injection, pushforward-then-pullback is identity, so we get
BrΓ0×Γ1((ι×idΦ)∗Θ)=(∩im(ι))∗((elι×idΦ)∗(BrΓ0(Θ))∩⊤im(ι)×2Γ×Φ)
Then, since the pushforward through elι is supported on im(ι), the intersection does nothing
BrΓ0×Γ1((ι×idΦ)∗Θ)=(∩im(ι))∗((elι×idΦ)∗(BrΓ0Θ))
Then note that any set pushed forward through elι must be a subset of im(ι), so intersecting with im(ι) does nothing, and we get
BrΓ0×Γ1((ι×idΦ)∗Θ)=(elι×idΦ)∗(BrΓ0Θ)
And the result is proven. ■
Infra-Bayesian physicalism: proofs part I
This post is an appendix to “Infra-Bayesian physicalism: a formal theory of naturalized induction”.
Lemma 1: prΓ×Φ(χy∈α(s∗(θ)))∈Θ occurs for all s:Γ→Γ iff, for all g:Γ×Φ→[0,1] and s:Γ→Γ, θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x))
This lemma will be implicitly used all over the place, in order to deal with the “presence in the bridge transform” condition algebraically, in terms of function expectations. The bulk of the conditions for a contribution to lie in the bridge transform is this endofunction condition (the support condition is trivial most of the time), so it’s advantageous to reformulate it. First-off, we have that prΓ×Φ(χy∈α(s∗(θ)))∈Θ iff, for all g:Γ×Φ→[0,1], we have prΓ×Φ(χy∈α(s∗(θ)))(λyx.g(y,x))≤Θ(λyx.g(y,x)) By LF-duality for ultradistributions, proved in Less Basic Inframeasure theory. A contribution lies in an ultracontribution set iff its expectations, w.r.t. all functions, are less than or equal to the ultracontribution expectations.
Now, we just need to unpack the left-hand-side into our desired form. Start off with prΓ×Φ(χy∈α(s∗(θ)))(λyx.g(y,x)) Apply how projections work =χy∈α(s∗(θ))(λyαx.g(y,x)) Now, we can move the indicator function into the function we’re taking the expectation again, because there’s no difference between deleting all measure outside of an event, or taking the expectation of a function that’s 0 outside of that event. So, we get =s∗(θ)(λyαx.χy∈αg(y,x)) Then we use how pushforwards are defined in probability theory. =θ(λyαx.χs(y)∈αg(s(y),x)) And that’s our desired form. So, we’ve shown that for any particular s:Γ→Γ, we have prΓ×Φ(χy∈α(s∗(θ)))∈Θ iff, for all g:Γ×Φ→[0,1], θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x)) And so, we get our desired iff statement by going from the iff for one s to the iff for all s. ■
Proposition 2.1: For any Γ, Φ and Θ∈□c(Γ×Φ), Br(Θ) exists and satisfies prΓ×ΦBr(Θ)=Θ. In particular, if Θ∈□(Γ×Φ) then Br(Θ)∈□(elΓ×Φ).
Proof sketch: We’ll show that for any particular contribution in θ∈Θ, there’s a contribution θ∗ which lies within Br(Θ) that projects down to equal θ. And then, show the other direction, that any contribution in Br(Θ) lands within Θ when you project it down. Thus, the projection of Br(Θ) must be Θ exactly.
For the first direction, given some θ∈Θ, let θ∗:=θ⋉(λy.{y}). Note that since θ∈Δc(Γ×Φ), this means that θ∗∈Δc(Γ×2Γ×Φ), so the type signatures line up. Clearly, projecting θ∗ down to Γ×Φ makes θ again. So that leaves showing that θ∗∈Br(Θ). Applying Lemma 1, an equivalent way of stating the bridge transform is that it consists precisely of all the θ′∈Δc(Γ×2Γ×Φ) s.t. for all s:Γ→Γ and g:Γ×Φ→[0,1], θ′(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x)) and also, supp θ′⊆elΓ×Φ.
Clearly, for our given θ∗, everything works out with the support condition, so that leaves the endofunction condition. Let s,g be arbitrary. θ∗(λyαx.χs(y)∈αg(s(y),x))=(θ⋉(λy.{y}))(λyαx.χs(y)∈αg(s(y),x)) =θ(λyx.δ{y}(λα.χs(y)∈αg(s(y),x)))=θ(λyx.χs(y)∈{y}g(s(y),x)) =θ(λyx.χs(y)=yg(s(y),x))=θ(λyx.χs(y)=yg(y,x))≤θ(λyx.g(y,x)) ≤maxθ∈Θθ(λyx.g(y,x))=Θ(λyx.g(y,x)) In order, the equalities were unpacking the semidirect product, substituting the dirac-delta in, reexpressing the condition for the indicator function, using that s(y)=y inside the indicator function, applying monotonicity of θ, using that θ∈Θ, and then packing up the definition of Θ. And that inequality has been fulfilled, so, yes, θ∗ lies within Br(Θ). Since θ was arbitrary within Θ, this shows that the projection set is as-big-or-bigger than Θ.
Now to show the reverse direction, that anything in the projection set lies within Θ. For any particular θ′∈Θ, remember, it must fulfill, for all g,s, that θ′(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x)) So, in particular, we can let s:Γ→Γ be the identity function, and g be anything, and we’d get θ′(λyαx.χy∈αg(y,x))≤Θ(λyx.g(y,x)) And, since θ′ is in Br(Θ), it’s supported on the (y,α) pairs s.t. y∈α, so that indicator function drops out of existence, and we get θ′(λyαx.g(y,x))≤Θ(λyx.g(y,x)) And then this can be written as a projection prΓ×Φ(θ′)(λyx.g(y,x))≤Θ(λyx.g(y,x)) And since this holds for all the various g, this means that prΓ×Φ(θ′)∈Θ And we’re done with the other direction of things, the projection must be exactly equal to Θ since projections of contributions in Br(Θ) always land in Θ, and we can surject onto Θ. ■
Proposition 2.2: Let X be a poset, θ,η∈ΔcX. Then, θ⪯η if and only if there exists κ:X→ΔX s.t.: For all x∈X, y∈supp κ(x): x≤y. κ∗θ≤η
Vanessa proved this one, with a very nice proof involving the max-flow min-cut theorem.
Make the following directed graph: The nodes are the elements of {s}∪{t}∪X×{0,1}. Basically, two copies of the finite poset X, a source node s, and a sink node t. x will be used to denote a variable from X, while a subscript of 0 or 1 denotes that the 0 or 1 version of that x, respectively.
As for the edges, there is an edge from s to every point in X0. And an edge from every point in X1 to t. And, for some x0∈X0 and y1∈X1, there’s an edge iff x≤y. The capacities of the edges are, for all x∈X, cs,x0:=θ(x), and for all x∈X, cx1,t:=η(x), and for x,y∈X s.t. x≤y, cx0,y1:=∞.
To work up to applying min-cut max-flow, we must show that the cut of all the edges between s and X0 is a min-cut of the network.
Remember, all cuts are induced by partitions of the nodes into two sets, S and T, where S includes the source s and T includes the sink t. And the value of a cut is c(S,T)=∑a∈S,b∈Tca,b=∑x0∈Tcs,x0+∑x0∈S,y1∈T:x≤ycx0,y1+∑y1∈Scy1,t Implicitly converting both of the following sets to be subsets of X, we have (S∩X1)⊇(S∩X0)↑ for any non-infinite cut. Why? Well, if it was false, then there’d be some y≥x where x0∈S, and yet y1∉S. So y1∈T. Then, the cost of the cut would include cx0,y1=∞, contradiction. So, now, we have that for any non-infinite cut, c(S,T)=∑a∈S,b∈Tca,b=∑x0∈Tcs,x0+∑y1∈Scy1,t and (S∩X1)⊇(S∩X0)↑ holds.
Now, we’ll change our cut. Given some cut (S,T), let our new cut (S′,T′) be defined as S′:={s}∪((S∩X0)↑×{0,1}) (which still respects that superset property listed above since both sets would be the same), and T′ be the complement. Letting some stuff cancel out, we get c(S,T)−c(S′,T′)=∑x0∈Tcs,x0+∑y1∈Scy1,t−∑x0∈T′cs,x0−∑y1∈S′cy1,t =∑x0∈T/T′cs,x0+∑y1∈S/S′cy1,t−∑x0∈T′/Tcs,x0−∑y1∈S′/Scy1,t Now, because S′∩X1=(S∩X0)↑×{1}⊆S∩X1 by that subset property, it means that there’s no y1∈S′/S, so we get =∑x0∈T/T′cs,x0+∑y1∈S/S′cy1,t−∑x0∈T′/Tcs,x0 As for that other minus term, we have that S′∩X0=(S∩X0)↑×{0}⊇S∩X0 by how it was defined, so T′∩X0⊆T∩X0, so that other minus term is 0, and we get =∑x0∈T/T′cs,x0+∑y1∈S/S′cy1,t≥0 Rearranging this a bit, we have c(S,T)≥c(S′,T′).
And now we’ll show that c(S′,T′) is underscored by c({s},(X×{0,1})∪{t}). Let’s go. We have c(S′,T′)−c({s},(X×{0,1})∪{t}) =∑x0∈T′cs,x0+∑y1∈S′cy1,t−∑x0∈(X×{0,1})∪{t}cs,x0−∑y1∈{s}cy1,t This simplifies a bit as =∑x0∈T′cs,x0+∑y1∈S′cy1,t−∑x0∈X0cs,x0 This can be rewritten a bit as =⎛⎝∑x0∈X0cs,x0−∑x0∈S′cs,x0⎞⎠+∑y1∈S′cy1,t−∑x0∈X0cs,x0 =∑y1∈S′cy1,t−∑x0∈S′cs,x0 And then, using what S′ is defined as, we can get =∑y∈(S∩X0)↑cy1,t−∑x∈(S∩X0)↑cs,x0 and using what the costs are, it’s =∑y∈(S∩X0)↑η(y)−∑x∈(S∩X0)↑θ(y) =η((S∩X0)↑)−θ((S∩X0)↑)≥0 This holds because θ⪯η and the indicator function for (S∩X0)↑ is monotone. And so, we get c(S′,T′)≥c({s},(X×{0,1})∪{t}). And we previously showed that c(S,T)≥c(S′,T′). So, this means that the cut around s is a minimal cut, it underscores all other cuts.
By the max-flow min-cut theorem, there’s a way of having stuff flow from s to t that saturates the capacities of all the edges that are cut.fx0,y1 will be used to denote the flow from x0 to y1 according to this max-flow way. Let’s finish things up.
Define κ:X→ΔX as follows. For some x, if fs,x0>0, then κ(x)(y):=fx0,y1fs,x0 If fs,x0=0, let κ(x) be any probability distribution on x↑. Note that κ(x) is always a probability distribution supported on x↑, by fiat if fs,x0=0, and otherwise, ∑y∈Xκ(x)(y)=∑y∈Xfx0,y1fs,x0=fs,x0fs,x0=1 This is because, the flow out of x0 must equal the flow into x0 from the source. And κ(x) is supported on x↑ because the only edges out of x0 go to y1 where y≥x. Now that we’ve got this demonstrated, we’ll show our desired inequality. Fix an arbitrary y. We have κ∗(θ)(y)=∑x∈Xθ(x)⋅κ(x)(y)=∑x∈Xcs,x0⋅fx0,y1fs,x0 And then we use that all the capacities of the edges cut in the min-cut are saturated according to the max-flow plan, so cs,x0=fs,x0, and we have =∑x∈Xfs,x0⋅fx0,y1fs,x0 Now, if fs,x0=0, then because the flow in equals the flow out, that means that fx0,y1=0, and otherwise we can cancel out, so we can rewrite as =∑x∈Xfx0,y1=fy1,t≤cy1,t=η(y) Where the first equality came from flow in equals flow out, the inequality came from the flow plan respecting the capacity of the paths, and the equality came from how the capacities were defined. So, for all y∈X, we have κ∗(θ)(y)≤η(y) so we have κ∗(θ)≤η.
For the reverse direction, if there’s some κ:X→ΔX s.t κ(x) is supported on x↑ s.t. κ∗(θ)≤η, then for any monotone function f, we have η(f)≥κ∗(θ)(f)=θ(λx.κ(x)(f)) And then we use that, since κ(x) is supported on x↑, and f is monotone, f(x) is less than or equal to the expectation of f w.r.t κ(x) (remember, for κ(x) you have a 100 percent chance of drawing something at-or-above x, which guarantees that f of whatever you picked is above f(x)). And so, we get ≥θ(f) And since this holds for every monotone f, we have θ⪯η.■
Proposition 2.3: Let X be a poset, θ,η∈ΔcX. Then, θ⪯η if and only if there exists κ:X→ΔX s.t. For all x∈X, y∈supp κ(x): x≥y. θ≤κ∗η
Proof: Because θ⪯η, we can get a maximum flow in exactly the same way as Proposition 2.2. Then, just flip the direction of all flows, which will be denoted by swapping the order of the subscripts. Now, define κ(y)(x):=fy1,x0ft,y1 And, again, if it’s 0, it should be an arbitrary probability distribution supported on y↓. Note that κ(y) is always a probability distribution supported on y↓, by fiat if f′t,y1=0, and otherwise, ∑x∈Xκ(y)(x)=∑x∈Xfy1,x0ft,y1=ft,y1ft,y1=1 This is because, the flow out of y1 must equal the flow into y1 from the source t (the old sink). And κ(y) is supported on y↓ because the only edges out of y1 go to x0 where y≥x. Now that we’ve got this demonstrated, we’ll show our desired inequality. Fix an arbitrary x. We have κ∗(η)(x)=∑y∈Xη(y)⋅κ(y)(x)=∑y∈Xcy1,t⋅fy1,x0ft,y1 Then use that cy1,t≥ft,y1 (because even with the flow reversed, the flow through a path must be less than or the same as the capacity. Accordingly, we get ≥∑y∈Xfy1,x0=fx0,s=cs,x0=θ(x) And we’re done, inequality established, using definitions and the flow saturating all the paths out of s.
So, for all x∈X, we have κ∗(η)(x)≥θ(x) so we have κ∗(η)≥θ.
For the reverse direction, if there’s some κ:X→ΔX s.t κ(y) is supported on y↓ s.t. κ∗(η)≥θ, then for any monotone function f, we have θ(f)≤κ∗(η)(f)=η(λx.κ(x)(f)) And then we use that, since κ(x) is supported on x↓, and f is monotone, f(x) is greater than or equal to the expectation of f w.r.t κ(x) (remember, for κ(x) you have a 100 percent chance of drawing something at-or-below x, which guarantees that f of whatever you picked is below f(x)). And so, we get ≤η(f) And since this holds for every monotone f, we have θ⪯η. ■
Proposition 2.4: For any Γ, Φ and Θ∈□c(Γ×Φ),Br(Θ) is downwards closed w.r.t. the induced order on Δc(elΓ×Φ). That is, if θ∈Br(Θ) and η⪯θ then η∈Br(Θ).
Remember, the condition for some θ to lie in Br(Θ) is that it be supported on elΓ, and that, for all s:Γ→Γ, and g:Γ×Φ→[0,1], θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x)) So, given that η⪯θ (lower score for all monotone functions) we’ll show that η fulfills both conditions. The support thing is taken care of by η∈Δc(elΓ×Φ). As for the other one, we have η(λyαx.χs(y)∈αg(s(y),x))≤θ(λyαx.χs(y)∈αg(s(y),x)) ≤Θ(λyx.g(y,x)) That inequality occurs because as you make α larger, ie, go up in the partial ordering, the value assigned to the relevant point increases since s(y)∈α is more likely now, so the function value increases from 0 to something that may be greater than 0. So, it’s a monotone function, and we then use that η⪯θ′. ■
Proposition 2.5: Consider a finite set X, ϕ?∈ΔX, ϕ!:{0,1}→ΔX, p∈[0,1] and ϕ:=(1−p)ϕ?+pϕ!. Then, p≥dTV(ϕ(0),ϕ(1)). Conversly, consider any ϕ:{0,1}→ΔX. Then, there exist ϕ?∈ΔX and ϕ!:{0,1}→ΔX s.t.ϕ=(1−p)ϕ?+pϕ! for p:=dTV(ϕ(0),ϕ(1)).
Ok, so let f be some arbitrary function X→[0,1]. We have an alternate characterization of the total variation distance as dTV(ϕ(0),ϕ(1)):=supf∈X→[0,1]|ϕ(0)(f)−ϕ(1)(f)| And then from there we can go =supf∈X→[0,1]|((1−p)ϕ?(f)+pϕ!(0)(f))−((1−p)ϕ?(f)+pϕ!(1)(f))| =supf∈X→[0,1]|pϕ!(0)(f)−pϕ!(1)(f)| =psupf∈X→[0,1]|ϕ!(0)(f)−ϕ!(1)(f)| =pdTV(ϕ!(0),ϕ!(1)) And since p≥pdTV(ϕ!(0),ϕ!(1)), we have p≥dTV(ϕ(0),ϕ(1)) and are done.
Conversely, let the value of p be 1−(ϕ(0)∧ϕ(1))(1), and ϕ? be 1(1−p)(ϕ(0)∧ϕ(1)). It is clear that this is a probability distribution because of how p was defined. The ∧ is the minimum/common overlap of the two probability distributions. Then, let ϕ!(0)=1p(ϕ(0)−(ϕ(0)∧ϕ(1))), and similar for the 1. Well, as long as p>0. If p=0, it can be any probability distribution you want. It’s a probability distribution because ϕ!(0)(1)=1p(ϕ(0)(1)−(ϕ(0)∧ϕ(1))(1))=1p(1−(ϕ(0)∧ϕ(1))(1)) =1−(ϕ(0)∧ϕ(1))(1)1−(ϕ(0)∧ϕ(1))(1)=1 And since p>0, everything works out. Now we just need to show that these add up to make ϕ and that p is the same as the total variation distance. ϕ(0)=(ϕ(0)∧ϕ(1))+ϕ(0)−(ϕ(0)∧ϕ(1)) =(1−p)1(1−p)(ϕ(0)∧ϕ(1))+p1p(ϕ(0)−(ϕ(0)∧ϕ(1)))=(1−p)ϕ?+pϕ!(0) And this works symmetrically for ϕ(1), showing that we indeed have equality. As for showing that p is the total variation distance, referring back to what we’ve already proved, we have dTV(ϕ(0),ϕ(1))=pdTV(ϕ!(0),ϕ!(1)) And now, since ϕ!(0)=ϕ(0)−(ϕ(0)∧ϕ(1)) and ϕ!(1)=ϕ(1)−(ϕ(0)∧ϕ(1)), the supports of these two probability distributions are disjoint, which implies that the total variation distance is 1, so we have dTV(ϕ(0),ϕ(1))=p.
Proposition 2.6: Consider any Φ and ϕ:{0,1}→ΔΦ. Denote U:={0,1}×{{0,1}}×Φ (the event ``program is unrealized″). Let Λ:=Br(⊤{0,1}⋉ϕ). Then, Λ(χU)=1−dTV(ϕ(0),ϕ(1))
This will take some work to establish. One direction, showing that the bridge transform value exceeds the total variation distance value, is fairly easy. As for the other direction, it’ll take some work. Let’s begin. Λ(χU)=Br(⊤{0,1}⋉ϕ)(χU)=Br(⊤{0,1}⋉ϕ)(λyαx.χα={0,1}) =maxθ∈Br(⊤{0,1}⋉ϕ)θ′(λyαx.χα={0,1}) We’ll make a particular contribution θ∗. It is defined as δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0}+δ0×(ϕ(0)∧ϕ(1))×δ{0,1} Or, put another way, restricting to 0,{0}, it’s ϕ(0)−ϕ(0)∧ϕ(1), and conditioning on 0,{0,1}, it’s ϕ(0)∧ϕ(1). So, for a given s:{0,1}→{0,1}, we have θ∗(λyαx.χs(y)∈αg(s(y),x)) =(δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0}+δ0×(ϕ(0)∧ϕ(1))×δ{0,1})(λyαx.χs(y)∈αg(s(y),x)) =(δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0})(λyαx.χs(y)∈αg(s(y),x)) +(δ0×(ϕ(0)∧ϕ(1))×δ{0,1})(λyαx.χs(y)∈αg(s(y),x)) =(ϕ(0)−ϕ(0)∧ϕ(1))(λx.χs(0)∈{0}g(s(0),x))+(ϕ(0)∧ϕ(1))(λx.χs(0)∈{0,1}g(s(0),x)) Now, we’ll go through two exhaustive possibilities for what s could be. If it maps 0 to 0, then it’s =(ϕ(0)−ϕ(0)∧ϕ(1))(λx.χ0∈{0}g(0,x))+(ϕ(0)∧ϕ(1))(λx.χ0∈{0,1}g(0,x)) =(ϕ(0)−ϕ(0)∧ϕ(1))(λx.g(0,x))+(ϕ(0)∧ϕ(1))(λx.g(0,x)) =ϕ(0)(λx.g(0,x))≤maxy∈{0,1}ϕ(y)(λx.g(y,x)) =⊤{0,1}(λy.ϕ(y)(λx.g(y,x))=(⊤{0,1}⋉ϕ)(λyx.g(y,x)) And our desired inequality is established. If s maps 0 to 1, then it’s =(ϕ(0)−ϕ(0)∧ϕ(1))(λx.χ1∈{0}g(1,x))+(ϕ(0)∧ϕ(1))(λx.χ1∈{0,1}g(1,x)) =(ϕ(0)∧ϕ(1))(λx.g(1,x))≤ϕ(1)(λx.g(1,x)) ≤maxy∈{0,1}ϕ(y)(λx.g(y,x))=(⊤{0,1}⋉ϕ)(λyx.g(y,x)) And that is taken care of. So, our θ∗ lies in Br(⊤{0,1}⋉ϕ).
Now, where were we? Ah right, we were at Λ(χU)=maxθ∈Br(⊤{0,1}⋉ϕ)θ(λyαx.χα={0,1}) But now we can continue with ≥θ∗(λyαx.χα={0,1}) Which unpacks as =(δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0}+δ0×(ϕ(0)∧ϕ(1))×δ{0,1})(λyαx.χα={0,1}) =(δ0×(ϕ(0)−ϕ(0)∧ϕ(1))×δ{0})(λyαx.χα={0,1}) +(δ0×(ϕ(0)∧ϕ(1))×δ{0,1})(λyαx.χα={0,1}) =(ϕ(0)−ϕ(0)∧ϕ(1))(λx.χ{0}={0,1})+(ϕ(0)∧ϕ(1))(λx.χ{0,1}={0,1}) =(ϕ(0)∧ϕ(1))(1) And then, the value of the overlap between two distributions is 1−dTV(ϕ(0),ϕ(1)). So, we’ve shown one direction, we have Λ(χU)≥1−dTV(ϕ(0),ϕ(1)) In the other direction of the equality we’re trying to prove, we need to constrain what θ might be. Starting off with what we definitely know, we have Λ(χU)=maxθ∈Br(⊤{0,1}⋉ϕ)θ(λyαx.χα={0,1}) Now it’s time to show that for any θ∈Br(⊤{0,1}⋉ϕ), that the measure on α={0,1} can’t be too high. Specifically, to proceed further, we’ll need to show that ∀x′∈Φ:θ(λyαx.χx=x′∧α={0,1})≤(ϕ(0)∧ϕ(1))(λx.χx=x′) Time to establish it. On x′, either ϕ(0)(x′)≥ϕ(1)(x′), or vice-versa. Without loss of generality, assume that it’s ϕ(0)(x′) that’s lower.
Then, let s be the constant-0 function, and g(0,x′) be 1, and 0 everywhere else. Then, we have θ(λyαx.χx=x′∧α={0,1})≤θ(λyαx.χs(y)∈αg(s(y),x)) ≤(⊤{0,1}⋉ϕ)(λyx.g(y,x))=(⊤{0,1}⋉ϕ)(λyx.χy=0∧x=x′) =maxy∈{0,1}ϕ(y)(λx.χy=0∧x=x′)=ϕ(0)(λx.χx=x′)=ϕ(0)(x′) =(ϕ(0)∧ϕ(1))(x′) Just abbreviating, using that θ lies in the bridge transform, deabbreviating with what g is, unpacking things a little bit and canceling out. At the end we used that ϕ(0)(x′)≤ϕ(1)(x′).
Ok, so, now that we’ve established the key fact that ∀x′∈Φ:θ(λyαx.χx=x′∧α={0,1})≤(ϕ(0)∧ϕ(1))(λx.χx=x′) We can resume our work. We were previously at =maxθ∈Br(⊤{0,1}⋉ϕ)θ(λyαx.χα={0,1}) and we can proceed to =maxθ∈Br(⊤{0,1}⋉ϕ)∑x′∈Φθ(λyαx.χx=x′∧α={0,1}) and from there, using the inequality we proved, proceed to ≤∑x′∈Φ(ϕ(0)∧ϕ(1))(λx.χx=x′)=(ϕ(0)∧ϕ(1))(1)=1−dTV(ϕ(0),ϕ(1)) And we’re done, we established that it’s an upper bound as well a lower bound, so we have equality. ■
Lemma 2: If there’s a function h:Φ→2Γ and Θ(λyx.χy∉h(x))=0, then for all θ∈Br(Θ), θ is supported on the set {α,x|α⊆h(x)}.
Assume that the conclusion is false, that there is some nonzero probability of drawing a x∗,α∗ pair where α∗⊈h(x∗). In particular, there must be some special y∗∈Γ value that witnesses that α∗ isn’t a subset, by y∗∈α∗ and y∗∉h(x∗). Remember that for all s:Γ→Γ and g:Γ×Φ→[0,1], that since θ∈Br(Θ), we have θ(λyαx.χs(y)∈αg(s(y),x))≤Θ(λyx.g(y,x)) Now, in particular, let g:=χy∉h(x), and s be the constant function that maps everything to y∗. Then, this turns into 0<θ(λyαx.χy∗∈αχy∗∉h(x))≤Θ(λyx.χy∉h(x))=0 which is impossible. The equality as the end was our starting assumption. The middle inequality was just specializing our inequality to a particular pair of functions. And it’s greater than 0 because there’s a nonzero probability of drawing x∗,α∗. And y∗ was selected to lie in α∗ and outside of h(x∗), so there’s a nonzero probability of getting a nonzero value. Therefore, the result follows. ■
Lemma 3: For any s:Γ→Γ, and Θ:□c(Γ×Φ), if θ∈Br(Θ), then χelΓ(s∗(θ′))∈Br(Θ).
So, the support condition is trivially fulfilled, because we’re restricting to the event y∈α. That just leaves the endofunction condition. Let s′:Γ→Γ be arbitrary, and g:Γ×Φ→[0,1] be arbitrary. Then we have χelΓ(s∗(θ))(λyαx.χs′(y)∈αg(s′(y),x))=s∗(θ)(λyαx.χy∈α∧s′(y)∈αg(s′(y),x)) ≤s∗(θ)(λyαx.χs′(y)∈αg(s′(y),x))=θ(λyαx.χs′(s(y))∈αg(s′(s(y)),x)) And, since s′∘s:Γ→Γ, we can use that θ∈Br(Θ) to get ≤Θ(λyx.g(y,x)) And we’re done, we’ve established that our modified version of θ remains in Br(Θ). ■
Proposition 2.7: If X is a poset and Θ∈□cX, then Θ↓ will denote downward closure of Θ. For any Γ, Φ and Θ∈□c(Γ×Φ) if (y,α,x)∈supp Br(Θ) and y′∈α, then (y′,x)∈supp Θ. Moreover, define susΘ:Φ→2Γ by susΘ(x):={y∈Γ∣(y,x)∈supp Θ}. Then, Br(Θ)⊆(Θ⋉susΘ)↓ (we slightly abuse notation by treating susΘ as a mapping Γ×Φ→2Γ that doesn’t depend on the first argument, and also playing loose with the order of factors in the set on which our HUCs live).
This proof splits into two parts. Part 1 will be proving the support statement. Part 2 will be proving the thing with susΘ.
Part 1: Here’s the basics of how this part will work. There’s some (y∗,α∗,x∗) tuple in the support of Br(Θ). Since the support of Br(Θ) is the union of the supports for the θ∈Br(Θ), there’s some θ that assigns nonzero probability to that event. Also, y′∈α∗. The key part of the proof is showing there’s some θ∗∈Br(Θ) that assigns nonzero measure to the event (y′,x∗,α∗). Once we’ve got that, since we know by Proposition 2.1 that Br(Θ) projects down to equal Θ, that means that the projection of θ∗ will land in Θ, and it will assign nonzero measure to the event (y′,x∗), so the event (y′,x∗) lies in the support of Θ.
So, that just leaves appropriately defining our θ∗ and showing that it lies in Br(Θ) and assigns nonzero probability to our event of interest. Our particular choice of θ∗ will be as follows. Let cy′:Γ→Γ be the constant function mapping everything to y′. θ∗:=χelΓ((cy′)∗(θ)) Applying Lemma 3, this is in Br(Θ), so that just leaves showing that it assigns nonzero probability to the event y′,x∗,α∗. We have χelΓ((cy′)∗(θ))(λyαx.χy=y′∧x=x∗∧α=α∗) =(cy′)∗(θ)(λyαx.χy=y′∧x=x∗∧α=α∗∧y∈α) We can simplify this a bit, because if the first three properties hold, that tells you something about what α is. =(cy′)∗(θ)(λyαx.χy=y′∧x=x∗∧α=α∗∧y∈α∗) unpack the pushforward =θ(λyαx.χcy′(y)=y′∧x=x∗∧α=α∗∧cy′(y)∈α∗) Then we use that it’s a constant function =θ(λyαx.χy′=y′∧x=x∗∧α=α∗∧y′∈α∗) Now, since y′∈α∗ by the problem setup, and y′=y′ is a tautology, we can remove those two events. =θ(λyαx.χx=x∗∧α=α∗) and use an inequality ≥θ(λyαx.χy=y∗∧x=x∗∧α=α∗)>0 Because θ assigned nonzero probability to that event. So, we know that our θ∗ assigns nonzero measure to the event of interest. And that’s it! It fulfills the appropriate properties to carry the proof through.
Proof part 2: First off, we can reexpress Br(Θ)⊆(Θ⋉susΘ)↓ as the equivalent statement that, for all f:elΓ×Φ→[0,1], we have Br(Θ)(λyαx.f(y,x,α))≤(Θ⋉susΘ)↓(λyαx.f(y,x,α)) Although, since Br(Θ) and (Θ⋉susΘ)↓ are downwards-closed, we actually only need to demonstrate this inequality for monotone functions f.
The reason we only need to demonstrate this inequality for monotone f is because of Br(Θ)(f)=Br(Θ)(fmax)≤(Θ⋉susΘ)↓(fmax)=(Θ⋉susΘ)↓(f) Where the equalities follow from Proposition 3.1 and the downward closure of the two ultracontributions, and the inequality is what we’re trying to prove (since fmax in the sense of Proposition 3.1 is always a monotone function.
So, let f be monotone, and we’re trying to prove the desired inequality. We’ll unpack it bit by bit. Br(Θ)(λyαx.f(y,x,α))=maxθ∈Br(Θ)θ(λyαx.f(y,x,α)) And now we remember that the set {x,y|y∈susΘ(x)} is a support for Θ, because it’s the same as {x,y|(y,x)∈supp Θ}, ie, the support of Θ. So, by Lemma 2, we can conclude that any θ∈Br(Θ) is supported on α,x pairs s.t. α⊆susΘ(x). In particular, this means that α≤susΘ(x), and since f is monotone, swapping out α for susΘ(x) always produces an increase in expected value, so we get ≤maxθ∈Br(Θ)θ(λyαx.f(y,x,susΘ(x))) and then, since all θ∈Br(Θ) are supported on y,α s.t. y∈α, we can go =maxθ∈Br(Θ)θ(λyαx.χy∈αf(y,x,susΘ(x))) And then, since all the θ∈Br(Θ) fulfill the endofunction property, we can let s be identity and g be f, and go ≤Θ(λyx.f(y,x,susΘ(x))) And rewrite that as =Θ(λyx.δsusΘ(x)(λα.f(y,x,α))) =(Θ⋉susΘ)(λyαx.f(y,x,α)) and then since f is monotone =(Θ⋉susΘ)↓(λyαx.f(y,x,α)) This holds because all contributions added when you take the downward closure can only produce lower expectation values than the existing contributions due to monotonicity of f, and so they’re ignored.
And now we’re done! We got the inequality going appropriately to hit our proof target. ■
Proposition 2.8: For any Γ, Φ and Θ∈□c(Γ×Φ) Br(prΓ×2ΓBr(Θ))=[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)]↓
For notation, we’ll use β for the set which is being treated as part of the background environment, and α as the usual set.
The way to establish this is to show equal expectation values for all functions monotone in the last argument (the relevant set-based one), which is all we need to do as both sets are downwards-closed. We’ll do this by establishing the two inequalities separately. Our first order of business is showing that for all f:elΓ×2Γ monotone in the last coordinate, we have Br(prΓ×2ΓBr(Θ))(λyαβ.f(y,β,α))≥[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)]↓(λyαβ.f(y,β,α)) Let’s begin. Pick an f that’s monotone in the last argument. [(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)]↓(λyαβ.f(y,β,α)) =[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)](λyαβ.f(y,β,α)) =prΓ×2ΓBr(Θ)(λyβ.f(y,β,β))=Br(Θ)(λyβx.f(y,β,β)) =maxθ′∈Br(Θ)θ′(λyβx.f(y,β,β))=maxθ′∈Br(Θ)θ′(λyβx.δβ(λα.f(y,β,α))) =maxθ′∈Br(Θ)prΓ×2Γ(θ′)(λyβ.δβ(λα.f(y,β,α))) =maxθ′∈Br(Θ)(prΓ×2Γ(θ′)⋉id2Γ)(λyβα.f(y,β,α)) We’ll set this to the side for a moment to show that if θ′∈Br(Θ), then prΓ×2Γ(θ′)⋉id2Γ∈Br(prΓ×2ΓBr(Θ)).
The support part is easy. Since θ′∈Br(Θ), it’s supported on (y,β) pairs where y∈β. Taking semidirect product with identity means that y∈α always happens, because β=α always happens. So, that leaves showing the endofunction condition. We have (prΓ×2Γ(θ′)⋉id2Γ)(λyαβ.χs(y)∈αg(s(y),β)) =prΓ×2Γ(θ′)(λyβ.δβ(λα.χs(y)∈αg(s(y),β)) =prΓ×2Γ(θ′)(λyβ.χs(y)∈βg(s(y),β))=θ′(λyβx.χs(y)∈βg(s(y),β)) And start packing things up a bit =s∗(θ′)(λyβx.χy∈βg(y,β)) =χelΓ(s∗(θ′))(λyβx.g(y,β)) And, since θ′∈Br(Θ), this update of the pushforward of θ′ lands in Br(Θ) by Lemma 3, so we get ≤maxθ′′∈Br(Θ)θ′′(λyβx.g(y,β)) =Br(Θ)(λyβx.g(y,β)) =prΓ×2ΓBr(Θ)(λyβ.g(y,β)) Now, since we’ve established that inequality for all choices of s,g, we have that prΓ×2Γ(θ′)⋉id2Γ∈Br(prΓ×2ΓBr(Θ)) Resuming where we last left off, the last place we were at in our chain of inequalities was =maxθ′∈Br(Θ)(prΓ×2Γ(θ′)⋉id2Γ)(λyαβ.f(y,β,α)) Since these things are always in Br(prΓ×2ΓBr(Θ)), we can go ≤maxθ′′∈Br(prΓ×2ΓBr(Θ))θ′′(λyαβ.f(y,β,α)) =Br(prΓ×2ΓBr(Θ))(λyαβ.f(y,β,α)) And we’re done with that inequality direction.
Now to show the reverse direction, which actually will use that f is monotone in the last argument. We start with Br(prΓ×2ΓBr(Θ))(λyαβ.f(y,β,α)) =maxθ′′∈Br(prΓ×2ΓBr(Θ))θ′′(λyαβ.f(y,β,α)) And now, we can use that prΓ×2ΓBr(Θ) is supported on (y,β) pairs where y∈id(β), along with Lemma 2 applied to id, to conclude that all the θ′′ must be supported on the event α⊆id(β)=β. Since big sets go more towards the top and get a higher loss, and f is monotone in the last argument, we get ≤maxθ′′∈Br(prΓ×2ΓBr(Θ))θ′′(λyαβ.f(y,β,β)) Now, we can use the endofunction property of all the θ′′ w.r.t.prΓ×2ΓBr(Θ) to get a uniform upper bound of ≤prΓ×2ΓBr(Θ)(λyβ.f(y,β,β)) and then go =[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)](λyαβ.f(y,β,α)) =[(idΓ×diag2Γ)∗prΓ×2ΓBr(Θ)]↓(λyαβ.f(y,β,α)) and we’re done, we got the inequality going in the other direction, so we have equality for arbitrary monotone functions, and thus equality. ■
Proposition 2.9: For any Γ, Φ and Θ1,Θ2∈□c(Γ×Φ), if Θ1⊆Θ2 then Br(Θ1)⊆Br(Θ2).
Proof: This is really easy to show, we just need to take some contribution θ′∈Br(Θ1) and show that it lies in Br(Θ2). The support condition is easily fulfilled, so that leaaves showing the endofunction condition. Let’s begin. θ′(λyαx.χs(y)∈αg(s(y),x))≤Θ1(λyx.g(y,x))=maxθ∈Θ1θ(λyx.g(y,x)) And now, since Θ1⊆Θ2, we have ≤maxθ∈Θ2θ(λyx.g(y,x))=Θ2(λyx.g(y,x)) Done. ■
Proposition 2.10: For any Γ, Φ1, Φ2, t:Φ2→Φ1 and Θ∈□c(Γ×Φ2) (idelΓ×t)∗Br(Θ)⊆Br((idΓ×t)∗Θ)
As usual, we just need to establish that for all f:elΓ×Φ, the expectation value is lower in the first function than the second function, so our proof goal will be (idelΓ×t)∗Br(Θ)(λyαx1.f(y,α,x1))≤Br((idΓ×t)∗Θ)(λyαx1.f(y,α,x1)) Let’s begin trying to show this. (idelΓ×t)∗Br(Θ)(λyαx1.f(y,x1,α)) =Br(Θ)(λyαx2.f(y,t(x2),α)) =maxθ′∈Br(Θ)θ′(λyαx2,.f(y,t(x2),α)) =maxθ′∈Br(Θ)(idelΓ×t)∗(θ′)(λyαx1.f(y,x1,α)) Now we will show that all the contributions of this form lie in Br((idΓ×t)∗Θ). Clearly the y∈α condition is always fulfilled for these pushforwards, so that just leaves the endofunction condition. Let’s begin. (idelΓ×t)∗(θ′)(λyαx1.χs(y)∈αg(s(y),x1))=θ′(λyαx2.χs(y)∈αg(s(y),t(x2))) ≤Θ(λyx2.g(y,t(x2)))=(idΓ×t)∗(Θ)(λyx1.g(y,x1)) And we’re done, we established our desired result that the pushforward lands in the appropriate set. So, we can proceed by going ≤maxθ′′∈Br((idΓ×t)∗(Θ))θ′′(λyαx1.f(y,x1,α)) =Br((idΓ×t)∗(Θ))(λyαx1.f(y,x1,α)) And we’re done! ■
Proposition 2.11: Consider any Γ, Φ1, Φ2, t:Φ2→Φ1, Ξ:Φ1→□Φ2 and Θ∈□c(Γ×Φ1) s.t. t∘Ξ=idΦ1. Then, (idelΓ×Ξ)∗Br(Θ)⊆Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ) In particular, prelΓBr(Θ)=prelΓBr((idΓ×Ξ)∗Θ)
Well, let’s get started on proving these various things. To begin with, to prove (idelΓ×Ξ)∗Br(Θ)⊆Br((idΓ×Ξ)∗Θ) We need to prove that, for all f:elΓ×Φ2→[0,1], (idelΓ×Ξ)∗Br(Θ)(λyαx2.f(y,x2,α))≤Br((idΓ×Ξ)∗Θ)(λyαx2.f(y,x2,α)) Let’s establish this. Unpacking the left-hand side, we have (idelΓ×Ξ)∗Br(Θ)(λyαx2.f(y,x2,α)) =Br(Θ)(λyαx1.Ξ(x1)(λx2.f(y,x2,α))) =maxθ′∈Br(Θ)θ′(λyαx1.Ξ(x1)(λx2.f(y,x2,α))) =maxθ′∈Br(Θ)(idelΓ×Ξ)∗(θ′)(λyαx2.f(y,x2,α)) =maxθ′∈Br(Θ)maxθ′′∈(idelΓ×Ξ)∗(θ′)θ′′(λyαx2.f(y,x2,α)) Now we’ll show that this θ′′ lies in Br((idΓ×Ξ)∗Θ). The support condition is trivial, so that leaves the endofunction condition. θ′′(λyαx2.χy∈αg(s(y),x2))≤(idelΓ×Ξ)∗(θ′)(λyαx2.χs(y)∈αg(s(y),x2)) =θ′(λyαx1.Ξ(x1)(λx2.χs(y)∈αg(s(y),x2))) Then, by homogenity of Ξ, we can pull a constant out, the indicator function, to get =θ′(λyαx1.χs(y)∈αΞ(x1)(λx2.g(s(y),x2))) Then, by the endofunction condition, we get ≤Θ(λyx1.Ξ(x1)(λx2.g(y,x2))) =(idΓ×Ξ)∗(Θ)(λyx2.g(y,x2)) And bam, we’ve showed that θ′′ lies in the appropriate set. We were last at =maxθ′∈Br(Θ)maxθ′′∈(idelΓ×Ξ)∗(θ′)θ′′(λyαx2.f(y,x2,α)) So we can impose an upper bound of ≤maxθ′′∈Br((idΓ×Ξ)∗Θ)θ′′(λyαx2.f(y,x2,α)) ≤Br((idΓ×Ξ)∗(Θ))(λyαx2.f(y,x2,α)) And we’re done, we’ve established our desired inequality. Now for proving another inequality. Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ) This can be proved by applying Proposition 2.10. Start out with Proposition 2.10. (idelΓ×t)∗Br(Θ′)⊆Br((idΓ×t)∗Θ′) Specialize Θ′ to (idΓ×Ξ)∗(Θ), yielding (idelΓ×t)∗Br((idΓ×Ξ)∗Θ)⊆Br((idΓ×t)∗(idΓ×Ξ)∗Θ) The two pushforwards can be folded into one. (idelΓ×t)∗Br((idΓ×Ξ)∗Θ)⊆Br((idΓ×t∘Ξ)∗Θ) Now use that t∘Ξ is identity (idelΓ×t)∗Br((idΓ×Ξ)∗Θ)⊆Br(Θ) Apply pullback along t to both sides (idelΓ×t)∗(idelΓ×t)∗Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ)
On the other hand, we also have
Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗(idelΓ×t)∗Br((idΓ×Ξ)∗Θ)
Combining the last two lines, we get the desired result:
Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ) That just leaves showing the equality result, now that we’ve got both subset inclusions.
We start out with (idelΓ×Ξ)∗Br(Θ)⊆Br((idΓ×Ξ)∗Θ)⊆(idelΓ×t)∗Br(Θ) Applying projection yields prelΓ((idelΓ×Ξ)∗Br(Θ))⊆prelΓ(Br((idΓ×Ξ)∗Θ))⊆prelΓ((idelΓ×t)∗Br(Θ)) For any given f, we have prelΓ((idelΓ×Ξ)∗Br(Θ))(λyα.f(y,α)) ≤prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α)) ≤prelΓ((idelΓ×t)∗Br(Θ))(λyα.f(y,α)) Unpacking the projection on the two ends, we get (idelΓ×Ξ)∗Br(Θ)(λyαx2.f(y,α)) ≤prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α)) ≤(idelΓ×t)∗Br(Θ)(λyαx2.f(y,α)) And then, for the smallest and largest quantities, the pullback or pushforward only affects the x1 or x2 coordinates, everything else remains the same. In particular, f doesn’t depend on such coordinates. So, we get Br(Θ)(λyαx1.f(y,α)) ≤prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α)) ≤Br(Θ)(λyαx1.f(y,α)) The left and right hand sides are equal, so we have prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α))=Br(Θ)(λyαx1.f(y,α)) Then, packing the projection back up, we have prelΓ(Br((idΓ×Ξ)∗Θ))(λyα.f(y,α))=prelΓBr(Θ)(λyα.f(y,α)) And since there’s equality for all functions, the two ultradistributions are equal, so we have prelΓ(Br((idΓ×Ξ)∗Θ))=prelΓBr(Θ) And we’re done. ■
Proposition 2.12: For any Γ, Φ, Θ1,Θ2∈□c(Γ×Φ) and p∈[0,1] pBr(Θ1)+(1−p)Br(Θ2)⊆Br(pΘ1+(1−p)Θ2)
We’ll do this by showing that all contributions of the form pθ+(1−p)θ′ with θ∈Br(Θ1) and θ′∈Br(Θ2) lie within Br(pΘ1+(1−p)Θ2). The support property is trivial, so that leaves the endofunction property. (pθ+(1−p)θ′)(λyαx.χs(y)∈αg(s(y),x)) =pθ(λyαx.χs(y)∈αg(s(y),x))+(1−p)θ′(λyαx.χs(y)∈αg(s(y),x)) ≤pΘ1(λyx.g(y,x))+(1−p)Θ2(λyx.g(y,x)) =(pΘ1+(1−p)Θ2)(λyx.g(y,x)) Done, the mix lies in Br(pΘ1+(1−p)Θ2). ■
Proposition 2.13: Consider some Γ, Φ1, Φ2, Θ1∈□c(Γ×Φ1), Θ1∈□c(Γ×Φ2) and p∈[0,1]. Regard Φ1,2 as subsets of Φ1⊔Φ2, so that pΘ1+(1−p)Θ2∈□c(Γ×(Φ1⊔Φ2)). Then, pBr(Θ1)+(1−p)Br(Θ2)=Br(pΘ1+(1−p)Θ2)
We already established one subset direction in Proposition 2.12. We just need the other direction, to take any θ′′∈Br(pΘ1+(1−p)Θ2), and write it as pθ+(1−p)θ′ where θ∈Br(Θ1) and θ′∈Br(Θ2).
We trivially have equality when p=0 or 1, so that leaves the cases where p∈(0,1). With that, our attempted θ will be χΦ1θ′′p, and our attempted θ′ will be χΦ2θ′′1−p.
Now, clearly, these mix to make θ′′, because pθ+(1−p)θ′=pχΦ1θ′′p+(1−p)χΦ2θ′′1−p=χΦ1θ′′+χΦ2θ′′=θ′′ Remember, Φ1 and Φ2 are the two components of a space made by disjoint union.
They clearly both fulfill the support condition, because they’re made from θ′′ which fulfills the support condition. This leaves the endofunction condition, which is somewhat nontrivial to show, and will require some setup. Clearly, without loss of generality, we just need to show it for θ, and θ′ follows by identical arguments. For some g:Γ×(Φ1⊔Φ2)→[0,1], g′ will denote the function that mimics g on Γ×Φ1, and is 0 on Γ×Φ2.
Also, note that θ, as defined, is supported entirely on Γ×Φ1, and θ′, as defined, is supported entirely on Γ×Φ2. Let’s get started on showing our endofunction condition. Let g and s be arbitrary. θ(λyαx.χs(y)∈αg(s(y),x)) =θ(λyαx.χs(y)∈αg(s(y),x))+1−ppθ′(λyαx.0) This is because the expectation of an all-0 function is 0. Then, =θ(λyαx.χs(y)∈αg′(s(y),x))+1−ppθ′(λyαx.χs(y)∈αg′(s(y),x)) Why does this happen? Well, θ is supported entirely on Φ1, and g′ mimics g perfectly on Φ1. And θ′ is supported entirely on Φ2, and g′ mimics 0 perfectly on Φ2. So it doesn’t matter that we changed the functions outside of the contribution support. =(θ+1−ppθ′)(λyαx.χs(y)∈αg′(s(y),x)) =1p(pθ+(1−p)θ′)(λyαx.χs(y)∈αg′(s(y),x)) =1pθ′′(λyαx.χs(y)∈αg′(s(y),x)) Now we use the endofunction condition on θ′′ to get ≤1p(pΘ1+(1−p)Θ2)(λyx.g′(y,x)) =(Θ1+1−ppΘ2)(λyx.g′(y,x)) =Θ1(λyx.g′(y,x))+1−ppΘ2(λyx.g′(y,x)) And then we use that Θ1 is supported on Ψ1 and Θ2 is supported on Ψ2, and on Ψ1, g′=g, and on Ψ2, g′=0, to get =Θ1(λyx.g(y,x))+1−ppΘ2(λyx.0) =Θ1(λyx.g(y,x)) And we’re done! The endofunction condition goes through, which is the last piece of the proof we needed. ■
Proposition 2.14: Consider some Γ, Φ, Θ∈□c(Γ×Φ) and F⊆Γ. Let ∩F:2Γ→2Γ be defined by ∩F(α):=F∩α. Then, BrΓ(Θ∩⊤F×Φ)=(idΓ×∩F×idΦ)∗(BrΓ(Θ)∩⊤F×2Γ×Φ) Moreover, let ι:Γ′→Γ be an injection and elι:elΓ′→elΓ be the injection induced by ι. Then, BrΓ′((ι×idΦ)∗(Θ∩⊤im(ι)×Φ))=(elι×idΦ)∗BrΓ(Θ∩⊤im(ι)×Φ)
We’ll have to prove both directions separately. Notice that for our proofs so far, they involve transforming one side of the equation to a certain form, then there’s a critical step in the middle that’s tricky to show, then we just transform back down. So our first step will be unpacking the two sides of the equation up to the critical inequality.
Also, intersecting with the top distribution corresponding to a particular set is the same as the raw-update on that set, we’ll use that. We’ll suppress identity functions and unused coordinates for ⊤ in the notation, so if some coordinate isn’t mentioned, assume it’s either the identity function for pushforwards, or that the set for ⊤ is the specified set times the entire space for unmentioned coordinates. With that notational note, we have BrΓ(Θ∩⊤F×Φ)(λyαx.f(y,x,α)) Suppressing notation =Br(Θ∩⊤F)(λyαx.f(y,x,α)) And rewriting the bridge transform, =maxθ∈Br(Θ∩⊤F)θ(λyαx.f(y,x,α)) This is our unpacking in one direction.
In the other direction, we have (idΓ×∩F×idΦ)∗(BrΓ(Θ)∩⊤F×2Γ×Φ)(λyαx.f(y,x,α)) Suppress some notation, =(∩F)∗(Br(Θ)∩⊤F)(λyαx.f(y,x,α)) Unpack the pushforward and intersection =(Br(Θ)∩⊤F)(λyαx.f(y,x,α∩F)) =Br(Θ)(λyαx.χy∈Ff(y,x,α∩F)) And unpack the bridge transform =maxθ′∈Br(Θ)θ′(λyαx.χy∈Ff(y,x,α∩F)) Pack back up =maxθ′∈Br(Θ)(θ′∩⊤F)(λyαx.f(y,x,α∩F)) =maxθ′∈BrΓ(Θ)(∩F)∗(θ′∩⊤F)(λyαx.f(y,x,α)) So, our goal to show equality overall is to establish the equality maxθ∈Br(Θ∩⊤F)θ(λyαx.f(y,x,α)) =maxθ′∈Br(Θ)(∩F)∗(θ′∩⊤F)(λyαx.f(y,x,α)) This can be done by establishing the following two things. First, is, if θ′∈Br(Θ), then (∩F)∗(θ′∩⊤F)∈Br(Θ∩⊤F). That establishes the ≥ inequality direction.
For the reverse direction, we’ll need to show that for any θ∈Br(Θ∩⊤F), θ also lies in Br(Θ), and that (∩F)∗(θ∩⊤F)=θ. This establishes the ≤ inequality direction, since any choice of θ for the left hand side can be duplicated on the right hand side.
Let’s switch our proof target to showing these two things. First off, assume θ′∈Br(Θ). The goal is to show that (∩F)∗(θ′∩⊤F)∈Br(Θ∩⊤F)
For once, the support condition is not entirely trivial. However, notice that θ′∩⊤F always has y∈F holding (because we updated) and y∈α always holds (because θ′ fulfills the support condition due to being in Br(Θ)). So, it’s guaranteed that y∈α∩F. Then, applying the pushforward that turns α into α∩F doesn’t change that there’s a guarantee that y lies in the set it’s paired with.
Now for the endofunction condition. (∩F)∗(θ′∩⊤F)(λyαx.χs(y)∈αg(s(y),x)) =(θ′∩⊤F)(λyαx.χs(y)∈α∩Fg(s(y),x)) =θ′(λyαx.χy∈Fχs(y)∈α∩Fg(s(y),x)) ≤θ′(λyαx.χs(y)∈α∩Fg(s(y),x))=θ′(λyαx.χs(y)∈αχs(y)∈Fg(s(y),x)) ≤Θ(λyxχy∈Fg(y,x))=(Θ∩⊤F)(λyx.g(y,x)) And we’re done, that half of the proof works out.
Now for the reverse direction, establishing that for any θ∈Br(Θ∩⊤F), θ also lies in Br(Θ), and that (∩F)∗(θ∩⊤F)=θ.
The first part of this is easy. θ(λyαx.χs(y)∈αg(s(y),x))≤(Θ∩⊤F)(λyx.g(y,x)) =Θ(λyx.χy∈Fg(y,x))≤Θ(λyx.g(y,x)) And we’re done. For the second part, it’s a bit trickier. In short, what happens is that a piece of measure from θ is deleted if y∉F, and then α gets mapped to α∩F. So, if we knew that θ was supported on F×2F, we’d know that neither of the two transformations do anything whatsoever, and so you just get θ out again.
So, let’s show that θ∈Br(Θ∩⊤F) is supported on F×2F. For this, the way we do it is use Proposition 2.7 to get an upper bound on the bridge transformation, and show that said upper bound is also supported on F×2F. By proposition 2.7, Br(Θ∩⊤F)⊆((Θ∩⊤F)⋉susΘ∩⊤F)↓ The downwards arrow means that we’re passing from larger subsets to smaller subsets, so it doesn’t matter if we remove that, it won’t affect whether the support is a subset of F×2F. Clearly, Θ∩⊤F is supported on F. And, for sus, we have susΘ∩⊤F(x)={y|(y,x)∈supp (Θ∩⊤F)} So clearly it can only produce sets of y which are subsets of F since that’s an upper bound on the support of Θ∩⊤F. So we have support on F×2F, and we’re done with this half of the theorem.
So now we move onto the second half of the theorem with the injections. Without loss of generality, we can reduce this problem to a slightly simpler form, where we assume that Θ is supported over im(ι). This is because any Θ supported over im(ι) has Θ∩⊤im(ι)=Θ, and also for any Θ′, Θ′∩⊤im(ι) is always supported over im(ι).
Accordingly, assume that Θ is supported over im(ι). Suppressing identity functions, our goal is to prove that BrF((ι)∗Θ)=(elι)∗BrΓ(Θ) This is suprisingly aggravating to prove. Again, we’ll try to rewrite things until we get to a point where there’s just one equality to show, and then put in the work of showing the endofunction condition in both directions. Time to start rewriting. Subscripts F will be used to denote when a particular y is part of F, or of Γ. Similar with 2F and 2Γ. BrF((ι)∗Θ)(λyFαFx.f(yF,x,αF)) =maxθF∈BrF((ι)∗Θ)θF(λyFαFx.f(yF,x,αF)) Rewriting the other direction, we have (elι)∗BrΓ(Θ)(λyFαFx.f(yF,x,αF)) Now, for pullback, it’s defined like this. Maximum over the empty set is 0. =BrΓ(Θ)(λyαx.maxyF,αF∈(elι)−1(y,α)f(yF,x,αF)) =maxθ∈BrΓ(Θ)θ(λyαx.maxyF,αF∈(elι)−1(y,α)f(yF,x,αF)) =maxθ∈BrΓ(Θ)(elι)∗(θ)(λyFαFx.f(yF,x,αF)) And so we now must embark on showing these two things are equal, by showing that every value that one of the maxes is capable of producing, the other is capable of producing too. Our proof goal is maxθF∈BrF((ι)∗Θ)θF(λyFαFx.f(yF,x,αF))=maxθ∈BrΓ(Θ)(elι)∗(θ)(λyFαFx.f(yF,x,αF)) Now, for one direction of this, we need to show that if θ∈BrΓ(Θ), then (elι)∗(θ)∈BrF((ι)∗Θ).
The support condition is easily fulfilled, because the only time the pullback works is when y,α∈im(elι), and ι is an injection, so there’s a unique preimage point yF,αF, and since y∈F occurs always,then yF∈αF occurs always. That leaves the endofunction condition. Let sF:F→F, gF:F×Φ→[0,1]. Let s′ be defined as follows. For points in im(ι), it maps y to ι(sF(ι−1(y))). Everywhere else, it’s the identity. Also, g′, for points in im(ι), maps y,x to g(ι−1(y),x), and is 0 everywhere else. Both of these are uniquely defined because ι is an injection. We have (elι)∗(θ)(λyFαFx:χsF(yF)∈αFgF(sF(yF),x)) =θ(λyαx:maxyF,αF∈(elι)−1(y,α)χsF(yF)∈αFgF(sF(yF),x)) We can partially write the maximum as an indicator function for checking whether y∈im(ι) and α⊆im(ι) (because in such a case the maximum is taken over an empty set and 0 is returned in those cases). And also, since there’s only one point in that inverse, since ι is an injection, there’s a canonical inverse and we can swap everything out for that, yielding =θ(λyαx:χy∈im(ι)∧α⊆im(ι)χsF(ι−1(y))∈ι−1(α)gF(sF(ι−1(y)),x)) Now, since ι is an injection, applying it to the point and the set in the indicator function don’t affect whether the point is in the relevant set, so we can go =θ(λyαx:χy∈im(ι)∧α⊆im(ι)χι(sF(ι−1(y)))∈ι(ι−1(α))gF(sF(ι−1(y)),x)) And we can also apply ι and ι−1 to the point in the gF function, as that’s just identity. =θ(λyαx:χy∈im(ι)∧α⊆im(ι)χι(sF(ι−1(y)))∈ι(ι−1(α))gF(ι−1(ι(sF(ι−1(y)))),x)) Use our s′ and g′ abbreviations since we know that the relevant point is in im(ι) if it made it past the indicator function. Also do some canceling out of identity functions around the α. =θ(λyαx:χy∈im(ι)∧α⊆im(ι)χs′(y)∈αgF(ι−1(s′(y)),x)) And use our abbreviation of g′ =θ(λyαx:χy∈im(ι)∧α⊆im(ι)χs′(y)∈αg′(s′(y),x)) Remove the indicator function ≤θ(λyαx:χs′(y)∈αg′(s′(y),x)) Apply endofunction property ≤Θ(λyx.g′(y,x)) Now, g′ is 0 when y∉im(ι), and is gF(ι−1(y),x) otherwise, so we have =Θ(λyx.χy∈im(ι)gF(ι−1(y),x)) =Θ(λyx.maxyF∈ι−1(y)gF(yF,x))=(ι)∗Θ(λyFx.gF(yF,x)) And we’re done here, the endofunction condition has been shown, establishing one direction of the inequality.
For the reverse direction, we’ll be working on showing that if θF∈BrF((ι)∗Θ), then (elι)∗θF∈Br(Θ), though it will take a little bit of extra work at the end to show how this implies our desired equality. The support property obviously holds, we’re just applying ι to our point and set, so that leaves the endofunction property.s and g are as usual. (elι)∗θF(λyαx.χs(y)∈αg(s(y),x)) =θF(λyFαFx.χs(ι(yF))∈ι(αF)g(s(ι(yF)),x)) Now, if s(ι(yF))∈im(ι), it might pass the indicator function, but if s(ι(yF))∉im(ι), it definitely won’t. So let’s place that indicator function. =θF(λyFαFx.χs(ι(yF))∈im(ι)χs(ι(yF))∈ι(αF)g(s(ι(yF)),x)) Now, since we know this stuff is in the image of ι (and ι(αF) definitely is), and ι is an injection, we can safely take the inverse of everything, yielding =θF(λyFαFx.χs(ι(yF))∈im(ι)χι−1(s(ι(yF)))∈ι−1(ι(αF))g(ι(ι−1(s(ι(yF)))),x)) In order, this was because we could reverse the injection as long as we’re past that indicator function, reversing the injection doesn’t affect whether the point is an element of the set, and ι∘ι−1 is identity. Cancel out some of the identity.. =θF(λyFαFx.χs(ι(yF))∈im(ι)χι−1(s(ι(yF)))∈αFg(ι(ι−1(s(ι(yF)))),x)) Now, at this point, let sF:F→F be defined as sF(yF)=ι−1(s(ι(yF))) when s(ι(yF))∈im(ι), and arbitrary otherwise. And let gF be defined as gF(yF,x)=g(ι(yF),x) Now, using these abbreviations, we can go =θF(λyFαFx.χs(ι(yF))∈im(ι)χsF(yF)∈αFg(ι(sF(yF)),x)) =θF(λyFαFx.χs(ι(yF))∈im(ι)χsF(yF)∈αFgF(sF(yF),x)) Now we can safely strip away the indicator function. ≤θF(λyFαFx.χsF(yF)∈αFgF(sF(yF),x)) And apply the endofunction condition ≤(ι)∗(Θ)(λyFx.gF(yF,x)) Unpacking the pullback, we get =Θ(λyx.maxyF∈ι−1(y)gF(yF,x)) Now unpack the definition of gF =Θ(λyx.maxyF∈ι−1(y)g(ι(yF),x)) Use that Θ is supported on the image of ι, so that max is always nonempty. Then, things cancel out to yield =Θ(λyx.g(y,x)) And we’re done there.
Ok, so… we’ve shown one inequality direction already, all that remained to be shown was maxθF∈BrF((ι)∗Θ)θF(λyFαFx.f(yF,x,αF))≤maxθ∈BrΓ(Θ)(elι)∗(θ)(λyFαFx.f(yF,x,αF)) How does it help to know that for any θF∈BrF((ι)∗Θ), then (elι)∗θF∈BrΓ(Θ)? Well, for any particular θF chosen on the left-hand side, you can let the choice of θ for the right hand side be (elι)∗θF, which lies in the appropriate set. Then, the contribution for the right-hand-side would be (elι)∗((elι)∗(θF)), which happens to be θF. So anything the left-hand-side can do, the right-hand-side can do as well, showing our one remaining direction of inequality, and concluding the proof.
Well, actually, I should flesh out that claim that (elι)∗((elι)∗(θF))=θF. This happens because (elι)∗((elι)∗(θF))(λyFαFx.f(yF,x,αF)) =(elι)∗(θF)(λyαx.χy,α∈elFmaxyF,αF∈(elι)−1(y,α)f(yF,x,αF)) =θF(λyFαFx.χι(yF),ι(αF)∈elFmaxy′F,α′F∈(elι)−1(ι(yF),ι(αF))f(y′F,x,α′F)) Now,that initial indicator condition is always true because θF is supported on elF, so we have =θF(λyFαFx.maxy′F,α′F∈(elι)−1(ι(yF),ι(αF))f(y′F,x,α′F)) And then, since elι is injective, applying it then taking the partial inverse is just identity, so we get =θF(λyFαFx.f(yF,x,αF)) And that’s why we have equality. Alright, proof finally done. ■
Proposition 2.15: Consider some Γ0, Γ1, r:Γ0→Γ1, Φ, Θ∈□c(Γ0×Φ). Let ι:Γ0→Γ0×Γ1 be given by ι(y):=(y,r(y)). Then, BrΓ0×Γ1((ι×idΦ)∗Θ)=(elι×idΦ)∗BrΓ0(Θ)
This can be established by Proposition 2.14. Taking proposition 2.14, and using Θ′ as an abbreviation for (ι×idΦ)∗Θ, and im(ι) as the F, we get BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ) =(idΓ0×Γ1×∩im(ι)×idΦ)∗(BrΓ0×Γ1(Θ′)∩⊤im(ι)×2Γ×Φ) We’ll need to abbreviate in a bit, so ignore some of the identity functions to get BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ) =(∩im(ι))∗(BrΓ0×Γ1(Θ′)∩⊤im(ι)×2Γ×Φ) Now, deabbreviating, Θ′ ie (ι×idΦ)∗Θ is already supported on im(ι), so intersecting with ⊤im(ι)×Φ does nothing. BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ) =(∩im(ι))∗(BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)∩⊤im(ι)×2Γ×Φ) Now, since Θ′∩⊤im(ι)×Φ is onl supported on im(ι), the bridge transform will only be supported on elim(ι). So, we can pull back along the function elι and pushforward and it will be identity, as our ultracontribution is entirely supported on the area that can be pulled back. So, we get BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ) =(∩im(ι))∗((elι×idΦ)∗((elι×idΦ)∗(BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ)))∩⊤im(ι)×2Γ×Φ) Applying the second part of Proposition 2.14, this can be rewritten as BrΓ0×Γ1(Θ′∩⊤im(ι)×Φ) =(∩im(ι))∗((elι×idΦ)∗(BrΓ0((ι×idΦ)∗(Θ′∩⊤im(ι)×Φ)))∩⊤im(ι)×2Γ×Φ) Since Θ′ deabbreviates as (ι×idΦ)∗Θ, it’s supported on im(ι), so intersecting with im(ι) does nothing. Getting rid of that part, and deabbreviating, we get BrΓ0×Γ1((ι×idΦ)∗Θ) =(∩im(ι))∗((elι×idΦ)∗(BrΓ0((ι×idΦ)∗((ι×idΦ)∗Θ)))∩⊤im(ι)×2Γ×Φ) Now, since ι is an injection, pushforward-then-pullback is identity, so we get BrΓ0×Γ1((ι×idΦ)∗Θ) =(∩im(ι))∗((elι×idΦ)∗(BrΓ0(Θ))∩⊤im(ι)×2Γ×Φ) Then, since the pushforward through elι is supported on im(ι), the intersection does nothing BrΓ0×Γ1((ι×idΦ)∗Θ)=(∩im(ι))∗((elι×idΦ)∗(BrΓ0Θ)) Then note that any set pushed forward through elι must be a subset of im(ι), so intersecting with im(ι) does nothing, and we get BrΓ0×Γ1((ι×idΦ)∗Θ)=(elι×idΦ)∗(BrΓ0Θ) And the result is proven. ■