We give a sufficient condition for a logical conditional expectation value defined using an optimal predictor scheme to be stable on counterfactual conditions.
Definition
Given Δ an error space of rank r, the stabilizer of Δ, denoted stabΔ is the set of functions γ:Nr→R>0 s.t. for any δ∈Δ we have γδ∈Δ.
Theorem
Consider Δ an error space of rank 2, D⊆{0,1}∗, μ a word ensemble and ^PD a Δ(poly,log)-optimal predictor scheme for (χD,μ). Assume ϵ:N2→R>0 is s.t.
(i) ϵ−1∈stabΔ
(ii) ^PkjD≥ϵ(k,j)
Consider f:D∩suppμ→[0,1]
and ^P1, ^P2Δ(poly,log)-optimal predictor schemes for (f,μ∣D). Then, ^P1μ≃Δ^P2.
Note
This result can be interpreted as stability on counterfactual conditions since the similarity is relative to μ rather than only relative to μ∣D. That is, ^P1 and ^P2 are similar outside of D as well.
Proof of Theorem
We will refer to the previously established results about Δ(poly,log)-optimal predictor schemes by L.N where N is the number in the linked post. Thus Theorem 1 there becomes Theorem L.1 here and so on.
Logical counterfactuals using optimal predictor schemes
We give a sufficient condition for a logical conditional expectation value defined using an optimal predictor scheme to be stable on counterfactual conditions.
Definition
Given Δ an error space of rank r, the stabilizer of Δ, denoted stabΔ is the set of functions γ:Nr→R>0 s.t. for any δ∈Δ we have γδ∈Δ.
Theorem
Consider Δ an error space of rank 2, D⊆{0,1}∗, μ a word ensemble and ^PD a Δ(poly,log)-optimal predictor scheme for (χD,μ). Assume ϵ:N2→R>0 is s.t.
(i) ϵ−1∈stabΔ
(ii) ^PkjD≥ϵ(k,j)
Consider f:D∩suppμ→[0,1] and ^P1, ^P2 Δ(poly,log)-optimal predictor schemes for (f,μ∣D). Then, ^P1μ≃Δ^P2.
Note
This result can be interpreted as stability on counterfactual conditions since the similarity is relative to μ rather than only relative to μ∣D. That is, ^P1 and ^P2 are similar outside of D as well.
Proof of Theorem
We will refer to the previously established results about Δ(poly,log)-optimal predictor schemes by L.N where N is the number in the linked post. Thus Theorem 1 there becomes Theorem L.1 here and so on.
By Theorem L.A.7
E(μk∣D)×Ur1(k,j)+r2(k,j)[(^Pkj1(x)−^Pkj2(x))2]∈Δ
Eμk×Ur1(k,j)+r2(k,j)[χD(x)(^Pkj1(x)−^Pkj2(x))2]μk(D)∈Δ
Eμk×Ur1(k,j)+r2(k,j)[χD(x)(^Pkj1(x)−^Pkj2(x))2]∈Δ
On the other hand, by Lemma L.B.3
Eμk×Ur(k,j)+r1(k,j)+r2(k,j)[(^PkjD(x)−χD(x))(^Pkj1(x)−^Pkj2(x))2]∈Δ
Combining the last two statements we conclude that
Eμk×Ur(k,j)+r1(k,j)+r2(k,j)[^PkjD(x)(^Pkj1(x)−^Pkj2(x))2]∈Δ
It follows that
Eμk×Ur1(k,j)+r2(k,j)[(^Pkj1(x)−^Pkj2(x))2]=ϵ(k,j)−1Eμk×Ur1(k,j)+r2(k,j)[ϵ(k,j)(^Pkj1(x)−^Pkj2(x))2]
Eμk×Ur1(k,j)+r2(k,j)[(^Pkj1(x)−^Pkj2(x))2]≤ϵ(k,j)−1Eμk×Ur(k,j)+r1(k,j)+r2(k,j)[^PkjD(x)(^Pkj1(x)−^Pkj2(x))2]
Eμk×Ur1(k,j)+r2(k,j)[(^Pkj1(x)−^Pkj2(x))2]∈Δ