Δ2ll,ϕ is an error space for any ϕ∈Φ. Δ2ll is an error space.
Proposition 2
Consider a polynomial q:N2→N. There is a function λq:N3→[0,1] s.t.
(i) ∀k,j∈N:∑i∈Nλq(k,j,i)=1
(ii) For any function ϵ:N2→[0,1] we have
ϵ(k,j)−∑i∈Nλq(k,j,i)ϵ(k,q(k,j)+i)∈Δ2ll
The proofs of Propositions 1 and 2 are in the Appendix. The following are proved using exactly like the analogous statements for Δ2avg and we omit the proofs.
Lemma
Consider (f,μ) a distributional estimation problem, (P,r,a), (Q,s,b)(poly,log)-predictor schemes. Suppose p:N2→N a polynomial and δ∈Δ2ll are s.t.
∀i,k,j∈N:E[(Pk,p(k,j)+i−f)2]≤E[(Qkj−f)2]+δ(k,j)
Then ∃δ′∈Δ2avg s.t.
E[(Pkj−f)2]≤E[(Qkj−f)2]+δ′(k,j)
Theorem 1
Consider (f,μ) a distributional estimation problem. Define Υ:N2×{0,1}∗3alg−→[0,1] by
Υkj(x,y,Q):=β(evj(Q,x,y))
Define υf,μ:N2→{0,1}∗ by
υkjf,μ:=argmin|Q|≤logjEμk×Uj[(Υkj(x,y,Q)−f(x))2]
Then, (Υ,j,υf,μ) is a Δ2ll(poly,log)-optimal predictor scheme for (f,μ).
Theorem 2
There is an oracle machine Λ that accepts an oracle of signature SF:N×{0,1}∗→{0,1}∗×[0,1] and a polynomial r:N→N where the allowed oracle calls are SFk(x) for |x|=r(k) and computes a function of signature N2×{0,1}∗2→[0,1] s.t. for any ϕ∈Φ, (f,μ) a distributional estimation problem and G:=(S,F,rS,aS) a corresponding Δ1ϕ(log)-generator, Λ[G] is a Δ2ll,ϕ(poly,log)-optimal predictor scheme for (f,μ).
Appendix
Proof of Proposition 1
The only slightly non-obvious condition is (v). We have
Improved error space for universal optimal predictor schemes
We construct an error space which is smaller than Δ2avg but admits analogous existence theorems for optimal predictor schemes.
Results
Construction
Given ϕ∈Φ we define Δ1ϕ to be the set of functions δ:N→R≥0 s.t. ∃ϵ>0:limk→∞ϕ(k)ϵδ(k)=0. It is easily seen Δ1ϕ is an error space.
Given ϕ∈Φ, denote tϕ(k):=⌊2(logk)ϕ(k)⌋. We define Δ2ll,ϕ to be the set of bounded functions δ:N2→R≥0 s.t. for any ϕ′∈Φ, if ϕ′≤ϕ then
tϕ′(k)−1∑j=2(loglog(j+1)−loglogj)δ(k,j)loglogtϕ′(k)∈Δ1ϕ′
We define Δ2ll:=⋂ϕ∈ΦΔ2ll,ϕ.
Proposition 1
Δ2ll,ϕ is an error space for any ϕ∈Φ. Δ2ll is an error space.
Proposition 2
Consider a polynomial q:N2→N. There is a function λq:N3→[0,1] s.t.
(i) ∀k,j∈N:∑i∈Nλq(k,j,i)=1
(ii) For any function ϵ:N2→[0,1] we have
ϵ(k,j)−∑i∈Nλq(k,j,i)ϵ(k,q(k,j)+i)∈Δ2ll
The proofs of Propositions 1 and 2 are in the Appendix. The following are proved using exactly like the analogous statements for Δ2avg and we omit the proofs.
Lemma
Consider (f,μ) a distributional estimation problem, (P,r,a), (Q,s,b) (poly,log)-predictor schemes. Suppose p:N2→N a polynomial and δ∈Δ2ll are s.t.
∀i,k,j∈N:E[(Pk,p(k,j)+i−f)2]≤E[(Qkj−f)2]+δ(k,j)
Then ∃δ′∈Δ2avg s.t.
E[(Pkj−f)2]≤E[(Qkj−f)2]+δ′(k,j)
Theorem 1
Consider (f,μ) a distributional estimation problem. Define Υ:N2×{0,1}∗3alg−→[0,1] by
Υkj(x,y,Q):=β(evj(Q,x,y))
Define υf,μ:N2→{0,1}∗ by
υkjf,μ:=argmin|Q|≤logjEμk×Uj[(Υkj(x,y,Q)−f(x))2]
Then, (Υ,j,υf,μ) is a Δ2ll(poly,log)-optimal predictor scheme for (f,μ).
Theorem 2
There is an oracle machine Λ that accepts an oracle of signature SF:N×{0,1}∗→{0,1}∗×[0,1] and a polynomial r:N→N where the allowed oracle calls are SFk(x) for |x|=r(k) and computes a function of signature N2×{0,1}∗2→[0,1] s.t. for any ϕ∈Φ, (f,μ) a distributional estimation problem and G:=(S,F,rS,aS) a corresponding Δ1ϕ(log)-generator, Λ[G] is a Δ2ll,ϕ(poly,log)-optimal predictor scheme for (f,μ).
Appendix
Proof of Proposition 1
The only slightly non-obvious condition is (v). We have
limsupk→∞ϕ(k)ϵαEκkϕ[δ(k,j)α]≤limsupk→∞ϕ(k)ϵαEκkϕ[δ(k,j)]α
limsupk→∞ϕ(k)ϵαEκkϕ[δ(k,j)α]≤(limsupk→∞ϕ(k)ϵEκkϕ[δ(k,j)])α
limk→∞ϕ(k)ϵαEκkϕ[δ(k,j)α]=0
Proof of Proposition 2
Given functions q1,q2:N2→N s.t.q1(k,j)≥q2(k,j) for k,j≫0, the proposition for q1 implies the proposition for q2 by setting
λq2(k,j,i):={λq1(k,j,i−q1(k,j)+q2(k,j))if i−q1(k,j)+q2(k,j)≥00if i−q1(k,j)+q2(k,j)<0
Therefore, it is enough to prove to proposition for functions of the form q(k,j)=jm+nlogk for m>0.
Consider any ϕ∈Φ. We have
limk→∞ϕ(k)−12=0
limk→∞ϕ(k)12loglogkϕ(k)loglogk=0
limk→∞ϕ(k)12log(m+nlogk)ϕ(k)loglogk=0
limk→∞ϕ(k)122m+nlogk∫x=2d(loglogx)ϕ(k)loglogk=0
Since ϵ takes values in [0,1]
limk→∞ϕ(k)122m+nlogk∫x=2ϵ(k,⌊x⌋)d(loglogx)ϕ(k)loglogk=0
Similarly
limk→∞ϕ(k)12tϕ(k)m+nlogk∫x=tϕ(k)ϵ(k,⌊x⌋)d(loglogx)ϕ(k)loglogk=0
The last two equations imply that
limk→∞ϕ(k)12tϕ(k)∫x=2ϵ(k,⌊x⌋)d(loglogx)−tϕ(k)m+nlogk∫x=2m+nlogkϵ(k,⌊x⌋)d(loglogx)ϕ(k)loglogk=0
Raising x to a power is equivalent to adding a constant to loglogx, therefore
limk→∞ϕ(k)12tϕ(k)∫x=2ϵ(k,⌊x⌋)d(loglogx)−tϕ(k)∫x=2ϵ(k,⌊xm+nlogk⌋)d(loglogx)ϕ(k)loglogk=0
limk→∞ϕ(k)12tϕ(k)∫x=2(ϵ(k,⌊x⌋)−ϵ(k,⌊xm+nlogk⌋))d(loglogx)ϕ(k)loglogk=0
Since ⌊xm+nlogk⌋≥⌊x⌋m+nlogk we can choose λq satisfying condition (i) so that
j+1∫x=jϵ(k,⌊xm+nlogk⌋)d(loglogx)=(loglog(j+1)−loglogj)∑iλq(k,j,i)ϵ(k,jm+nlogk+i)
It follows that
j+1∫x=jϵ(k,⌊xm+nlogk⌋)d(loglogx)=j+1∫x=j∑iλq(k,⌊x⌋,i)ϵ(k,⌊x⌋m+nlogk+i)d(loglogx)
limk→∞ϕ(k)12tϕ(k)∫x=2(ϵ(k,⌊x⌋)−∑iλq(k,⌊x⌋,i)ϵ(k,⌊x⌋m+nlogk+i))d(loglogx)ϕ(k)loglogk=0
limk→∞ϕ(k)12∑tϕ(k)−1j=2(loglog(j+1)−loglogj)(ϵ(k,j)−∑iλq(k,j,i)ϵ(k,jm+nlogk+i))ϕ(k)loglogk=0
ϵ(k,j)−∑i∈Nλq(k,j,i)ϵ(k,q(k,j)+i)∈Δ2ll