We construct a family or oracles relative to which all estimation problems which have approximate solutions in exponential time are generatable. Relatively to these oracles, we can use the optimal predictor scheme $Λ$ for assigning expectation values to arbitrary deterministic computations within time that is superquasipolynomial in the logarithm of the computation time. In particular, this should allow reflection in the sense of Garrabrant once we succeed to derandomize $Λ$ relative to the same oracle.

Results

New notation

Given $O \subseteq {0, 1}^{*}$ and appropriate sets $X$ and $Y$ , $A : X O \to Y$ denotes a pair consisting of $O$ and an oracle machine that, supplied with oracle $O$ , halts on every input in $X$ and produces an output in $Y$ . In particular, it defines a function from $X$ to $Y$ in the obvious way.

For every $n \in N$ , we fix a prefix free universal oracle machine $U O_{n}$ with one program tape, $n$ input tapes and one output tape (we are only going to use a finite number of $n$ s). Given $O \subseteq {0, 1}^{*}$ , $e v_{O, n} : N \times {0, 1}^{*}^{n + 1} O \to {0, 1}^{*}$ is the following. When $e v_{O, n}^{k} (Q, x_{1} \dots x_{n})$ is computed, $Q$ is interpreted as a program for $U O_{n}$ and $Q (x_{1} \dots x_{n})$ is executed with oracle $O$ for time $k$ . The resulting output is produced.

The notation $e v_{O}^{k} (Q, x_{1} \dots x_{n})$ means $e v_{O, n}^{k} (Q, x_{1} \dots x_{n})$ .

Given $A : X O \to Y$ and $X^{'} \subseteq X$ , $T_{A} (X^{'})$ will denote the maximal runtime of $A$ on elements of $X^{'}$ . $Q_{A} (X^{'})$ will denote the set of queries made to the oracle when applying $A$ to elements of $X^{'}$ .

All the theory of predictors with logarithmic advice can be relativized with respect to an arbitrary oracle. We denote the corresponding concepts using the oracle as a superscript. For example, the relativized version of a $Δ (p o l y, l o g)$ -optimal predictor schemes is $Δ (p o l y, l o g)^{O}$ -optimal predictor schemes etc.

Given $μ$ a word ensemble, $supp μ$ denotes $⋃_{k \in N} supp μ^{k}$ . We previously defined a distributional estimation problem to be a pair $(f, μ)$ where $μ$ is a word ensemble and $f$ is a function from ${0, 1}^{*}$ to $[0, 1]$ . From now on, we allow $f$ to be defined only on $supp μ$ . This doesn’t affect any of the previous results.

Omitting parentheses after an error space will mean zero advice. For example a $Δ$ -sampler is a $Δ (0)$ -sampler.

We start by constructing an exponential time estimation problem which is complete in the class of exponential time estimation problem with samplable word ensembles with respect to $Δ$ -pseudo-invertible reductions (the concept of $Δ$ -pseudo-invertible reductions was previously introduced for unidistributional problems but it is defined in the same day for distributional problems, see Appendix A). Everything is relative to an arbitrary fixed oracle.

Construction 1

Fix $O \subseteq {0, 1}^{*}$ .

Define $μ_{S a m p (O)}$ to be the following word ensemble. Sampling $μ_{S a m p (O)}^{k}$ is equivalent to sampling $S$ , $F$ and $y$ independently from $U^{k}$ and producing $(k, S, F, e v_{O}^{k} (S, k, y))$ .

Define $f_{E X P (O)} : supp μ_{S a m p (O)} \to [0, 1]$ by

$f_{E X P (O)} (k, S, F, x) := β (e v_{O}^{2^{k}} (F, x))$

Theorem 1

Fix error spaces $Δ^{1}$ , $Δ^{2}$ of ranks 1 and 2 respectively. Assume that for any $δ \in Δ^{1}$ the function $δ^{'} (k, j) := δ (k)$ is in $Δ^{2}$ . Consider $(f, μ)$ a distributional estimation problem. Suppose $q : N \to N$ is a polynomial, $F : {0, 1}^{*} O \to [0, 1]$ and $^S = (S, r_{S})$ is a $(Δ^{1})^{O}$ -sampler for $μ$ s.t.

(i) $E_{μ^{k}} [(F (x) - f (x))^{2}] \in Δ^{1}$

(ii) $\forall k \in N : T_{F} (supp μ^{k}) \leq 2^{q (k)}$

(iii) $\forall k \in N : | F | \leq q (k)$

(iv) $\forall k \in N : T_{S} ({0, 1}^{r_{S} (k)}) \leq q (k)$

(v) $\forall k \in N : | S | \leq q (k)$

It is possible to choose a polynomial $p : N \to N$ and $~ S$ a $(p o l y, 0)^{O}$ -scheme of signature $1 \to {0, 1}^{*}$ s.t.

(i) $\forall k \in N : p (k) \geq q (k)$

(i) $\forall k \in N, y \in {0, 1}^{p (k)} : {~ S}^{p (k)} (y) = S^{k} (y_{< r_{S} (k)})$

(ii) $\forall k \in N : T_{~ S} ({0, 1}^{p (k)}) \leq p (k)$

Denote ${^ζ}_{~ S, F, p} = (ζ_{~ S, F, p}, r_{~ S, F, p})$ to the ${0, 1}^{*}$ -valued $(p o l y, 0)$ -scheme defined by

$r_{~ S, F, p} (k) = 2 p (k) - | ~ S | - | F |$

$\forall k \in N, x \in {0, 1}^{*}, y_{1} \in {0, 1}^{p (k) - | ~ S |}, y_{2} \in {0, 1}^{p (k) - | F |} : ζ_{~ S, F, p}^{k} (x, y_{1} y_{2}) = (p (k), ~ S y_{1}, F y_{2}, x)$

Then, ${^ζ}_{~ S, F, p}$ is a $Δ^{2}$ -pseudo-invertible reduction of $(f, μ)$ to $(f_{E X P (O)}, μ_{S a m p (O)})$ .

The proofs of this and subsequent results are in Appendix B.

In particular, if $(f_{E X P (O)}, μ_{S a m p (O)})$ has a uniform $Δ^{2} (p o l y, l o g)^{O}$ -optimal predictor scheme then we can construct a $Δ^{2} (p o l y, l o g)^{O}$ -optimal predictor scheme for any $Δ^{1}$ -samplable distributional estimation problem which has an exponential time approximation with error in $Δ^{1}$ . We don’t expect this to be true for reasonable error spaces and $O = \emptyset$ , but as the following construction (drawing heavily from Heller) shows it is true for some oracles.

Construction 2

Consider $O \subseteq {0, 1}^{*}$ , $p : N \to N$ a polynomial and $σ : N \times {0, 1}^{*} O \to {0, 1}^{*}$ . We give a recursive definition of $O_{p, k}^{σ} \subseteq {0, 1}^{*}$ for $k \in N$ .

Recursion base:

$O_{p, 0}^{σ} := {0 x ∣ x \in O}$

Denote

$O_{p, k}^{σ, +} := {1 w α ∣ | α | = k \land \exists S, F, y \in {0, 1}^{k}, z \in {0, 1}^{p (k)} : σ^{k} (w) = z S F y \land f_{E X P (O_{p, k}^{σ})}^{k} (S, F, e v_{O_{p, k}^{σ}}^{k} (S, k, y)) \geq β (α)}$

$O_{p, k}^{σ, -} := ({0, 1}^{*} ∖ O_{p, k}^{σ}) \cap (Q_{f_{E X P (O_{p, k}^{σ})}} (supp μ_{S a m p (O_{p, k}^{σ})}^{k}) \cup Q_{e v_{O_{p, k}^{σ}}} ({0, 1}^{k} \times {k} \times {0, 1}^{k}))$

Recursion step:

$O_{p, k + 1}^{σ} := O_{p, k}^{σ} \cup O_{p, k}^{σ, +} ∖ ⋃ i \leq k O_{p, i}^{σ, -}$

Finally, we define

$O_{p}^{σ} := ⋃ k \in N O_{p, k}^{σ}$

Construction 3

Define $Δ_{p o w}^{1}$ to be the set of functions $δ : N \to R^{\geq 0}$ s.t.

$\exists ϵ > 0 : lim k \to \infty k^{ϵ} δ (k) = 0$

It is easy to see $Δ_{p o w}^{1}$ is an error space.

Theorem 2

Consider $O \subseteq {0, 1}^{*}$ and $σ : N \times {0, 1}^{*} O \to {0, 1}^{*}$ s.t. $σ^{k}$ is a bijection on ${0, 1}^{3 k + p (k)}$ . Then, for any sufficiently large polynomial $p : N \to N$ , $(f_{E X P (O_{p}^{σ})}, μ_{S a m p (O_{p}^{σ})})$ is $(Δ_{p o w}^{1})^{O_{p}^{σ}}$ -generatable.

Note 1

For $σ$ the identity function, $(f_{E X P (O_{p}^{σ})}, μ_{S a m p (O_{p}^{σ})})$ admits a Monte Carlo algorithm. This is too strong for our purposes since it means there is no logical uncertainty in exponential time problems. Such oracles can be regarded as bounded analogues of reflective oracles.

Note 2

It is easy to see that $O_{p}^{σ} \in E X P^{O}$ .

Theorem 2 is insufficient by itself to enable reflection with optimal predictor schemes relative to $O_{p}^{σ}$ since $Λ$ is a random algorithm. Potentially this can be solved by derandomization but at present we don’t know whether under what conditions derandomization with respect to $O_{p}^{σ}$ is possible.

Appendix A

Fix $Δ$ an error space of rank 2.

Definition A

Consider $(f, μ)$ , $(g, ν)$ distributional estimation problems, $^ζ = (ζ, r_{ζ}, a_{ζ})$ a ${0, 1}^{*}$ -valued $(p o l y, l o g)$ -bischeme. $^ζ$ is called a $Δ$ -pseudo-invertible reduction of $(f, μ)$ to $(g, ν)$ when there is a polynomial $p : N \to N$ s.t. the following conditions hold:

(i) $E_{μ^{k} \times U^{r_{ζ} (k, j)}} [(g ({^ζ}^{k j} (x)) - f (x))^{2}] \in Δ$

(ii) $P r_{μ^{k} \times U^{r_{ζ} (k, j)}} [ν^{p (k)} ({^ζ}^{k j} (x)) = 0] \in Δ$

(iii) There is $M > 0$ and $^R = (R, r_{R}, a_{R})$ a $Q \cap [0, M]$ -valued $(p o l y, l o g)$ -bischeme s.t.

$E_{ν^{p (k)} \times U^{r_{R} (k, j)}} [({^R}^{k j} (y) - \frac{P r_{μ^{k} \times U^{r_{ζ} (k, j)}} [{^ζ}^{k j} (x) = y]}{ν^{p (k)} (y)})^{2}] \in Δ$

(iv) There is a ${0, 1}^{*}$ -valued $(p o l y, l o g)$ -scheme $^ξ = (ξ, r_{ξ}, a_{ξ})$ s.t.

$Eμk×Urζ(k,j)[∑x′∈{0,1}∗|PrUrξ(k,j)[^ξkj(^ζkj(x,z),w)=x′]−Prμk×Urζ(k,j)[x′′=x′∣^ζkj(x′′,z′)=^ζkj(x,z)]|]∈Δ$

Such $^ξ$ is called a $Δ$ -pseudo-inverse of $^ζ$ .

$^ζ = (ζ, r_{ζ}, a_{ζ})$ a ${0, 1}^{*}$ -valued $(p o l y, l o g)$ -scheme is called a $Δ$ -pseudo-invertible reduction of $(f, μ)$ to $(g, ν)$ when it becomes such when adding trivial dependence on $j$ .

Theorem A

Suppose there is a polynomial $h : N^{2} \to N$ s.t. $h^{- 1} \in Δ$ . Consider $(f, μ)$ , $(g, ν)$ distributional estimation problems, $^ζ$ a $Δ$ -pseudo-invertible reduction of $(f, μ)$ to $(g, ν)$ and ${^P}_{g}$ a $Δ (p o l y, l o g)$ -optimal predictor scheme for $(g, ν)$ . Define ${^P}_{f}$ by ${^P}_{f}^{k j} (x) := {^P}_{g}^{k j} ({^ζ}^{k j} (x))$ . Then, ${^P}_{f}$ is a $Δ (p o l y, l o g)$ -optimal predictor scheme for $(f, μ)$ .

Theorem A.1 is proved exactly the same way as Theorem 8 for unidistributional problems and we leave out the details.

Appendix B

Proof of Theorem 1

We need to show the 4 defining conditions of $Δ^{2}$ -pseudo-invertible reductions.

(i) $E_{μ^{k} \times U^{r_{~ S, F, p} (k)}} [(f_{E X P (O)} ({^ζ}_{~ S, F, p}^{k} (x)) - f (x))^{2}] = E_{μ^{k} \times U^{r_{~ S, F, p} (k)}} [(f_{E X P (O)}^{p (k)} (~ S y_{1}, F y_{2}, x) - f (x))^{2}]$

$E_{μ^{k} \times U^{r_{~ S, F, p} (k)}} [(f_{E X P (O)} ({^ζ}_{~ S, F, p}^{k} (x)) - f (x))^{2}] = E_{μ^{k}} [(β (e v_{O}^{2^{p (k)}} (F, x)) - f (x))^{2}]$

$E_{μ^{k} \times U^{r_{~ S, F, p} (k)}} [(f_{E X P (O)} ({^ζ}_{~ S, F, p}^{k} (x)) - f (x))^{2}] = E_{μ^{k}} [(F (x) - f (x))^{2}]$

Using assumption (i), we get the required result.

(ii) $P r_{μ^{k} \times U^{r_{~ S, F, p} (k)}} [μ_{S a m p (O)}^{p (k)} ({^ζ}_{~ S, F, p}^{k} (x)) = 0] = P r_{μ^{k} \times U^{r_{~ S, F, p} (k)}} [μ_{S a m p (O)}^{p (k)} (p (k), ~ S y_{1}, F y_{2}, x) = 0]$

$Prμk×Ur~S,F,p(k)[μp(k)Samp(O)(^ζk~S,F,p(x))=0]=Prμk[x∉Sk({0,1}rS(k))]$

$Prμk×Ur~S,F,p(k)[μp(k)Samp(O)(^ζk~S,F,p(x))=0]=∑x∈{0,1}∗∖Sk({0,1}rS(k))μk(x)$

$Prμk×Ur~S,F,p(k)[μp(k)Samp(O)(^ζk~S,F,p(x))=0]=∑x∈{0,1}∗∖Sk({0,1}rS(k))|μk(x)−Pr{0,1}rS(k)[Sk(y)=x]|$

$Prμk×Ur~S,F,p(k)[μp(k)Samp(O)(^ζk~S,F,p(x))=0]≤∑x∈{0,1}∗|μk(x)−Pr{0,1}rS(k)[Sk(y)=x]|$

Since $S$ is a $(Δ^{1})^{O}$ -sampler for $μ$ , we get

$P r_{μ^{k} \times U^{r_{~ S, F, p} (k)}} [μ_{S a m p (O)}^{p (k)} ({^ζ}_{~ S, F, p}^{k} (x)) = 0] \in Δ^{1}$

(iii) Define $^R$ by

${^R}^{k j} (i, S^{'}, F^{'}, x) = ⎧ ⎪ ⎨ ⎪ ⎩ \begin{matrix} 0 if i \neq p (k) 0 if ~ S is not a prefix of S^{'} 2^{| ~ S | + | F |} in other cases \end{matrix}$

Consider $k \in N$ , $x^{'} \in {0, 1}^{*}$ , $y_{1} \in {0, 1}^{p (k) - | ~ S |}$ , $y_{2} \in {0, 1}^{p (k) - | F |}$ .

$\frac{P r_{μ^{k} \times U^{r_{ζ} (k, j)}} [{^ζ}_{~ S, F, p}^{k} (x) = (p (k), ~ S y_{1}, F y_{2}, x^{'})]}{μ_{S a m p (O)}^{p (k)} (p (k), ~ S y_{1}, F y_{2}, x^{'})} = \frac{2^{- p (k) + | ~ S |} 2^{- p (k) + | F |} P r_{U^{r_{S} (k)}} [S^{k} (y) = x]}{2^{- p (k)} 2^{- p (k)} P r_{U^{r_{S} (k)}} [S^{k} (y) = x]} = 2^{| ~ S | + | F |}$

(iv) Define $^ξ$ by

${^ξ}^{k j} (i, S^{'}, F^{'}, x) = x$

Proposition B

Consider $O \subseteq {0, 1}^{*}$ , $p : N \to N$ a polynomial and $σ : N \times {0, 1}^{*} O \to {0, 1}^{*}$ . Then

(i) $\forall k \in N, i > k, x \in {0, 1}^{*} : x \in O_{p, k}^{σ} ⟹ x \in O_{p, i}^{σ} \land x \in O_{p}^{σ}$

(ii) $\forall x \in {0, 1}^{*} : 0 x \in O_{p}^{σ} ⟺ x \in O$

(iii) $\forall k \in N, \forall x \in {0, 1}^{4 k + p (k)} : 1 x \in O_{p}^{σ} ⟺ 1 x \in O_{p, k}^{σ, +} ∖ ⋃ i \leq k O_{p, i}^{σ, -}$

(iv) $\forall k \in N, S \in {0, 1}^{k}, y \in {0, 1}^{k} : e v_{O_{p}^{σ}}^{k} (S, y) = e v_{O_{p, k}^{σ}}^{k} (S, y)$

(v) $\forall k \in N, F \in {0, 1}^{k}, x \in supp μ_{S a m p (O_{p}^{σ})}^{k} : e v_{O_{p}^{σ}}^{2^{k}} (F, x) = e v_{O_{p, k}^{σ}}^{2^{k}} (F, x)$

Proof of Proposition B

(i) By induction: $x \in O_{p, k + 1}^{σ, -}$ implies $x \notin O_{p, k}^{σ}$ .

(ii) If $0 x \notin O$ then $0 x \notin O_{p}^{σ}$ since all words in $O_{p, k}^{σ, +}$ begin with $1$ . If $0 x \in O$ then $0 x \in O_{p}^{σ}$ by (i).

(iii) If $1 x \in O_{p, k}^{σ, +} ∖ ⋃_{i \leq k} O_{p, i}^{σ, -}$ then $1 x \in O_{p, k + 1}^{σ}$ and by (i) $1 x \in O_{p}^{σ}$ . If $1 x \in O_{p}^{σ}$ then there is $n \in N$ s.t. $1 x \in O_{p, n}^{σ, +} ∖ ⋃_{i \leq n} O_{p, i}^{σ, -}$ and $n = k$ because any word in $O_{p, n}^{σ, +}$ has length $4 n + p (n) + 1$ .

(iv) All oracle queries made in $e v_{O_{p, k}^{σ}}^{k} (S, y)$ return the same values when addressed to $O_{p, i}^{σ}$ for $i > k$ : queries that return $0$ keep returning $0$ since they are included in $O_{p, k}^{σ, -}$ and queries that return $1$ keep returning $1$ by (i).

(v) Note that (iv) implies $supp μ_{S a m p (O_{p}^{σ})}^{k} = supp μ_{S a m p (O_{p, k}^{σ})}^{k}$ and apply the same argument as in (iv) to $e v_{O_{p, i}^{σ}}^{2^{k}} (F, x)$ .

Proof of Theorem 2

We describe $^G$ the $(Δ_{p o w}^{1})^{O_{p}^{σ}}$ -generator for $(f_{E X P (O_{p}^{σ})}, μ_{S a m p (O_{p}^{σ})})$ . $^G = (G, 3 k + p (k))$ is the $(p o l y, 0)^{O_{p}^{σ}}$ -scheme of signature $1 \to {0, 1}^{*} \times [0, 1]$ defined as follows. Consider the computation of $G^{k} (w)$ .

$z S F y := σ^{k} (w)$ is computed, where $| S | = | F | = | y | = k$ . $x := e v_{O_{p}^{σ}}^{k} (S, k, y)$ is computed. For each $i \in {0, 1 \dots k}$ , we compute $α_{i}$ defined as $\frac{i}{k}$ rounded to $k$ binary digits. $α := \frac{1}{k + 1} | {i \in {0, 1 \dots k} ∣ 1 w α_{i} \in O_{p}^{σ}} |$ is computed within $k$ binary digits. The output is $((k, S, F, x), α)$ .

${^G}_{1}$ reproduces $μ_{S a m p (O_{p}^{σ})}$ precisely since $σ^{k}$ preserves $U^{3 k + p (k)}$ .

It is easy to see that there is a polynomial $q$ s.t. for any $p$ we have $| ⋃_{i \leq k} O_{p, i}^{σ, -} | \leq 2^{q (k)}$ . Take any polynomial $p$ s.t. ${lim}_{k \to \infty} (3 k + p (k) - q (k)) = + \infty$ . By Lemma C, (iii) we have

$α = \frac{1}{k + 1} (| {i \in {0, 1 \dots k} ∣ 1 w α_{i} \in O_{p, k}^{σ, +}} | - | {i \in {0, 1 \dots k} ∣ 1 w α_{i} \in O_{p, k}^{σ, +} \cap ⋃ n \leq k O_{p, n}^{σ, -}} |)$

The first term approximates $e v_{O_{p, k}^{σ}}^{2^{k}} (F, x)$ within error $\frac{1}{k + 1} \in Δ_{p o w}^{1}$ by the construction of $O_{p, k}^{σ, +}$ . By Lemma C, (v) it approximates $f_{E X P (O_{p}^{σ})} (k, S, F, x) = e v_{O_{p}^{σ}}^{2^{k}} (F, x)$ within the same error. The second terms vanishes for all but a negligible fraction of the $w$ -s by the properties of $p$ and $q$ .

Towards reflection with relative optimal predictor schemes