We formulate a concept analogous to Garrabrant’s irreducible pattern in the complexity theoretic language underlying optimal predictor schemes and prove a formal relation to Garrabant’s original definition. We prove that optimal predictor schemes pass the corresponding version of the Benford test (namely, on irreducible patterns they are $Δ$ -similar to a constant).

Results

All the proofs are given in the appendix.

Definition 1

Fix $Δ$ an error space of rank 2. Consider $(f, μ)$ a distributional estimation problem, $α \in [0, 1]$ . $(f, μ)$ is called $Δ (p o l y, l o g)$ -irreducible with expectation $α$ when for any ${0, 1}$ -valued $(p o l y, l o g)$ -bischeme $^A = (A, r_{A}, a_{A})$ we have

$| E_{μ^{k} \times U^{r_{A} (k, j)}} [(f - α) {^A}^{k j}] | \in Δ$

Theorem 1

Fix $Δ$ an error space of rank 2. Assume $h : N^{2} \to N$ is a polynomial s.t. $h^{- 1} \in Δ$ . Given $t \in [0, 1]$ , define $π_{h}^{k j} (t)$ to be $t$ rounded within error $h (k, j)^{- 1}$ . Consider $(f, μ)$ a distributional estimation problem, $α \in [0, 1]$ . Define the $(p o l y, l o g)$ -predictor scheme ${^P}_{α, h}$ by ${^P}_{α, h}^{k j} (x) := π_{h}^{k j} (α)$ . Then, $(f, μ)$ is $Δ (p o l y, l o g)$ -irreducible with expectation $α$ if and only if ${^P}_{α, h}$ is a $Δ (p o l y, l o g)$ -optimal predictor scheme for $(f, μ)$ .

Corollary 1

Consider $Δ_{1}$ , $Δ_{2}$ error spaces of rank 2 s.t. $Δ_{1} \subseteq Δ_{2}$ . Assume there is a polynomial $h : N^{2} \to N$ s.t. $h^{- 1} \in Δ_{2}$ . Consider $(f, μ)$ a distributional estimation problem, $α \in [0, 1]$ and $^P$ a $Δ_{1} (p o l y, l o g)$ -optimal predictor scheme for $(f, μ)$ . Suppose $(f, μ)$ is $Δ_{2} (p o l y, l o g)$ -irreducible with expectation $α$ . Then, $^P μ ≃ Δ_{2} α$ .

The following definition is adapted from Garrabrant with relatively minor modifications

Definition 2

Denote $U^{\leq k}$ the uniform probability measure on ${0, 1}^{\leq k}$ . Given $r : N \to N$ and $A : {0, 1}^{*}^{2} a l g - \to {0, 1}$ , denote $m_{A, r} (k) := \sum_{x \in {0, 1}^{\leq k}} P r_{U^{r (| x |)}} [A (x, y) = 1]$ . Fix $t : N \to N$ . $f : {0, 1}^{*} \to [0, 1]$ is called a $t (p o l y)$ -irreducible pattern with expectation $α \in [0, 1]$ when for any polynomial $p : N \to N$ there is $c_{p} > 0$ s.t. for any $W : {0, 1}^{*}^{2} a l g - \to {0, 1}$ if $W$ runs within time $p (t (| x |))$ on any input $(x, y)$ s.t. $| y | = p (t (| x |))$ then

$\forall k \in N : m_{W, p (t)} (k) \geq 3 ⟹ | E_{U^{\leq k} \times U^{p (t (k))}} [f (x) ∣ W (x, y_{\leq p (t (| x |))}) = 1] - α | \leq \frac{c_{p} | W | \sqrt{log log m_{W, p (t)} (k)}}{\sqrt{m_{W, p (t)} (k)}}$

Note 1

Definition 2 differs from Garrabrant’s original definition in several respects:

We consider an arbitrary function rather than the characteristic function of the set of provable sentences.
Instead of a single time bound, we consider a family of time bounds differing by a polynomial.
We allow $W$ to be a random algorithm rather than requiring it to be deterministic.

Definition 3

Fix $t : N \to N$ . $Δ_{G, t}^{2}$ is the set of bounded functions $δ : N^{2} \to R^{\geq 0}$ for which there are $c, d > 0$ s.t.

$\forall k : \forall j \leq t (k) : δ (k, j)^{d} \leq c 2^{- k} log (k + 2) (log (j + 2))^{2}$

Proposition 1

$Δ_{G, t}^{2}$ is an error space of rank 2.

Theorem 2

Assume $t : N \to N$ is s.t. for some polynomial $p : N \to N$ we have $p (t (k)) \geq k$ . Consider $f : {0, 1}^{*} \to [0, 1]$ , $α \in [0, 1]$ . Assume $f$ is a $t (p o l y)$ -irreducible pattern with expectation $α$ . Then, $(f, U)$ is $Δ_{G, t}^{2} (p o l y, l o g)$ -irreducible with expectation $α$ .

Definition 4

Denote $Φ$ the set of functions $ϕ : N \to R^{\geq 0}$ s.t. ${lim}_{k \to \infty} ϕ (k) = \infty$ . Given $ϕ \in Φ$ , define $t_{ϕ} (k) := ⌊ 2^{(log k)^{ϕ (k)}} ⌋$ . Fix $ϕ \in Φ$ . $Δ_{a v g, ϕ}^{2}$ is the set of bounded functions $δ : N^{2} \to R^{\geq 0}$ s.t. for any $ϕ^{'} \in Φ$ , if $ϕ^{'} \leq ϕ$ then

$lim k \to \infty \frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)}{log log t_{ϕ^{'}} (k)} = 0$

Proposition 2

$Δ_{a v g, ϕ}^{2}$ is an error space of rank 2.

Note that $Δ_{a v g}^{2} = ⋂_{ϕ \in Φ} Δ_{a v g, ϕ}^{2}$ .

Proposition 3

Consider $ϕ \in Φ$ s.t. ${lim}_{k \to \infty} 2^{- k} (log k)^{2 ϕ (k) + 1} = 0$ . Then $Δ_{G, t_{ϕ}}^{2} \subseteq Δ_{a v g, ϕ}^{2}$ .

Corollary 2

Fix $ϕ \in Φ$ s.t. ${lim}_{k \to \infty} 2^{- k} (log k)^{2 ϕ (k) + 1} = 0$ . Cosnider $f : {0, 1}^{*} \to [0, 1]$ , $α \in [0, 1]$ and $^P$ a $Δ_{a v g}^{2}$ -optimal predictor scheme for $(f, U)$ . Suppose $f$ is a $t_{ϕ} (p o l y)$ -irreducible pattern with expectation $α$ . Then $^P U ≃ Δ_{a v g, ϕ}^{2} α$ .

Note 2

It is possible to repeat this theory without random and thus relate the deterministic version of irreducible patterns to $Δ (l o g)$ -optimal predictor schemes (which have logarithmic advice and no coin flips or equivalently logarithmic number of coin flips).

Appendix

We will refer to the previously established results about $Δ (p o l y, l o g)$ -optimal predictor schemes by L.N where N is the number in the linked post. Thus Theorem 1 there becomes Theorem L.1 here and so on.

Proposition 4

Fix $Δ$ an error space of rank 2. Consider $(f, μ)$ a distributional estimation problem, $α \in [0, 1]$ . Suppose $(f, μ)$ is $Δ (p o l y, l o g)$ -irreducible with expectation $α$ . Then, there is a $Δ$ -moderate function $δ : N^{4} \to [0, 1]$ s.t. for any $k, j, s \in N$ and $A : {0, 1}^{*}^{2} a l g - \to {0, 1}$

$| E_{μ^{k} \times U^{s}} [(f (x) - α) A (x, y)] | \leq δ (k, j, T_{A}^{μ} (k, s), 2^{| A |})$

Proof of Proposition 4

Take $δ$ to be

$δ (k, j, t, u) := max \begin{matrix} T_{A}^{μ} (k, s) \leq t | A | \leq log u \end{matrix} | E_{μ^{k} \times U^{s}} [(f (x) - α) A (x, y)] |$

If $δ$ is not $Δ$ -moderate then there is a polynomial $s : N^{2} \to N$ and a family of programs ${A^{k j} : {0, 1}^{*}^{2} a l g - \to {0, 1}}_{k, j \in N}$ s.t. $T_{A^{k j}}^{μ} (k, s (k, j))$ is bounded by a polynomial in $k, j$ and $| A^{k j} |$ is logarithmically bounded in $k, j$ but

$| E_{μ^{k} \times U^{s (k, j)}} [(f (x) - α) A^{k j} (x, y)] | \notin Δ$

We can unite the $A^{k j}$ into a single $(p o l y, l o g)$ -predictor scheme $^A$ by providing them as advice for a universal program. This contradicts the assumption on $(f, μ)$ .

Proof of Theorem 1

If ${^P}_{α, h}$ is a $Δ (p o l y, l o g)$ -optimal predictor scheme for $(f, μ)$ then $(f, μ)$ is $Δ (p o l y, l o g)$ -irreducible with expectation $α$ by Lemma L.B.3.

Assume $(f, μ)$ is $Δ (p o l y, l o g)$ -irreducible with expectation $α$ . Consider $^Q = (Q, r_{Q}, a_{Q})$ any $Q \cap [- 1, + 1]$ -valued $(p o l y, l o g)$ -bischeme. By Lemma L.B.3, it is sufficient to prove that

$| E_{μ^{k} \times U^{r_{Q} (k, j)}} [({^P}_{α, h} - f)^Q] | \in Δ$

We have

$| E_{μ^{k} \times U^{r_{Q} (k, j)}} [({^P}_{α, h} - f)^Q] | = | E_{μ^{k} \times U^{r_{Q} (k, j)}} [(π^{k j} (α) - f)^Q] |$

$| E_{μ^{k} \times U^{r_{Q} (k, j)}} [({^P}_{α, h} - f)^Q] | \leq | E_{μ^{k} \times U^{r_{Q} (k, j)}} [(α - f)^Q] | + h (k, j)^{- 1}$

$| E_{μ^{k} \times U^{r_{Q} (k, j)}} [({^P}_{α, h} - f)^Q] | \leq | E_{μ^{k} \times U^{r_{Q} (k, j)}} [(α - f) π^{k j} (^Q)] | + 2 h (k, j)^{- 1}$

$| E_{μ^{k} \times U^{r_{Q} (k, j)}} [({^P}_{α, h} - f)^Q] | \leq | E_{μ^{k} \times U^{r_{Q} (k, j)}} [(α - f) \int_{0}^{1} θ (^Q - π^{k j} (t)) d t] | + 2 h (k, j)^{- 1}$

$| E_{μ^{k} \times U^{r_{Q} (k, j)}} [({^P}_{α, h} - f)^Q] | \leq \int_{0}^{1} | E_{μ^{k} \times U^{r_{Q} (k, j)}} [(α - f) θ (^Q - π^{k j} (t))] | d t + 2 h (k, j)^{- 1}$

Applying Proposition 4 to $θ (^Q - π^{k j} (t))$ , we conclude that there is $δ \in Δ$ s.t.

$\forall t \in [0, 1] : | E_{μ^{k} \times U^{r_{Q} (k, j)}} [(α - f) θ (^Q - π^{k j} (t))] | \leq δ (k, j)$

Summing up

$| E_{μ^{k} \times U^{r_{Q} (k, j)}} [({^P}_{α, h} - f)^Q] | \leq δ (k, j) + 2 h (k, j)^{- 1}$

Proof of Corollary 1

Trivially follows from Theorem 1 and Theorem L.A.7.

Proposition 5

Fix $t : N \to N$ . Assume $δ : N^{2} \to R^{\geq 0}$ is bounded and $c, d > 0$ are s.t.

$\forall k : \forall j \leq t (k) : δ (k, j)^{d} \leq c 2^{- k} log (k + 2) (log (j + 2))^{2}$

Consider $d^{'} \geq d$ . Then, there is $c^{'} > 0$ s.t.

$\forall k : \forall j \leq t (k) : δ (k, j)^{d^{'}} \leq c^{'} 2^{- k} log (k + 2) (log (j + 2))^{2}$

Proof of Proposition 5

$δ^{d^{'}} \leq (sup δ)^{d^{'} - d} δ^{d}$

Proof of Proposition 1

The only not entirely obvious part is additivity. Consider $δ_{1}, δ_{2} \in Δ_{G, t}^{2}$ . Suppose $d_{1}, c_{1}, d_{2}, c_{2} \in R^{> 0}$ are s.t.

$\forall k : \forall j \leq t (k) : δ_{1} (k, j)^{d_{1}} \leq c_{1} 2^{- k} log (k + 2) (log (j + 2))^{2}$

$\forall k : \forall j \leq t (k) : δ_{2} (k, j)^{d_{2}} \leq c_{2} 2^{- k} log (k + 2) (log (j + 2))^{2}$

For sufficiently large $d \in N$ , $(δ_{1} + δ_{2})^{d}$ can be written as a sum of terms of the form $δ_{1}^{d_{1}^{'}} δ_{2}^{d_{2}^{'}}$ for $d_{1}^{'} \geq d_{1}$ , $d_{2}^{'} \geq d_{2}$ . Applying Proposition 5, we conclude that there is $c > 0$ s.t.

$\forall k : \forall j \leq t (k) : (δ_{1} + δ_{2})^{d} \leq c 4^{- k} (log (k + 2))^{2} (log (j + 2))^{4}$

$\forall k : \forall j \leq t (k) : (δ_{1} + δ_{2})^{\frac{d}{2}} \leq c 2^{- k} log (k + 2) (log (j + 2))^{2}$

Proposition 6

For any $t : N \to N$ , $n \in N$ , $ϵ \in (0, 2]$ , $M > 0$

$min (2^{- k} (log (k + 2))^{n} (log (j + 2))^{2 - ϵ}, M) \in Δ_{G, t}^{2}$

Proof of Proposition 6

$(2^{- k} (log (k + 2))^{n} (log (j + 2))^{2 - ϵ})^{\frac{2}{2 - ϵ}} = (2^{- k} (log (k + 2))^{n})^{\frac{2}{2 - ϵ}} (log (j + 2))^{2}$

Since $\frac{2}{2 - ϵ} > 1$ we can choose $c > 0$ s.t. $(2^{- k} (log (k + 2))^{n})^{\frac{2}{2 - ϵ}} \leq c 2^{- k}$ . We get

$(2^{- k} (log (k + 2))^{n} (log (j + 2))^{2 - ϵ})^{\frac{2}{2 - ϵ}} \leq c 2^{- k} (log (j + 2))^{2}$

Proof of Theorem 2

Consider $^A = (A, r_{A}, a_{A})$ a ${0, 1}$ -valued $(p o l y, l o g)$ -bischeme. Define ${W^{k j} : {0, 1}^{*}^{2} a l g - \to {0, 1}}_{k, j \in N}$ by

$Wkj(x,y)={^Akj(x,y≤rA(k,j))if |x|=k0if |x|≠k$

Choose a polynomial $p : N \to N$ s.t. $W^{k j} (x, y)$ runs within time $p (t (| x |))$ for $j \leq t (k)$ , $| x | \leq k$ and $| y | = p (t (| x |))$ and that $r_{A} (k, j) \leq p (t (k))$ for $j \leq t (k)$ . We have

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) \geq 3 ⟹ | E_{U^{\leq k} \times U^{p (t (k))}} [f (x) ∣ W^{k j} (x, y_{\leq p (t (| x |))}) = 1] - α | \leq \frac{c_{p} | W^{k j} | \sqrt{log log m_{W^{k j}, p (t)} (k)}}{\sqrt{m_{W^{k j}, p (t)} (k)}}$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) \geq 3 ⟹ | E_{U^{k} \times U^{r_{A} (k, j)}} [f (x) ∣ {^A}^{k j} (x, y) = 1] - α | \leq \frac{(c_{1} log (k + 2) + c_{2} log (j + 2)) \sqrt{log log m_{W^{k j}, p (t)} (k)}}{\sqrt{m_{W^{k j}, p (t)} (k)}}$

for some $c_{1}, c_{2} > 0$ . $m_{W^{k j}, p (t)} (k) \leq 2^{k + 1}$ therefore

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) \geq 3 ⟹ | E_{U^{k} \times U^{r_{A} (k, j)}} [f ∣ {^A}^{k j} = 1] - α | \leq \frac{(c_{1} log (k + 2) + c_{2} log (j + 2)) \sqrt{log (k + 1)}}{\sqrt{m_{W^{k j}, p (t)} (k)}}$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) \geq 3 ⟹ | E_{U^{k} \times U^{r_{A} (k, j)}} [f - α ∣ {^A}^{k j} = 1] | \leq \frac{(c_{1} log (k + 2) + c_{2} log (j + 2)) \sqrt{log (k + 1)}}{\sqrt{m_{W^{k j}, p (t)} (k)}}$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) \geq 3 ⟹ \frac{| E_{U^{k} \times U^{r_{A} (k, j)}} [(f - α) {^A}^{k j}] |}{P r_{U^{k} \times U^{r_{A} (k, j)}} [{^A}^{k j} = 1]} \leq \frac{(c_{1} log (k + 2) + c_{2} log (j + 2)) \sqrt{log (k + 1)}}{\sqrt{P r_{U^{k} \times U^{r_{A} (k, j)}} [{^A}^{k j} = 1] 2^{k}}}$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) \geq 3 ⟹ | E_{U^{k} \times U^{r_{A} (k, j)}} [(f - α) {^A}^{k j}] | \leq \sqrt{P r_{U^{k} \times U^{r_{A} (k, j)}} [{^A}^{k j} = 1]} \frac{(c_{1} log (k + 2) + c_{2} log (j + 2)) \sqrt{log (k + 1)}}{{\sqrt{2}}^{k}}$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) \geq 3 ⟹ | E_{U^{k} \times U^{r_{A} (k, j)}} [(f - α) {^A}^{k j}] | \leq \frac{(c_{1} log (k + 2) + c_{2} log (j + 2)) \sqrt{log (k + 1)}}{{\sqrt{2}}^{k}}$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) \geq 3 ⟹ E_{U^{k} \times U^{r_{A} (k, j)}} [(f - α) {^A}^{k j}]^{2} \leq 2^{- k} (c_{1} log (k + 2) + c_{2} log (j + 2))^{2} log (k + 1)$

On the other hand

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) < 3 ⟹ P r_{U^{k} \times U^{r_{A} (k, j)}} [{^A}^{k j} = 1] 2^{k} < 3$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) < 3 ⟹ P r_{U^{k} \times U^{r_{A} (k, j)}} [{^A}^{k j} = 1] < 3 \cdot 2^{- k}$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) < 3 ⟹ P r_{U^{k} \times U^{r_{A} (k, j)}} [{^A}^{k j} = 1]^{2} < 3 \cdot 2^{- k}$

$\forall k : \forall j \leq t (k) : m_{W^{k j}, p (t)} (k) < 3 ⟹ E_{U^{k} \times U^{r_{A} (k, j)}} [(f - α) {^A}^{k j}]^{2} < 3 \cdot 2^{- k}$

Combining the two cases

$\forall k : \forall j \leq t (k) : E_{U^{k} \times U^{r_{A} (k, j)}} [(f - α) {^A}^{k j}]^{2} \leq 2^{- k} max ((c_{1} log (k + 2) + c_{2} log (j + 2))^{2} log (k + 1), 3)$

Using Proposition 6, we conclude that

$| E_{U^{k} \times U^{r_{A} (k, j)}} [(f - α) {^A}^{k j}] | \in Δ_{G, t}^{2}$

Proof of Proposition 2

Proven exactly the same way as for $Δ_{a v g}^{2}$ .

Proof of Proposition 3

Consider $δ \in Δ_{G, t_{ϕ}}^{2}$ . Take $c, d > 0$ s.t.

$\forall k : \forall j \leq t (k) : δ (k, j)^{d} \leq c 2^{- k} log (k + 2) (log (j + 2))^{2}$

Given $ϕ^{'} \in Φ$ s.t. $ϕ^{'} \leq ϕ$ , we have

$\frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)^{d}}{log log t_{ϕ^{'}} (k)} \leq \frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) c 2^{- k} log (k + 2) (log (j + 2))^{2}}{log log t_{ϕ^{'}} (k)}$

$\frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)^{d}}{log log t_{ϕ^{'}} (k)} \leq c 2^{- k} log (k + 2) \frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) (log (j + 2))^{2}}{log log t_{ϕ^{'}} (k)}$

$\frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)^{d}}{log log t_{ϕ^{'}} (k)} \leq c 2^{- k} log (k + 2) (log (t_{ϕ^{'}} (k) + 1))^{2}$

$lim k \to \infty \frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)^{d}}{log log t_{ϕ^{'}} (k)} \leq c lim k \to \infty 2^{- k} log (k + 2) (log (t_{ϕ^{'}} (k) + 1))^{2}$

$lim k \to \infty \frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)^{d}}{log log t_{ϕ^{'}} (k)} \leq c lim k \to \infty 2^{- k} log (k + 2) (log t_{ϕ^{'}} (k))^{2}$

$lim k \to \infty \frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)^{d}}{log log t_{ϕ^{'}} (k)} \leq c lim k \to \infty 2^{- k} log (k + 2) (log k)^{2 ϕ^{'} (k)}$

$lim k \to \infty \frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)^{d}}{log log t_{ϕ^{'}} (k)} \leq c lim k \to \infty 2^{- k} (log k)^{2 ϕ^{'} (k) + 1}$

$lim k \to \infty \frac{t_{ϕ^{'}} (k) - 1 \sum j = 2 (log log (j + 1) - log log j) δ (k, j)^{d}}{log log t_{ϕ^{'}} (k)} = 0$

We conclude that $δ^{d} \in Δ_{a v g, ϕ}^{2}$ and therefore $δ \in Δ_{a v g, ϕ}^{2}$ .

Proof of Corollary 2

Follows trivially from Theorem 2, Proposition 3 and Corollary 1.

Optimal predictor schemes pass a Benford test

Results

Definition 1

Theorem 1

Corollary 1

Definition 2

Note 1

Definition 3

Proposition 1

Theorem 2

Definition 4

Proposition 2

Proposition 3

Corollary 2

Note 2

Appendix

Proposition 4

Proof of Proposition 4

Proof of Theorem 1

Proof of Corollary 1

Proposition 5

Proof of Proposition 5

Proof of Proposition 1

Proposition 6

Proof of Proposition 6

Proof of Theorem 2

Proof of Proposition 2

Proof of Proposition 3

Proof of Corollary 2