Optimal predictors for global probability measures

There are two commonly used formalisms in average case complexity theory: problems with a single global probability measure on parameter space or problems with a family of such probability measures. Previous posts about optimal predictors focused on the family approach. In this post we give the formalism for global measures.

Results

New notation

will denote the one-element set.

Given a probability measure on , a set, , stands for the maximal runtime of for , .

Definition 1

A unidistributional estimation problem is a pair where is a probability measure on and is an arbitrary function.

Definition 2

Given appropriate sets , , consider , polynomial and . The triple is called a -scheme of signature when

(i) The runtime of is bounded by with polynomial.

(ii) for some .

A -scheme of signature will also be called a -predictor.

Given , , will denote the -valued random variable where is sampled from the probability measure . We will also use the notation .


Fix an error space of rank 1.

Definition 3

Consider a unidistributional estimation problem and a -predictor. is called a -optimal predictor for when for any -predictor , there is s.t.


-optimal predictors for unidistributional problems have properties and existence theorems analogical to -optimal predictor schemes, where the role of the rank 2 error space is taken by the rank 1 error space (see Definition 8). The theorems are listed below. The proofs of Theorem 4, Theorem 8 (which is stated stronger than previous analogues) and Theorem 10 are given in the appendix. Adapting the other proofs is straightforward.

Theorem 1

Suppose . Consider a unidistributional estimation problem and a -optimal predictor for . Suppose , are s.t.

Define

Assume that either have a number of digits logarithmically bounded in or produces outputs with a number of digits logarithmically bounded in (by Theorem A.7 if any -optimal predictor exists for then a -optimal predictor with this property exists as well). Then, .

Theorem 2

Consider a probability measure on and s.t. . Suppose is a -optimal predictor for and is a -optimal predictor for . Then is a -optimal predictor for .

Theorem 3

Consider a probability measure on and s.t. . Suppose is a -optimal predictor for and is a -optimal predictor for . Then, is a -optimal predictor for .

Definition 4

Fix an error space of rank 1. A probability measure on is called -sampleable when there is a -scheme of signature such that

is called a -sampler for .

Definition 5

Consider an error space of rank 1. A unidistributional estimation problem is called -generatable when there is a -scheme of of signature such that

(i) is a -sampler for .

(ii)

is called a -generator for .

Theorem 4

Consider , unidistributional estimation problems with respective -optimal predictors and . Assume is -sampleable and is -generatable. Define by . Then, is a -optimal predictor for .

Theorem 5

Consider a probability measure on and , . Assume is a -optimal predictor for and is a -optimal predictor for . Define by . Then is a -optimal predictor for .

Theorem 6

Fix a polynomial s.t. . Consider a probability measure on , and non-empty. Assume is a -optimal predictor for and is a -optimal predictor for . Define by

Then, is a -optimal predictor for .

Definition 6

Consider a probability measure on and , -predictors. We say is -similar to relative to (denoted ) when .

Theorem 7

Consider a unidistributional estimation problem, a -optimal predictor for and a -predictor. Then, is a -optimal predictor for if and only if .

Definition 7

Consider , unidistributional estimation problems, a -scheme of signature . is called a -pseudo-invertible reduction of to when the following conditions hold:

(i)

(ii)

(iii) There is and a -scheme of signature s.t.

(iv) There is a -scheme of signature s.t.

Such is called a -pseudo-inverse of .

Note 1

The conditions of Definition 7 are weaker than corresponding definitions in previous posts in the sense that exact equalities were replaced by approximate equalities with error bounds related to . However, analogous relaxations can be done in the “multidistributional” theory too.

Theorem 8

Suppose . Consider , unidistributional estimation problems, a -pseudo-invertible reduction of to and a -optimal predictor for . Define by . Then, is a -optimal predictor for .

Defintion 8

is the set of bounded functions s.t.

It is easily seen that is a rank 1 error space.

Theorem 9

Consider a unidistributional estimation problem. Define by

Define by

Then, is a -optimal predictor for .

Theorem 10

There is an oracle machine that accepts an oracle of signature and a polynomial where the allowed oracle calls are for and computes a function of signature s.t. for any a unidistributional estimation problem and a corresponding -generator, is a -optimal predictor for .


The following is the description of . Consider and a polynomial . We describe the computation of where the extra argument of is regarded as internal coin tosses.

We loop over the first words in lexicographic order. Each word is interpreted as a program . We loop over “test runs”. At test run , we generate by evaluating for sampled from . We then sample from and compute . At the end of the test runs, we compute the average error . At the end of the loop over programs, the program with the lowest error is selected and the output is produced.

Appendix

Definition 9

Given , a function is called -moderate when

(i) is non-decreasing in arguments to .

(ii) For any collection of polynomials ,

Lemma 1

Fix a unidistributional estimation problem and a -predictor. Then, is -optimal iff there is a -moderate function s.t. for any ,

The proof is analogous to the case of -optimal predictor schemes and we omit it.

Lemma 2

Suppose . Fix a unidistributional estimation problem and a corresponding -optimal predictor. Consider a -predictor, and a -scheme of signature . Assume . Then there is s.t.

The proof is analogous to the case of -optimal predictor schemes and we omit it.

Lemma 3

Consider a unidistributional estimation problem and a -optimal predictor for . Then there are and a -moderate function s.t. for any ,

Conversely, consider and a -scheme of signature . Suppose that for any -scheme of signature we have

Define to be a -predictor s.t. computing is equivalent to computing rounded to digits after the binary point, where . Then, is a -optimal predictor for .

The proof is analogous to the case of -optimal predictor schemes and we omit it.

Lemma 4

Consider a probability measure on . Suppose is a -sampler for . Then, s.t. for any bounded function

Proof of Lemma 4

Since is a -sampler, we get the desired result.

Lemma 5

Consider a family of sets and family of probability measures on . Denote . Consider a function and a bounded function. Suppose that

Then it follows that

Proof of Lemma 5

Proposition 1

Consider a unidistributional estimation problem, a -generator for and a bounded function. Then

Proof of Proposition 1

By Lemma 4 we have

By property (ii) of generators and Lemma 5 we have

Combining the two we get the desired result.

Proof of Theorem 4

We have

Therefore, for any -scheme of signature

By Lemma 3, it is sufficient to show an appropriate bound for each of the terms on the right hand side. Suppose is a -generator for . Applying Proposition 1 to the first term, we get

where .

Applying Lemma 3 for , we get

where .

Suppose is a -sampler for . Applying Lemma 4 to the second term, we get

where .

Applying Lemma 3 for , we get

where . Again, we got the required bound.

Lemma 6

Consider a family of sets and family of probability measures on . Denote . Consider bounded functions and a uniformly bounded family of functions indexed by some set . Suppose that

Then there is s.t.

Proof of Lemma 6

Proposition 2

Consider , unidistributional estimation problems. Suppose is a -pseudo-invertible reduction of to and is it’s -pseudo-inverse. Then there is s.t. for any bounded function

Proof of Proposition 2

Denote . According to the definitive property of

where . Therefore

Proof of Theorem 8

Consider a -predictor. Let be the -predictor defined by

Applying Lemma 2 we get

where .

Using the definitive property of we can apply Lemma 5 to the left hand side and get

where . Using property (ii) of pseudo-invertible reductions, we get

Using the definition of and Lemma 6 applied via property (i) of pseudo-invertible reductions, we get

where .

Using the definitive property of , Lemma 5 and property (ii) of pseudo-invertible reductions on the right-hand side, we get

where . Applying Proposition 2

where . Applying Lemma 6 via property (i) of pseudo-invertible reductions

Putting everything together

for .

Proposition 3

Consider and a non-increasing function. Suppose that

Then

Proof of Proposition 3

Assume to the contrary that there is and an unbounded sequence s.t.

Define . For any we have

Therefore

Since we can choose an infinite number of non-overlapping intervals of the form , we reach the contradiction

Proposition 4

Consider a polynomial . There is a function s.t.

(i)

(ii)For any non-increasing function we have

Proof of Proposition 4

Given polynomials s.t. for , the proposition for implies the proposition for by setting

Therefore, it is enough to prove to proposition for polynomials of the form for .

We have

For any

Since we can choose satisfying condition (i) so that

It follows that

Applying Proposition 3, we get the desired result.

Lemma 7

Consider a unidistributional estimation problem, , -predictors. Suppose a polynomial and are s.t.

Then s.t.

Proof of Lemma 7

Define .

By Proposition 4 we have

Proposition 5

Consider . Define . Then, .

Proof of Proposition 5

Suppose is s.t. . We claim that . Assume to the contrary that for some we have an unbounded sequence s.t. . We can then choose s.t. and . But this implies which is a contradiction.

Proof of Theorem 10

Consider a -predictor. Choose a non-constant polynomial s.t. evaluating involves running until it halts “naturally” (such exists because runs in at most polynomial time and has at most logarithmic advice). Given , consider the execution of . The standard deviation of with respect to the internal coin tosses of is at most . Applying Lemma 6 followed by Lemma 4, the expectation value is where for . By Chebyshev’s inequality,

Hence

The standard deviation of for any is also at most . The expectation value is where . Therefore

The extra factor comes from summing probabilities over programs. Combining we get

By Proposition 5, we can assume is non-increasing without loss of generality. Therefore

Applying Lemma 7 we get the desired result.