Nice! I think I believe your claim, and I would like to chat with you to verify stuff and talk about future directions.
I have thought about algorithms very similar to this, and using such an algorithm got an M which is either good, or bad in the first sense and outputting probabilities converging to 2⁄3, or bad in the second sense and outputting probabilities converging to 1⁄3. I had thought that if epsilon was shrinking quickly enough as to not have bad1 go to infinity, it would be shrinking so quickly that you could get locked in the while loop with bad2 increasing. I don’t think I actually checked this claim carefully, so I guess maybe I was wrong.
If this algorithm works as claimed, I wonder if you can extend it to three advisors (which may not be constant).
Ah, I think I can stymy M with 2 nonconstant advisors. Namely, let A1(n)=12−1n+3 and A2(n)=12+1n+3. We (setting up an adversarial E) precommit to setting E(n)=0 if p(n)≥A2(n) and E(n)=1 if p(n)≤A1(n); now we can assume that M always chooses p(n)∈[A1(n),A2(n)], since this is better for M.
Now define b′i(j)=|Ai(j)+E(j)−1|−|p(j)+E(j)−1| and bi(n)=∑j<nb′i(j). Note that if we also define badi(n)=∑j<n(log|Ai(j)+E(j)−1|−log|p(j)+E(j)−1|) then ∑j<n|2bi(j)−badi(j)|≤∑j<n(2A1(j)−1−log(2A1(j))))=∑j<nO((12−A1(j))2) is bounded; therefore if we can force b1(n)→∞ or b2(n)→∞ then we win.
Let’s reparametrize by writing δ(n)=A2(n)−A1(n)=2n+3 and q(n)=p(n)−A1(n)δ(n), so that b′i(j)=δ(j)(|i−2+E(j)|−|q(j)−1+E(j)|).
Now, similarly to how M worked for constant advisors, let’s look at the problem in rounds: let s0=0, and sn=⌊exp(sn−1−1)⌋+1 for n>0. When determining E(sn−1),…,E(sn−1), we can look at p(sn−1),…,p(sn−1). Let tn=⌊b2(sn)−1n⌋. Let’s set E(sn−1),…,E(sn−1) to 1 if ∑sn−1j=sn−1δ(j)(1−q(j))≥1; otherwise we’ll do something more complicated, but maintain the constraint that b2(sn)≥b2(sn−1)−1n(n−1)≥tn−1+1n: this guarantees that tn is nondecreasing and that liminfj→∞b2(j)≥limn→∞tn.
If tn→∞ then b2(n)→∞ and we win. Otherwise, let t=limn→∞tn, and consider n such that tn−1=t.
We have ∑sn−1j=sn−1δ(j)(1−q(j))<1. Let J⊆{sn−1,…,sn−1} be a set of indices with q(j)≥q(j′) for all j∈J,j′∉J, that is maximal under the constraint that ∑j∈Jδ(j)(1−q(j))≤1n(n−1); thus we will still have ∑j∈Jδ(j)(1−q(j))≥1n(n−1)−δ(sn−1). We shall set E(j)=0 for all j∈J.
By the definition of J: ∑j∈Jb′1(j)=∑j∈Jδ(j)q(j)≥∑j∈Jδ(j)(1−q(j))∑sn−1j=sn−1δ(j)q(j)∑sn−1j=sn−1δ(j)(1−q(j))≥(1n(n−1)−δ(sn−1))∑sn−1j=sn−1δ(j)−11≥(1n(n−1)−δ(sn−1))(2log(sn+3sn−1+3)−1)≥2 if n≫0
For j′∉J, we’ll proceed iteratively, greedily minimizing ∣∣∑jj′=sn−11j′∉J(b′1(j′),b′2(j′))∣∣. Then: minsn−1≤j<snj∑j′=sn−11j′∉Jb′1(j′)≥−
⎷sn−1∑j=sn−1δ(j)2=−2
⎷sn−2∑j=sn−1+31j2≥−2
⎷sn−2∑j=sn−1+3(1j−1−1j)≥−2√sn−1+2≥−1 if n≫0
Keeping this constraint, we can flip (or not flip) all the E(j′)s for j′∉J so that ∑sn−1j′=sn−11j′∉Jb′2(j′)>0. Then, we have b2(sn)≥b2(sn−1)−1n(n−1), b1(sn)−b1(sn−1)=∑snj=sn−1(1j∈J+1j∉J)b′1(j)≥2−1=1 if n≫0, and for sn−1≤j≤sn, b1(j)≥b1(sn−1)+∑j−1j′=sn−11j′∉Jb′1(j′)≥b1(sn−1)−1 if n≫0.
I don’t yet know whether I can extend it to two nonconstant advisors, but I do know I can extend it to a countably infinite number of constant-prediction advisors. Let (Pi)i=0,… be an enumeration of their predictions that contains each one an infinite number of times. Then:
def M(p, E, P):
prev, this, next = 0, 0, 1
def bad(i):
return sum(log(abs((E[k] + P[i] - 1) /
(E[k] + p[k] - 1)))
for k in xrange(prev))
for k in xrange(this, next): p[k] = 0.5
prev, this, next = this, next, floor(exp(next - 1)) + 1
for i in xrange(0, Inf):
for k in xrange(this, next): p[k] = P[i]
prev, this, next = this, next, floor(exp(next - 1)) + 1
bad(i) is now up to date through E[:this], not just E[:prev]
bound = 2 * bad(i)
for j in xrange(0, Inf):
if P[j] == P[i]: continue
flip = P[j] < P[i]
p1, p2 = abs(P[i] - flip), abs(P[j] - flip)
for k in xrange(this, next): p[k] = abs(p1 - flip)
prev, this, next = this, next, floor(exp(next - 1)) + 1
if bad(i) <= 0: break
while bad(i) > 0 and bad(j) > 0:
# won't let bad(i) surpass bound
eps = (bound - bad(i)) / 2 / abs(1 - p1 - flip) / (next - this)
This is just for early iterations of the inner loop; in the limit, eps should be just enough for bad(i) to go halfway to bound if we let p = abs(p1 + eps - flip):
while eps >= 1 - p1 or
bound <= bad(i) + (next - this) *
log((1 - p1) / (1 - p1 - eps)):
eps /= 2
for k in xrange(this, next): p[k] = abs(p1 + eps - flip)
prev, this, next = this, next, floor(exp(next - 1)) + 1
for k in xrange(this, next): p[k] = abs(p1 - flip)
# this is where the P[i] + d * eps affects bad(i)
Consider q=log(1−p1)−log(1−p2)log(1−p1)−log(1−p2)+log(p2)−log(p1). This q is the probability between p1 and p2 such that if E[k] is chosen with probability |q−flip| then that will have an equal impact on bad(i) and bad(j). Now consider some q′ between p1 and q.
Every iteration where mean(|E[prev:this]−flip|)≤q′ will decrease bad(j) by a positive quantity that’s at least linear in this-prev, so (at least after the first few such iterations) this will exceed prev⋅−log(maxk:Pk has been reachedmax(Pk,1−Pk))>bad(j), so it will turn bad(j) negative. If this happens for all j then M cannot be bad for E. If it doesn’t, then let’s look at the first j where it doesn’t. After a finite number of iterations, every iteration must have mean(|E[prev:this]−flip|)>q′. However, this will cause bad(i) to decrease by a positive quantity that’s at least proportional to bound - bad(i); therefore, after a finite number of such iterations, we must reach bad(i)<0. So if M is bad for E then for each value of i we will eventually make bad(i)<0 and then move on to the next value of i. This implies M is not bad for E.
Emboldened by this, we can also consider the problem of building an M that isn’t outperformed by any constant advisor. However, this cannot be done, according to the following handwavy argument:
Let q be some incompressible number, and let E(i)iid∼Bern(q). When computing p(n), M can’t do appreciably better than Laplace’s law of succession, which will give it standard error √q(1−q)log(n), and relative badness ∼q(1−q)log(n)(1q+11−q)=1log(n) (relative to the q-advisor) on average. For i≤n, and n≫0, the greatest deviation of the badness from the ∑ij=21log(j)≥i−1log(i) trend is ≈√2nloglog(n)q(1−q) (according to the law of the iterated logarithm), which isn’t enough to counteract the expected badness; therefore the badness will converge to infinity.
Nice! I think I believe your claim, and I would like to chat with you to verify stuff and talk about future directions.
I have thought about algorithms very similar to this, and using such an algorithm got an M which is either good, or bad in the first sense and outputting probabilities converging to 2⁄3, or bad in the second sense and outputting probabilities converging to 1⁄3. I had thought that if epsilon was shrinking quickly enough as to not have bad1 go to infinity, it would be shrinking so quickly that you could get locked in the while loop with bad2 increasing. I don’t think I actually checked this claim carefully, so I guess maybe I was wrong.
If this algorithm works as claimed, I wonder if you can extend it to three advisors (which may not be constant).
Ah, I think I can stymy M with 2 nonconstant advisors. Namely, let A1(n)=12−1n+3 and A2(n)=12+1n+3. We (setting up an adversarial E) precommit to setting E(n)=0 if p(n)≥A2(n) and E(n)=1 if p(n)≤A1(n); now we can assume that M always chooses p(n)∈[A1(n),A2(n)], since this is better for M.
Now define b′i(j)=|Ai(j)+E(j)−1|−|p(j)+E(j)−1| and bi(n)=∑j<nb′i(j). Note that if we also define badi(n)=∑j<n(log|Ai(j)+E(j)−1|−log|p(j)+E(j)−1|) then ∑j<n|2bi(j)−badi(j)|≤∑j<n(2A1(j)−1−log(2A1(j))))=∑j<nO((12−A1(j))2) is bounded; therefore if we can force b1(n)→∞ or b2(n)→∞ then we win.
Let’s reparametrize by writing δ(n)=A2(n)−A1(n)=2n+3 and q(n)=p(n)−A1(n)δ(n), so that b′i(j)=δ(j)(|i−2+E(j)|−|q(j)−1+E(j)|).
Now, similarly to how M worked for constant advisors, let’s look at the problem in rounds: let s0=0, and sn=⌊exp(sn−1−1)⌋+1 for n>0. When determining E(sn−1),…,E(sn−1), we can look at p(sn−1),…,p(sn−1). Let tn=⌊b2(sn)−1n⌋. Let’s set E(sn−1),…,E(sn−1) to 1 if ∑sn−1j=sn−1δ(j)(1−q(j))≥1; otherwise we’ll do something more complicated, but maintain the constraint that b2(sn)≥b2(sn−1)−1n(n−1)≥tn−1+1n: this guarantees that tn is nondecreasing and that liminfj→∞b2(j)≥limn→∞tn.
If tn→∞ then b2(n)→∞ and we win. Otherwise, let t=limn→∞tn, and consider n such that tn−1=t.
We have ∑sn−1j=sn−1δ(j)(1−q(j))<1. Let J⊆{sn−1,…,sn−1} be a set of indices with q(j)≥q(j′) for all j∈J,j′∉J, that is maximal under the constraint that ∑j∈Jδ(j)(1−q(j))≤1n(n−1); thus we will still have ∑j∈Jδ(j)(1−q(j))≥1n(n−1)−δ(sn−1). We shall set E(j)=0 for all j∈J.
By the definition of J: ∑j∈Jb′1(j)=∑j∈Jδ(j)q(j)≥∑j∈Jδ(j)(1−q(j))∑sn−1j=sn−1δ(j)q(j)∑sn−1j=sn−1δ(j)(1−q(j))≥(1n(n−1)−δ(sn−1))∑sn−1j=sn−1δ(j)−11≥(1n(n−1)−δ(sn−1))(2log(sn+3sn−1+3)−1)≥2 if n≫0
For j′∉J, we’ll proceed iteratively, greedily minimizing ∣∣∑jj′=sn−11j′∉J(b′1(j′),b′2(j′))∣∣. Then: minsn−1≤j<snj∑j′=sn−11j′∉Jb′1(j′)≥− ⎷sn−1∑j=sn−1δ(j)2=−2 ⎷sn−2∑j=sn−1+31j2≥−2 ⎷sn−2∑j=sn−1+3(1j−1−1j)≥−2√sn−1+2≥−1 if n≫0
Keeping this constraint, we can flip (or not flip) all the E(j′)s for j′∉J so that ∑sn−1j′=sn−11j′∉Jb′2(j′)>0. Then, we have b2(sn)≥b2(sn−1)−1n(n−1), b1(sn)−b1(sn−1)=∑snj=sn−1(1j∈J+1j∉J)b′1(j)≥2−1=1 if n≫0, and for sn−1≤j≤sn, b1(j)≥b1(sn−1)+∑j−1j′=sn−11j′∉Jb′1(j′)≥b1(sn−1)−1 if n≫0.
Therefore, b1(j)→∞, so we win.
I don’t yet know whether I can extend it to two nonconstant advisors, but I do know I can extend it to a countably infinite number of constant-prediction advisors. Let (Pi)i=0,… be an enumeration of their predictions that contains each one an infinite number of times. Then:
bad(i)
is now up to date throughE[:this]
, not justE[:prev]
This is just for early iterations of the inner loop; in the limit,
eps
should be just enough forbad(i)
to go halfway tobound
if we letp = abs(p1 + eps - flip)
:Consider q=log(1−p1)−log(1−p2)log(1−p1)−log(1−p2)+log(p2)−log(p1). This q is the probability between
p1
andp2
such that ifE[k]
is chosen with probability |q−flip| then that will have an equal impact onbad(i)
andbad(j)
. Now consider some q′ betweenp1
and q. Every iteration where mean(|E[prev:this]−flip|)≤q′ will decreasebad(j)
by a positive quantity that’s at least linear inthis-prev
, so (at least after the first few such iterations) this will exceed prev⋅−log(maxk:Pk has been reachedmax(Pk,1−Pk))>bad(j), so it will turnbad(j)
negative. If this happens for allj
thenM
cannot be bad forE
. If it doesn’t, then let’s look at the firstj
where it doesn’t. After a finite number of iterations, every iteration must have mean(|E[prev:this]−flip|)>q′. However, this will causebad(i)
to decrease by a positive quantity that’s at least proportional tobound - bad(i)
; therefore, after a finite number of such iterations, we must reach bad(i)<0. So ifM
is bad forE
then for each value ofi
we will eventually make bad(i)<0 and then move on to the next value ofi
. This impliesM
is not bad forE
.Emboldened by this, we can also consider the problem of building an M that isn’t outperformed by any constant advisor. However, this cannot be done, according to the following handwavy argument:
Let q be some incompressible number, and let E(i)iid∼Bern(q). When computing p(n), M can’t do appreciably better than Laplace’s law of succession, which will give it standard error √q(1−q)log(n), and relative badness ∼q(1−q)log(n)(1q+11−q)=1log(n) (relative to the q-advisor) on average. For i≤n, and n≫0, the greatest deviation of the badness from the ∑ij=21log(j)≥i−1log(i) trend is ≈√2nloglog(n)q(1−q) (according to the law of the iterated logarithm), which isn’t enough to counteract the expected badness; therefore the badness will converge to infinity.