every iteration where mean(𝙴[𝚙𝚛𝚎𝚟:𝚝𝚑𝚒𝚜])≥2/5 will cause bound—bad1() to grow exponentially (by a factor of 11/10=1+(1/2)(−1+2/5𝚙𝟷))
a little more? I don’t follow. (I think I follow the overall structure of the proof, and if I believed this step I would believe the proof.)
We have that eps is about (2/3)(1-exp([bad1() - bound]/(next-this))), or at least half that, but I don’t see how to get a lower bound on the decrease of bad1() (as a fraction of bound-bad1() ).
You are correct that you use the fact that 1+eps is at approximately e^(eps).
The concrete way this is used in this proof is replacing the ln(1+3eps) you subtract from bad1 when the environment is a 1 with 3eps=(bound—bad1) / (next—this), and replacing the ln(1-3eps/2) you subtract from bad1 when the environment is a 0 with −3eps/2=-(bound—bad1) / (next—this)/2
Therefore, you subtract from bad1 approximately at least (next-this)((2/5)(bound—bad1) / (next—this)-(3/5)*(bound—bad1) / (next—this)/2).
This comes out to (bound—bad1)/10.
I believe the inequality is the wrong direction to just use e^(eps) as a bound for 1+eps, but when next-this gets big, the approximation gets close enough.
which, using the log(1+x) = x approximation, is about
(1/3) ([bound—bad1()] / [next—this] ).
Then Scott’s comment gives the rest. I was worried about the fact that we seem to be taking the exponential of the error in our approximation, or something. But Scott points out that this is not an issue because we can make [next-this] as big as we want, if necessary, without increasing bad1() at all, by guessing p1 for a very long time until [bound—bad1()] / [next—this]] is close enough to zero that the error is too small to matter.
Could you spell out the step
every iteration where mean(𝙴[𝚙𝚛𝚎𝚟:𝚝𝚑𝚒𝚜])≥2/5 will cause bound—bad1() to grow exponentially (by a factor of 11/10=1+(1/2)(−1+2/5𝚙𝟷))
a little more? I don’t follow. (I think I follow the overall structure of the proof, and if I believed this step I would believe the proof.)
We have that eps is about (2/3)(1-exp([bad1() - bound]/(next-this))), or at least half that, but I don’t see how to get a lower bound on the decrease of bad1() (as a fraction of bound-bad1() ).
You are correct that you use the fact that 1+eps is at approximately e^(eps).
The concrete way this is used in this proof is replacing the ln(1+3eps) you subtract from bad1 when the environment is a 1 with 3eps=(bound—bad1) / (next—this), and replacing the ln(1-3eps/2) you subtract from bad1 when the environment is a 0 with −3eps/2=-(bound—bad1) / (next—this)/2
Therefore, you subtract from bad1 approximately at least (next-this)((2/5)(bound—bad1) / (next—this)-(3/5)*(bound—bad1) / (next—this)/2).
This comes out to (bound—bad1)/10.
I believe the inequality is the wrong direction to just use e^(eps) as a bound for 1+eps, but when next-this gets big, the approximation gets close enough.
In case anyone shared my confusion:
The while loop where we ensure that eps is small enough so that
bound > bad1() + (next—this) * log((1 - p1) / (1 - p1 - eps))
is technically necessary to ensure that bad1() doesn’t surpass bound, but it is immaterial in the limit. Solving
bound = bad1() + (next—this) * log((1 - p1) / (1 - p1 - eps))
gives
eps >= (1/3) (1 - e^{ -[bound—bad1()] / [next—this]] })
which, using the log(1+x) = x approximation, is about
(1/3) ([bound—bad1()] / [next—this] ).
Then Scott’s comment gives the rest. I was worried about the fact that we seem to be taking the exponential of the error in our approximation, or something. But Scott points out that this is not an issue because we can make [next-this] as big as we want, if necessary, without increasing bad1() at all, by guessing p1 for a very long time until [bound—bad1()] / [next—this]] is close enough to zero that the error is too small to matter.