I agree with Psy-Kosh, I don’t think the proofs are comparable.
The difference between the two derivations is that #1 uses a weak upper bound for W, while #2 uses an equality. This means that the final bound is much weaker in #1 than in #2.
It’s possible to redo the derivation for the nonrandomized version and get the exact same bound as for the randomized case. The trick is to write M as the number of samples for which F_i > 1⁄2. Under weak assumptions on the performance of the experts you can show that this is less than Sum_i F_i.
Eliezer: thanks for the AI brain-teaser, I hope to see more posts like this in the future.
I agree with Psy-Kosh, I don’t think the proofs are comparable.
The difference between the two derivations is that #1 uses a weak upper bound for W, while #2 uses an equality. This means that the final bound is much weaker in #1 than in #2.
It’s possible to redo the derivation for the nonrandomized version and get the exact same bound as for the randomized case. The trick is to write M as the number of samples for which F_i > 1⁄2. Under weak assumptions on the performance of the experts you can show that this is less than Sum_i F_i.
Eliezer: thanks for the AI brain-teaser, I hope to see more posts like this in the future.