Rohin Shah comments on Utility uncertainty vs. expected information gain

Rohin Shah 13 Sep 2019 22:59 UTC
LW: 2 AF: 1
AF
Unfortunately, on-policy expected information gain goes to 0 pretty fast (Theorem 5 here).
Where’s the “pretty fast”? The theorem makes a claim in the limit and says nothing about convergence. (I haven’t read the rest of the paper.)
- michaelcohen 14 Sep 2019 5:53 UTC
  LW: 1 AF: 1
  AF Parent
  Oh yeah sorry that isn’t shown there. But I believe the sum over all timesteps of the m-step expected info gain at each timestep is finite w.p.1 which would make it o(1/t) w.p.1.