Is this a prediction that a cyclic learning rate—that goes up and down—will work out better than a decreasing one? If so, that seems false, as far as I know.
https://www.youtube.com/watch?v=GM6XPEQbkS4 (talk) / https://arxiv.org/abs/2307.06324 prove faster convergence with a periodic learning rate. On a specific ‘nicer’ space than reality, and they’re (I believe from what I remember) comparing to a good bound with a constant stepsize of 1.
So it may be one of those papers that applies in theory but not often in practice, but I think it is somewhat indicative.
https://www.youtube.com/watch?v=GM6XPEQbkS4 (talk) / https://arxiv.org/abs/2307.06324 prove faster convergence with a periodic learning rate. On a specific ‘nicer’ space than reality, and they’re (I believe from what I remember) comparing to a good bound with a constant stepsize of 1. So it may be one of those papers that applies in theory but not often in practice, but I think it is somewhat indicative.