I’m curious about the retraction. Is it because of the later comments in the story, about how people change afterwards?
No, I just thought about it some more, and I realized that increasing the learning rate of a model (assuming the optimizer is something like SGD) would inject more randomness, just like increasing the temperature of simulated annealing would.
I’m curious about the retraction. Is it because of the later comments in the story, about how people change afterwards?
No, I just thought about it some more, and I realized that increasing the learning rate of a model (assuming the optimizer is something like SGD) would inject more randomness, just like increasing the temperature of simulated annealing would.