I’m a bit confused here, when you talk about the Hessian are you talking about the Hessian evaluated at the point of minimum loss? If so, isn’t the bellow statement not strictly right?
If we start at our minimum and walk away in a principal direction, the loss as a function of distance traveled is L(x)=12λix2, where λi is the Hessian eigenvalue for that direction.
Like, isn’t L(x)=12λix2 just an approximation of the loss here?
Thanks for posting this.
I’m a bit confused here, when you talk about the Hessian are you talking about the Hessian evaluated at the point of minimum loss? If so, isn’t the bellow statement not strictly right?
Like, isn’t L(x)=12λix2 just an approximation of the loss here?
Yes, it is an approximation, as noted at the start of that section: