I think the linked post is assuming that the parameters are real numbers.
I’m still confused by “Unless you have a loss function that has a finite minimum value like squared loss (not cross entropy or softmax)” because cross entropy is bounded below at zero.
I think the linked post is assuming that the parameters are real numbers.
I’m still confused by “Unless you have a loss function that has a finite minimum value like squared loss (not cross entropy or softmax)” because cross entropy is bounded below at zero.