Two more thoughts: the above is probably more common in [what I intuitively think of as] “physical” problems where the parameters have some sort of geometric or causal relationship, which is maybe less meaningful for neural networks?
Also, for optimization more broadly, your constraints will give you a way to wind up with many parameters that can’t be changed to decrease your function, without requiring a massive coincidence. (The boundary of the feasible region is lower-dimensional.) Again, I guess not something deep learning has to worry about in full generality.
Two more thoughts: the above is probably more common in [what I intuitively think of as] “physical” problems where the parameters have some sort of geometric or causal relationship, which is maybe less meaningful for neural networks?
Also, for optimization more broadly, your constraints will give you a way to wind up with many parameters that can’t be changed to decrease your function, without requiring a massive coincidence. (The boundary of the feasible region is lower-dimensional.) Again, I guess not something deep learning has to worry about in full generality.