Support vector machines [2], which try to pick separating hyperplanes that minimize generalization error, are one example of this where the algorithm is explicitly trying to maximize worst-case utility.
Could you expand on this a little? I’ve always thought of SVMs as minimizing an expected loss (the sum over hinge losses) rather than any best-worst-case approach. Are you referring to the “max min” in the dual QP? I’m interested in other interpretations...
Could you expand on this a little? I’ve always thought of SVMs as minimizing an expected loss (the sum over hinge losses) rather than any best-worst-case approach. Are you referring to the “max min” in the dual QP? I’m interested in other interpretations...