Here, the classifier is able to classify labels almost perfectly, so it’s not learning only about outliers.
If there’s one near-perfect separating hyperplane, then there’s usually lots of near-perfect separating hyperplanes; which one is chosen by the linear classifier is determined mostly by outliers/borderline cases. That’s what I mean when I say it’s mostly measuring outliers.
If there’s one near-perfect separating hyperplane, then there’s usually lots of near-perfect separating hyperplanes; which one is chosen by the linear classifier is determined mostly by outliers/borderline cases. That’s what I mean when I say it’s mostly measuring outliers.