The fact that Recursive CCS finds strictly more than one good direction means that CCS is not efficient at locating all information related to truth: it is not able to find a direction which contains as much information as the direction found by taking the difference of the means. Note: Logistic Regression seems to be about as leaky as CCS.
Perhaps a naive question, but if a single linear projection exists that contains the desired information, why isn’t this ‘global minimum’ found with LR (or CSS)?
That’s because a classifier only needs to find a direction which correctly classifies the data, not a direction which makes other classifiers fail. A direction which removes all linearly available information is not always as good as the direction found with LR (at classification).
Maybe figure 2 from the paper which introduced mean difference as a way to remove linear information might help.
Perhaps a naive question, but if a single linear projection exists that contains the desired information, why isn’t this ‘global minimum’ found with LR (or CSS)?
That’s because a classifier only needs to find a direction which correctly classifies the data, not a direction which makes other classifiers fail. A direction which removes all linearly available information is not always as good as the direction found with LR (at classification).
Maybe figure 2 from the paper which introduced mean difference as a way to remove linear information might help.