Assume we have a disease-detecting CV algorithm that looks at microscope images of tissue for cancerous cells. Maybe there’s a specific protein cluster (A) that shows up on the images which indicates a cancerous cell with 0.99 AUC. Maybe there’s also another protein cluster (B) that shows up and only has 0.989 AUC, A overlaps with B in 99.9999% of true positive. But B looks big and ugly and black and cancery to a human eye, A looks perfectly normal, it’s almost indistinguishable from perfectly benign protein clusters even to the most skilled oncologist.
If I understand this thought experiment right, we are also to assume that we know the slight difference in AUC is not just statistical noise (even with the high co-linearity between the A cluster and the B cluster)? So, say we assume that you still get a slightly higher AUC for A on a data set of cells that have either only A or neither versus a data set of cells with either only B or neither?
In that case, I would say that the model that weighs A a bit more is actually “explainable” in the relevant sense of the term - it is just that some people find the explanation aesthetically unpleasing. You can show what features the model is looking at to assign a probability that some cell is cancerous. You can show how, in the vast majority of cases, a model that looks at the presence or absence of the A cluster assigns a higher probability of a cell being cancerous to cells that actually are cancerous. And you can show how a model that looks at B does that also, but that A is slightly better at it.
If the treatment is going to be slightly different for a patient depending on how much weight you give to A versus B, and if I were the patient, I would want to use the treatment that has the best chance of working without negative side effects based on the data, regardless of whether A or B looks uglier. If some other patients want a version of the treatment that is statistically less likely to work based on their aesthetic sense of A versus B, I would think that is a silly risk to take (though also a very slight risk if A and B are that strongly correlated), but that would be their problem not mine.
If I understand this thought experiment right, we are also to assume that we know the slight difference in AUC is not just statistical noise (even with the high co-linearity between the A cluster and the B cluster)? So, say we assume that you still get a slightly higher AUC for A on a data set of cells that have either only A or neither versus a data set of cells with either only B or neither?
In that case, I would say that the model that weighs A a bit more is actually “explainable” in the relevant sense of the term - it is just that some people find the explanation aesthetically unpleasing. You can show what features the model is looking at to assign a probability that some cell is cancerous. You can show how, in the vast majority of cases, a model that looks at the presence or absence of the A cluster assigns a higher probability of a cell being cancerous to cells that actually are cancerous. And you can show how a model that looks at B does that also, but that A is slightly better at it.
If the treatment is going to be slightly different for a patient depending on how much weight you give to A versus B, and if I were the patient, I would want to use the treatment that has the best chance of working without negative side effects based on the data, regardless of whether A or B looks uglier. If some other patients want a version of the treatment that is statistically less likely to work based on their aesthetic sense of A versus B, I would think that is a silly risk to take (though also a very slight risk if A and B are that strongly correlated), but that would be their problem not mine.