happyfellow comments on Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc

happyfellow 4 Jun 2022 22:42 UTC
4 points
I’m understanding that by “interpretability” you mean “we can attach values meaningful to people to internal nodes of the model”[1].

My guess is that logical/probabilistic models are regarded as more interpretable than DNNs mostly for two reasons:
- They tend to have small number of inputs and the inputs are heavily engineered features (so the inputs themselves are already meaningful to people).
- Internal nodes combine features in quite simple ways, particularly when number of inputs is small (the meaning in the inputs cannot be distorted/diluted too much in the internal nodes, if you allow).
I think what you are saying is: let’s assume that inputs are not engineered and have no high-level meaning to people (e.g. raw pixel data). But the output does, e.g. it detects cats in pictures. The question is: can we find parts of the model which correspond to some human understandable categories (e.g. ear detector)?

In this case, I agree that seems equally hard regardless of the model, holding complexity constant. I just wouldn’t call hardness of doing this specific thing “uninterpretability”.

[1] Is that definition standard? I’m not a fan, I’d go closer to “interpretable model” = “model humans can reason about, other than by running black-box experiments on the model”.