I’m not actually convinced that interpretability is doomed—in the OP I was exploring something of a worst case possibility.
Might be useful to mark this in the post. Perhaps a comment at the beginning about this post exploring a model, and its implications.
Might be useful to mark this in the post. Perhaps a comment at the beginning about this post exploring a model, and its implications.