I wasn’t aware that method had a name, but I’ve seen that idea suggested before when this topic comes up. For neural networks in particular, you can just look at the gradients of the inputs to see how it’s output changes as you change each input.
I think the problem people have, is that just tells you what the machine is doing. Not why. Machine learning can never really offer understanding.
For example, there was a program created specifically for the purpose of training human understandable models. It worked by fitting the simplest possible mathematical expression to the data. And the hope was that simple mathematical expressions would be easy to interpret by humans.
One biologist found an expression that perfectly fit his data. It was simple, and he was really excited by it. But he couldn’t understand what it meant at all. And he couldn’t publish it, because how can you publish an equation without any explanation?
I wasn’t aware that method had a name, but I’ve seen that idea suggested before when this topic comes up. For neural networks in particular, you can just look at the gradients of the inputs to see how it’s output changes as you change each input.
I think the problem people have, is that just tells you what the machine is doing. Not why. Machine learning can never really offer understanding.
For example, there was a program created specifically for the purpose of training human understandable models. It worked by fitting the simplest possible mathematical expression to the data. And the hope was that simple mathematical expressions would be easy to interpret by humans.
One biologist found an expression that perfectly fit his data. It was simple, and he was really excited by it. But he couldn’t understand what it meant at all. And he couldn’t publish it, because how can you publish an equation without any explanation?
Isn’t that exactly what causality and do notation is for? Generate the “how” answer, and then do causal analysis to get the why.