Point 8, about the opacity of decision-making, reminded me of something I’m surprised I haven’t seen on LW before:
LIME, Local Interpretable Model-agnostic Explanations, can show a human-readable explanation for the reason any classification algorithm makes a particular decision. It would be harder to apply the method to an optimizer than to a classifier, but I see no principled reason why an approach like this wouldn’t help understand any algorithm that has a locally smooth-ish mapping of inputs to outputs.
I wasn’t aware that method had a name, but I’ve seen that idea suggested before when this topic comes up. For neural networks in particular, you can just look at the gradients of the inputs to see how it’s output changes as you change each input.
I think the problem people have, is that just tells you what the machine is doing. Not why. Machine learning can never really offer understanding.
For example, there was a program created specifically for the purpose of training human understandable models. It worked by fitting the simplest possible mathematical expression to the data. And the hope was that simple mathematical expressions would be easy to interpret by humans.
One biologist found an expression that perfectly fit his data. It was simple, and he was really excited by it. But he couldn’t understand what it meant at all. And he couldn’t publish it, because how can you publish an equation without any explanation?
Point 8, about the opacity of decision-making, reminded me of something I’m surprised I haven’t seen on LW before:
LIME, Local Interpretable Model-agnostic Explanations, can show a human-readable explanation for the reason any classification algorithm makes a particular decision. It would be harder to apply the method to an optimizer than to a classifier, but I see no principled reason why an approach like this wouldn’t help understand any algorithm that has a locally smooth-ish mapping of inputs to outputs.
I almost think, that that LIME article merits it’s own Link post, what do you think?
It should be posted, but by someone who can more rigorously describe its application to an optimizer than “probably needs to be locally smooth-ish.”
I wasn’t aware that method had a name, but I’ve seen that idea suggested before when this topic comes up. For neural networks in particular, you can just look at the gradients of the inputs to see how it’s output changes as you change each input.
I think the problem people have, is that just tells you what the machine is doing. Not why. Machine learning can never really offer understanding.
For example, there was a program created specifically for the purpose of training human understandable models. It worked by fitting the simplest possible mathematical expression to the data. And the hope was that simple mathematical expressions would be easy to interpret by humans.
One biologist found an expression that perfectly fit his data. It was simple, and he was really excited by it. But he couldn’t understand what it meant at all. And he couldn’t publish it, because how can you publish an equation without any explanation?
Isn’t that exactly what causality and do notation is for? Generate the “how” answer, and then do causal analysis to get the why.