khafra comments on Slate Star Codex Notes on the Asilomar Conference on Beneficial AI

khafra 8 Feb 2017 12:21 UTC
6 points
Point 8, about the opacity of decision-making, reminded me of something I’m surprised I haven’t seen on LW before:

LIME, Local Interpretable Model-agnostic Explanations, can show a human-readable explanation for the reason any classification algorithm makes a particular decision. It would be harder to apply the method to an optimizer than to a classifier, but I see no principled reason why an approach like this wouldn’t help understand any algorithm that has a locally smooth-ish mapping of inputs to outputs.
- Gunnar_Zarncke 8 Feb 2017 14:43 UTC
  6 points
  Parent
  I almost think, that that LIME article merits it’s own Link post, what do you think?
  - khafra 8 Feb 2017 22:02 UTC
    0 points
    Parent
    It should be posted, but by someone who can more rigorously describe its application to an optimizer than “probably needs to be locally smooth-ish.”
- Houshalter 8 Feb 2017 17:44 UTC
  0 points
  Parent
  I wasn’t aware that method had a name, but I’ve seen that idea suggested before when this topic comes up. For neural networks in particular, you can just look at the gradients of the inputs to see how it’s output changes as you change each input.
  
  I think the problem people have, is that just tells you what the machine is doing. Not why. Machine learning can never really offer understanding.
  
  For example, there was a program created specifically for the purpose of training human understandable models. It worked by fitting the simplest possible mathematical expression to the data. And the hope was that simple mathematical expressions would be easy to interpret by humans.
  
  One biologist found an expression that perfectly fit his data. It was simple, and he was really excited by it. But he couldn’t understand what it meant at all. And he couldn’t publish it, because how can you publish an equation without any explanation?
  - username2 10 Feb 2017 6:36 UTC
    0 points
    Parent
    Isn’t that exactly what causality and do notation is for? Generate the “how” answer, and then do causal analysis to get the why.