I think the word “explainable” isn’t really the best fit. What we really mean is that the model has to be able to construct theories of the world, and prioritize the ones which are more compact. An AI that has simply memorized that a stone will fall if it’s exactly 5, 5.37, 7.8 (etc) meters above the ground is not explainable in that sense, whereas one that discovered general relativity would be considered explainable.
And yeah, at some point, even maximally compressed theories become so complex that no human can hope to understand them. But explainability should be viewed as an intrinsic property of AI models rather than in connection with humans.
Or, maybe we should think of “explainability” as the AI’s lossy compression quality for its theories, in which case it must be evaluated together with our ability, as all modern lossy compression takes the human ear, eye and brain into account. In this case it could be measured by how close our reconstruction is to the real theory for each compression.
maybe we should think of “explainability” as the AI’s lossy compression quality for its theories, in which case it must be evaluated together with our ability, as all modern lossy compression takes the human ear, eye and brain into account. In this case it could be measured by how close our reconstruction is to the real theory for each compression.
My view is along these lines, and see first link for an interesting example vis a vis this (start at min 17, or just read the linked paper in the description)
I think the word “explainable” isn’t really the best fit. What we really mean is that the model has to be able to construct theories of the world, and prioritize the ones which are more compact. An AI that has simply memorized that a stone will fall if it’s exactly 5, 5.37, 7.8 (etc) meters above the ground is not explainable in that sense, whereas one that discovered general relativity would be considered explainable.
And yeah, at some point, even maximally compressed theories become so complex that no human can hope to understand them. But explainability should be viewed as an intrinsic property of AI models rather than in connection with humans.
Or, maybe we should think of “explainability” as the AI’s lossy compression quality for its theories, in which case it must be evaluated together with our ability, as all modern lossy compression takes the human ear, eye and brain into account. In this case it could be measured by how close our reconstruction is to the real theory for each compression.
My view is along these lines, and see first link for an interesting example vis a vis this (start at min 17, or just read the linked paper in the description)