What if slightly twiddling the RGB values produces something that is basically “spherical field of white, etc. with enough noise on top of it that humans can’t see it”?
That would all hinge on what it means for an image to be “hidden” beneath noise, I suppose. The more noise you layer on top of an image the more room for interpretation there is in classifying it, and the less salient any particular classification candidate will be. If a scrutable system can come up with compelling arguments for a strange classification that human beings would not make, then its choices would be naturally less ridiculous than otherwise. But to say that “humans conceivably may suffer from the same problem” is a bit of a dodge; esp. in light of the fact that these systems are making mistakes we clearly would not.
But either way, what you’re proposing and what Unknowns was arguing are different. Unknowns was (if I understood him rightly) arguing that the assignment of different probability weights for pixels (or, more likely, groups of pixels) representing a particular feature of an object is an explanation of why they’re classified the way they are. But such an “explanation” in inscrutable; we cannot ourselves easily translate it into the language of lines, curves, apparent depth, etc. (unless we write some piece of software to do this and which is then effectively part of the agent).
Look at it from the other end: You can take a picture of a baseball and overlay noise on top of it. There could, at least plausibly, be a point where overlaying the noise destroys the ability for humans to see the baseball, but the information is actually still present (and could, for instance, be recovered if you applied a noise reduction algorithm to that). Perhaps when you are twiddling the pixels of random noise, you’re actually constructing such a noisy baseball image a pixel at a time.
Perhaps when you are twiddling the pixels of random noise, you’re actually constructing such a noisy baseball image a pixel at a time.
You could be constructing a noisy image of a baseball one pixel at a time. In fact if you actually are then your network would be amazingly robust. But in a non-robust network, it seems much more probable that you’re just exploiting the system’s weaknesses and milking them for all they’re worth.
That would all hinge on what it means for an image to be “hidden” beneath noise, I suppose. The more noise you layer on top of an image the more room for interpretation there is in classifying it, and the less salient any particular classification candidate will be. If a scrutable system can come up with compelling arguments for a strange classification that human beings would not make, then its choices would be naturally less ridiculous than otherwise. But to say that “humans conceivably may suffer from the same problem” is a bit of a dodge; esp. in light of the fact that these systems are making mistakes we clearly would not.
But either way, what you’re proposing and what Unknowns was arguing are different. Unknowns was (if I understood him rightly) arguing that the assignment of different probability weights for pixels (or, more likely, groups of pixels) representing a particular feature of an object is an explanation of why they’re classified the way they are. But such an “explanation” in inscrutable; we cannot ourselves easily translate it into the language of lines, curves, apparent depth, etc. (unless we write some piece of software to do this and which is then effectively part of the agent).
Look at it from the other end: You can take a picture of a baseball and overlay noise on top of it. There could, at least plausibly, be a point where overlaying the noise destroys the ability for humans to see the baseball, but the information is actually still present (and could, for instance, be recovered if you applied a noise reduction algorithm to that). Perhaps when you are twiddling the pixels of random noise, you’re actually constructing such a noisy baseball image a pixel at a time.
Agree with all you said, but have to comment on
You could be constructing a noisy image of a baseball one pixel at a time. In fact if you actually are then your network would be amazingly robust. But in a non-robust network, it seems much more probable that you’re just exploiting the system’s weaknesses and milking them for all they’re worth.