Abstraction is about what information you throw away. For a ReLU activation function, all negative inputs are mapped to zero—you lose information there, in a way that you don’t when applying a linear transformation.
Imagine your model (or a submodule thereof) as a mapping from one vector space to another. In order to focus on the features relevant to the questions you care about (is the image a truck, is it a lizard, …) you throw away information that is not relevant to these questions—you give it less real-estate in your representation-space. We can expect more out-of-distribution regions of input-space to be “pinched” by the model—they’re not represented as expressively as are the more in-distribution regions of input-space.
So if your cosine-similarity decreases less when you nudge your input, you’re in a more “pinched” region of input-space, and if it decreases more, you’re in a more “expanded” region of input space—which means the model was tuned to focus on that region, which means it’s more in-distribution.
Abstraction is about what information you throw away. For a ReLU activation function, all negative inputs are mapped to zero—you lose information there, in a way that you don’t when applying a linear transformation.
Imagine your model (or a submodule thereof) as a mapping from one vector space to another. In order to focus on the features relevant to the questions you care about (is the image a truck, is it a lizard, …) you throw away information that is not relevant to these questions—you give it less real-estate in your representation-space. We can expect more out-of-distribution regions of input-space to be “pinched” by the model—they’re not represented as expressively as are the more in-distribution regions of input-space.
So if your cosine-similarity decreases less when you nudge your input, you’re in a more “pinched” region of input-space, and if it decreases more, you’re in a more “expanded” region of input space—which means the model was tuned to focus on that region, which means it’s more in-distribution.
Ok, that’s fascinating! Thanks for the explanation.