dkirmani comments on how 2 tell if ur input is out of distribution given only model weights

dkirmani 6 Aug 2023 16:34 UTC
11 points
5

I’m also somewhat confused about why this works

Abstraction is about what information you throw away. For a ReLU activation function, all negative inputs are mapped to zero—you lose information there, in a way that you don’t when applying a linear transformation.

Imagine your model (or a submodule thereof) as a mapping from one vector space to another. In order to focus on the features relevant to the questions you care about (is the image a truck, is it a lizard, …) you throw away information that is not relevant to these questions—you give it less real-estate in your representation-space. We can expect more out-of-distribution regions of input-space to be “pinched” by the model—they’re not represented as expressively as are the more in-distribution regions of input-space.

So if your cosine-similarity decreases less when you nudge your input, you’re in a more “pinched” region of input-space, and if it decreases more, you’re in a more “expanded” region of input space—which means the model was tuned to focus on that region, which means it’s more in-distribution.
- Chris_Leong 6 Aug 2023 18:20 UTC
  4 points
  0
  Parent
  Ok, that’s fascinating! Thanks for the explanation.