hogwash9 comments on Answering questions honestly instead of predicting human answers: lots of problems and some solutions

hogwash9 29 Jul 2021 7:39 UTC
1 point
AF
It makes sense that negative pairs would help to a large extent, but not all contrastive papers used negative examples, like BYOL (ref). Edit: but now I’m realizing that this might no longer fit the definition of contrastive learning (instead just ordinary self supervised learning), so I apologize about the error/confusion in that case.
- Rohin Shah 29 Jul 2021 7:55 UTC
  3 points
  AF Parent
  If memory serves, with BYOL you are using current representations of an input $x_{1}$ to predict representations of a related input $x_{2}$ , but the representation of $x_{2}$ comes from an old version of the encoder. So, as long as you start with a non-collapsed initial encoder, the fact that you are predicting a past encoder which is non-collapsed ensures that the current encoder you learn will also be non-collapsed.
  (Mostly my point is that there are specific algorithmic reasons to expect that you don’t get the collapsed solutions, it isn’t just a tendency of neural nets to avoid collapsed solutions.)
  but now I’m realizing that this might no longer fit the definition of contrastive learning (instead just ordinary self supervised learning), so I apologize about the error/confusion in that case.
  No worries, I think it’s still a relevant example for thinking about “collapsed” solutions.