Quintin Pope comments on Finding the multiple ground truths of CoinRun and image classification

Quintin Pope 8 Dec 2021 21:37 UTC
5 points
Maybe you can train a sequence of reward functions: $r_{1}, . . ., r_{n}$ such that each $r_{i}$ is discouraged from attending to the input features that are most salient to the previous $i - 1$ reward functions?
I.e., you’d train $r_{1}$ normally. Then, while training $r_{2}$ , you’d use gradient saliency (or similar methods) to find which regions of the input are most salient for $r_{1}$ and $r_{2}$ , then penalize $r_{2}$ for sharing salient features with $r_{1}$ . Similarly, $r_{i}$ would be penalized w.r.t. saliency maps from ${r_{j}}_{j < i}$ .
Note that for gradient saliency specifically, you can optimize directly for the penalty term with SGD because differentiation is itself a differentiable operation. You can have a loss term like $\sum | r_{1 input saliency} - r_{2 input saliency} |$ and compute its gradient with respect to model parameters (Some notes on doing this with PyTorch). Note that some gradient saliency methods seem to fail basic sanity checks.
Non-differentiable saliency methods like Shapley values can still serve as an optimization target, but you’ll need to use reinforcement learning or other non-gradient optimization approaches. That would probably be very hard.
- gwern 8 Dec 2021 22:21 UTC
  13 points
  Parent
  You can also steer optimization to find ‘diverse’ models, like Ridge Rider: https://arxiv.org/abs/2011.06505
  
  I’m not sure how necessary that is. If you want diverse good solutions, that sounds a lot like ‘sampling from the posterior’, and we know thanks to Google burning a huge number of TPU-hours on true HMC-sampling from Bayesian neural networks that ‘deep ensembles’ (ie training multiple random initializations from scratch on the same dataset) actually provide you a pretty good sample from the posterior. If there are lots of equally decent ways to classify an image expressible in a NN, then the deep ensemble will sample from them (and that is presumably why ensembling improves: because they all are doing something different, instead of weighting the same features the same amount). If that’s not adequate, it’d be good to think about what one really wants instead, and how to build that in (maybe one wants to do data augmentation to erase color from one dataset/model and shapes from another, to encourage a ventral-dorsal split or something).
  - Stuart_Armstrong 9 Dec 2021 16:29 UTC
    2 points
    Parent
    Thanks! Very useful feedback.