Vanessa Kosoy comments on Formal Solution to the Inner Alignment Problem

Vanessa Kosoy 26 Feb 2021 12:19 UTC
LW: 5 AF: 4
AF
I mean, there’s obviously a lot more work to do, but this is progress. Specifically if SGD is MAP then it seems plausible that e.g. SGD + random initial conditions or simulated annealing would give you something like top N posterior models. You can also extract confidence from NNGP.
- evhub 26 Feb 2021 20:29 UTC
  LW: 4 AF: 3
  AF Parent
  I agree that this is progress (now that I understand it better), though:
  
  if SGD is MAP then it seems plausible that e.g. SGD + random initial conditions or simulated annealing would give you something like top N posterior models
  
  I think there is strong evidence that the behavior of models trained via the same basic training process are likely to be highly correlated. This sort of correlation is related to low variance in the bias-variance tradeoff sense, and there is evidence that not only do massive neural networks tend to have pretty low variance, but that this variance is likely to continue to decrease as networks become larger.
  - Vanessa Kosoy 26 Feb 2021 20:54 UTC
    LW: 3 AF: 2
    AF Parent
    Hmm, added to reading list, thank you.