Past Account comments on Why Neural Networks Generalise, and Why They Are (Kind of) Bayesian

Past Account 28 Feb 2021 18:22 UTC
LW: 1 AF: 1
AF
[Deleted]
- Joar Skalse 3 Mar 2021 14:46 UTC
  LW: 2 AF: 2
  AF Parent
  What I’m suggesting is that volume in high-dimensions can concentrate on the boundary.
  Yes. I imagine this is why overtraining doesn’t make a huge difference.
  Falsifiable Hypothesis: Compare SGD with overtaining to the random sampling algorithm. You will see that functions that are unlikely to be generated by random sampling will be more likely under SGD with overtraining. Moreover, functions that are more likely with random sampling will be become less likely under SGD with overtraining.
  See e.g., page 47 in the main paper.
  - Past Account 4 Mar 2021 5:48 UTC
    1 point
    Parent
    [Deleted]