johnswentworth comments on NTK/GP Models of Neural Nets Can’t Learn Features

johnswentworth 7 May 2021 19:05 UTC
4 points
My take on the LTH is that pruning is basically just a weird way of doing optimization so it’s not that surprising you can get good performance.
+1 to this in particular; I think this is the main point Daniel (and many people like Daniel) are missing here. There’s a very big difference between “car detector functions exist somewhere in the random jumble of a sufficiently big randomly initialized NN” vs “the net can be pruned to yield a car detector function”, and the LTH papers show the latter.
- Daniel Kokotajlo 10 May 2021 10:52 UTC
  4 points
  Parent
  I think I get this distinction; I realize the NN papers show the latter; I guess our disagreement is about how big a deal / how surprising this is.