Roman Leventov comments on My Criticism of Singular Learning Theory

Roman Leventov 21 Nov 2023 22:06 UTC
3 points
0
Kolmogorov complexity is definitely a misleading path here, and it’s unfortunate that Joar chose it as the “leading” example of complexity in the post. Note this passage:
However, they do not give a detailed answer to the question of precisely which complexity measure they minimise—they merely show that this result holds for many different complexity measures. For example, I would expect that fully connected neural networks are biased towards functions with low Boolean circuit complexity, or something very close to that. Verifying this claim, and deriving similar results about other kinds of network architectures, would make it easier to reason about what kinds of functions we should expect a neural network to be likely or unlikely to learn. This would also make it easier to reason about out-of-distribution generalisation, etc.
This quote from the above comment is better:
If we want to explain generalisation in neural networks, then we must explain if and how their inductive bias aligns with out (human) priors. Moreover, our human priors are (in most contexts) largely captured by computational complexity. Therefore, we must somewhere, in some way, connect neural networks to computational complexity.
I’ve expressed this idea with some links here:
Bayesian Brain theorists further hypothesise that animal brains do effectively implement something like these “simple” algorithms (adjusted to the level of generality and sophistication of the world model each animal species needs) due to the strong evolutionary pressure on energy efficiency of the brain (“The free energy principle induces neuromorphic development”). The speed-accuracy tradeoffs in brain hardware add another kind of pressure that points in the same direction (“Internal feedback in the cortical perception–action loop enables fast and accurate behavior”).
Then if we combine two claims:
- Joar’s “DNNs are (kind of) Bayesian” (for the reasons that I don’t understand because I didn’t read their papers, so I just take his word here), and
- Fields et al.‘s “brains are ‘almost’ Bayesian because Bayesian learning is information-efficient (= energy-efficient), and there is a strong evolutionary pressure for brains in animals to be energy-efficient”,
is this an explanation explanation of DNNs’ remarkable generalisation ability? Or more quantification should be added to both of these claims to turn this into a good explanation?
- Joar Skalse 22 Nov 2023 10:19 UTC
  1 point
  0
  Parent
  Thank you for this—I agree with what you are saying here. In the post, I went with a somewhat loose equivocation between “good priors” and “a prior towards low Kolmogorov complexity”, but this does skim past a lot of nuance. I do also very much not want to say that the DNN prior is exactly towards low Kolmogorov complexity (this would be uncomputable), but only that it is mostly correlated with Kolmogorov complexity for typical problems.