johnswentworth comments on Why Neural Networks Generalise, and Why They Are (Kind of) Bayesian

johnswentworth 3 Jan 2021 1:36 UTC
LW: 4 AF: 3
AF
That’s a clever example, I like it.
Based on that description, it should be straightforward to generalize the Levin bound to neural networks. The main step would be to replace the Huffman code with a turbocode (or any other near-Shannon-bound code), at which point the compressibility is basically identical to the log probability density, and we can take the limit to continuous function space without any trouble. The main change is that entropy would become relative entropy (as is normal when taking info theory bounds to a continuous limit). Intuitively, it’s just using the usual translation between probability theory and minimum description length, and applying it to the probability density of parameter space.