In the ML example, generalization won’t work when approximating a function which is a completely random jumble of points.
Nice article, minor question. You seem to be treating random functions as qualitatively different from regular/some-flavor-of-deterministic ones (please correct if not the case). Other than in mathematical settings, I’m not sure how that works, since you would expect some random noise in (or on top of) the data you are recording (and feeding your model), and that same noise would contaminate that determinism.
Also, when approximating a completely random jumble of points, can’t you build models to infer the distribution from where those points are taken? I get it wont be as accurate when predicting but I fail to see why that’s not an issue of degrees.
By random I just meant “no simple underlying regularity explains it shortly”. For example, a low-degree polynomial has a very short description length. While a random jumble of points doesn’t (you need to write the points one by one). This of course already assumes a language.
Nice article, minor question. You seem to be treating random functions as qualitatively different from regular/some-flavor-of-deterministic ones (please correct if not the case). Other than in mathematical settings, I’m not sure how that works, since you would expect some random noise in (or on top of) the data you are recording (and feeding your model), and that same noise would contaminate that determinism.
Also, when approximating a completely random jumble of points, can’t you build models to infer the distribution from where those points are taken? I get it wont be as accurate when predicting but I fail to see why that’s not an issue of degrees.
By random I just meant “no simple underlying regularity explains it shortly”. For example, a low-degree polynomial has a very short description length. While a random jumble of points doesn’t (you need to write the points one by one). This of course already assumes a language.