This binary distinction is a gross oversimplification.
If you consider the space of all theories, the solomonoff prior—and thus regularized bayesian inference—is correct: the best model of data is an ensemble of submodels weighted by their data compression. That solution works out to a distribution over fundamental physical theories of everything.
But that isn’t the whole story—for each such minimal K-complexity theory there is an expanding infinite tier of functionally equivalent higher complexity theories, and then far more loose equivalents when we consider approximations.
These approximate T theories derive their correctness from how well they approximate some minimal-K theory. So when you consider practical compute constraints, the most useful world models tend to be complex approximations of physics—as used in video games/simulations or ANNs.
what do you mean “the solomonoff prior is correct”? do you mean that you assign high prior likelihood to theories with low kolmogorov complexity?
this post claims: many people assign high prior likelihood to theories with low time complexity. and this is somewhat rational for them to do if they think that they would otherwise be susceptible to fallacious reasoning.
what do you mean “the solomonoff prior is correct”?
I mean it is so fundamentally correct that it is just how statistical learning works—all statistical learning systems that actually function well approximate bayesian learning (which uses a solomnoff/complexity prior). This includes the brain and modern DL systems, which implement various forms of P(M|E) ~ P(E|M) P(M) - ie they find approximate models which ‘compress’ the data by balancing predictive capability against model complexity.
You could still be doing perfect bayesian reasoning regardless of your prior credences. Bayesian reasoning (at least as I’ve seen the term used) is agnostic about the prior, so there’s nothing defective about assigned a low prior to programs with high time-complexity.
This is true in the abstract, but the physical word seems to be such that difficult computations are done for free in the physical substrate (e.g,. when you throw a ball, this seems to happen instantaneously, rather than having to wait for a lengthy derivation of the path it traces). This suggests a correct bias in favor of low-complexity theories regardless of their computational cost, at least in physics.
This binary distinction is a gross oversimplification.
If you consider the space of all theories, the solomonoff prior—and thus regularized bayesian inference—is correct: the best model of data is an ensemble of submodels weighted by their data compression. That solution works out to a distribution over fundamental physical theories of everything.
But that isn’t the whole story—for each such minimal K-complexity theory there is an expanding infinite tier of functionally equivalent higher complexity theories, and then far more loose equivalents when we consider approximations.
These approximate T theories derive their correctness from how well they approximate some minimal-K theory. So when you consider practical compute constraints, the most useful world models tend to be complex approximations of physics—as used in video games/simulations or ANNs.
what do you mean “the solomonoff prior is correct”? do you mean that you assign high prior likelihood to theories with low kolmogorov complexity?
this post claims: many people assign high prior likelihood to theories with low time complexity. and this is somewhat rational for them to do if they think that they would otherwise be susceptible to fallacious reasoning.
I mean it is so fundamentally correct that it is just how statistical learning works—all statistical learning systems that actually function well approximate bayesian learning (which uses a solomnoff/complexity prior). This includes the brain and modern DL systems, which implement various forms of P(M|E) ~ P(E|M) P(M) - ie they find approximate models which ‘compress’ the data by balancing predictive capability against model complexity.
You could still be doing perfect bayesian reasoning regardless of your prior credences. Bayesian reasoning (at least as I’ve seen the term used) is agnostic about the prior, so there’s nothing defective about assigned a low prior to programs with high time-complexity.
This is true in the abstract, but the physical word seems to be such that difficult computations are done for free in the physical substrate (e.g,. when you throw a ball, this seems to happen instantaneously, rather than having to wait for a lengthy derivation of the path it traces). This suggests a correct bias in favor of low-complexity theories regardless of their computational cost, at least in physics.