Why should the Occamian prior work so well in the real world? It’s a seemingly profound mystery that is asking to be dissolved.
To begin with, I propose a Lazy Razor and a corresponding Lazy prior:
Given several competing models of reality, we should select the one that is easiest to work with.
This is merely a formulation of the obvious trade-off between accuracy and cost. I would rather have a bad prediction today than a good prediction tomorrow or a great prediction ten years from now. Ultimately, this prior will deliver a good model, because it will let you try out many different models fast.
The concept of “easiness” may seem even more vague than “complexity”, but I believe that in any specific context its measurement should be clear. Note, “easiness” is measured in man-hours, dollars or etc, it’s not to be confused with “hardness” in the sense of P and PN. If you still don’t know how to measure “easiness” in your context, you should use the Lazy prior to choose an “easiness” measurement procedure. To break the recursive loop, know that the Laziest of all models is called “pulling numbers out of your ass”.
Now let’s return to the first question. Why should the Occamian prior work so well in the real world?
The answer is, it doesn’t, not really. Of all the possible priors, the Occamian prior holds no special place. Its greatest merit is that it often resembles Lazy prior in the probabilities it offers. Indeed it is easy to see, that a random model with a billion parameters is disliked by both priors, and that a model with two parameters is loved by both. By the way, its second greatest merit is being easy to work with.
Note, the priors are not interchangeable. One case where they disagree is on making use of existing resources. Suppose mathematics has derived powerful tools for working with A-theory but not B-theory. Then Lazy prior would suggest that a complex model based on A-theory may be preferable to a simpler one based on B-theory. Or, suppose some process took millions of years to produce abundant and powerful meat-based computers. Then Lazy prior would suggest that we make use of them in our models, regardless of their complexity, while the Occamian prior would object.
Against Occam’s Razor
Why should the Occamian prior work so well in the real world? It’s a seemingly profound mystery that is asking to be dissolved.
To begin with, I propose a Lazy Razor and a corresponding Lazy prior:
This is merely a formulation of the obvious trade-off between accuracy and cost. I would rather have a bad prediction today than a good prediction tomorrow or a great prediction ten years from now. Ultimately, this prior will deliver a good model, because it will let you try out many different models fast.
The concept of “easiness” may seem even more vague than “complexity”, but I believe that in any specific context its measurement should be clear. Note, “easiness” is measured in man-hours, dollars or etc, it’s not to be confused with “hardness” in the sense of P and PN. If you still don’t know how to measure “easiness” in your context, you should use the Lazy prior to choose an “easiness” measurement procedure. To break the recursive loop, know that the Laziest of all models is called “pulling numbers out of your ass”.
Now let’s return to the first question. Why should the Occamian prior work so well in the real world?
The answer is, it doesn’t, not really. Of all the possible priors, the Occamian prior holds no special place. Its greatest merit is that it often resembles Lazy prior in the probabilities it offers. Indeed it is easy to see, that a random model with a billion parameters is disliked by both priors, and that a model with two parameters is loved by both. By the way, its second greatest merit is being easy to work with.
Note, the priors are not interchangeable. One case where they disagree is on making use of existing resources. Suppose mathematics has derived powerful tools for working with A-theory but not B-theory. Then Lazy prior would suggest that a complex model based on A-theory may be preferable to a simpler one based on B-theory. Or, suppose some process took millions of years to produce abundant and powerful meat-based computers. Then Lazy prior would suggest that we make use of them in our models, regardless of their complexity, while the Occamian prior would object.