For every program T, all but a finite amount of program longer than T have lower probabilities.
Yeah, that’s correct. Note that it’s also correct if you remove “longer than T”: “For every program T, all but a finite number of other programs have lower probabilities”. Which is obviously true because you can’t have an infinite number of programs whose probability is above some threshold.
It’s a well known argument, I learned it from Shalizi’s note. There’s other work trying to justify Occam’s razor when the set of hypotheses is finite, e.g. Kevin Kelly’s work.
Bayesian explanations of Ockham’s razor are based on a circular appeal to a prior bias toward simple possibilities.
I think it is possible to appeal to simpler principles, modifying some of the points I made above.
Indeed, I think it is possible to non-circularly explain why
the Razor is the rule which says “among the theories compatible with the evidence, chose the simplest”, and the question is why this leads to the truth better than a rule like “among the theories compatible with the evidence, chose the one whose statement involves the fewest occurrences of the letter ‘e’”, or even “the one which most glorifies Divine Providence”.
Yeah, that’s correct. Note that it’s also correct if you remove “longer than T”: “For every program T, all but a finite number of other programs have lower probabilities”. Which is obviously true because you can’t have an infinite number of programs whose probability is above some threshold.
Sure, now that you’ve pointed it out I see that my conjecture was trivially true :)
I guess on the same line of thought you can informally deconstruct Occam’s razor:
every finite-complexity state of affair can be equivalently explained by longer and longer hypothesis;
one of them must be true;
for any explanation, the list of longer explanations is infinite, but only grabs the finite remaining portion of probability mass;
so, besides for a finite number of instances, all of them must have lower and lower probabilities.
Might be worth a post to informally deduce Occam’s razor.
It’s a well known argument, I learned it from Shalizi’s note. There’s other work trying to justify Occam’s razor when the set of hypotheses is finite, e.g. Kevin Kelly’s work.
Thanks for the very interesting papers.
In Kelly’s page, this
I think it is possible to appeal to simpler principles, modifying some of the points I made above.
Indeed, I think it is possible to non-circularly explain why