Am I correctly understanding that: If we use exponential as a universal prior, the simplest hypothesis is 2 times more probable than the nearest hypothesis which is 1 bit longer?
And that the simplest hypothesis has equal probability than all other longer hypothesis combined?
Yep that’s right! And it’s a good thing to point out, since there’s a very strong bias towards whatever can be expressed in a simple manner. So, the particular universal Turing machine you choose can matter a lot.
However, in another sense, the choice is irrelevant. No matter what universal Turing machine is used for the Universal prior, AIXI will still converge to the true probability distribution in the limit. Furthermore, for a certain very general definition of prior, the Universal prior assigns more* probability to all possible hypotheses than any other type of prior.
*More means up to a constant factor. So f(x)=x is more than g(x)=2x because we are allowed to say f(x)>1/3g(x) for all x.
Am I correctly understanding that: If we use exponential as a universal prior, the simplest hypothesis is 2 times more probable than the nearest hypothesis which is 1 bit longer?
And that the simplest hypothesis has equal probability than all other longer hypothesis combined?
Yep that’s right! And it’s a good thing to point out, since there’s a very strong bias towards whatever can be expressed in a simple manner. So, the particular universal Turing machine you choose can matter a lot.
However, in another sense, the choice is irrelevant. No matter what universal Turing machine is used for the Universal prior, AIXI will still converge to the true probability distribution in the limit. Furthermore, for a certain very general definition of prior, the Universal prior assigns more* probability to all possible hypotheses than any other type of prior.
*More means up to a constant factor. So f(x)=x is more than g(x)=2x because we are allowed to say f(x)>1/3g(x) for all x.