Not sure that theorem gives us very much. Yeah, a mixture of all programs must include some programs that stop without outputting anything, so M(empty string) must be strictly greater than M(0)+M(1). But we can also make a semimeasure where M(empty string)=1, M(0)=M(1)=1/2 by fiat, and otherwise defer to a mixture. So it can’t itself be a mixture of all programs, but will be just as good for sequence prediction. That’s all the theorem says. Basically, if a Swiss army knife solves all problems, we shouldn’t be surprised by the existence of other tools (like a Swiss army knife with added fishing hook) that also solve all problems.
Yes, it’s true that the theorem doesn’t show that there’s anything exciting that’s interestingly different from a universal mixture, just that AFAIK we can’t disprove that, and the theorem forces us to come up with a non-trivial notion of ‘interestingly different’ if we want to.
Not sure that theorem gives us very much. Yeah, a mixture of all programs must include some programs that stop without outputting anything, so M(empty string) must be strictly greater than M(0)+M(1). But we can also make a semimeasure where M(empty string)=1, M(0)=M(1)=1/2 by fiat, and otherwise defer to a mixture. So it can’t itself be a mixture of all programs, but will be just as good for sequence prediction. That’s all the theorem says. Basically, if a Swiss army knife solves all problems, we shouldn’t be surprised by the existence of other tools (like a Swiss army knife with added fishing hook) that also solve all problems.
Yes, it’s true that the theorem doesn’t show that there’s anything exciting that’s interestingly different from a universal mixture, just that AFAIK we can’t disprove that, and the theorem forces us to come up with a non-trivial notion of ‘interestingly different’ if we want to.