I think that this is a slightly wrong account of the case for Solomonoff induction. The claim is not just that Solomonoff induction predicts computable environments better than computable predictors, but rather that the Solomonoff prior is an enumerable semimeasure that is also a mixture over every enumerable semimeasure, and therefore predicts computable environments at least as well as any other enumerable semimeasure. So, using your notation, R∈S={all enumerable semimeasures}. It still fails as a theory of embedded agency, since it only predicts computable environments, but it’s not true that we must only compare it to prediction strategies strictly weaker than itself. The paper (Non-)Equivalence of Universal Priors has a decent discussion of this.
Although it’s also worth noting that as per Theorem 16 of the above paper, not all universally dominant enumerable semimeasures are versions of the Solomonoff prior, so there’s the possibility that the Solomonoff prior only does well by finding a good non-Solomonoff distribution and mimicking that.
Not sure that theorem gives us very much. Yeah, a mixture of all programs must include some programs that stop without outputting anything, so M(empty string) must be strictly greater than M(0)+M(1). But we can also make a semimeasure where M(empty string)=1, M(0)=M(1)=1/2 by fiat, and otherwise defer to a mixture. So it can’t itself be a mixture of all programs, but will be just as good for sequence prediction. That’s all the theorem says. Basically, if a Swiss army knife solves all problems, we shouldn’t be surprised by the existence of other tools (like a Swiss army knife with added fishing hook) that also solve all problems.
Yes, it’s true that the theorem doesn’t show that there’s anything exciting that’s interestingly different from a universal mixture, just that AFAIK we can’t disprove that, and the theorem forces us to come up with a non-trivial notion of ‘interestingly different’ if we want to.
I think that this is a slightly wrong account of the case for Solomonoff induction. The claim is not just that Solomonoff induction predicts computable environments better than computable predictors, but rather that the Solomonoff prior is an enumerable semimeasure that is also a mixture over every enumerable semimeasure, and therefore predicts computable environments at least as well as any other enumerable semimeasure. So, using your notation, R∈S={all enumerable semimeasures}. It still fails as a theory of embedded agency, since it only predicts computable environments, but it’s not true that we must only compare it to prediction strategies strictly weaker than itself. The paper (Non-)Equivalence of Universal Priors has a decent discussion of this.
Although it’s also worth noting that as per Theorem 16 of the above paper, not all universally dominant enumerable semimeasures are versions of the Solomonoff prior, so there’s the possibility that the Solomonoff prior only does well by finding a good non-Solomonoff distribution and mimicking that.
Not sure that theorem gives us very much. Yeah, a mixture of all programs must include some programs that stop without outputting anything, so M(empty string) must be strictly greater than M(0)+M(1). But we can also make a semimeasure where M(empty string)=1, M(0)=M(1)=1/2 by fiat, and otherwise defer to a mixture. So it can’t itself be a mixture of all programs, but will be just as good for sequence prediction. That’s all the theorem says. Basically, if a Swiss army knife solves all problems, we shouldn’t be surprised by the existence of other tools (like a Swiss army knife with added fishing hook) that also solve all problems.
Yes, it’s true that the theorem doesn’t show that there’s anything exciting that’s interestingly different from a universal mixture, just that AFAIK we can’t disprove that, and the theorem forces us to come up with a non-trivial notion of ‘interestingly different’ if we want to.