To be clear, SI can’t learn which program, just a program with the same functional behavior (depending on the setup, same functional behavior on the prefixes of the string in question).
long-run truths that can be learned by universal priors
Hm. Say we have universal priors P and Q. For computable infinite strings x, we have that P(x|n) converges to Q(x|n), because they both converge to the right answer. For any x, we have that P(x|n) and Q(x|n) are always within some fixed constant factor of each other, by definition. I conjecture, very unconfidently, that for any P there’s a universal Q and a string x such that | P(x|n) - Q(x|n) | > c for some fixed c > 0 for infinitely many n. I don’t even have an idea to show this and don’t know if it’s true, but the intuition is like, an adversary choosing x should be able to find a machine M (or rather, equivalence classes of machines that make the same predictions; this trips me up in constructing Q) that Q and P have different priors on, and are consistent so far with the x|n chosen; then the adversary confirms M heavily by choosing more bits of x, so that the probability on M dominates, and the difference between Q(M) and P(M) is ballooned up maximally (given the constraint that P(Q) and Q(P) are positive); then the adversary picks a different M’ and repeats, and so on.
To be clear, SI can’t learn which program, just a program with the same functional behavior (depending on the setup, same functional behavior on the prefixes of the string in question).
Hm. Say we have universal priors P and Q. For computable infinite strings x, we have that P(x|n) converges to Q(x|n), because they both converge to the right answer. For any x, we have that P(x|n) and Q(x|n) are always within some fixed constant factor of each other, by definition. I conjecture, very unconfidently, that for any P there’s a universal Q and a string x such that | P(x|n) - Q(x|n) | > c for some fixed c > 0 for infinitely many n. I don’t even have an idea to show this and don’t know if it’s true, but the intuition is like, an adversary choosing x should be able to find a machine M (or rather, equivalence classes of machines that make the same predictions; this trips me up in constructing Q) that Q and P have different priors on, and are consistent so far with the x|n chosen; then the adversary confirms M heavily by choosing more bits of x, so that the probability on M dominates, and the difference between Q(M) and P(M) is ballooned up maximally (given the constraint that P(Q) and Q(P) are positive); then the adversary picks a different M’ and repeats, and so on.