I thought about it some more and you’re right, my argument doesn’t work. I was imagining the universal prior as a mixture of deterministic programs printing infinite strings, which is wrong. Even for something as simple as a uniform prior, the only way to get it as a mixture of programs is by using programs that print finite strings, and letting the semimeasure renormalization do its magic (allowing longer programs “inherit” the weight of shorter ones that terminate early). That’s how it can beat the mixture of all programs that print infinite strings, which can’t “inherit” each other’s weight in the same way.
I thought about it some more and you’re right, my argument doesn’t work. I was imagining the universal prior as a mixture of deterministic programs printing infinite strings, which is wrong. Even for something as simple as a uniform prior, the only way to get it as a mixture of programs is by using programs that print finite strings, and letting the semimeasure renormalization do its magic (allowing longer programs “inherit” the weight of shorter ones that terminate early). That’s how it can beat the mixture of all programs that print infinite strings, which can’t “inherit” each other’s weight in the same way.