A hypothesis can’t exclude things, only make positive predictions
Internally, the algorithm could work by ruling things out (“There are no black swans, so the world can’t be X”), but it must still completely specify everything. This may be clearer once you have the answer your question, “What counts as a hypothesis for Solomonoff induction?”: a halting program for some universal Turing machine. And the possible worlds are (in correspondence with) the elements of the space of possible outputs of that machine. So every “hypothesis” pins down everything exactly.
You may have also read some stuff about the Minimum Message Length formalization of Occam’s razor, and it may be affecting your intuitions. In this formalization, it’s more natural to use logical operations for part of your message. That is, you could say something like “It’s the list of all primes OR the list of all squares. Compressed data: first number is zero”. Here, we’ve used a logical operation on the statement of the model, but it’s made our lossless compression of the data longer. This is a meaningful thing to do in this formalization (whereas it’s not really in Solomonoff induction), but the thing we ended up with is definitely not the message with the shortest length. That means it doesn’t affect the prior because that’s all about the minimum message length.
“That is, you could say something like “It’s the list of all primes OR the list of all squares. Compressed data: first number is zero”″
Just to clarify here (because it took me a couple of seconds): you only need the first number of the compressed data because that is sufficient to distinguish whether you have a list of primes or a list of squares. But as Pongo said, you could describe that same list in a much more compressed way by skipping the irrelevant half of the OR statement.
Internally, the algorithm could work by ruling things out (“There are no black swans, so the world can’t be X”), but it must still completely specify everything. This may be clearer once you have the answer your question, “What counts as a hypothesis for Solomonoff induction?”: a halting program for some universal Turing machine. And the possible worlds are (in correspondence with) the elements of the space of possible outputs of that machine. So every “hypothesis” pins down everything exactly.
You may have also read some stuff about the Minimum Message Length formalization of Occam’s razor, and it may be affecting your intuitions. In this formalization, it’s more natural to use logical operations for part of your message. That is, you could say something like “It’s the list of all primes OR the list of all squares. Compressed data: first number is zero”. Here, we’ve used a logical operation on the statement of the model, but it’s made our lossless compression of the data longer. This is a meaningful thing to do in this formalization (whereas it’s not really in Solomonoff induction), but the thing we ended up with is definitely not the message with the shortest length. That means it doesn’t affect the prior because that’s all about the minimum message length.
“That is, you could say something like “It’s the list of all primes OR the list of all squares. Compressed data: first number is zero”″
Just to clarify here (because it took me a couple of seconds): you only need the first number of the compressed data because that is sufficient to distinguish whether you have a list of primes or a list of squares. But as Pongo said, you could describe that same list in a much more compressed way by skipping the irrelevant half of the OR statement.