In Solomonoff induction it is important to use a two-tape Turing machine where one tape is for the program and one is for the input and work space. The program tape is an infinite random string, but the program length is defined to be the number of bits that the Turing machine actually reads during its execution. This way the set of possible programs becomes a prefix free set. It follows that the prior probabilities will add up to one when you weight by 2^(-l) where l is program length. (I believe this was realized by Leonid Levin. In Solomonoff’s original scheme the prior probabilities did not add to one.) This also allows the beautiful interpretation that the program tape is assigned by independent coin flips for each bit, and the 2^-l weighting arises naturally rather than as an artificial assumption. I believe this is discussed in the information theory book by Cover and Thomas.
In Solomonoff induction it is important to use a two-tape Turing machine where one tape is for the program and one is for the input and work space. The program tape is an infinite random string, but the program length is defined to be the number of bits that the Turing machine actually reads during its execution. This way the set of possible programs becomes a prefix free set. It follows that the prior probabilities will add up to one when you weight by 2^(-l) where l is program length. (I believe this was realized by Leonid Levin. In Solomonoff’s original scheme the prior probabilities did not add to one.) This also allows the beautiful interpretation that the program tape is assigned by independent coin flips for each bit, and the 2^-l weighting arises naturally rather than as an artificial assumption. I believe this is discussed in the information theory book by Cover and Thomas.