I am struggling to understand part of how Solomonoff’s Induction is justified. Once we have all hypothesis that satisfy the data represented as a series of 0s and 1s, how can we say that some hypothesis are more likely than others? I understand that some strings have a higher probability of appearing within a certain sets (e.g. 101 is 2x as likely in the set of binaries of length 3 as 1111 is in the set of binaries of length 4) but how can we say that we are more likely to encounter 101 versus 1111 in the set of strings of length n (where n > 4)? Won’t both strings occur exactly once? Or is it the case that, let’s say, we are deciding between one hypothesis looks like “101” + “either 0 or 1 or nothing” versus 1111. In this case it is clear that the first is more likely than the second, but this would only be valid if 1010 and 1011 both were also valid algorithms, which doesn’t seem like it would necessarily be the case. Can someone help me understand this?
I am struggling to understand part of how Solomonoff’s Induction is justified. Once we have all hypothesis that satisfy the data represented as a series of 0s and 1s, how can we say that some hypothesis are more likely than others? I understand that some strings have a higher probability of appearing within a certain sets (e.g. 101 is 2x as likely in the set of binaries of length 3 as 1111 is in the set of binaries of length 4) but how can we say that we are more likely to encounter 101 versus 1111 in the set of strings of length n (where n > 4)? Won’t both strings occur exactly once? Or is it the case that, let’s say, we are deciding between one hypothesis looks like “101” + “either 0 or 1 or nothing” versus 1111. In this case it is clear that the first is more likely than the second, but this would only be valid if 1010 and 1011 both were also valid algorithms, which doesn’t seem like it would necessarily be the case. Can someone help me understand this?