Is there a nice way to quantify how fast does these theoretical priors drop off with the length of something? By how much should I favor simple explanation X over only mediumly more complicated explanation Y.
Interesting question. If you have a countable infinity of mutually exclusive explanations (e.g. they are all finite strings using letters from some finite alphabet), then your only constraint is that the infinite sum of all their prior probabilities must converge to 1. Otherwise you’re free to choose. You could make the convergence really fast (say, by making the prior of a hypothesis inversely proportional to the exponent of the exponent of its length), or slower if you wish to. A very natural and popular choice is restricting the hypotheses to form a “prefix-free set” (no hypothesis can begin with another shorter hypothesis) and then assigning every hypothesis of N bits a prior of 2^-N, which makes the sum converge by Kraft’s inequality.
Apart from giving a simple formula for the prior, it comes in handy in other theoretical constructions. For example, if you have a “universal Turing machine” (a computer than can execute arbitrary programs) and feed it an infinite input stream of bits, perhaps coming from a random source because you intend to “execute a random program”… then it needs to know where the program ends. You could introduce an end-of-program marker, but a more general solution is to make valid programs form a prefix-free set, so that when the machine has finished reading a valid program, it knows that reading more bits won’t result in a longer but still valid program. (Note that adding an end-of-program marker is one of the ways to make your set of programs prefix-free!)
Overall this is a nice example of an idea that “just smells good” to a mathematician’s intuition.
Ah! I must have had a brain-stnank—this makes total sense in math / theoretical CS terms, I was substituting an incorrect interpretation of “hypothesis” when reading the comment out of context. Thanks :)
Yeah, I agree. The shortest god-programs are probably longer than the shortest physics-programs, just not “enormously” longer.
Probably enormously longer if you want it to produce a god that would cause the world to act in a way as if basic EM held.
ie, you don’t just need a mind, you need to specify the sort of mind that would want to cause the world to be in a specific way...
Is there a nice way to quantify how fast does these theoretical priors drop off with the length of something? By how much should I favor simple explanation X over only mediumly more complicated explanation Y.
Interesting question. If you have a countable infinity of mutually exclusive explanations (e.g. they are all finite strings using letters from some finite alphabet), then your only constraint is that the infinite sum of all their prior probabilities must converge to 1. Otherwise you’re free to choose. You could make the convergence really fast (say, by making the prior of a hypothesis inversely proportional to the exponent of the exponent of its length), or slower if you wish to. A very natural and popular choice is restricting the hypotheses to form a “prefix-free set” (no hypothesis can begin with another shorter hypothesis) and then assigning every hypothesis of N bits a prior of 2^-N, which makes the sum converge by Kraft’s inequality.
What is the reasoning behind using a prefix-free set?
Apart from giving a simple formula for the prior, it comes in handy in other theoretical constructions. For example, if you have a “universal Turing machine” (a computer than can execute arbitrary programs) and feed it an infinite input stream of bits, perhaps coming from a random source because you intend to “execute a random program”… then it needs to know where the program ends. You could introduce an end-of-program marker, but a more general solution is to make valid programs form a prefix-free set, so that when the machine has finished reading a valid program, it knows that reading more bits won’t result in a longer but still valid program. (Note that adding an end-of-program marker is one of the ways to make your set of programs prefix-free!)
Overall this is a nice example of an idea that “just smells good” to a mathematician’s intuition.
Ah! I must have had a brain-stnank—this makes total sense in math / theoretical CS terms, I was substituting an incorrect interpretation of “hypothesis” when reading the comment out of context. Thanks :)