How do you define an extended version of a program?
I agree to what you’re saying only as long an “extended version” of a program is the same program, just padded with meaningless bits at the end.
I don’t agree that its true regarding longer programs in general. A “longer version” of a program is any program that produces the same output, including programs that use different algorithms, programs that are padded at the beginning, padded in the middle etc.
To get the mathematical expression of the Universal prior, we count all programs, and here the extended versions comes into play. Intuitively, a program padded with a single bit is counted twice (as it may be padded with 0 or 1), a program padded with 2 bits is counted 4 times, etc. So a program with length L+k is 2^k times less likely than as program with length L. That’s where the 2^{-length(p)} probability comes from.
How do you define an extended version of a program?
I agree to what you’re saying only as long an “extended version” of a program is the same program, just padded with meaningless bits at the end.
I don’t agree that its true regarding longer programs in general. A “longer version” of a program is any program that produces the same output, including programs that use different algorithms, programs that are padded at the beginning, padded in the middle etc.
To get the mathematical expression of the Universal prior, we count all programs, and here the extended versions comes into play. Intuitively, a program padded with a single bit is counted twice (as it may be padded with 0 or 1), a program padded with 2 bits is counted 4 times, etc. So a program with length L+k is 2^k times less likely than as program with length L. That’s where the 2^{-length(p)} probability comes from.