You’re over-counting programs. I didn’t spell out definitions of “programming language” and “length”, but an important disideratum is that there has to only be finitely much “length” to go around, in the sense that ∑p∈Prexp(length(p)) must converge.
Under your proposal, the total amount of “length” is ∑∞n=12n⋅2−n=∑∞n=11=∞, so this is not an admissible notion of length.
(Note: this argument has nothing to do with the choice of base 2, and it applies equally well for all bases.)
Two common ways of repairing your notion of length are:
Using a prefix-free code for programs, such that if p is a program then it is not the prefix of any other program.
Thinking of programs as infinite bitstrings (which you might intuitively factor into a finite prefix describing a probabilistic program, and an infinite suffix serving as the random seed).
In the former case, when 001 is a valid code for a program p, then any extension to that code (like 0010 or 0011) is not valid, and so a total of 2−3 length is allocated to p and no other length is allocated to any program that starts with 001, and it’s not too hard to see that a total of 1 length is allocated overall.
In the latter case, we don’t count 001 as separate from 0010... and 0011...; we only count the infinite programs, and observe that a total of 2−3 length is allocated to the collection of all programs that start with 001, and that 1 length is allocated overall.
You ran into trouble because you were trying to count both the prefixes like 001and all their continuations, which is overcounting :-).
You’re over-counting programs. I didn’t spell out definitions of “programming language” and “length”, but an important disideratum is that there has to only be finitely much “length” to go around, in the sense that ∑p∈Prexp(length(p)) must converge.
Under your proposal, the total amount of “length” is ∑∞n=12n⋅2−n=∑∞n=11=∞, so this is not an admissible notion of length.
(Note: this argument has nothing to do with the choice of base 2, and it applies equally well for all bases.)
Two common ways of repairing your notion of length are:
Using a prefix-free code for programs, such that if p is a program then it is not the prefix of any other program.
Thinking of programs as infinite bitstrings (which you might intuitively factor into a finite prefix describing a probabilistic program, and an infinite suffix serving as the random seed).
In the former case, when 001 is a valid code for a program p, then any extension to that code (like 0010 or 0011) is not valid, and so a total of 2−3 length is allocated to p and no other length is allocated to any program that starts with 001, and it’s not too hard to see that a total of 1 length is allocated overall.
In the latter case, we don’t count 001 as separate from 0010... and 0011...; we only count the infinite programs, and observe that a total of 2−3 length is allocated to the collection of all programs that start with 001, and that 1 length is allocated overall.
You ran into trouble because you were trying to count both the prefixes like 001 and all their continuations, which is overcounting :-).
Thanks!