Take a simple set of rules. There is only one tape that corresponds to these precise rules.
Not quite. There’s an infinite number of tapes that correspond to these precise rules exactly. Each “program of length L” is also two programs of length L+1 , four programs of length L+2, and so on—the following bits are simply not read.
In general, there’s a large number of different ways how a program with same behaviour could be implemented, and for programs up to a specific length, the number is larger for programs that describe fundamentally simpler behaviours.
Not quite. There’s an infinite number of tapes that correspond to these precise rules exactly. Each “program of length L” is also two programs of length L+1 , four programs of length L+2, and so on—the following bits are simply not read.
That only works under some methods of specification of the model machine, and I don’t consider it to be a reasonable generalization to assume. It’s at best contested, and at worse semantic distinction-without-a-difference.
That only works under some methods of specification of the model machine
Should not be a problem for you to name one where it doesn’t work, right?
Universal TMs all have to permit setting up an interpreter for running the code for another UTM, you really have a lot less lee-way for “methods of specification” than you think.
If you are hung up on there being an actual difference in the output, rather than hand-waving about the special circumstances or such, you ought to provide at least a sketch of a set up (so that people can understand clearly how you get more of the simpler programs). For example, for programs that do halt, instead of halting you can make them copy extra bits from the input tape to the output tape—that would be really easy to set up. And the programs in general can be compactly made to halt after something like BusyBeaver(10) steps.
Whether those L+1, L+2, etc. count as different programs from the length-L one is, last I checked, contentious, and because theorists feel strongly about this, various different widely-used formalisms for defining what makes something a UTM disagree about whether unread tape matters. If someone pokes a hole in the argument in the great-grandparent so that the general conclusion works iff the 2 L+1, 4 L+2, etc. count as different universes, then it would become worth addressing. But as long as it works without it, I’m going to stick with the most difficult case.
Again, give an example for the assertions being made.
As for your argument, as others have pointed out, you did not prove anything about the extra length for setting up special output in very specific circumstances, or sketched how that could be accomplished. “Sticking with the most difficult case” you got into the region where you are unable to actually produce an argument. It is far less than obviously true (and may well be false) that the programs which are simple programs but with special output in very specific circumstances, are a notable fraction of large programs.
My knowledge of algorithmic information theory marks the approach advocated by private_messaging as “the best way” as established by years of experimenting with ways to specify priors over programs. I admit to little knowledge of the controversy, but agree with private_messaging’s insistence for a burden of providing-the-alternative on your part.
the approach advocated by private_messaging as “the best way”
What? There is no mention of this anywhere. I have no idea what you’re referring to with this phrase.
I’m not going to provide an alternative because it doesn’t matter. Using the alternate formulation where you count every padded program with the same results as being unique, you get the same results, with the addition of another underlying assumption. By assuming they do not count as the same, I (am attempting to) force potential objections to confront the actual basis for the argument.
So go ahead, if you like. Treat those as distinct and work out the consequences. You’ll reach precisely the same conclusions.
Not quite. There’s an infinite number of tapes that correspond to these precise rules exactly. Each “program of length L” is also two programs of length L+1 , four programs of length L+2, and so on—the following bits are simply not read.
In general, there’s a large number of different ways how a program with same behaviour could be implemented, and for programs up to a specific length, the number is larger for programs that describe fundamentally simpler behaviours.
That only works under some methods of specification of the model machine, and I don’t consider it to be a reasonable generalization to assume. It’s at best contested, and at worse semantic distinction-without-a-difference.
Should not be a problem for you to name one where it doesn’t work, right?
Universal TMs all have to permit setting up an interpreter for running the code for another UTM, you really have a lot less lee-way for “methods of specification” than you think.
If you are hung up on there being an actual difference in the output, rather than hand-waving about the special circumstances or such, you ought to provide at least a sketch of a set up (so that people can understand clearly how you get more of the simpler programs). For example, for programs that do halt, instead of halting you can make them copy extra bits from the input tape to the output tape—that would be really easy to set up. And the programs in general can be compactly made to halt after something like BusyBeaver(10) steps.
Whether those L+1, L+2, etc. count as different programs from the length-L one is, last I checked, contentious, and because theorists feel strongly about this, various different widely-used formalisms for defining what makes something a UTM disagree about whether unread tape matters. If someone pokes a hole in the argument in the great-grandparent so that the general conclusion works iff the 2 L+1, 4 L+2, etc. count as different universes, then it would become worth addressing. But as long as it works without it, I’m going to stick with the most difficult case.
Again, give an example for the assertions being made.
As for your argument, as others have pointed out, you did not prove anything about the extra length for setting up special output in very specific circumstances, or sketched how that could be accomplished. “Sticking with the most difficult case” you got into the region where you are unable to actually produce an argument. It is far less than obviously true (and may well be false) that the programs which are simple programs but with special output in very specific circumstances, are a notable fraction of large programs.
My knowledge of algorithmic information theory marks the approach advocated by private_messaging as “the best way” as established by years of experimenting with ways to specify priors over programs. I admit to little knowledge of the controversy, but agree with private_messaging’s insistence for a burden of providing-the-alternative on your part.
What? There is no mention of this anywhere. I have no idea what you’re referring to with this phrase.
I’m not going to provide an alternative because it doesn’t matter. Using the alternate formulation where you count every padded program with the same results as being unique, you get the same results, with the addition of another underlying assumption. By assuming they do not count as the same, I (am attempting to) force potential objections to confront the actual basis for the argument.
So go ahead, if you like. Treat those as distinct and work out the consequences. You’ll reach precisely the same conclusions.