There’s entirely too many ways to formalize S.I. which are basically equivalent, which makes it really difficult to discuss. I standardize on “random bits on a read-only input tape, write only output tape, and a single work tape initialized with zeroes” model, with the probability of a specific output string s being the prior for s . The machine M must be such that for any other machine M’ you can make a prefix such that putting that prefix on the beginning of the tape for M makes it exactly equivalent to M’ . (the prefix works as an emulator, i.e. a program for the machine M’, with this prefix, will run on M exactly as if it was M’). Very concise description, and you can get all the things like 2^-l out of that.
You mean “also two programs of length l+1”, right?
I think this comment by gjm addresses the “longer programs are as likely” idea: http://lesswrong.com/r/discussion/lw/jhm/understanding_and_justifying_solomonoff_induction/adfc
Yeah, that was a typo.
There’s entirely too many ways to formalize S.I. which are basically equivalent, which makes it really difficult to discuss. I standardize on “random bits on a read-only input tape, write only output tape, and a single work tape initialized with zeroes” model, with the probability of a specific output string s being the prior for s . The machine M must be such that for any other machine M’ you can make a prefix such that putting that prefix on the beginning of the tape for M makes it exactly equivalent to M’ . (the prefix works as an emulator, i.e. a program for the machine M’, with this prefix, will run on M exactly as if it was M’). Very concise description, and you can get all the things like 2^-l out of that.