Then I would go for Turing machines, Lambda calculus, or similar. These languages are very simple, and can easily handle input and output.
Even simpler languages, like cellular automaton No.110 or Combinatory Logic might be better, but those are quite difficult to get to handle input and output correctly.
The reason simple languages, or universal machines, should be better, is that the upper bound for error in estimating probabilities is 2 to the power of the complexity of the program simulating one language in another, according to algorithmic information theory.
So, the simpler the language is, the more correct relative probabilities it will give.
The answer I gave before this one, was for the question: Maximize the likelihood that the language will produce the output corresponding to the observed events.
You however want to compare 2 hypotheses, getting 2 probabilities, and compare them. In this case the absolute size of the probabilities do not matter, just their relative size, so you can just go for the simplest language that is practical to use.
Using simple languages is the conventional approach. However, simple languages typically result in more complex programs. The game of life is very simple—yet try writing a program in it. If you are trying to minimise the size of an emulator of other languages, then highly simple languages don’t seem to fit the bill.
Why would one want a decent formulation of Occam’s razor? To help solve the problem of the priors.
I agree. We seem to have the same goal, so my first advice stands, not my second.
I am currently trying to develop a language that is both simple and expressive, and making some progress. The overall design is finished, and I am now down to what instructions it should have. It is a general bi-graph, but with a sequential program structure, and no separation of program and data.
It is somewhat different from what you want, as I also need something that have measurable use of time and memory, and is provable able to run fast.
Could I have a link, or maybe some more information about this language? It’s something I’m very interested in (merging expressiveness and guarantees about resource usage).
I’m sure that’s not possible to do in general (in an algorithmic fashion), since it would solve certain mathematical problems in a flash that aren’t supposed to be solved in polynomial time or even efficiently (like particular halting problems).
I think Tim’s intended meaning was more along the lines of “a language L such that when A is a more likely hypothesis than B, L’s MDL for A is shorter than its MDL for B more often than for any other language”.
“Best”: most accurate—i.e. when Occam’s razor says A is a more likely hypothesis than B, then that is actually true.
Then I would go for Turing machines, Lambda calculus, or similar. These languages are very simple, and can easily handle input and output.
Even simpler languages, like cellular automaton No.110 or Combinatory Logic might be better, but those are quite difficult to get to handle input and output correctly.
The reason simple languages, or universal machines, should be better, is that the upper bound for error in estimating probabilities is 2 to the power of the complexity of the program simulating one language in another, according to algorithmic information theory.
So, the simpler the language is, the more correct relative probabilities it will give.
The answer I gave before this one, was for the question: Maximize the likelihood that the language will produce the output corresponding to the observed events.
You however want to compare 2 hypotheses, getting 2 probabilities, and compare them. In this case the absolute size of the probabilities do not matter, just their relative size, so you can just go for the simplest language that is practical to use.
What do you want to use this for?
Using simple languages is the conventional approach. However, simple languages typically result in more complex programs. The game of life is very simple—yet try writing a program in it. If you are trying to minimise the size of an emulator of other languages, then highly simple languages don’t seem to fit the bill.
Why would one want a decent formulation of Occam’s razor? To help solve the problem of the priors.
I agree. We seem to have the same goal, so my first advice stands, not my second.
I am currently trying to develop a language that is both simple and expressive, and making some progress. The overall design is finished, and I am now down to what instructions it should have. It is a general bi-graph, but with a sequential program structure, and no separation of program and data.
It is somewhat different from what you want, as I also need something that have measurable use of time and memory, and is provable able to run fast.
Could I have a link, or maybe some more information about this language? It’s something I’m very interested in (merging expressiveness and guarantees about resource usage).
I’m sure that’s not possible to do in general (in an algorithmic fashion), since it would solve certain mathematical problems in a flash that aren’t supposed to be solved in polynomial time or even efficiently (like particular halting problems).
You’ll have to hope for something less.
I think Tim’s intended meaning was more along the lines of “a language L such that when A is a more likely hypothesis than B, L’s MDL for A is shorter than its MDL for B more often than for any other language”.