It’s a dichotomy in this specific case where science says “don’t care, same math, same predictions” and EY’s Bayes says “my model is simpler than yours, so it’s better”. The dichotomy disappears once the models are different experimentally, except that one should still strive to find the Kolmogorov-simplest model with the same predictive power. In any case, EY’s point, the way I understood it, is that when the scientific method fails (different models are not easily testable, like in economics, for example), one should “fall back” on Bayes.
“Striving to find it” and “moving closer to it at every opportunity” can be very different things.
When the “perfection” in question is something that you know is impossible to achieve (and in any given nontrivial case, you know you’ll be unable to establish you’ve achieved it even if by chance you did), establishing it as your goal—which is what “striving to find it” is—is foolish.
On the other hand, finding simpler models certainly is a good idea. But it’s good not because it gets us “closer at every opportunity” to the Kolmogorov-simplest model, for two reasons. One is stated in the parentheses above, and the second is that “closer” is almost meaningless, when you know that you not only cannot compute K, you can’t in general put upper bounds on it either (by Chaitin’s Incompleteness), which means that you have no idea how closer you’re getting to the ideal value with every particular simplification.
Well, I’m not buying K-complexity goal in particular, which is why I said only “perfection”; I’m making a different point. The thing about goals is that they are not up for grabs, they can’t in themselves be foolish, only actions or subgoals can be foolish. Foolishness must follow from incongruity with some higher goal (that cares nothing for efficiency or probability of success, mere instrumental drives), so if one’s goal is to optimize some hard-to-optimize quality whose level of optimality is also hard to gauge, that’s still what one should do, if only by taking hard-to-arrange accidental opportunities for improvement.
I guess we’re talking past each other then, because I (plausibly, I think, given the context) took your original reply to still refer to the Kolmogorov complexity goal. My beef is with that particular formulation, because I find it sometimes to be illegitimately overused for (what amounts to merely) emotional effect. I’m all for working on optimizing imperfectly-defined, hard-to-pin-down goals! Been doing that for a while with my life.
(the results are mixed)
The hypothetical still applies, I think. Suppose minimizing K-complexity happened to be one’s goal, then there are probably some steps that can be taken in its pursuit, and in any case it wouldn’t be right to call it “foolish” if it’s indeed the goal, even in the unlikely situation where nothing whatsoever can predictably advance it (maybe one should embark on a quest to find a Turing oracle or something). It might be foolish to think that it’s a human (sub)goal though, where it would clash with what we actually seek.
Is it known what is the highest complexity, beyond which the Chaitin’s Incompleteness applies? If it is relatively large, it is possible that all hypotheses interesting for humans have complexity lower than that...
In particular I wish to extract from that paper the following very simple (albeit non-constructive) proof of Chaitin’s Incompleteness:
Given a (reasonable, sound) formal theory F, we know that F cannot prove all true sentences of the form “Program P never halts” (the reason is that if it could, we could solve the halting problem by searching over all possible proofs in F for the proof of P either halting with a particular run, or never halting, being sure our search will finish in finite time). Consider the shortest program P such that P never halts but F cannot prove that fact. Let L(P) be its length. Claim: F can never prove that Kolmogorov complexity of anything can be greater than L(P). Proof: given any output X, F can never refute the possibility that P might yet halt at some future time and output exactly X. Therefore L(P) must remain a candidate for the Kolmogorov complexity of X as far as F is concerned.
Edit: nevermind this. I’ve realized the proof is wrong. It’s only true that “F can never refute the possibility that P might yet halt at some time and output some Y”, but it is not true that “F can never refute the possibility that P might yet halt and output this specific X”. It’s conceivable (albeit unusual) that P doesn’t halt, F is unable to prove that, but is able to prove that should P halt, its output will not be X. For example, think of P as a Turing machine with one halt state, which is easily “backwards-traceable” to a sequence of actions that erases the entire tape so far and writes out “123″. Then F can easily be strong enough to be able to prove that if P halts at all, it outputs “123” and not anything else.
I emailed the article’s author and he replied acknowledging the problem, which has been raised by a bunch of people before, and giving me links to a few paywalled articles with the correct exposition. However, this correct exposition is nowhere as succint and attractive as the short and faulty proof above.
Thanks! The paper does give an answer, obvious in hindsight: the threshold constant for a formal system F is determined by the shortest Turing Machine that does not halt, but that fact is not provable in F.
That’s a pity. Still, given any non-halting TM for which F cannot prove that, it is easy (costs very little additional constant complexity) to build a TM for which F also cannot prove that it does not return any x. And this still answers my original question (whether the threshold is very high) in the negative. So, bad for K-complexity, we need something better.
I was simply trying to interpret what EY wrote. I should have said something like “prefer a K-simpler model”. Personally, I do not support expending much effort looking for a simpler model with the same predictive power, except as a step to finding a model with better predictive power.
There are occasional examples of one thing being strictly simpler than another. For example, “lightning is thrown by Thor, and also Maxwell’s equations, Coulomb’s Law, and the atomic theory of matter are true” is simpler if you just cut out Thor. So you should cut out Thor. So you should at least strive to that extent :P
Agreed that striving to find simpler theories is, generally speaking, a worthy goal. What I tried to emphasize is that striving to find the simplest one—in the particular Kolmogorov sense—isn’t.
It’s a dichotomy in this specific case where science says “don’t care, same math, same predictions” and EY’s Bayes says “my model is simpler than yours, so it’s better”. The dichotomy disappears once the models are different experimentally, except that one should still strive to find the Kolmogorov-simplest model with the same predictive power. In any case, EY’s point, the way I understood it, is that when the scientific method fails (different models are not easily testable, like in economics, for example), one should “fall back” on Bayes.
It is foolish to strive to find the Kolmogorov-simplest model, because that task is known to be impossible.
Fallacy of Gray.
The Fallacy of Gray is a good post, but my comment is not an instance of such a fallacy.
The point he was trying to make is that it is not foolish to strive for a Kolmogorov-simpler model.
If always finding perfection is impossible, striving to find it and moving closer to it at every opportunity isn’t.
“Striving to find it” and “moving closer to it at every opportunity” can be very different things.
When the “perfection” in question is something that you know is impossible to achieve (and in any given nontrivial case, you know you’ll be unable to establish you’ve achieved it even if by chance you did), establishing it as your goal—which is what “striving to find it” is—is foolish.
On the other hand, finding simpler models certainly is a good idea. But it’s good not because it gets us “closer at every opportunity” to the Kolmogorov-simplest model, for two reasons. One is stated in the parentheses above, and the second is that “closer” is almost meaningless, when you know that you not only cannot compute K, you can’t in general put upper bounds on it either (by Chaitin’s Incompleteness), which means that you have no idea how closer you’re getting to the ideal value with every particular simplification.
Well, I’m not buying K-complexity goal in particular, which is why I said only “perfection”; I’m making a different point. The thing about goals is that they are not up for grabs, they can’t in themselves be foolish, only actions or subgoals can be foolish. Foolishness must follow from incongruity with some higher goal (that cares nothing for efficiency or probability of success, mere instrumental drives), so if one’s goal is to optimize some hard-to-optimize quality whose level of optimality is also hard to gauge, that’s still what one should do, if only by taking hard-to-arrange accidental opportunities for improvement.
I guess we’re talking past each other then, because I (plausibly, I think, given the context) took your original reply to still refer to the Kolmogorov complexity goal. My beef is with that particular formulation, because I find it sometimes to be illegitimately overused for (what amounts to merely) emotional effect. I’m all for working on optimizing imperfectly-defined, hard-to-pin-down goals! Been doing that for a while with my life. (the results are mixed)
The hypothetical still applies, I think. Suppose minimizing K-complexity happened to be one’s goal, then there are probably some steps that can be taken in its pursuit, and in any case it wouldn’t be right to call it “foolish” if it’s indeed the goal, even in the unlikely situation where nothing whatsoever can predictably advance it (maybe one should embark on a quest to find a Turing oracle or something). It might be foolish to think that it’s a human (sub)goal though, where it would clash with what we actually seek.
Is it known what is the highest complexity, beyond which the Chaitin’s Incompleteness applies? If it is relatively large, it is possible that all hypotheses interesting for humans have complexity lower than that...
In particular I wish to extract from that paper the following very simple (albeit non-constructive) proof of Chaitin’s Incompleteness:
Given a (reasonable, sound) formal theory F, we know that F cannot prove all true sentences of the form “Program P never halts” (the reason is that if it could, we could solve the halting problem by searching over all possible proofs in F for the proof of P either halting with a particular run, or never halting, being sure our search will finish in finite time). Consider the shortest program P such that P never halts but F cannot prove that fact. Let L(P) be its length. Claim: F can never prove that Kolmogorov complexity of anything can be greater than L(P). Proof: given any output X, F can never refute the possibility that P might yet halt at some future time and output exactly X. Therefore L(P) must remain a candidate for the Kolmogorov complexity of X as far as F is concerned.
Edit: nevermind this. I’ve realized the proof is wrong. It’s only true that “F can never refute the possibility that P might yet halt at some time and output some Y”, but it is not true that “F can never refute the possibility that P might yet halt and output this specific X”. It’s conceivable (albeit unusual) that P doesn’t halt, F is unable to prove that, but is able to prove that should P halt, its output will not be X. For example, think of P as a Turing machine with one halt state, which is easily “backwards-traceable” to a sequence of actions that erases the entire tape so far and writes out “123″. Then F can easily be strong enough to be able to prove that if P halts at all, it outputs “123” and not anything else.
I emailed the article’s author and he replied acknowledging the problem, which has been raised by a bunch of people before, and giving me links to a few paywalled articles with the correct exposition. However, this correct exposition is nowhere as succint and attractive as the short and faulty proof above.
LOL
I don’t know much about it, but searching leads to this pretty interesting-looking paper which argues that the bound is largely incidental: http://www.mv.helsinki.fi/home/praatika/chaitinJPL.pdf
Thanks! The paper does give an answer, obvious in hindsight: the threshold constant for a formal system F is determined by the shortest Turing Machine that does not halt, but that fact is not provable in F.
Unfortunately, this turns out to be subtly wrong—see my update in a sibling comment.
That’s a pity. Still, given any non-halting TM for which F cannot prove that, it is easy (costs very little additional constant complexity) to build a TM for which F also cannot prove that it does not return any x. And this still answers my original question (whether the threshold is very high) in the negative. So, bad for K-complexity, we need something better.
I was simply trying to interpret what EY wrote. I should have said something like “prefer a K-simpler model”. Personally, I do not support expending much effort looking for a simpler model with the same predictive power, except as a step to finding a model with better predictive power.
There are occasional examples of one thing being strictly simpler than another. For example, “lightning is thrown by Thor, and also Maxwell’s equations, Coulomb’s Law, and the atomic theory of matter are true” is simpler if you just cut out Thor. So you should cut out Thor. So you should at least strive to that extent :P
Agreed that striving to find simpler theories is, generally speaking, a worthy goal. What I tried to emphasize is that striving to find the simplest one—in the particular Kolmogorov sense—isn’t.